The Cosmos DB RU/s setting one engineer doubled, and the bill nobody questioned

A latency complaint came in on a Tuesday. By that afternoon an engineer had doubled the provisioned RU/s on three Cosmos DB containers, watched the p99 drop, and closed the ticket. No PR, because throughput isn't code. No line in the cost review, because nobody reviews throughput. The director signed off on nothing, because nothing was put in front of them.

That last part is where the money leaks. Most directors assume cost-bearing changes flow through the same gate as code. They don't. az cosmosdb sql container throughput update is one command any engineer with contributor access can run, and it commits you to a recurring charge the second it returns. The fix was correct. It just had no off-ramp, and it is now your baseline.

The latency fix that became a permanent baseline

Doubling RU/s to clear a spike is defensible under pressure. What makes it expensive is that nothing forces a second look once the pressure is gone. The spike passed in a day; the throughput stayed for the quarter. Provisioned throughput bills on what you reserve, not what you consume, so a container now idling in the low double digits of utilization pays full freight for capacity it touches twice a day. Nothing alerts on the headroom you stopped needing three weeks ago. That is the quiet half of FinOps: a decision right when it shipped, wrong by the next billing cycle.

Provisioned vs autoscale RU/s and the tipping point

Microsoft prices autoscale RU/s at 1.5x the manual rate per RU/s, scaling between a tenth and the full ceiling you set. That premium is why teams reach for manual provisioning, and why they never revisit it. The crossover is clean: above roughly two-thirds of peak utilization, manual wins; below that, you're funding idle reserved capacity autoscale would have released.

The doubled container sits nowhere near two-thirds. It is the textbook case for autoscale. But manual throughput got picked once, silently, and everyone after inherited it without a reason.

Why throughput changes never hit a cost review

Where cost review exists at all, it lives on infrastructure that moves through pull requests and Terraform plans. RU/s changes route around all of it:

The lesson is uncomfortable but reliable: any setting an engineer can change without a reviewer will eventually be changed without one. Throughput is the one that sends you a bill for it.

Setting a gate on cost-bearing config changes

Treat provisioned-throughput changes the way you already treat schema changes. Move RU/s into Terraform or Bicep so a bump becomes a PR with an approver and a paper trail, then deny portal-slider edits in production through Azure Policy or scoped RBAC. The gate doesn't have to be heavy. It has to exist, and it has to have an owner who reads the bill. A number that survived review because someone defended it is a different thing from one nobody saw.

Right-sizing RU/s against real consumption

Don't argue throughput from memory. Pull the NormalizedRUConsumption metric per container, read the peak-to-average ratio over a representative week, and decide on the data: high, steady utilization stays manual; spiky or low-duty containers move to autoscale. Run that pass across everything your team touched last quarter, and the doubled-and-forgotten containers surface first.

Cloud Horizons surfaces these per workspace, so a throughput change shows up as a cost event the moment it lands, with utilization next to it, before it becomes next year's renewal line. The cheapest savings are almost always the line items nobody owns. See how the FinOps view catches them.