If your team is using BigQuery at scale, there's a good chance your FinOps and engineering teams are both trying to lower the bill, but neither are succeeding. This is why.
A customer’s data engineering team came to us with a BigQuery spend problem. They were operating at roughly 150K committed slots, with tens of thousands in production, spread across 100 reservations in more than 20 regions. Costs were climbing, SLOs were missed, and despite knowing why, no one could act on it.
Capacity was controlled by the FinOps team, which operated on budgets & contracts, not workload behavior. So, baseline slots were locked into long-term commitments, autoscale limits were configured conservatively, and any change required coordination with engineering that moved too slowly.
At the same time, the system itself was opaque, with no clear way to answer basic questions like:
- Which reservations are consistently underutilized?
- Which reservations are starved?
- How are baseline capacity & autoscale actually used over time?
Even with idle slot sharing enabled, capacity remained fragmented. Some reservations sat idle while others queued.
The result was predictable: Engineering saw latency spikes and FinOps saw rising costs. Neither team had the control needed to fix the problem.
This is not a unique story. It is, in fact, the defining cloud cost story of the last five years. BigQuery just makes it impossible to ignore.
Why BigQuery Is Where This Problem Gets Expensive
BigQuery is great at removing infrastructure complexity. There's no cluster to provision, no node type to select, and no autoscaling policy to wire together manually. You simply write queries and jobs run; Google handles the underlying infrastructure.
The cost model, though, creates a structural problem that, as we saw with our customer, neither FinOps nor engineering can solve alone.
BigQuery charges by slot usage, not by infrastructure. A slot is Google's abstracted unit of compute, which combines CPU, memory, & I/O into a single billable resource.
Traditionally, you can set up your slots like this:
- Buy a baseline amount upfront at a committed rate
- Set an autoscale maximum for demand spikes at a higher pay-as-you-go rate and
- Distribute those slots across reservations scoped to your teams & workloads
This configuration determines your bill & your query performance simultaneously.
However, between baseline slots, reservations, & autoscaling slots, there’s potential for a staggering amount of BigQuery waste.
For instance, because teams purchase a fixed amount of slots for a fixed amount of time at a specific rate, teams still risk wasting paid capacity because their workloads don’t consistently use it.
When setting an autoscale maximum, if it’s set too high after a one-time spike, and nobody manually lowers the ceiling after, every time the autoscale kicks in for a demand increase, BigQuery scales up to that ceiling.
Because you’re billed for however many slots BigQuery spins up, you’re paying for more than your workload actually needs.
The 60-second minimum billing window makes this worse: even a short burst gets billed for a full minute at whatever scale BigQuery reached.
For batch & interactive workloads that share the same reservation, BigQuery doesn't inherently know which is more urgent. When a batch job runs (like a large overnight ETL), it consumes a chunk of the reservation’s capacity. So if an analyst runs a query or a dashboard refreshes during the batch job, it must wait in the queue for slots to free up.
The cost here isn’t obvious at first. Because you’re paying for baseline slots either way, the batch job consuming them doesn’t change your committed spend, it just makes it look like you have a capacity problem. The typical response is to buy more baseline slots, which only increases your spend and leaves the issue festering.
BigQuery’s sneaky ways of inflating costs is then exacerbated by a bigger problem: the disconnect between FinOps & engineering.
The Slot Problem Is Really a Coordination Problem
The coordination problem between FinOps & engineering has a price tag. According to Harness, 52% of engineering leaders said this misalignment was the leading cause of cloud infrastructure costs, which is projected to reach $44.5 billion this year.
We’ve seen this with our own customers working in BigQuery:
When an engineering team notices their services that depend on BigQuery are slowing down or timing out, they escalate to a platform team. The platform team then looks at the reservation and sees high autoscale spillover. They follow up by requesting more reservations.
FinOps then reviews the request against org-level commitments & existing reservation allocations. Increasing baseline slots for one reservation is not just a local change, it also requires either:
- Reallocating capacity from another reservation — which risks impacting a different workload
- Increasing overall commitments — which carries long-term cost implications
Without clear visibility into which reservations are consistently underutilized and which are genuinely capacity-constrained, these optimization decisions become inherently conservative. And while idle slot sharing helps absorb short term imbalance, it does not provide a reliable signal for structural reallocation.
As a result, FinOps typically approves incremental autoscale increases to handle immediate demand while delaying or minimizing baseline changes. Over time, more workloads shift into autoscale, while baseline capacity remains static, leading to higher costs & continued inefficiency.
And this isn’t just a BigQuery issue; according to the same Harness’ report, enterprises take an average of 31 days to identify & eliminate cloud waste, and 25 days to detect & rightsize over-provisioned resources.
The industry’s answer is to automate optimizations. But automation has become its own problem.
The $44.5 Billion Organizational Problem That’s Now an Automation Problem
For years, "empowering engineers to take action" was the number one challenge cited by 40% of FinOps practitioners.
But in the State of FinOps 2026 report, “empowering engineers to take action,” was replaced with an industry that is looking to “shift left,” or create environments where engineers can take action proactively rather than reactively to get cloud costs under control.
It makes sense that we’re at this point. As teams become leaner and have a plethora of automated cost optimization tools to choose from, AI cost management is now the #1 desired skillset for FinOps.
Instead of the back-and-forth between FinOps & engineering, orgs can just automate the problem away.
But automation doesn't deeply know your workloads enough to always make the right optimization. It can only apply rules and react to thresholds, optimizing in isolation away from real-world behavior.
Why Automation Makes BigQuery Costs Worse
Most automation tools operate on static rules or point-in-time signals. For instance, if we’re optimizing BigQuery, they can look at recent utilization and suggest:
- Baseline slot reductions
- Autoscale limit increases
- Reservation rebalancing
But these decisions are not static, and require context to be effective at managing waste.
When we talk about reservation sizing, the right action depends on:
- The time of day demand patterns
- The workload mix, batch vs. interactive
- Latency expectations tied to SLOs
- Cross reservation dependencies under shared capacity
For example, if we reduce baseline slots based on low utilization during off peak hours, it can increase autoscale dependency during peak windows. Or, if we increase autoscale limits to fix queueing, it can mask a deeper imbalance in how capacity is distributed across reservations.
Automation without real-world context is inherently unstable.
BigQuery makes this especially visible. A reservation change intended to reduce cost can introduce queueing for latency sensitive workloads. A conservative baseline meant to control spend can push more usage into autoscale, increasing total cost.
The same automation issues that impact cost also impact performance. Reduce baseline slots too aggressively and you breach SLOs. Cap autoscale too tightly and queries get stuck in the queue.
Ready to optimize your BigQuery Slots?
Book a Sedai demo to speak with a technical expert.

Why Autonomous Optimization is the Only Safe Optimization
If FinOps & engineering teams are shifting to a more proactive approach to cutting cloud costs, they must understand automation will only break in the end, still leaving them with high costs & more toil.
The only truly proactive approach to safe optimization is grounding it in real application behavior that can change as your workloads change.
For BigQuery, that means the system needs to observe not just costs but:
- Which workloads need them and when
- What the latency patterns look like across reservations
- Where the true sweet spot sits between execution time and slot usage
In addition, humans must stay in the loop to review recommendations, approve changes, & retain control over the structural decisions like org-level commitment purchases.
For our customers, we follow this model for BigQuery:
- Continuous observation of slot consumption patterns across workloads
- SLO-aware recommendations that account for performance, not just cost
- A Copilot model where engineers & FinOps can review data-backed recommendations with projected impact before any change is made
We don’t see this as removing either FinOps or engineers from the process or elevating one team’s importance over the other. Instead, we see this as providing a centralized, trusted system that can consider both teams’ perspectives and bolster that with real-world app behavior.
The Practical Takeaway for Engineering Leaders
If you're a VP of Engineering or CTO with significant BigQuery spend, you already know that your FinOps & engineering teams have a disconnect. We’re not trying to convince you otherwise. What we want you to consider is examining your own environment and asking yourself some uncomfortable questions.
First, how often do your slot reservations actually get adjusted?
If the answer is weekly or less, you're operating on a lag that's costing you real money, because demand doesn't change on a weekly admin cycle.
Second, who decides the autoscale maximum, and do they have visibility into the SLO implications of capping it too low?
If that decision is owned entirely by FinOps without direct input from the engineers who own your latency-sensitive workloads, you're likely either over-capped or under-capped at any given time.
Third, what happens when a workload's behavior changes?
If the answer involves a ticket, a meeting, or a manual admin step, the lag between observation and action is measured in days — not in the minutes that actually matter when query latency starts climbing.
Everyone talks about not wanting another dashboard; it’s practically cringe-worthy to suggest it. And regardless of how many engineers you add to your team, optimizations will never scale to what you need. The $44.5 billion in annual cloud cost waste is proof that more dashboards & headcount are not effective.
BigQuery is where this problem becomes impossible to ignore: the slot model, reservation complexity, disconnect between cost and performance, and the lag between observation & action all converge into a spending problem that dashboards & headcount can't fix.
The teams that get ahead of this won't do so by adding more processes or more automation. They'll do it by grounding every optimization decision in how their workloads actually behave. And that's what Sedai is built to do.
Sedai's BigQuery optimization is available now. Stop managing slots by hand; schedule a demo to see how.
