BigQuery Cost Optimization With AI: Slots and Scans

The first time a BigQuery bill doubled on me overnight, the cause turned out to be a single dashboard query someone had scheduled to run every five minutes against an unpartitioned event table. It scanned the full table each time, which on-demand pricing charges by the byte. Nobody noticed because the query was fast — BigQuery is fast even when it’s scanning terabytes — and “fast” reads as “fine” until the invoice arrives. The fix took ten minutes once I found it. Finding it was the hard part, and it’s the part where AI earns its keep, because the evidence lives in INFORMATION_SCHEMA.JOBS and reading that history by hand is tedious in exactly the way models are good at.

Start from the job history, not the schema

The instinct is to open the biggest tables and start adding partitioning. That’s backwards. The right starting point is the job history, because it tells you which queries actually cost money rather than which tables merely look big. I pull a window of jobs with the bytes billed and slot time, then hand the result to AI to rank.

SELECT
  user_email,
  job_id,
  total_bytes_billed,
  total_slot_ms,
  query
FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  AND job_type = 'QUERY'
  AND state = 'DONE'
ORDER BY total_bytes_billed DESC
LIMIT 50;

Prompt: “Here are the top 50 BigQuery jobs by bytes billed over the last week, with slot_ms and the query text. Rank them by cost contribution, and for each tell me whether the cost is driven by scan volume (suggests partition/cluster/column fixes) or by sustained slot consumption (suggests a reservation question). Don’t suggest fixes yet — just classify.”

That classification is the whole game. Scan-volume problems and slot-contention problems have completely different solutions, and conflating them is how people end up buying capacity they don’t need to fix a query that just needed a WHERE clause.

The cheapest fixes first: pruning and partition filters

Once AI has ranked the offenders, I go after the reversible, no-lock-in fixes before anything else. The biggest one is almost always a query that scans a partitioned table without filtering on the partition column, so BigQuery reads every partition.

Prompt: “This query runs against a table partitioned by event_date and clustered by customer_id, but it scans the whole table. Here’s the query and the table DDL. Rewrite it to use a partition filter and column pruning so it scans the minimum bytes, and estimate the reduction. Keep the result identical.”

The model is reliable here because the rule is mechanical: a partition filter the optimizer can use, and selecting only the columns you need instead of SELECT *. On a wide table, dropping SELECT * alone can cut bytes scanned by an order of magnitude, because BigQuery’s columnar storage only reads the columns you reference. I verify the rewrite returns the same rows before shipping it — AI drafts the rewrite, I confirm it’s equivalent.

For tables that are queried repeatedly with the same expensive aggregation, the next lever is materialization. I ask the model to spot the pattern:

Prompt: “Across these top queries, find aggregations that repeat with the same GROUP BY and could be served by a materialized view or a scheduled summary table instead of re-aggregating the base table every time. Show the CREATE MATERIALIZED VIEW and which queries would then hit it.”

Slots: only after the queries are clean

If the cost classification pointed at sustained slot consumption rather than scan volume — many concurrent queries from BI tools, steady all-day load — then the question becomes pricing model. On-demand charges per byte; capacity (Editions) charges for slots over time. The mistake I see constantly is buying a big reservation sized to peak concurrency.

Prompt: “Our slot_ms usage over the last 30 days shows steady daytime load with sharp spikes during the morning dashboard refresh. We’re on on-demand. Help me decide between staying on-demand and moving to an Editions reservation. If a reservation, size the baseline to the steady floor and use autoscaling for the spikes — not the peak. Show the trade-off in plain terms.”

Sizing to the floor with autoscaling for peaks, rather than to the peak, is the single most common capacity error and the easiest money to leave on the table. A reservation is a commitment, so I treat the model’s number as a starting point and confirm it against a few weeks of real usage before committing.

What AI gets right and where I stay in the loop

AI is genuinely good at the mechanical parts of this work: ranking a JOBS export, rewriting a query to add a partition filter, spotting a repeated aggregation that wants a materialized view. Those are pattern-matching tasks against well-defined rules, and that’s the model’s home turf. What it can’t see is whether that “idle” scheduled query is actually feeding a compliance report someone needs, or whether a table is safe to repartition without breaking a live SLA.

So the division is firm. The model drafts the rewrites and the savings estimates; I confirm equivalence on every query change and I never let it repartition a production table or commit a reservation without checking what depends on it. Repartitioning rewrites the whole table — expensive and disruptive — so that’s always a human decision. The reusable version of this workflow lives in my prompts library, and the broader GCP with AI series covers the cost work that spans services, like the billing and committed-use review that sits one level up from BigQuery. The bill rewards attention, and attention is exactly what AI lets you scale.