GCP Error Guide: 'Quota exceeded: too many concurrent queries' BigQuery Concurrency Limits
Fix BigQuery 'Exceeded rate limits: too many concurrent queries for this project': diagnose interactive slots, reservations, and runaway jobs with read-only bq and gcloud.
- #gcp
- #troubleshooting
- #errors
- #bigquery
Exact Error Message
A BigQuery job fails to start (or is rejected by the API) with one of these forms:
$ bq query --use_legacy_sql=false 'SELECT COUNT(*) FROM `analytics-prod.events.raw`'
Error in query string: Error processing job
'analytics-prod:bqjob_r3f8a1c9d2e4_00000190a1b2c3d4_1':
Quota exceeded: Your project exceeded quota for concurrent queries.
For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas
From a client library or the REST API the equivalent message is:
google.api_core.exceptions.Forbidden: 403 Exceeded rate limits: too many
concurrent queries for this project. For more information, see
https://cloud.google.com/bigquery/quotas
reason: rateLimitExceeded
The bigquery subsystem returns this as a rateLimitExceeded / quota error on jobs.insert.
What the Error Means
BigQuery limits how many interactive queries a single project can run at the same time. On-demand (per-byte-billed) projects share a default concurrency limit; reservation-based (capacity / slot) projects get a concurrency target derived from their assigned slots. When the number of in-flight interactive query jobs for the project reaches that ceiling, the next jobs.insert is rejected immediately rather than queued, and you get too many concurrent queries.
Critically, this is a concurrency limit, not a slot-exhaustion or bytes-billed limit. The job never even starts executing; BigQuery refuses to admit it because too many sibling jobs are already running. This is why retrying with backoff usually succeeds: as soon as one of the running queries finishes, a concurrency slot frees up. It is distinct from Resources exceeded during query execution (a per-query memory/shuffle problem) and from daily bytes-billed quotas.
Common Causes
- Fan-out from an orchestrator. Airflow, dbt, Dataform, or a Cloud Function loop submits dozens of interactive queries in parallel.
- Dashboards that issue many queries at once. Looker, Data Studio / Looker Studio, or a custom BI tool firing one query per tile on page load.
- Retry storms. A transient failure triggers retries that pile on top of still-running originals.
- Long-running interactive queries holding concurrency slots for minutes, starving short queries.
- Everything submitted as interactive when many jobs could be
BATCHpriority (which queues instead of failing). - Under-provisioned reservation. Few assigned slots yield a low concurrency target for a high-fan-out workload.
- Many sessions / scripting blocks counting as separate concurrent jobs.
How to Reproduce the Error
- Pick a project on the on-demand model (no large reservation).
- Write a short script that submits many interactive queries without waiting:
for i in $(seq 1 200); do bq query --nouse_legacy_sql --sync=false \ "SELECT COUNT(*) FROM \`analytics-prod.events.raw\` WHERE rand() < 0.5" & done wait - Because each job is interactive and
--sync=falsedoes not wait, dozens are admitted simultaneously. - Once in-flight interactive jobs hit the project’s concurrency limit, the next submissions return
Quota exceeded: too many concurrent queries for this project.
Diagnostic Commands
All read-only. Use these to see how many jobs are running and how concurrency is provisioned.
# 1. Confirm the active project
gcloud config get-value project
# 2. List jobs currently RUNNING for the project (the live concurrency count)
bq ls --jobs=true --all=true --max_results=200 \
--format=prettyjson | \
python3 -c "import sys,json;[print(j['jobReference']['jobId'], j['status']['state'], j.get('configuration',{}).get('jobType')) for j in json.load(sys.stdin) if j['status']['state']=='RUNNING']"
# 3. Inspect a specific job's priority (INTERACTIVE vs BATCH) and timing
bq show --format=prettyjson -j bqjob_r3f8a1c9d2e4_00000190a1b2c3d4_1
# 4. Query INFORMATION_SCHEMA for jobs in flight over the last hour (read-only)
bq query --use_legacy_sql=false \
'SELECT job_id, user_email, priority, state, creation_time
FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE state = "RUNNING"
ORDER BY creation_time DESC'
# 5. See whether the project uses a reservation (affects the concurrency target)
bq show --reservation --location=US --project_id=analytics-prod
# 6. List reservation assignments for the project
bq ls --reservation_assignment --location=US --project_id=analytics-prod
# 7. Review consumer quota usage / limits for BigQuery
gcloud services quota list \
--service=bigquery.googleapis.com \
--consumer=projects/analytics-prod \
--filter="metric:concurrent" 2>/dev/null || \
gcloud alpha services quota list --service=bigquery.googleapis.com \
--consumer=projects/analytics-prod
Step-by-Step Resolution
- Identify the source of the fan-out using the running-jobs query in step 4. Group by
user_emailto find the orchestrator or dashboard service account flooding the project. - Add bounded concurrency in the offending client. Cap parallel queries (e.g. a semaphore of 10-20) instead of submitting unbounded loops.
- Switch eligible jobs to BATCH priority. Batch queries queue rather than fail when concurrency is full, so they never throw this error:
bq query --priority=BATCH --use_legacy_sql=false \ 'SELECT ... FROM `analytics-prod.events.raw`' - Retry interactive failures with exponential backoff and jitter. Since slots free up as jobs finish, a few seconds of backoff usually clears the error.
- Provision or grow a reservation. For sustained high concurrency, buy slots and assign them so the concurrency target scales with capacity:
# Example: create a reservation, then assign the project to it bq mk --reservation --location=US --slots=500 prod-reservation bq mk --reservation_assignment --location=US \ --reservation_id=prod-reservation \ --assignee_id=analytics-prod --assignee_type=PROJECT --job_type=QUERY - Consolidate dashboard queries. Use BI Engine, cache results, or materialized views so a dashboard does not fire dozens of simultaneous interactive queries on load.
- Cancel runaway long queries that are hogging concurrency slots (
bq cancel <job_id>) once you confirm they are safe to stop.
Prevention and Best Practices
- Default heavy/scheduled pipelines to BATCH priority and reserve INTERACTIVE for human/ad-hoc work.
- Bound parallelism in orchestrators (Airflow pool size, dbt
threads, Dataform concurrency) below the project’s concurrency target. - Cache and materialize frequent dashboard queries; enable BI Engine for low-latency repeats.
- Add jittered exponential backoff to every BigQuery client so transient concurrency rejections self-heal.
- Right-size a reservation for predictable workloads so concurrency scales with assigned slots.
- Alert on
rateLimitExceededcounts trending up before they become user-visible. For automated alert triage, see /dashboard/monitoring-alerts/.
Related Errors
Resources exceeded during query execution— a single query ran out of memory/shuffle; a per-query problem, not concurrency.Quota exceeded: Your project exceeded quota for queries per day— a daily count limit, not simultaneous-execution.Exceeded rate limits: too many api requests per user per method—jobs.insertAPI rate, not query concurrency.Quota exceeded: ... for tabledata.list— read-API throttling on results, unrelated.Job exceeded rate limits: too many table update operations— DML/load concurrency on a single table.
Frequently Asked Questions
Is this the same as running out of slots? No. Slots are the compute units that execute a query. This error happens before execution: BigQuery refuses to admit the job because too many interactive queries are already in flight. You can hit the concurrency limit even with idle slots available.
Will simply retrying fix it? Often, yes. Because the limit is on simultaneous jobs, a short exponential backoff usually succeeds as soon as a running query finishes. Always pair retries with jitter so a retry storm does not re-trigger the limit.
What is the difference between INTERACTIVE and BATCH priority here?
Interactive queries count against the concurrency limit and fail fast when it is reached. Batch queries are queued by BigQuery and start when capacity is available, so they never throw too many concurrent queries. Move non-urgent work to BATCH.
Does buying a reservation raise the concurrency limit? Reservation (capacity) pricing gives a concurrency target derived from assigned slots, so more slots generally allow more simultaneous queries. It also makes performance predictable compared to the shared on-demand pool.
Why does my dashboard trigger this on page load? Many BI tools fire one query per visualization simultaneously. A dashboard with 30 tiles can submit 30 interactive queries at once. Use caching, BI Engine, or materialized views to collapse that burst.
Can I see exactly which jobs are using the concurrency slots?
Yes. Query INFORMATION_SCHEMA.JOBS_BY_PROJECT (or bq ls --jobs) and filter on state = "RUNNING". Grouping by user_email and priority quickly reveals the source of the fan-out.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.