Azure Error Guide: '429 TooManyRequests' ARM Throttling
Fix Azure 429 TooManyRequests throttling: diagnose ARM read/write limits, resource-provider throttling, Retry-After headers, Terraform parallelism, and backoff.
- #azure
- #troubleshooting
- #errors
- #throttling
Overview
An Azure 429 TooManyRequests happens when Azure Resource Manager (ARM) or a resource provider rejects your request because you have exceeded a request-rate limit. ARM tracks a token bucket of remaining reads and writes per subscription (and per resource provider), and once the bucket is empty it returns 429 with a Retry-After header telling you how long to wait. The operation does not complete until you slow down and retry.
You will see this in a CLI response or pipeline log:
(TooManyRequests) The request is being throttled as the limit has been reached for operation 'GetVirtualMachine'. Please try again after '23' seconds.
Code: TooManyRequests
Message: The request is being throttled as the limit has been reached for operation 'GetVirtualMachine'.
And in the raw HTTP exchange (--debug / --verbose) the response headers carry the budget:
Response status: 429
Retry-After: 23
x-ms-ratelimit-remaining-subscription-reads: 0
x-ms-ratelimit-remaining-subscription-resource-requests: 0
x-ms-request-id: 8f2c1d44-2a17-49b8-9c0e-1d3f5a6b7c80
It occurs whenever requests arrive faster than the bucket refills — most often during large automation runs, tight polling loops, parallel Terraform apply, or fan-out scripts that iterate over many resources. The limit is per-subscription per-region per-operation-type, so one noisy pipeline can throttle everything else in the same subscription.
Symptoms
- CLI or SDK calls fail intermittently with
(TooManyRequests)and aRetry-Aftervalue. terraform applyerrors withStatus=429 Code="TooManyRequests"partway through a plan.- Deployments stall and resume in bursts as the bucket refills.
- The remaining-reads/writes header trends toward
0under load.
az vm show --resource-group rg-prod --name web-01 --debug 2>&1 \
| grep -iE 'x-ms-ratelimit-remaining-subscription|Retry-After|status: 429'
Response status: 429
Retry-After: 23
x-ms-ratelimit-remaining-subscription-reads: 0
az group deployment list --resource-group rg-prod -o table 2>&1 | tail -3
(TooManyRequests) The request is being throttled as the limit has been reached for operation 'ListDeployments'. Please try again after '17' seconds.
Common Root Causes
1. Subscription-level ARM read/write limits exhausted
ARM enforces a per-subscription budget for read and write requests. A burst of list/show calls drains the read bucket; bulk creates/updates drain the write bucket.
az vm list --query "length(@)" -o tsv
az group list --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-subscription-reads'
x-ms-ratelimit-remaining-subscription-reads: 4
A remaining count this close to 0 means the next handful of reads will return 429.
2. Per-resource-provider throttling
Each resource provider (Compute, Network, Storage) keeps its own limit independent of the ARM subscription bucket. Hammering one provider throttles only its operations.
az vm list -g rg-prod --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-resource'
x-ms-ratelimit-remaining-resource: Microsoft.Compute/HighCostGet3Min;0,Microsoft.Compute/HighCostGet30Min;312
The HighCostGet3Min counter at 0 shows the Compute provider’s short-window budget for expensive GETs is spent.
3. Tight polling loops or high Terraform parallelism
A loop that polls a resource state with no delay, or terraform running its default 10 parallel operations against many resources, generates requests faster than the bucket refills.
terraform apply -parallelism=10 2>&1 | grep -iE '429|TooManyRequests' | head -3
Error: waiting for creation of Network Interface: Code="TooManyRequests" Message="The request is being throttled..."
Error: retrieving Virtual Machine: Status=429 Code="TooManyRequests"
Lowering -parallelism reduces the concurrent request rate against ARM.
4. Large fan-out automation across many resources
A script that iterates over hundreds of resources, issuing a show/update per item with no batching or pacing, exhausts the bucket within seconds.
for rg in $(az group list --query "[].name" -o tsv); do
az resource list -g "$rg" -o none
done 2>&1 | grep -ic 'TooManyRequests'
37
37 throttled responses across the loop signals fan-out is outrunning the limit; batch with --query server-side or add pacing.
5. Tenant-level throttling from shared identity
Multiple subscriptions or pipelines authenticating as the same service principal share tenant-scoped limits (for example Graph or management-group reads), so unrelated jobs throttle each other.
az account show --query "{tenant:tenantId, sub:id}" -o json
az role assignment list --all --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-tenant-reads'
x-ms-ratelimit-remaining-tenant-reads: 2
A near-zero tenant-reads counter points at contention from other jobs using the same identity.
6. Missing exponential backoff / Retry-After ignored
Client code that retries immediately (or with a fixed tiny delay) instead of honoring the Retry-After header turns a transient 429 into a sustained throttle storm.
az vm get-instance-view -g rg-prod -n web-01 --debug 2>&1 \
| grep -iE 'Retry-After|status: 429'
Response status: 429
Retry-After: 30
If your retry fires before the Retry-After: 30 window elapses, every retry is itself throttled.
Diagnostic Workflow
Step 1: Confirm it is a 429 and read the Retry-After value
az <command> --debug 2>&1 \
| grep -iE 'status: 429|Retry-After|TooManyRequests'
Step 2: Identify which bucket is empty (subscription vs resource provider)
az <command> --debug 2>&1 \
| grep -iE 'x-ms-ratelimit-remaining-(subscription|resource|tenant)'
The header at or near 0 tells you whether it is the ARM subscription bucket, a resource provider, or the tenant.
Step 3: Find the operation and the caller generating the load
az monitor activity-log list \
--offset 1h \
--query "[?httpRequest != null].{op:operationName.value, caller:caller, status:status.value}" \
-o table | grep -i throttl
Step 4: Reduce concurrency in the offending client
# Terraform: drop parallelism
terraform apply -parallelism=3
# Ad-hoc loops: pace requests and batch with server-side --query
az resource list --query "[?type=='Microsoft.Compute/virtualMachines'].id" -o tsv
Step 5: Verify the bucket recovers after backing off
sleep 30
az group list --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-subscription-reads'
A rising remaining-reads count confirms the bucket is refilling and the throttle has cleared.
Example Root Cause Analysis
A nightly Terraform pipeline that manages roughly 200 VMs starts failing midway with Status=429 Code="TooManyRequests", and other engineers report az vm list intermittently throttling in the same subscription.
The deployment log shows the operation:
Error: retrieving Virtual Machine "web-114": Status=429 Code="TooManyRequests" Message="The request is being throttled as the limit has been reached for operation 'GetVirtualMachine'. Please try again after '28' seconds."
A debug read confirms the Compute provider’s short-window budget, not the subscription bucket, is the bottleneck:
az vm show -g rg-prod -n web-01 --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-resource'
x-ms-ratelimit-remaining-resource: Microsoft.Compute/HighCostGet3Min;0,Microsoft.Compute/HighCostGet30Min;88
The pipeline runs terraform apply -parallelism=10, and the provider issues per-VM GetVirtualMachine refreshes during plan. With 200 VMs and 10 concurrent refreshes, the Compute HighCostGet3Min bucket drains to 0, and because the retry logic ignored Retry-After, immediate retries kept the bucket empty.
Fix: lower parallelism so the request rate stays under the refill rate, and let retries honor the header:
terraform apply -parallelism=3
# confirm recovery
sleep 30
az vm show -g rg-prod -n web-01 --debug 2>&1 \
| grep -i 'x-ms-ratelimit-remaining-resource'
x-ms-ratelimit-remaining-resource: Microsoft.Compute/HighCostGet3Min;297,Microsoft.Compute/HighCostGet30Min;1140
The bucket recovers, the apply completes, and the shared throttling on other engineers’ az vm list calls disappears.
Prevention Best Practices
- Always honor the
Retry-Afterheader: retry only after the value elapses, and wrap clients in exponential backoff with jitter rather than fixed-delay loops. - Cap concurrency in automation: use
terraform -parallelismand bounded worker pools so the request rate stays below the bucket’s refill rate. - Batch server-side with
--queryandaz graph query(Resource Graph) instead of loopingaz resource showper item; one Resource Graph call replaces hundreds of ARM reads. - Isolate noisy pipelines into their own subscription where the workload allows, so one job cannot exhaust shared per-subscription buckets.
- Monitor the
x-ms-ratelimit-remaining-*headers and alert before they hit0, not after requests start failing. See more in the Azure guides. - For ad-hoc triage, the free incident assistant can summarize a throttled pipeline log into which bucket (subscription, provider, or tenant) is exhausted and the right backoff.
Quick Command Reference
# Confirm a 429 and read the backoff value
az <command> --debug 2>&1 | grep -iE 'status: 429|Retry-After|TooManyRequests'
# See which bucket is empty
az <command> --debug 2>&1 | grep -iE 'x-ms-ratelimit-remaining-(subscription|resource|tenant)'
# Subscription read budget
az group list --debug 2>&1 | grep -i 'x-ms-ratelimit-remaining-subscription-reads'
# Resource-provider (Compute) budget
az vm list -g <RG> --debug 2>&1 | grep -i 'x-ms-ratelimit-remaining-resource'
# Find throttled operations and callers
az monitor activity-log list --offset 1h \
--query "[?status.value=='Failed'].{op:operationName.value, caller:caller}" -o table
# Replace per-item loops with one Resource Graph query
az graph query -q "Resources | where type =~ 'microsoft.compute/virtualmachines' | project name, id"
# Reduce client concurrency
terraform apply -parallelism=3
# Verify the bucket recovers
sleep 30; az group list --debug 2>&1 | grep -i 'x-ms-ratelimit-remaining-subscription-reads'
Conclusion
A 429 TooManyRequests means an ARM or resource-provider rate bucket is empty and Azure is telling you to slow down via Retry-After. The usual root causes:
- The per-subscription ARM read or write bucket is exhausted by a burst of requests.
- A single resource provider (Compute/Network/Storage) hit its own independent limit.
- A tight polling loop or high Terraform parallelism outruns the bucket’s refill rate.
- Large fan-out automation iterates over many resources without batching or pacing.
- A shared service principal causes tenant-level throttling across unrelated jobs.
- Clients ignore
Retry-Afterand retry immediately, sustaining the throttle.
Read the x-ms-ratelimit-remaining-* headers to find the empty bucket, then cut concurrency and honor Retry-After — the fix is almost always pacing the caller, not raising a limit.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.