AWS Error Guide: 'Throttling: Rate exceeded' and RequestLimitExceeded API Throttling
Fix AWS Throttling, Rate exceeded and RequestLimitExceeded errors: diagnose API rate limits, hot retry loops, missing backoff, pagination storms, and quota caps.
- #aws
- #troubleshooting
- #errors
- #throttling
Overview
AWS API endpoints enforce per-account, per-region request-rate limits using token buckets. When your call rate exceeds the bucket’s refill rate, the service rejects further calls with a throttling error rather than queuing them. The two common shapes are Throttling / ThrottlingException (most services) and RequestLimitExceeded (EC2 and some others); both carry HTTP 400 and are retryable with backoff.
You will see it surface from the CLI or an SDK:
An error occurred (Throttling) when calling the DescribeInstances operation (reached max retries: 4): Rate exceeded
EC2 phrases the same condition differently:
An error occurred (RequestLimitExceeded) when calling the DescribeVolumes operation: Request limit exceeded.
It occurs whenever aggregate request volume to one API spikes — a tight polling loop, a fan-out of Lambdas all calling the same control-plane API, an un-paginated Describe* storm, or many clients sharing one account’s bucket.
Symptoms
- Intermittent
Throttling: Rate exceededorRequestLimitExceededthat worsens under load. - Calls succeed when run alone but fail when run in parallel or in CI fan-out.
- SDK reports
reached max retriesafter several automatic attempts. - CloudTrail shows the same
eventNamerepeating witherrorCodeofClient.RequestLimitExceededorThrottlingException.
for i in $(seq 1 40); do aws ec2 describe-instances >/dev/null & done; wait
An error occurred (RequestLimitExceeded) when calling the DescribeInstances operation: Request limit exceeded.
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances \
--query 'Events[?contains(CloudTrailEvent, `RequestLimitExceeded`)]|length(@)'
17
Common Root Causes
1. A tight polling loop with no wait
A script polls a resource state in a hot loop with no sleep, hammering one API far above its refill rate.
grep -n "describe-stacks" deploy.sh
12:while true; do aws cloudformation describe-stacks --stack-name app | grep COMPLETE && break; done
A while true with no sleep issues hundreds of DescribeStacks calls per second — guaranteed throttling. Add a wait or use a wait command.
2. Parallel fan-out sharing one account bucket
Many workers (CI matrix, Lambda fan-out, parallel xargs) all call the same API. The bucket is per-account/region, so concurrency multiplies the rate.
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances \
--start-time "$(date -u -d '5 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
--query 'Events[].Username' --output text | sort | uniq -c | sort -rn
340 ci-runner-role
12 ops-user
ci-runner-role made 340 calls in 5 minutes — the fan-out is the source.
3. Retries without exponential backoff
A retry wrapper that retries immediately (or with a fixed short delay) turns a transient throttle into a sustained one, because every retry adds to the rate.
aws configure get retry_mode
aws configure get max_attempts
legacy
3
legacy retry mode uses minimal backoff. Switch to standard or adaptive, which add jitter and respect throttle signals.
4. Un-paginated Describe storms
Listing all resources without server-side pagination filters forces large repeated calls. Combined with frequent polling this saturates the API.
grep -rn "describe-instances" scripts/ | grep -vc "max-items\|filter"
6
Six describe-instances calls with no --filters or --max-items each pull the full inventory — expensive and easily throttled at scale.
5. A low or reduced service API quota
Some API throttle limits are adjustable quotas. A new account, or one with a manually lowered limit, throttles sooner than expected.
aws service-quotas list-service-quotas --service-code ec2 \
--query "Quotas[?contains(QuotaName, 'Rate of')].[QuotaName,Value]" --output text
Rate of DescribeInstances requests 50.0
If the effective rate limit is 50/s and your fleet bursts higher, request a quota increase.
6. Shared credentials across many tools
Multiple unrelated tools (Terraform, a dashboard, a backup job, monitoring) using the same account credentials in the same region all draw from one bucket, so no single tool is “at fault” but the sum throttles.
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=DescribeRegions \
--query 'Events[].Username' --output text | sort -u
backup-bot
grafana-cw
terraform-ci
Three distinct principals hitting the same control-plane API at once — the aggregate is what trips the limit.
Diagnostic Workflow
Step 1: Identify the throttled API and error code
aws ec2 describe-instances 2>&1 | grep -oE '(Throttling|RequestLimitExceeded)'
Confirm whether it is Throttling/ThrottlingException or RequestLimitExceeded, and note the operation name — that is the bucket you are exhausting.
Step 2: Measure the call rate in CloudTrail
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=<OPERATION> \
--start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
--query 'Events[].Username' --output text | sort | uniq -c | sort -rn
This reveals which principal(s) drive the volume and roughly how high it is.
Step 3: Inspect the SDK/CLI retry configuration
aws configure get retry_mode; aws configure get max_attempts
echo "AWS_RETRY_MODE=$AWS_RETRY_MODE AWS_MAX_ATTEMPTS=$AWS_MAX_ATTEMPTS"
legacy mode or an unset retry mode means weak backoff — a prime contributor.
Step 4: Check whether the rate limit is an adjustable quota
aws service-quotas list-service-quotas --service-code <SERVICE> \
--query "Quotas[?contains(QuotaName, 'Rate of') || contains(QuotaName, 'request')].[QuotaName,Value,Adjustable]" \
--output text
If Adjustable is True and the value is low, a quota increase is a valid fix alongside backoff.
Step 5: Enable adaptive retries and re-test
export AWS_RETRY_MODE=adaptive
export AWS_MAX_ATTEMPTS=6
aws ec2 describe-instances >/dev/null && echo "OK with adaptive retries"
adaptive mode adds a client-side rate limiter that backs off when it sees throttling, smoothing the call rate.
Example Root Cause Analysis
A nightly Terraform plan in CI began failing intermittently with Throttling: Rate exceeded on DescribeSecurityGroups. The plan ran a 12-way matrix, each job refreshing state for the same large VPC account.
CloudTrail showed the rate was concentrated and bursty:
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=DescribeSecurityGroups \
--start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
--query 'Events[].Username' --output text | sort | uniq -c
288 ci-runner-role
Twelve concurrent jobs each refreshing hundreds of security groups, all as one role in one region, summed well past the API’s refill rate. Retry mode was legacy:
aws configure get retry_mode
legacy
So failed calls retried with little backoff, deepening the storm. Fix: set AWS_RETRY_MODE=adaptive and AWS_MAX_ATTEMPTS=8 in the CI environment, and reduce the matrix concurrency from 12 to 4. The plans completed without throttling, and the small concurrency reduction cost only a couple of minutes of wall time.
Prevention Best Practices
- Use the SDK’s
standardoradaptiveretry mode everywhere (AWS_RETRY_MODE=adaptive); never hand-roll fixed-delay retries that ignore throttle signals. - Cap fan-out concurrency to the API’s real rate; more parallel workers on one account/region bucket multiplies the rate, it does not raise the limit.
- Replace hot polling loops with backoff and waiters (
aws <svc> wait ...) instead ofwhile truewith no sleep. - Paginate and filter
Describe*calls server-side (--filters,--max-items) so each call is cheap and infrequent. - Spread independent tools across regions or use
service-quotasto raise adjustable rate limits where the workload genuinely needs it. - For triaging a throttling spike from logs, the free incident assistant can pinpoint the noisiest API and principal. More patterns are in the AWS guides.
Quick Command Reference
# Confirm the error code and operation
aws ec2 describe-instances 2>&1 | grep -oE '(Throttling|RequestLimitExceeded)'
# Measure call rate per principal for an operation
aws cloudtrail lookup-events \
--lookup-attributes AttributeKey=EventName,AttributeValue=<OPERATION> \
--start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
--query 'Events[].Username' --output text | sort | uniq -c | sort -rn
# Inspect retry configuration
aws configure get retry_mode; aws configure get max_attempts
# Check adjustable rate quotas
aws service-quotas list-service-quotas --service-code <SERVICE> \
--query "Quotas[?contains(QuotaName,'Rate of')].[QuotaName,Value,Adjustable]" --output text
# Re-run with adaptive retries
AWS_RETRY_MODE=adaptive AWS_MAX_ATTEMPTS=6 aws ec2 describe-instances >/dev/null
Conclusion
A Throttling / RequestLimitExceeded error means your aggregate request rate to one API exceeded its token-bucket refill rate. The usual root causes:
- A tight polling loop issuing calls with no wait.
- Parallel fan-out sharing the single per-account/region bucket.
- Retries without exponential backoff turning a transient throttle into a sustained one.
- Un-paginated, unfiltered
Describe*storms. - A low or reduced adjustable API rate quota.
- Many independent tools sharing one account’s credentials and region.
Measure the real rate per principal in CloudTrail, switch to adaptive retries, then cut concurrency or raise the quota — throttling is about smoothing the rate, not just retrying harder.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.