AWS Error Guide: 'Throttling: Rate exceeded' and

Overview

AWS API endpoints enforce per-account, per-region request-rate limits using token buckets. When your call rate exceeds the bucket’s refill rate, the service rejects further calls with a throttling error rather than queuing them. The two common shapes are Throttling / ThrottlingException (most services) and RequestLimitExceeded (EC2 and some others); both carry HTTP 400 and are retryable with backoff.

You will see it surface from the CLI or an SDK:

An error occurred (Throttling) when calling the DescribeInstances operation (reached max retries: 4): Rate exceeded

EC2 phrases the same condition differently:

An error occurred (RequestLimitExceeded) when calling the DescribeVolumes operation: Request limit exceeded.

It occurs whenever aggregate request volume to one API spikes — a tight polling loop, a fan-out of Lambdas all calling the same control-plane API, an un-paginated Describe* storm, or many clients sharing one account’s bucket.

Symptoms

Intermittent Throttling: Rate exceeded or RequestLimitExceeded that worsens under load.
Calls succeed when run alone but fail when run in parallel or in CI fan-out.
SDK reports reached max retries after several automatic attempts.
CloudTrail shows the same eventName repeating with errorCode of Client.RequestLimitExceeded or ThrottlingException.

for i in $(seq 1 40); do aws ec2 describe-instances >/dev/null & done; wait

An error occurred (RequestLimitExceeded) when calling the DescribeInstances operation: Request limit exceeded.

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances \
  --query 'Events[?contains(CloudTrailEvent, `RequestLimitExceeded`)]|length(@)'

Common Root Causes

1. A tight polling loop with no wait

A script polls a resource state in a hot loop with no sleep, hammering one API far above its refill rate.

grep -n "describe-stacks" deploy.sh

12:while true; do aws cloudformation describe-stacks --stack-name app | grep COMPLETE && break; done

A while true with no sleep issues hundreds of DescribeStacks calls per second — guaranteed throttling. Add a wait or use a wait command.

Many workers (CI matrix, Lambda fan-out, parallel xargs) all call the same API. The bucket is per-account/region, so concurrency multiplies the rate.

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeInstances \
  --start-time "$(date -u -d '5 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --query 'Events[].Username' --output text | sort | uniq -c | sort -rn

    340 ci-runner-role
     12 ops-user

ci-runner-role made 340 calls in 5 minutes — the fan-out is the source.

3. Retries without exponential backoff

A retry wrapper that retries immediately (or with a fixed short delay) turns a transient throttle into a sustained one, because every retry adds to the rate.

aws configure get retry_mode
aws configure get max_attempts

legacy
3

legacy retry mode uses minimal backoff. Switch to standard or adaptive, which add jitter and respect throttle signals.

4. Un-paginated Describe storms

Listing all resources without server-side pagination filters forces large repeated calls. Combined with frequent polling this saturates the API.

grep -rn "describe-instances" scripts/ | grep -vc "max-items\|filter"

Six describe-instances calls with no --filters or --max-items each pull the full inventory — expensive and easily throttled at scale.

5. A low or reduced service API quota

Some API throttle limits are adjustable quotas. A new account, or one with a manually lowered limit, throttles sooner than expected.

aws service-quotas list-service-quotas --service-code ec2 \
  --query "Quotas[?contains(QuotaName, 'Rate of')].[QuotaName,Value]" --output text

Rate of DescribeInstances requests	50.0

If the effective rate limit is 50/s and your fleet bursts higher, request a quota increase.

6. Shared credentials across many tools

Multiple unrelated tools (Terraform, a dashboard, a backup job, monitoring) using the same account credentials in the same region all draw from one bucket, so no single tool is “at fault” but the sum throttles.

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeRegions \
  --query 'Events[].Username' --output text | sort -u

backup-bot
grafana-cw
terraform-ci

Three distinct principals hitting the same control-plane API at once — the aggregate is what trips the limit.

Diagnostic Workflow

Step 1: Identify the throttled API and error code

aws ec2 describe-instances 2>&1 | grep -oE '(Throttling|RequestLimitExceeded)'

Confirm whether it is Throttling/ThrottlingException or RequestLimitExceeded, and note the operation name — that is the bucket you are exhausting.

Step 2: Measure the call rate in CloudTrail

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=<OPERATION> \
  --start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --query 'Events[].Username' --output text | sort | uniq -c | sort -rn

This reveals which principal(s) drive the volume and roughly how high it is.

Step 3: Inspect the SDK/CLI retry configuration

aws configure get retry_mode; aws configure get max_attempts
echo "AWS_RETRY_MODE=$AWS_RETRY_MODE AWS_MAX_ATTEMPTS=$AWS_MAX_ATTEMPTS"

legacy mode or an unset retry mode means weak backoff — a prime contributor.

Step 4: Check whether the rate limit is an adjustable quota

aws service-quotas list-service-quotas --service-code <SERVICE> \
  --query "Quotas[?contains(QuotaName, 'Rate of') || contains(QuotaName, 'request')].[QuotaName,Value,Adjustable]" \
  --output text

If Adjustable is True and the value is low, a quota increase is a valid fix alongside backoff.

Step 5: Enable adaptive retries and re-test

export AWS_RETRY_MODE=adaptive
export AWS_MAX_ATTEMPTS=6
aws ec2 describe-instances >/dev/null && echo "OK with adaptive retries"

adaptive mode adds a client-side rate limiter that backs off when it sees throttling, smoothing the call rate.

Example Root Cause Analysis

A nightly Terraform plan in CI began failing intermittently with Throttling: Rate exceeded on DescribeSecurityGroups. The plan ran a 12-way matrix, each job refreshing state for the same large VPC account.

CloudTrail showed the rate was concentrated and bursty:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeSecurityGroups \
  --start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --query 'Events[].Username' --output text | sort | uniq -c

    288 ci-runner-role

Twelve concurrent jobs each refreshing hundreds of security groups, all as one role in one region, summed well past the API’s refill rate. Retry mode was legacy:

aws configure get retry_mode

legacy

So failed calls retried with little backoff, deepening the storm. Fix: set AWS_RETRY_MODE=adaptive and AWS_MAX_ATTEMPTS=8 in the CI environment, and reduce the matrix concurrency from 12 to 4. The plans completed without throttling, and the small concurrency reduction cost only a couple of minutes of wall time.

Prevention Best Practices

Use the SDK’s standard or adaptive retry mode everywhere (AWS_RETRY_MODE=adaptive); never hand-roll fixed-delay retries that ignore throttle signals.
Cap fan-out concurrency to the API’s real rate; more parallel workers on one account/region bucket multiplies the rate, it does not raise the limit.
Replace hot polling loops with backoff and waiters (aws <svc> wait ...) instead of while true with no sleep.
Paginate and filter Describe* calls server-side (--filters, --max-items) so each call is cheap and infrequent.
Spread independent tools across regions or use service-quotas to raise adjustable rate limits where the workload genuinely needs it.
For triaging a throttling spike from logs, the free incident assistant can pinpoint the noisiest API and principal. More patterns are in the AWS guides.

Quick Command Reference

# Confirm the error code and operation
aws ec2 describe-instances 2>&1 | grep -oE '(Throttling|RequestLimitExceeded)'

# Measure call rate per principal for an operation
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=<OPERATION> \
  --start-time "$(date -u -d '10 min ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --query 'Events[].Username' --output text | sort | uniq -c | sort -rn

# Inspect retry configuration
aws configure get retry_mode; aws configure get max_attempts

# Check adjustable rate quotas
aws service-quotas list-service-quotas --service-code <SERVICE> \
  --query "Quotas[?contains(QuotaName,'Rate of')].[QuotaName,Value,Adjustable]" --output text

# Re-run with adaptive retries
AWS_RETRY_MODE=adaptive AWS_MAX_ATTEMPTS=6 aws ec2 describe-instances >/dev/null

Conclusion

A Throttling / RequestLimitExceeded error means your aggregate request rate to one API exceeded its token-bucket refill rate. The usual root causes:

A tight polling loop issuing calls with no wait.
Parallel fan-out sharing the single per-account/region bucket.
Retries without exponential backoff turning a transient throttle into a sustained one.
Un-paginated, unfiltered Describe* storms.
A low or reduced adjustable API rate quota.
Many independent tools sharing one account’s credentials and region.

Measure the real rate per principal in CloudTrail, switch to adaptive retries, then cut concurrency or raise the quota — throttling is about smoothing the rate, not just retrying harder.

AWS Error Guide: 'Throttling: Rate exceeded' and RequestLimitExceeded API Throttling

Overview

Symptoms

Common Root Causes

1. A tight polling loop with no wait

3. Retries without exponential backoff

4. Un-paginated Describe storms

5. A low or reduced service API quota

6. Shared credentials across many tools

Diagnostic Workflow

Step 1: Identify the throttled API and error code

Step 2: Measure the call rate in CloudTrail

Step 3: Inspect the SDK/CLI retry configuration

Step 4: Check whether the rate limit is an adjustable quota

Step 5: Enable adaptive retries and re-test

Example Root Cause Analysis

Prevention Best Practices

Quick Command Reference

Conclusion

Download the Free 500-Prompt DevOps AI Toolkit

Overview

Symptoms

Common Root Causes

1. A tight polling loop with no wait

2. Parallel fan-out sharing one account bucket

3. Retries without exponential backoff

4. Un-paginated Describe storms

5. A low or reduced service API quota

6. Shared credentials across many tools

Diagnostic Workflow

Step 1: Identify the throttled API and error code

Step 2: Measure the call rate in CloudTrail

Step 3: Inspect the SDK/CLI retry configuration

Step 4: Check whether the rate limit is an adjustable quota

Step 5: Enable adaptive retries and re-test

Example Root Cause Analysis

Prevention Best Practices

Quick Command Reference

Conclusion

Download the Free 500-Prompt DevOps AI Toolkit