Automation Error Guide: '429 Too Many Requests' Downstream

Overview

A 429 Too Many Requests means a downstream API rejected your call because you exceeded its rate limit. The server is healthy and your request is well-formed — it’s simply throttling you. In automation, this commonly happens when a job iterates over a large list and fires requests faster than the API’s per-second or per-minute quota allows, or when many parallel workers share one quota. Without backoff that honors the server’s Retry-After, the job either fails outright or makes the throttling worse by retrying immediately.

You will see this in the job log:

ERROR step=enrich-contacts http 429 Too Many Requests url=https://api.crm.example.com/v1/contacts/lookup
INFO  response headers: Retry-After: 12  X-RateLimit-Remaining: 0  X-RateLimit-Reset: 1718900000

And the API’s body usually explains the limit:

{ "error": "rate_limited", "message": "API rate limit exceeded: 100 requests per minute" }

It occurs whenever the job’s call rate crosses the limit — a nightly batch over thousands of items, a fan-out across worker pods, or a retry storm after a transient blip. A job that ran under a small dataset can start 429ing the moment the dataset (or the worker count) grows.

Symptoms

Job steps fail with 429 / “rate limit exceeded”; the API is otherwise reachable.
X-RateLimit-Remaining: 0 and a Retry-After header on the response.
429s cluster in bursts (start of a batch, or during a fan-out) rather than randomly.
Naive retries make it worse — a retry storm keeps the remaining quota at 0.

# Count 429s and inspect rate-limit headers in the job log
grep -c "429" /var/log/jobs/enrich-contacts.log
grep -iE "Retry-After|X-RateLimit" /var/log/jobs/enrich-contacts.log | tail -3

317
Retry-After: 12
X-RateLimit-Remaining: 0
X-RateLimit-Limit: 100

# Reproduce and read the live limit headers
curl -s -D - -o /dev/null -H "Authorization: Bearer $TOKEN" \
  https://api.crm.example.com/v1/contacts/lookup?q=test | grep -iE "429|ratelimit|retry-after"

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Remaining: 0

Common Root Causes

1. Unthrottled loop fires faster than the limit

The job iterates a list with no pacing, sending far more than the allowed requests/second.

grep -RniE "for .*await|map\(.*await|forEach.*fetch|Promise.all" ./jobs | head

jobs/enrich.ts:22: await Promise.all(items.map(i => api.lookup(i)))  // unbounded fan-out

Promise.all over thousands of items launches them all at once — instant quota exhaustion.

2. No backoff / Retry-After ignored

On a 429 the job either fails or retries immediately, ignoring the Retry-After the server provided.

grep -RniE "Retry-After|retryAfter|backoff|sleep|setTimeout" ./jobs | head

(no matches)

No backoff logic means retries hit the still-exhausted quota and 429 again.

The limit is per account/key, but N parallel workers each think they have headroom; combined they blow the shared quota.

grep -RniE "replicas|concurrency|WORKER_COUNT|parallelism" ./deploy ./jobs | head

deploy/job.yaml: parallelism: 20

20 parallel workers against a 100 req/min limit means each can only do ~5/min — coordination is required.

4. Per-key vs per-IP limit confusion

You scaled out IPs expecting more headroom, but the limit is per API key (or per account), so distributing across hosts doesn’t help.

# Same key everywhere?
grep -RniE "API_KEY|Authorization" ./jobs ./deploy | sort -u | head

deploy/job.yaml: API_KEY=sk_live_shared_one_key

One shared key means one shared quota regardless of how many IPs you spread across.

5. Retry storm amplifying the limit

A transient failure triggers retries from many in-flight requests at once, spiking the rate and self-inflicting 429s.

grep -RniE "maxRetries|retries:|retryCount" ./jobs | head
grep -c "429" /var/log/jobs/enrich-contacts.log

jobs/enrich.ts:31: retries: 5   // immediate, no jitter
317

Five immediate retries with no jitter multiplies the request rate exactly when quota is scarce.

6. Limit lower than assumed / changed by the provider

The job is paced to an old, higher limit; the provider lowered it (or you’re on a smaller tier) so the existing pacing now exceeds it.

curl -s -D - -o /dev/null -H "Authorization: Bearer $TOKEN" <URL> | grep -i "X-RateLimit-Limit"

X-RateLimit-Limit: 60

If the job paces to 100/min but the header now says 60, the pacing itself is the problem.

Diagnostic Workflow

Step 1: Confirm 429 and read the limit headers

curl -s -D - -o /dev/null -H "Authorization: Bearer $TOKEN" <URL> \
  | grep -iE "429|X-RateLimit-Limit|X-RateLimit-Remaining|Retry-After"

This gives you the actual limit, remaining quota, and required wait — the numbers you must pace to.

Step 2: Measure your job’s real request rate

# Requests per minute the job is actually sending
grep "http " /var/log/jobs/enrich-contacts.log \
  | awk '{print substr($1,1,16)}' | sort | uniq -c | tail -5

Compare requests/min to X-RateLimit-Limit; if it exceeds the limit, that’s your gap.

Step 3: Check whether the quota is shared across workers

grep -RniE "parallelism|replicas|concurrency|WORKER_COUNT" ./deploy ./jobs
grep -RniE "API_KEY|Authorization" ./jobs ./deploy | sort -u

Multiple workers + one key = a shared quota that needs a central limiter, not per-worker pacing.

Step 4: Verify backoff honors Retry-After

grep -RniE "Retry-After|retryAfter|backoff|jitter|sleep|setTimeout" ./jobs

Confirm 429s trigger a wait equal to Retry-After (with jitter), not an immediate retry.

Step 5: Add or tune a client-side limiter, then re-run a small batch

# Throttle the loop to under the limit (example: 1 req / 700ms ~= 85/min for a 100/min cap)
grep -RniE "limiter|p-limit|bottleneck|rate" ./jobs

Cap concurrency and rate below the published limit, then re-run a bounded batch and watch for zero 429s.

Example Root Cause Analysis

A nightly enrich-contacts job that used to finish clean now logs hundreds of 429s and aborts partway. The API is up and the token is valid.

Reading the live headers shows the limit and the wait:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
Retry-After: 12

The job fans out with Promise.all over the whole contact list:

grep -Rni "Promise.all" ./jobs/enrich.ts

jobs/enrich.ts:22: await Promise.all(items.map(i => api.lookup(i)))

The contact list grew from a few hundred to several thousand. Promise.all launches every lookup simultaneously, so the job sends thousands of requests in a couple of seconds against a 100/min limit — quota hits zero instantly and every subsequent call 429s. There’s no backoff, so it just fails.

Fix: bound concurrency and pace to under the limit, and honor Retry-After on the rare 429:

# Replace Promise.all with a concurrency-limited, rate-paced runner:
# const limit = pLimit(2); ~1 req/700ms; on 429 sleep(Retry-After*1000) then retry
node jobs/enrich.js --max-rate 85/min --concurrency 2

processed 4120 contacts, 0 rate-limit errors

Paced under the 100/min cap, the job completes with no 429s.

Prevention Best Practices

Pace client calls below the published limit with a token-bucket / concurrency limiter; never Promise.all an unbounded list against a rate-limited API.
Always honor Retry-After (and back off with jitter) on a 429 instead of retrying immediately — immediate retries are a self-inflicted retry storm.
For multi-worker jobs sharing one API key, enforce the rate centrally (a shared limiter / distributed token bucket), since per-worker pacing doesn’t bound the shared quota.
Read X-RateLimit-Limit/Remaining from responses and adapt dynamically rather than hard-coding an assumed limit the provider may have lowered.
Make the job resumable/checkpointed so a throttled batch can continue rather than restarting and re-spiking the rate.
Alert on 429 rate so a limit change or dataset growth is caught early. The free incident assistant can correlate a 429 spike with a deploy or data-size change; see more automation guides.

Quick Command Reference

# Read the live rate-limit headers
curl -s -D - -o /dev/null -H "Authorization: Bearer $TOKEN" <URL> \
  | grep -iE "429|X-RateLimit-Limit|X-RateLimit-Remaining|Retry-After"

# Count 429s and inspect headers in the log
grep -c "429" /var/log/jobs/<job>.log
grep -iE "Retry-After|X-RateLimit" /var/log/jobs/<job>.log | tail

# Measure the job's actual requests/min
grep "http " /var/log/jobs/<job>.log | awk '{print substr($1,1,16)}' | uniq -c | tail

# Find unbounded fan-out and missing backoff
grep -RniE "Promise.all|forEach.*fetch|Retry-After|backoff|p-limit|bottleneck" ./jobs

# Shared-quota check (workers vs one key)
grep -RniE "parallelism|replicas|concurrency|API_KEY" ./deploy ./jobs | sort -u

Conclusion

A 429 Too Many Requests in a job means you crossed a healthy downstream’s rate limit. The usual root causes:

An unthrottled loop firing faster than the per-second/minute quota.
No backoff and the Retry-After header ignored.
Many parallel workers sharing one per-key quota.
Confusing a per-key/account limit with a per-IP one.
A retry storm amplifying the request rate at the worst moment.
The provider’s limit being lower than the job’s hard-coded assumption.

Read the live X-RateLimit-Limit and your job’s actual request rate first — the gap between them tells you exactly how much to throttle, and honoring Retry-After keeps retries from making it worse.

Automation Error Guide: '429 Too Many Requests' Downstream API Rate Limit in a Job

Overview

Symptoms

Common Root Causes

1. Unthrottled loop fires faster than the limit

2. No backoff / Retry-After ignored

4. Per-key vs per-IP limit confusion

5. Retry storm amplifying the limit

6. Limit lower than assumed / changed by the provider

Diagnostic Workflow

Step 1: Confirm 429 and read the limit headers

Step 2: Measure your job’s real request rate

Step 3: Check whether the quota is shared across workers

Step 4: Verify backoff honors Retry-After

Step 5: Add or tune a client-side limiter, then re-run a small batch

Example Root Cause Analysis

Prevention Best Practices

Quick Command Reference

Conclusion

Download the Free 500-Prompt DevOps AI Toolkit

Overview

Symptoms

Common Root Causes

1. Unthrottled loop fires faster than the limit

2. No backoff / Retry-After ignored

3. Concurrency / fan-out across many workers sharing one quota

4. Per-key vs per-IP limit confusion

5. Retry storm amplifying the limit

6. Limit lower than assumed / changed by the provider

Diagnostic Workflow

Step 1: Confirm 429 and read the limit headers

Step 2: Measure your job’s real request rate

Step 3: Check whether the quota is shared across workers

Step 4: Verify backoff honors Retry-After

Step 5: Add or tune a client-side limiter, then re-run a small batch

Example Root Cause Analysis

Prevention Best Practices

Quick Command Reference

Conclusion

Download the Free 500-Prompt DevOps AI Toolkit