AWS Error Guide: 'Task timed out after N seconds' Lambda Timeout Failures
Fix the Lambda 'Task timed out after N seconds' error: diagnose low timeouts, blocked network calls, cold starts, downstream latency, and unresolved async work.
- #aws
- #troubleshooting
- #errors
- #lambda
Overview
Task timed out after N seconds means a Lambda invocation ran longer than the function’s configured timeout, so the runtime forcibly terminated it. Lambda enforces a hard wall-clock limit (1 second to 15 minutes); when the handler has not returned a result before the limit, the invocation is killed mid-execution and billed for the full duration. Any in-flight work is abandoned — partial writes, half-open connections, and incomplete responses are all possible side effects.
You see it at the end of the invocation’s log stream:
2026-06-23T14:08:22.417Z 5f3c1a9e-... Task timed out after 3.01 seconds
And the invocation reports an error to the caller (or, for async/event-source invocations, retries and eventually hits the DLQ). It occurs when the timeout is set too low for the real work, when a network call hangs (no route/SG), during cold-start initialization, or when downstream latency spikes.
Symptoms
- Logs end with
Task timed out after N secondsand no handler completion line. Durationin theREPORTline equals (or nearly equals) the configured timeout.- Async invocations retry 3x then land in a DLQ; SQS messages become visible again.
- API Gateway returns
504 Gateway Timeoutor502when fronting the function.
aws logs filter-log-events --log-group-name /aws/lambda/order-processor \
--filter-pattern "Task timed out" \
--query 'events[-1].message' --output text
2026-06-23T14:08:22.417Z 5f3c1a9e-... Task timed out after 3.01 seconds
aws lambda get-function-configuration --function-name order-processor \
--query '[Timeout,MemorySize]' --output text
3 128
Common Root Causes
1. The timeout is simply too low for the work
The function genuinely needs longer than its configured limit. A 3-second default never covers a multi-step API workflow.
aws logs filter-log-events --log-group-name /aws/lambda/order-processor \
--filter-pattern "REPORT" --limit 5 \
--query 'events[].message' --output text | grep -oE 'Duration: [0-9.]+ ms'
Duration: 3000.41 ms
Duration: 2998.10 ms
Duration: 3001.00 ms
Durations clustered exactly at the timeout (3000 ms) mean the work is being cut off, not finishing — raise the timeout.
2. A VPC network call with no route hangs until timeout
A function in a VPC private subnet calling an external API or AWS service with no NAT/endpoint will hang on connect until the timeout fires (no fast failure).
aws lambda get-function-configuration --function-name order-processor \
--query 'VpcConfig.[SubnetIds,SecurityGroupIds]' --output json
aws ec2 describe-route-tables --filters Name=association.subnet-id,Values=subnet-0priv1 \
--query 'RouteTables[].Routes[?contains(to_string(@),`nat`)]' --output json
[]
An empty NAT-route result for a VPC function that calls the internet means every external call hangs to the timeout.
3. Downstream latency spike (DB, API, DynamoDB)
The function’s own code is fine, but a dependency got slow. Throttled DynamoDB, an overloaded RDS instance, or a slow third-party API pushes total duration over the limit.
aws cloudwatch get-metric-statistics --namespace AWS/Lambda \
--metric-name Duration --dimensions Name=FunctionName,Value=order-processor \
--start-time "$(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" --period 300 \
--statistics Maximum --query 'Datapoints[].Maximum' --output text
820.5 910.2 2998.7 3001.2
Duration jumping from ~900 ms to the 3000 ms ceiling shows a latency spike pushing invocations over the edge.
4. Heavy cold-start initialization
Large dependencies, slow SDK clients, or expensive work in the init phase eat into the first invocation’s budget; with a tight timeout the cold start alone can exceed it.
aws logs filter-log-events --log-group-name /aws/lambda/order-processor \
--filter-pattern "Init Duration" --limit 3 \
--query 'events[].message' --output text | grep -oE 'Init Duration: [0-9.]+ ms'
Init Duration: 2400.55 ms
A 2.4 s init on a 3 s timeout leaves almost no room for the handler — trim init work or raise the timeout.
5. Under-provisioned memory throttling CPU
Lambda CPU scales with memory. A CPU-bound function at 128 MB runs slowly enough to time out; more memory speeds it up and can cost the same or less overall.
aws lambda get-function-configuration --function-name order-processor \
--query 'MemorySize' --output text
128
For CPU-heavy work, 128 MB starves the function — raising memory (and thus CPU) often eliminates the timeout.
6. An unresolved promise / missing callback
The handler kicks off async work but returns before it completes, or never resolves; the runtime waits for the event loop to drain (or for the callback) until the timeout. Node with callbackWaitsForEmptyEventLoop left on is a classic case.
aws logs filter-log-events --log-group-name /aws/lambda/order-processor \
--filter-pattern "Task timed out" --limit 3 --query 'events[].message' --output text
2026-06-23T14:08:22.417Z ... Task timed out after 3.01 seconds
If the business logic logs “done” well before the timeout but the invocation still times out, an open handle/unresolved promise is keeping the event loop alive.
Diagnostic Workflow
Step 1: Confirm it is a timeout, not an unhandled error
aws logs filter-log-events --log-group-name /aws/lambda/<FN> \
--filter-pattern "Task timed out" --limit 1 --query 'events[0].message' --output text
The literal Task timed out after N seconds distinguishes a timeout from an exception or OOM (Runtime exited).
Step 2: Compare Duration against the configured timeout
aws lambda get-function-configuration --function-name <FN> --query 'Timeout' --output text
aws cloudwatch get-metric-statistics --namespace AWS/Lambda --metric-name Duration \
--dimensions Name=FunctionName,Value=<FN> --period 300 --statistics Maximum p99 \
--start-time "$(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" --query 'Datapoints[].[Maximum]' --output text
Durations pinned at the timeout value confirm work is being cut off.
Step 3: Check whether the function is VPC-bound and routable
aws lambda get-function-configuration --function-name <FN> \
--query 'VpcConfig.[SubnetIds,SecurityGroupIds]' --output json
If it is in a VPC, verify a NAT route or VPC endpoint exists for whatever it calls — hung connects are a top cause.
Step 4: Look for cold-start and init cost
aws logs filter-log-events --log-group-name /aws/lambda/<FN> \
--filter-pattern "Init Duration" --limit 5 --query 'events[].message' --output text
A large Init Duration near the timeout points at init-phase work or provisioned-concurrency needs.
Step 5: Adjust timeout/memory and validate
aws lambda update-function-configuration --function-name <FN> \
--timeout 30 --memory-size 512
aws lambda invoke --function-name <FN> --payload '{}' /tmp/out.json \
--cli-binary-format raw-in-base64-out --query StatusCode
Raise the timeout to cover real latency and bump memory for CPU-bound work, then re-invoke to confirm completion.
Example Root Cause Analysis
A webhook handler, stripe-events, started timing out at exactly 6.00 seconds after a network change, returning 504 through API Gateway. Logs showed the handler logged “verifying signature” then nothing until the timeout.
The function had recently been attached to a VPC to reach a private RDS instance:
aws lambda get-function-configuration --function-name stripe-events \
--query 'VpcConfig.SubnetIds' --output text
subnet-0priv1 subnet-0priv2
But the handler also calls the public Stripe API to verify the event. In the VPC’s private subnets there was no NAT route:
aws ec2 describe-route-tables --filters Name=association.subnet-id,Values=subnet-0priv1 \
--query 'RouteTables[].Routes[].DestinationCidrBlock' --output text
10.0.0.0/16
Only the local route — no 0.0.0.0/0 to a NAT gateway. The outbound Stripe API call hung on connect until the 6 s timeout. Fix: add a NAT gateway and a default route for the private subnets (keeping the function in the VPC for RDS access).
aws ec2 create-route --route-table-id rtb-0priv5678 \
--destination-cidr-block 0.0.0.0/0 --nat-gateway-id nat-0abc1234
After the route was added, the Stripe call returned in under 400 ms and the timeouts stopped.
Prevention Best Practices
- Set the timeout from observed p99 duration plus headroom, not the 3-second default; align API Gateway’s integration timeout with the function’s.
- For VPC functions that call the internet or AWS public APIs, always provision a NAT route or the relevant VPC endpoints — a missing route manifests as a timeout, not a clear network error.
- Right-size memory for CPU-bound work; more memory raises CPU and often lowers total cost while removing the timeout.
- Apply explicit per-call timeouts on every downstream client (DB, HTTP, SDK) shorter than the Lambda timeout, so a slow dependency fails fast instead of consuming the whole budget.
- Use provisioned concurrency or trim init-phase work for latency-sensitive functions with heavy cold starts.
- For correlating timeouts with downstream latency from the logs, the free incident assistant can spot whether the function or a dependency is slow. More Lambda walkthroughs are in the AWS guides.
Quick Command Reference
# Confirm a timeout (vs. exception/OOM)
aws logs filter-log-events --log-group-name /aws/lambda/<FN> \
--filter-pattern "Task timed out" --limit 1 --query 'events[0].message' --output text
# Configured timeout and memory
aws lambda get-function-configuration --function-name <FN> --query '[Timeout,MemorySize]' --output text
# Duration trend
aws cloudwatch get-metric-statistics --namespace AWS/Lambda --metric-name Duration \
--dimensions Name=FunctionName,Value=<FN> --period 300 --statistics Maximum \
--start-time "$(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" --query 'Datapoints[].Maximum' --output text
# VPC config and cold-start cost
aws lambda get-function-configuration --function-name <FN> --query 'VpcConfig' --output json
aws logs filter-log-events --log-group-name /aws/lambda/<FN> \
--filter-pattern "Init Duration" --limit 3 --query 'events[].message' --output text
# Adjust and retest
aws lambda update-function-configuration --function-name <FN> --timeout 30 --memory-size 512
Conclusion
Task timed out after N seconds means the handler did not return before the configured wall-clock limit and was killed. The usual root causes:
- The timeout is set too low for the real work.
- A VPC network call with no route hangs until the timeout fires.
- A downstream latency spike (DB, DynamoDB, third-party API).
- Heavy cold-start initialization eating the budget.
- Under-provisioned memory throttling CPU on compute-bound work.
- An unresolved promise / open handle keeping the runtime alive.
Confirm the durations are pinned at the timeout, then decide whether to raise the limit, fix the network path, or speed up a dependency — a timeout is a symptom, not the root cause.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.