GCP Error Guide: 'Container failed to start' Cloud Run Revision Errors
Fix the Cloud Run 'Container failed to start and listen on PORT' error: diagnose PORT binding, slow startup, failed health checks, crashes, and image config.
- #gcp
- #troubleshooting
- #errors
- #cloud-run
Overview
A Cloud Run “Container failed to start” error means the new revision’s container did not begin listening on the expected port within the startup window, so the revision is marked unhealthy and traffic is never migrated to it. Cloud Run starts your container, injects a PORT environment variable, and waits for the process to accept TCP connections on that port (and pass the startup probe). If it doesn’t — because it crashed, listened on the wrong port, or took too long — the deploy fails.
You will see this in the deploy output or the revision’s status:
ERROR: (gcloud.run.deploy) Revision 'api-00007-abc' is not ready and cannot serve traffic.
The user-provided container failed to start and listen on the port defined provided by the PORT=8080
environment variable within the allocated timeout.
Or the health-check variant:
ERROR: Revision 'api-00007-abc' is not ready and cannot serve traffic.
Startup probe failed: connection refused on port 8080.
It occurs on every new deploy or revision (including auto-scaling cold starts that fail). Because Cloud Run keeps serving the last healthy revision, the app may stay up while deploys silently fail.
Symptoms
- Deploy fails with “failed to start and listen on the port … PORT=8080”.
- Revision shows
Ready: Falsewith reasonHealthCheckContainerErrororContainerMissing. - Logs show the app binding to
3000/5000or127.0.0.1instead of0.0.0.0:$PORT. - The container exits immediately (crash on boot) or hangs past the startup timeout.
gcloud run revisions describe api-00007-abc --region us-central1 \
--format="value(status.conditions[0].type, status.conditions[0].message)"
Ready Revision 'api-00007-abc' is not ready and cannot serve traffic.
The user-provided container failed to start and listen on the port defined by the PORT=8080 environment variable.
Common Root Causes
1. App listens on the wrong port
The app hardcodes a port instead of reading PORT, so it never listens on the port Cloud Run probes (default 8080).
gcloud run services logs read api --region us-central1 --limit 20 \
| grep -iE 'listening|port'
Server listening on http://0.0.0.0:3000
The app bound 3000 but Cloud Run probes 8080 — connection refused. Read process.env.PORT / os.environ["PORT"] instead.
2. App binds to localhost, not 0.0.0.0
Binding to 127.0.0.1 makes the port unreachable from outside the container.
gcloud run services logs read api --region us-central1 --limit 20 \
| grep -i listening
Listening on 127.0.0.1:8080
The right port but the wrong interface; Cloud Run’s probe can’t reach 127.0.0.1. Bind to 0.0.0.0:$PORT.
3. The container crashes on startup
A missing env var, bad config, or unhandled exception kills the process before it listens.
gcloud run services logs read api --region us-central1 --limit 30 \
| grep -iE 'error|exception|traceback|exited'
KeyError: 'DATABASE_URL'
Container called exit(1).
The process exits before binding the port — fix the missing config (here DATABASE_URL).
4. Startup is slower than the timeout
Heavy initialization (model load, migrations, warmup) exceeds the startup timeout, so the probe gives up before the app is ready.
gcloud run services describe api --region us-central1 \
--format="value(spec.template.spec.timeoutSeconds, spec.template.metadata.annotations)"
300
run.googleapis.com/startup-cpu-boost: 'false'
If init takes ~40s but the startup probe window is short, raise the startup timeout / add a startup probe and enable CPU boost.
5. Wrong container entrypoint or architecture
A bad CMD/ENTRYPOINT, or an image built for arm64 while Cloud Run runs amd64, prevents the process from running at all.
gcloud run services logs read api --region us-central1 --limit 10
exec /app/server: exec format error
exec format error is a CPU-architecture mismatch — rebuild for linux/amd64.
6. App fails its own health/readiness check
A configured startup or liveness HTTP probe points at a path the app doesn’t serve, so it never reports healthy.
gcloud run services describe api --region us-central1 \
--format="yaml(spec.template.spec.containers[0].startupProbe)"
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 3
If the app serves health at / (not /healthz), the probe fails and the revision never becomes ready.
Diagnostic Workflow
Step 1: Read the revision status for the exact reason
gcloud run revisions describe <REVISION> --region <REGION> \
--format="value(status.conditions[].type, status.conditions[].message)"
The message states the PORT it expected and whether it was a probe failure, a missing container, or a timeout.
Step 2: Read the container logs around startup
gcloud run services logs read <SERVICE> --region <REGION> --limit 50
Look for the “listening on” line (port + interface), crash stack traces, or exited/exit(1).
Step 3: Confirm the app honors PORT and binds 0.0.0.0
The app must read the PORT env var and bind 0.0.0.0:$PORT. Cloud Run sets PORT=8080 by default; you can change the container port:
gcloud run services describe <SERVICE> --region <REGION> \
--format="value(spec.template.spec.containers[0].ports[0].containerPort)"
Step 4: Reproduce the start locally with the same contract
docker run --rm -e PORT=8080 -p 8080:8080 gcr.io/my-prod-project/api:latest
curl -sS localhost:8080/ -o /dev/null -w '%{http_code}\n'
If it doesn’t listen on 8080 locally, it won’t in Cloud Run. This isolates port/crash bugs from the platform.
Step 5: Adjust timeout/probe or fix the image, then redeploy
# Give slow starts more room + CPU boost
gcloud run deploy api --image gcr.io/my-prod-project/api:latest --region us-central1 \
--timeout=300 --cpu-boost --port=8080
For probe mismatches, point the startup probe at a path the app actually serves, then redeploy.
Example Root Cause Analysis
A team adds a machine-learning model load to their service and the next deploy fails:
ERROR: (gcloud.run.deploy) Revision 'api-00007-abc' is not ready and cannot serve traffic.
The user-provided container failed to start and listen on the port ... PORT=8080 within the allocated timeout.
Logs show the app does eventually start, but late:
gcloud run services logs read api --region us-central1 --limit 30 | grep -iE 'loading|listening'
Loading model weights (1.3 GB)...
Server listening on 0.0.0.0:8080
The port and interface are correct, and there’s no crash — the model load just pushes the first listen past the startup probe window, so Cloud Run tears the revision down before it reports healthy. This is a startup-timeout problem, not a port bug.
Fix: enable startup CPU boost and give the startup probe more room so initialization finishes in time:
gcloud run deploy api --image gcr.io/my-prod-project/api:latest --region us-central1 \
--cpu-boost --timeout=300 \
--port=8080
(Plus configuring a startup probe with a higher failureThreshold/periodSeconds.) The revision now becomes ready and serves traffic.
Prevention Best Practices
- Always read the
PORTenvironment variable and bind0.0.0.0:$PORT; never hardcode a port or bind to localhost. - Test the exact container contract locally (
docker run -e PORT=8080 -p 8080:8080 ...) before deploying so port/crash bugs never reach Cloud Run. - Build images for
linux/amd64(or set the platform explicitly) so architecture mismatches don’t produceexec format error. - Move slow initialization behind a startup probe and enable startup CPU boost so legitimately slow boots aren’t killed by the timeout.
- Fail fast and log clearly on missing config (DB URLs, secrets) so a crash-on-boot is obvious in logs rather than a vague timeout.
- For triage, the free incident assistant can correlate the revision message with the listening port in your logs. More walkthroughs are in the GCP guides.
Quick Command Reference
# Revision status + exact failure message
gcloud run revisions describe <REVISION> --region <REGION> \
--format="value(status.conditions[].type, status.conditions[].message)"
# Startup logs (port, interface, crashes)
gcloud run services logs read <SERVICE> --region <REGION> --limit 50
# What container port is configured?
gcloud run services describe <SERVICE> --region <REGION> \
--format="value(spec.template.spec.containers[0].ports[0].containerPort)"
# Reproduce the start locally with the same contract
docker run --rm -e PORT=8080 -p 8080:8080 <IMAGE>
curl -sS localhost:8080/ -o /dev/null -w '%{http_code}\n'
# Redeploy with more startup room
gcloud run deploy <SERVICE> --image <IMAGE> --region <REGION> \
--timeout=300 --cpu-boost --port=8080
# Inspect the startup probe
gcloud run services describe <SERVICE> --region <REGION> \
--format="yaml(spec.template.spec.containers[0].startupProbe)"
Conclusion
A Cloud Run “Container failed to start” error means the revision never listened on the expected PORT (or passed its startup probe) within the timeout. The usual root causes:
- The app listens on the wrong port instead of reading
PORT. - The app binds
127.0.0.1rather than0.0.0.0. - The container crashes on startup (missing config, exception).
- Initialization is slower than the startup timeout.
- A bad entrypoint or an
amd64/arm64architecture mismatch. - A configured health/startup probe targets a path the app doesn’t serve.
Read the revision’s status message for the expected port, check the logs for the listening line or crash, and reproduce the same PORT contract locally — the fix is almost always honoring PORT/0.0.0.0, fixing a boot crash, or giving slow startups more room.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.