GCP Error Guide: 'Container failed to start' Cloud Run

Overview

A Cloud Run “Container failed to start” error means the new revision’s container did not begin listening on the expected port within the startup window, so the revision is marked unhealthy and traffic is never migrated to it. Cloud Run starts your container, injects a PORT environment variable, and waits for the process to accept TCP connections on that port (and pass the startup probe). If it doesn’t — because it crashed, listened on the wrong port, or took too long — the deploy fails.

You will see this in the deploy output or the revision’s status:

ERROR: (gcloud.run.deploy) Revision 'api-00007-abc' is not ready and cannot serve traffic.
 The user-provided container failed to start and listen on the port defined provided by the PORT=8080
 environment variable within the allocated timeout.

Or the health-check variant:

ERROR: Revision 'api-00007-abc' is not ready and cannot serve traffic.
 Startup probe failed: connection refused on port 8080.

It occurs on every new deploy or revision (including auto-scaling cold starts that fail). Because Cloud Run keeps serving the last healthy revision, the app may stay up while deploys silently fail.

Symptoms

Deploy fails with “failed to start and listen on the port … PORT=8080”.
Revision shows Ready: False with reason HealthCheckContainerError or ContainerMissing.
Logs show the app binding to 3000/5000 or 127.0.0.1 instead of 0.0.0.0:$PORT.
The container exits immediately (crash on boot) or hangs past the startup timeout.

gcloud run revisions describe api-00007-abc --region us-central1 \
  --format="value(status.conditions[0].type, status.conditions[0].message)"

Ready  Revision 'api-00007-abc' is not ready and cannot serve traffic.
 The user-provided container failed to start and listen on the port defined by the PORT=8080 environment variable.

Common Root Causes

1. App listens on the wrong port

The app hardcodes a port instead of reading PORT, so it never listens on the port Cloud Run probes (default 8080).

gcloud run services logs read api --region us-central1 --limit 20 \
  | grep -iE 'listening|port'

Server listening on http://0.0.0.0:3000

The app bound 3000 but Cloud Run probes 8080 — connection refused. Read process.env.PORT / os.environ["PORT"] instead.

2. App binds to localhost, not 0.0.0.0

Binding to 127.0.0.1 makes the port unreachable from outside the container.

gcloud run services logs read api --region us-central1 --limit 20 \
  | grep -i listening

Listening on 127.0.0.1:8080

The right port but the wrong interface; Cloud Run’s probe can’t reach 127.0.0.1. Bind to 0.0.0.0:$PORT.

3. The container crashes on startup

A missing env var, bad config, or unhandled exception kills the process before it listens.

gcloud run services logs read api --region us-central1 --limit 30 \
  | grep -iE 'error|exception|traceback|exited'

KeyError: 'DATABASE_URL'
Container called exit(1).

The process exits before binding the port — fix the missing config (here DATABASE_URL).

4. Startup is slower than the timeout

Heavy initialization (model load, migrations, warmup) exceeds the startup timeout, so the probe gives up before the app is ready.

gcloud run services describe api --region us-central1 \
  --format="value(spec.template.spec.timeoutSeconds, spec.template.metadata.annotations)"

300
run.googleapis.com/startup-cpu-boost: 'false'

If init takes ~40s but the startup probe window is short, raise the startup timeout / add a startup probe and enable CPU boost.

5. Wrong container entrypoint or architecture

A bad CMD/ENTRYPOINT, or an image built for arm64 while Cloud Run runs amd64, prevents the process from running at all.

gcloud run services logs read api --region us-central1 --limit 10

exec /app/server: exec format error

exec format error is a CPU-architecture mismatch — rebuild for linux/amd64.

6. App fails its own health/readiness check

A configured startup or liveness HTTP probe points at a path the app doesn’t serve, so it never reports healthy.

gcloud run services describe api --region us-central1 \
  --format="yaml(spec.template.spec.containers[0].startupProbe)"

startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 3

If the app serves health at / (not /healthz), the probe fails and the revision never becomes ready.

Diagnostic Workflow

Step 1: Read the revision status for the exact reason

gcloud run revisions describe <REVISION> --region <REGION> \
  --format="value(status.conditions[].type, status.conditions[].message)"

The message states the PORT it expected and whether it was a probe failure, a missing container, or a timeout.

Step 2: Read the container logs around startup

gcloud run services logs read <SERVICE> --region <REGION> --limit 50

Look for the “listening on” line (port + interface), crash stack traces, or exited/exit(1).

Step 3: Confirm the app honors PORT and binds 0.0.0.0

The app must read the PORT env var and bind 0.0.0.0:$PORT. Cloud Run sets PORT=8080 by default; you can change the container port:

gcloud run services describe <SERVICE> --region <REGION> \
  --format="value(spec.template.spec.containers[0].ports[0].containerPort)"

Step 4: Reproduce the start locally with the same contract

docker run --rm -e PORT=8080 -p 8080:8080 gcr.io/my-prod-project/api:latest
curl -sS localhost:8080/ -o /dev/null -w '%{http_code}\n'

If it doesn’t listen on 8080 locally, it won’t in Cloud Run. This isolates port/crash bugs from the platform.

Step 5: Adjust timeout/probe or fix the image, then redeploy

# Give slow starts more room + CPU boost
gcloud run deploy api --image gcr.io/my-prod-project/api:latest --region us-central1 \
  --timeout=300 --cpu-boost --port=8080

For probe mismatches, point the startup probe at a path the app actually serves, then redeploy.

Example Root Cause Analysis

A team adds a machine-learning model load to their service and the next deploy fails:

ERROR: (gcloud.run.deploy) Revision 'api-00007-abc' is not ready and cannot serve traffic.
 The user-provided container failed to start and listen on the port ... PORT=8080 within the allocated timeout.

Logs show the app does eventually start, but late:

gcloud run services logs read api --region us-central1 --limit 30 | grep -iE 'loading|listening'

Loading model weights (1.3 GB)...
Server listening on 0.0.0.0:8080

The port and interface are correct, and there’s no crash — the model load just pushes the first listen past the startup probe window, so Cloud Run tears the revision down before it reports healthy. This is a startup-timeout problem, not a port bug.

Fix: enable startup CPU boost and give the startup probe more room so initialization finishes in time:

gcloud run deploy api --image gcr.io/my-prod-project/api:latest --region us-central1 \
  --cpu-boost --timeout=300 \
  --port=8080

(Plus configuring a startup probe with a higher failureThreshold/periodSeconds.) The revision now becomes ready and serves traffic.

Prevention Best Practices

Always read the PORT environment variable and bind 0.0.0.0:$PORT; never hardcode a port or bind to localhost.
Test the exact container contract locally (docker run -e PORT=8080 -p 8080:8080 ...) before deploying so port/crash bugs never reach Cloud Run.
Build images for linux/amd64 (or set the platform explicitly) so architecture mismatches don’t produce exec format error.
Move slow initialization behind a startup probe and enable startup CPU boost so legitimately slow boots aren’t killed by the timeout.
Fail fast and log clearly on missing config (DB URLs, secrets) so a crash-on-boot is obvious in logs rather than a vague timeout.
For triage, the free incident assistant can correlate the revision message with the listening port in your logs. More walkthroughs are in the GCP guides.

Quick Command Reference

# Revision status + exact failure message
gcloud run revisions describe <REVISION> --region <REGION> \
  --format="value(status.conditions[].type, status.conditions[].message)"

# Startup logs (port, interface, crashes)
gcloud run services logs read <SERVICE> --region <REGION> --limit 50

# What container port is configured?
gcloud run services describe <SERVICE> --region <REGION> \
  --format="value(spec.template.spec.containers[0].ports[0].containerPort)"

# Reproduce the start locally with the same contract
docker run --rm -e PORT=8080 -p 8080:8080 <IMAGE>
curl -sS localhost:8080/ -o /dev/null -w '%{http_code}\n'

# Redeploy with more startup room
gcloud run deploy <SERVICE> --image <IMAGE> --region <REGION> \
  --timeout=300 --cpu-boost --port=8080

# Inspect the startup probe
gcloud run services describe <SERVICE> --region <REGION> \
  --format="yaml(spec.template.spec.containers[0].startupProbe)"

Conclusion

A Cloud Run “Container failed to start” error means the revision never listened on the expected PORT (or passed its startup probe) within the timeout. The usual root causes:

The app listens on the wrong port instead of reading PORT.
The app binds 127.0.0.1 rather than 0.0.0.0.
The container crashes on startup (missing config, exception).
Initialization is slower than the startup timeout.
A bad entrypoint or an amd64/arm64 architecture mismatch.
A configured health/startup probe targets a path the app doesn’t serve.

Read the revision’s status message for the expected port, check the logs for the listening line or crash, and reproduce the same PORT contract locally — the fix is almost always honoring PORT/0.0.0.0, fixing a boot crash, or giving slow startups more room.

GCP Error Guide: 'Container failed to start' Cloud Run Revision Errors