Kubernetes Error Guide: 'ProgressDeadlineExceeded' Stalled

Exact Error Message

kubectl rollout status fails, and the Deployment carries a Progressing condition set to False with reason ProgressDeadlineExceeded:

$ kubectl rollout status deployment/web
Waiting for deployment "web" rollout to finish: 1 out of 3 new replicas have been updated...
error: deployment "web" exceeded its progress deadline

$ kubectl describe deployment web | grep -A4 Conditions
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    False   ProgressDeadlineExceeded

You may also see it in the condition message: ReplicaSet "web-6d8c9f7b5c" has timed out progressing.

What the Error Means

A Deployment is considered to be progressing while it is creating new pods and waiting for them to become Ready. The progressDeadlineSeconds field (default 600s) is a watchdog: if the Deployment makes no forward progress — no new pod becomes Available — within that window, the controller sets Progressing=False with reason ProgressDeadlineExceeded.

The critical insight: ProgressDeadlineExceeded is a symptom, never a root cause. It only tells you the rollout did not finish in time. The real failure is always one level down, in the pods the new ReplicaSet is trying to start — they are crashlooping, failing a readiness probe, unable to pull an image, or unschedulable. Your job is to find that failure. The deadline does not roll back the rollout; old pods keep serving (so the app may still be Available) while the new ones never come up.

Common Causes

CrashLoopBackOff — the new container starts then exits, never reaching Ready.
Failing readiness probe — the container runs but the probe never passes, so the pod stays 0/1.
ImagePullBackOff — the new image cannot be pulled, so the pod never starts.
FailedScheduling — the new pods are Pending (insufficient resources, taints, affinity).
Resource limits — too-low memory limits cause OOMKills before the pod is Ready.
Bad rollout — a config/env/secret change makes the new revision non-functional.

How to Reproduce the Error

Roll out an image tag that does not exist, with a short deadline:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  progressDeadlineSeconds: 60
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: web
          image: nginx:does-not-exist

kubectl apply -f web.yaml
kubectl rollout status deployment/web

After 60s the rollout fails with exceeded its progress deadline because the new pods are stuck in ImagePullBackOff and never become Ready.

Diagnostic Commands

# Confirm the condition and reason
kubectl describe deployment web | grep -A6 Conditions

# Which ReplicaSet is the new (stuck) one?
kubectl get rs -l app=web -o wide

# THE KEY STEP: look at the actual pods of the new ReplicaSet
kubectl get pods -l app=web
kubectl get pods -l app=web -o wide

# Why is a specific new pod not Ready?
kubectl describe pod <NEW_POD> | grep -A12 Events

# Logs from a crashlooping container (and the previous attempt)
kubectl logs <NEW_POD>
kubectl logs <NEW_POD> --previous

Do not stop at the Deployment. The describe deployment only says the deadline passed — kubectl get pods and describe pod reveal the genuine failure.

Step-by-Step Resolution

1. Find the new ReplicaSet and its pods. The reason lives in the pods, so list them and read their STATUS:

kubectl get pods -l app=web

NAME                    READY   STATUS             RESTARTS   AGE
web-6d8c9f7b5c-q4n2t    0/1     ImagePullBackOff   0          2m
web-6d8c9f7b5c-r8k3m    0/1     ImagePullBackOff   0          2m
web-7f4d8c9b6a-xab12    1/1     Running            0          1h   # old, still serving

The STATUS column classifies the real failure: ImagePullBackOff, CrashLoopBackOff, Pending, or Running but 0/1 (probe failure).

2. Route to the underlying error:

ImagePullBackOff -> see the ImagePullBackOff guide.
Pending -> see the FailedScheduling guide.
CrashLoopBackOff -> read kubectl logs <pod> --previous for the exit cause.
Running but 0/1 -> a readiness probe is failing (next step).

3. Diagnose a failing readiness probe. A Running pod stuck at 0/1 with repeated Readiness probe failed events means the probe never passes:

kubectl describe pod <NEW_POD> | grep -i 'readiness\|probe'

Warning  Unhealthy  3s (x18 over 1m)  kubelet  Readiness probe failed: HTTP probe failed with statuscode: 503

Fix the probe path/port, or give the app time to warm up with initialDelaySeconds/failureThreshold:

readinessProbe:
  httpGet: { path: /healthz, port: 8080 }
  initialDelaySeconds: 15
  periodSeconds: 5
  failureThreshold: 6

4. Roll back if the new revision is simply bad. Old pods are still serving, so revert immediately and investigate calmly:

kubectl rollout undo deployment/web
kubectl rollout status deployment/web

5. Re-roll after fixing the root cause. Apply the corrected manifest (good image, working probe, adequate resources) and watch the rollout finish:

kubectl apply -f web.yaml
kubectl rollout status deployment/web

6. Adjust progressDeadlineSeconds only if the app legitimately needs longer to start (large warmup, migrations). Raising it does not fix a broken pod — it just gives a healthy-but-slow rollout more time:

spec:
  progressDeadlineSeconds: 900

Prevention and Best Practices

Treat ProgressDeadlineExceeded as a pointer, not a diagnosis — always drill into the new ReplicaSet’s pods.
Set readiness probes that reflect real readiness, with enough initialDelaySeconds for slow-starting apps, so the deadline is not tripped by warmup.
Verify image tags exist in CI before deploying, so rollouts never stall on ImagePullBackOff.
Keep progressDeadlineSeconds realistic for the workload; the 600s default is fine for most, longer for migration-heavy starts.
Gate rollouts with kubectl rollout status in CD pipelines so a stalled deploy fails the pipeline instead of silently leaving old pods serving.
Keep enough cluster capacity that new pods schedule promptly; a Pending new ReplicaSet trips the deadline too. See Kubernetes & Helm guides.

ImagePullBackOff — a common underlying cause of a stalled rollout.
FailedScheduling — when the new pods are Pending.
PersistentVolumeClaim not bound — a stateful rollout that waits on storage.

Frequently Asked Questions

Does ProgressDeadlineExceeded roll back my deployment? No. Kubernetes does not auto-rollback. The old ReplicaSet keeps running its pods (so the app may stay Available) while the new pods stay broken. You must run kubectl rollout undo yourself or fix the new revision.

My app shows Available=True but Progressing=False. Which matters? Both. Available=True means enough old pods are still serving traffic, so users are fine for now. Progressing=False / ProgressDeadlineExceeded means the new version never rolled out. The deploy did not actually apply — investigate before assuming success.

Where is the real reason — the Deployment or the pods? The pods. describe deployment only reports that the deadline elapsed. Run kubectl get pods for the new ReplicaSet and kubectl describe pod / kubectl logs --previous to find the genuine failure.

Should I just increase progressDeadlineSeconds to make the error go away? Only if the pods are genuinely healthy but slow to become Ready. If pods are crashlooping or unschedulable, a larger deadline only delays the same failure. Fix the pod first.

Why did the rollout stall when the image and probe both look fine? Check scheduling. If the new ReplicaSet’s pods are Pending due to insufficient resources or taints, they never become Ready and the deadline trips — exactly the FailedScheduling path.

Kubernetes Error Guide: 'ProgressDeadlineExceeded' Stalled Deployment Rollout

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit