CrashLoopBackOff Debugging Prompt
Drill into a specific CrashLoopBackOff failure — application crash, missing config, init container failure, or probe-driven kill — and find the actual cause.
- Target user
- Kubernetes admins debugging pods stuck in CrashLoopBackOff
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior Kubernetes SRE who has root-caused hundreds of CrashLoopBackOff incidents. You know the difference between symptom and cause. A pod is in CrashLoopBackOff. Your job: 1. Determine **which of these specific scenarios** is happening: - **Application crash on startup** — the container exits non-zero before getting healthy. Look in `--previous` logs for the exception. - **Missing or wrong config** — ConfigMap/Secret value missing, malformed YAML, wrong env var, or volume mount path wrong. - **Init container failure** — main container never starts because init failed. Check `kubectl describe` for init container status. - **Liveness probe killing it** — container starts, probe fails, kubelet kills it, restart, repeat. Check probe config vs actual readiness time. - **OOMKilled in a loop** — limit too low or memory leak. Container exits with code 137. Check `lastState.terminated.reason`. - **Permission denied / read-only filesystem** — security context too strict for what the app needs to write. - **Image entrypoint wrong** — `CrashLoopBackOff` with `exit code 127` (command not found) or `126` (not executable). 2. Quote the exact log line or describe output that supports your diagnosis. 3. Recommend the targeted fix (config change, manifest patch, image rebuild). Show me the YAML diff. 4. If you need additional data (e.g., previous container logs, the actual entrypoint, file permissions inside the image), ask for it before guessing. Pod: - Name: [POD NAME] - Namespace: [NS] - Image: [IMAGE:TAG] - Restart count: [N] `kubectl describe pod`: ``` [PASTE — make sure 'Last State' and 'Events' sections are included] ``` Current container logs: ``` [PASTE] ``` PREVIOUS container logs (the one that crashed): ``` [PASTE — kubectl logs <pod> --previous] ``` Pod spec fragment (containers, env, volumeMounts): ```yaml [PASTE] ```
Why this prompt works
“CrashLoopBackOff” is Kubernetes’s most over-used error state — it means “something kept exiting nonzero” and that something could be a dozen different things. This prompt narrows down to the specific scenario first, then targets the fix.
How to use it
- Always grab
--previouslogs. The current container has barely started before crashing again. The previous container’s logs hold the actual error. - Note the exit code in
lastState.terminated. It’s a huge clue: 137 = OOM, 1 = generic app error, 126 = not executable, 127 = command not found, 139 = SIGSEGV. - Check the restart count. A count of 12+ in 5 minutes vs 12+ over 24 hours tells different stories.
What to paste
kubectl describe pod <name> -n <ns>
kubectl logs <name> -n <ns> --previous --tail=200
kubectl get pod <name> -n <ns> -o jsonpath='{.status.containerStatuses[*].lastState}' | jq
kubectl get pod <name> -n <ns> -o jsonpath='{.spec.containers[*].command}'
Common patterns this catches
Exit code 137→ OOMKilled. Raise limits or fix the leak.Exit code 1with a stack trace in previous logs → app bug. Patch the app, redeploy.Exit code 0withLiveness probe failedevents → probe is too aggressive for app’s startup time. IncreaseinitialDelaySeconds.Exit code 127→ entrypoint command not found in the image. Wrong base image or wrong CMD/ENTRYPOINT.Back-off restarting failed containerwith no logs at all → init container probably failed silently. Checkkubectl describefor init container status.
Related prompts
-
Kubernetes Pod Crash Diagnosis Prompt
Diagnose CrashLoopBackOff, OOMKilled, ImagePullBackOff, and stuck pods from kubectl output.
-
Kubernetes Pod Troubleshooting Prompt
Diagnose any misbehaving pod — pending, evicted, networking-broken, storage-stuck, or just plain slow — with a structured AI walkthrough.
-
Kubernetes YAML Security Review Checklist Prompt
AI-driven security review of Kubernetes manifests — privilege, capabilities, network exposure, secret handling, and admission-policy compliance.