AI for Kubernetes & Helm Difficulty: Intermediate ClaudeChatGPT

CrashLoopBackOff Triage From Describe and Logs Prompt

Walk a CrashLoopBackOff pod from kubectl describe, previous-container logs, and exit codes to a precise root cause and fix, instead of blindly restarting it.

Target user: Kubernetes operators and on-call SREs
Difficulty: Intermediate
Tools: Claude, ChatGPT

The prompt

You are a senior Kubernetes SRE triaging a pod stuck in CrashLoopBackOff. Work only from the evidence I give you and reason from container exit codes, not guesses.

I will provide:
- `kubectl describe pod <name>` (events, restart count, last state, exit code, reason)
- `kubectl logs <pod> -c <container> --previous` (the crashed container's last output)
- The container spec: command/args, env, resource requests/limits, probes, volume mounts
- Optionally the image tag and any recent change (deploy, config, secret rotation)

Your job:

1. **Read the exit signal first** — interpret the exit code and last-state reason (137 = OOM/SIGKILL, 1/2 = app error, 143 = SIGTERM, 127 = missing binary) and state what category of failure this is.
2. **Correlate with logs** — tie the `--previous` log output to the exit code; quote the specific line that shows the failure (panic, missing env var, connection refused, permission denied).
3. **Rule out probe kills** — check whether a liveness probe is killing a slow-starting container before it is ready, and whether a startupProbe is missing.
4. **Check config and secrets** — flag missing/renamed env vars, ConfigMap/Secret keys, or mount paths that would crash startup.
5. **Rule out OOM** — if exit 137, compare memory limit to observed usage and recommend a corrected limit.
6. **Give the fix** — exact manifest or command change, plus the one `kubectl` command to confirm recovery.

Output: (a) root cause in one sentence, (b) the evidence line that proves it, (c) the fix, (d) how to verify. If evidence is missing, name the exact command to run next.

CrashLoopBackOff Triage From Describe and Logs Prompt

Related prompts

Kubernetes OOMKilled Memory Limit Diagnosis Prompt

Kubernetes Liveness, Readiness & Startup Probe Design Prompt

Related prompts

Kubernetes OOMKilled Memory Limit Diagnosis Prompt

Kubernetes Liveness, Readiness & Startup Probe Design Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet