You are a senior Kubernetes platform engineer with deep experience running production clusters on EKS, GKE, AKS, and bare-metal k8s. I will share kubectl output for a pod that's not behaving the way I want. Your job: 1. Identify the pod's current lifecycle state and the **first** thing that's wrong (don't chase symptoms — find the cause). 2. Map the failure to one of these buckets: - **Scheduling** (Pending, Unschedulable, taint/toleration mismatch, resource quota) - **Image pull** (ImagePullBackOff, ErrImagePull, registry auth) - **Startup** (CrashLoopBackOff, OOMKilled, init container failure) - **Probe-driven** (liveness/readiness probe killing the pod) - **Networking** (pod can't reach a Service, DNS, or external host) - **Storage** (PVC unbound, mount failure, ReadWriteOnce conflict) - **Eviction** (node pressure, preempted by higher-priority pod) 3. Quote the **specific output line(s)** that support your diagnosis. Don't paraphrase. 4. Suggest the next 2–3 diagnostic commands. Label anything destructive (delete, drain, scale to zero, patch) as **DANGEROUS** with the blast radius. 5. Before suggesting a fix, confirm the root cause with me. Ask follow-up questions if needed. Pod manifest (or relevant fragment): ```yaml [PASTE] ``` `kubectl describe pod <name> -n <ns>`: ``` [PASTE] ``` Logs (current + previous container if relevant): ``` [PASTE] ``` Cluster context: - Kubernetes version: [e.g. 1.32] - Node type / size: [e.g. EKS t3.xlarge, bare-metal 16-core] - Namespace ResourceQuota: [if any] - Recent changes: [deployment, image tag, node pool resize, etc.]

Why this prompt works

Kubernetes failures look identical on the surface (the pod won’t run, or it’s running but doing the wrong thing) but have radically different root causes. This prompt forces a state-machine view of pod lifecycle and demands the model point to actual log lines, not paraphrase.

How to use it

Always include kubectl describe pod output — it’s where the events list lives, and the events list is where root cause usually hides.
Include the manifest, not screenshots. The model needs to compare requested resources to observed behavior.
For OOMKilled / eviction diagnoses, also paste kubectl top pod and node-level pressure metrics if available.
Keep the conversation alive: paste new output as you gather it. Long-context models retain the diagnostic flow.

What to paste

kubectl get pod <name> -n <ns> -o yaml | head -100
kubectl describe pod <name> -n <ns>
kubectl logs <name> -n <ns> --tail=200
kubectl logs <name> -n <ns> --previous --tail=200 2>/dev/null || true
kubectl get events -n <ns> --sort-by='.lastTimestamp' | tail -30
kubectl top pod <name> -n <ns> 2>/dev/null || true

Common patterns this catches

Pod Pending forever → almost always a scheduling failure. Check Events: for “0/N nodes are available.”
CrashLoopBackOff → look at --previous logs; the current logs only show the latest restart.
Pod running but no traffic → readiness probe failing silently. Check kubectl describe for readiness probe details.
Container exits 137 → OOMKilled. Either raise the limit or fix the leak.
Error: ImagePullBackOff → image name typo, missing registry secret, or rate limit.

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response

Instant PDF download — yours free, forever

Plus one practical AI-workflow email a week (no spam)

Kubernetes Pod Troubleshooting Prompt

Why this prompt works

How to use it

What to paste

Common patterns this catches

Related prompts

CrashLoopBackOff Debugging Prompt

Kubernetes YAML Security Review Checklist Prompt

Helm Chart Review Prompt

Kubernetes Pod Crash Diagnosis Prompt

Reading prompts? Get all 500 in one free PDF