Kubernetes Pod Lifecycle & Graceful Shutdown Prompt
Design and debug pod lifecycle — preStop hooks, terminationGracePeriodSeconds, SIGTERM handling, connection draining, readiness probe behavior on shutdown.
- Target user
- Kubernetes engineers designing production workloads
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior Kubernetes engineer who has debugged "users see 502 during deploys" too many times. You know that graceful shutdown requires correctly handling SIGTERM, draining connections, and coordinating with the readiness probe and Service endpoints. I will provide: - The workload (HTTP server, gRPC, message consumer) - Current pod spec (preStop, terminationGracePeriodSeconds, probes) - The symptom (in-flight requests dropped, 502s during rollouts, slow shutdown) - App's signal handling behavior Your job: 1. **Understand the shutdown sequence**: 1. Pod marked for deletion (status updated) 2. **Endpoints removed** from Service (kube-proxy updates iptables/IPVS) — but this is ASYNC 3. **preStop hook** runs (if defined) 4. **SIGTERM** sent to PID 1 5. **terminationGracePeriodSeconds** countdown begins 6. If not exited by then → **SIGKILL** 2. **The endpoint propagation race**: - Service endpoint removal is async; load balancers may still send traffic - **Apps that exit immediately on SIGTERM lose those in-flight requests** - Solution: preStop sleep (5-15s) gives kube-proxy time to update 3. **For HTTP servers**: - On SIGTERM: stop accepting NEW connections, drain existing - Set readinessProbe to fail → endpoints remove → no new traffic (slow path) - preStop sleep > readinessProbe failureThreshold × periodSeconds 4. **For SIGTERM handling**: - **PID 1 receives SIGTERM** — but if PID 1 is a shell, it ignores SIGTERM by default - Use `exec` in script: `exec myapp` so myapp is PID 1 - Or use tini / dumb-init as PID 1 to forward signals 5. **For terminationGracePeriodSeconds**: - Default 30s - Set to (drain time + buffer): for long-lived connections, may need 5+ min - kubectl delete with `--grace-period=0 --force` SKIPS this 6. **For preStop hook**: - Runs BEFORE SIGTERM - Can be `exec` (command) or `httpGet` - Common: `sleep 15` for endpoint propagation - Or: notify load balancer to drain 7. **For long-running workloads** (batch jobs, message consumers): - Save progress on shutdown - For at-least-once queues: ack only after work done 8. **For sidecar coordination**: - Sidecars die at the same time as main; order matters - Native sidecar (1.28+) reverses: sidecars die last Mark DESTRUCTIVE: `kubectl delete pod --grace-period=0 --force` (no graceful), removing preStop without verifying endpoint propagation, increasing termGracePeriodSeconds without bounded shutdown logic (pod hangs forever). --- Workload: [DESCRIBE] Pod spec excerpt: ```yaml [PASTE] ``` Symptom: [DESCRIBE — 502s, dropped requests, slow shutdown] App signal handling: [DESCRIBE]
Why this prompt works
Pod shutdown is a series of races: endpoint propagation, SIGTERM handling, connection draining. Getting any wrong drops user requests. This prompt walks the sequence.
How to use it
- Verify SIGTERM reaches your app (PID 1, signal forwarding).
- Add preStop sleep for endpoint propagation.
- Set terminationGracePeriodSeconds to (drain + buffer).
- Test rolling restart under load.
Useful commands
# Test signal handling
kubectl exec <pod> -- kill -SIGTERM 1 # send SIGTERM to PID 1
# Watch app logs to see if it received
# Force-test shutdown
kubectl delete pod <pod> # uses spec.terminationGracePeriodSeconds
time kubectl delete pod <pod> # measure time
# Test with load
# Run wrk or hey while doing kubectl rollout restart
hey -c 50 -z 30s http://svc.example.com/ &
kubectl rollout restart deploy/web
# Check current settings
kubectl get pod <pod> -o yaml | yq '.spec.containers[].lifecycle, .spec.terminationGracePeriodSeconds'
# Endpoints during shutdown
kubectl get endpoints <svc> -w
Patterns
HTTP server with proper drain
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
template:
spec:
terminationGracePeriodSeconds: 60 # drain + buffer
containers:
- name: app
image: myapp:v1
ports: [{ containerPort: 8080 }]
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- "sleep 15" # give endpoints time to update
readinessProbe:
httpGet:
path: /healthz
port: 8080
periodSeconds: 5
failureThreshold: 1
App on SIGTERM:
- Stop accepting new connections (close listener)
- Drain in-flight requests (with timeout)
- Exit cleanly
Tini as PID 1 (signal forwarding)
FROM node:20-alpine
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]
Tini forwards SIGTERM to the node process and reaps zombies.
Long-running batch worker (save progress)
# Python example
import signal
import sys
shutdown = False
def handle_sigterm(signum, frame):
global shutdown
shutdown = True
signal.signal(signal.SIGTERM, handle_sigterm)
while not shutdown:
job = queue.get()
process(job)
if shutdown:
# Save state, ack if done
queue.ack(job)
break
sys.exit(0)
With pod spec:
spec:
terminationGracePeriodSeconds: 300 # 5 min for current job + buffer
containers:
- name: worker
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "echo 'preStop'; sleep 5"]
Java app with shutdown hook
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
server.shutdown(); // stop accepting
server.awaitTermination(30, TimeUnit.SECONDS);
}));
Common findings this catches
- 502s during rollout → no preStop sleep; endpoint propagation race.
- App doesn’t exit on SIGTERM → shell as PID 1 without exec.
- terminationGracePeriodSeconds too short → SIGKILL before drain completes.
- Long drain blocks rollouts → bound your drain time.
- Sidecar dies first, main fails → use native sidecars (1.28+) OR preStop on main with sleep.
- Force-delete pod loses data → only for stuck pods.
- App keeps accepting connections during shutdown → close listener on SIGTERM.
When to escalate
- Long-lived connection workloads (WebSocket, gRPC streaming) — design for graceful close at protocol level.
- Stateful workloads needing coordinated shutdown — engage StatefulSet model.
- Custom controller dropping work — observability for shutdown phase.
Related prompts
-
Kubernetes Deployment Rollout Debug Prompt
Diagnose stuck Deployment rollouts — `ProgressDeadlineExceeded`, replica set churn, maxSurge/maxUnavailable misconfig, image pull pacing, and stuck-mid-rollout recovery.
-
Kubernetes Native Sidecar Containers Prompt
Migrate to native sidecar containers (1.28+) — `initContainers` with `restartPolicy: Always`, ordering, graceful shutdown, common patterns (service mesh, log shipper).
-
Kubernetes Pod Troubleshooting Prompt
Diagnose any misbehaving pod — pending, evicted, networking-broken, storage-stuck, or just plain slow — with a structured AI walkthrough.