Auditing Kubernetes Manifests With AI: A Practical Workflow
AI is surprisingly good at reviewing Kubernetes YAML — if you prompt it right. Here's a workflow that catches real issues without false-positive noise.
- #kubernetes
- #yaml
- #security
- #ai
- #review
A senior K8s engineer I work with audits manifests faster than I read them. He’s seen so many patterns that “missing readinessProbe on a Deployment that takes 45 seconds to start” jumps off the page. Most of us don’t have that pattern library memorized — and increasingly, we don’t need to. AI assistants have read more Kubernetes manifests than any human ever will.
The catch: a generic “review this YAML” prompt produces generic noise. You need to direct the model toward the categories of issues that actually matter in your environment.
The two mistakes everyone makes
Mistake 1: Asking for “a security review.” You’ll get a bullet list of every possible concern, ranked alphabetically, with no signal about which matter. You’ll skim, dismiss, and learn nothing.
Mistake 2: Pasting one manifest. Real Kubernetes problems live in the interaction between resources — a Deployment’s readiness probe and a Service’s selector, a NetworkPolicy and the actual app traffic. One YAML in isolation hides most of the bugs.
The fix for both is the same: give the model a bounded scope and enough context to reason about interactions.
A workflow that works
Step 1: Pick the audit dimension
Pre-decide what you’re checking for. Different prompts for different dimensions:
- Resource limits & QoS — are requests/limits set, does QoS match intent, are limits realistic
- Probes & lifecycle — readiness, liveness, startup, preStop, terminationGracePeriodSeconds
- Security context — runAsNonRoot, capabilities, readOnlyRootFilesystem, seccomp
- Network exposure — NetworkPolicy, Service type, Ingress rules
- Reliability — PodDisruptionBudget, topology spread, replica count
- State & storage — PVC access modes, retention policies, backup tags
Mixing dimensions in one review produces wishy-washy output. Pick one, get a clean answer, move on.
Step 2: Paste the manifest + related context
For a workload review, paste:
- The Deployment / StatefulSet / DaemonSet
- Its Service(s) and Ingress
- Any NetworkPolicies that match its labels
- The HPA if relevant
- The ConfigMaps and Secrets it references (sanitize first)
For YAML this is usually under 500 lines, well within any model’s context window. The model can now reason about interactions, not just isolated fields.
Step 3: Use a directive prompt
The big difference between “tell me about this YAML” and a useful review is the instruction format. Compare:
Review this Kubernetes manifest.
versus:
You are reviewing a production Deployment + Service + NetworkPolicy bundle. For each finding, give: (1) severity (critical/high/medium/low), (2) the exact field path that’s wrong, (3) one sentence on why it matters, (4) the corrected YAML snippet. Focus only on probes, lifecycle, and graceful shutdown. Ignore documentation/comments.
The first prompt produces an essay. The second produces a list of fixable issues.
Step 4: Verify before applying
This is where most reviews go wrong. The model is right most of the time. It’s wrong some of the time, often in ways that look correct.
Common AI failure modes in K8s review:
- Hallucinated field names —
spec.template.spec.terminationGracePeriod(it’sterminationGracePeriodSeconds) - Outdated API versions —
policy/v1beta1 PodDisruptionBudget(removed in 1.25) - Wrong defaults claimed — claiming
failureThresholddefaults to 1 when it’s 3 - Misreading the use case — recommending
runAsNonRoot: truefor a workload that legitimately needs root
For every “fix” the model suggests, glance at the official K8s docs for that field. This adds 30 seconds per finding and catches the wrong ones. Without this step, you will apply changes that break things.
A real example
Here’s a Deployment I reviewed last week:
apiVersion: apps/v1
kind: Deployment
metadata:
name: payments
spec:
replicas: 2
selector:
matchLabels: { app: payments }
template:
metadata:
labels: { app: payments }
spec:
containers:
- name: app
image: registry.example.com/payments:v3.1.0
ports:
- containerPort: 8080
env:
- name: DB_URL
value: postgres://payments-db:5432/payments
resources:
limits:
cpu: "2"
memory: "2Gi"
readinessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 5
I asked Claude to review for probes and graceful shutdown only. The findings:
- No
requests, onlylimits→ pod getsBestEffortQoS, first to be evicted under pressure. Set requests equal to or below limits. initialDelaySeconds: 5→ Java/Spring apps typically need 30-90 seconds to start. AddstartupProbewith longer threshold.- No
livenessProbe→ kubelet won’t restart if the app deadlocks. Mirror readinessProbe with looser thresholds. - No
terminationGracePeriodSeconds→ defaults to 30s; for a payment service with in-flight requests, this is borderline. Set to 60s. - No
preStophook → SIGTERM hits immediately; load balancers may still send traffic for ~10s after pod marked Terminating. Addsleep 15preStop.
All five were real, all five were fixable in two minutes of YAML editing. The model didn’t tell me about anything irrelevant. That’s because I scoped the prompt to “probes and graceful shutdown only.”
The big one — #5 — is something I’ve personally been bitten by twice. The model wouldn’t have prioritized it without the directive prompt.
What about Kyverno / OPA / Pod Security Admission?
Yes, you should run those too. They catch consistent issues at admission time. They don’t catch issues that require judgment: “is 30 seconds enough graceful shutdown for this specific service?” Policy enforcement is a floor; AI review is a directed second opinion above that floor.
I run both. Kyverno catches “no securityContext at all” before it ever lands. AI review catches “readinessProbe path doesn’t match what the app exposes” — something only a human (or an AI imitating one) would notice.
A starter prompt
If you want a template, here’s the one I use most:
You are reviewing a Kubernetes workload bundle for production readiness. Focus only on: probes (readiness, liveness, startup),
terminationGracePeriodSeconds, preStop hooks, and rolling update strategy. For each finding produce: severity, exact field path, why it matters in one sentence, corrected YAML. Ignore everything else (security context, network policies, resource limits — those are separate reviews). The workload is [serves HTTP at /api on port 8080 / consumes from a queue / batch processor that runs N hours].
The bracketed context at the end is what makes the review accurate for your workload. Without it, the model assumes a generic web service.
For our full prompt library on Kubernetes review, see the Kubernetes & Helm category — especially kubernetes-yaml-security-review and kubernetes-resource-limits-tuning.