Auditing Kubernetes Manifests With AI: A Practical Workflow

A senior K8s engineer I work with audits manifests faster than I read them. He’s seen so many patterns that “missing readinessProbe on a Deployment that takes 45 seconds to start” jumps off the page. Most of us don’t have that pattern library memorized — and increasingly, we don’t need to. AI assistants have read more Kubernetes manifests than any human ever will.

The catch: a generic “review this YAML” prompt produces generic noise. You need to direct the model toward the categories of issues that actually matter in your environment.

The two mistakes everyone makes

Mistake 1: Asking for “a security review.” You’ll get a bullet list of every possible concern, ranked alphabetically, with no signal about which matter. You’ll skim, dismiss, and learn nothing.

Mistake 2: Pasting one manifest. Real Kubernetes problems live in the interaction between resources — a Deployment’s readiness probe and a Service’s selector, a NetworkPolicy and the actual app traffic. One YAML in isolation hides most of the bugs.

The fix for both is the same: give the model a bounded scope and enough context to reason about interactions.

A workflow that works

Step 1: Pick the audit dimension

Pre-decide what you’re checking for. Different prompts for different dimensions:

Resource limits & QoS — are requests/limits set, does QoS match intent, are limits realistic
Probes & lifecycle — readiness, liveness, startup, preStop, terminationGracePeriodSeconds
Security context — runAsNonRoot, capabilities, readOnlyRootFilesystem, seccomp
Network exposure — NetworkPolicy, Service type, Ingress rules
Reliability — PodDisruptionBudget, topology spread, replica count
State & storage — PVC access modes, retention policies, backup tags

Mixing dimensions in one review produces wishy-washy output. Pick one, get a clean answer, move on.

For a workload review, paste:

The Deployment / StatefulSet / DaemonSet
Its Service(s) and Ingress
Any NetworkPolicies that match its labels
The HPA if relevant
The ConfigMaps and Secrets it references (sanitize first)

For YAML this is usually under 500 lines, well within any model’s context window. The model can now reason about interactions, not just isolated fields.

Step 3: Use a directive prompt

The big difference between “tell me about this YAML” and a useful review is the instruction format. Compare:

Review this Kubernetes manifest.

versus:

You are reviewing a production Deployment + Service + NetworkPolicy bundle. For each finding, give: (1) severity (critical/high/medium/low), (2) the exact field path that’s wrong, (3) one sentence on why it matters, (4) the corrected YAML snippet. Focus only on probes, lifecycle, and graceful shutdown. Ignore documentation/comments.

The first prompt produces an essay. The second produces a list of fixable issues.

Step 4: Verify before applying

This is where most reviews go wrong. The model is right most of the time. It’s wrong some of the time, often in ways that look correct.

Common AI failure modes in K8s review:

Hallucinated field names — spec.template.spec.terminationGracePeriod (it’s terminationGracePeriodSeconds)
Outdated API versions — policy/v1beta1 PodDisruptionBudget (removed in 1.25)
Wrong defaults claimed — claiming failureThreshold defaults to 1 when it’s 3
Misreading the use case — recommending runAsNonRoot: true for a workload that legitimately needs root

For every “fix” the model suggests, glance at the official K8s docs for that field. This adds 30 seconds per finding and catches the wrong ones. Without this step, you will apply changes that break things.

A real example

Here’s a Deployment I reviewed last week:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payments
spec:
  replicas: 2
  selector:
    matchLabels: { app: payments }
  template:
    metadata:
      labels: { app: payments }
    spec:
      containers:
      - name: app
        image: registry.example.com/payments:v3.1.0
        ports:
        - containerPort: 8080
        env:
        - name: DB_URL
          value: postgres://payments-db:5432/payments
        resources:
          limits:
            cpu: "2"
            memory: "2Gi"
        readinessProbe:
          httpGet: { path: /healthz, port: 8080 }
          initialDelaySeconds: 5

I asked Claude to review for probes and graceful shutdown only. The findings:

No requests, only limits → pod gets BestEffort QoS, first to be evicted under pressure. Set requests equal to or below limits.
initialDelaySeconds: 5 → Java/Spring apps typically need 30-90 seconds to start. Add startupProbe with longer threshold.
No livenessProbe → kubelet won’t restart if the app deadlocks. Mirror readinessProbe with looser thresholds.
No terminationGracePeriodSeconds → defaults to 30s; for a payment service with in-flight requests, this is borderline. Set to 60s.
No preStop hook → SIGTERM hits immediately; load balancers may still send traffic for ~10s after pod marked Terminating. Add sleep 15 preStop.

All five were real, all five were fixable in two minutes of YAML editing. The model didn’t tell me about anything irrelevant. That’s because I scoped the prompt to “probes and graceful shutdown only.”

The big one — #5 — is something I’ve personally been bitten by twice. The model wouldn’t have prioritized it without the directive prompt.

What about Kyverno / OPA / Pod Security Admission?

Yes, you should run those too. They catch consistent issues at admission time. They don’t catch issues that require judgment: “is 30 seconds enough graceful shutdown for this specific service?” Policy enforcement is a floor; AI review is a directed second opinion above that floor.

I run both. Kyverno catches “no securityContext at all” before it ever lands. AI review catches “readinessProbe path doesn’t match what the app exposes” — something only a human (or an AI imitating one) would notice.

A starter prompt

If you want a template, here’s the one I use most:

You are reviewing a Kubernetes workload bundle for production readiness. Focus only on: probes (readiness, liveness, startup), terminationGracePeriodSeconds, preStop hooks, and rolling update strategy. For each finding produce: severity, exact field path, why it matters in one sentence, corrected YAML. Ignore everything else (security context, network policies, resource limits — those are separate reviews). The workload is [serves HTTP at /api on port 8080 / consumes from a queue / batch processor that runs N hours].

The bracketed context at the end is what makes the review accurate for your workload. Without it, the model assumes a generic web service.

For our full prompt library on Kubernetes review, see the Kubernetes & Helm category — especially kubernetes-yaml-security-review and kubernetes-resource-limits-tuning.