Writing Kubernetes Admission Policies With an AI Copilot

The scariest thing you can do to a cluster is enable an admission policy in Enforce mode without testing it. I’ve watched a one-line typo in a Kyverno rule block every single pod from scheduling, including the ingress controller, including the policy engine’s own webhook. The cluster doesn’t crash dramatically — it just quietly refuses every new pod until someone figures out why nothing deploys.

Admission policies are also where AI assistance shines, because the syntax is finicky and the patterns are well-documented. Kyverno’s match/validate blocks and the newer CEL-based ValidatingAdmissionPolicy are exactly the kind of structured, example-rich thing a model drafts well. The catch is the same as always: the AI is a fast junior engineer who’s never seen your cluster, so it drafts and you test. Nobody enables Enforce on a model’s say-so.

Start from the intent, not the syntax

I don’t ask the model to “write a Kyverno policy.” I describe the guardrail in plain English and let it pick the mechanism:

Write a policy that rejects any Pod whose containers don’t set runAsNonRoot: true. It should apply to all namespaces except kube-system. Give me both a Kyverno ClusterPolicy and an equivalent native ValidatingAdmissionPolicy so I can compare.

Getting both lets me see the trade-off: Kyverno is more ergonomic and has mutation/generation; the native ValidatingAdmissionPolicy (GA in recent Kubernetes) needs no extra controller but uses CEL. A typical Kyverno draft:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: "require-run-as-nonroot"
spec:
  validationFailureAction: Audit
  rules:
    - name: "check-run-as-nonroot"
      match:
        any:
          - resources:
              kinds: ["Pod"]
      exclude:
        any:
          - resources:
              namespaces: ["kube-system"]
      validate:
        message: "Containers must set runAsNonRoot: true."
        pattern:
          spec:
            containers:
              - securityContext:
                  runAsNonRoot: true

Notice validationFailureAction: Audit. I make the model default every new policy to Audit, never Enforce. That’s the difference between learning and an outage.

Make it generate the test cases too

A policy is only as good as the manifests that should pass and fail. I ask the model to produce both:

Generate two Pod manifests: one that should pass this policy and one that should fail. Make the failing one realistic — a typical app that just forgot the securityContext.

Then I run them through Kyverno’s CLI without touching the cluster at all:

kyverno apply require-run-as-nonroot.yaml \
  --resource good-pod.yaml \
  --resource bad-pod.yaml

If the “good” pod is blocked or the “bad” pod passes, the policy is wrong, and I caught it offline. This offline test loop is the whole game — the model writes both the rule and the fixtures, and the CLI tells the truth.

Pro Tip: Ask the AI to write the test where the policy is almost satisfied — runAsNonRoot set on the Pod’s securityContext but not the container’s. That edge case (Pod-level vs container-level) is the single most common admission-policy bug, and surfacing it in a test catches it before prod does.

CEL needs extra scrutiny

The native ValidatingAdmissionPolicy uses CEL expressions, and CEL has sharp edges around null handling. A draft like this looks right:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
  name: "require-nonroot"
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
        resources: ["pods"]
  validations:
    - expression: "object.spec.containers.all(c, c.securityContext.runAsNonRoot == true)"
      message: "Containers must set runAsNonRoot: true."

But if any container has no securityContext at all, that expression throws instead of failing cleanly. I specifically prompt:

Rewrite this CEL so it handles containers with no securityContext field set — those should fail validation, not error the webhook.

The model knows the has() guard and optional-field idioms; it just won’t apply them unless you ask, because the naive version reads fine.

Beware locking yourself out

The classic self-inflicted outage is a policy with failurePolicy: Fail that the policy engine’s own pods can’t satisfy. I make the AID reason about this explicitly:

If this policy is enforced cluster-wide, which system components could it block? Check the policy engine’s own webhook, CNI pods, and CSI drivers. Recommend exclusions.

That prompt has saved me more than once. Always exclude kube-system and the policy controller’s namespace, and stage the rollout: Audit for a week, read the violation reports, then flip to Enforce.

kubectl get policyreports -A | grep require-run-as-nonroot

Roll out behind a human, behind audit data

Here’s the bright line. The model drafts the policy, the fixtures, and the exclusion list. The CLI validates it offline. A human reads a week of Audit-mode policy reports, confirms nothing legitimate is being flagged, and only then applies the Enforce change. The AI never gets a kubeconfig, never runs kubectl apply, and never decides on its own that a policy is safe to enforce. The audit reports are the authority, not the model’s confidence.

If you’d like the flagged-then-approved workflow formalized, the code review dashboard is built for that handoff, and the incident response dashboard helps if an over-eager policy ever does cause a deploy freeze.

Mutation policies need a higher bar

Kyverno can do more than reject — it can mutate incoming resources, injecting sidecars, adding labels, or setting defaults. That’s powerful and far more dangerous than validation, because a validating policy that’s wrong just blocks a deploy, while a mutating policy that’s wrong silently rewrites every object in the cluster into something subtly broken. I treat AI-drafted mutation policies with extra suspicion.

Write a Kyverno mutate rule that adds a default securityContext.runAsNonRoot: true to Pods that don’t set it. Then show me three Pods where this mutation could conflict with an existing securityContext or break an init container that legitimately needs root.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: "default-nonroot"
spec:
  rules:
    - name: "add-nonroot"
      match:
        any:
          - resources:
              kinds: ["Pod"]
      mutate:
        patchStrategicMerge:
          spec:
            securityContext:
              +(runAsNonRoot): true

The +() anchor means “add only if absent,” which is the safe form — but the model will sometimes draft a plain merge that clobbers an existing value. I make it explain the anchor semantics it chose and prove the rule is additive, never destructive. And mutation policies stay in Audit/simulation longer than validation ones, because a bad mutation is invisible until something downstream breaks.

Version the policies like code

Admission policies are infrastructure, and treating them as throwaway YAML is how a cluster ends up with a require-nonroot-v2-final-ACTUAL policy nobody understands. Every policy goes in git with the test fixtures the model generated next to it, so the next person can run the same offline kyverno apply check and see exactly which manifests should pass and fail. When I ask the model to draft a policy, I also ask it to draft the README line explaining the intent in one sentence — “rejects Pods that run as root, excludes kube-system” — so the why travels with the what. A policy without its rationale gets disabled the first time it inconveniences someone.

Conclusion

Admission control is one of the highest-leverage things you can add to a cluster and one of the easiest to turn into an outage. AI removes most of the syntax friction — it drafts Kyverno and CEL rules, generates pass/fail fixtures, and reasons about lockout risks. But the model never enforces anything. You test offline with the CLI, run in Audit mode, read the reports, and let a human flip the switch. That discipline is what makes admission policy a safety net instead of a foot-gun.

For policy-engine depth, Kyverno policy as code without Rego and the rest of the Kubernetes and Helm guides go further. Ready-made policy prompts live in the prompt library.