Hardening a Pod securityContext With AI Review

If you kubectl get pod -o yaml on a random workload in most clusters, the securityContext is either empty or wildly permissive. The container runs as root, can write to its own filesystem, keeps every Linux capability the runtime hands out, and could escalate privileges if something inside it gets popped. None of that is usually needed — it’s just what you get when nobody filled in the security fields.

Hardening a securityContext is high-value and low-glamour, which makes it perfect for an AI copilot. The model knows every field and the safe default for each. What it doesn’t know is whether your specific app will fall over when you take root away — so it proposes the hardening and you verify it against the running workload. The model reads and drafts; the human applies, after watching the pod survive.

Ask for the locked-down baseline

I start by handing the model the current pod spec and asking for the hardened version with reasoning:

Here is a Deployment’s pod spec. Add a securityContext that drops all capabilities, runs as non-root, sets a read-only root filesystem, and disables privilege escalation. For each field, tell me what could break in the app if that setting is wrong.

The “what could break” clause is the important half. A hardened baseline looks like this:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop: ["ALL"]
  seccompProfile:
    type: RuntimeDefault

That’s a strong default. It’s also the spec most likely to break a real app, because plenty of apps assume they can write to /tmp or bind a low port. The model knows the common failure modes; surfacing them up front is what lets you prepare.

Read-only root filesystem is where things break

readOnlyRootFilesystem: true is the highest-value setting and the one most likely to cause a crash, because apps love writing scratch files. The fix isn’t to abandon it — it’s to mount writable emptyDir volumes exactly where the app needs them. I ask:

With a read-only root filesystem, this app still needs to write to /tmp and /var/run. Add the emptyDir volumes and mounts so those paths stay writable.

volumeMounts:
  - name: "tmp"
    mountPath: "/tmp"
  - name: "run"
    mountPath: "/var/run"
volumes:
  - name: "tmp"
    emptyDir: {}
  - name: "run"
    emptyDir: {}

This keeps the filesystem locked down everywhere except the few paths that genuinely need writes — the right trade-off, and exactly the kind of fiddly edit the model handles well.

Capabilities: drop all, add back the minimum

Dropping ALL capabilities is correct, but a few workloads genuinely need one back — NET_BIND_SERVICE to bind port 80, for instance. I make the model justify any add:

This app binds port 80 directly. Which single capability does it need added back, and is there a way to avoid even that?

The good answer points out that you can usually bind 8080 and let the Service map 80 to it, avoiding the capability entirely. Forcing the model to look for the no-capability path keeps the spec tighter than just bolting NET_BIND_SERVICE back on.

Pro Tip: runAsNonRoot: true only enforces that the UID isn’t 0 — it doesn’t pick a UID. If your image’s USER is root and you set runAsNonRoot without runAsUser, the pod fails to start with “container has runAsNonRoot and image will run as root.” Ask the model to confirm the image actually has a non-root user, or to set an explicit runAsUser.

Pod-level versus container-level

A common bug is setting security fields at the wrong level. Some fields (runAsNonRoot, fsGroup) make sense on the pod securityContext; others (capabilities, readOnlyRootFilesystem, allowPrivilegeEscalation) only apply at the container level. The model can get this wrong, so I check:

Confirm each field is at the correct level — pod securityContext vs container securityContext — and move any that’s misplaced.

capabilities on the pod-level securityContext is silently ignored, which means you think you dropped them and you didn’t. Catching this is worth the dedicated check.

Verify against the running pod, not against confidence

Here’s the human-in-the-loop line, and it’s bright. The model produces a hardened spec; I apply it to a canary or staging pod and watch what actually happens:

kubectl apply -f hardened.yaml -n staging
kubectl rollout status deploy/api -n staging
kubectl logs deploy/api -n staging | grep -iE 'permission denied|read-only'

Permission-denied and read-only-filesystem errors in the logs tell me which path the app needs that I didn’t account for. I feed those back to the model, add the mount, and repeat. Only after a hardened pod survives a real traffic cycle does the change roll wider.

The AI never applies the hardening to prod, never gets a kubeconfig, and never decides the spec is safe. It drafts; the running pod and the human reading its logs are the authority. A hardened spec that crashes the app in a chat window costs nothing; one applied fleet-wide is an outage.

Conclusion

Most pods are over-privileged by default, and tightening the securityContext is exactly the tedious, well-patterned work AI handles well. It drafts a locked-down baseline, adds the writable mounts a read-only filesystem needs, finds the minimum capability set, and catches pod-vs-container misplacement. What it can’t do is know whether your app survives the lockdown — so you apply to staging, watch the logs for permission errors, and let a human roll it forward. That split gets you a genuinely hardened workload without a guess in production.

For the broader picture, securing a Kubernetes cluster: pod security and admission control and auditing Kubernetes manifests with AI cover the surrounding controls, and the code review dashboard formalizes the flag-then-approve handoff.

Map your spec to a Pod Security Standard

Hardening one pod is good; knowing which named bar it clears is better. Kubernetes defines three Pod Security Standards — privileged, baseline, and restricted — and the restricted profile is the target for most workloads. Rather than eyeball whether my spec qualifies, I let the model check it against the actual requirements:

Compare this pod spec against the Kubernetes restricted Pod Security Standard. List every requirement it meets and every one it still fails, citing the specific field for each gap.

The restricted profile demands the exact set we’ve been building toward — runAsNonRoot, allowPrivilegeEscalation: false, capabilities.drop: ["ALL"], a seccompProfile, and no host namespaces — so the model’s gap list becomes a concrete finish-line checklist. Then I can enforce it cluster-wide with a single namespace label instead of trusting each manifest:

kubectl label namespace prod \
  pod-security.kubernetes.io/enforce=restricted

Hardening individual specs and enforcing the standard at the namespace are complementary: the label catches anything that slips through review, and the per-pod work makes sure your own workloads actually pass the bar before you turn enforcement on. Asking the model to do that mapping turns “I think this is locked down” into “this satisfies restricted, here’s the proof.”

A hardened main container with a wide-open init container is a real and easy miss. Init containers often run as root to fix volume permissions or pull setup files, and people forget they need their own securityContext. I make the model audit the whole pod, not just the app container:

Check every init container and sidecar in this pod, not just the main one. Does each have its own securityContext, or are any silently running as root with full capabilities?

If an init container genuinely needs root to chown a volume, the better fix is usually an fsGroup on the pod securityContext that makes the volume group-writable, eliminating the privileged init step entirely. The model knows that pattern and will suggest it when you ask how to avoid the root init container rather than just accepting it.