Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kubernetes & Helm By James Joyner IV · · 9 min read

Securing a Kubernetes Cluster: Pod Security and Admission Control

Pod Security Standards and admission controllers stop dangerous workloads before they run. Here's how to lock down a cluster without breaking deploys, with AI help.

  • #kubernetes
  • #security
  • #pod-security
  • #admission-control
  • #ai
  • #policy

A default Kubernetes cluster will happily run a privileged container, mounted to the host filesystem, running as root, with the host network. Nothing stops it. The job of cluster security is to put gates in front of that — to reject dangerous workloads before they schedule, not to detect them afterward. That’s what Pod Security and admission control do.

Here’s how I lock down a cluster without turning every deploy into a fight, and where AI helps translate policy violations into fixes.

The shift: PodSecurityPolicy is gone

If you learned this years ago, unlearn the first thing: PodSecurityPolicy (PSP) was removed in Kubernetes 1.25. It was confusing and order-dependent, and it’s not coming back. The built-in replacement is Pod Security Admission (PSA), and for anything beyond the basics, a policy engine like Kyverno or OPA Gatekeeper.

Pod Security Standards: three levels

PSA enforces three predefined profiles:

  • Privileged — no restrictions. For infrastructure workloads that genuinely need it.
  • Baseline — blocks the known-bad: privileged containers, host namespaces, host ports, most hostPath mounts. A sane minimum.
  • Restricted — hardened: must run as non-root, drop all capabilities, seccomp RuntimeDefault, read-only root filesystem encouraged.

You apply these per namespace with labels, and you get three modes per profile:

apiVersion: v1
kind: Namespace
metadata:
  name: payments
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/warn: restricted
    pod-security.kubernetes.io/audit: restricted
  • warn — lets the pod run but warns the user.
  • audit — logs a violation to the audit log.
  • enforce — rejects the pod.

Roll out in warn mode first

The mistake that breaks deploys: setting enforce: restricted on an existing namespace and watching every deploy fail at once. Don’t. Roll out in stages:

  1. Set warn and audit to restricted — nothing breaks, but you collect every violation.
  2. Read the warnings and the audit log; fix the workloads.
  3. Then flip enforce to restricted.

This turns a cluster-wide outage into a punch list. The warn/enforce split exists precisely so you can see the blast radius before you pull the trigger.

What restricted actually requires

Most workloads fail restricted for the same handful of reasons, all fixable in the pod spec:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  seccompProfile:
    type: RuntimeDefault
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]

The two that bite hardest: runAsNonRoot (your image must have a non-root user and not need to bind port 80 — use 8080) and readOnlyRootFilesystem (the app must write only to mounted emptyDir volumes). Both are good hygiene; both require image changes you can’t always make instantly. That’s why warn mode first matters.

Where built-in PSA stops and policy engines start

PSA is namespace-level and profile-based — it can’t express “images must come from our registry” or “every pod must have resource limits” or “no :latest tags.” For that you need Kyverno or OPA Gatekeeper, which run as admission webhooks and let you write arbitrary policy:

# Kyverno: require resource limits
spec:
  rules:
  - name: require-limits
    match:
      resources:
        kinds: [Pod]
    validate:
      message: "Resource limits are required"
      pattern:
        spec:
          containers:
          - resources:
              limits:
                memory: "?*"

Kyverno policies are plain YAML (easier to start with); Gatekeeper uses Rego (more powerful, steeper curve). Either runs in Audit before Enforce — same staged-rollout discipline.

Where AI helps

Admission rejections produce a wall of text — violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false, unrestricted capabilities, runAsNonRoot != true — and you have to translate each clause into a securityContext change. That’s a perfect AI task:

“This pod was rejected by Pod Security Admission with profile ‘restricted’. Here’s the error and the pod spec. Give me the exact securityContext changes to make it compliant, and flag any that require an image rebuild rather than a spec change.”

The “requires an image rebuild” flag is the useful part — it separates the changes you can make now from the ones that need a Dockerfile change. Keep a set of Kubernetes security prompts ready for this translation work.

The other layers, briefly

Admission control is one layer. A locked-down cluster also wants:

  • Tight RBAC — least privilege on every ServiceAccount.
  • Network Policies — default-deny so a compromised pod is contained.
  • Image scanning in CI and an admission policy that blocks unscanned or high-CVE images.
  • automountServiceAccountToken: false on pods that don’t call the API.
  • Audit logging enabled so you can answer “what happened” after the fact.

No single control is sufficient; defense in depth is the whole game.

A lockdown sequence that won’t break Friday

  1. Label namespaces warn + audit at baseline, fix violations, enforce baseline.
  2. Repeat for restricted on namespaces that can take it.
  3. Add Kyverno in Audit mode for the policies PSA can’t express.
  4. Fix violations, then flip Kyverno to Enforce.
  5. Layer in Network Policies, RBAC tightening, and image scanning.

Before security policies merge, I run them through the Code Review tool — it catches a policy set to Enforce before anyone’s checked the audit log, which is how you take down every deploy at once.

Cluster security isn’t about catching attackers after the fact; it’s about making the dangerous configuration impossible to deploy. Stage every control through warn/audit before enforce, let AI translate the violations into spec changes, and you harden the cluster without a single broken-deploy incident.

AI compliance suggestions are assistive. Always test policy changes in audit mode against real workloads before enforcing.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.