Kubernetes Security Hardening: Pods, RBAC, and Network Policy That Actually Contain a Breach
A default Kubernetes cluster is dangerously permissive. Here's how I harden pods, RBAC, and network policy so one compromised container can't become the whole cluster — with AI auditing the manifests.
- #security
- #hardening
- #kubernetes
- #rbac
- #network-policy
- #ai
A fresh Kubernetes cluster is convenient and dangerously permissive. Pods run as root by default, can often reach every other pod on the network, and service accounts come with more access than most workloads ever use. The platform is built to make things work, not to make them safe — that part is on you.
After 25 years of running production infrastructure, and plenty of it on Kubernetes, here’s how I harden a cluster so that one compromised container stays one compromised container instead of becoming the whole cluster. And how I use AI to audit the manifests before they ship.
The mental model: assume a pod will be compromised
The right design question isn’t “how do I keep every pod safe?” It’s “when a pod is compromised, how far can the attacker get?” Good Kubernetes hardening is about containment — shrinking the blast radius of an inevitable foothold. Three layers do most of the work: the pod’s own privileges, what its identity can do (RBAC), and where it can talk (network policy).
Layer 1: harden the pod’s security context
By default a container can run as root, write its own filesystem, and hold Linux capabilities it doesn’t need. Lock all of that down in the securityContext:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 10001
seccompProfile:
type: RuntimeDefault
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
Each line closes a path: non-root means a container escape doesn’t start as root, read-only root filesystem stops an attacker from writing a payload, dropped capabilities remove kernel privileges malware reaches for, and the seccomp profile restricts dangerous syscalls. This single block is the highest-leverage pod-level change you can make.
Enforce it with Pod Security Admission
Setting a good security context on your manifests doesn’t stop someone from deploying a privileged pod next week. Enforce a floor at the namespace level with Pod Security Admission:
# Namespace label — reject pods that violate the 'restricted' standard
metadata:
labels:
pod-security.kubernetes.io/enforce: restricted
Now the cluster rejects a pod that runs as root or requests host access. Policy beats good intentions — the guardrail holds even when someone’s in a hurry.
Layer 2: lock down RBAC
RBAC is where a contained breach becomes a cluster takeover, because every pod carries a service account token, and that token is an identity an attacker can use against the API server.
Rules I never break:
- No wildcard roles. A role granting
*on*iscluster-adminwith extra steps. Scope verbs and resources explicitly. - One service account per workload, scoped to exactly what that workload needs — never the
defaultSA for anything real. - Disable token automounting where the workload doesn’t call the API at all:
spec:
automountServiceAccountToken: false
Most application pods never talk to the Kubernetes API. Mounting them a token anyway just hands an attacker a credential for free. Turn it off unless it’s needed.
Watch especially for the escalation traps: any binding that grants create/update on roles, bind, escalate, or broad secrets read. Those let a modest foothold rewrite its own permissions.
Layer 3: network policy — deny by default
This is the layer most teams skip, and it’s the one that most limits lateral movement. By default, every pod can reach every other pod. A compromised frontend can talk straight to your database pod, your metrics, anything.
Start with a default-deny policy per namespace, then explicitly allow only the traffic you need:
# Default deny all ingress in the namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
spec:
podSelector: {}
policyTypes: ["Ingress"]
# Then allow only frontend -> api on 8080
spec:
podSelector:
matchLabels: { app: api }
ingress:
- from:
- podSelector: { matchLabels: { app: frontend } }
ports:
- { protocol: TCP, port: 8080 }
With default-deny in place, a compromised pod can’t freely scan and pivot across the cluster. Lateral movement is the attacker’s whole game, and network policy is what takes it away.
Don’t forget the supporting controls
A few more that round out a hardened cluster:
- Encrypt secrets at rest in etcd, and lock down who can
getthem via RBAC. - Disable the kubelet read-only port and protect the API server.
- Use signed, scanned images and an admission policy that rejects unsigned ones.
- Keep nodes and the control plane patched — a known kernel CVE undoes a lot of this.
Using AI to audit manifests
Kubernetes YAML is verbose, and the absence of a security setting is invisible — there’s no error for “you forgot runAsNonRoot.” That’s exactly the kind of gap AI catches well as a reviewer.
I paste the manifests and prompt:
“Audit these Kubernetes manifests for security hardening. Flag every missing or weak control: containers running as root, missing readOnlyRootFilesystem, allowPrivilegeEscalation not set false, capabilities not dropped, automountServiceAccountToken left on for a pod that doesn’t need it, wildcard RBAC, and missing network policy. For each, give the line and the corrected YAML.”
The model reliably surfaces the security context block someone left off, the cluster-admin binding added “temporarily,” and the namespace with no default-deny. It reads manifests more thoroughly than I do during a busy review.
The rule holds: AI audits and proposes; a human reviews and applies. For a structured, risk-classified pass over manifest diffs, our Code Review tool runs a static pre-scan plus an AI layer, and you can keep these audit prompts with your other security hardening prompts.
The short version
Design for containment: assume a pod will be compromised and shrink what happens next. Harden the pod security context — non-root, read-only root, drop all capabilities, seccomp — and enforce it cluster-wide with Pod Security Admission. Lock down RBAC with scoped, per-workload service accounts and no wildcards, and turn off token automounting where it’s not needed. Put a default-deny network policy in every namespace so a foothold can’t pivot. Then point AI at the manifests as a tireless reviewer that catches the control you silently forgot — with a human applying every change.
AI-generated manifest audits are assistive, not authoritative. Always review suggested changes against your workload’s real requirements before applying them to a live cluster.