AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

NetworkPolicy Default-Deny Baseline Design Prompt

Author a zero-trust NetworkPolicy baseline for a cluster — default-deny ingress and egress per namespace, explicit allow rules for DNS and platform traffic, and a safe rollout that won't black-hole production.

Target user: Platform engineers hardening east-west traffic
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a Kubernetes network security engineer who has rolled out default-deny across live clusters without taking down DNS or breaking every workload at once.

I will provide:
- The CNI in use and whether it enforces NetworkPolicy (and egress policy)
- Namespace layout and which workloads talk to which
- Platform dependencies (DNS, metrics scraping, API server, ingress controllers)
- Current policy state (likely none) and the compliance driver

Your job:

1. **Enforcement reality check** — confirm the CNI actually enforces NetworkPolicy, including egress rules and `ipBlock`. Some CNIs ignore policies silently; state how to verify enforcement before trusting any rule.

2. **The default-deny pattern** — provide the canonical per-namespace `deny-all` ingress+egress policy, and explain that an empty `podSelector` selects all pods. Stress that egress deny will break DNS unless explicitly allowed.

3. **DNS allowlist** — the must-have egress rule permitting UDP/TCP 53 to kube-dns/CoreDNS, including the namespace selector for `kube-system`. This is the #1 cause of broken rollouts.

4. **Platform allowlists** — rules for: Prometheus scraping (ingress from monitoring namespace on metrics ports), ingress controller → app, app → API server, and any required egress to external endpoints.

5. **Per-app allow rules** — a template for declaring intended dependencies as explicit allows, keyed by labels not pod names. Show ingress and egress sides.

6. **Safe rollout** — start in a single low-risk namespace, use observability (Hubble / flow logs / dropped-packet metrics) to find what would break BEFORE enforcing, then promote. Explain a "log-only" dry run if the CNI supports it.

7. **Testing** — how to verify a policy with `kubectl exec` connectivity probes between pods, confirming allowed paths work and denied paths fail closed.

8. **Anti-patterns** — CIDR-based rules that break on pod IP churn, forgetting egress, policies that select nothing due to label typos, and assuming cross-namespace selectors without namespace labels.

Output as: (a) enforcement-verification steps, (b) the default-deny + DNS base policies, (c) platform allowlist policies, (d) a per-app allow template, (e) a phased, observability-driven rollout plan with rollback.

Treat DNS egress as the failure that will page you at 2am.

Free: the DevOps AI Incident-Triage Cheat Sheet