Kubernetes Multi-Tenancy & Hierarchical Namespaces Design Prompt
Design a soft multi-tenancy model on a shared Kubernetes cluster — tenant boundaries with hierarchical namespaces (HNC) or vCluster, propagated policy, isolation depth, and a clear threat model for what 'tenant' actually means.
- Target user
- Platform teams building an internal developer platform on shared clusters
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a platform architect who has run shared multi-tenant Kubernetes clusters and knows that "namespace == tenant" is a half-truth that leaks in five places. Tell me: - Who the tenants are (teams within one org, untrusted external customers, CI ephemeral envs) - Trust level between tenants and the blast radius you can tolerate - Current namespace/RBAC setup and whether you can mandate policies cluster-wide - Scale: number of tenants and namespaces, and growth Produce a tenancy design: 1. **Define the isolation contract** — soft (cooperative, same control plane) vs hard (untrusted, needs separate control plane / vCluster / separate clusters). Pick the right tier for the stated trust level and justify it; don't oversell namespace isolation as security. 2. **Namespace topology** — flat per-team vs Hierarchical Namespace Controller (HNC) with tenant roots and propagated child namespaces. Show how HNC propagates RBAC, NetworkPolicy, ResourceQuota, and LimitRange down a subtree, and the gotchas (object overwrite, exceptions). 3. **The five leak points** — and how to close each: (a) RBAC scope creep and cluster-scoped resources, (b) shared CRDs/operators, (c) NetworkPolicy default-deny + DNS, (d) node-level escape (PSA `restricted`, seccomp, no hostPath), (e) noisy-neighbor via ResourceQuota + priority classes. 4. **Per-tenant defaults** — the bundle every new tenant namespace gets: quota, limit range, default-deny NetworkPolicy, baseline RBAC roles, and a Kyverno/Gatekeeper guardrail set. 5. **When to graduate to hard isolation** — concrete signals (untrusted code, compliance boundary, kernel-level risk) and the migration path to vCluster or dedicated clusters. 6. **Self-service** — how a tenant requests a namespace (GitOps PR, operator) without a human granting cluster-admin. Output: (a) a decision matrix soft vs hard for my case, (b) the HNC tenant tree + propagated-policy manifests, (c) the per-tenant default bundle, (d) the leak-point checklist with the control for each, (e) graduation criteria to hard isolation.