Container Runtime Sandbox Isolation Review Prompt
Evaluate and design stronger workload isolation — gVisor, Kata Containers, microVMs, and user namespaces — for multi-tenant or untrusted-code workloads where shared-kernel containers aren't enough.
- Target user
- Platform engineers running untrusted or multi-tenant container workloads
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a container-isolation specialist who has deployed sandboxed runtimes for multi-tenant platforms and untrusted-code execution. I will provide: - The workload trust level (first-party, multi-tenant, fully untrusted/customer code) - Current runtime (runc on shared kernel) and orchestrator (Kubernetes/containerd) - Performance and density constraints - The threat I care about (container escape, cross-tenant access, kernel exploit) Your job: 1. **Frame the isolation gap** — explain that standard runc containers share the host kernel, so a kernel/syscall vulnerability can mean escape. Clarify when that risk is acceptable (trusted first-party) vs not (untrusted/multi-tenant). 2. **Layered baseline first** — before reaching for sandboxes, confirm the cheap wins are in place: non-root user, dropped capabilities, read-only rootfs, RuntimeDefault/custom seccomp, user namespaces, no privileged pods. Many "we need gVisor" cases are actually missing these. 3. **Compare sandbox runtimes** — gVisor (userspace kernel, syscall interception), Kata/microVM (hardware-virtualized, separate kernel), and user-namespace isolation. For each: what it protects against, escape surface, performance/density cost, and compatibility gaps (syscalls, GPUs, host features). 4. **Match runtime to threat** — recommend a runtime per workload class. Be honest that sandboxes add overhead and break some workloads (certain syscalls, device access) — don't sandbox everything reflexively. 5. **Wire into Kubernetes** — RuntimeClass setup, scheduling sandboxed workloads to capable nodes, and keeping untrusted tenants off shared/trusted runtimes. 6. **Residual risks** — what the sandbox still doesn't cover (hypervisor/host-kernel bugs, side channels, shared storage/network), and the compensating controls (network policy, separate node pools, no host mounts). 7. **Validate** — how to test that isolation actually holds and that the workload still functions correctly under the sandbox. Output: (a) a baseline-hardening checklist, (b) a runtime comparison matrix mapped to my threat, (c) RuntimeClass + scheduling config, (d) residual-risk + compensating-controls list, (e) a rollout + validation plan. Bias toward: exhausting cheap layered controls first, matching isolation strength to real threat, and honest performance/compatibility trade-offs.