AI for Kubernetes & Helm Difficulty: Advanced ClaudeChatGPT

Kubernetes Request Right-Sizing from Prometheus History Prompt

Turn weeks of Prometheus usage history into safe, cost-aware CPU/memory request and limit recommendations per workload — without triggering OOMKills or throttling regressions.

Target user: Platform/FinOps engineers right-sizing workloads across many namespaces
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior platform engineer who has right-sized thousands of workloads and knows that bad requests cause both wasted spend AND production incidents.

I will provide:
- PromQL results for `container_cpu_usage_seconds_total`, `container_memory_working_set_bytes`, and `container_cpu_cfs_throttled_periods_total` over 2-4 weeks
- Current `requests`/`limits` per container
- Workload type (web, batch, JVM, in-memory cache, sidecar)
- VPA recommender output if available, and node instance types + cost per core/GB

Your job:

1. **Pick the right percentile** — explain why p95-p99 of working set (not average) drives the memory request, and why CPU requests should target sustained p90 while limits handle bursts. Call out workloads where average is dangerously misleading (JVM heap, cache warm-up).

2. **Memory is non-compressible** — set memory `request == limit` for anything that OOMKills badly; explain the working-set vs RSS vs cache distinction and why `working_set_bytes` is the correct signal.

3. **CPU throttling check** — if `cfs_throttled_periods / cfs_periods > 10%`, the limit is too tight regardless of usage; recommend raising or removing the CPU limit and the latency tradeoff.

4. **Per-workload recommendations** — output a table: container, current req/lim, observed p50/p95/p99, recommended req/lim, % change, monthly $ delta, risk flag.

5. **Guardrails** — never recommend a request below observed p95 memory; never cut CPU request by >50% in one step; flag workloads with too little history (<7 days) as "do not change yet."

6. **QoS implications** — show how the new values shift each pod's QoS class (Guaranteed/Burstable/BestEffort) and what that means for eviction order under node pressure.

7. **Rollout** — stagger changes (10% of replicas first), watch OOMKills + p99 latency + throttling for 24h, and provide rollback values.

8. **VPA decision** — recommend whether to adopt VPA in `Off` (recommend-only), `Initial`, or `Auto` mode per workload, and why `Auto` is unsafe alongside HPA on the same metric.

Output as: (a) recommendation table, (b) patched container resource blocks, (c) a Prometheus alert that catches under-provisioning post-change, (d) staged rollout plan.

Bias toward: safety over savings, p95+ for memory, and explicit "leave it alone" calls when data is thin.

Run this prompt with AI

Test it, get an AI-improved version, or compare models — live in the Prompt Workspace. No copy-paste.

Related prompts

More Kubernetes & Helm prompts & error guides

Browse every Kubernetes & Helm prompt and troubleshooting guide in one place.

Free download · 368-page PDF

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
Instant PDF download — yours free, forever
Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.

Kubernetes Request Right-Sizing from Prometheus History Prompt

Related prompts

Vertical Pod Autoscaler (VPA) Tuning Prompt

Kubernetes In-Place Pod Resize Design Prompt

Kubernetes Cost Allocation with OpenCost / Kubecost Prompt

Prometheus ServiceMonitor & PodMonitor Configuration Prompt

Reading prompts? Get all 500 in one free PDF