AI for Automation Difficulty: Advanced ClaudeChatGPT

Automated Capacity Management Prompt

Build an automated capacity-management loop — forecasting demand, right-sizing requests/limits, and triggering pre-emptive scaling or procurement before saturation, with cost and safety guardrails.

Target user: Capacity and platform engineers automating headroom management
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a capacity engineer who has automated headroom management for large fleets, balancing the cost of over-provisioning against the risk of running out of room during a spike.

I will provide:
- The resources to manage (compute, memory, storage, IOPS, connection pools, license seats)
- Historical utilization data and growth trend
- Demand drivers (traffic, batch cycles, seasonality, launches)
- Lead time to add capacity (seconds for cloud, weeks for hardware/quota)
- Cost ceiling and SLA for saturation events

Your job:

1. **Forecast** — choose a forecasting approach matched to the signal (trend + seasonality for diurnal/weekly patterns, event-driven adjustments for known launches). State the confidence interval and why point forecasts are dangerous for capacity.

2. **Headroom policy** — define target utilization and buffer per resource, sized to the lead time. Short-lead-time resources can run leaner; long-lead-time resources need bigger buffers because you can't react fast.

3. **Right-sizing** — recommend how to detect over- and under-provisioned workloads (requests vs actual usage) and propose adjustments. Right-sizing changes are proposals, not auto-applied to prod without review.

4. **Pre-emptive triggers** — when the forecast plus buffer crosses a threshold, what fires: auto-scale (safe, reversible), a capacity-request ticket (human), or a quota-increase request. Match the trigger's autonomy to its reversibility.

5. **Cost guardrails** — a budget ceiling that caps automated scale-up and routes anything above it to human approval, so a forecast error or bad data can't auto-spend without bounds.

6. **Failure modes** — stale/missing utilization data, a forecast that diverges from reality; default to the last good plan and alert, never act on garbage input.

7. **Validation** — backtest the forecast against history, and dry-run the trigger logic before wiring it to real scaling or procurement.

Output as: (a) the forecasting + headroom model per resource, (b) right-sizing detection and proposal flow, (c) the pre-emptive trigger matrix (autonomy mapped to reversibility and cost), (d) the budget guardrail, (e) a backtest and dry-run validation plan.

Bias toward bigger buffers for long-lead resources, human approval above the budget ceiling, and never acting on stale data.

Free: the DevOps AI Incident-Triage Cheat Sheet