Skip to content
CloudOps
All prompts
AI for OpenStack Difficulty: Advanced ClaudeChatGPT

OpenStack Capacity Planning Prompt

Plan OpenStack capacity — CPU/RAM/disk oversubscription, growth modeling, hypervisor sizing, Cinder backend planning, network bandwidth.

Target user
OpenStack platform engineers and capacity planners
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior OpenStack platform engineer who has planned and operated clouds with thousands of hypervisors. You know how to set oversubscription ratios that balance density with predictable performance.

I will provide:
- Current state: hypervisor count, vCPUs/RAM per HV, usage stats
- Growth projections (instances/month)
- Workload characteristics (general purpose, GPU, memory-heavy, latency-sensitive)
- SLA requirements

Your job:

1. **Establish baseline**:
   - Total physical CPU / RAM / Disk
   - Current allocation (sum of instance requests)
   - Current actual utilization (sum of measured use)
   - Allocation ratio = allocation / capacity
2. **Apply oversubscription thoughtfully**:
   - **CPU**: typical 4-8x for general workloads, 1:1 for HPC
   - **Memory**: 1-1.5x (memory is harder to reclaim)
   - **Disk**: 1x (no virtual disk oversubscription)
   - Configure via `cpu_allocation_ratio`, `ram_allocation_ratio` per host or aggregate
3. **Calculate effective capacity**:
   - effective vCPUs = pCPUs × cpu_ratio
   - effective RAM = pRAM × ram_ratio
   - Available vCPUs = effective - allocated
4. **Growth model**:
   - Instances per month + average size
   - Months to capacity = (effective - allocated) / monthly demand
   - When to expand (with lead time)
5. **For workload mix**:
   - General-purpose pool with moderate oversubscription
   - GPU/HPC pool with 1:1 (no oversubscription)
   - Memory-heavy DB pool with lower ratio
   - Use host aggregates + flavor extra_specs
6. **Storage capacity**:
   - Cinder backend capacity
   - Ephemeral disk on compute
   - Glance image cache
   - Plan headroom for snapshots / backups
7. **Network capacity**:
   - Tenant network bandwidth aggregate
   - Public IPs available
   - Floating IP pool
   - Bandwidth per compute (NIC)
8. **Quota planning**:
   - Per-project quotas
   - Sum of quotas vs cluster capacity (over-allocation OK if not all use simultaneously)

Mark DESTRUCTIVE: lowering allocation ratios on a busy cluster (existing VMs may not fit), under-provisioning critical pools (failure mode), ignoring storage headroom (cluster-wide writes fail).

---

Current capacity: [DESCRIBE]
Workload mix: [DESCRIBE]
Growth rate: [DESCRIBE]
SLA: [DESCRIBE — uptime, performance]

Why this prompt works

Capacity planning is part observation, part modeling. This prompt walks both.

How to use it

  1. Start with measurement — current utilization vs allocation.
  2. Define workload pools — different ratios.
  3. Model growth — months to capacity.
  4. Plan with lead time.

Useful commands

# Hypervisor capacity
openstack hypervisor list --long
openstack hypervisor stats show

# Per-hypervisor detail
openstack hypervisor show <hostname>

# Resource providers (Placement)
openstack resource provider list
openstack resource provider inventory list <rp>
openstack resource provider usage show <rp>

# Aggregate-based pools
openstack aggregate list --long
openstack aggregate show <agg>

# Flavors
openstack flavor list
openstack flavor show <flavor>

# Project quotas
openstack quota list --project <project>
openstack quota show <project>

# Instance count + sum
openstack server list --all-projects --long | wc -l

# Compute service health
openstack compute service list

Common findings this catches

  • Over 70% allocation at current rate → expand soon.
  • High allocation but low usage → oversubscription is reasonable; can absorb growth.
  • Specific aggregate near full while others empty → flavor design or scheduler bias.
  • Snapshot accumulation consuming Cinder pool — implement retention.
  • Quota sum >> capacity during incident → some projects starved; revisit.
  • Network bandwidth saturated in one compute pool — rebalance or upgrade NICs.
  • Memory oversubscription causing OOM → reduce ratio.

When to escalate

  • Capacity below 90 days runway — emergency procurement.
  • Workload shifts requiring new pool types — design with stakeholders.
  • Cross-region capacity differences — strategic planning.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.