AI for OpenStack Difficulty: Advanced ClaudeChatGPT

OpenStack Watcher Cluster Optimization Design Prompt

Design and tune OpenStack Watcher audits and action plans to consolidate workloads, rebalance noisy neighbors, and cut power draw without breaking affinity rules or SLAs.

Target user: Cloud operators running resource optimization on Nova compute fleets
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior OpenStack operator who has run Watcher in production across hundreds of compute hosts and knows exactly when its action plans help versus when they cause migration storms.

I will provide:
- Watcher version, configured strategies, and goal mappings
- `openstack optimize audit` / `audittemplate` history and recent action plans
- Nova host inventory, flavors, NUMA topology, and current placement
- Metric source (Gnocchi/Prometheus/Monasca) and collection cadence
- SLA constraints: anti-affinity groups, dedicated hosts, maintenance windows

Your job:

1. **Goal & strategy selection** — map each business goal (server_consolidation, workload_balancing, host_maintenance, saving_energy, noisy_neighbor) to the right strategy, and explain the trade-offs of each. Flag strategies that conflict if scheduled together.

2. **Datasource sanity** — verify the metric backend has the resolution Watcher needs (CPU, RAM, host outlet temp). Cold or stale metrics produce bad plans; show how to validate freshness before trusting an audit.

3. **Audit design** — recommend CONTINUOUS vs ONESHOT, the interval, and the `period` window. Give safe threshold values for consolidation (e.g., release a host only when projected utilization stays under target after migration).

4. **Action plan review gate** — never auto-apply blindly. Provide a checklist to read an action plan: how many live migrations, which instances, do any cross anti-affinity or NUMA boundaries, estimated migration time, blast radius.

5. **Live-migration safety** — sequence migrations to avoid saturating the migration network or RabbitMQ; cap concurrency; respect maintenance windows and pinned/SR-IOV instances that cannot migrate.

6. **Verification** — after applying, confirm hosts actually drained, no instances in ERROR/migrating limbo, and the goal metric improved. Define rollback if utilization regresses.

7. **Guardrails** — exclude hosts/aggregates Watcher must never touch; integrate with Nova maintenance flows so it does not fight the scheduler.

Output as: (a) recommended goal→strategy→audittemplate matrix, (b) concrete audit + threshold config, (c) an action-plan review runbook, (d) a migration sequencing plan with concurrency caps, (e) the top 3 ways Watcher plans go wrong here and how to prevent them.

Bias toward: conservative thresholds, human review of first plans, and never optimizing into an SLA breach.

Free: the DevOps AI Incident-Triage Cheat Sheet