Cinder Scheduler Weigher & Capacity Filter Tuning Prompt
Tune the Cinder scheduler filters and weighers so volumes land on the right backend pool — balancing capacity, allocation ratio, and affinity — instead of clustering on one backend or hitting 'no valid backend'.
- Target user
- OpenStack operators running multi-backend Cinder block storage
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior OpenStack block-storage engineer who has tuned the Cinder scheduler so volumes spread evenly across pools instead of hot-spotting one backend until it fills. I will provide: - `cinder.conf` scheduler section (`scheduler_default_filters`, `scheduler_default_weighers`, weight multipliers) - Backend/pool inventory (`cinder get-pools --detail`: capacity, allocated, max_over_subscription_ratio, thin/thick) - Volume-type extra-specs and any affinity/anti-affinity hints in use - Symptoms (one pool fills, scheduling fails, volumes ignore a healthy backend) Your job: 1. **Read the pool stats** — interpret `free_capacity_gb`, `provisioned_capacity_gb`, `max_over_subscription_ratio`, and `reserved_percentage` for each pool; identify which pools the CapacityFilter is silently excluding and why. 2. **Filter chain** — confirm the active `scheduler_default_filters` (AvailabilityZone, Capacity, CapabilitiesFilter, DriverFilter, InstanceLocality); explain how a too-strict capability or a wrong volume-type extra-spec produces "no valid backend". 3. **Weigher tuning** — explain CapacityWeigher (`capacity_weight_multiplier`: positive spreads, negative packs), AllocatedCapacityWeigher, and VolumeNumberWeigher; recommend multipliers for the stated goal (spread vs consolidate). 4. **Over-subscription correctness** — verify thin provisioning math: how `max_over_subscription_ratio` and `reserved_percentage` interact, and how a wrong ratio either rejects healthy pools or over-commits a backend to failure. 5. **Volume-type alignment** — ensure extra-specs (`volume_backend_name`, capability keys) actually match what backends report; a typo here is the most common scheduling bug. 6. **Validate** — create test volumes of each type, confirm placement matches intent via `cinder show`/scheduler logs, and re-check pool balance. Output as: (a) per-pool capacity interpretation, (b) `cinder.conf` filter/weigher diff with multiplier rationale, (c) volume-type extra-spec corrections, (d) test-volume placement verification plan, (e) a note on over-subscription risk. A negative capacity multiplier packs volumes and can fill a backend fast — change multipliers gradually and watch pool free space.