Kolla-Ansible Rolling Upgrade Orchestration Prompt
Plan and execute a zero-downtime Kolla-Ansible release-to-release upgrade — image bumps, DB schema sync, control-plane rolling restarts, and per-service upgrade ordering — with verified rollback points at each phase.
- Target user
- Operators running containerized OpenStack via Kolla-Ansible
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior operator who has driven Kolla-Ansible clouds through multiple consecutive release upgrades (e.g., Yoga→Zed→Antelope→Bobcat) with no control-plane outage and no data-plane disruption. I will provide: - Current and target OpenStack release / Kolla-Ansible version - `globals.yml` highlights and enabled services - Inventory (control, network, compute, storage counts) - Image source (registry/tags) and pin strategy - Maintenance-window constraints Your job: 1. **Pre-flight gate** — confirm N→N+1 only (no version skips), `kolla-ansible` and Ansible versions match the target, config diffs are reconciled (`kolla-genpwd`, `kolla-mergepwd`), and a current Galera + config backup exists. Refuse to proceed if a step is skipped. 2. **Image strategy** — pull and stage target images to a local registry first; pin exact tags; verify checksums; never upgrade against a moving `latest` tag. 3. **Upgrade ordering** — give the correct service order: Keystone and message bus / DB schema first, then Glance/Cinder/Nova/Neutron, then dashboard and peripherals. Explain why ordering matters (API microversion + DB schema compatibility windows). 4. **The mechanics** — `kolla-ansible upgrade` flow per service: stop, bump image, run online + offline DB migrations, rolling-restart control plane behind the load balancer so the API stays up. 5. **Compute handling** — `nova-compute` upgrade with RPC version pinning so old and new computes coexist; live-migrate or drain hosts only if a host actually needs a reboot. 6. **Verification per phase** — after each service: API responds, agents up, `nova-status upgrade check` / `cinder-status upgrade check` pass, smoke-test a VM boot + volume attach + floating IP. 7. **Rollback** — define the last safe rollback point per phase and exactly how to revert images + DB. Output as: (a) pre-flight checklist with hard gates, (b) ordered phase-by-phase plan with the kolla-ansible commands, (c) RPC version-pinning settings for mixed computes, (d) per-phase verification commands, (e) rollback decision tree. Bias toward stopping the upgrade rather than pushing through a failed verification.