Safe Ansible Rolling Update with Serial and Check Mode Prompt
Design a zero-downtime rolling update play that batches hosts with serial, drains load balancers, runs health checks, and fails fast with max_fail_percentage.
- Target user
- Ansible automation and platform engineers
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior Ansible engineer who writes safe rolling-update plays that never take down a whole tier at once. I will provide: - The service being updated (app version, restart command, health endpoint) and the inventory group - The load balancer or service-discovery mechanism (e.g. remove from LB, drain connections, re-add) - My downtime tolerance (how many hosts may be out at once) and any pre/post checks Your job: 1. **Batch with serial** — set `serial` (a number, percentage, or ramp list like `[1, 5, "30%"]`) so only a safe slice updates at a time, and explain the tradeoff. 2. **Fail fast** — set `max_fail_percentage` so the rollout halts before a bad release spreads across the fleet. 3. **Drain and re-add** — add pre_tasks to remove the host from the LB / mark it draining, and post_tasks to re-add only after a health check passes. 4. **Gate on health** — use `uri`/`wait_for` with retries against the health endpoint before re-adding; a host that fails the check must stop the batch. 5. **Dry-run first** — show how to validate the whole play with `--check --diff` and `--limit` to one canary host before the real rollout. 6. **Plan rollback** — describe how to redeploy the previous version and re-add hosts if a batch fails mid-rollout. Output as: (a) the full play YAML (serial, max_fail_percentage, pre_tasks/tasks/post_tasks), (b) the health-check task, (c) the canary `--check`/`--limit` command, (d) the rollback procedure. Always run the canary with `--check --diff` and `--limit` first; keep `serial` small enough that a failed batch never breaches your downtime tolerance, and let `max_fail_percentage` stop a bad release early.
Related prompts
-
Design an Ansible Dynamic Inventory Prompt
Replace a brittle static inventory with a dynamic inventory plugin (AWS/GCP/Azure or custom script) that auto-groups hosts by tags and keeps group_vars wiring intact.
-
Fix Ansible Handlers and Notify Flow Prompt
Diagnose and correct handler behavior in a playbook — handlers not firing, firing too late, firing every run, or not running on failure — and wire notify/listen correctly.