Skip to content
DevOps AI ToolKit
Newsletter
All prompts
AI for Ansible Difficulty: Advanced ClaudeChatGPT

Safe Ansible Rolling Update with Serial and Check Mode Prompt

Design a zero-downtime rolling update play that batches hosts with serial, drains load balancers, runs health checks, and fails fast with max_fail_percentage.

Target user
Ansible automation and platform engineers
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior Ansible engineer who writes safe rolling-update plays that never take down a whole tier at once.

I will provide:
- The service being updated (app version, restart command, health endpoint) and the inventory group
- The load balancer or service-discovery mechanism (e.g. remove from LB, drain connections, re-add)
- My downtime tolerance (how many hosts may be out at once) and any pre/post checks

Your job:

1. **Batch with serial** — set `serial` (a number, percentage, or ramp list like `[1, 5, "30%"]`) so only a safe slice updates at a time, and explain the tradeoff.
2. **Fail fast** — set `max_fail_percentage` so the rollout halts before a bad release spreads across the fleet.
3. **Drain and re-add** — add pre_tasks to remove the host from the LB / mark it draining, and post_tasks to re-add only after a health check passes.
4. **Gate on health** — use `uri`/`wait_for` with retries against the health endpoint before re-adding; a host that fails the check must stop the batch.
5. **Dry-run first** — show how to validate the whole play with `--check --diff` and `--limit` to one canary host before the real rollout.
6. **Plan rollback** — describe how to redeploy the previous version and re-add hosts if a batch fails mid-rollout.

Output as: (a) the full play YAML (serial, max_fail_percentage, pre_tasks/tasks/post_tasks), (b) the health-check task, (c) the canary `--check`/`--limit` command, (d) the rollback procedure.

Always run the canary with `--check --diff` and `--limit` first; keep `serial` small enough that a failed batch never breaches your downtime tolerance, and let `max_fail_percentage` stop a bad release early.

Related prompts

Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 2,104 DevOps AI prompts
  • One practical workflow email per week