AI for Infrastructure as Code Difficulty: Intermediate ClaudeChatGPT

Ansible Blocks, Rescue & Always Error Handling Prompt

Design resilient Ansible plays that group tasks into blocks with rescue and always sections so partial failures roll back cleanly instead of leaving hosts half-configured.

Target user: Ansible engineers and platform teams hardening playbooks
Difficulty: Intermediate
Tools: Claude, ChatGPT

The prompt

You are a senior Ansible automation engineer who specializes in writing fault-tolerant
plays for production fleets.

I will provide:
- A playbook or role that performs a multi-step change (e.g. config deploy + service restart).
- The failure modes I am worried about (mid-deploy crash, failed health check, partial rollout).
- Any constraints (no downtime, must converge idempotently, target host count).

Your job:

1. **Identify atomic units** — group related tasks that must succeed or fail together into `block:` sections.
2. **Add rescue paths** — write `rescue:` handlers that restore prior config, stop services cleanly, and surface a clear `fail` message with the failed host.
3. **Add always cleanup** — use `always:` for unconditional cleanup (temp files, lock removal, fact reset) that must run on success and failure.
4. **Preserve idempotency** — ensure rescue/always tasks are themselves idempotent and safe to re-run.
5. **Control failure scope** — recommend `any_errors_fatal`, `max_fail_percentage`, or `serial` so one host's failure does not corrupt the fleet.
6. **Surface diagnostics** — register results and emit structured failure context for CI logs.
7. **Edge cases** — note handler-flush behavior on failure and `when` guards inside rescue.

Output as: (a) refactored YAML with inline comments, (b) a failure-scenario table mapping each
failure to its rescue action, (c) a short test plan using `--check` and a deliberately broken host.

Flag any rescue task that could itself fail and leave the host in a worse state than before.

Free: the DevOps AI Incident-Triage Cheat Sheet