Ansible delegate_to and run_once Correctness Audit Prompt
Audit delegation and run_once usage in a playbook so single-instance tasks, load-balancer drains, and DB migrations run on exactly the right host exactly once.
- Target user
- Engineers using delegate_to and run_once who have been bitten by tasks firing on the wrong host or N times
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT, Cursor
The prompt
You are a senior Ansible engineer who has debugged the classic delegation bugs: a DB migration that ran once per web node, a load-balancer drain that hit the wrong member, and a `run_once` task whose facts came from a host nobody expected. You know `delegate_to`, `run_once`, `delegate_facts`, and `serial` interact in ways that are easy to get subtly wrong. I will give you tasks that use `delegate_to` and/or `run_once`. Audit them for correctness and rewrite the unsafe ones. Steps: 1. **For each delegated task, state the intended target and the actual target**: confirm `delegate_to` resolves to the host you mean (an inventory hostname, `localhost`, or a templated value) and that the connection vars used are the delegate's, not the loop host's. 2. **run_once semantics**: confirm whether the task should run once per play or once per batch under `serial`, and note that `run_once` runs once per batch — flag any case where that surprises the author. 3. **Fact attribution**: where the delegated task registers a variable or gathers facts, clarify whether the result is attributed to the original host or the delegate, and whether `delegate_facts: true` is needed. 4. **Ordering and gating**: for orchestration steps (drain LB member, migrate DB, run smoke test), confirm they are ordered correctly and gated so a failure stops the rollout rather than continuing. 5. **Loop interaction**: flag any `delegate_to` inside a loop where the delegate is constant — this serializes work on one host and can overload it. 6. **Idempotency**: confirm the once-only operations (migrations, registrations) are themselves idempotent or guarded by a `creates`/`when` check. Fill in: - Tasks to audit: [PASTE TASKS] - What each task is meant to accomplish: [DESCRIBE: e.g. "run DB migration once on db01"] - Are you using serial / batches: [yes + batch size / no] Output format: a per-task verdict table (intended target, actual target, run_once scope, fact attribution, correct? + risk), then the corrected YAML, then a short note on how to verify with `--limit` and a single dry run before a full rollout. Do not run anything. Migrations, LB drains, and registrations are destructive when they fire on the wrong host or repeat; recommend `--check`/`--limit` on one batch and a connectivity verification between orchestration steps.
Why this prompt works
Delegation is where Ansible stops being a simple config tool and starts being an orchestrator, and that is exactly where the subtle bugs live. The two most expensive ones are timeless: a database migration written with run_once that quietly fires once per serial batch instead of once per playbook, and a load-balancer drain whose delegate_to resolved to the wrong member because of a templated hostname. This prompt makes the auditor state, for every delegated task, the intended target versus the actual target — which is the single check that catches both classes of mistake before they reach production.
Fact attribution is the part nobody internalizes until it bites them. When a task is delegated, the connection moves to the delegate, but whether the registered result or gathered facts belong to the original host or the delegate depends on delegate_facts. The prompt forces that to be explicit, so a registered variable used three tasks later actually holds the value you think it does. Combined with the loop-interaction check — a constant delegate inside a loop silently serializes every iteration onto one host — it covers the performance traps alongside the correctness ones.
The orchestration framing keeps the output honest about ordering and gating. A rolling deploy is a sequence of once-only steps that must stop on failure, and the prompt insists those steps be ordered and gated rather than fired hopefully. By refusing to run anything and pushing a --limit single-batch dry run first, it treats migrations, drains, and registrations as the destructive, host-specific operations they are.