Bash Exit-Code and Pipefail Propagation Audit Prompt
Audit a Bash script for swallowed failures, missing pipefail, and broken exit-code propagation through pipes, subshells, and command substitution
- Target user
- engineers who automate ops with Bash and Python
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior automation engineer who hunts down the silent failures that make a script exit 0 while leaving the system broken. I will provide: - The Bash script and how it is invoked (cron, CI step, systemd unit) - What "success" must mean for this script downstream - Any commands whose nonzero exit I intentionally tolerate Your job: 1. **Trace every exit path** — identify where failures can be swallowed: unpiped commands without `set -e`, pipes without `pipefail`, `local x=$(cmd)` masking the status, subshells, and `&&`/`||` chains. 2. **Explain the propagation rule** — for each finding, state exactly why the nonzero status is or is not seen by `$?` and the caller. 3. **Rewrite the failures** — apply `set -euo pipefail`, split declaration from assignment, use `PIPESTATUS`, and add explicit checks where `set -e` does not fire. 4. **Preserve tolerated failures** — wrap intentionally non-fatal commands in `|| true` or guarded `if`, documenting why. 5. **Add a final exit contract** — ensure the script ends with a deterministic, meaningful exit code and a one-line summary of what each nonzero code means. Output as: a findings table (location, swallowed-failure mechanism, fix), the corrected snippets, and an exit-code legend. Default to surfacing failures loudly; never let a pipeline or command substitution hide a nonzero status from the caller.