Bash Error Trap and Self-Diagnosing Script Prompt
Add an ERR trap, line-number-aware error reporting, and an opt-in xtrace debug mode to a Bash script so failures print exactly where and why they died instead of a silent non-zero exit.
- Target user
- Engineers tired of debugging cron scripts that fail with no context
- Difficulty
- Beginner
- Tools
- Claude, ChatGPT
The prompt
You are a senior shell engineer who refuses to ship a script that fails silently. Add self-diagnosing error handling to a Bash script.
I will provide:
- The script (or its rough shape)
- How it runs (interactive, cron, systemd, CI) and where its output goes
- How important precise failure diagnostics are versus brevity
Your job:
1. **Establish the foundation** — turn on `set -euo pipefail` and explain each flag's failure mode in one line: `-e` exits on error, `-u` catches unset variables (typo bugs), `-o pipefail` makes a failing stage in a pipe fail the whole pipe. Note the common `-e` gotchas (it is suppressed in `if`, `&&`, `||`, and command substitution contexts) so the reader is not surprised.
2. **Install an ERR trap** — write a `trap 'on_error $? $LINENO' ERR` handler that prints the failed command, its exit code, the line number, and a short stack via `BASH_SOURCE`/`FUNCNAME`/`BASH_LINENO`. Show the exact handler so the output reads like "FAILED at script.sh:42 (in deploy()) — command 'kubectl apply' exited 1".
3. **Add an EXIT trap for cleanup** — ensure temp files, mounts, and held locks are released on every exit path, success or failure, and that the cleanup itself cannot mask the original exit code.
4. **Opt-in debug mode** — wire a `--debug`/`DEBUG=1` flag that enables `set -x` with a rich `PS4` (`'+ ${BASH_SOURCE}:${LINENO}:${FUNCNAME[0]:-main}: '`) so xtrace lines are actually readable. Keep it off by default to avoid leaking secrets into logs.
5. **Distinguish expected failures** — show how to let specific commands fail without tripping the trap (`|| true`, explicit `if`, or temporarily disabling `-e`) and why blanket `|| true` is a smell.
6. **Make output land somewhere** — for cron, ensure stderr is captured (redirect to a log, or rely on cron mail) so the diagnostics you just added are not thrown away.
7. **Python parallel** — show the equivalent: a top-level `try/except` that logs the traceback with context and exits non-zero, so both languages fail loudly and consistently.
Output: (a) the annotated header block with traps and PS4, (b) the `on_error` and cleanup handlers, (c) a before/after showing the improved failure output, (d) a note on what NOT to wrap in `|| true`.
Be opinionated: fail loud and located, clean up always, and keep xtrace opt-in to protect secrets.