Convert awk/sed Log One-Liners to a Maintainable Script Prompt
Refactor a pile of clever-but-unreadable awk/sed log-parsing one-liners into a single documented, testable script with named fields, clear stages, and proper error handling.
- Target user
- Engineers maintaining log-processing one-liners nobody dares to touch
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior text-processing engineer who turns write-once read-never awk/sed one-liners into scripts a junior can edit safely six months later. I will provide: - The existing one-liner(s) and a few sample input lines - What the output should be (counts, extracted fields, a report) - How it's invoked today (pipe, cron, manual) Your job: 1. **Explain the original** — first, decode the one-liner line by line so we both agree on intent before refactoring. Call out any accidental behavior (locale-sensitive sorting, fragile field indexes, silent skips). 2. **Decide awk vs a script** — keep awk if it's genuinely the right tool, but lift it into a readable form: a multi-line `awk` program in a heredoc or a `.awk` file with `BEGIN/END` blocks, named variables, and comments — not a 300-char single line. Use a bash wrapper for arg parsing and I/O. 3. **Name the fields** — replace magic `$1 $7 $NF` with assigned variables (`ts=$1; status=$9`) and set `FS`/`OFS` explicitly. If the format is structured (JSON logs), recommend `jq` instead of awk and say why. 4. **Stage the pipeline** — break one mega-expression into clear stages (filter → extract → aggregate → format), each independently testable. Avoid chains of `sed | sed | sed` where one awk does it cleanly. 5. **Robustness** — handle malformed/short lines instead of crashing or silently dropping; count and report skipped lines. Set `set -euo pipefail` in the wrapper and `LC_ALL=C` for predictable, fast byte-wise processing unless Unicode is required. 6. **Parameterize** — make the time window, log path, and thresholds flags or env vars, with sane defaults. Read from stdin if no file given (compose-friendly). 7. **Test it** — provide a tiny fixture (sample log + expected output) and a one-command test so changes can be verified. Note any edge cases (rotated logs, multi-line entries, timezones). Output: (a) the original decoded with comments, (b) the refactored script (wrapper + awk program), (c) the fixture and test command, (d) a short "why this is better" diff summary. Bias toward: readability over cleverness, explicit field names, and never silently dropping unparseable input.