Structured Logging in Bash and Python Automation Scripts
echo statements don't scale past one machine. Here's how to add leveled, structured JSON logging to bash and Python so your automation is searchable and debuggable.
- #bash
- #python
- #logging
- #json
- #observability
- #automation
When a script runs on your laptop, echo "doing the thing" is fine. When that same script runs unattended across forty machines on a schedule, those echoes scatter into log files nobody reads, with no timestamps, no severity, and no way to search them. The first time a scheduled job fails silently and you have no log to explain why, you understand the cost of casual logging.
After 25 years of running automation in production, here is how I add real structured logging to bash and Python without much ceremony.
Levels and stderr: the bash minimum
The single most important upgrade to a bash script is a logging function that adds a timestamp and a level, and sends logs to stderr so they do not pollute the script’s actual output:
log() {
local level="$1"; shift
printf '%s [%s] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$level" "$*" >&2
}
log INFO "starting backup of ${SOURCE}"
log WARN "retrying upload, attempt ${attempt}"
log ERROR "backup failed: ${err}"
Two things matter here. First, logging to >&2 keeps your logs out of stdout, so a script that produces data (a CSV, a JSON blob) stays pipeable while still logging. Second, the UTC ISO-8601 timestamp means logs from different machines actually correlate. date -u everywhere — local timezones in logs are a debugging nightmare across a fleet.
Controllable verbosity
Add a level threshold so a script can run quiet in production and verbose when you are debugging:
LOG_LEVEL="${LOG_LEVEL:-INFO}"
declare -A LEVELS=([DEBUG]=0 [INFO]=1 [WARN]=2 [ERROR]=3)
log() {
local level="$1"; shift
(( LEVELS[$level] >= LEVELS[$LOG_LEVEL] )) || return 0
printf '%s [%s] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$level" "$*" >&2
}
Now LOG_LEVEL=DEBUG ./script.sh shows everything, and the default run shows INFO and up.
Structured JSON when a log system is collecting
The moment logs flow into Loki, Elasticsearch, or CloudWatch, plain text becomes a liability — you cannot filter on fields that do not exist. Emit JSON instead, so every field is queryable:
log_json() {
local level="$1" msg="$2"
printf '{"ts":"%s","level":"%s","msg":"%s","host":"%s"}\n' \
"$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$level" "$msg" "$(hostname)" >&2
}
Now you can query “all ERROR logs from host web-3 in the last hour” instead of grepping free text. For anything more than a few fields, though, bash string-building gets fragile fast (quoting, escaping) — that is the signal to log from Python.
Python’s logging module, done right
Python ships a capable logging module. Configure it once, use it everywhere:
import logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%Y-%m-%dT%H:%M:%SZ",
)
log = logging.getLogger("backup")
log.info("starting backup of %s", source)
log.warning("retrying upload, attempt %d", attempt)
log.error("backup failed", exc_info=True)
Two habits worth forming: pass values as arguments ("... %s", source) rather than f-strings, so logging can skip formatting entirely when the level is suppressed; and use exc_info=True on errors to capture the full traceback automatically.
Structured JSON in Python
For machine-readable output, structlog (or a JSON formatter) gives you key-value logs cleanly:
import structlog
log = structlog.get_logger()
log.info("backup_complete", source=source, bytes=total, duration_s=elapsed)
That emits a JSON line with source, bytes, and duration_s as real fields. In your log system you can now graph backup duration or filter by source without parsing free text. This is the payoff of structured logging: the log is data, not prose.
What to actually log
Good logging is not “log everything.” It is logging the things that answer “what happened and why did it fail” at 2am:
- Start and end of the run, with a duration.
- Every external action — which file, which API, which host.
- Every decision the script makes — “skipping, already migrated.”
- The actual error, including the command and exit code, not just “failed.”
- Counts — how many processed, how many skipped, how many failed.
What not to log: secrets, tokens, full request bodies with customer data. Treat every log line as something that might end up in a system other people can read.
Where AI helps
Adding consistent logging to an existing script is tedious but mechanical, which makes it a good fit for AI. I paste a script and ask:
“Add structured logging to this script. Use a leveled logger writing to stderr with UTC ISO-8601 timestamps. Log the start/end with duration, every external action, and errors with the failing command and exit code. Do not log any secrets or tokens. Keep the script’s stdout output unchanged.”
The “keep stdout unchanged” and “no secrets” constraints are the ones that matter — without them, models cheerfully log tokens and dump diagnostics into stdout. I keep these logging prompts in my prompt library.
The upshot
Casual echo is fine for a script you babysit. The moment it runs unattended or across more than one machine, you want levels, UTC timestamps, stderr separation, and — once a log system is collecting — structured JSON. The script becomes debuggable by someone who is not you, at an hour you would rather be asleep.
For more on production-grade automation, see our bash and Python automation guides.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.