Stop Using echo: Safe String Formatting in Bash with printf

There’s a class of Bash bug that survives code review precisely because the offending line looks harmless. echo "$message" reads like the most innocent statement in the file, right up until $message contains a leading -n, an embedded backslash, or a value an attacker controls. Then it silently drops a flag, mangles an escape sequence, or worse, gets interpolated into a command and runs something you never wrote. The fix is one of the oldest pieces of advice in shell programming and one of the most ignored: stop reaching for echo, and learn printf.

I’m not against generating shell with AI; I do it constantly. But Bash is a language where the difference between safe and exploitable is a single quoting decision, and large language models are happy to hand you echo because most of their training data does too. AI drafts, human verifies. Verifying Bash means knowing which output primitive is actually predictable, and that’s printf.

Why echo is unpredictable

The core problem is that echo’s behavior is not portable and not even consistent within Bash depending on options. Whether echo interprets backslash escapes, whether it honors -n and -e, and how it handles a value that starts with a dash all vary between /bin/sh, Bash builtin echo, dash, and the xpg_echo shell option. Consider:

$ msg="-n hello"
$ echo "$msg"
hello          # the -n got eaten as a flag, no newline either

You wanted to print the literal string -n hello. Instead echo treated the first word as an option. There is no quoting you can add to "$msg" to prevent this, because the parsing happens after expansion. Now imagine $msg came from a log line, a filename, or an API response. Your output is now data-dependent in a way you can’t lock down.

printf has none of this ambiguity. It takes a format string and arguments, and it does exactly what the format says regardless of what the data looks like:

$ printf '%s\n' "$msg"
-n hello

The %s directive consumes one argument and prints it verbatim. The data can never be reinterpreted as a flag or an escape, because the format string is fixed in your source code and the values only ever land in %s. That single property, separating the format from the data, is the whole reason to switch.

Format strings, briefly

If you’ve used printf in C or Python, the Bash builtin will feel familiar, with a couple of useful twists. The common directives:

printf '%s\n' "plain string"
printf '%d items\n' 42
printf 'pi is %.2f\n' 3.14159
printf '%-20s %s\n' "name:" "value"      # left-justified, width 20
printf '%05d\n' 7                          # zero-padded: 00007

The twist that catches people off guard is that printf reuses the format string until all arguments are consumed. That’s not a bug; it’s the single most useful feature for ops work:

$ printf '%s = %s\n' KEY1 val1 KEY2 val2 KEY3 val3
KEY1 = val1
KEY2 = val2
KEY3 = val3

One format, applied row by row over a flat list of arguments. This is how you turn an array into aligned, machine-readable output without a loop. Combine it with mapfile and you have a tidy reporting pipeline; for the JSON-output variant of that idea, the machine-readable JSON output prompt is the natural next step.

%q: the quoting directive that prevents injection

The directive that justifies this whole article is %q. It prints its argument quoted in a form that can be reused as input to the shell, escaping any character that the shell would otherwise treat specially. This is the tool for safely constructing commands, building reproducible logs, and generating scripts on the fly.

$ dir='my dir; rm -rf /'
$ printf '%q\n' "$dir"
my\ dir\;\ rm\ -rf\ /

The semicolon, the spaces, everything dangerous is neutralized. If you took that output and pasted it back into a shell, it would refer to a single literal filename, not a command separator followed by a destructive command. Contrast with the naive approach that everyone has shipped at least once:

# DANGEROUS: building a command by string interpolation
cmd="ls $dir"
eval "$cmd"          # runs: ls my dir; rm -rf /   -> catastrophe

The safe rewrite uses %q to quote every interpolated value before it ever touches eval or a generated script:

# Build a command string that is safe to eval or write to a file
printf -v safe_cmd 'ls %q' "$dir"
eval "$safe_cmd"     # runs: ls 'my dir; rm -rf /'  -> one harmless ls

The printf -v safe_cmd form assigns the result to a variable instead of printing it, which is exactly what you want when you’re assembling a command incrementally. This is also the right pattern for generating a remediation script from untrusted input, or for logging the exact command you ran in a way that can be replayed verbatim.

Prompt I gave the model: “I’m generating a Bash command at runtime from a list of file paths that may contain spaces, semicolons, and quotes, then running it via eval. Rewrite this to be injection-safe. Explain why %q is necessary and where it can still fail.” The draft correctly reached for printf '%q' but initially missed that older Bash versions quote differently than printf '%Q' and didn’t mention that %q output is shell-specific, not safe to feed to a non-shell consumer. Good draft, incomplete on the edge cases, which is exactly the part you have to verify yourself.

A caveat worth internalizing: %q produces output safe for the shell. It is not a general-purpose sanitizer. Do not use it to escape values bound for SQL, a JSON document, or an HTTP header. For credentials specifically, quoting is the wrong layer entirely; you want to keep secrets out of argument lists and command logs in the first place, which the secret-handling prompt covers in depth.

Locale stability and reproducible output

One more reason printf earns its keep: it’s locale-aware in the places that matter and stable in the places you need it to be. Floating-point formatting respects the locale’s decimal separator, which can surprise you on a host set to a comma locale:

$ LC_NUMERIC=de_DE.UTF-8 printf '%.2f\n' 3.5
3,50

If a script writes numbers that another tool parses, that comma is a latent bug. The fix is to pin the locale for the numeric-sensitive call:

LC_ALL=C printf '%.2f\n' 3.5     # always 3.50, period

For %s and %q the content is byte-faithful, so you get deterministic output regardless of locale, which is what you want for hashing, diffing, and reproducible builds. Pinning LC_ALL=C around the formatting boundary is a cheap insurance policy that I now add by reflex to any script that emits parseable data.

The rule, distilled

Use printf '%s\n' instead of echo everywhere; it costs you four characters and buys you predictability. Use printf -v to build strings without subshells. Use %q whenever a value will be interpolated back into a shell, a generated script, or an eval. Pin LC_ALL=C when output must be byte-stable. None of this is new, and that’s the point: printf has been the right answer for decades, and the only thing that’s changed is that AI now generates the echo version faster than ever, so it’s on you to catch it in review.

More hard-won shell patterns, including the quoting and word-splitting traps that pair directly with this one, live in the Bash & Python automation collection.

Stop Using echo: Safe String Formatting in Bash with printf and %q