Translating a Bash Script to Python with AI Without Breaking

There’s a point where a bash script stops being the right tool — usually when it grows nested associative arrays, JSON parsing, and error handling that bash makes painful. I hit that wall with a 300-line backup orchestrator and asked an AI assistant to port it to Python. It produced a runnable translation in under a minute. It also introduced three behavior changes that would have silently corrupted backups. AI is a fast junior engineer for translation work; the translation is a starting draft you must verify against the original, line by line, before it runs on prod.

Decide what not to translate

The first question I ask the AI isn’t “translate this” — it’s “which parts of this script are genuinely better left as shell calls?” Plenty of bash is just orchestrating tar, rsync, and aws. Rewriting those as Python libraries adds risk for no benefit.

The AI correctly identified that the rsync and tar invocations should stay as subprocess calls, while the argument parsing, retry logic, and JSON manifest handling were the parts that bash made ugly and Python would make clean. That triage is the highest-value thing the AI does here — and it’s also the easiest to over-rule, because the AI loves to “Pythonify” everything. I kept the shell tools as shell tools.

Watch the exit-code semantics

Bash and Python disagree about what “success” means, and this is where silent bugs live. In bash, cmd1 | cmd2 returns cmd2’s status unless you set pipefail. The AI’s naive translation used subprocess.run with shell=True and didn’t check the return code at all.

import subprocess

def run(cmd):
    result = subprocess.run(cmd, check=True)  # raises on non-zero
    return result

run(["rsync", "-a", "--delete", src, dst])

check=True is the Python equivalent of set -e for that call. The original bash had set -e, so the faithful translation must raise on failure. The AI’s first draft dropped that, meaning a failed rsync would have been treated as success and the backup marked complete. I only caught it because I diffed behavior against the original, not just the code.

Never use shell=True with interpolated values

The AI reached for shell=True because it maps cleanly onto the bash mental model. It’s also a command-injection vector the moment any argument comes from outside.

# Dangerous if `name` is attacker- or user-controlled
subprocess.run(f"tar -czf {name}.tar.gz {path}", shell=True)

# Safe: list form, no shell
subprocess.run(["tar", "-czf", f"{name}.tar.gz", path], check=True)

I prompt explicitly: “Translate without shell=True; pass arguments as a list.” The list form also fixes the quoting bugs that plague bash — no more wrestling with "$var" versus $var. The AI handles this well once told; left alone it defaults to the unsafe form.

Pro Tip: ask the AI to flag every place the original bash relied on word-splitting or glob expansion. Those are exactly the spots where a list-form subprocess call will behave differently, and they need an explicit glob.glob() or shlex.split() decision rather than a silent change.

Translate trap handlers into try/finally

The bash script used trap cleanup EXIT. The Python equivalent is try/finally or a context manager, and the AI maps it correctly when asked.

import tempfile, shutil

workdir = tempfile.mkdtemp()
try:
    do_backup(workdir)
finally:
    shutil.rmtree(workdir, ignore_errors=True)

This preserves the original guarantee that the temp directory is always cleaned up. I verified it by forcing a mid-run failure and confirming the directory was gone — the same fault-injection check I’d run on the bash version.

Preserve idempotency and re-run safety

My backup script was safe to re-run; a partial run left no half-state. The translation has to keep that property. The AI’s version wrote the manifest file before the backup completed, so a crash left a manifest claiming a backup existed that didn’t. The original wrote the manifest last, atomically.

import os, json, tempfile

def write_manifest_atomically(path, data):
    fd, tmp = tempfile.mkstemp(dir=os.path.dirname(path))
    with os.fdopen(fd, "w") as f:
        json.dump(data, f)
    os.replace(tmp, path)  # atomic on POSIX

This is a classic AI translation slip: it preserved the lines but not the ordering guarantee. Reordering looked harmless to a junior. See writing idempotent automation scripts for why ordering and atomic writes are the whole ballgame.

Keep secrets out of the translation prompt

The original sourced an env file with an S3 access key. Before pasting the script, I replaced every secret with a placeholder like __AWS_SECRET__. The AI improved the handling — moving from a sourced file to boto3’s default credential chain — without ever seeing a real key. For the AWS specifics, scripting AWS with boto3 covers the credential chain in depth.

The rule holds across every task: the AI writes the plumbing, you supply the real water, locally and at runtime only.

Diff behavior, not just code

The final step is the one people skip. I run the original bash and the new Python against the same fixture inputs and compare outputs, exit codes, and side effects. For a backup tool that meant comparing the resulting archive contents and manifests byte for byte. Two of the three behavior changes only showed up here, not in code review. Pair this with the testing approach in bats and pytest.

If you want a structured static pass over the translation, the code review dashboard flags shell-injection and unchecked subprocess calls before you merge.

Conclusion

AI translates bash to Python at junior-engineer speed and quality: structurally sound, idiomatic, and subtly wrong in exactly the ways that matter for automation — exit codes, quoting, and ordering guarantees. Decide what to leave as shell, forbid shell=True, preserve idempotency, keep secrets out of the prompt, and diff behavior against the original before anything runs on prod. The translation is a draft; you are the reviewer. More porting prompts live in the bash and Python automation category and the prompts library.

Translating a Bash Script to Python with AI Without Breaking It