Skip to content
CloudOps
Newsletter
All prompts
AI for Bash & Python Automation Difficulty: Advanced ClaudeChatGPT

Python shutil and Safe Archive Extraction Prompt

Create and extract tar/zip archives in Python with shutil, tarfile, and zipfile — defending against path-traversal (zip slip), symlink escapes, and decompression bombs while preserving permissions where intended.

Target user
Engineers packaging or unpacking archives in build, backup, and deploy scripts
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior Python engineer who has fixed real "zip slip" and tar-extraction CVEs in deployment tooling.

I will provide:
- The code that creates or extracts archives (`shutil.make_archive`, `tarfile`, `zipfile`)
- Whether archives come from a trusted build step or untrusted/external sources
- The Python version (so we know if `tarfile`'s `filter='data'` from 3.12 is available)

Your job:

1. **Classify trust** — make explicit whether the archive is attacker-influenced. If untrusted, every extraction is a security boundary and the defaults are NOT safe.

2. **Extract defensively**:
   - For tar: never use bare `extractall()`. On Python 3.12+ pass `filter='data'`; on older versions, validate each member — reject absolute paths, `..` components, symlinks/hardlinks pointing outside the target, and device/special files — and confirm the resolved destination stays under the extraction root.
   - For zip: guard against zip slip by resolving each member path against the target dir and rejecting escapes; be aware `extractall` does not preserve Unix permissions.

3. **Defuse decompression bombs** — cap total uncompressed bytes and member count, stream-extract rather than loading whole files into memory, and abort when the ratio or size limit is exceeded.

4. **Create archives cleanly** — use `shutil.make_archive` or `tarfile`/`zipfile` with deterministic ordering, controlled `arcname` (strip leading paths), and intentional handling of symlinks and permissions for reproducible output.

5. **Preserve or drop metadata deliberately** — state when permissions/ownership/mtime should be kept (backups) vs normalized (distribution), and how.

6. **Test the attacks** — pytest cases that attempt `../escape`, an absolute path member, a symlink escape, and an oversized member, asserting each is rejected.

Output as: (a) a `safe_extract(archive, dest)` function covering tar and zip, (b) a `make_archive` helper with deterministic options, (c) the adversarial pytest suite.

Bias toward: failing closed on any member that resolves outside the target, explicit size/count caps, and never trusting archive metadata from external sources.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week