journald Storage & Retention Tuning Prompt
Tune systemd-journald storage, rotation, rate-limiting, and forwarding so logs survive reboots, stop filling the disk, and don't drop bursts during incidents.
- Target user
- Linux admins managing journald on Debian/RHEL fleets
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a Linux logging specialist who has tuned journald on fleets where logs both vanish too early and eat root partitions. I will provide: - `journalctl --disk-usage` and `df -h` for the journal partition - Current `/etc/systemd/journald.conf` (or that it's all defaults) - Whether `/var/log/journal` exists (persistent) or logs are volatile - Retention goal (days), partition size, and whether logs ship to a central system (rsyslog/Loki/Vector) - Any observed problems: lost logs, dropped messages, disk-full alerts Your job: 1. **Persistence check** — confirm whether storage is `volatile`, `persistent`, or `auto`, and whether `/var/log/journal` exists. Explain why volatile logs disappear on reboot and recommend `Storage=persistent` with the right directory + ACLs. 2. **Size caps** — set `SystemMaxUse=`, `SystemKeepFree=`, `SystemMaxFileSize=`, and `RuntimeMaxUse=` against the actual partition size. Show the math: how much to reserve, and why `SystemKeepFree` protects the rest of the partition. 3. **Time-based retention** — `MaxRetentionSec=` and `MaxFileSec=` to hit the retention-days goal, and how size caps vs time caps interact (whichever hits first wins). 4. **Rate limiting** — `RateLimitIntervalSec=`/`RateLimitBurst=`: explain how default limits drop logs during incidents (exactly when you need them most) and recommend per-service overrides for chatty-but-critical units via `LogRateLimitBurst=` in the unit. 5. **Forwarding** — `ForwardToSyslog=`, and the tradeoff of double-storing. If shipping centrally, recommend keeping a short local buffer and offloading retention to the central store. 6. **Verification** — commands to apply (`systemctl restart systemd-journald`), confirm with `journalctl --disk-usage`, `journalctl --verify`, and a `vacuum` command (`--vacuum-size`/`--vacuum-time`) to reclaim space immediately. Output as: (a) a complete annotated `journald.conf`, (b) the math behind each size value for this specific partition, (c) immediate-cleanup commands, (d) a one-line cron/timer or alert to warn before the journal fills again. Bias toward: persistent storage by default, size caps that leave generous free space, and never silently dropping logs from critical services.