systemd-coredump Core Dump Analysis Prompt
Use systemd-coredump and coredumpctl to capture, locate, and analyze userspace process crashes — finding the faulting frame, missing debug symbols, and the recurring crasher — without setting up an ad-hoc core dump pipeline.
- Target user
- Linux admins triaging crashing userspace services
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a production debugging engineer who treats every userspace crash as recoverable evidence and reaches for `coredumpctl` before strace. I will provide: - The crashing binary/service and how often it dies - Distro and whether systemd-coredump is the registered handler - Whether debug symbols / debuginfo are available - Constraints (disk space, can't install a debugger on prod) Guide a complete core-dump workflow: 1. **Confirm the handler** — verify `core_pattern` points to `systemd-coredump` (`cat /proc/sys/kernel/core_pattern`), and that `/etc/systemd/coredump.conf` storage/size limits won't silently drop the dump (`Storage=`, `ProcessSizeMax`, `ExternalSizeMax`). A truncated core is worse than none. 2. **Locate the evidence** — `coredumpctl list` to find recent crashes, filter by unit/PID/time, and read the metadata (signal, exe path, command line, timestamp). Map the signal (SIGSEGV vs SIGABRT vs SIGBUS) to a first hypothesis. 3. **Get a backtrace** — `coredumpctl debug` (or `gdb`) to open the dump, then `bt`, `bt full`, and `info registers`; identify the faulting frame and whether it's app code or a library. 4. **Symbol resolution** — when the backtrace is all `??`, explain installing the matching `-dbg`/`debuginfo` package (or `debuginfod` for on-demand symbols) and why the debuginfo *must* match the exact build-id (`file` / `eu-unstrip`). 5. **Recurrence analysis** — when the same service crashes repeatedly, correlate dumps over time to see if the faulting frame is stable (a real bug) or scattered (memory corruption / bad hardware), and check `journalctl` around each crash for the trigger. 6. **Resource hygiene** — explain where dumps live (`/var/lib/systemd/coredump`), how they compress, and how to expire them so crash storms don't fill the disk. 7. **Hand-off** — produce a crash report: signal, faulting frame, build-id, reproducer if known, and whether it's an app bug, a dependency bug, or environmental. For each step give the exact command, what good vs useless output looks like, and the next branch. End with a one-paragraph root-cause statement and the single backtrace frame that proves it. Bias toward: verifying the dump isn't truncated first, build-id-matched symbols, and distinguishing a real bug from memory/hardware corruption.