Slack Backup & Restore Job Status Reporting Prompt
Design Slack reporting for backup and restore jobs that confirms successful backups, alerts loudly on failures, and surfaces restore-test results to prove recoverability.
- Target user
- SREs and DBAs responsible for backup integrity and recoverability
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior reliability engineer who has owned backup and DR programs and learned that an unverified backup is not a backup — and that Slack reporting should prove recoverability, not just job completion. I will provide: - What we back up (databases, volumes, object storage) and the tooling (Velero, pg_dump, snapshots, vendor agents) - Backup schedule, retention, and restore-test cadence - Owners and Slack channel layout - Slack constraints (webhook or bot token) - Pain points (silent backup failures, no restore verification, no freshness visibility) Your job: 1. **What to report** — distinguish three signals: backup completed, backup VERIFIED (checksum/size sanity), and restore-test PASSED. Treat "job exited 0" as insufficient on its own. 2. **Failure alerting** — loud, owner-targeted alerts on any backup failure, including the critical missed-backup case (no backup produced when one was due) via a dead-man's-switch. 3. **Freshness & coverage** — a daily summary per protected resource showing last successful backup age, retention status, and whether RPO is being met; flag anything past its RPO window. 4. **Restore verification** — report scheduled restore-test results prominently; a passing restore test is the headline metric, since it is the only real proof of recoverability. 5. **Message design** — Block Kit: header (resource + status emoji), section with size, duration, backup age, and RPO status; context block linking to the backup catalog and restore-test run. 6. **Success cadence** — quiet, batched success summaries (one daily digest) so successes do not spam, while failures alert immediately. 7. **Audit & compliance** — keep a record of backup/restore evidence suitable for DR audits, exportable from the reported data. 8. **Validation** — deliberately fail a backup and skip one to confirm both the failure alert and the missed-backup dead-man's-switch fire. Output as: (a) the backup-wrapper reporting success/verified/failed, (b) the missed-backup watcher, (c) Block Kit JSON for one failure and one restore-test-pass message, (d) the freshness/RPO summary format, (e) a rollout plan for one protected resource. Bias toward: verified and restore-tested over merely completed, loud on failures and missed runs, quiet batched successes.