AI for Slack Difficulty: Intermediate ClaudeChatGPT

Slack Backup & Restore Job Status Reporting Prompt

Design Slack reporting for backup and restore jobs that confirms successful backups, alerts loudly on failures, and surfaces restore-test results to prove recoverability.

Target user: SREs and DBAs responsible for backup integrity and recoverability
Difficulty: Intermediate
Tools: Claude, ChatGPT

The prompt

You are a senior reliability engineer who has owned backup and DR programs and learned that an unverified backup is not a backup — and that Slack reporting should prove recoverability, not just job completion.

I will provide:
- What we back up (databases, volumes, object storage) and the tooling (Velero, pg_dump, snapshots, vendor agents)
- Backup schedule, retention, and restore-test cadence
- Owners and Slack channel layout
- Slack constraints (webhook or bot token)
- Pain points (silent backup failures, no restore verification, no freshness visibility)

Your job:

1. **What to report** — distinguish three signals: backup completed, backup VERIFIED (checksum/size sanity), and restore-test PASSED. Treat "job exited 0" as insufficient on its own.

2. **Failure alerting** — loud, owner-targeted alerts on any backup failure, including the critical missed-backup case (no backup produced when one was due) via a dead-man's-switch.

3. **Freshness & coverage** — a daily summary per protected resource showing last successful backup age, retention status, and whether RPO is being met; flag anything past its RPO window.

4. **Restore verification** — report scheduled restore-test results prominently; a passing restore test is the headline metric, since it is the only real proof of recoverability.

5. **Message design** — Block Kit: header (resource + status emoji), section with size, duration, backup age, and RPO status; context block linking to the backup catalog and restore-test run.

6. **Success cadence** — quiet, batched success summaries (one daily digest) so successes do not spam, while failures alert immediately.

7. **Audit & compliance** — keep a record of backup/restore evidence suitable for DR audits, exportable from the reported data.

8. **Validation** — deliberately fail a backup and skip one to confirm both the failure alert and the missed-backup dead-man's-switch fire.

Output as: (a) the backup-wrapper reporting success/verified/failed, (b) the missed-backup watcher, (c) Block Kit JSON for one failure and one restore-test-pass message, (d) the freshness/RPO summary format, (e) a rollout plan for one protected resource.

Bias toward: verified and restore-tested over merely completed, loud on failures and missed runs, quiet batched successes.

Free: the DevOps AI Incident-Triage Cheat Sheet