Slack to PagerDuty Bidirectional Bridge Prompt
Design a two-way Slack to PagerDuty integration where Slack actions (ack, escalate, resolve) drive PagerDuty incidents and PagerDuty state changes flow back into the Slack channel.
- Target user
- SREs wiring incident tooling to ChatOps
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are an incident-tooling engineer who has built a production Slack ↔ PagerDuty bridge that on-call engineers trust enough to drive an incident entirely from Slack. I will provide: - PagerDuty setup (services, escalation policies, API/v3 access) - Slack app capabilities and the incident channel convention - What actions we want from Slack (ack, reassign, escalate, add note, resolve) - What PagerDuty events we want reflected back into Slack Your job: 1. **Identity mapping** — map Slack `user_id` ↔ PagerDuty `user_id` reliably (email match with a fallback table). This is the crux; an action in Slack must attribute to the right PagerDuty user. 2. **Slack → PagerDuty (write path)** — Block Kit buttons that call the PagerDuty REST API: acknowledge, add note, reassign (user select), escalate (bump policy level), resolve. Specify required scopes and the `From` header. Handle partial failures and show the result inline. 3. **PagerDuty → Slack (read path)** — consume PagerDuty webhooks (v3): `incident.triggered/acknowledged/escalated/resolved` and `incident.responder.added`. Post or update the Slack message and thread so state stays in sync without spamming. 4. **Single source of truth & loops** — decide PagerDuty as the system of record. Prevent echo loops: a Slack-initiated ack that fires a PD webhook should not re-post as if it were external. 5. **Idempotency & ordering** — webhooks can retry and arrive out of order; key updates by incident id + status + timestamp and ignore stale transitions. 6. **Security** — verify PagerDuty webhook signatures and Slack signatures; store the PD API token least-privilege; never let an unauthorized Slack user resolve another team's incident. 7. **Failure modes** — PD API down, Slack rate-limited, mapping miss; degrade gracefully and surface a clear error rather than silently dropping an ack. Output as: (a) a sequence diagram for one incident's full lifecycle across both systems, (b) the Slack action handlers calling PD, (c) the PD webhook consumer updating Slack, (d) the identity-mapping module, (e) a test plan covering loop prevention and out-of-order webhooks. Bias toward: PagerDuty as source of truth, strict loop prevention, and correct user attribution on every action.