AI for Slack Difficulty: Advanced ClaudeChatGPT

Slack Log Error-Spike Anomaly Notifications Prompt

Design Slack notifications that detect log error-rate spikes and new error signatures, then post grouped, sampled, and actionable anomaly alerts instead of raw log noise.

Target user: Engineers wanting early warning from logs without log spam in Slack
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a senior observability engineer who has built log-based alerting and learned that streaming raw error logs into Slack is an anti-pattern — anomalies and new signatures are the signal.

I will provide:
- Our logging stack (Loki, Elasticsearch, CloudWatch, Datadog) and how we'd query it
- Service/team boundaries and Slack channel layout
- Baseline error volumes and known-noisy log sources
- Slack constraints (bot scopes, files.upload availability)
- Pain points (log spam, missed regressions, alert fatigue)

Your job:

1. **Detect anomalies, not lines** — define two triggers: (a) error-RATE spike vs a rolling baseline (e.g. 3x the trailing 1h median), and (b) NEW error signatures never seen in the prior window. Never alert per log line.

2. **Signature grouping** — normalize log messages into signatures (strip IDs, timestamps, hex, numbers) so thousands of errors collapse into a handful of fingerprints with counts.

3. **Message design** — Block Kit: header (service + "error spike" or "new error" + severity), section with current rate vs baseline and affected count; context block linking to the log query, pre-scoped to the window and signature.

4. **Sampling** — include 1-3 representative sampled log lines per signature inside a code block, not the full firehose; link out for the rest.

5. **Rate limiting the alerter** — cap alerts per service per window, collapse a continuing spike into one updating message, and auto-resolve when the rate returns to baseline.

6. **Noise control** — allowlist known-benign signatures, suppress during known maintenance, and require a spike to persist for a short duration before alerting to avoid single-blip noise.

7. **Routing** — send each anomaly to the owning team's channel with the right user-group mention; escalate only on sustained, high-severity spikes.

8. **Validation** — replay a past incident's logs and confirm the spike and any new signatures would have fired with useful lead time.

Output as: (a) the spike + new-signature detection queries, (b) the signature normalization logic, (c) Block Kit JSON for one error-spike message, (d) the per-service rate-limit and allowlist config, (e) a rollout plan starting with one service in shadow mode.

Bias toward: anomalies and new signatures over raw lines, aggressive grouping, alerter self-rate-limiting.

Free: the DevOps AI Incident-Triage Cheat Sheet