You are a senior SRE who runs Grafana Synthetic Monitoring for black-box availability and latency SLOs. I will provide: - The endpoints/services to probe - Target SLOs (availability, latency) - Regions your users come from Your job: 1. **Pick check types**: HTTP(S) for API/web availability, ping for L3 reachability, DNS for resolution, traceroute for path analysis, and k6 browser checks for full page loads. 2. **Choose probe locations**: select public probes near your user regions (or a private probe for internal targets) and set the frequency (e.g. 60s). 3. **HTTP assertions**: validate status code, expected body/regex, TLS certificate expiry, and response time thresholds. 4. **Labels**: tag checks with `service`, `env`, `team` so metrics/logs are queryable and route alerts. 5. **Metrics produced**: `probe_success`, `probe_duration_seconds`, `probe_http_status_code`, `probe_ssl_earliest_cert_expiry` — use these for SLOs. 6. **Alerting**: alert on `probe_success` availability over a window and on cert expiry approaching. 7. **Private probes**: deploy a private probe agent for internal-only endpoints. 8. **As code**: manage checks via the Synthetic Monitoring API or the Terraform provider. Mark DESTRUCTIVE: pointing high-frequency checks at rate-limited endpoints, probing third-party APIs you don't own (ToS), leaking secrets in check bodies. --- Endpoints/services: [DESCRIBE] Target SLOs: [DESCRIBE] User regions: [DESCRIBE]

Why this prompt works

Synthetic checks are only as good as their assertions and probe placement — a 200-only HTTP check misses broken content, and probes in the wrong region measure meaningless latency. This prompt makes the model choose check types deliberately, add real assertions (body, TLS, latency), and connect the resulting probe_* metrics to SLO alerts.

How to use it

List endpoints and whether they are internal so it picks public vs private probes.
State SLO targets so assertions and alert windows match.
Name user regions so probe locations are relevant.
Ask for Terraform or API definitions to manage checks as code.

Useful commands

# List existing checks via the Synthetic Monitoring API
curl -s -H "Authorization: Bearer $SM_TOKEN" \
  https://synthetic-monitoring-api.grafana.net/api/v1/check/list | jq '.[].job'

# Add a check via the API
curl -X POST https://synthetic-monitoring-api.grafana.net/api/v1/check/add \
  -H "Authorization: Bearer $SM_TOKEN" \
  -H "Content-Type: application/json" \
  -d @http-check.json

Example config

# Terraform: HTTP synthetic check with assertions
resource "grafana_synthetic_monitoring_check" "checkout" {
  job     = "checkout-api"
  target  = "https://api.example.com/health"
  enabled = true
  probes  = [/* us-east, eu-west probe ids */]
  labels = {
    service = "checkout"
    env     = "prod"
    team    = "payments"
  }
  settings {
    http {
      method                = "GET"
      valid_status_codes    = [200]
      fail_if_body_not_matches_regexp = ["\"status\":\\s*\"ok\""]
      tls_config { insecure_skip_verify = false }
      ip_version = "V4"
    }
  }
}

# Availability SLO alert over 5m and TLS cert-expiry alert
avg_over_time(probe_success{job="checkout-api"}[5m]) < 0.99
# and
(probe_ssl_earliest_cert_expiry - time()) / 86400 < 14

Common findings this catches

False green → HTTP check validates status only, not body.
Wrong latency → probe region far from real users.
Unreachable internal target → needs a private probe.
Surprise TLS outage → no cert-expiry alert.
Throttled target → check frequency too aggressive.
Secret leak → token in check header/body.
Cost creep → too many checks inflating active series.

When to escalate

Rate-limiting or ToS concerns with external targets — coordinate with the target owner/legal.
Private probe network access — networking/security team.
SLO/error-budget policy — reliability governance.

Grafana Synthetic Monitoring Checks Prompt

Why this prompt works

How to use it

Useful commands

Example config

Common findings this catches

When to escalate

Related prompts

Grafana Business KPI Dashboard Design Prompt

Grafana PagerDuty/Opsgenie Contact Point Prompt

Grafana Terraform Provider Dashboards Prompt

Why this prompt works

How to use it

Useful commands

Example config

Common findings this catches

When to escalate

Related prompts

Grafana Business KPI Dashboard Design Prompt

Grafana PagerDuty/Opsgenie Contact Point Prompt

Grafana Terraform Provider Dashboards Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet