Home

AI for Prometheus & Monitoring

Write better alert rules, PromQL queries, and Grafana dashboards with AI.

Prompts

Guides

Recommended tools

Claude

by Anthropic

4.8

The most cautious and context-aware AI assistant for infrastructure work.

Best for

Production troubleshooting, postmortems, IaC review

Pricing

Free tier; Pro $20/mo; Team & Enterprise tiers

Read review
ChatGPT

by OpenAI

4.6

The broadest AI ecosystem with deep plugin support and the largest user community.

Best for

Ansible/Terraform generation, fast scaffolding, plugin-heavy workflows

Pricing

Free tier; Plus $20/mo; Team & Enterprise tiers

Read review
Datadog Bits AI

by Datadog

4.2

An AI SRE inside Datadog — auto-investigates alerts, queries your telemetry in plain English, and accelerates incident triage.

Best for

Investigating alerts and incidents inside Datadog, natural-language queries across metrics/logs/traces

Pricing

Bundled with Datadog; AI features vary by plan. Datadog billed per host/usage (often expensive at scale)

Read review
Microsoft Copilot for Azure

by Microsoft

4.0

An AI assistant inside the Azure portal that knows your environment — generate Bicep/CLI, troubleshoot AKS, and query Log Analytics in plain English.

Best for

Managing & troubleshooting Azure resources, generating Bicep/CLI, AKS diagnostics, KQL authoring

Pricing

Included with Azure at no additional charge (standard Azure resource usage applies)

Read review

AI for Prometheus & Monitoring

Prompts

Alertmanager Routing Tree Matcher Design Review Prompt

Grafana Prometheus Dashboard Panel Query Design Prompt

Prometheus Active Series Cardinality Reduction Triage Prompt

Prometheus Missing Metric End-to-End Debugging Prompt

Prometheus Multi-Window Multi-Burn-Rate SLO Alert Authoring Prompt

Prometheus Recording Rule Hierarchy Design and Naming Prompt

Prometheus Scrape Config Relabel Target Pruning Design Prompt

PromQL group_left Metadata Enrichment Join Prompt

PromQL Latency SLI from Histograms Aggregation Design Prompt

PromQL Rate Window vs Scrape Interval Mismatch Debugging Prompt

Alertmanager group_wait, group_interval & repeat_interval Tuning Prompt

Grafana $__rate_interval Correctness Review Prompt

OpenTelemetry Collector batch & memory_limiter Processor Sizing Prompt

Prometheus metric_relabel_configs Drop-List Cardinality Audit Prompt

Prometheus query.max-samples, timeout & concurrency Tuning Prompt

PromQL Native Histogram histogram_count & histogram_sum Debugging Prompt

PromQL quantile_over_time vs histogram_quantile Selection Prompt

Thanos Store Gateway Index & Caching Tier Sizing Prompt

VictoriaMetrics vmagent Stream Aggregation Rules Design Prompt

Prometheus Experimental Feature-Flag Rollout Prompt

Prometheus honor_labels & honor_timestamps Conflict Resolution Prompt

Prometheus http_sd Dynamic Target Discovery Prompt

Prometheus Query Log Slow-Query Audit Prompt

Prometheus sample_limit Target Protection Prompt

Prometheus scrape_protocols Content Negotiation Prompt

Prometheus target_limit & label_limit Guardrails Prompt

Prometheus TSDB Head Memory & Series Churn Prompt

Prometheus WAL Replay Startup Latency Prompt

Prometheus Exporter TLS & Auth Hardening Prompt

Prometheus External Labels & Multi-Cluster Collision Prompt

Prometheus Histogram Bucket Boundary Design Prompt

Prometheus Meta-Monitoring & Self-SLO Design Prompt

Prometheus Query API Read-Path Protection Prompt

Prometheus Recording Rule Layered Aggregation Prompt

Prometheus Scrape Timeout & Slow Target Diagnosis Prompt

Prometheus TSDB Snapshot Backup & Restore Prompt

PromQL Clamp & Bounds Sanitization Review Prompt

Grafana Dashboard JSON Model Drift Review Prompt

Prometheus Config Reload Validation with promtool Prompt

Prometheus Out-of-Order Sample Ingestion Tuning Prompt

Prometheus Rule Unit Testing with promtool Prompt

Prometheus Target-Down & Scrape Failure Triage Prompt

Prometheus TSDB Block & Compaction Tuning Prompt

Prometheus WAL & TSDB Corruption Recovery Prompt

PromQL absent_over_time Gap Detection Prompt

PromQL Counter-Reset Resilience Review Prompt

Alertmanager Routing Tree Dry-Run Testing Prompt

Alertmanager Silence Automation via amtool & API Prompt

Dashboard Query to Recording Rule Offload Prompt

Exporter Cardinality Budget & Label Allowlisting Prompt

MetricsQL WITH Templates & Query Optimization Prompt

Prometheus Remote Write Queue & Backpressure Tuning Prompt

PromQL Apdex Score & Latency Satisfaction Prompt

Synthetic Monitoring Multi-Step Journey Checks Prompt

VictoriaMetrics Cardinality Explorer & TSDB Triage Prompt

Alertmanager PagerDuty Receiver Integration Prompt

Grafana k6 Load Test Metrics Dashboard Prompt

Loki Multi-Tenancy & Retention Design Prompt

Long-Term Metrics Storage Backend Selection Prompt

OpenTelemetry Tail Sampling Policy Design Prompt

Prometheus Scrape & Evaluation Interval Tuning Prompt

Prometheus Staleness & Stale Markers Prompt

PromQL offset & Time-Shifted Comparison Prompt

Recording Rule Naming Convention Prompt

Alertmanager HA Cluster & Gossip Mesh Design Prompt

Grafana Notification Policies & Contact Points Design Prompt

Grafana SLO Burn-Rate Dashboard Design Prompt

OpenTelemetry Span Metrics Connector for RED Metrics Prompt

Prometheus Alert Runbook & Annotation Standardization Prompt

Prometheus TLS Certificate Expiry Monitoring Prompt

PromQL topk / bottomk Ranking & Top-N Dashboard Queries Prompt

VictoriaMetrics Migration from Prometheus Prompt

Alertmanager Webhook Receiver Integration Prompt

Grafana Dashboards as Code with Grafonnet Prompt

Grafana OnCall Escalation Chain Design Prompt

OpenTelemetry Temporality & Prometheus Compatibility Prompt

Prometheus for & keep_firing_for Tuning Prompt

Prometheus Query Frontend & Vertical Sharding Prompt