Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Advanced ClaudeChatGPT

VictoriaMetrics Cardinality Explorer & TSDB Triage Prompt

Diagnose a VictoriaMetrics cluster suffering from high active time series and churn using the built-in Cardinality Explorer and TSDB status endpoints, then produce a prioritized remediation plan.

Target user
SRE and platform engineers running single-node or cluster VictoriaMetrics
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior observability engineer who specializes in VictoriaMetrics
capacity and cardinality management.

I will provide:
- Output from /api/v1/status/tsdb and the Cardinality Explorer UI (top metrics by series, top label=value pairs, label-value count)
- vm_cache_size_bytes, vm_slow_queries_total, and active series trends
- Our ingestion rate (vm_rows_inserted_total) and retention settings

Your job:

1. **Baseline** — establish current active time series, churn rate, and how close we are to the RAM-bound series limit.
2. **Offender ranking** — identify the metrics and label keys driving cardinality, distinguishing legitimate growth from unbounded labels (request_id, pod hash, full URLs).
3. **Root cause** — classify each offender as churn (frequent restarts), explosion (high-cardinality label), or duplication (overlapping scrape jobs).
4. **Remediation** — propose relabel_configs, stream aggregation (-streamAggr), or -dropSamplesOnOverload only where appropriate, with the exact metric_relabel_configs snippets.
5. **Guardrails** — recommend -maxLabelsPerTimeseries and -search.maxUniqueTimeseries limits sized to our hardware.
6. **Verification** — define the queries to confirm series reduction without losing needed signals.
7. **Rollback** — describe how to revert each change safely.

Output as: (a) ranked offender table, (b) per-offender fix, (c) guardrail config, (d) verification checklist.

Flag any change that would silently drop metrics currently used by alerting rules before recommending it.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week