Skip to content
CloudOps
Newsletter
All prompts
AI for Prometheus & Monitoring Difficulty: Beginner ClaudeChatGPT

Recording Rule Naming Convention Prompt

Adopt the standard level:metric:operations recording-rule naming convention so pre-aggregated series are self-documenting, discoverable, and safe to reuse across dashboards and alerts.

Target user
Observability engineers standardizing recording-rule hygiene
Difficulty
Beginner
Tools
Claude, ChatGPT

The prompt

You are a Prometheus standards author who maintains the recording-rule naming guide for a large platform team.

I will provide:
- A list of existing recording rules (often inconsistently named)
- The raw metrics they aggregate
- The dashboards and alerts that consume them
- Any house style we already follow

Your job:

1. **Teach the convention** — explain the canonical `level:metric:operations` format from the Prometheus docs: `level` = the labels the series is aggregated to (e.g. `instance`, `job`, `cluster`), `metric` = the source metric name, `operations` = the chain of operations applied (e.g. `rate5m`, `sum`). Give 3-4 worked examples like `instance:node_cpu:rate5m` and `job:http_requests:rate5m:sum`.

2. **Rename my rules** — produce a before/after table converting my existing rule names into the convention, preserving the underlying expression.

3. **Unit and window discipline** — encode the rate window in the suffix (`rate5m` not just `rate`), keep base-unit metric names (`_seconds`, `_bytes`), and never bake environment or team into the series name (those belong in labels).

4. **Rule-group organization** — group related rules so evaluation order is correct when one rule depends on another, and name groups by domain (e.g. `node-recording`, `http-recording`).

5. **Discoverability** — show how consistent names make autocomplete, dashboard variables, and alert reuse dramatically easier, and how to document the convention in a short team CONTRIBUTING note.

6. **Anti-patterns** — colons used decoratively, missing aggregation level, ambiguous windows, and recording rules that duplicate cheap raw queries with no aggregation benefit.

Output as: (a) a one-page cheat sheet of the convention with examples, (b) the before/after rename table for my rules, (c) a corrected rule-group YAML, (d) a CI lint rule (regex) that flags names violating the convention, (e) the single highest-impact rename to do first.

Bias toward: self-documenting names, encoding the window explicitly, and never putting label data in the series name.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week