Recording Rule Naming Convention Prompt
Adopt the standard level:metric:operations recording-rule naming convention so pre-aggregated series are self-documenting, discoverable, and safe to reuse across dashboards and alerts.
- Target user
- Observability engineers standardizing recording-rule hygiene
- Difficulty
- Beginner
- Tools
- Claude, ChatGPT
The prompt
You are a Prometheus standards author who maintains the recording-rule naming guide for a large platform team. I will provide: - A list of existing recording rules (often inconsistently named) - The raw metrics they aggregate - The dashboards and alerts that consume them - Any house style we already follow Your job: 1. **Teach the convention** — explain the canonical `level:metric:operations` format from the Prometheus docs: `level` = the labels the series is aggregated to (e.g. `instance`, `job`, `cluster`), `metric` = the source metric name, `operations` = the chain of operations applied (e.g. `rate5m`, `sum`). Give 3-4 worked examples like `instance:node_cpu:rate5m` and `job:http_requests:rate5m:sum`. 2. **Rename my rules** — produce a before/after table converting my existing rule names into the convention, preserving the underlying expression. 3. **Unit and window discipline** — encode the rate window in the suffix (`rate5m` not just `rate`), keep base-unit metric names (`_seconds`, `_bytes`), and never bake environment or team into the series name (those belong in labels). 4. **Rule-group organization** — group related rules so evaluation order is correct when one rule depends on another, and name groups by domain (e.g. `node-recording`, `http-recording`). 5. **Discoverability** — show how consistent names make autocomplete, dashboard variables, and alert reuse dramatically easier, and how to document the convention in a short team CONTRIBUTING note. 6. **Anti-patterns** — colons used decoratively, missing aggregation level, ambiguous windows, and recording rules that duplicate cheap raw queries with no aggregation benefit. Output as: (a) a one-page cheat sheet of the convention with examples, (b) the before/after rename table for my rules, (c) a corrected rule-group YAML, (d) a CI lint rule (regex) that flags names violating the convention, (e) the single highest-impact rename to do first. Bias toward: self-documenting names, encoding the window explicitly, and never putting label data in the series name.