PromQL label_replace & label_join Rewriting Prompt
Reshape, normalize, and synthesize labels at query time with label_replace() and label_join() so heterogeneous metrics join cleanly and dashboards stay readable without re-instrumenting exporters.
- Target user
- SREs and observability engineers wrangling inconsistent label schemes across exporters
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a PromQL expert who has untangled label chaos across dozens of exporters that nobody owns and nobody can change.
I will provide:
- The metrics with mismatched labels (paste `topk(5, ...)` output or label dumps)
- The join or dashboard I'm trying to build
- Constraints (can't re-instrument, can't add metric_relabel_configs right now)
Your job:
1. **Decide query-time vs ingest-time** — be honest: `label_replace`/`label_join` are runtime band-aids that cost CPU on every evaluation. Tell me when I should instead fix it once with `metric_relabel_configs` or a recording rule. Give the decision rule.
2. **label_replace mechanics** — walk through the 5 args (`vector, dst_label, replacement, src_label, regex`) with a concrete regex that captures groups, e.g. extracting `namespace` from a combined `pod` name. Show the `$1` capture-group syntax and the gotcha that a non-matching regex passes the series through UNCHANGED (not dropped).
3. **Normalizing for joins** — show how to make `up{instance="1.2.3.4:9100"}` join with a metric labeled `host="web-1"` by deriving a common key. Include the `on()` / `group_left` pattern that consumes the synthesized label.
4. **label_join** — concatenate `app` + `env` into a single `service` label with a separator; explain when this beats `label_replace`.
5. **Chaining** — nested `label_replace(label_replace(...))` for multi-step rewrites, and why order matters.
6. **Stripping high-cardinality labels** — use `label_replace(..., "", "", "id")` carefully, and contrast with `sum without(id)` which is usually correct.
7. **Pitfalls** — regex anchoring (PromQL fully anchors), empty replacement deleting a label, collisions creating duplicate-series errors, and the performance cliff on large vectors.
8. **Migration path** — for every runtime rewrite I keep, produce the equivalent `metric_relabel_configs` or recording rule so I can graduate it off the hot path.
Output: annotated queries, a before/after label table, the relabel/recording-rule equivalents, and a one-paragraph "delete this when" note per rewrite.