Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Prometheus & Monitoring By James Joyner IV · · 9 min read

Prometheus Error Guide: 'found duplicate series for the match group' Vector Matching Failure

Fix the PromQL 'found duplicate series for the match group' error: add group_left/group_right for many-to-one joins, or deduplicate a non-unique one-side.

  • #prometheus-monitoring
  • #troubleshooting
  • #errors
  • #promql

Exact Error Message

found duplicate series for the match group ...; many-to-one matching must be explicit (group_left/group_right) is a PromQL vector-matching error. It is returned as the result of a query — there is no value, just an error — and it also surfaces in rule evaluation logs when a recording or alerting rule contains the offending expression.

From the query engine you get the full message, naming the exact match group and the colliding series:

found duplicate series for the match group {instance="10.0.3.4:9100"} on the right hand-side of the operation: [{__name__="node_memory_MemTotal_bytes", instance="10.0.3.4:9100", job="node"}, {__name__="node_memory_MemTotal_bytes", instance="10.0.3.4:9100", job="node-ha"}]; many-to-one matching must be explicit (group_left/group_right)

Over the HTTP API the same failure comes back as a JSON error body:

{"status":"error","errorType":"execution","error":"found duplicate series for the match group {instance=\"10.0.3.4:9100\"} on the right hand-side of the operation: [...]; many-to-one matching must be explicit (group_left/group_right)"}

And in the rule manager log, where it silently breaks an alert or recording rule:

ts=2026-06-27T10:14:02.118Z caller=manager.go:677 level=warn component="rule manager" msg="Evaluating rule failed" rule="record: instance:mem_used:ratio" err="found duplicate series for the match group {instance=\"10.0.3.4:9100\"} on the right hand-side of the operation: [...]; many-to-one matching must be explicit (group_left/group_right)"

What the Error Means

When you combine two instant vectors with a binary operator (/, *, +, -, etc.), PromQL matches each series on the left to a series on the right by their labels. The default is one-to-one: after the match labels are applied, each side must have exactly one series per match group. You can narrow the match labels with on(...) (match only on these) or ignoring(...) (match on everything except these).

A many-to-one (or one-to-many) match — where several left series share a single right series — is legal, but it must be declared explicitly with group_left (left side is the “many”) or group_right (right side is the “many”).

This specific error fires when, after applying on()/ignoring(), the “one” side has more than one series per match group. The engine cannot decide which of those series to pair with, so the match is ambiguous and it refuses to guess. The message tells you which side is the offender — on the right hand-side of the operation (or left) — and the exact match group {...} labels that collapsed onto multiple series.

This is distinct from many-to-many matching not allowed, where both sides have duplicate series for the match group. Here, one side is fine and the other (“the one”) is not unique — typically because you either forgot group_left, or the side you grouped against genuinely has duplicates.

Common Causes

  • Missing group_left/group_right on a deliberate many-to-one join. You join a high-cardinality metric to a low-cardinality one (an info/meta metric, a per-instance total) without declaring the grouping side.
  • Duplicate label sets on the one-side. An *_info metric is exported twice, or two jobs export the same series — so even with group_left, the “one” side has two series per match group.
  • An HA pair scraping the same target. Two Prometheus jobs (e.g. node and node-ha) scrape the same node, producing two node_memory_MemTotal_bytes series per instance that differ only by job.
  • Too-loose on(). Matching on on(instance) collapses a distinguishing label (like job or device) that was keeping the right side unique, turning a clean join into an ambiguous one.
  • Relabeling that dropped a uniqueness label. A metric_relabel_configs change removed a label that made the series unique, so what used to be one series per group is now several.
  • An info metric with genuine duplicates. kube_pod_info, node_uname_info, or a custom _info metric emits more than one row per join key (e.g. two node_uname_info series per instance after a node was renamed but the old series has not yet aged out).

How to Reproduce the Error

Stand up two scrape jobs that hit the same node exporter, so node_memory_MemTotal_bytes exists twice per instance (once per job):

scrape_configs:
  - job_name: node
    static_configs: [{ targets: ["10.0.3.4:9100"] }]
  - job_name: node-ha
    static_configs: [{ targets: ["10.0.3.4:9100"] }]

Now run a join that matches only on instance:

node_filesystem_avail_bytes / on(instance) node_memory_MemTotal_bytes

Because the right side has two node_memory_MemTotal_bytes series for instance="10.0.3.4:9100" (one per job), the match group is ambiguous and the query returns found duplicate series for the match group {instance="10.0.3.4:9100"} on the right hand-side ....

Diagnostic Commands

All of these are read-only query reads against the HTTP API — they fetch data, they do not change anything.

Run the offending right-hand subquery alone and count series per match group; any count above 1 is a duplicate:

curl -s --data-urlencode 'query=count by (instance) (node_memory_MemTotal_bytes) > 1' \
  http://localhost:9090/api/v1/query | jq '.data.result'
[{"metric":{"instance":"10.0.3.4:9100"},"value":[1719482000,"2"]}]

A non-empty result is the proof: that instance has two series on the one-side. Inspect the actual label sets to see what distinguishes them:

curl -s --data-urlencode 'query=node_memory_MemTotal_bytes{instance="10.0.3.4:9100"}' \
  http://localhost:9090/api/v1/query | jq '.data.result[].metric'
{"__name__":"node_memory_MemTotal_bytes","instance":"10.0.3.4:9100","job":"node"}
{"__name__":"node_memory_MemTotal_bytes","instance":"10.0.3.4:9100","job":"node-ha"}

The differing label is job — that is what on(instance) collapsed. For info-metric joins, inspect the label sets the same way:

curl -s --data-urlencode 'query=node_uname_info' \
  http://localhost:9090/api/v1/query | jq '.data.result[].metric'

If node_uname_info shows two rows for one instance, the info metric itself is duplicated and group_left will not save you.

Step-by-Step Resolution

1. Isolate which side has the duplicate. The error already says left- or right-hand side. Confirm with a count on that side, grouped by the match labels:

curl -s --data-urlencode 'query=count by (instance) (node_memory_MemTotal_bytes) > 1' \
  http://localhost:9090/api/v1/query | jq '.data.result'

2. Decide: legitimate many-to-one, or accidental duplicate?

If the one-side is genuinely unique per match group and you simply forgot to declare the many-to-one direction, add group_left. Here is the broken query and the fix:

# broken: ambiguous many-to-one
node_filesystem_avail_bytes / on(instance) node_memory_MemTotal_bytes
# fixed: explicit group_left (left side is the "many")
node_filesystem_avail_bytes / on(instance) group_left() node_memory_MemTotal_bytes

group_left() tells the engine the left side may have many series per match group while the right has one — and it keeps the left side’s labels in the result. Use group_left(label1, label2) to also copy specific labels from the right side onto the result.

3. If the one-side genuinely has duplicates, group_left will NOT help. This is the key trap: group_left only resolves the direction of the match; it does nothing when the “one” side really has two series per group (the HA-pair case above). You must make that side unique. Three options:

Deduplicate by collapsing the extra series with an aggregation:

node_filesystem_avail_bytes
  / on(instance) group_left()
max by (instance) (node_memory_MemTotal_bytes)

max by (instance) (or min/avg, or topk(1, ...)) reduces the two HA series to one per instance, restoring uniqueness.

Or add the distinguishing label back into on() so the series no longer collapse:

node_filesystem_avail_bytes / on(instance, job) node_memory_MemTotal_bytes

Or fix the root cause: drop the duplicate scrape job / relabel so the one-side series is unique in the first place. For an _info metric duplicated by stale series, the duplicate usually ages out, but a deliberate relabel to drop the redundant copy is cleaner.

4. Validate the rewrite against the API — expect a series count, not an error:

curl -s --data-urlencode 'query=node_filesystem_avail_bytes / on(instance) group_left() max by (instance) (node_memory_MemTotal_bytes)' \
  http://localhost:9090/api/v1/query | jq '.data.result | length'

Prevention and Best Practices

  • Reserve group_left/group_right for deliberate many-to-one joins (info metrics, per-device or per-CPU series against a per-instance total), and copy in only the labels you need: group_left(node).
  • Before joining against an *_info metric, confirm it is unique per join key with count by (<keys>) (metric) > 1. Info metrics are the most common source of a non-unique one-side.
  • Keep on(...) tight enough to be meaningful but not so loose that it collapses a label keeping the one-side unique — on(instance, job) instead of on(instance) when an HA pair scrapes the same target.
  • Detect duplicate scrapes early: alert on count by (instance, __name__) (up) > 1 or on key info metrics having more than one series per join key.
  • Pin validated join expressions in recording rules so the matching is checked once and reused, rather than re-authored in every dashboard panel.
  • When a working join suddenly errors, check recent relabel and scrape-config changes first — a new HA job or a dropped label is the usual trigger.
  • many-to-many matching not allowedboth sides have duplicate series for the match group, not just the one-side. Same family, but the fix is to make both operands unique (aggregate / on()), not to add a grouping modifier.
  • duplicate sample for timestamp — an ingestion-path collision (same series, same timestamp, different value), unrelated to vector matching but easy to confuse because both mention “duplicate.”
  • multiple matches for labels: many-to-one matching must be explicit — the closely related wording emitted when the left side is the non-unique “one”; resolve it with group_right or by deduplicating the left.

Frequently Asked Questions

Will adding group_left always fix this error? No. group_left/group_right only declares which side is the “many.” If the side you are grouping against (the “one”) genuinely has more than one series per match group — an HA pair, a duplicated _info metric — the match is still ambiguous and the error persists. You must deduplicate that side with max by (...)/topk, add a distinguishing label to on(), or fix the scrape.

How do I tell whether the duplicate is intentional or accidental? Run count by (<match labels>) (<one-side>) > 1. Then inspect the offending series’ labels. If they differ only by something like job from an HA scrape, it is accidental — deduplicate. If they are distinct entities you meant to fan out across, it is many-to-one — declare group_left/group_right.

Which side does group_left keep labels from? group_left keeps all of the left (many) side’s labels and lets you copy named labels from the right (one) side: group_left(role). group_right is the mirror image. The grouping direction must match where the “many” actually is, or the engine rejects it.

Why did this start failing after a config change? A new scrape job or an HA Prometheus pair likely began scraping the same target, duplicating the one-side series, or a relabel dropped a label that kept it unique. Compare count by (instance) (<one-side>) before and after, and review recent scrape_configs/relabel_configs diffs.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.