Prometheus Error Guide: 'many-to-many matching not allowed' PromQL Vector Matching
Fix the PromQL 'many-to-many matching not allowed' and 'found duplicate series' errors: diagnose mismatched labels, missing on()/ignoring(), and group_left/group_right.
- #prometheus-monitoring
- #troubleshooting
- #errors
- #promql
Overview
many-to-many matching not allowed: matching labels must be unique on one side is a PromQL vector-matching error. When you combine two instant vectors with a binary operator (/, *, +, and, unless, etc.), Prometheus matches each sample on the left to exactly one sample on the right by their label sets. For a default one-to-one match, the matched labels must uniquely identify a single series on each side. If either side has multiple series sharing the same matching labels, the operation is ambiguous and the engine refuses it.
You will see this from the API or a panel:
found duplicate series for the match group {instance="10.0.1.5:9100"} on the right hand-side of the operation: [...]; many-to-many matching not allowed: matching labels must be unique on one side
A closely related message is:
multiple matches for labels: many-to-one matching must be explicit (group_left/group_right)
It is purely an expression-authoring error — no infrastructure is broken. The same expression that works in one cluster can fail in another simply because a label that was unique there is not unique here.
Symptoms
- A division/multiplication panel shows the “many-to-many” or “duplicate series for the match group” error instead of a value.
- An expression that worked yesterday breaks after a new label (e.g.,
mode,cpu,device) appeared on one metric. group_left/group_rightis missing where one side intentionally has multiple series.- The error names a specific
match grouplabel set you can use to find the offenders.
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes
Error: found duplicate series for the match group ...
Common Root Causes
1. One side has extra distinguishing labels
The classic case: the left and right metrics carry different label sets, so after matching on the common labels, one side still has multiple series. Inspect the labels on each side:
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=node_cpu_seconds_total' | jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=node_load1' | jq -r '.data.result[0].metric | keys | join(",")'
__name__,cpu,instance,job,mode
__name__,instance,job
node_cpu_seconds_total has cpu and mode; node_load1 does not. Dividing them one-to-one fails because, per instance, the CPU metric has many series.
2. Missing on() or ignoring() to restrict the match labels
Without on()/ignoring(), Prometheus matches on all labels. If the two metrics share most labels but differ in a few, the match collapses incorrectly.
# Fails: matches on all labels, including the differing ones
rate(http_requests_total[5m]) / http_request_size_bytes
You must tell the engine which labels to match on, e.g. on(instance, job).
3. Needing group_left/group_right but not declaring it
Many-to-one is legal but must be explicit. Joining a high-cardinality metric to a low-cardinality info metric requires group_left:
# Fails without group_left: many series share the matching labels
node_cpu_seconds_total * on(instance) node_meta_info
The fix is * on(instance) group_left(role) node_meta_info, telling the engine the left side is the “many.”
4. Duplicate series from aggregation that kept a varying label
An aggregation that should collapse a label but doesn’t (because it was not listed) leaves duplicates on one side.
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=count by (instance) (up) > 1' | jq '.data.result'
[{"metric":{"instance":"10.0.1.5:9100"},"value":[1718804000,"2"]}]
Two up series per instance (e.g., from two jobs) collide when matched on instance alone.
5. A relabeling change introduced a new label
A recently added relabel_configs label (like region on only one metric’s job) breaks a previously working one-to-one match.
git -C /etc/prometheus diff HEAD~1 -- prometheus.yml | grep -A3 'target_label'
+ - target_label: region
+ replacement: us-east-1
Now one side has region and the other doesn’t, so the implicit all-labels match no longer pairs uniquely.
6. Info-metric join without the correct grouping side
*_info metrics (kube-state-metrics, node meta) are meant to be joined many-to-one, but choosing the wrong direction (group_right vs group_left) still errors.
kube_pod_info * on(pod, namespace) group_left(node) kube_pod_status_ready
If the “many” side is actually kube_pod_status_ready, this needs group_right instead — the engine rejects the wrong grouping.
Diagnostic Workflow
Step 1: Read the match-group labels from the error
The error names the exact match group {...} that is non-unique. That label set tells you which labels the engine matched on and where the duplicate lives.
Step 2: Compare the label sets of both sides
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<LEFT>' \
| jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<RIGHT>' \
| jq -r '.data.result[0].metric | keys | join(",")'
Differences in the key lists explain why an implicit all-labels match fails.
Step 3: Find which side has the duplicates
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=count by (<MATCH_LABELS>) (<SUSPECT_SIDE>) > 1' | jq '.data.result'
A non-empty result is the side that needs aggregation, on() restriction, or a grouping modifier.
Step 4: Decide one-to-one vs many-to-one
If both sides should yield one series per match group, aggregate away the extra labels or add on(...). If one side legitimately has many series (an info metric, a per-CPU metric), use group_left/group_right.
Step 5: Rewrite and validate
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=<REWRITTEN_EXPR>' | jq '.data.result | length'
A clean integer count (no error) confirms the match is now unique.
Example Root Cause Analysis
A reliability dashboard panel rate(http_requests_total{status="500"}[5m]) / rate(http_requests_total[5m]) (error ratio) starts failing with “found duplicate series for the match group” after a deploy.
Comparing labels on each side:
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=http_requests_total{status="500"}' \
| jq -r '.data.result[0].metric | keys | join(",")'
__name__,handler,instance,job,method,status
The numerator is filtered to status="500" but the denominator still carries every status, method, and handler value. Matched on all labels, the denominator has many series per numerator series — many-to-many.
The fix is to aggregate both sides to the same grouping (job) and restrict the match:
sum by (job) (rate(http_requests_total{status="500"}[5m]))
/
sum by (job) (rate(http_requests_total[5m]))
Both sides now have exactly one series per job, the match is one-to-one, and the panel renders the error ratio correctly.
Prevention Best Practices
- Aggregate both operands to the same label set with
sum by (...)before dividing; ratios should compare like-shaped vectors. - Use explicit
on(...)/ignoring(...)to state which labels the match should use, rather than relying on an implicit all-labels match that breaks when a new label appears. - Reserve
group_left/group_rightfor deliberate many-to-one joins (info metrics, per-CPU/per-device series) and copy in only the labels you need. - Pin error-ratio and saturation expressions in recording rules so the matching is validated once and reused, not re-authored per dashboard.
- When a working query breaks, check recent relabel/metric changes first — a single added label is the usual trigger.
- The free incident assistant can read the match-group labels from the error and propose a corrected
sum by/on()rewrite; more PromQL patterns are under Prometheus and monitoring.
Quick Command Reference
# Compare label keys on each side
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<LEFT>' \
| jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<RIGHT>' \
| jq -r '.data.result[0].metric | keys | join(",")'
# Which side has duplicates for the match labels?
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=count by (<MATCH_LABELS>) (<SIDE>) > 1' | jq '.data.result'
# Validate a rewritten expression (expect a length, not an error)
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=<REWRITTEN>' | jq '.data.result | length'
# Common correct patterns
sum by (job) (rate(http_requests_total{status="500"}[5m]))
/ sum by (job) (rate(http_requests_total[5m]))
node_cpu_seconds_total * on(instance) group_left(role) node_meta_info
Conclusion
many-to-many matching not allowed / found duplicate series for the match group means a binary operation could not pair each left series to a unique right series. Resolve it methodically:
- Read the
match group {...}labels the error reports. - Compare the label keys on both operands.
- Find which side has duplicate series for those match labels.
- Decide whether you want one-to-one (aggregate /
on()) or many-to-one (group_left/group_right). - Rewrite and validate against the API.
The fix is in the expression, not the infrastructure — make both sides share a unique label set, or declare the many-to-one join explicitly.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.