Prometheus Error Guide: 'many-to-many matching not allowed'

Overview

many-to-many matching not allowed: matching labels must be unique on one side is a PromQL vector-matching error. When you combine two instant vectors with a binary operator (/, *, +, and, unless, etc.), Prometheus matches each sample on the left to exactly one sample on the right by their label sets. For a default one-to-one match, the matched labels must uniquely identify a single series on each side. If either side has multiple series sharing the same matching labels, the operation is ambiguous and the engine refuses it.

You will see this from the API or a panel:

found duplicate series for the match group {instance="10.0.1.5:9100"} on the right hand-side of the operation: [...]; many-to-many matching not allowed: matching labels must be unique on one side

A closely related message is:

multiple matches for labels: many-to-one matching must be explicit (group_left/group_right)

It is purely an expression-authoring error — no infrastructure is broken. The same expression that works in one cluster can fail in another simply because a label that was unique there is not unique here.

Symptoms

A division/multiplication panel shows the “many-to-many” or “duplicate series for the match group” error instead of a value.
An expression that worked yesterday breaks after a new label (e.g., mode, cpu, device) appeared on one metric.
group_left/group_right is missing where one side intentionally has multiple series.
The error names a specific match group label set you can use to find the offenders.

node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes

Error: found duplicate series for the match group ...

Common Root Causes

1. One side has extra distinguishing labels

The classic case: the left and right metrics carry different label sets, so after matching on the common labels, one side still has multiple series. Inspect the labels on each side:

curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=node_cpu_seconds_total' | jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=node_load1' | jq -r '.data.result[0].metric | keys | join(",")'

__name__,cpu,instance,job,mode
__name__,instance,job

node_cpu_seconds_total has cpu and mode; node_load1 does not. Dividing them one-to-one fails because, per instance, the CPU metric has many series.

2. Missing on() or ignoring() to restrict the match labels

Without on()/ignoring(), Prometheus matches on all labels. If the two metrics share most labels but differ in a few, the match collapses incorrectly.

# Fails: matches on all labels, including the differing ones
rate(http_requests_total[5m]) / http_request_size_bytes

You must tell the engine which labels to match on, e.g. on(instance, job).

3. Needing group_left/group_right but not declaring it

Many-to-one is legal but must be explicit. Joining a high-cardinality metric to a low-cardinality info metric requires group_left:

# Fails without group_left: many series share the matching labels
node_cpu_seconds_total * on(instance) node_meta_info

The fix is * on(instance) group_left(role) node_meta_info, telling the engine the left side is the “many.”

4. Duplicate series from aggregation that kept a varying label

An aggregation that should collapse a label but doesn’t (because it was not listed) leaves duplicates on one side.

curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=count by (instance) (up) > 1' | jq '.data.result'

[{"metric":{"instance":"10.0.1.5:9100"},"value":[1718804000,"2"]}]

Two up series per instance (e.g., from two jobs) collide when matched on instance alone.

5. A relabeling change introduced a new label

A recently added relabel_configs label (like region on only one metric’s job) breaks a previously working one-to-one match.

git -C /etc/prometheus diff HEAD~1 -- prometheus.yml | grep -A3 'target_label'

+      - target_label: region
+        replacement: us-east-1

Now one side has region and the other doesn’t, so the implicit all-labels match no longer pairs uniquely.

6. Info-metric join without the correct grouping side

*_info metrics (kube-state-metrics, node meta) are meant to be joined many-to-one, but choosing the wrong direction (group_right vs group_left) still errors.

kube_pod_info * on(pod, namespace) group_left(node) kube_pod_status_ready

If the “many” side is actually kube_pod_status_ready, this needs group_right instead — the engine rejects the wrong grouping.

Diagnostic Workflow

Step 1: Read the match-group labels from the error

The error names the exact match group {...} that is non-unique. That label set tells you which labels the engine matched on and where the duplicate lives.

Step 2: Compare the label sets of both sides

curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<LEFT>' \
  | jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<RIGHT>' \
  | jq -r '.data.result[0].metric | keys | join(",")'

Differences in the key lists explain why an implicit all-labels match fails.

Step 3: Find which side has the duplicates

curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=count by (<MATCH_LABELS>) (<SUSPECT_SIDE>) > 1' | jq '.data.result'

A non-empty result is the side that needs aggregation, on() restriction, or a grouping modifier.

Step 4: Decide one-to-one vs many-to-one

If both sides should yield one series per match group, aggregate away the extra labels or add on(...). If one side legitimately has many series (an info metric, a per-CPU metric), use group_left/group_right.

Step 5: Rewrite and validate

curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=<REWRITTEN_EXPR>' | jq '.data.result | length'

A clean integer count (no error) confirms the match is now unique.

Example Root Cause Analysis

A reliability dashboard panel rate(http_requests_total{status="500"}[5m]) / rate(http_requests_total[5m]) (error ratio) starts failing with “found duplicate series for the match group” after a deploy.

Comparing labels on each side:

curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=http_requests_total{status="500"}' \
  | jq -r '.data.result[0].metric | keys | join(",")'

__name__,handler,instance,job,method,status

The numerator is filtered to status="500" but the denominator still carries every status, method, and handler value. Matched on all labels, the denominator has many series per numerator series — many-to-many.

The fix is to aggregate both sides to the same grouping (job) and restrict the match:

sum by (job) (rate(http_requests_total{status="500"}[5m]))
/
sum by (job) (rate(http_requests_total[5m]))

Both sides now have exactly one series per job, the match is one-to-one, and the panel renders the error ratio correctly.

Prevention Best Practices

Aggregate both operands to the same label set with sum by (...) before dividing; ratios should compare like-shaped vectors.
Use explicit on(...)/ignoring(...) to state which labels the match should use, rather than relying on an implicit all-labels match that breaks when a new label appears.
Reserve group_left/group_right for deliberate many-to-one joins (info metrics, per-CPU/per-device series) and copy in only the labels you need.
Pin error-ratio and saturation expressions in recording rules so the matching is validated once and reused, not re-authored per dashboard.
When a working query breaks, check recent relabel/metric changes first — a single added label is the usual trigger.
The free incident assistant can read the match-group labels from the error and propose a corrected sum by/on() rewrite; more PromQL patterns are under Prometheus and monitoring.

Quick Command Reference

# Compare label keys on each side
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<LEFT>' \
  | jq -r '.data.result[0].metric | keys | join(",")'
curl -s 'http://localhost:9090/api/v1/query' --data-urlencode 'query=<RIGHT>' \
  | jq -r '.data.result[0].metric | keys | join(",")'

# Which side has duplicates for the match labels?
curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=count by (<MATCH_LABELS>) (<SIDE>) > 1' | jq '.data.result'

# Validate a rewritten expression (expect a length, not an error)
curl -s 'http://localhost:9090/api/v1/query' \
  --data-urlencode 'query=<REWRITTEN>' | jq '.data.result | length'

# Common correct patterns
sum by (job) (rate(http_requests_total{status="500"}[5m]))
  / sum by (job) (rate(http_requests_total[5m]))

node_cpu_seconds_total * on(instance) group_left(role) node_meta_info

Conclusion

many-to-many matching not allowed / found duplicate series for the match group means a binary operation could not pair each left series to a unique right series. Resolve it methodically:

Read the match group {...} labels the error reports.
Compare the label keys on both operands.
Find which side has duplicate series for those match labels.
Decide whether you want one-to-one (aggregate / on()) or many-to-one (group_left/group_right).
Rewrite and validate against the API.

The fix is in the expression, not the infrastructure — make both sides share a unique label set, or declare the many-to-one join explicitly.

Prometheus Error Guide: 'many-to-many matching not allowed' PromQL Vector Matching