Prometheus Error Guide: 'out of order sample' Duplicate Sample Ingestion
Fix Prometheus 'out of order sample' and 'duplicate sample for timestamp' ingestion errors: diagnose clock skew, duplicate targets, label collisions, and OOO window settings.
- #prometheus-monitoring
- #troubleshooting
- #errors
- #ingestion
Overview
out of order sample and duplicate sample for timestamp are TSDB ingestion errors. Prometheus’s storage engine requires that, for a given series (a unique set of labels), samples arrive with strictly increasing timestamps. When a sample arrives with a timestamp older than the last one stored for that series, you get out of order sample. When a sample arrives with the same timestamp but a different value, you get duplicate sample for timestamp.
You will see these in the Prometheus log:
ts=2026-06-23T14:08:02.114Z caller=scrape.go:1654 level=warn component="scrape manager" scrape_pool=app msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=312
The TSDB-level error appears as:
ts=2026-06-23T14:08:02.118Z caller=append.go:91 level=warn msg="Error on ingesting out-of-order samples" num_dropped=312 err="out of order sample"
err="duplicate sample for timestamp"
These are append-path errors: the scrape or remote-write succeeds at the HTTP level, but individual samples are silently dropped at ingestion. Data goes missing without the target ever showing down.
Symptoms
- Log warnings about “out of order” or “too old or too far into the future” samples with a non-zero
num_dropped. - Gaps or flat lines in graphs even though the target is
up. prometheus_target_scrapes_sample_out_of_order_totalorprometheus_tsdb_out_of_order_samples_totalincreasing.- Two scrape jobs or two remote writers pushing the “same” series.
rate(prometheus_target_scrapes_sample_out_of_order_total[5m]) > 0
rate(prometheus_target_scrapes_sample_duplicate_timestamp_total[5m]) > 0
{instance="localhost:9090"} 4.27
Common Root Causes
1. Two targets exposing the same series
The most common cause: two scrape configs (or two endpoints) produce identical label sets, so Prometheus sees the same series twice per interval and rejects the second as a duplicate or out-of-order sample. Find colliding series:
curl -s http://localhost:9090/api/v1/query \
--data-urlencode 'query=count by (__name__, job) ({__name__=~".+"}) > 1' | jq '.data.result | length'
A frequent trigger is the same exporter scraped under two job names that get relabeled to the same value, or two replicas behind one Service IP.
2. Clock skew on a target or on Prometheus
Samples are timestamped at scrape time using Prometheus’s clock, but for federation, remote-write, or exporters that emit their own timestamps, a skewed clock produces out-of-order or future samples.
chronyc tracking | grep -E 'System time|Last offset'
System time : 3.412009800 seconds slow of NTP time
Last offset : -3.401221 seconds
A multi-second offset between a remote-write source and the receiving Prometheus reliably triggers out of order sample.
3. An exporter exposing explicit (and stale or repeated) timestamps
Most exporters let Prometheus assign the timestamp, but some emit their own (e.g., Pushgateway-style or batch jobs). If the same timestamp is re-exposed across scrapes, ingestion rejects it.
curl -s http://10.0.5.9:9091/metrics | grep -E '^[a-z].* [0-9]+ [0-9]{13}$' | head
batch_last_success_seconds 1.718804e+09 1718804000000
A trailing 13-digit value is an explicit millisecond timestamp; if it does not advance between scrapes, you get duplicates.
4. Duplicate label sets from relabeling
A metric_relabel_configs or relabel_configs rule that drops a distinguishing label (like instance or pod) can collapse two distinct series into one, which then collide.
grep -nA6 'metric_relabel_configs' /etc/prometheus/prometheus.yml
metric_relabel_configs:
- action: labeldrop
regex: instance
Dropping instance across a multi-replica job merges the replicas’ series and produces duplicate timestamps.
5. Remote-write senders racing each other
Two Prometheus servers (or two agents) remote-writing the same series to one receiver will interleave samples and trip out-of-order rejection at the receiver unless out-of-order ingestion is enabled.
rate(prometheus_remote_storage_samples_failed_total[5m]) > 0
{remote_name="central", url="https://central:9090/api/v1/write"} 6.10
Failed remote-write samples climbing alongside out-of-order errors on the receiver points at overlapping senders.
6. Out-of-order ingestion window too small
Modern Prometheus (2.39+) supports a configurable out-of-order time window. If late samples (common with remote-write from edge agents) arrive beyond that window, they are dropped.
grep -nE 'out_of_order_time_window' /etc/prometheus/prometheus.yml
# (no match -> default 0s, OOO disabled)
With the window at the default 0s, any late-arriving sample is rejected as out of order.
Diagnostic Workflow
Step 1: Identify which error and how many samples are dropped
journalctl -u prometheus --no-pager | grep -iE 'out of order|duplicate sample|too old' | tail -20
Note whether it is out of order sample, duplicate sample for timestamp, or too old/too far into the future — they have different fixes.
Step 2: Find the colliding series
curl -s 'http://localhost:9090/api/v1/query' \
--data-urlencode 'query=count by (__name__) ({__name__=~".+"}) > 1' \
| jq -r '.data.result[] | "\(.metric.__name__)\t\(.value[1])"' | head
Series appearing more than once per evaluation are your duplicates.
Step 3: Look for duplicate targets producing the same labels
curl -s http://localhost:9090/api/v1/targets \
| jq -r '.data.activeTargets[] | [.labels.job, .labels.instance, .scrapeUrl] | @tsv' \
| sort | uniq -d
Identical job/instance rows for different scrape URLs are colliding targets.
Step 4: Check clocks if remote-write or explicit timestamps are involved
chronyc tracking
date -u
Any multi-second offset between sender and receiver explains out-of-order timestamps.
Step 5: Inspect relabeling for dropped distinguishing labels
grep -nB2 -A8 -E 'relabel_configs|metric_relabel_configs' /etc/prometheus/prometheus.yml
Look for labeldrop/labelmap rules that remove instance, pod, or replica.
Example Root Cause Analysis
A platform team adds a second Prometheus replica for HA, both remote-writing to a central long-term store. The central store’s log fills with out of order sample and dashboards there show jagged gaps.
The central receiver’s metrics confirm the rejection:
rate(prometheus_tsdb_out_of_order_samples_total[5m])
{job="central"} 812.4
Both replicas scrape the same targets and produce identical series, then remote-write them to the same endpoint. Their scrapes are not synchronized, so samples for each series arrive interleaved and out of order.
The fix is to enable an out-of-order ingestion window on the central receiver so late samples from the trailing replica are accepted:
storage:
tsdb:
out_of_order_time_window: 30m
After reload, prometheus_tsdb_out_of_order_samples_total drops to zero and both replicas’ samples are accepted into the same series, giving true HA without gaps. (Alternatively, add an external_labels: { replica: A/B } and dedupe downstream.)
Prevention Best Practices
- Ensure each series is produced by exactly one source: avoid scraping the same endpoint under two jobs, and don’t drop the
instance/podlabel that distinguishes replicas. - Run NTP/chrony on every host and on Prometheus; alert on offsets over ~1s, which is enough to cause out-of-order samples in remote-write topologies.
- Let Prometheus assign timestamps; only use explicit timestamps for true batch/push workloads, and make sure they always advance.
- For HA pairs and edge agents that remote-write, set a sensible
out_of_order_time_window(e.g., 30m) on the receiver, or useexternal_labelsreplica deduplication. - Alert on
prometheus_target_scrapes_sample_out_of_order_totalandprometheus_tsdb_out_of_order_samples_totalso silent sample drops surface immediately. - The free incident assistant can correlate ingestion warnings with target/relabel config to pinpoint the colliding series; see related write-ups under Prometheus and monitoring.
Quick Command Reference
# Which ingestion error and how many dropped?
journalctl -u prometheus --no-pager | grep -iE 'out of order|duplicate sample|too old' | tail -20
# Find duplicate active targets
curl -s http://localhost:9090/api/v1/targets \
| jq -r '.data.activeTargets[] | [.labels.job,.labels.instance,.scrapeUrl] | @tsv' | sort | uniq -d
# Clock offset
chronyc tracking | grep -E 'System time|Last offset'
# Relabel rules that may collapse series
grep -nB2 -A8 -E 'relabel_configs|metric_relabel_configs' /etc/prometheus/prometheus.yml
# Explicit timestamps in a target's payload (13-digit trailing value)
curl -s <SCRAPE_URL> | grep -E ' [0-9]{13}$' | head
# Ingestion rejection rates
rate(prometheus_target_scrapes_sample_out_of_order_total[5m])
rate(prometheus_target_scrapes_sample_duplicate_timestamp_total[5m])
rate(prometheus_tsdb_out_of_order_samples_total[5m])
Conclusion
out of order sample and duplicate sample for timestamp mean samples were rejected at the TSDB append path while the scrape itself succeeded — data quietly goes missing. Diagnose in order:
- Read the log to tell
out of orderfromduplicatefromtoo old/too far into future. - Find the colliding series with
count by (__name__) (...) > 1. - Hunt duplicate targets producing identical
job/instancelabels. - Check clock skew where remote-write or explicit timestamps are in play.
- Review relabeling for rules that drop distinguishing labels.
Most cases reduce to “two sources, one series.” Eliminate the duplicate source, fix the clock, or set an out-of-order window for legitimate late data.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.