Redis Error Guide: Replica Stuck in Repeated Full Resync — Partial Resync Failing
Fix a Redis replica looping on full resync: diagnose small repl-backlog-size, replication ID mismatch, output buffer kills, and rising sync_full.
- #redis
- #troubleshooting
- #errors
- #replication
Overview
Redis replication is designed so a briefly-disconnected replica can reattach with a partial resync — the master replays only the commands the replica missed from its in-memory replication backlog. A full resync is the heavyweight fallback: the master forks, produces a full RDB, ships it, and the replica reloads the entire dataset. When partial resync keeps failing, the replica falls into a loop of repeated full resyncs — each one forking the master, saturating the network, and never “catching up.” The sync_full counter in INFO stats climbs steadily, which is the clearest signal.
There is no single error string; the pattern shows up in logs:
# Master log
Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '3f9c...', my replication IDs are 'a1b2...' and '0000...')
Starting BGSAVE for SYNC with target: disk
Background saving started by pid 8123
# Replica log (repeating)
Full resync from master: a1b2...:14311988
MASTER <-> REPLICA sync: Loading DB in memory
Connecting to MASTER 10.0.0.5:6379 # ...and it starts over
The core issue is that partial resync is being refused — usually because the backlog is too small to cover the disconnect, or the master’s replication ID changed.
Symptoms
INFO statssync_fullincrements repeatedly;sync_partial_okstays flat whilesync_partial_errgrows.- Master CPU/fork and network spike on a cycle; the replica never reaches a stable
master_link_status:upfor long. - Master log repeatedly shows “Partial resynchronization not accepted” and “Starting BGSAVE for SYNC”.
- Replica log loops through “Full resync … Loading DB … Connecting to MASTER”.
redis-cli -h <master> INFO stats | grep -E 'sync_full|sync_partial'
sync_full:47
sync_partial_ok:0
sync_partial_err:46
Common Root Causes
1. repl-backlog-size too small for the disconnect window
If the replica is offline longer than the backlog can hold (or write volume is high), the needed offset ages out and only a full resync is possible.
redis-cli -h <master> CONFIG GET repl-backlog-size
redis-cli -h <master> INFO replication | grep -E 'repl_backlog_active|repl_backlog_histlen|master_repl_offset'
repl-backlog-size 1mb # too small for a busy master
2. Replica output buffer limit killing the transfer mid-sync
If the RDB transfer or command stream exceeds client-output-buffer-limit slave, the master drops the replica mid-sync and it restarts full.
redis-cli -h <master> CONFIG GET client-output-buffer-limit
redis-cli -h <master> journalctl 2>/dev/null | grep -i 'output buffer'
client-output-buffer-limit slave 256mb 64mb 60
3. Master replication ID changed (restart / failover)
After a master restart or failover, the replication ID changes, so the replica’s cached replid no longer matches and partial resync is refused.
Partial resynchronization not accepted: Replication ID mismatch
4. Network instability / repl-timeout too low
A flaky link or a short repl-timeout aborts the transfer before the RDB finishes loading, forcing a restart of the whole sync.
redis-cli -h <master> CONFIG GET repl-timeout
redis-cli INFO replication | grep master_link_status
Diagnostic Workflow
Step 1: Confirm the full-resync loop
redis-cli -h <master> INFO stats | grep -E 'sync_full|sync_partial_ok|sync_partial_err'
watch -n2 "redis-cli -h <master> INFO stats | grep sync_full"
A steadily rising sync_full with sync_partial_err climbing = the loop.
Step 2: Read both logs for the refusal reason
journalctl -u redis --no-pager | grep -iE 'resync|BGSAVE|Replication ID|output buffer|Loading DB' | tail -30
“Replication ID mismatch” → master restarted/failed over; “output buffer” → buffer limit; nothing but repeated BGSAVE → backlog too small.
Step 3: Check backlog sizing vs. write rate
redis-cli -h <master> INFO replication | grep -E 'repl_backlog_size|repl_backlog_histlen|master_repl_offset'
redis-cli -h <master> INFO stats | grep instantaneous_ops_per_sec
If repl_backlog_histlen is tiny relative to how much the master writes during a disconnect, partial resync cannot succeed.
Step 4: Check buffer limits and timeouts
redis-cli -h <master> CONFIG GET client-output-buffer-limit
redis-cli -h <master> CONFIG GET repl-timeout
redis-cli -h <master> CONFIG GET repl-diskless-sync
Example Root Cause Analysis
A replica of a write-heavy master never stabilized. sync_full climbed by one every ~90 seconds:
redis-cli -h 10.0.0.5 INFO stats | grep -E 'sync_full|sync_partial_err'
sync_full:47
sync_partial_err:46
The master log showed the transfer being cut off, not an ID mismatch:
Client id=... flags=S ... scheduled to be closed ASAP for overcoming of output buffer limits.
Connection with replica 10.0.0.9:6379 lost.
Starting BGSAVE for SYNC ...
The replica’s slave output buffer hard limit (64mb) was too small for the RDB + backlog of writes accumulating during the multi-GB transfer, so the master killed the replica mid-sync every cycle — which then restarted as a full resync.
Fix: raise the slave output buffer limits and grow the backlog so brief blips heal partially:
redis-cli -h 10.0.0.5 CONFIG SET client-output-buffer-limit "slave 512mb 128mb 60"
redis-cli -h 10.0.0.5 CONFIG SET repl-backlog-size 64mb # + persist to redis.conf
After the change, the next sync completed, master_link_status:up held, and sync_full stopped incrementing — subsequent blips resolved via sync_partial_ok.
Prevention Best Practices
- Size
repl-backlog-sizefor peak write rate × expected disconnect window (tens of MB for busy masters, not the 1 MB default). - Raise
client-output-buffer-limit slaveso a large RDB transfer plus concurrent writes never trips the limit mid-sync. - Set a generous
repl-timeoutfor high-latency or high-throughput links so transfers are not aborted prematurely. - Consider
repl-diskless-sync yeswhen disk I/O on the master is the bottleneck for producing the transfer RDB. - Use Sentinel/Cluster and a stable topology; frequent master restarts change the replication ID and force full resyncs.
- Alert on
sync_fullrate andsync_partial_err; a risingsync_fullis the early warning. See more Redis error guides.
Quick Command Reference
# Confirm the loop
redis-cli -h <master> INFO stats | grep -E 'sync_full|sync_partial_ok|sync_partial_err'
# Why partial resync is refused
journalctl -u redis | grep -iE 'resync|Replication ID|output buffer|BGSAVE|Loading DB' | tail -30
# Backlog sizing vs write rate
redis-cli -h <master> INFO replication | grep -E 'repl_backlog_size|repl_backlog_histlen|master_repl_offset'
redis-cli -h <master> INFO stats | grep instantaneous_ops_per_sec
# Limits & timeouts
redis-cli -h <master> CONFIG GET client-output-buffer-limit
redis-cli -h <master> CONFIG GET repl-timeout
# Remediate
redis-cli -h <master> CONFIG SET repl-backlog-size 64mb
redis-cli -h <master> CONFIG SET client-output-buffer-limit "slave 512mb 128mb 60"
Conclusion
A replica looping on full resync means partial resync keeps being refused, so the master re-ships the whole dataset over and over — hammering fork, CPU, and network without ever stabilizing. The sync_full counter climbing is the signature. Root causes:
repl-backlog-sizetoo small to cover the disconnect window.client-output-buffer-limit slavetoo small, killing the transfer mid-sync.- The master’s replication ID changing after a restart/failover.
- Network instability or a short
repl-timeoutaborting transfers.
Read both logs to see why partial resync was rejected, then size the backlog and slave output buffers for your write rate and let the transfer complete. Once partial resync succeeds, brief blips heal via sync_partial_ok and the loop ends.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.