Kubernetes Error Guide: 'etcdserver: request timed out' Slow etcd & Defrag
Fix 'etcdserver: request timed out': diagnose slow disk fsync, defrag and quota limits, leader elections, network latency, and CPU starvation in the etcd backend.
- #kubernetes
- #troubleshooting
- #errors
- #etcd
Exact Error Message
These messages appear in the kube-apiserver logs and in etcd’s own logs when the cluster’s key-value store cannot serve requests fast enough:
etcdserver: request timed out
rpc error: code = DeadlineExceeded desc = context deadline exceeded
{"level":"warn","msg":"leader changed","local-member-id":"8e9e05c52164694d"}
{"level":"warn","msg":"apply request took too long","took":"2.137s","expected-duration":"100ms","request":"header:<ID:...> txn:<...>"}
{"level":"warn","msg":"slow fdatasync","took":"1.842s","expected-duration":"1s"}
When the disk is full of garbage and at quota you will also see writes hard-fail:
etcdserver: mvcc: database space exceeded
What the Error Means
etcd is the consistent, replicated key-value store behind the Kubernetes API. Every LIST, GET, CREATE, and WATCH ultimately reads or writes etcd. etcd commits each write to a write-ahead log (WAL) and fdatasyncs it to disk before acknowledging, then replicates it via the Raft consensus protocol. Both the disk fsync and Raft replication must complete within tight deadlines.
When the disk is slow, the members are out of sync, or the leader keeps changing, etcd cannot reach consensus or persist within its deadline. It returns request timed out or DeadlineExceeded to the apiserver, which then surfaces Error from server (Timeout) / context deadline exceeded to its own callers. apply request took too long and slow fdatasync are the early-warning symptoms; once they cross etcd’s internal deadline, requests fail outright.
mvcc: database space exceeded is a different failure mode: etcd’s on-disk database hit its --quota-backend-bytes limit (default 2 GiB, often raised to 8 GiB) and refuses all writes until you compact, defragment, and disarm the alarm.
Common Causes
- Slow disk (WAL fsync latency) — etcd is extremely sensitive to disk latency. Spinning disks, network-attached storage, or noisy-neighbor IOPS push
fdatasyncpast 1s and stall every write. etcd needs low-latency local SSD/NVMe. - Defragmentation needed / DB near quota — deletes and compaction free logical space but leave the physical file fragmented. The
dbsize creeps toward--quota-backend-bytes, and eventually writes fail withmvcc: database space exceeded. - Network latency between members — high RTT or packet loss between etcd peers slows Raft replication, so the leader cannot get a quorum acknowledgement before its deadline.
- Leader elections / flapping — when a member misses heartbeats (often because of disk or CPU stalls), the cluster re-elects a leader. During the election, writes pause and time out; repeated
leader changedlines mean instability. - CPU starvation — etcd co-located with a busy apiserver or unthrottled workloads gets starved of CPU, missing heartbeats and fsync deadlines.
- Too many large objects / events — large ConfigMaps/Secrets, a runaway controller creating Events, or huge
LISTtraffic bloats the database and the request payloads, increasing apply latency. - Clock skew — significant time drift between members disrupts lease and election timing, causing spurious timeouts.
How to Reproduce the Error
Inject disk latency under etcd’s data directory and watch fsync and apply latency blow past their thresholds. On a test control-plane node only:
# Throttle the etcd data device to simulate a slow disk, then write churn
sudo tc qdisc add dev <data-disk> root netem delay 0 # placeholder; use a slow/loaded volume
for i in $(seq 1 5000); do kubectl create cm churn-$i --from-literal=a=$(head -c 4096 </dev/urandom | base64) ; done
{"level":"warn","msg":"slow fdatasync","took":"1.612s","expected-duration":"1s"}
{"level":"warn","msg":"apply request took too long","took":"1.904s","expected-duration":"100ms"}
Error from server (Timeout): etcdserver: request timed out
A more direct reproduction of the quota path is to set a tiny --quota-backend-bytes and write until you hit mvcc: database space exceeded.
Diagnostic Commands
Set up the etcdctl environment from the static-pod certs, then check endpoint status and health across all members:
export ETCDCTL_API=3
CERT=/etc/kubernetes/pki/etcd
etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=$CERT/ca.crt --cert=$CERT/server.crt --key=$CERT/server.key \
endpoint status --cluster -w table
+-------------------------+------------------+---------+---------+-----------+------------+-----------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX|
+-------------------------+------------------+---------+---------+-----------+------------+-----------+
| https://10.0.0.10:2379 | 8e9e05c52164694d | 3.5.12 | 7.6 GB | true | 184 | 9912033 |
| https://10.0.0.11:2379 | b2c3d4e5f6a7b8c9 | 3.5.12 | 7.6 GB | false | 184 | 9912030 |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+
A DB SIZE near the quota is the smoking gun for the space exceeded path. Check health and any active alarms:
etcdctl --endpoints=https://127.0.0.1:2379 --cacert=$CERT/ca.crt --cert=$CERT/server.crt --key=$CERT/server.key endpoint health --cluster -w table
etcdctl --endpoints=https://127.0.0.1:2379 --cacert=$CERT/ca.crt --cert=$CERT/server.crt --key=$CERT/server.key alarm list
memberID:10501334649042878790 alarm:NOSPACE
Pull etcd’s disk-latency metrics — the single most important signal:
sudo crictl logs $(sudo crictl ps --name etcd -q | head -1) 2>&1 | grep -E 'fdatasync|took too long|leader changed' | tail
curl -sk --cert $CERT/server.crt --key $CERT/server.key --cacert $CERT/ca.crt https://127.0.0.1:2379/metrics \
| grep -E 'etcd_disk_wal_fsync_duration_seconds_bucket|etcd_disk_backend_commit_duration_seconds_bucket|etcd_server_leader_changes_seen_total'
Step-by-Step Resolution
-
Confirm the failure class.
alarm listshowingNOSPACEmeans you are at the quota — go to step 2. Slowfdatasync/apply took too longwith no alarm means a disk/CPU/network latency problem — go to step 4. -
Clear a quota (
NOSPACE) alarm. Compact to the current revision, defragment to reclaim physical space, then disarm:REV=$(etcdctl --endpoints=https://127.0.0.1:2379 --cacert=$CERT/ca.crt --cert=$CERT/server.crt --key=$CERT/server.key endpoint status -w json | grep -o '"revision":[0-9]*' | head -1 | cut -d: -f2) etcdctl ... compact $REV etcdctl ... defrag --cluster etcdctl ... alarm disarmDefrag blocks the member it runs on, so run it one member at a time (off the leader last), not cluster-wide simultaneously in a busy cluster.
-
Raise the quota if the cluster is legitimately large. Edit the etcd static-pod manifest and bump the backend quota (max recommended 8 GiB):
# /etc/kubernetes/manifests/etcd.yaml spec: containers: - command: - etcd - --quota-backend-bytes=8589934592 # 8 GiB - --auto-compaction-mode=periodic - --auto-compaction-retention=5mThe kubelet restarts etcd automatically when the manifest changes.
-
Fix slow disk. Move etcd’s data directory to dedicated low-latency local SSD/NVMe. Verify
etcd_disk_wal_fsync_duration_secondsp99 stays under ~10ms andetcd_disk_backend_commit_duration_secondsp99 under ~25ms. Do not run etcd on network-attached or shared storage. -
Stabilize leadership and CPU. If
etcd_server_leader_changes_seen_totalkeeps climbing, give etcd dedicated CPU (guaranteed QoS, separate nodes from busy workloads). Reduce network RTT between members; keep them in the same low-latency zone/region. -
Cut object/event churn. Find controllers spamming Events or large objects and fix them; enable periodic auto-compaction so revisions do not accumulate. Verify clocks are synced via NTP across members.
Prevention and Best Practices
- Run etcd on dedicated local SSD/NVMe and alert on
etcd_disk_wal_fsync_duration_secondsp99; disk latency is the dominant cause of etcd timeouts. - Enable
--auto-compaction-mode=periodicwith a short retention and schedule a defrag (rolling, one member at a time) so the database never drifts toward the quota. - Give etcd guaranteed CPU and isolate it from noisy workloads and the apiserver where possible; co-location causes heartbeat misses.
- Keep etcd members in the same low-latency network and sync clocks with NTP to keep Raft elections stable.
- Alert on
DB SIZEapproaching--quota-backend-bytesand onetcd_server_leader_changes_seen_totalso you catch instability before writes start failing.
Related Errors
- context deadline exceeded — what the apiserver surfaces to clients when etcd times out underneath it.
- The connection to the server was refused — an apiserver that won’t start because etcd is unhealthy.
etcdserver: mvcc: database space exceeded— the quota path of this same error, fixed by compact + defrag + disarm.etcdserver: too many requests— etcd shedding load under overwhelming request volume.
Frequently Asked Questions
Is etcdserver: request timed out a network or disk problem?
Most often disk. etcd fdatasyncs every write before acknowledging, so slow disk latency stalls writes long before network does. Check etcd_disk_wal_fsync_duration_seconds first; only if disk latency is healthy should you investigate inter-member network RTT and leader elections.
Is it safe to run defrag on a live cluster?
Defrag is safe but blocks the member it runs on while it rewrites the database file, which can briefly cause timeouts on that member. Run it one member at a time, defrag the leader last, and ideally during a low-traffic window. Never defrag every member simultaneously on a busy cluster.
What causes mvcc: database space exceeded?
The on-disk database hit --quota-backend-bytes. Even after deleting objects, physical space is not reclaimed until you compact (drop old revisions) and defrag (shrink the file). Then alarm disarm re-enables writes. If the cluster is genuinely large, also raise the quota up to the recommended 8 GiB maximum.
Why do I keep seeing leader changed?
Repeated leader elections mean a member keeps missing heartbeats — usually disk or CPU stalls, or network packet loss between peers. Each election pauses writes and causes timeouts. Stabilize the underlying disk/CPU/network rather than treating the elections themselves.
On a managed cluster, can I fix etcd myself?
No. On EKS, GKE, AKS, and similar, etcd is part of the managed control plane and you have no access to its nodes or etcdctl. Capture the apiserver-side Timeout/context deadline exceeded evidence and the timing, then open a support ticket. Your levers are limited to reducing object/event churn and overall API load.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.