PostgreSQL Error Guide: 'No space left on device' Disk Full on Write
Fix the PostgreSQL 'No space left on device' error: diagnose a full data directory, WAL bloat, table bloat, temp files, and stuck replication slots.
- #postgres
- #troubleshooting
- #errors
- #storage
Overview
PostgreSQL raises this error when a write to disk fails because the underlying filesystem is full. Any operation that needs to grow a file — extending a table or index, writing a WAL segment, spilling a sort to a temp file, or appending to the log — fails with the OS ENOSPC error. Once the data directory’s filesystem hits 100%, the database cannot accept writes and may refuse new connections or shut down to protect data integrity.
The client and server log show:
ERROR: could not extend file "base/16384/24591": No space left on device
HINT: Check free disk space.
PANIC: could not write to file "pg_wal/xlogtemp.12345": No space left on device
It occurs whenever free space on the volume hosting the data directory (or a tablespace, or pg_wal) reaches zero. The trigger is often gradual — WAL accumulating, bloat growing, a big query spilling to disk — until one final write tips the filesystem over. Because PostgreSQL needs headroom even to checkpoint and clean up, a fully wedged disk can be hard to recover without freeing space outside the database first.
Symptoms
- Writes fail with
could not extend file ... No space left on device. - WAL writes fail with
PANIC: could not write to file "pg_wal/...", often crashing the server. - New connections may fail; the server may sit in recovery or refuse to start.
dfshows the data volume at 100% used.
df -h /var/lib/postgresql
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 100G 100G 20K 100% /var/lib/postgresql
# Largest consumers inside the data directory
sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
58G /var/lib/postgresql/16/main/base
33G /var/lib/postgresql/16/main/pg_wal
6.2G /var/lib/postgresql/16/main/base/16384
1.1G /var/lib/postgresql/16/main/pg_stat_tmp
99G /var/lib/postgresql/16/main
A pg_wal of 33G alongside a near-full disk points straight at WAL accumulation as the culprit.
Common Root Causes
1. Data directory disk full from normal growth
The cluster simply outgrew its volume — tables and indexes legitimately fill the disk.
SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC
LIMIT 5;
datname | size
-----------+--------
appdb | 54 GB
reporting | 11 GB
postgres | 8 MB
(3 rows)
If the sum of database sizes approaches the volume size, you are out of capacity and need to reclaim or expand it.
2. WAL bloat from failing archiving
If archive_mode is on and archive_command keeps failing, PostgreSQL refuses to recycle WAL segments until they are archived, so pg_wal grows without bound.
SELECT archived_count, failed_count, last_failed_wal, last_failed_time
FROM pg_stat_archiver;
archived_count | failed_count | last_failed_wal | last_failed_time
----------------+--------------+--------------------------+-------------------------------
4210 | 1880 | 0000000100000A3F000000C1 | 2026-06-23 13:55:02.114+00
(1 row)
A high and climbing failed_count means archiving is broken and WAL is piling up in pg_wal.
3. Table and index bloat
Dead tuples from heavy UPDATE/DELETE churn that autovacuum has not reclaimed inflate on-disk size well beyond live data.
SELECT relname,
pg_size_pretty(pg_total_relation_size(relid)) AS total,
n_dead_tup, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 5;
relname | total | n_dead_tup | n_live_tup
--------------+--------+------------+------------
events | 22 GB | 41832290 | 9120044
sessions | 3.1 GB | 8123440 | 210330
(2 rows)
When n_dead_tup rivals or exceeds n_live_tup, bloat is consuming most of the table’s space.
4. Temp files from large sorts and hashes
Queries whose sorts or hash joins exceed work_mem spill to temp files under base/pgsql_tmp. A single huge analytical query can fill the disk transiently.
SELECT datname, temp_files, pg_size_pretty(temp_bytes) AS temp_total
FROM pg_stat_database
WHERE temp_files > 0
ORDER BY temp_bytes DESC;
datname | temp_files | temp_total
---------+------------+------------
appdb | 14021 | 47 GB
(1 row)
sudo du -sh /var/lib/postgresql/16/main/base/pgsql_tmp
9.8G /var/lib/postgresql/16/main/base/pgsql_tmp
A large live pgsql_tmp plus high temp_bytes means runaway spill — bound it with temp_file_limit.
5. A replication slot retaining WAL
An inactive or lagging replication slot pins WAL so it can never be recycled, growing pg_wal until the disk fills — even with archiving healthy.
SELECT slot_name, active, restart_lsn,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;
slot_name | active | restart_lsn | retained
----------------+--------+--------------+-----------
standby_2_slot | f | A3F/C1000000 | 28 GB
standby_1_slot | t | B02/3A000000 | 120 MB
(2 rows)
The inactive slot standby_2_slot is holding 28 GB of WAL — a dead or removed standby that was never cleaned up.
6. Log files filling the disk
Verbose logging (log_statement = all, slow-query logging, or no rotation) can fill the log volume — which is often the same disk as the data directory.
sudo du -sh /var/log/postgresql
sudo ls -lhS /var/log/postgresql | head
14G /var/log/postgresql
-rw-r----- 1 postgres postgres 11G postgresql-16-main.log
-rw-r----- 1 postgres postgres 2.1G postgresql-16-main.log.1
A multi-gigabyte log with no rotation can be the entire reason the disk is full.
Diagnostic Workflow
Step 1: Confirm which filesystem is full
df -h
df -h /var/lib/postgresql /var/log/postgresql
Identify whether the data directory, a separate tablespace mount, or the log volume hit 100% — that narrows the cause immediately.
Step 2: Find the biggest consumers on disk
sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
sudo du -sh /var/lib/postgresql/16/main/pg_wal \
/var/lib/postgresql/16/main/base/pgsql_tmp
A dominant pg_wal points to WAL/slot/archiving issues; a dominant pgsql_tmp points to query spill; a dominant base points to bloat or genuine growth.
Step 3: Check WAL retention causes (archiving and slots)
SELECT failed_count, last_failed_wal FROM pg_stat_archiver;
SELECT slot_name, active,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained
FROM pg_replication_slots
ORDER BY 3 DESC;
A climbing failed_count or a large retained value on an inactive slot tells you exactly why WAL will not recycle.
Step 4: Free emergency space, then let the database catch up
# Safe quick wins outside PostgreSQL's data files:
sudo journalctl --vacuum-size=200M
sudo find /var/log/postgresql -name '*.log.*' -mtime +1 -delete
# Drop a confirmed-dead replication slot to release retained WAL
psql -c "SELECT pg_drop_replication_slot('standby_2_slot');"
Never delete files inside pg_wal or base by hand — drop the slot or fix archiving and let PostgreSQL recycle WAL safely.
Step 5: Reclaim space inside the database
-- Reclaim bloat (VACUUM FULL needs free space and an exclusive lock)
VACUUM (VERBOSE, ANALYZE) events;
-- Cap future temp-file spill per query
ALTER SYSTEM SET temp_file_limit = '10GB';
SELECT pg_reload_conf();
Use plain VACUUM or pg_repack when you cannot afford the lock or lack the spare space VACUUM FULL requires.
Example Root Cause Analysis
A primary database starts rejecting writes with could not extend file ... No space left on device, and monitoring shows the data volume at 100%.
df confirms /var/lib/postgresql is full, and du shows where:
33G /var/lib/postgresql/16/main/pg_wal
58G /var/lib/postgresql/16/main/base
pg_wal at 33G is abnormal for this workload. Archiving is healthy (failed_count is 0), so the next suspect is a replication slot:
slot_name | active | retained
----------------+--------+-----------
standby_2_slot | f | 28 GB
standby_2_slot is inactive and holding 28 GB of WAL. The team had decommissioned the second standby a week earlier but never dropped its slot, so the primary kept every WAL segment since, slowly filling the disk.
Fix: drop the orphaned slot so PostgreSQL can recycle the retained WAL:
SELECT pg_drop_replication_slot('standby_2_slot');
CHECKPOINT;
Within a checkpoint cycle, pg_wal shrinks back to its normal size, free space returns, and writes succeed again. To prevent recurrence, the team adds max_slot_wal_keep_size so a single stuck slot can no longer fill the disk.
Prevention Best Practices
- Monitor and alert on filesystem usage for the data, WAL, and log volumes well before 100% (e.g., page at 80%). A full disk is far cheaper to prevent than to recover.
- Set
max_slot_wal_keep_sizeso a dead or lagging replication slot cannot retain WAL without bound, and auditpg_replication_slotsregularly for inactive slots. - Alert on
pg_stat_archiver.failed_count; a brokenarchive_commandsilently growspg_waluntil the disk fills. - Tune autovacuum to keep up with churn, and bound query spill with
temp_file_limitso one bad analytical query cannot exhaust the disk. - Configure log rotation (
logrotateorlog_rotation_size/log_rotation_age) and avoidlog_statement = allin production. - For fast triage, the free incident assistant can turn a disk-full log block into the likely cause — WAL, slot, bloat, or temp spill.
Quick Command Reference
# Which filesystem is full?
df -h /var/lib/postgresql /var/log/postgresql
# Biggest on-disk consumers in the data directory
sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
sudo du -sh /var/lib/postgresql/16/main/pg_wal /var/lib/postgresql/16/main/base/pgsql_tmp
# Database sizes
psql -c "SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database ORDER BY 2 DESC;"
# Archiving health
psql -c "SELECT archived_count, failed_count, last_failed_wal FROM pg_stat_archiver;"
# WAL retained by replication slots
psql -c "SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained FROM pg_replication_slots ORDER BY 3 DESC;"
# Temp-file spill per database
psql -c "SELECT datname, temp_files, pg_size_pretty(temp_bytes) FROM pg_stat_database WHERE temp_files > 0;"
# Bloat (dead tuples) by table
psql -c "SELECT relname, n_dead_tup, n_live_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 5;"
# Reclaim: drop a dead slot, vacuum, cap temp files
psql -c "SELECT pg_drop_replication_slot('<SLOT>');"
psql -c "VACUUM (VERBOSE, ANALYZE) <TABLE>;"
psql -c "ALTER SYSTEM SET temp_file_limit = '10GB'; SELECT pg_reload_conf();"
Conclusion
A No space left on device error means a PostgreSQL write hit a full filesystem — most damagingly when it strikes pg_wal and PANICs the server. The usual root causes:
- The data directory’s volume is genuinely full from normal growth.
- WAL bloat because a failing
archive_commandblocks segment recycling. - Table and index bloat from churn that autovacuum has not reclaimed.
- Temp files from large sorts/hashes spilling under
pgsql_tmp. - An inactive or lagging replication slot retaining WAL indefinitely.
- Unrotated, verbose log files filling the disk.
Find the full filesystem with df, find the biggest consumer with du, then reclaim safely — fix archiving or drop a dead slot rather than deleting WAL by hand. More PostgreSQL guides are in the Postgres category.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.