Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Postgres By James Joyner IV · · 11 min read

PostgreSQL Error Guide: 'No space left on device' Disk Full on Write

Fix the PostgreSQL 'No space left on device' error: diagnose a full data directory, WAL bloat, table bloat, temp files, and stuck replication slots.

  • #postgres
  • #troubleshooting
  • #errors
  • #storage

Overview

PostgreSQL raises this error when a write to disk fails because the underlying filesystem is full. Any operation that needs to grow a file — extending a table or index, writing a WAL segment, spilling a sort to a temp file, or appending to the log — fails with the OS ENOSPC error. Once the data directory’s filesystem hits 100%, the database cannot accept writes and may refuse new connections or shut down to protect data integrity.

The client and server log show:

ERROR:  could not extend file "base/16384/24591": No space left on device
HINT:  Check free disk space.
PANIC:  could not write to file "pg_wal/xlogtemp.12345": No space left on device

It occurs whenever free space on the volume hosting the data directory (or a tablespace, or pg_wal) reaches zero. The trigger is often gradual — WAL accumulating, bloat growing, a big query spilling to disk — until one final write tips the filesystem over. Because PostgreSQL needs headroom even to checkpoint and clean up, a fully wedged disk can be hard to recover without freeing space outside the database first.

Symptoms

  • Writes fail with could not extend file ... No space left on device.
  • WAL writes fail with PANIC: could not write to file "pg_wal/...", often crashing the server.
  • New connections may fail; the server may sit in recovery or refuse to start.
  • df shows the data volume at 100% used.
df -h /var/lib/postgresql
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme1n1    100G  100G   20K 100% /var/lib/postgresql
# Largest consumers inside the data directory
sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
58G     /var/lib/postgresql/16/main/base
33G     /var/lib/postgresql/16/main/pg_wal
6.2G    /var/lib/postgresql/16/main/base/16384
1.1G    /var/lib/postgresql/16/main/pg_stat_tmp
99G     /var/lib/postgresql/16/main

A pg_wal of 33G alongside a near-full disk points straight at WAL accumulation as the culprit.

Common Root Causes

1. Data directory disk full from normal growth

The cluster simply outgrew its volume — tables and indexes legitimately fill the disk.

SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC
LIMIT 5;
  datname  |  size
-----------+--------
 appdb     | 54 GB
 reporting | 11 GB
 postgres  | 8 MB
(3 rows)

If the sum of database sizes approaches the volume size, you are out of capacity and need to reclaim or expand it.

2. WAL bloat from failing archiving

If archive_mode is on and archive_command keeps failing, PostgreSQL refuses to recycle WAL segments until they are archived, so pg_wal grows without bound.

SELECT archived_count, failed_count, last_failed_wal, last_failed_time
FROM pg_stat_archiver;
 archived_count | failed_count |     last_failed_wal      |       last_failed_time
----------------+--------------+--------------------------+-------------------------------
           4210 |         1880 | 0000000100000A3F000000C1 | 2026-06-23 13:55:02.114+00
(1 row)

A high and climbing failed_count means archiving is broken and WAL is piling up in pg_wal.

3. Table and index bloat

Dead tuples from heavy UPDATE/DELETE churn that autovacuum has not reclaimed inflate on-disk size well beyond live data.

SELECT relname,
       pg_size_pretty(pg_total_relation_size(relid)) AS total,
       n_dead_tup, n_live_tup
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 5;
   relname    | total  | n_dead_tup | n_live_tup
--------------+--------+------------+------------
 events       | 22 GB  |  41832290  |   9120044
 sessions     | 3.1 GB |   8123440  |    210330
(2 rows)

When n_dead_tup rivals or exceeds n_live_tup, bloat is consuming most of the table’s space.

4. Temp files from large sorts and hashes

Queries whose sorts or hash joins exceed work_mem spill to temp files under base/pgsql_tmp. A single huge analytical query can fill the disk transiently.

SELECT datname, temp_files, pg_size_pretty(temp_bytes) AS temp_total
FROM pg_stat_database
WHERE temp_files > 0
ORDER BY temp_bytes DESC;
 datname | temp_files | temp_total
---------+------------+------------
 appdb   |     14021  | 47 GB
(1 row)
sudo du -sh /var/lib/postgresql/16/main/base/pgsql_tmp
9.8G    /var/lib/postgresql/16/main/base/pgsql_tmp

A large live pgsql_tmp plus high temp_bytes means runaway spill — bound it with temp_file_limit.

5. A replication slot retaining WAL

An inactive or lagging replication slot pins WAL so it can never be recycled, growing pg_wal until the disk fills — even with archiving healthy.

SELECT slot_name, active, restart_lsn,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;
   slot_name    | active |  restart_lsn |  retained
----------------+--------+--------------+-----------
 standby_2_slot | f      | A3F/C1000000 | 28 GB
 standby_1_slot | t      | B02/3A000000 | 120 MB
(2 rows)

The inactive slot standby_2_slot is holding 28 GB of WAL — a dead or removed standby that was never cleaned up.

6. Log files filling the disk

Verbose logging (log_statement = all, slow-query logging, or no rotation) can fill the log volume — which is often the same disk as the data directory.

sudo du -sh /var/log/postgresql
sudo ls -lhS /var/log/postgresql | head
14G     /var/log/postgresql
-rw-r----- 1 postgres postgres 11G postgresql-16-main.log
-rw-r----- 1 postgres postgres 2.1G postgresql-16-main.log.1

A multi-gigabyte log with no rotation can be the entire reason the disk is full.

Diagnostic Workflow

Step 1: Confirm which filesystem is full

df -h
df -h /var/lib/postgresql /var/log/postgresql

Identify whether the data directory, a separate tablespace mount, or the log volume hit 100% — that narrows the cause immediately.

Step 2: Find the biggest consumers on disk

sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
sudo du -sh /var/lib/postgresql/16/main/pg_wal \
            /var/lib/postgresql/16/main/base/pgsql_tmp

A dominant pg_wal points to WAL/slot/archiving issues; a dominant pgsql_tmp points to query spill; a dominant base points to bloat or genuine growth.

Step 3: Check WAL retention causes (archiving and slots)

SELECT failed_count, last_failed_wal FROM pg_stat_archiver;

SELECT slot_name, active,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained
FROM pg_replication_slots
ORDER BY 3 DESC;

A climbing failed_count or a large retained value on an inactive slot tells you exactly why WAL will not recycle.

Step 4: Free emergency space, then let the database catch up

# Safe quick wins outside PostgreSQL's data files:
sudo journalctl --vacuum-size=200M
sudo find /var/log/postgresql -name '*.log.*' -mtime +1 -delete

# Drop a confirmed-dead replication slot to release retained WAL
psql -c "SELECT pg_drop_replication_slot('standby_2_slot');"

Never delete files inside pg_wal or base by hand — drop the slot or fix archiving and let PostgreSQL recycle WAL safely.

Step 5: Reclaim space inside the database

-- Reclaim bloat (VACUUM FULL needs free space and an exclusive lock)
VACUUM (VERBOSE, ANALYZE) events;

-- Cap future temp-file spill per query
ALTER SYSTEM SET temp_file_limit = '10GB';
SELECT pg_reload_conf();

Use plain VACUUM or pg_repack when you cannot afford the lock or lack the spare space VACUUM FULL requires.

Example Root Cause Analysis

A primary database starts rejecting writes with could not extend file ... No space left on device, and monitoring shows the data volume at 100%.

df confirms /var/lib/postgresql is full, and du shows where:

33G     /var/lib/postgresql/16/main/pg_wal
58G     /var/lib/postgresql/16/main/base

pg_wal at 33G is abnormal for this workload. Archiving is healthy (failed_count is 0), so the next suspect is a replication slot:

   slot_name    | active |  retained
----------------+--------+-----------
 standby_2_slot | f      | 28 GB

standby_2_slot is inactive and holding 28 GB of WAL. The team had decommissioned the second standby a week earlier but never dropped its slot, so the primary kept every WAL segment since, slowly filling the disk.

Fix: drop the orphaned slot so PostgreSQL can recycle the retained WAL:

SELECT pg_drop_replication_slot('standby_2_slot');
CHECKPOINT;

Within a checkpoint cycle, pg_wal shrinks back to its normal size, free space returns, and writes succeed again. To prevent recurrence, the team adds max_slot_wal_keep_size so a single stuck slot can no longer fill the disk.

Prevention Best Practices

  • Monitor and alert on filesystem usage for the data, WAL, and log volumes well before 100% (e.g., page at 80%). A full disk is far cheaper to prevent than to recover.
  • Set max_slot_wal_keep_size so a dead or lagging replication slot cannot retain WAL without bound, and audit pg_replication_slots regularly for inactive slots.
  • Alert on pg_stat_archiver.failed_count; a broken archive_command silently grows pg_wal until the disk fills.
  • Tune autovacuum to keep up with churn, and bound query spill with temp_file_limit so one bad analytical query cannot exhaust the disk.
  • Configure log rotation (logrotate or log_rotation_size/log_rotation_age) and avoid log_statement = all in production.
  • For fast triage, the free incident assistant can turn a disk-full log block into the likely cause — WAL, slot, bloat, or temp spill.

Quick Command Reference

# Which filesystem is full?
df -h /var/lib/postgresql /var/log/postgresql

# Biggest on-disk consumers in the data directory
sudo du -h --max-depth=1 /var/lib/postgresql/16/main | sort -rh | head
sudo du -sh /var/lib/postgresql/16/main/pg_wal /var/lib/postgresql/16/main/base/pgsql_tmp

# Database sizes
psql -c "SELECT datname, pg_size_pretty(pg_database_size(datname)) FROM pg_database ORDER BY 2 DESC;"

# Archiving health
psql -c "SELECT archived_count, failed_count, last_failed_wal FROM pg_stat_archiver;"

# WAL retained by replication slots
psql -c "SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained FROM pg_replication_slots ORDER BY 3 DESC;"

# Temp-file spill per database
psql -c "SELECT datname, temp_files, pg_size_pretty(temp_bytes) FROM pg_stat_database WHERE temp_files > 0;"

# Bloat (dead tuples) by table
psql -c "SELECT relname, n_dead_tup, n_live_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 5;"

# Reclaim: drop a dead slot, vacuum, cap temp files
psql -c "SELECT pg_drop_replication_slot('<SLOT>');"
psql -c "VACUUM (VERBOSE, ANALYZE) <TABLE>;"
psql -c "ALTER SYSTEM SET temp_file_limit = '10GB'; SELECT pg_reload_conf();"

Conclusion

A No space left on device error means a PostgreSQL write hit a full filesystem — most damagingly when it strikes pg_wal and PANICs the server. The usual root causes:

  1. The data directory’s volume is genuinely full from normal growth.
  2. WAL bloat because a failing archive_command blocks segment recycling.
  3. Table and index bloat from churn that autovacuum has not reclaimed.
  4. Temp files from large sorts/hashes spilling under pgsql_tmp.
  5. An inactive or lagging replication slot retaining WAL indefinitely.
  6. Unrotated, verbose log files filling the disk.

Find the full filesystem with df, find the biggest consumer with du, then reclaim safely — fix archiving or drop a dead slot rather than deleting WAL by hand. More PostgreSQL guides are in the Postgres category.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.