GitLab CI Error Guide: 'no space left on device' Runner Disk Exhaustion
Fix GitLab CI 'no space left on device' errors: prune Docker images and volumes, clear runner build cache and artifacts, free /tmp, and resolve inode exhaustion.
- #gitlab-cicd
- #troubleshooting
- #errors
- #disk
Overview
When a GitLab job fails with no space left on device, the runner host (or the Docker daemon backing it) has run out of disk. The job was executing fine until something — a clone, a npm install, a docker build, an artifact upload — tried to write a byte and the filesystem refused. Unlike a flaky network error, this one is deterministic: every job on that runner will keep failing the same way until you free space, because the disk does not heal itself between pipelines.
The message shows up wherever the write happened, so the exact path is a clue about which step overran the disk:
ERROR: write /builds/group/app/node_modules/.cache/webpack/0.pack: no space left on device
ERROR: Job failed: exit code 1
A second, scarier variant appears once the disk is so full that the Docker daemon itself can no longer write its own state — at that point even cleanup commands start failing:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
ERROR: Job failed (system failure): Error response from daemon: failed to create task: no space left on device
The error code is fixed (no space left on device, ENOSPC); the cause varies. It is almost always an accumulation problem on a long-lived runner — old Docker layers, stale build directories, unbounded cache — not a bug in your .gitlab-ci.yml logic.
Symptoms
- Jobs that used to pass start failing mid-run with
no space left on device, often during clone, dependency install,docker build, or artifact upload. - Every job on one specific runner fails the same way while jobs on other runners are fine.
- The Docker-based jobs report
Cannot connect to the Docker daemonafter the disk hits 100%. df -hshows a filesystem (usually/or/var/lib/docker) at 100% Used, ordf -ishows Inodes at 100% even though space looks free.- Artifact or cache upload fails with
tar: Wrote only N of M bytes: No space left on device.
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 49G 49G 0 100% /
/dev/nvme1n1 197G 131G 56G 71% /var/lib/docker
tmpfs 7.8G 1.2M 7.8G 1% /run
df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/root 3276800 3276800 0 100% /
/dev/nvme1n1 13107200 4102233 9004967 32% /var/lib/docker
The first output shows the root filesystem full while the data partition has room — a small-partition problem. The second shows free space but 100% inodes — exhaustion by millions of tiny files, not bytes.
Common Root Causes
1. /var/lib/docker filled by accumulated images, containers, and build cache
A long-lived docker or docker-machine runner never garbage-collects on its own. Every pulled image, exited container, anonymous volume, and BuildKit cache layer piles up under /var/lib/docker until it consumes the whole partition.
docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 412 6 88.4GB 81.2GB (91%)
Containers 230 2 3.1GB 3.0GB (98%)
Local Volumes 148 3 24.6GB 23.9GB (97%)
Build Cache 1971 0 41.8GB 41.8GB (100%)
ERROR: failed to register layer: write /var/lib/docker/overlay2/.../diff/...: no space left on device
ERROR: Job failed: exit code 1
RECLAIMABLE near 100% across images, volumes, and build cache is the signature. Almost all of that disk is dead weight from past jobs.
2. The runner’s builds_dir and git clones are never cleaned between jobs
The runner clones the repo into builds_dir (default /builds inside the container, or under the runner’s working dir for the shell executor). If the executor reuses the host or the build dir is mounted from the host, stale clones, leftover node_modules, and partial checkouts from old jobs accumulate.
du -sh /home/gitlab-runner/builds/* 2>/dev/null | sort -h | tail
1.2G /home/gitlab-runner/builds/abc123/0/group/app
3.8G /home/gitlab-runner/builds/abc123/0/group/data-service
11.4G /home/gitlab-runner/builds/abc123/0/group/monolith
fatal: write error: No space left on device
fatal: unable to write file builds/group/monolith/...: No space left on device
ERROR: Job failed: exit code 128
The clone step itself fails (git exit 128) because the build directory partition is full of old checkouts that were never reclaimed.
3. A single job writes huge artifacts/cache or downloads a massive dataset
One greedy job can fill the disk on its own — pulling a multi-gigabyte dataset to /tmp, generating an enormous coverage report, or declaring an artifacts:/cache: path that sweeps in node_modules or build output.
# .gitlab-ci.yml — this quietly tars the entire workspace every run
test:
script:
- curl -o /tmp/dataset.tar.gz https://data.example.com/full-dump.tar.gz
- tar xzf /tmp/dataset.tar.gz -C /tmp
artifacts:
paths:
- ./ # grabs node_modules, build output, the dataset — everything
Uploading artifacts...
./: found 284113 matching artifact files and directories
tar: /builds/group/app/.tmp-artifacts: Wrote only 4096 of 10240 bytes: No space left on device
ERROR: Uploading artifacts as "archive" to coordinator... failed
The tar step that builds the artifact archive runs out of disk because the artifact path scoops up gigabytes it never needed to.
4. Docker build layer cache / BuildKit cache grows unbounded
Jobs that run docker build (or docker buildx) on a persistent daemon accumulate BuildKit cache mounts and intermediate layers. Without pruning, the build cache alone can dwarf your actual images.
docker buildx du
Reclaimable: 41.8GB
Total: 43.0GB
ID RECLAIMABLE SIZE LAST ACCESSED
3k2j... true 9.1GB 6 days ago
9fa1... true 7.4GB 4 days ago
...
ERROR: failed to solve: failed to compute cache key: write /var/lib/docker/.../snapshots/...: no space left on device
ERROR: Job failed: exit code 1
docker system df undercounts BuildKit; docker buildx du shows the real cache size. Reclaimable in the tens of gigabytes means an unbounded build cache.
5. Inode exhaustion — space free, but df -i is 100%
A filesystem can run out of inodes before it runs out of bytes. Projects with massive node_modules trees or generated artifacts create millions of tiny files; once inodes hit 100%, writes fail with the same ENOSPC even though df -h shows free space.
df -i /builds
sudo find /home/gitlab-runner/builds -xdev -type f | wc -l
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/root 3276800 3276800 0 100% /
2841190
ERROR: write /builds/group/app/node_modules/.pnpm/.../index.js: no space left on device
npm ERR! nospc ENOSPC: no space left on device, write
df -h will look healthy while df -i is pegged at 100%. The fix is deleting file count, not file size — usually stale node_modules and old build dirs.
6. dind storage or a small partition fills while the data disk has room
With Docker-in-Docker (docker:dind), the inner daemon writes to its own ephemeral storage; if it is not given a roomy volume it fills fast. Equally common: the root partition (or a tiny /tmp) hits 100% while the big data partition still has space, because something wrote to the wrong path.
# inside or about the dind service
df -h /var/lib/docker /tmp /
Filesystem Size Used Avail Use% Mounted on
overlay 10G 10G 0 100% /var/lib/docker # dind's tiny default
tmpfs 2.0G 2.0G 0 100% /tmp
/dev/nvme1n1 197G 38G 159G 20% /
ERROR: failed to register layer: ApplyLayer ... no space left on device
ERROR: Job failed (system failure): preparing environment: ... no space left on device
The data partition is 20% used but dind’s 10G overlay and /tmp are full. The job dies even though the host “has plenty of disk.”
Diagnostic Workflow
Step 1: Confirm which filesystem is full — bytes or inodes
df -h
df -i
df -h finds the full byte filesystem; df -i catches inode exhaustion when df -h looks fine. Note which mount is at 100% — /, /var/lib/docker, /tmp, or the builds partition — because that decides the rest of the workflow.
Step 2: Find the biggest offender on the full filesystem
sudo du -xh / 2>/dev/null | sort -h | tail -20
sudo du -sh /var/lib/docker /home/gitlab-runner/builds /tmp 2>/dev/null
du -x stays on one filesystem so you do not chase mounted volumes. This tells you whether Docker, stale build directories, or a runaway /tmp download is eating the disk.
Step 3: Inspect Docker’s accumulated state
docker system df
docker buildx du
If RECLAIMABLE is high across images, containers, volumes, and build cache, the runner has simply never been pruned (Root Cause 1 and 4). buildx du reveals BuildKit cache that system df underreports.
Step 4: Check runner build directories and config
du -sh /home/gitlab-runner/builds/* 2>/dev/null | sort -h | tail
grep -E 'builds_dir|cache_dir|disable_cache' /etc/gitlab-runner/config.toml
Old clones in builds_dir (Root Cause 2) and a host-mounted, never-cleared cache_dir are common culprits. Note whether [runners.docker] disable_cache is set.
Step 5: Reclaim space and re-run
# Reclaim Docker space (drop the volumes flag if you keep named volumes)
docker system prune -af --volumes
docker buildx prune -af
# Clear stale runner clones and caches
sudo rm -rf /home/gitlab-runner/builds/* /home/gitlab-runner/cache/*
After freeing space, confirm with df -h / df -i, then retry the pipeline. If it recurs within days, the real fix is automated cleanup (see Prevention), not another manual prune.
Example Root Cause Analysis
Every job on the docker-1 runner has started failing during docker build. Other runners are fine; this one has been live for months without a rebuild.
The job log points at the daemon’s own storage:
ERROR: failed to register layer: write /var/lib/docker/overlay2/2f.../diff/usr/lib/...: no space left on device
ERROR: Job failed: exit code 1
Check the disk on the runner host:
df -h
df -i
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 197G 197G 0 100% /var/lib/docker
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/nvme1n1 13107200 4102233 9004967 32% /var/lib/docker
Bytes are at 100% but inodes are fine, so this is accumulated data, not tiny files. Ask Docker what is reclaimable:
docker system df
docker buildx du
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 412 6 88.4GB 81.2GB (91%)
Build Cache 1971 0 41.8GB 41.8GB (100%)
Local Volumes 148 3 24.6GB 23.9GB (97%)
Reclaimable: 41.8GB / 43.0GB
Over 140 GB of the 197 GB partition is dead images, exited containers, orphaned volumes, and BuildKit cache from months of pipelines. Nothing was ever pruned.
Fix: prune Docker state, prune the build cache explicitly, and verify the partition recovers.
docker system prune -af --volumes
docker buildx prune -af
df -h /var/lib/docker
Total reclaimed space: 143.7GB
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 197G 54G 143G 27% /var/lib/docker
The partition drops from 100% to 27%, the next pipeline builds cleanly, and a daily prune timer (below) keeps it from creeping back to full.
Prevention Best Practices
- Schedule a cleanup cron or systemd timer on every long-lived runner: a nightly
docker system prune -af --volumes && docker buildx prune -afkeeps/var/lib/dockerfrom creeping to 100%. - Set a sane
builds_dir/cache_dirretention story — clear stale clones between jobs, and avoid host-mountingcache_dirunless you actively manage its size. - Scope
artifacts:andcache:paths tightly so you never tarnode_modulesor build output into the archive; exclude generated trees explicitly. - Put
/var/lib/docker(and/builds) on a dedicated, generously sized partition, and give dind a roomy volume instead of its tiny default overlay so a small root or/tmpnever trips the whole runner. - Alert on disk early — page on
df -h/df -icrossing ~80% rather than waiting for the first failed job, since inode exhaustion (df -i) is easy to miss. - For fast triage when a runner-disk failure storm hits, the free incident assistant can read the
no space left on devicelog and point at the likely offender. More pipeline fixes live in the GitLab CI/CD guides.
Quick Command Reference
# Which filesystem is full — bytes vs inodes?
df -h
df -i
# Find the biggest offender (stay on one filesystem)
sudo du -xh / 2>/dev/null | sort -h | tail -20
sudo du -sh /var/lib/docker /home/gitlab-runner/builds /tmp 2>/dev/null
# What is Docker holding that it could reclaim?
docker system df
docker buildx du
# Reclaim Docker space (images, containers, volumes, build cache)
docker system prune -af --volumes
docker buildx prune -af
# Clear stale runner clones and caches
sudo rm -rf /home/gitlab-runner/builds/* /home/gitlab-runner/cache/*
# Inspect runner build/cache config
grep -E 'builds_dir|cache_dir|disable_cache' /etc/gitlab-runner/config.toml
# Nightly cleanup timer payload (cron example)
# 0 3 * * * docker system prune -af --volumes && docker buildx prune -af
Conclusion
A no space left on device failure is the runner telling you a filesystem it needs to write to is out of space — or out of inodes. The usual root causes:
/var/lib/dockerfilled by accumulated images, containers, volumes, and build cache on a long-lived runner.- The runner’s
builds_dirand old git clones never cleaned between jobs. - A single job writing huge artifacts/cache or downloading a massive dataset to
/tmpor the build dir. - An unbounded Docker/BuildKit build-layer cache.
- Inode exhaustion —
df -hshows space free, butdf -iis at 100% from millions of small files. - dind storage or a small
/or/tmppartition filling while the data disk still has room.
Run df -h and df -i first to learn whether you are out of bytes or inodes, find the offender with du and docker system df, and the durable fix is almost always a scheduled prune plus tighter artifact and build-directory hygiene so the disk never creeps back to full.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.