Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Linux Admins By James Joyner IV · · 10 min read

Linux Error: Input/output error — Cause, Fix, and Troubleshooting Guide

How to fix the Linux 'Input/output error' (EIO): diagnose failing disks with SMART and dmesg, distinguish device faults from NFS drops, and recover safely.

  • #linux
  • #troubleshooting
  • #storage
  • #disk

Summary

Input/output error maps to the errno EIO (5). It is the kernel telling you a low-level read or write against a device failed and could not be completed. Unlike a busy or permission error, Input/output error usually points at hardware or transport trouble — a failing disk, a flaky cable/controller, a dropped SAN/iSCSI path, or a broken NFS backend. Treat it as a data-integrity event first and a config problem second.

Common Symptoms

  • cat, cp, or dd on a specific file returns Input/output error partway through.
  • dmesg fills with I/O error, blk_update_request, or EXT4-fs error lines.
  • A filesystem remounts itself read-only after a write failure.
  • smartctl reports reallocated or pending sectors climbing.
  • On NFS, all access to an export returns EIO after the server or network faulted.

Most Likely Causes of the ‘Input/output error’ Error

The Input/output error error is a failed device operation. In production, most likely first:

  1. A failing or dying disk — bad/pending sectors, a drive going offline mid-operation. SMART and dmesg will show it.
  2. A flaky transport — a loose/failing SATA/SAS cable, a controller (HBA) fault, or a dropped multipath/iSCSI/SAN path so the block device times out.
  3. A dead or faulted NFS backend — the export’s underlying storage failed, so the client surfaces EIO rather than a hang.
  4. Filesystem metadata corruption — the kernel hits an inconsistency (EXT4-fs error) and returns EIO while flagging the fs.
  5. A device that went read-only or offline — the block device state flipped to offline, so every I/O errors out.

Quick Triage

# The kernel's account of the failure (most important single command)
dmesg -T | grep -Ei 'I/O error|blk_update_request|EXT4-fs error|offline|medium error' | tail
# Which device/mount is affected?
findmnt
lsblk -o NAME,SIZE,STATE,MOUNTPOINT

dmesg almost always names the failing device (e.g. sdb) and the sector — start there.

Diagnostic Commands

# Full kernel error trail: bad sectors, resets, path failures
dmesg -T | grep -Ei 'error|reset|offline|medium|timeout' | tail -40
# Is the block device even online?
cat /sys/block/sdb/device/state    # expect "running", not "offline"
# SMART health and error/reallocated-sector counters (install smartmontools)
sudo smartctl -H /dev/sdb
sudo smartctl -a /dev/sdb | grep -Ei 'Reallocated|Pending|Uncorrectable|Health'

Rising Reallocated_Sector_Ct, Current_Pending_Sector, or Offline_Uncorrectable means the drive is failing — replace it, do not repair it.

# Read-only, non-destructive surface scan (never run the write test on live data)
sudo badblocks -sv -b 4096 /dev/sdb

Warning: badblocks in write mode (-w) destroys all data on the device. Only ever run the read-only form (-sv, no -w) on a disk you care about, and prefer running it on an unmounted device.

# NFS angle: is the backend actually reachable/exporting?
findmnt -t nfs,nfs4
showmount -e <nfs-server>
nfsstat -m

If the affected mount is NFS and showmount -e fails, the fault is the server/network, not a local disk.

Fix / Remediation

  1. Back up readable data immediately. If a disk is throwing EIO, copy what you still can before doing anything else — a failing drive gets worse:

    sudo ddrescue -f -n /dev/sdb /dev/sdc rescue.map   # image the failing disk first
  2. Replace failing hardware. If SMART shows reallocated/pending/uncorrectable sectors climbing, the disk is dying — swap it and restore from backup. No filesystem repair fixes bad media.

  3. Reseat/repair the transport. For cable/controller/path faults (resets in dmesg, state flapping), reseat cables, check the HBA, or fail over/restore the multipath/iSCSI path:

    sudo multipath -ll
    echo running | sudo tee /sys/block/sdb/device/state   # re-enable an offlined device (only if hardware is sound)
  4. Address NFS-backend faults on the server side: restore the export’s storage, confirm showmount -e succeeds, then remount the client.

  5. If the fault is filesystem corruption (EXT4-fs error with sound hardware), do not repair a mounted or actively-failing filesystem.

    Warning: fsck/xfs_repair can lose or alter data and must never run on a mounted filesystem. Unmount first, image or back up the device, and run a read-only check (fsck -n / xfs_repair -n) before any repair. Full walkthrough: recovering corrupted Linux filesystems with fsck.

    sudo umount /dev/sdb1
    sudo fsck -n /dev/sdb1        # ext4: read-only check first (Ubuntu/Debian, RHEL/Rocky)
    sudo xfs_repair -n /dev/sdb1  # xfs: dry run (RHEL/Rocky default fs)

Validation

# No new I/O errors after the fix
dmesg -T | grep -Ei 'I/O error|EXT4-fs error' | tail
# Device back online and SMART healthy
cat /sys/block/sdb/device/state
sudo smartctl -H /dev/sdb
# Files read end-to-end
find /mnt/data -type f -print0 | xargs -0 -I{} sh -c 'cat "{}" >/dev/null' && echo "reads clean"

Prevention

  • Run scheduled SMART self-tests and alert on rising Reallocated/Pending/Uncorrectable counts — most disk EIO is predictable days in advance.
  • Deploy smartd (smartmontools) with email/webhook alerts on every host with local disks.
  • Use redundant storage (RAID/mirroring, multipath) so a single failing device does not surface EIO to applications.
  • For NFS, mount hard with a bounded timeo so backend faults retry rather than corrupting in-flight work, and monitor the server’s disk health too.
  • Keep tested, restorable backups — EIO from bad media is a replace-and-restore event, not a repair.
  • Alert on filesystems remounting read-only (a common kernel response to write EIO).

Final Notes

Input/output error is a device-level failure, so lead with dmesg and smartctl to separate dying media from a flaky transport or a dead NFS backend. If SMART is degrading, image the disk and replace it — no repair fixes bad sectors. Only when hardware is proven sound and the fault is metadata corruption should you unmount, back up, and run fsck -n/xfs_repair -n before any write repair.

Want faster Linux incident response? Use DevOps AI Toolkit to turn production errors into clear diagnostics, remediation steps, and reusable runbooks.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.