You are a senior Linux storage engineer who has rescued data from countless corrupted ext4 filesystems — bad blocks, lost superblock, orphaned inodes, journal damage. You know that `fsck -y` is sometimes the right answer and sometimes "irreversibly destroy data fast." I will provide: - The symptom (mount fails, kernel "EXT4-fs error", read errors, files reading garbage, missing files after crash) - `dmesg` excerpts around the failure - The output of `mount` (currently mounted? read-only? not at all?) - The block device (`lsblk -f`), partition layout, and underlying storage (mdraid/LVM/raw) - Whether the data is critical and irreplaceable, or backed up elsewhere Your job: 1. **Stop further damage first**: - Unmount the FS (`umount` or `fuser -mk` if needed) - Mark device read-only at the kernel level if needed (`blockdev --setro`) - If hardware errors in `dmesg`, image the device first (`ddrescue`) before any fsck 2. **Pick the right fsck approach**: - **`fsck.ext4 -n` (no changes)** — dry run; reports what it would fix. ALWAYS first. - **`fsck.ext4 -f -p`** — auto-fix things that need no judgment (preen). Refuses on uncertain cases. - **`fsck.ext4 -f -y`** — answer yes to all. Used after `-n` shows acceptable changes. - **`fsck.ext4 -f -y -b <backup-sb>`** — if primary superblock is bad, use a backup - **`debugfs -R 'ls -l /' /dev/<dev>`** — inspect read-only without fsck (forensics) 3. **For "Superblock invalid"**: - Find backup superblocks: `mke2fs -n /dev/<dev>` (with `-n` it doesn't make; just shows) - Or `dumpe2fs /dev/<dev> | grep -i "backup superblock"` - Retry fsck with `-b <backup-block-number>` (e.g., 32768, 98304) 4. **For "Journal recovery failed"**: - Try `mount -o ro` to inspect without journal replay - `e2fsck -y -E journal_only /dev/<dev>` to replay only - In extreme cases: `tune2fs -O ^has_journal /dev/<dev>` to remove the journal (DESTRUCTIVE for ordered/journal data modes; converts to ext2) 5. **For "orphan inode" / "i_blocks_hi should be zero"**: - Almost always safe to fsck with `-y` after `-n` review - Lost+found will collect detached inodes; rename / re-attach manually 6. **For "bad magic number in superblock"**: - Likely wrong device (partition vs whole disk confusion) OR severe head corruption - Verify partition table is intact with `gdisk -l` / `fdisk -l` - Try backup superblock; if all fail, the FS metadata may be lost 7. **For read errors during fsck**: - Underlying disk failure; image with `ddrescue` first, then fsck the image - **NEVER `fsck -y` a disk with hardware errors** — fsck will write to bad sectors and propagate damage 8. **For data files reading garbage**: - May be FS metadata corruption (block map wrong) or actual data corruption - `debugfs` `dump_extents <inode>` shows the block map; cross-reference with `dd` reads - If the FS is on RAID5 with a known-failed-and-resynced member, suspect silent corruption from rebuild Mark DESTRUCTIVE clearly: `fsck -y` (auto-confirms ALL changes), `tune2fs -O ^has_journal` (removes journal), `mkfs` (reformats). --- Symptom: [DESCRIBE] `dmesg` excerpts: ``` [PASTE] ``` `mount | grep <fs>` and `lsblk -f`: ``` [PASTE] ``` Underlying storage: [raw / mdraid / LVM / LUKS] Data criticality: [backed up / partially backed up / irreplaceable] What you tried so far: [DESCRIBE]

Why this prompt works

ext4 fsck has many options and the wrong sequence destroys data. “Just run fsck -y” advice from forums is often the worst answer for a recovery scenario. This prompt forces a triage: stop damage, dry-run, image if hardware is suspect, then act.

How to use it

Unmount immediately (read-only at minimum). Continued writes worsen corruption.
Run fsck -n first — always. Review what would change.
If dmesg shows hardware errors, image with ddrescue BEFORE further fsck.
For irreplaceable data, copy what you can BEFORE running corrective fsck.

Useful commands

# Inventory (safe, read-only)
sudo dmesg | tail -100
sudo dumpe2fs -h /dev/<dev> | head -40
sudo tune2fs -l /dev/<dev> | head -30
lsblk -f
fdisk -l /dev/<dev>

# Find backup superblocks
sudo mke2fs -n /dev/<dev>           # -n: don't create; shows layout
# Output includes "Superblock backups stored on blocks: 32768, 98304, ..."

# Dry-run fsck
sudo fsck.ext4 -n -f /dev/<dev>     # tells you what it would do

# Image a failing device (BEFORE any write attempts)
sudo apt install gddrescue
sudo ddrescue -d -r3 /dev/<failing> /dev/<replacement> ddrescue.log
sudo ddrescue -d -r3 -R /dev/<failing> /dev/<replacement> ddrescue.log   # 2nd pass, reverse

# Forensics without fsck (read-only inspection)
sudo debugfs /dev/<dev>
# Inside:
#   ls -l /
#   stat <inode>
#   icheck <block>            # which inode owns this block
#   ncheck <inode>            # filename for this inode
#   dump <inode> /tmp/out     # extract file by inode (recovers deleted)

# Repair (after dry-run review)
sudo fsck.ext4 -f -p /dev/<dev>     # preen — auto-fix safe things; bails on uncertain
sudo fsck.ext4 -f -y /dev/<dev>     # yes to all

# With backup superblock
sudo fsck.ext4 -f -y -b 32768 /dev/<dev>

# After fsck: mount read-only first
sudo mount -o ro /dev/<dev> /mnt/recovery
sudo rsync -aHAX --partial /mnt/recovery/ /backup/

# Look for orphaned files
ls -la /mnt/recovery/lost+found/
sudo file /mnt/recovery/lost+found/*    # identify by content

Recovery decision tree

Symptom: FS won't mount, EXT4-fs error in dmesg
│
├── Hardware errors in dmesg (UNC, sector errors)?
│   ├── Yes → ddrescue to healthy disk → continue on image
│   └── No  → continue on device
│
├── fsck -n -f /dev/<dev>
│   ├── Reports "clean" → not FS corruption; check mount opts, kernel version
│   ├── Reports few fixable issues → fsck -y -f
│   └── Reports superblock invalid → continue
│
├── Superblock invalid
│   ├── mke2fs -n → list backup SB blocks
│   ├── fsck -y -f -b <backup-sb>
│   └── If all backups fail → debugfs forensics; consider data recovery service
│
└── fsck succeeds → mount -o ro → rsync data out → then mount rw

Common findings this catches

fsck -n reports millions of changes → likely wrong device, not corruption. Re-verify lsblk -f.
fsck -p exits early “INCONSISTENCY MANUALLY” → safe to retry with -y after -n review.
Journal recovery succeeds but FS mounts read-only → kernel detected error after replay; fsck offline.
Files in lost+found with names like #12345 → orphan inodes; identify by file, restore by content.
Bad block at superblock → use backup superblock with -b.
Repeated “Superblock has an invalid journal” after replay → journal device damaged; consider removing journal (data ext2) as last resort.
fsck loops repeatedly fixing same issue → likely hardware writing back errors; image first.

Verification after recovery

# Mount read-only first, check
sudo mount -o ro /dev/<dev> /mnt/check
sudo find /mnt/check -type f -exec file {} \; | grep -v ASCII | head

# Compare to backup if available
sudo rsync -avn --checksum /backup/ /mnt/check/

# Check FS features
sudo tune2fs -l /dev/<dev> | grep -E "Filesystem features|Last checked|Mount count"

# Schedule periodic checks
sudo tune2fs -c 30 /dev/<dev>     # check every 30 mounts
sudo tune2fs -i 6m /dev/<dev>     # check every 6 months

When to escalate

Irreplaceable data + hardware failure → professional data recovery service. Don’t trust forum advice with originals.
Repeated FS errors after fsck — underlying storage problem (controller, cable, disk); replace before trusting again.
Corruption pattern matching a known kernel bug — check distro CVE/bug tracker; may need a downgrade or specific patch.
Encrypted volume (LUKS) where the FS appears clean to debugfs but unreadable through dm-crypt — LUKS header issue; see LUKS recovery prompt.

ext4 Filesystem Corruption Recovery Prompt

Why this prompt works

How to use it

Useful commands

Recovery decision tree

Common findings this catches

Verification after recovery

When to escalate

Related prompts

Linux Disk Full / Inode Exhaustion Diagnosis Prompt

LVM Troubleshooting Prompt

Linux mdraid Software RAID Recovery Prompt

Why this prompt works

How to use it

Useful commands

Recovery decision tree

Common findings this catches

Verification after recovery

When to escalate

Related prompts

Linux Disk Full / Inode Exhaustion Diagnosis Prompt

LVM Troubleshooting Prompt

Linux mdraid Software RAID Recovery Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet