Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Kubernetes & Helm By James Joyner IV · · 9 min read

Kubernetes Error Guide: 'ImageInspectError' Corrupt Local Image

Fix the ImageInspectError: clear corrupt image layers, recover from disk-full nodes, and force a clean re-pull so the container runtime can inspect the image again.

  • #kubernetes-helm
  • #troubleshooting
  • #errors
  • #images

Exact Error Message

NAME                       READY   STATUS              RESTARTS   AGE
api-7d6c5b4f9d-w2kpz       0/1     ImageInspectError   0          41s

# kubectl describe pod api-7d6c5b4f9d-w2kpz
Events:
  Type     Reason             Age              From     Message
  ----     ------             ----             ----     -------
  Warning  InspectFailed  9s (x4 over 41s)  kubelet
    Failed to inspect image "registry.example.com/api:v2.3":
    rpc error: code = Unknown desc = failed to get image status:
    failed to read image content: unexpected EOF

What the Error Means

ImageInspectError (event reason InspectFailed) means the image reference is valid and the image is present on the node, but the container runtime could not read or inspect its metadata. The runtime tries to read the image’s manifest, config, and layer information; when that read fails — usually because the on-disk content is incomplete or corrupted — it returns an inspect error and the kubelet cannot proceed to create the container.

Unlike ErrImagePull (the pull from the registry failed), ImageInspectError is about an image that was already (partially) stored on the node. The local image store is the problem, not the network or registry.

Common Causes

  • A truncated or partially-pulled image layer, often from a node that ran out of disk during a pull (unexpected EOF, failed to read image content).
  • Corruption in the runtime’s content store after an unclean shutdown, kernel panic, or filesystem error.
  • A full or failing node disk that prevented the runtime from completing or persisting the image.
  • A side-loaded image archive that was incomplete or damaged during transfer to the node.
  • Runtime database/metadata corruption in containerd’s boltdb or CRI-O’s storage after an abrupt crash.
  • An interrupted image garbage-collection that removed layers still referenced by the manifest.

How to Reproduce the Error

This is hard to trigger deliberately, but a reliable approximation:

  1. Start pulling a large image onto a node.
  2. Fill the node’s disk (or kill the runtime) mid-pull so the layer write is truncated.
  3. Restart the runtime and schedule a pod using that image — the runtime finds a partial image and fails to inspect it.
# On a node, simulate by truncating a content blob (lab only):
crictl pull registry.example.com/api:v2.3   # may report success then fail on inspect

Diagnostic Commands

# Confirm the reason and node
kubectl get pod <POD> -o wide
kubectl get pod <POD> -o jsonpath='{.status.containerStatuses[0].state.waiting.reason}{"\n"}'
kubectl describe pod <POD> | grep -A4 -iE 'inspectfailed|imageinspecterror'

# On the affected node: try to inspect the image directly
crictl images | grep api
crictl inspecti registry.example.com/api:v2.3

# Check runtime logs for read/content errors
journalctl -u containerd --since "15 min ago" | grep -iE 'eof|corrupt|content|inspect'

# Confirm the node has free disk and inodes
df -h /var/lib/containerd
df -i /var/lib/containerd
kubectl describe node <NODE> | grep -i diskpressure

Step-by-Step Resolution

1. Confirm the node and reproduce the inspect failure locally. On the node from kubectl get pod -o wide, run crictl inspecti <IMAGE>. A failure here confirms the local image is unreadable.

2. Free disk space first. If the node is full, the runtime cannot re-pull cleanly. Reclaim space and clear unused images:

df -h /var/lib/containerd
crictl rmi --prune

3. Remove the corrupt image so it can be re-pulled. Delete the specific damaged image from the node’s store:

crictl rmi registry.example.com/api:v2.3

4. Force a fresh pull. Re-pull the image and inspect it to verify integrity:

crictl pull registry.example.com/api:v2.3
crictl inspecti registry.example.com/api:v2.3

5. Delete the pod so the kubelet retries. With a clean image present, the new pod should pass inspection and start:

kubectl delete pod <POD>   # recreated by its controller
kubectl get pods -l app=<APP> -w

6. If corruption persists, restart or drain the node. Deep content-store or metadata corruption may require restarting the runtime or draining the node and rebuilding it:

kubectl cordon <NODE>
kubectl drain <NODE> --ignore-daemonsets --delete-emptydir-data
# then restart containerd / replace the node

Prevention and Best Practices

  • Monitor node disk usage and configure image garbage collection thresholds so nodes never fill up mid-pull.
  • Use imagePullPolicy: IfNotPresent with a registry source so a corrupt local copy can be re-pulled, rather than relying solely on side-loaded archives.
  • Verify side-loaded image archives with a checksum before importing on air-gapped nodes.
  • Ensure graceful node shutdown (drain before reboot) to avoid leaving the content store in an inconsistent state.
  • Alert on InspectFailed events as an early signal of a degraded node disk or runtime. See more in Kubernetes & Helm guides.
  • ErrImagePull / ImagePullBackOff — the pull from the registry itself failed.
  • ErrImageNeverPull — the image is simply not present and the policy forbids pulling.
  • CreateContainerError — the image is fine but the runtime failed to create the container.
  • DiskPressure node condition — the underlying disk problem that frequently causes inspect failures.

Frequently Asked Questions

Does ImageInspectError mean the image in my registry is broken? No. The registry copy is almost always fine. The error is about the node’s local copy being incomplete or corrupted. Removing the local image and re-pulling resolves it in most cases.

Why did re-pulling not fix it? If the node disk is still full or the content store is corrupted, even a fresh pull lands in a bad state. Free disk space first, prune unused images, and if it still fails, restart the runtime or drain the node.

Can this affect multiple pods at once? Yes. If the same corrupt image or a degraded disk affects a node, every pod scheduled there that uses that image will fail to inspect. The fix is per-node: clean the image store and address disk health.

How is this different from CreateContainerError? ImageInspectError happens earlier — the runtime cannot even read the image’s metadata. CreateContainerError happens after a successful inspect, when the runtime tries to build the container from a readable image.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.