OpenStack Error Guide: Nova/Glance 'Failed to download image' on Compute
Fix Nova 'Failed to download image' / Glance store NotFound when nova-compute fetches an image: reachability, Ceph RBD auth, image status, and cache space.
- #openstack
- #troubleshooting
- #errors
- #glance
Overview
Failed to download image happens when nova-compute tries to fetch an image from Glance to build an instance and the transfer cannot complete. Nova needs the image bits — either copied into the local instance cache (file/qcow backend) or cloned from the shared store (Ceph RBD copy-on-write) — before it can create the root disk. When the fetch fails, the spawn aborts and the instance goes to ERROR.
You will see it in the nova-compute log or the instance fault:
ERROR nova.compute.manager [instance: 5d2a...] Instance failed to spawn:
ImageDownloadFailed: Failed to download image <image_id>:
Connection to glance failed: Error contacting Glance server 'http://10.0.0.10:9292'
The label is generic; the underlying cause spans the whole image path: glance-api unreachable from the compute node, a Glance backend store fault (Ceph pool/auth, Swift, NFS file store), an image that is not active, a checksum or signature failure, or simply no room left in the local image cache. Because the same error string covers all of these, the fix depends entirely on reading the nova-compute and glance-api logs together.
Symptoms
- Instance goes
ERRORduringBUILD/spawningwith faultImageDownloadFailedor “Failed to download image”. - Only new instances on certain compute nodes fail, while existing instances run fine.
- The same image boots on one compute host but not another.
- glance-api logs
NotFound/StoreNotFound, or nova-compute logs a connection refused/timeout to port 9292.
openstack server show app-09 -c fault -f value
{'message': 'Failed to download image 0f3c... (HTTPInternalServerError ...)', 'code': 500}
Common Root Causes
1. glance-api unreachable from the compute node
The control-plane endpoint resolves on the controller but the compute node cannot reach it — wrong endpoint, firewall, or a down glance-api.
# from the compute host
openstack endpoint list --service image -c URL -f value
curl -s -o /dev/null -w "%{http_code}\n" http://10.0.0.10:9292/healthcheck
000 # connection refused / timeout from this compute node
2. Ceph RBD backend: pool, auth, or keyring problem
With the RBD store, nova-compute clones the image directly from the Ceph images pool. If the compute node’s cephx key lacks access, or the pool/monitors are wrong, the clone fails even though glance-api is reachable.
ERROR nova.virt.libvirt.imagebackend Error connecting to ceph cluster:
rados.PermissionDeniedError: [errno 13] error connecting to the cluster
ceph auth get client.glance 2>/dev/null
rbd -p images --id nova ls | head
3. Image not in ‘active’ status
If the image is queued, saving, importing, or deactivated, there are no usable bits to download.
openstack image show <IMAGE_ID> -c status -c size -c checksum -f value
queued
None
None
A queued image has no data uploaded; a deactivated image is intentionally blocked from use.
4. Checksum mismatch (corruption in transit or at rest)
Nova verifies the downloaded data against the image’s stored checksum. A mismatch — from a partial Swift object, a truncated file-store copy, or backend corruption — aborts the download.
ERROR nova.compute.manager ImageUnacceptable: Image <id> is unacceptable:
Image's checksum does not match. Expected 9f2c...e1, got 4ab0...77
5. Signature verification failure
When verify_glance_signatures = True, Nova validates the image’s signature against the key in Barbican. A missing/rotated key, or img_signature metadata that does not match the data, blocks the download.
ERROR nova.compute.manager SignatureVerificationError:
Signature verification for the image failed: Invalid signature.
6. Insufficient space in the instance image cache
For the file/qcow backend, the base image is cached under instances_path/_base. If that filesystem is full, the copy fails mid-download.
df -h /var/lib/nova/instances
du -sh /var/lib/nova/instances/_base
/dev/sda3 100G 100G 0G 100% /var/lib/nova/instances
7. RBD show_image_direct_url / COW clone misconfig
Fast RBD boot relies on show_image_direct_url = True (and show_multiple_locations) in glance-api so Nova can do a copy-on-write clone instead of a full download. If those are off, or the image was uploaded without an RBD location, Nova falls back to a slow full download that can fail on space/timeouts.
grep -E 'show_image_direct_url|show_multiple_locations' /etc/glance/glance-api.conf
openstack image show <IMAGE_ID> -f json | grep -i direct_url
show_image_direct_url = False
Diagnostic Workflow
Step 1: Confirm the image is usable
openstack image show <IMAGE_ID> -c status -c size -c checksum -c disk_format -f value
If status is not active (or size/checksum are None), stop here — fix or re-upload the image; there is nothing to download.
Step 2: Read the nova-compute log on the failing host
# Kolla-Ansible (on the compute node)
docker logs nova_compute 2>&1 | grep -iE "download image|ImageDownloadFailed|imagebackend|ceph" | tail -30
# Traditional packages
sudo journalctl -u openstack-nova-compute | grep -iE "download image|ImageDownloadFailed" | tail -30
The traceback distinguishes a connection error (Step 3) from a backend/auth error (Step 4) or a checksum/signature failure.
Step 3: Test glance-api reachability from that compute node
openstack endpoint list --service image -c Interface -c URL -f value
curl -s -o /dev/null -w "%{http_code}\n" http://<GLANCE_HOST>:9292/healthcheck
A non-200 (or 000) from the compute node points at networking, the endpoint URL, or a down glance-api:
# Kolla-Ansible (controller)
docker logs glance_api 2>&1 | tail -30
# Traditional
sudo journalctl -u openstack-glance-api | tail -30
Step 4: Check the Glance backend store
grep -E '^stores|default_store|^\[glance_store\]|rbd_store_pool' /etc/glance/glance-api.conf
For RBD, confirm the compute node’s cephx access:
rbd -p images --id nova ls >/dev/null && echo "RBD OK" || echo "RBD AUTH/POOL FAIL"
For a file/NFS store, confirm the share is mounted and readable; for Swift, confirm the object exists and is not partial.
Step 5: Check cache space and clear stale base images
df -h /var/lib/nova/instances
ls -lh /var/lib/nova/instances/_base | tail
If _base is full, free space (or let the periodic image-cache manager prune), then retry:
openstack server reboot --hard <SERVER> # or rebuild
Example Root Cause Analysis
Instances on the newly added compute-05 fail at spawn with “Failed to download image”, while the same image boots fine on compute-01 through compute-04.
The image is active with a valid checksum, so it is not an image problem. The nova-compute log on compute-05:
ERROR nova.virt.libvirt.imagebackend Error connecting to ceph cluster (images):
rados.PermissionDeniedError: [errno 13] error connecting to the cluster
glance-api is reachable, but the RBD clone fails on this host only. Testing cephx from the compute node:
rbd -p images --id nova ls
2026-06-24 09:44:02.118 monclient(hunting): authenticate timed out after 300
rbd: couldn't connect to the cluster!
The nova cephx keyring was never deployed to compute-05 (the node was provisioned from a stale template missing /etc/ceph/ceph.client.nova.keyring). Without the key, Nova cannot clone the base image from the images pool.
Fix: deploy the keyring and ceph.conf, restart nova-compute, and rebuild:
# copy ceph.conf + ceph.client.nova.keyring to /etc/ceph on compute-05, then:
rbd -p images --id nova ls >/dev/null && echo OK
docker restart nova_compute # or: systemctl restart openstack-nova-compute
openstack server reboot --hard app-09
With cephx access restored, the COW clone succeeds and the instance reaches ACTIVE.
Prevention Best Practices
- Treat the Glance image path as a per-compute dependency: every compute node needs reachability to glance-api AND, for RBD, the
ceph.confplus thenova/cindercephx keyrings. Most “works on one host, not another” cases are a missing keyring or endpoint on the new node. - Pin and validate
show_image_direct_url/show_multiple_locationson glance-api when using the RBD store, so Nova does fast COW clones instead of fragile full downloads. - Monitor free space on
instances_path(the_basecache) and let the image-cache manager prune; a full cache silently breaks file-backed downloads. - Confirm images are
activewith a non-nullchecksumandsizein your image-publishing pipeline; never exposequeued/deactivatedimages to users. - If you enable
verify_glance_signatures, automate Barbican key rotation so signing metadata stays valid, or signature checks will start failing image downloads fleet-wide. - For fast triage, the free incident assistant can sort a nova-compute traceback into reachability vs. backend-auth vs. checksum. See more in OpenStack guides.
Quick Command Reference
# Is the image usable at all?
openstack image show <IMAGE_ID> -c status -c size -c checksum -c disk_format -f value
# Root cause from the compute node's nova-compute log
docker logs nova_compute 2>&1 | grep -iE "download image|ImageDownloadFailed|imagebackend" | tail -30
sudo journalctl -u openstack-nova-compute | grep -iE "download image|ImageDownloadFailed" | tail -30
# Glance reachability from the compute host
openstack endpoint list --service image -c Interface -c URL -f value
curl -s -o /dev/null -w "%{http_code}\n" http://<GLANCE_HOST>:9292/healthcheck
# glance-api log (controller)
docker logs glance_api 2>&1 | tail -30
sudo journalctl -u openstack-glance-api | tail -30
# RBD store auth check from the compute node
rbd -p images --id nova ls >/dev/null && echo "RBD OK" || echo "RBD FAIL"
grep -E 'show_image_direct_url|show_multiple_locations|rbd_store_pool' /etc/glance/glance-api.conf
# Image cache space
df -h /var/lib/nova/instances
ls -lh /var/lib/nova/instances/_base | tail
# Retry after the fix
openstack server reboot --hard <SERVER>
Conclusion
“Failed to download image” is a catch-all for any break in the path between nova-compute and the image bits. The string never tells you the cause — the logs do. Work it top to bottom:
- Confirm the image is
activewith a valid size and checksum; if not, fix the image first. - Read the nova-compute log on the failing host to classify the failure: connection, backend/auth, checksum, or signature.
- Test glance-api reachability from that specific compute node.
- For the RBD store, verify the node’s
ceph.confand cephx keyring; for file/Swift, verify the mount/object. - Check
_basecache space and confirm direct-URL clone settings for fast RBD boots.
Because the image path is a per-host dependency, the most common surprise is a node that is missing a keyring, endpoint, or mount the rest of the fleet has — fix that one host’s image path and the spawn succeeds.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.