Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for OpenStack By James Joyner IV · · 11 min read

OpenStack Error Guide: 'Instance failed to spawn' Nova Stuck in BUILD/spawning

Fix Nova 'Instance failed to spawn' and instances stuck in BUILD/spawning: diagnose libvirt/qemu errors, disk space, VIF plug timeouts, SELinux, and CPU flags.

  • #openstack
  • #troubleshooting
  • #errors
  • #nova

Overview

“Instance failed to spawn” is the error nova-compute raises when it has accepted a build, downloaded/prepared the image, and asked libvirt to create the domain — but the create did not succeed. The instance either flips to ERROR or sits in BUILD with task state spawning until a timeout fires.

The literal log line in nova-compute looks like:

ERROR nova.compute.manager [instance: 7c9e1a2b-3344-5566-7788-99aabbccddee] Instance failed to spawn: libvirt.libvirtError: internal error: process exited while connecting to monitor: qemu-system-x86_64: -accel kvm: failed to initialize kvm: No such file or directory

If the build instead hangs on networking you will see the spawn aborted by a VIF timeout:

ERROR nova.virt.libvirt.driver [instance: 7c9e...] Failed to allocate network(s)
nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed

It occurs during the compute-side spawn phase of openstack server create (or rebuild). Unlike “No Valid Host Was Found,” the scheduler already picked a host — the failure is local to that compute node’s hypervisor, disk, image handling, or VIF plugging.

Symptoms

  • Instance stuck in BUILD / task state spawning, then drops to ERROR.
  • Fault message references libvirt, qemu, “Failed to allocate network(s)”, or “No space left on device”.
  • nova-compute log shows Instance failed to spawn on a specific compute host.
openstack server show app-09 -c status -c "OS-EXT-STS:task_state" -c fault -f value
ERROR
None
{'code': 500, 'message': "Build of instance 7c9e... aborted: Virtual Interface creation failed", 'details': '...VirtualInterfaceCreateException...'}
openstack server list --status BUILD --long -c Name -c Status -c "Task State" -c Host
+--------+--------+------------+------------+
| Name   | Status | Task State | Host       |
+--------+--------+------------+------------+
| app-09 | BUILD  | spawning   | compute-02 |
+--------+--------+------------+------------+

Common Root Causes

1. Libvirt/qemu cannot start the domain (no KVM / nested virt)

If the host lacks hardware virtualization (or kvm modules aren’t loaded) but virt_type = kvm, qemu fails to initialize KVM.

egrep -c '(vmx|svm)' /proc/cpuinfo
lsmod | grep -E 'kvm_intel|kvm_amd'
docker exec nova_libvirt virsh list --all 2>/dev/null || sudo virsh list --all
0

A count of 0 means no virtualization flag is exposed — KVM init fails with “failed to initialize kvm”.

2. Insufficient disk on the compute host

Image conversion, the ephemeral/root disk, or _base cache can fill the Nova state directory.

df -h /var/lib/nova /var/lib/docker
docker logs nova_compute 2>&1 | grep -i "No space left" | tail -3
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb1       100G  100G   0G  100% /var/lib/nova

A 100% /var/lib/nova produces OSError: [Errno 28] No space left on device during disk creation.

3. Image / backing-file problems

A corrupt image, a wrong disk_format, or a qcow2 with an unreachable backing file makes qemu-img conversion fail.

openstack image show ubuntu-22.04 -c status -c disk_format -c size -f value
docker exec nova_compute qemu-img info \
  /var/lib/nova/instances/_base/<HASH> 2>/dev/null
docker logs nova_compute 2>&1 | grep -i "qemu-img" | tail -5
active
qcow2
0

A reported size of 0 or a qemu-img “Could not open backing file” error points at the image.

4. Neutron VIF plug timeout

Nova creates the domain but waits for Neutron to send a network-vif-plugged event. If the L2 agent is slow or the event never arrives, the spawn aborts after vif_plugging_timeout.

grep -E 'vif_plugging_(timeout|is_fatal)' /etc/nova/nova.conf
docker logs nova_compute 2>&1 | grep -i "Timeout waiting for .*vif" | tail -3
vif_plugging_timeout = 300
vif_plugging_is_fatal = True
WARNING nova.virt.libvirt.driver [instance: 7c9e...] Timeout waiting for [('network-vif-plugged', 'a1b2c3d4-...')] to be plugged.

5. SELinux / AppArmor denials

Mandatory access control can block libvirt/qemu from opening the instance disk or socket, even when permissions look correct.

sudo ausearch -m avc -ts recent 2>/dev/null | grep -iE 'qemu|libvirt|svirt' | tail -5
sudo getenforce
sudo dmesg | grep -i apparmor | grep -i denied | tail -5
type=AVC msg=audit(...): avc:  denied  { read } for  pid=... comm="qemu-system-x86" name="disk" dev="vdb1" ... scontext=system_u:system_r:svirt_t:s0:c12,c34 tcontext=system_u:object_r:default_t:s0

6. Unsupported CPU model / missing flags

A flavor or image requesting a CPU model the host doesn’t support, or cpu_mode/cpu_models misconfig, makes libvirt reject the domain.

grep -E '^(cpu_mode|cpu_models|cpu_model_extra_flags)' /etc/nova/nova.conf
docker logs nova_compute 2>&1 | grep -i "unsupported configuration\|CPU" | tail -5
ERROR ... libvirt.libvirtError: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: ...

Diagnostic Workflow

Step 1: Capture the fault and the host

openstack server show <SERVER> -c status -c "OS-EXT-STS:task_state" \
  -c "OS-EXT-SRV-ATTR:host" -c fault -f value

Note the host; every later command runs against that compute node.

Step 2: Read the nova-compute spawn log

# Kolla-Ansible
docker logs nova_compute 2>&1 | grep -A20 "Instance failed to spawn" | tail -40
# Traditional
sudo journalctl -u nova-compute --no-pager | grep -A20 "Instance failed to spawn" | tail -40

The traceback names the layer that failed: libvirt, qemu-img, VirtualInterfaceCreateException, or No space left.

Step 3: Check host capacity and hypervisor health

df -h /var/lib/nova
free -m
docker exec nova_libvirt virsh list --all 2>/dev/null || sudo virsh list --all
docker exec nova_libvirt virsh nodeinfo 2>/dev/null || sudo virsh nodeinfo

Rule out a full disk and a dead/over-subscribed libvirt before digging deeper.

Step 4: If it’s networking, correlate with Neutron

docker logs nova_compute 2>&1 | grep -i "vif" | tail -10
docker logs neutron_openvswitch_agent 2>&1 | tail -30
# Traditional
sudo journalctl -u neutron-openvswitch-agent --no-pager | tail -30
openstack network agent list --host <HOST>

A VIF timeout almost always means the L2 agent on the host is slow, down, or the port failed to bind.

Step 5: Check MAC layer (SELinux/AppArmor) and CPU config

sudo ausearch -m avc -ts recent | grep -iE 'qemu|libvirt|svirt' | tail
sudo aa-status | grep -i libvirt
grep -E '^(virt_type|cpu_mode|cpu_models)' /etc/nova/nova.conf
egrep -c '(vmx|svm)' /proc/cpuinfo

Example Root Cause Analysis

app-09 is stuck in BUILD/spawning on compute-02, then errors with “Virtual Interface creation failed.”

The nova-compute log:

WARNING nova.virt.libvirt.driver [instance: 7c9e...] Timeout waiting for [('network-vif-plugged', 'a1b2c3d4-1111-...')] to be plugged.
ERROR nova.compute.manager [instance: 7c9e...] Instance failed to spawn: nova.exception.VirtualInterfaceCreateException: Virtual Interface creation failed

So libvirt created the domain but never got the network-vif-plugged event. Checking the OVS agent on compute-02:

openstack network agent list --host compute-02
+---------+--------------------+------------+-------+-------+
| ID      | Agent Type         | Host       | Alive | State |
+---------+--------------------+------------+-------+-------+
| 3a2b... | Open vSwitch agent | compute-02 | XXX   | UP    |
+---------+--------------------+------------+-------+-------+

The agent is not heartbeating (Alive = XXX), so it never wired the tap interface and Nova’s VIF timeout fired. The agent log confirms it was stuck reconnecting to RabbitMQ.

Fix: restart the agent, confirm it heartbeats, then rebuild the instance:

docker restart neutron_openvswitch_agent     # on compute-02
openstack network agent list --host compute-02   # Alive should flip to :-)
openstack server reboot --hard app-09

The VIF plugs, Nova receives the event, and app-09 reaches ACTIVE.

Prevention Best Practices

  • Monitor /var/lib/nova (and /var/lib/docker for Kolla) free space; alert well before 90%. A full state dir is a top cause of spawn failures.
  • Validate KVM/nested virt on every compute node (egrep -c '(vmx|svm)' /proc/cpuinfo) before adding it to the scheduler pool.
  • Alert on dead L2 agents with openstack network agent list; VIF timeouts are usually an agent that stopped heartbeating.
  • Keep vif_plugging_timeout realistic for your fabric and decide deliberately on vif_plugging_is_fatal.
  • Run SELinux/AppArmor in enforcing mode but ship the OpenStack policy modules, and watch ausearch -m avc after upgrades.
  • Pin cpu_mode/cpu_models to the lowest common host CPU when using live migration to avoid “CPU not compatible” spawn errors.
  • For fast triage, paste the spawn traceback into the free incident assistant, or browse more OpenStack guides.

Quick Command Reference

# Fault, task state, and host
openstack server show <SERVER> -c status -c "OS-EXT-STS:task_state" -c "OS-EXT-SRV-ATTR:host" -c fault -f value

# Spawn traceback
docker logs nova_compute 2>&1 | grep -A20 "Instance failed to spawn" | tail -40
sudo journalctl -u nova-compute | grep -A20 "Instance failed to spawn" | tail -40

# Host capacity & hypervisor
df -h /var/lib/nova; free -m
docker exec nova_libvirt virsh list --all
docker exec nova_libvirt virsh nodeinfo

# KVM availability
egrep -c '(vmx|svm)' /proc/cpuinfo; lsmod | grep kvm

# VIF / Neutron correlation
docker logs nova_compute 2>&1 | grep -i "vif" | tail -10
openstack network agent list --host <HOST>

# MAC layer & CPU
sudo ausearch -m avc -ts recent | grep -iE 'qemu|libvirt|svirt' | tail
grep -E '^(virt_type|cpu_mode|cpu_models)' /etc/nova/nova.conf

# Recover
openstack server reboot --hard <SERVER>

Conclusion

“Instance failed to spawn” is a host-local build failure after the scheduler has already chosen the compute node. The typical root causes:

  1. Libvirt/qemu cannot start the domain (no KVM / missing virtualization flags).
  2. The compute host’s /var/lib/nova (or Docker) volume is out of disk.
  3. A corrupt image, wrong disk_format, or broken qcow2 backing file.
  4. A Neutron VIF plug timeout, usually a slow or dead L2 agent.
  5. SELinux/AppArmor denials blocking qemu’s access to the disk or socket.
  6. An unsupported CPU model or missing CPU flags for the requested domain.

Read the nova-compute traceback first — it names the failing layer — then check disk and the L2 agent, which together account for most real-world spawn failures.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.