Nova PCI Passthrough and SR-IOV With AI in OpenStack

PCI passthrough is one of those OpenStack features that works perfectly in the docs and fights you for a full day in production. The instance boots without the device, or the device shows up on the wrong NUMA node and tanks performance, or Placement never reports any PCI inventory so the scheduler can’t even consider the host. The failure can live in five different layers, and the way you waste a day is by fixating on one while the bug is in another. Here’s how I wire up SR-IOV and passthrough now, layer by layer, and how AI helps me cross-check the config against the actual hardware instead of against my assumptions.

The Five Layers, Top to Bottom

Before touching config, internalize where things can break: the kernel/IOMMU, the driver/VF state, the Nova device_spec, the Placement inventory, and the flavor extra-specs. Every successful passthrough requires all five aligned. Start at the bottom and confirm the hardware is even ready:

# IOMMU enabled?
dmesg | grep -e DMAR -e IOMMU
# the actual device, its driver, and NUMA node
lspci -nnk | grep -A3 -i ethernet
cat /sys/class/net/<pf>/device/numa_node

If IOMMU isn’t on (intel_iommu=on / amd_iommu=on in the kernel cmdline), nothing above it matters. The openstack category collects the related Nova hardware playbooks.

Creating VFs and Binding the Driver

For SR-IOV you create virtual functions on the physical function and ensure they’re bound to the right driver — vfio-pci for full passthrough, or the VF driver for macvtap/direct modes:

echo 8 > /sys/class/net/<pf>/device/sriov_numvfs
ip link show <pf>            # confirm VFs appear
lspci -nnk | grep -A3 Virtual

This step has teeth: changing sriov_numvfs resets the PF, which drops the NIC momentarily. On a compute hosting running instances that share that NIC, that’s an outage. Always do this on a drained host.

The device_spec: Where Config Meets Hardware

Nova’s [pci] device_spec (the modern name for the passthrough whitelist) tells Nova which devices it may assign. The single most common bug is a spec that doesn’t match the actual lspci IDs:

[pci]
device_spec = {"vendor_id": "8086", "product_id": "154c", "physical_network": "physnet-sriov"}
alias = {"name": "intel-nic", "device_type": "type-VF", "vendor_id": "8086", "product_id": "154c"}

Those vendor/product IDs must exactly match what lspci -nn reports, and the physical_network must match your Neutron physnet mapping. This is exactly the cross-reference humans get wrong when tired — you copy IDs from a vendor doc instead of from your own lspci.

Prompt: “Here is my [pci] device_spec and alias from nova.conf, and the lspci -nnk output from the compute host. Confirm whether the vendor_id/product_id in my spec actually match a device in lspci, whether the device is bound to vfio-pci, and which NUMA node it’s on. Build a table mapping each spec entry to the matching (or missing) hardware. Don’t propose nova.conf rewrites until the mapping is confirmed.”

Output: A table showing my product_id was 154c but the host’s VFs reported 10ed — I’d pasted the PF’s product ID, not the VF’s, so the spec matched nothing. It also flagged that the device sat on NUMA node 1 while my flavor pinned CPUs to node 0, which would have hurt performance even after the ID fix.

That mapping table caught two bugs I’d have chased separately for hours. The AI is a fast junior engineer cross-referencing config against hardware — but I confirmed the VF product ID with my own lspci before changing the spec, because acting on a hallucinated ID just trades one wrong value for another.

NUMA Affinity: The Silent Performance Killer

A passthrough device has a NUMA node, and your flavor’s CPU/memory NUMA policy has one too. If they disagree, the instance pays a cross-node penalty for every packet — it works, but slowly, and nobody notices until a latency complaint. Express the affinity requirement in the flavor:

openstack flavor set m1.sriov \
  --property "pci_passthrough:alias"="intel-nic:1" \
  --property "hw:numa_nodes"="1" \
  --property "hw:pci_numa_affinity_policy"="required"

Pro Tip: Make the AI check device NUMA node against flavor NUMA policy explicitly. A passthrough that “works” but straddles NUMA nodes is the bug that survives your testing and shows up as a vague performance ticket three weeks later.

Confirming Placement Sees the Device

The scheduler only lands a PCI flavor where Placement reports matching inventory. After config and a nova-compute restart, verify the device class shows up:

openstack resource provider list
openstack resource provider inventory list <compute-rp>

If the PCI inventory isn’t there, the scheduler can’t place the flavor no matter how perfect the device_spec is. When I’m reconciling “the device is configured but won’t schedule,” I’ll hand the device_spec, the Placement inventory, and the flavor to Claude and ask it to find the break in the chain. That triage is fast; I verify the conclusion by actually attempting a boot on the target host. Reusable PCI prompts live in the prompt workspace.

Validate on One Host Before the Fleet

The whole thing comes together as a single-host validation: pick one drained compute, create the VFs, set the device_spec, restart nova-compute, confirm Placement inventory, and boot a test instance with the flavor. Verify in-guest that the device is present and on the right NUMA node. Only then roll the config to the rest of the fleet. Doing this on one host means a misconfiguration costs you one host, not the whole rack.

Conclusion

PCI passthrough and SR-IOV are five aligned layers pretending to be one feature, and the path to a wasted day is debugging one layer while the bug hides in another. AI is genuinely good at the cross-reference that humans fumble — matching device_spec IDs against lspci, checking device NUMA against flavor policy, tracing why Placement has no inventory. Every one of those is a lead you verify against the real hardware before you act, because a confidently wrong product ID is no better than the one you started with. Cross-check with the model, validate on one host, then roll. More Nova hardware prompts are in the prompts library.