Nova Host Aggregates, NUMA, and CPU Pinning in OpenStack

There is a category of OpenStack workload — packet processing, real-time, latency-sensitive databases — where a “normal” VM is not good enough because the noisy-neighbor jitter of shared CPUs ruins it. The fix is NUMA-aware placement and dedicated CPU pinning, where Nova binds a VM’s vCPUs to specific physical cores and keeps its memory local to the right NUMA node. The configuration is spread across host aggregates, flavor extra specs, and the hypervisor, and a single mismatch sends instances into No valid host found. This is the working configuration and the debug path I use.

Decide which hosts do pinning

Mixing pinned and unpinned instances on the same host leads to contention and overcommit chaos. The clean pattern is to dedicate a set of hosts to pinned workloads via a host aggregate and steer flavors to them.

openstack aggregate create --zone nova pinned-hosts
openstack aggregate add host pinned-hosts compute-perf-01
openstack aggregate set --property pinned=true pinned-hosts

The pinned=true property is a label the scheduler matches against flavor specs. This is the gate that keeps general-purpose VMs off your performance hosts and vice versa, so the two populations never fight over the same cores.

Tell the flavor what it needs

Pinning and NUMA topology are requested through flavor extra specs. These are the knobs that turn an ordinary flavor into a performance flavor.

openstack flavor create --vcpus 8 --ram 16384 --disk 40 perf.8c
openstack flavor set perf.8c \
  --property hw:cpu_policy=dedicated \
  --property hw:cpu_thread_policy=isolate \
  --property hw:numa_nodes=1 \
  --property aggregate_instance_extra_specs:pinned=true

hw:cpu_policy=dedicated is what triggers pinning. hw:numa_nodes=1 keeps the guest’s CPUs and memory on a single NUMA node to avoid cross-node memory latency. The aggregate_instance_extra_specs:pinned=true matches the aggregate property so these flavors only land on pinned hosts. All three must agree or you get silent mis-scheduling.

Pro Tip: hw:cpu_thread_policy=isolate reserves whole physical cores by leaving their sibling hyperthreads idle — great for latency, expensive on density. Use prefer instead if you care more about packing than worst-case jitter. Choosing this without understanding the density cost is how clouds run out of “capacity” that is technically there.

Configure the compute host to match

Flavor specs are requests; the host has to actually reserve cores for guests and leave some for the host OS. This lives in nova.conf on the compute node.

grep -E 'cpu_dedicated_set|cpu_shared_set|reserved_host' /etc/nova/nova.conf
# Inspect the real topology:
lscpu | grep -E 'NUMA|Socket|Thread'
numactl --hardware

The cpu_dedicated_set lists the physical cores Nova may pin guests to, and you must exclude the cores you want the host OS and emulator threads to use. If cpu_dedicated_set overlaps the host’s own cores, you get exactly the jitter you were trying to eliminate. Match it to the real topology from lscpu and numactl.

Debug “No valid host found” for a pinned flavor

This is the signature failure. The scheduler could not find a host with enough free pinnable cores on a single NUMA node. The placement and scheduler views narrow it down.

openstack server show <instance> -f value -c fault
# On the scheduler:
grep -i 'NUMATopologyFilter\|no valid host' /var/log/nova/nova-scheduler.log
openstack hypervisor show compute-perf-01 -f value -c vcpus -c vcpus_used

The NUMATopologyFilter rejecting all hosts means no single NUMA node has 8 free dedicated cores, even if the host has 16 free in total spread across two nodes. NUMA placement is per-node, not per-host, and that distinction is the cause of most baffling pinning failures. Either relax hw:numa_nodes or free up cores on a node.

Where AI accelerates the config

This is multi-file, multi-layer configuration where one mismatch is invisible, which is precisely where an AI assistant earns its place as a fast junior engineer. I paste the flavor extra specs, the aggregate properties, the cpu_dedicated_set, and the numactl --hardware output, and ask it to confirm they are mutually consistent — does the dedicated set leave room for emulator threads, does the NUMA request fit a single node, does the aggregate property match the flavor.

I keep it sanitized and credential-free; it never touches nova.conf on a real host. The model finds the mismatch and explains the NUMA math; I make the config change, restart nova-compute myself, and re-test, because a wrong cpu_dedicated_set rolled out cluster-wide degrades every pinned workload at once. The code review dashboard is where I run the actual nova.conf diff, Cursor helps when I am templating these specs across many flavors, and the prompt library has config-consistency prompts.

openstack flavor show perf.8c -f value -c properties   # what I hand the model

Huge pages: the other half of NUMA performance

CPU pinning without huge pages is half a solution for memory-latency-sensitive workloads. Default 4 KB pages create heavy TLB pressure, and the gains from pinning evaporate under page-table churn. Requesting huge pages in the flavor — and reserving them on the host — completes the picture, but it adds another layer that must line up with everything else.

openstack flavor set perf.8c --property hw:mem_page_size=large
# On the compute host, confirm huge pages are reserved per NUMA node:
grep -i huge /proc/meminfo
numastat -m | grep -i huge

The catch is that huge pages are allocated per NUMA node at boot, so a host can have plenty of huge pages total but none free on the specific node where the scheduler wants to place the guest — and you get No valid host found again, this time for memory rather than CPUs. NUMA placement is per-node for memory exactly as it is for cores, and the two constraints compound: the guest needs both free cores and free huge pages on the same node. Reserve huge pages with awareness of your NUMA layout, not as a single cluster-wide number.

Don’t forget the emulator threads

A pinned VM is not just its vCPUs — QEMU runs emulator and I/O threads too, and by default they float onto the same pinned cores, reintroducing the jitter you paid to eliminate. The fix is an emulator thread policy that gives those threads their own home, which is the kind of detail that separates “configured pinning” from “pinning that actually delivers low latency.”

openstack flavor set perf.8c --property hw:emulator_threads_policy=isolate

With isolate, the emulator threads get a dedicated pCPU outside the guest’s vCPU set, so a busy I/O path does not steal cycles from the latency-critical vCPUs. Skipping this is why some teams pin everything correctly and still see periodic latency spikes — the spikes are QEMU’s own threads contending with the guest. It is the last mile of pinning, and it is invisible until you measure tail latency.

Verify pinning actually happened

Configuration without verification is a guess. After an instance boots on a pinned flavor, confirm libvirt really pinned it rather than silently falling back.

# On the compute host hosting the instance:
virsh vcpupin instance-0000abcd
virsh numatune instance-0000abcd

virsh vcpupin should show each vCPU bound to a distinct physical core from your dedicated set, and numatune should show memory bound to the expected node. If the vCPUs are floating across all cores, pinning did not take and the flavor or host config is wrong — better to find that now than under production load.

Conclusion

NUMA and CPU pinning turn OpenStack into a viable home for latency-sensitive workloads, but the configuration spans aggregates, flavors, and the hypervisor, and the layers must agree exactly. Dedicate hosts via an aggregate, request pinning in flavor specs, match cpu_dedicated_set to the real topology, and always verify with virsh. An AI assistant is an excellent fast junior for checking that these layered settings are mutually consistent and for explaining the per-NUMA-node math behind No valid host found — keep credentials and live config files away from it, verify every suggestion against numactl and virsh, and roll out and restart yourself. More Nova performance guides live under the OpenStack category.