CPU Affinity and Core Isolation for Latency-Sensitive Linux

Most servers do fine letting the kernel scheduler put work wherever it likes. But once you’re chasing tail latency — trading systems, real-time media, packet processing, low-jitter benchmarks — the scheduler’s default “move things around for fairness” behavior becomes the enemy. Every time your hot thread migrates to a different core, you eat a cache-cold penalty and a scheduling hiccup. The fix is to nail work to specific CPUs and keep everything else off them. Here’s how I do it.

First, understand your topology

You can’t pin intelligently without knowing the layout. Cores, threads (SMT siblings), and NUMA nodes all matter.

lscpu                    # summary: sockets, cores, threads, NUMA nodes
lscpu --extended         # per-CPU: core id, socket, NUMA node
numactl --hardware       # NUMA node memory + distances
cat /proc/cpuinfo | grep -E 'processor|physical id|core id'

The thing to internalize: CPU numbers are logical IDs, and hyperthread siblings share execution resources. Pinning a latency-critical thread to CPU 0 while a noisy neighbor runs on its SMT sibling (often CPU N/2) means they’re fighting over the same physical core. Check siblings:

cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list

taskset: pin a process to CPUs

taskset sets CPU affinity — the mask of CPUs a process is allowed to run on.

# Launch a process pinned to CPUs 2 and 3
taskset -c 2,3 ./myapp

# Re-pin a running process by PID
taskset -cp 2,3 $(pgrep myapp)

# Inspect current affinity
taskset -cp $(pgrep myapp)

Affinity is inherited by children, so pinning a parent pins the whole tree. For multithreaded apps where you want different threads on different cores, you usually set affinity from inside the app (pthread_setaffinity_np) or per-thread with taskset -cp against thread IDs from /proc/<pid>/task/.

NUMA: pin memory too, not just CPU

On a multi-socket box, accessing memory attached to a different socket is slower. Pinning CPU without pinning memory can leave you fetching across the interconnect. numactl binds both:

numactl --cpunodebind=0 --membind=0 ./myapp

This says “run on node 0’s CPUs and allocate from node 0’s memory.” For memory-bandwidth-bound workloads, getting this right is often a bigger win than the CPU pinning itself. Verify placement:

numastat -p $(pgrep myapp)    # per-node memory for the process

isolcpus: keep the scheduler off your cores

Affinity tells your process where it can run; it doesn’t stop other processes and kernel work from landing on those same cores. To truly reserve cores, isolate them at boot so the general scheduler won’t place tasks there.

Add to the kernel command line (GRUB GRUB_CMDLINE_LINUX):

isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3

isolcpus — the scheduler won’t put normal tasks on these cores.
nohz_full — disables the periodic scheduler tick on those cores when one task is running, removing a recurring source of jitter.
rcu_nocbs — offloads RCU callback processing off those cores.

After rebooting, only processes you explicitly pin (with taskset) will run there. Confirm:

cat /sys/devices/system/cpu/isolated

The trade-off: isolated cores are useless to everything else, so you’re sacrificing throughput for latency. On a 32-core box, dedicating 4 isolated cores to a latency-critical service and leaving 28 for everything else is a common, sane split.

cgroups: a more flexible alternative to isolcpus

isolcpus is static and requires a reboot. cgroup v2 cpuset controllers let you carve out cores dynamically. With systemd:

[Service]
AllowedCPUs=2,3

That confines a service to cores 2 and 3 without rebuilding GRUB. You can also push everything else off those cores by constraining the system slices, achieving soft isolation that’s tunable at runtime. Inspect it:

systemctl show myservice -p AllowedCPUs
cat /sys/fs/cgroup/system.slice/myservice.service/cpuset.cpus

Don’t forget IRQ affinity

Pinning your process means nothing if a network card keeps interrupting your isolated core. Interrupts have their own affinity:

cat /proc/interrupts                       # which CPU handles each IRQ
echo 1 | sudo tee /proc/irq/<N>/smp_affinity   # bitmask, route IRQ to CPU 0

Route device interrupts away from your isolated cores so a busy NIC doesn’t inject latency where you’ve worked hard to remove it. Distros often ship irqbalance, which actively moves IRQs around — disable or constrain it when you’re hand-tuning, or it’ll undo your work.

Measure, don’t assume

Pinning is only worth it if you can show the win. Measure jitter before and after with cyclictest (from rt-tests), and watch migrations with perf:

cyclictest -t1 -p 80 -a 2 -n -i 1000 -l 100000
perf stat -e migrations,context-switches ./myapp

If migrations drops to near zero and your p99 latency tightens, the pinning worked. If not, you tuned the wrong thing.

Where AI helps

The error-prone part of all this is reading lscpu --extended and numactl --hardware, then working out a sane core-and-NUMA assignment by hand. Paste that topology into a model and ask it to propose an isolation plan — which cores to isolate, which siblings to avoid, which NUMA node to bind — and you get a reasoned starting point to verify. I keep a few Linux admin prompts for exactly this.

CPU pinning is a sharp tool: enormous wins for the few workloads that need it, wasted throughput for the ones that don’t. Know your topology, isolate deliberately, handle IRQs, and always measure the before and after.

Generated commands and configs are assistive, not authoritative. Always verify against your own systems before applying changes in production.

CPU Affinity and Core Isolation for Latency-Sensitive Linux Workloads