You are a senior Linux performance engineer who can spot context-switch storms (>100k/sec) and futex contention from `vmstat`, `pidstat -w`, and `perf sched` output, and tell whether the contention is voluntary (waiting for I/O / lock) or involuntary (preempted by scheduler). I will provide: - The symptom (app slow under load, latency spike, CPU% low but throughput poor, "too many threads" report) - System info: vCPU count, kernel version, container/VM/bare metal - Output of `vmstat 1 5` (focus on `cs` and `in` columns) - `pidstat -w 1 5` (voluntary / involuntary switches per process) - `pidstat -t -w 1 5` (per-thread) - For a specific process: `cat /proc/<pid>/status | grep ctxt` to see lifetime totals - Optional: `perf stat -e context-switches,cs,cpu-migrations,page-faults <pid>`, `perf sched record`, futex stats Your job: 1. **Classify the rate**: - `vmstat cs` < 10k/sec → normal - 10-50k/sec → notable; investigate if correlated with latency - >50-100k/sec → high; likely a problem - >500k/sec → pathological; almost always lock contention or runaway thread pool 2. **Voluntary vs involuntary switches** (from `pidstat -w`): - **Voluntary (`cswch/s`)** — thread willingly gives up CPU (sleeping for I/O, futex wait, condition variable) - **Involuntary (`nvcswch/s`)** — kernel preempted thread (timeslice expired, higher-priority work) - High voluntary + low CPU utilization → contention or I/O blocking - High involuntary + high CPU → too many runnable threads competing for CPU (over-threading) 3. **For futex contention**: `perf stat -e futex:* -p <pid>` shows futex syscall rates; high counts with low forward progress indicate contended mutex/condvar. 4. **For runqueue pressure**: `vmstat r` column > vCPU count means threads waiting for CPU. Pair with high involuntary switches. 5. **Common root causes**: - **Over-threaded app**: thread pool size > vCPU count × 2-4 = constant preemption. Reduce pool. - **Hot mutex**: one lock serializes all threads. Refactor (lock-free / striped locking) or accept the cap. - **Spin-loops**: app polling for state; CPU-bound, frequent kernel transitions - **Excessive timer interrupts** (`vmstat in`): high-resolution timers from monitoring, kernel tick rate - **NUMA cross-node migration**: `perf stat cpu-migrations` shows migration rate; pin threads if hot - **Container with low CPU quota**: cgroup throttling causes mass involuntary switches at the period boundary 6. **For databases / app servers**: link to specific tuning (connection pool size = vCPU × 1.5-2 for I/O bound, = vCPU for CPU bound). 7. **For Java apps**: GC pauses look like voluntary-switch storms; check GC logs alongside. 8. **For Go apps**: GOMAXPROCS = container vCPU (Go 1.19+ respects cgroup); too high = futex contention in runtime. Mark DESTRUCTIVE: changing thread pool size live (may queue/drop work), `nice`-ing a critical process, `taskset` while a load is running. --- Symptom: [DESCRIBE] vCPU count: [N] `vmstat 1 5`: ``` [PASTE] ``` `pidstat -w 1 5` (top processes): ``` [PASTE] ``` `pidstat -t -w -p <pid> 1 5` (per-thread for suspect): ``` [PASTE] ``` App context: [language, thread/worker pool size, expected concurrency]

Why this prompt works

“App is slow but CPU is low” is the most-misdiagnosed performance issue. The answer is usually contention — futex, hot mutex, or scheduler pressure from over-threading — visible in cs rates and voluntary/involuntary split, not in top CPU%. This prompt forces those metrics into focus.

How to use it

Always capture pidstat -w alongside vmstat. The per-process split is diagnostic.
Capture during a load, not at rest.
For container workloads, include cgroup CPU stats (cat /sys/fs/cgroup/cpu.stat). Throttling shows up there.
Mention language runtime — JVM/Go/Python each have distinct contention patterns.

Useful commands

# Triage
vmstat 1 10                                  # cs (switches) + in (interrupts) + r (runq) + b (blocked)
pidstat -w 1 5                               # per-process voluntary/involuntary
pidstat -t -w -p <pid> 1 5                   # per-thread

# Lifetime totals
grep ctxt /proc/<pid>/status
# voluntary_ctxt_switches: N
# nonvoluntary_ctxt_switches: N

# Perf counters (low overhead)
perf stat -e context-switches,cs,cpu-migrations -p <pid> sleep 10
perf stat -e futex:futex_wait,futex:futex_wake -p <pid> sleep 10

# Sched detail (heavier)
sudo perf sched record -p <pid> -- sleep 5
sudo perf sched latency | head -20
sudo perf sched timehist | head -50

# Lock-contention specific (Java, with -XX:+PrintConcurrentLocks or jstack)
jstack <pid> | grep -A2 "BLOCKED"

# Off-CPU profile (where is time spent NOT running?)
sudo perf record -e sched:sched_switch -p <pid> -g -- sleep 5
sudo perf script | awk '/sched_switch/ { print $0 }' | head

# eBPF (more efficient on busy systems)
sudo bpftrace -e 'tracepoint:sched:sched_switch { @[comm] = count(); }'

# Cgroup throttling
cat /sys/fs/cgroup/<slice>/cpu.stat       # nr_throttled, throttled_usec
cat /sys/fs/cgroup/<slice>/cpu.max         # quota / period

# CPU migration rate
perf stat -e cpu-migrations -p <pid> sleep 10

Common findings this catches

Java thread pool = 200, vCPU = 4 → massive involuntary switches; reduce pool to 8-16 for CPU-bound, 32-64 for I/O-bound.
Go service GOMAXPROCS unbounded on container with 2 vCPU quota → runtime treats it as 64; futex storms. Set GOMAXPROCS=2.
Container at 100% CPU quota for 100ms then idle 900ms → cgroup CPU throttling causing periodic latency. Raise quota or remove.
MySQL Threads_running >> CPU count under load → connection pool over-sized; latency goes up, not throughput.
vmstat in > 100k/sec → interrupt storm; check /proc/interrupts for runaway NIC IRQ or timer.
Voluntary switches at 50k/sec, app idle by CPU → blocked on I/O or hot lock; correlate with iostat or perf futex.

Differential cheatsheet

Pattern	Likely	Fix
High voluntary, low CPU	I/O blocking or lock contention	Profile blocked threads; reduce contention
High involuntary, high CPU	Over-threaded	Reduce thread/worker pool
High `cs` + spiky latency	GC pauses or stop-the-world	GC tuning
High `in` (interrupts)	IRQ storm	Check `/proc/interrupts`, NIC pinning
`r` > vCPU consistently	Runqueue pressure	Scale up CPU, reduce thread count
Throttled time > 0	Container CPU cap	Raise quota

When to escalate

Hot kernel-side lock contention (visible in perf top showing _raw_spin_lock near top) — kernel-level issue; engage platform team.
App-internal lock contention requiring code change — escalate to app owner with perf evidence.
Suspected hypervisor steal hidden as “involuntary switches” — check %st in top; cloud-side fix.

Linux Context Switch & Lock Contention Diagnosis Prompt

Why this prompt works

How to use it

Useful commands

Common findings this catches

Differential cheatsheet

When to escalate

Related prompts

Linux High Load & CPU Runaway Investigation Prompt

Linux NUMA Imbalance Investigation Prompt

Linux `perf` & Flame Graph Profiling Prompt

Why this prompt works

How to use it

Useful commands

Common findings this catches

Differential cheatsheet

When to escalate

Related prompts

Linux High Load & CPU Runaway Investigation Prompt

Linux NUMA Imbalance Investigation Prompt

Linux `perf` & Flame Graph Profiling Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet