You are a senior Linux network engineer who has tuned production network stacks for high-throughput services (CDNs, load balancers, databases). You know which sysctls matter and which are cargo-culted. I will provide: - The symptom (throughput below link spec, high p99 latency, retransmit storms, "connection reset", ephemeral port exhaustion, accept queue overflow) - Host role: client / server / proxy / load-balancer - NIC and link spec (`ethtool <iface>`, `ethtool -i <iface>`) - Current congestion control: `sysctl net.ipv4.tcp_congestion_control` - Output of `ss -s`, `ss -tnp state established | wc -l`, `nstat`, `netstat -s | egrep -i "retrans|drop|listen"` - Distro, kernel version, sysctl baseline (`sysctl -a 2>/dev/null | grep -E "(rmem|wmem|tcp|netdev|somaxconn)"`) Your job: 1. **Classify the symptom**: - **Throughput below link** → window, buffer, or NIC-offload issue - **High latency p99 only** → buffer bloat, retransmit, or NIC interrupt pinning - **Retransmit storms** → loss, drops, ECN misconfiguration, MTU blackhole - **`netstat -s` listen overflow** → `somaxconn` / app accept queue too small - **Ephemeral port exhaustion** → outbound-heavy host; `ip_local_port_range`, `tcp_tw_reuse` - **`Connection reset`** → backlog full, conntrack table, app `RST` on close, firewall 2. **TCP throughput math**: max-throughput ≈ window_size / RTT. A 1 MB window over 100ms RTT caps at ~80 Mbps. Bump buffers if BDP exceeds default. 3. **Congestion control choice**: - **`cubic`** (Linux default) — fair, stable, latency-tolerant. Good for general - **`bbr`** — high throughput on lossy/long-fat paths; uses bandwidth × RTT model; can be unfair to cubic neighbors - Switch with `sysctl net.ipv4.tcp_congestion_control=bbr`; needs `tcp_bbr` module 4. **Buffer auto-tuning** (default in modern kernels): `net.ipv4.tcp_rmem` / `tcp_wmem` are min/default/max — raise max for long-fat networks. 5. **NIC tuning**: - **Multi-queue + IRQ affinity** (`ethtool -L`, `irqbalance`, manual `/proc/irq/<n>/smp_affinity`) - **Offload features** (`ethtool -k`): TSO, GSO, GRO, LRO. Helpful for throughput, can hurt latency or break NV-routed traffic - **Ring buffer size** (`ethtool -G`): raise if `ifconfig` shows drops in rx ring 6. **For load balancers / proxies**: tune `somaxconn`, `tcp_max_syn_backlog`, app's `listen(backlog)`. Listen drops are invisible without `nstat`. 7. **Conntrack** (firewalled hosts): `nf_conntrack_max`, `nf_conntrack_buckets`, hash size. Table full = silent packet drops. 8. **For DSCP / QoS / multi-queue scheduling**: `fq_codel`, `cake`, or `mq` qdiscs — defaults are usually fine; `pfifo_fast` is legacy. Mark DESTRUCTIVE: disabling firewall to "test," switching to `bbr` on a load balancer mid-day, dropping ring buffer size, disabling offloads on a live link. --- Symptom: [DESCRIBE — include rate, latency target, link spec] Host role: [client/server/proxy/LB] NIC + link: [`ethtool` output] TCP / sysctl baseline: ``` [PASTE relevant `sysctl -a` excerpts] ``` `ss -s`, `nstat`, `netstat -s` highlights: ``` [PASTE] ``` Reproduction: `iperf3 -c <host>` or workload-specific benchmark: ``` [PASTE] ```

Why this prompt works

Network tuning is rife with cargo-cult sysctls copied from Stack Overflow answers a decade old. This prompt forces measurement-driven tuning: identify the actual bottleneck (window, drops, buffers, NIC) before changing parameters.

How to use it

Measure first: iperf3 for raw throughput, ss -ti for per-flow TCP info (cwnd, rtt, retrans), nstat for kernel counters.
State the target: “1 Gbps link, currently getting 300 Mbps” tells the model the gap.
Include nstat and netstat -s — drop and overflow counters are diagnostic.
Identify role: server-side tuning differs from client-side (LB tunes accept queue; client tunes ephemeral ports).

Useful commands

# Link spec
ethtool <iface>                    # speed, duplex
ethtool -i <iface>                 # driver
ethtool -S <iface> | head -40      # extended stats (drops, errors)
ethtool -k <iface>                 # offload features
ethtool -g <iface>                 # ring buffer sizes
ethtool -L <iface>                 # multi-queue setting

# TCP stack overview
ss -s                              # summary
ss -ti '( sport = :443 )' | head   # per-flow info: cwnd, rtt, retrans
ss -tnp state listening
ss -lntp                           # listeners

# Kernel counters
nstat                              # all SNMP counters; deltas between runs
netstat -s | egrep -i "retrans|drop|listen|overflow"
sar -n EDEV 1 5                    # per-NIC error rates
sar -n TCP,ETCP 1 5                # TCP rates

# Buffers / autotuning
sysctl net.ipv4.tcp_rmem net.ipv4.tcp_wmem net.core.rmem_max net.core.wmem_max
sysctl net.ipv4.tcp_congestion_control
sysctl net.core.somaxconn net.ipv4.tcp_max_syn_backlog

# Conntrack
sysctl net.netfilter.nf_conntrack_max net.netfilter.nf_conntrack_buckets
cat /proc/sys/net/netfilter/nf_conntrack_count
dmesg | grep -i conntrack

# IRQ pinning / multi-queue
cat /proc/interrupts | grep <iface>
mpstat -I SCPU 1 3
sudo ethtool -L <iface> combined N    # set N queues

# Throughput test
iperf3 -s                              # on receiver
iperf3 -c <server> -P 4 -t 30          # 4 parallel streams
iperf3 -c <server> -R                  # reverse

# Latency
mtr -rwbc 100 <host>
ping -c 100 -i 0.1 <host>

Tuning patterns

High-throughput server (long-fat network)

# /etc/sysctl.d/99-network-perf.conf
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq            # required by BBR for best behavior
net.core.netdev_max_backlog = 16384

Connection accept-heavy server (load balancer)

net.core.somaxconn = 65535             # also bump app's listen(backlog)
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535

Outbound-heavy client (API gateway)

net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_fin_timeout = 15

NIC tuning (post-sysctl)

# Maximize multi-queue
sudo ethtool -L eth0 combined $(nproc)

# Raise ring buffer if drops in rx
ethtool -g eth0    # max?
sudo ethtool -G eth0 rx 4096 tx 4096

# Pin queues to specific cores (basic)
sudo systemctl stop irqbalance
sudo bash -c 'i=0; for q in /proc/irq/*/eth0-rx-*; do echo $((1 << i)) > "$q/smp_affinity"; i=$((i+1)); done'

Common findings this catches

netstat -s | grep "listen drops" > 0 with steady arrival → app’s listen backlog too small; raise somaxconn AND app config.
ethtool -S | grep drop rising → ring buffer too small or NIC hardware drops; raise rx ring.
nstat | grep TcpExtTCPSackRecv high → significant out-of-order; check for path-MTU or middlebox loss.
Throughput plateaus at exactly link speed / 8 → window-limited; BDP > current window; raise tcp_rmem/wmem max.
Long-fat path stuck at low throughput on cubic → switch to BBR with fq qdisc; expect 2-10× on lossy paths.
Conntrack table full in dmesg → raise nf_conntrack_max AND nf_conntrack_buckets.
Ephemeral port exhaustion on outbound API gateway → enable tcp_tw_reuse; widen port range.

When to escalate

NIC firmware bugs (consistent silent drops not in counters) — driver update or NIC replacement.
Cross-region throughput limited by physical / provider topology — tuning won’t help; choose different placement.
Application accepting connections slowly (not the kernel) — coordinate with app team; backlog tuning only papers over.

Linux Network Performance Tuning Prompt

Why this prompt works

How to use it

Useful commands

Tuning patterns

High-throughput server (long-fat network)

Connection accept-heavy server (load balancer)

Outbound-heavy client (API gateway)

NIC tuning (post-sysctl)

Common findings this catches

When to escalate

Related prompts

Linux Block I/O Performance Investigation Prompt

Linux High Load & CPU Runaway Investigation Prompt

Linux Host Network Connectivity Debug Prompt

Why this prompt works

How to use it

Useful commands

Tuning patterns

High-throughput server (long-fat network)

Connection accept-heavy server (load balancer)

Outbound-heavy client (API gateway)

NIC tuning (post-sysctl)

Common findings this catches

When to escalate

Related prompts

Linux Block I/O Performance Investigation Prompt

Linux High Load & CPU Runaway Investigation Prompt

Linux Host Network Connectivity Debug Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet