ss Socket State & TCP Backlog Triage Prompt

Read ss output to explain a connection problem — stuck SYN-RECV/CLOSE-WAIT/TIME-WAIT piles, full accept/SYN backlogs, or exhausted ephemeral ports — and pinpoint whether the app or the kernel is to blame.

Target user

Linux admins and SREs debugging connection refusals, drops, or socket exhaustion

Difficulty

Intermediate

Tools

Claude, ChatGPT

You are a senior Linux engineer who diagnoses connection problems by reading socket state, not guessing — and knows which TCP states point at the app vs the kernel. I will provide: - The symptom: connection refused, connection timeout, resets, or "running out of connections" - `ss -s` (summary), `ss -tan state <state>` for the suspect state, and `ss -tlnp` for listeners with backlog - For a specific listener: `ss -tlnp '( sport = :[PORT] )'` showing Recv-Q / Send-Q - Relevant sysctls if known: `net.core.somaxconn`, `net.ipv4.tcp_max_syn_backlog`, `net.ipv4.ip_local_port_range`, `net.ipv4.tcp_tw_reuse` - `dmesg` lines mentioning "possible SYN flooding" or "TCP: out of memory" Your job: 1. **Read Recv-Q / Send-Q on listeners correctly** — for a LISTEN socket, Recv-Q is the current accept-queue depth and Send-Q is the configured backlog (min of app `listen()` and `somaxconn`). Explain whether the queue is overflowing for THIS listener. 2. **Interpret the state pile** — tell me what a large count in a given state means: many CLOSE-WAIT = the *application* isn't close()-ing sockets (app bug); many TIME-WAIT = normal for a busy client, only a problem near port exhaustion; many SYN-RECV = possible SYN flood or backlog overflow; many FIN-WAIT-2 = peer not closing. 3. **Check for exhaustion** — compare active connection counts against `ip_local_port_range` size and against any fd ulimit; flag ephemeral-port or file-descriptor exhaustion. 4. **App vs kernel verdict** — state clearly whether the fix lives in the application (close sockets, increase `listen()` backlog, connection pooling) or the kernel (`somaxconn`, `tcp_max_syn_backlog`, port range, SYN-cookies) — and why. 5. **Recommend the targeted change** — give the specific sysctl or app-config change, its current vs proposed value, and how to verify the queue drains or drops stop after applying. Output as: (a) a plain-English read of the socket summary and the suspect state, (b) the root-cause verdict (app or kernel) with the evidence line, (c) the exact change to make with current→proposed values, (d) the verification command and what "fixed" looks like. Verify before acting: confirm the diagnosis against a second `ss` sample under load before changing sysctls — a one-shot snapshot of TIME-WAIT or SYN-RECV is often normal and not worth tuning.

Why this prompt works

When connections start failing, most people reach for netstat | grep | wc -l and start guessing. ss exposes the actual machinery — accept queues, SYN backlogs, and per-state socket counts — but only if you know that Recv-Q and Send-Q mean completely different things on a LISTEN socket than on an established one. This prompt front-loads that decoding so the AI reads the listener’s queue depth and configured backlog correctly instead of treating Recv-Q as “bytes.”

The diagnostic leverage comes from mapping TCP states to causes. A pile of CLOSE-WAIT sockets is almost always an application that forgot to close() — no amount of kernel tuning fixes it. A pile of TIME-WAIT is usually a healthy busy client and a trap that lures people into enabling tcp_tw_recycle (long since removed for breaking NAT). SYN-RECV piling up points at backlog overflow or a SYN flood. By forcing the model to deliver an explicit app-vs-kernel verdict with the evidence line, you stop the reflexive sysctl-poking that fixes nothing.

The prompt also guards the most common tuning footgun: raising somaxconn while the application’s own listen() backlog silently caps the queue lower, so nothing changes and everyone’s confused. It insists the model state current-versus-proposed values and a concrete verification step, and it demands a second sample under load before touching the kernel — because a single snapshot of TIME-WAIT or SYN-RECV is frequently just normal traffic. The AI reads the sockets and proposes the targeted change; you confirm it under load before tuning the kernel.

ss Socket State & TCP Backlog Triage Prompt

Why this prompt works

Related prompts

Linux conntrack Table Exhaustion Tuning Prompt

Linux ulimit & File Descriptor Limits Prompt

Why this prompt works

Related prompts

Linux conntrack Table Exhaustion Tuning Prompt

Linux ulimit & File Descriptor Limits Prompt

Free: the DevOps AI Incident-Triage Cheat Sheet