ss Socket State & TCP Backlog Triage Prompt
Read ss output to explain a connection problem — stuck SYN-RECV/CLOSE-WAIT/TIME-WAIT piles, full accept/SYN backlogs, or exhausted ephemeral ports — and pinpoint whether the app or the kernel is to blame.
- Target user
- Linux admins and SREs debugging connection refusals, drops, or socket exhaustion
- Difficulty
- Intermediate
- Tools
- Claude, ChatGPT
The prompt
You are a senior Linux engineer who diagnoses connection problems by reading socket state, not guessing — and knows which TCP states point at the app vs the kernel. I will provide: - The symptom: connection refused, connection timeout, resets, or "running out of connections" - `ss -s` (summary), `ss -tan state <state>` for the suspect state, and `ss -tlnp` for listeners with backlog - For a specific listener: `ss -tlnp '( sport = :[PORT] )'` showing Recv-Q / Send-Q - Relevant sysctls if known: `net.core.somaxconn`, `net.ipv4.tcp_max_syn_backlog`, `net.ipv4.ip_local_port_range`, `net.ipv4.tcp_tw_reuse` - `dmesg` lines mentioning "possible SYN flooding" or "TCP: out of memory" Your job: 1. **Read Recv-Q / Send-Q on listeners correctly** — for a LISTEN socket, Recv-Q is the current accept-queue depth and Send-Q is the configured backlog (min of app `listen()` and `somaxconn`). Explain whether the queue is overflowing for THIS listener. 2. **Interpret the state pile** — tell me what a large count in a given state means: many CLOSE-WAIT = the *application* isn't close()-ing sockets (app bug); many TIME-WAIT = normal for a busy client, only a problem near port exhaustion; many SYN-RECV = possible SYN flood or backlog overflow; many FIN-WAIT-2 = peer not closing. 3. **Check for exhaustion** — compare active connection counts against `ip_local_port_range` size and against any fd ulimit; flag ephemeral-port or file-descriptor exhaustion. 4. **App vs kernel verdict** — state clearly whether the fix lives in the application (close sockets, increase `listen()` backlog, connection pooling) or the kernel (`somaxconn`, `tcp_max_syn_backlog`, port range, SYN-cookies) — and why. 5. **Recommend the targeted change** — give the specific sysctl or app-config change, its current vs proposed value, and how to verify the queue drains or drops stop after applying. Output as: (a) a plain-English read of the socket summary and the suspect state, (b) the root-cause verdict (app or kernel) with the evidence line, (c) the exact change to make with current→proposed values, (d) the verification command and what "fixed" looks like. Verify before acting: confirm the diagnosis against a second `ss` sample under load before changing sysctls — a one-shot snapshot of TIME-WAIT or SYN-RECV is often normal and not worth tuning.
Why this prompt works
When connections start failing, most people reach for netstat | grep | wc -l and start guessing. ss exposes the actual machinery — accept queues, SYN backlogs, and per-state socket counts — but only if you know that Recv-Q and Send-Q mean completely different things on a LISTEN socket than on an established one. This prompt front-loads that decoding so the AI reads the listener’s queue depth and configured backlog correctly instead of treating Recv-Q as “bytes.”
The diagnostic leverage comes from mapping TCP states to causes. A pile of CLOSE-WAIT sockets is almost always an application that forgot to close() — no amount of kernel tuning fixes it. A pile of TIME-WAIT is usually a healthy busy client and a trap that lures people into enabling tcp_tw_recycle (long since removed for breaking NAT). SYN-RECV piling up points at backlog overflow or a SYN flood. By forcing the model to deliver an explicit app-vs-kernel verdict with the evidence line, you stop the reflexive sysctl-poking that fixes nothing.
The prompt also guards the most common tuning footgun: raising somaxconn while the application’s own listen() backlog silently caps the queue lower, so nothing changes and everyone’s confused. It insists the model state current-versus-proposed values and a concrete verification step, and it demands a second sample under load before touching the kernel — because a single snapshot of TIME-WAIT or SYN-RECV is frequently just normal traffic. The AI reads the sockets and proposes the targeted change; you confirm it under load before tuning the kernel.
Related prompts
-
Linux conntrack Table Exhaustion Tuning Prompt
Diagnose and fix nf_conntrack table exhaustion on busy Linux hosts and gateways — dropped connections, log spam, and timeout tuning — or decide where to bypass tracking entirely.
-
Linux ulimit & File Descriptor Limits Prompt
Diagnose and raise process resource limits — open files, processes, memlock — fixing 'Too many open files' across systemd units, PAM logins, and containers.