Skip to content
CloudOps
All prompts
AI for Linux Admins Difficulty: Intermediate ClaudeChatGPT

Linux Host Network Connectivity Debug Prompt

Diagnose single-host Linux networking — broken routes, firewall blocks, DNS, conntrack exhaustion, ephemeral port exhaustion, MTU issues — without confusing it with cloud/SDN problems.

Target user
Linux sysadmins and SREs debugging connectivity from one server
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior Linux network engineer who can debug "this host can't reach that host" with the same care a doctor uses on a differential diagnosis. You don't immediately blame "firewall" or "DNS" — you walk the path.

I will provide:
- The symptom: outbound (this host → another) or inbound (others → this host)? TCP/UDP/ICMP? Specific port and destination?
- Is it total failure or intermittent? Connect timeout, connect refused, TLS handshake fail, app timeout?
- The OS / distro / kernel
- Whether iptables, nftables, or firewalld is the firewall (`firewall-cmd --state` / `nft list ruleset | head` / `iptables-save | head`)
- Output from a minimum diagnostic set:
  - `ip a` (interfaces and IPs)
  - `ip route` (route table)
  - `ss -tnp state listening` (for inbound problems) / `ss -tnp dst <ip>` (for outbound)
  - `dig <hostname>` if name resolution is involved
  - One representative reproduction (`curl -v` / `nc -vz` / `mtr -rwbc 10 <host>`)

Your job:

1. **Walk the OSI-ish layers in order**, narrating each:
   - **L1/L2**: Interface up? Carrier present? Speed/duplex? `ethtool` if relevant.
   - **L3**: Right IP? Right route? Default gateway reachable? `mtr` to destination.
   - **DNS**: name resolves at all? Resolves to the IP the app actually contacts? Stale `/etc/hosts`? `nss-dns` vs `nss-resolve` (systemd-resolved)?
   - **L4 reachability**: TCP SYN gets through? Is it `connect refused` (port closed locally or by firewall) vs `connect timeout` (silent drop)?
   - **Firewall**: local iptables/nftables rules, plus reverse-path filter (`rp_filter`), plus conntrack state.
   - **TLS**: SNI / SAN / chain issues — `openssl s_client -connect <h>:<p> -servername <h>` shows the real story.
   - **Application**: app log on receiving side, listening on `0.0.0.0` vs `127.0.0.1`, accepting from the right vhost?
2. **Distinguish "connect refused" vs "connect timeout"**:
   - **Refused** = a host responded with RST. Either local-port-not-listening, firewall configured with REJECT, or upstream load balancer rejecting. Definitely reaches L3.
   - **Timeout** = no response at all. Routing, drop firewall, MTU blackhole, or destination dead.
3. **Check for the easy-but-overlooked**:
   - `/etc/resolv.conf` overwritten by NetworkManager / cloud-init / dhclient
   - `/etc/hosts` overriding what you think DNS returned
   - `nsswitch.conf` order (`files dns` vs `dns files`)
   - Conntrack table full (`nf_conntrack: table full, dropping packet`) → `dmesg` will show this
   - Ephemeral port exhaustion on outbound-heavy hosts (`ss -s` shows estab+timewait, `sysctl net.ipv4.ip_local_port_range`)
   - MTU blackhole: ping works, large packets time out → `tracepath <host>` shows MTU
   - `rp_filter=1` dropping asymmetric-routed return packets
4. **For inbound problems**: confirm the service is listening on the expected interface (`0.0.0.0:<port>` vs `127.0.0.1:<port>`). `127.0.0.1` listeners are unreachable externally regardless of firewall.
5. **Recommend the SAFE diagnostic command sequence** with each step's expected output and what it rules in/out.
6. **Mark DESTRUCTIVE actions** (`iptables -F`, `systemctl restart NetworkManager` over SSH, `ip route flush`).

---

Symptom direction: [outbound from this host / inbound to this host]
Protocol + port: [e.g., TCP/443, UDP/53]
Failure mode: [connect timeout / connect refused / TLS error / DNS resolution fail / intermittent]
Distro + kernel: [e.g., Ubuntu 22.04, 5.15...]
Firewall: [iptables / nftables / firewalld / none]
`ip a`:
```
[PASTE]
```
`ip route`:
```
[PASTE]
```
`ss -tnp` (relevant subset):
```
[PASTE]
```
Reproduction (`curl -v` / `nc -vz` / `mtr` / `dig`):
```
[PASTE]
```
Firewall rules (`nft list ruleset` or `iptables-save -c` or `firewall-cmd --list-all`):
```
[PASTE]
```

Why this prompt works

“Can’t connect” has eight different L1–L7 causes that look identical to the user. Models tend to over-rotate on firewall or DNS as the answer. This prompt forces a layered walk and explicitly separates connect refused (a different debugging path) from connect timeout.

How to use it

  1. Be specific about direction. “Can’t reach the database” might be DNS, route, firewall, or the DB just down — and they’re investigated differently.
  2. State protocol and port. TCP 443 timeout is a network/firewall problem; UDP 53 timeout is almost always DNS server or path.
  3. Include the actual error string from the client, verbatim. curl: (7) Failed to connect: Connection refused and curl: (28) Operation timed out lead to different debugging.
  4. Run mtr -rwbc 10 <dest> for any “intermittent” complaint — it shows packet loss per hop.

Useful commands

# Layer 2 / Interface
ip a
ip -s link show <iface>     # rx/tx + errors
ethtool <iface>             # speed, duplex
ethtool -S <iface> | grep -i error   # NIC error counters

# Layer 3 / Routing
ip route
ip route get <dest-ip>      # which route would be used
mtr -rwbc 10 <dest>         # path loss
tracepath <dest>            # discovers MTU
ping -M do -s 1472 <dest>   # MTU test (1500 - 28 for ICMP+IP)

# DNS
resolvectl status           # systemd-resolved view
cat /etc/resolv.conf
dig <name>
dig @8.8.8.8 <name>         # bypass local resolver
nslookup <name>
getent hosts <name>         # uses nsswitch.conf order

# Sockets / Listeners
ss -tnlp                    # TCP listeners
ss -unlp                    # UDP listeners
ss -tnp dst <ip>            # active TCP to a destination
ss -tnp src :443            # active from local port 443
ss -s                       # summary (estab, timewait, etc.)

# Reproduction
curl -v --connect-timeout 5 https://<host>/
nc -vz <host> <port>
openssl s_client -connect <host>:443 -servername <host> </dev/null

# Firewall view (only ONE applies to your system)
sudo nft list ruleset                       # nftables (modern)
sudo iptables-save -c                       # iptables (legacy + counters)
sudo firewall-cmd --list-all                # firewalld zone

# Conntrack
sudo conntrack -L | head
cat /proc/sys/net/netfilter/nf_conntrack_count
cat /proc/sys/net/netfilter/nf_conntrack_max
dmesg | grep -i "nf_conntrack: table full"

# Ephemeral ports
sysctl net.ipv4.ip_local_port_range
sysctl net.ipv4.tcp_tw_reuse
ss -s    # look at timewait count

# Reverse-path filter
sysctl net.ipv4.conf.all.rp_filter
sysctl net.ipv4.conf.<iface>.rp_filter

# Capture (narrow scope, low overhead)
sudo tcpdump -i any -n -s 96 'host <ip> and port <p>' -c 100

Differential cheatsheet

SymptomMost likelyConfirm with
Connection refused immediatelyService not listening on expected IP/port, or firewall REJECTss -tnlp on dest; nft list ruleset
Connection timed outSilent drop: routing, DROP firewall, dest deadmtr, firewall log, dest health
First connect OK, later failsConntrack table full, ephemeral port exhaustiondmesg, ss -s, conntrack counters
Small payloads OK, large failMTU blackholetracepath, ping -M do -s 1472
Intermittent loss to one hopUpstream issuemtr -rwbc 100 over 5 min
Resolves OK, app times outApp contacts wrong IP, or IPv6 path brokendig, getent hosts, force -4
TLS handshake failsSNI / SAN mismatch, cert expiredopenssl s_client -servername
Works to one host, fails to anotherSpecific route / firewall ruleip route get, nft list ruleset

Common findings this catches

  • App listens on 127.0.0.1:5432 but clients connect to <host-ip>:5432 → bind address misconfigured. ss -tnlp reveals immediately.
  • /etc/resolv.conf says nameserver 127.0.0.53 but systemd-resolved is misconfigured upstream → resolvectl status shows the real story.
  • iptables -L -v -n shows no rules but nft list ruleset does → nftables ruleset blocking; the iptables view is misleading on modern kernels.
  • dmesg | grep conntrack shows table-full drops → raise nf_conntrack_max, or disable conntrack for high-volume internal traffic.
  • ss -s shows 30k TIME-WAIT → ephemeral port exhaustion on outbound-heavy host; enable tcp_tw_reuse, raise ip_local_port_range.
  • MTU 1500 path → 1450 path (VPN/tunnel in between) → TCP SYN works, large packets time out; set mss-clamp on the gateway or lower app MTU.

When to escalate

  • Anything that looks like upstream packet loss in mtr consistently at hop 2+ — network team / ISP.
  • Cloud-VM-specific issues (VPC routes, security groups, Network ACLs) — this prompt is for the host layer; check provider console too. For OpenStack, see Neutron Networking Debug.
  • Hardware NIC errors growing in ethtool -S (CRC errors, drops) — physical layer; engage data-center hands.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.