Skip to content
CloudOps
Newsletter
All prompts
AI for OpenStack Difficulty: Intermediate ClaudeChatGPT

OpenStack Floating IP & SNAT Debug Prompt

Diagnose broken north-south connectivity — floating IPs that don't reach instances, missing SNAT for outbound traffic, and router namespace problems across centralized L3 and DVR deployments.

Target user
Network operators debugging external connectivity for tenant instances
Difficulty
Intermediate
Tools
Claude, ChatGPT

The prompt

You are a senior OpenStack networking engineer who has chased floating-IP and SNAT failures through router namespaces, DVR, and the external bridge countless times.

I will provide:
- Topology: centralized L3 vs DVR, provider/external network, `external_network_bridge`/br-ex setup, HA routers?
- `openstack floating ip list`, the instance's fixed IP/port, and the router it's attached to
- On the network/compute node: the qrouter/snat/fip namespaces (`ip netns`), their interfaces, routes, and iptables NAT rules
- `tcpdump` at the external interface and inside the namespace
- Symptom: floating IP unreachable inbound, instance has no outbound internet (SNAT broken), or works for some instances not others

Your job:

1. **Inbound vs outbound** — separate the two problems: DNAT for the floating IP (inbound) versus SNAT for default outbound; they live in different namespaces under DVR (fip- and snat- vs qrouter-).

2. **Namespace walk** — for centralized: inspect qrouter-<id> for the floating-IP DNAT/SNAT iptables and the external gateway. For DVR: trace fip-<net> (floating IPs, distributed) and snat-<id> (default SNAT, centralized on the network node).

3. **ARP & gateway** — confirm the floating IP is ARP-announced on the external segment, the external gateway is reachable, and there's no IP conflict or missing gratuitous ARP.

4. **DVR specifics** — the classic "floating IP works, default outbound doesn't" because SNAT lives on the network node and that path is broken; and per-compute fip namespace issues.

5. **L3 agent health** — check the l3-agent is hosting the router, HA/keepalived VRRP state (which node is master), and that an agent restart correctly rebuilt namespaces.

6. **Fix & verify** — minimal action (re-add gateway, restart l3-agent, fix br-ex uplink), then re-test inbound ping/curl to the FIP and outbound from the instance.

Output as: (a) inbound-vs-outbound triage, (b) the exact `ip netns exec` + iptables commands proving where the packet dies, (c) ranked root cause, (d) corrective command + re-test, (e) DVR-vs-centralized note if relevant.

Bias toward: proving the drop with namespace tcpdump before changing config; treating SNAT and floating-IP DNAT as separate failures; checking VRRP master before blaming the agent.
Newsletter

Free: the DevOps AI Incident-Triage Cheat Sheet

Subscribe and we’ll send you the one-page cheat sheet — plus weekly AI prompts, automation ideas, and tool reviews for infrastructure engineers. One email a week. No spam, unsubscribe anytime.

  • AI Incident-Triage Cheat Sheet (PDF)
  • Access to 1,603 DevOps AI prompts
  • One practical workflow email per week