Octavia Load Balancer Troubleshooting Prompt
Diagnose Octavia issues — amphora boot failures, listener/pool/health-monitor misconfig, certificate problems, failover, statistics.
- Target user
- OpenStack network engineers running Octavia
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior OpenStack engineer who has operated Octavia load balancers — amphora VMs, listeners, pools, members, health monitors, TLS termination with Barbican. I will provide: - The symptom (LB stuck creating, traffic not reaching members, health monitor reporting members down, failover not happening, TLS error) - `openstack loadbalancer show <id>` and related show commands for listener/pool/member - Amphora image status and Octavia worker logs - The Octavia topology (`SINGLE` or `ACTIVE_STANDBY`) - For TLS: Barbican secret container ref Your job: 1. **Verify Octavia components**: - `octavia-worker`, `octavia-housekeeping`, `octavia-health-manager`, `octavia-api` running - Amphora image registered (`amphora-agent` baked in) - Management network connectivity from amphora to controller 2. **For "LB stuck in `PENDING_CREATE`"**: - Amphora VM boot failed: check Nova for the amphora instance - Image missing or wrong: tag `amphora` on Glance image - Network not reachable: management network ID configured correctly - Flavor issues: amphora flavor doesn't fit on any host 3. **For listener/pool issues**: - **Listener** = port + protocol + cert - **Pool** = backend members + algorithm + health monitor - **Member** = IP:port within a Service - **Health Monitor** = how to check member liveness (HTTP, TCP, PING) - All must be `ACTIVE` and `ONLINE` 4. **For "member shows offline"**: - Health monitor expected HTTP 200 from `/healthz` but app returns 404 → fix path - HM TCP probe to closed port → app not listening - Network policy / security group blocking HM source - Member IP not reachable from amphora (subnet/security group) 5. **For TLS termination**: - Cert + intermediates stored in Barbican secret container - Listener references container - Listener `default_tls_container_ref` for default cert; `sni_container_refs` for SNI - Cert expired = TLS handshake fails; rotate in Barbican 6. **For failover (ACTIVE_STANDBY)**: - Two amphorae; one ACTIVE, one STANDBY - VRRP between them (keepalived) - Failover triggered by health-manager - Common issue: VIP not migrating because VRRP misconfigured 7. **For statistics**: - `openstack loadbalancer stats show` for connection / byte counters - Per-pool, per-listener 8. **For octavia-housekeeping** (cleanup): - Deletes expired amphorae - Cleans stale records - If stopped: orphan amphorae accumulate, costing resources Mark DESTRUCTIVE: deleting an LB with active traffic (clients drop), Barbican secret rotation without LB reconcile (expired cert served), force-deleting amphora without Octavia state cleanup. --- Symptom: [DESCRIBE] `openstack loadbalancer show <id>`: ``` [PASTE] ``` Listener/Pool/HM details: ``` [PASTE] ``` Octavia logs (worker + housekeeping + health-manager): ``` [PASTE] ``` Topology: [SINGLE / ACTIVE_STANDBY]
Why this prompt works
Octavia spans Nova (amphora VMs), Neutron (networks), Barbican (certs), and its own services. A “LB stuck creating” can be in any layer. This prompt walks them.
How to use it
- Always include the LB topology (SINGLE vs ACTIVE_STANDBY).
- Check amphora Nova instance for “stuck create” issues.
- For TLS, verify Barbican secret and cert validity.
- For member offline, replicate HM check manually.
Useful commands
# LB inventory
openstack loadbalancer list
openstack loadbalancer show <id>
openstack loadbalancer status show <id> # tree view
openstack loadbalancer stats show <id>
# Children
openstack loadbalancer listener list --loadbalancer <id>
openstack loadbalancer pool list --loadbalancer <id>
openstack loadbalancer member list <pool-id>
openstack loadbalancer healthmonitor list
# Amphora
openstack loadbalancer amphora list
openstack loadbalancer amphora show <id>
# Amphora image
openstack image list --tag amphora
openstack image show <amphora-image-id>
# Octavia logs (controller)
sudo journalctl -u octavia-worker -n 200 --no-pager
sudo journalctl -u octavia-housekeeping -n 200 --no-pager
sudo journalctl -u octavia-health-manager -n 200 --no-pager
sudo journalctl -u octavia-api -n 100 --no-pager
# Amphora-side (if you can SSH into amphora management)
ssh -i amphora-key ubuntu@<amphora-mgmt-ip>
sudo journalctl -u amphora-agent -n 100 --no-pager
# Inside amphora: /var/log/haproxy.log
# Test member from amphora
ssh -i amphora-key ubuntu@<amphora-mgmt-ip>
sudo ip netns exec amphora-haproxy curl -v http://<member-ip>:<port>/healthz
Patterns
Create HTTPS LB with TLS
# Store cert in Barbican
openstack secret store --name 'cert' --secret-type certificate --payload-content-type application/octet-stream --payload "$(base64 -w0 server.crt)"
openstack secret container create --name 'lb-cert-container' \
--type=certificate \
--secret="certificate=$(openstack secret list -c "Secret href" -f value | head -1)" \
--secret="private_key=..." \
--secret="intermediates=..."
# Create LB
openstack loadbalancer create --name web-lb --vip-subnet-id <subnet-id>
openstack loadbalancer listener create \
--name web-listener \
--protocol TERMINATED_HTTPS \
--protocol-port 443 \
--default-tls-container-ref "$BARBICAN_CONTAINER_HREF" \
web-lb
openstack loadbalancer pool create \
--name web-pool \
--lb-algorithm ROUND_ROBIN \
--listener web-listener \
--protocol HTTP
openstack loadbalancer member create \
--subnet-id <member-subnet> \
--address 10.0.0.10 \
--protocol-port 8080 \
web-pool
openstack loadbalancer healthmonitor create \
--delay 5 \
--timeout 3 \
--max-retries 3 \
--type HTTP \
--url-path /healthz \
web-pool
Failover an amphora
openstack loadbalancer amphora failover <amphora-id>
Common findings this catches
- LB stuck PENDING_CREATE > 10 min → check Nova for amphora boot failure.
- Member offline but health monitor passes manually → security group blocks HM from amphora subnet.
- TLS handshake fails after cert rotation → listener still pointing to old Barbican secret.
- Failover doesn’t migrate VIP → ACTIVE_STANDBY VRRP config or network issue.
- Statistics show 0 connections but traffic expected → listener bound to wrong port or VIP not reachable.
- Amphorae piling up in Nova → housekeeping not running.
- HM TCP probe to closed port → app not listening; member offline.
When to escalate
- Octavia control plane down — engage cluster admin.
- Amphora image vulnerability discovered — update + rebuild all LBs.
- Cross-provider (F5/A10) issues — vendor support.
Related prompts
-
Barbican Secret Store Management Prompt
Manage Barbican secrets — secret/container/order model, HSM backend, key rotation, ACLs, Octavia integration.
-
Neutron Networking Debug Prompt
Diagnose Neutron networking failures — unreachable VMs, broken security groups, missing floating IPs, OVS/OVN flow issues — from CLI output and agent logs.
-
OpenStack VM Troubleshooting Prompt
Diagnose Nova VM boot failures, networking issues, and stuck instances using nova/openstack CLI output.