OpenStack Troubleshooting Toolkit
Diagnose Horizon 504s, Cinder and RabbitMQ timeouts, dead Neutron agents, Keystone auth, and Kolla-Ansible issues with control-plane runbooks and AI workflows.
Top OpenStack errors
Start with the most common production issues and troubleshooting paths.
Horizon / API 504 Gateway Timeout
Localize the slow backend across HAProxy, Keystone, RabbitMQ, and MariaDB.
Cinder scheduler timeout
Trace the API → scheduler → volume RPC path and “filtering removed all hosts”.
RabbitMQ RPC timeout
oslo.messaging MessagingTimeout and missed heartbeats across services.
neutron-l3-agent dead (XXX state)
Heartbeat, RPC, namespace, and floating-IP triage for dead L3 agents.
Kolla-Ansible certificate update
Rotate internal/external/HAProxy TLS certs and reconfigure safely.
ImageUnacceptable
Cinder rejecting a volume create with ImageUnacceptable? Diagnose image size vs volume size, format, and virtual-size mismatche…
Floating IP pool not found
Allocating a floating IP and hitting Floating IP pool not found or ExternalNetworkNotReachable? Diagnose missing external netwo…
InstanceNotFound
Nova logging InstanceNotFound during periodic tasks or deletes? Diagnose orphaned database rows, stale local instances, and com…
Best OpenStack prompts
Use these prompts to turn symptoms, logs, and config into a structured troubleshooting plan.
Nova Live Migration Troubleshooting
Diagnose Nova live migration failures — shared storage requirements, block migration, network bandwidth, CPU compatibility, error 'migration aborted'.
OpenStack Upgrade Pre-Flight Review
Pre-upgrade safety review of an OpenStack cluster moving release N → N+1 — config drift, deprecated options, DB migrations, breaking changes, service ordering.
Neutron Networking Debug
Diagnose Neutron networking failures — unreachable VMs, broken security groups, missing floating IPs, OVS/OVN flow issues — from CLI output and agent logs.
OpenStack VM Troubleshooting
Diagnose Nova VM boot failures, networking issues, and stuck instances using nova/openstack CLI output.
Free OpenStack tools
Validate, troubleshoot, or analyze your configuration before production changes.
AI Incident Response Assistant
Paste control-plane logs and symptoms, get a triage plan.
Start triageOpenStack troubleshooting guides
The runbook hub for 504s, Cinder, RabbitMQ, Neutron, and Kolla-Ansible — with downloadable packs.
Open the hubOpenStack runbook
Use a repeatable checklist for production troubleshooting.
A top-to-bottom control-plane checklist for OpenStack incidents.
- 1 Check service list and endpoints (openstack service list / endpoint list)
- 2 Validate HAProxy backends on the internal/external VIP
- 3 Check RabbitMQ queues, consumers, and heartbeats
- 4 Inspect the affected service logs (nova/cinder/neutron/keystone)
- 5 Validate the API response path end to end