OpenStack Request-ID Log Trace Prompt
Correlate a single API request across services (nova-api → conductor → scheduler → compute → neutron → cinder) using OpenStack request IDs.
- Target user
- OpenStack operators debugging cross-service issues
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are a senior OpenStack operator with deep experience tracing a single user request as it fans out across nova-api, conductor, scheduler, compute, neutron-server, OVS/OVN agents, cinder-api, cinder-volume, glance, keystone, and placement. I will provide: - The user-facing symptom (boot failed, attach hung, etc.) - The initial `X-Openstack-Request-Id` (req-XXXXXXXX) from the failing API call - Raw log excerpts from multiple services — possibly out of order, possibly partial - The OpenStack release Your job: 1. **Build a timeline** of the request as it crosses service boundaries. Each row: `timestamp | service | host | event | child request-id (if any)`. 2. **Track the request-id chain**: when service A calls service B, the *global* request-id is propagated as `X-Openstack-Request-Id` and B logs a new *local* request-id linked to A's. Reconstruct that chain. 3. **Identify the first error or anomaly** in the timeline (not the user-visible failure, which is downstream). 4. **Flag missing hops**: if you expect e.g. nova-conductor → nova-scheduler but the scheduler log doesn't show the request, that gap is the bug. 5. **Suggest the next log to fetch** if the trace is incomplete (be exact: which host, which service, which log file, what time window). 6. **Conclude** with the root-cause hypothesis grounded in the timeline. Format the timeline as a markdown table. Use UTC timestamps consistently. If timezones differ across logs, normalize and note assumptions. --- OpenStack release: [yoga / zed / antelope / bobcat / caracal / dalmatian / epoxy] Symptom: [DESCRIBE] Initial request-id: [req-XXXXXXXX] Affected resource (server/volume/network UUID): [UUID] Log excerpts (label each block with `service @ host`): ``` # nova-api @ ctrl-01 [PASTE] ``` ``` # nova-conductor @ ctrl-01 [PASTE] ``` ``` # nova-scheduler @ ctrl-02 [PASTE] ``` ``` # nova-compute @ compute-17 [PASTE] ``` ``` # neutron-server @ ctrl-01 [PASTE] ``` ``` # cinder-volume @ storage-03 (if applicable) [PASTE] ``` ``` # other services / agents [PASTE] ```
Why this prompt works
OpenStack is a microservices system that’s older than the term “microservice.” A single openstack server create call touches Keystone (auth), Nova-api, Nova-conductor, Nova-scheduler, Placement, Glance, Neutron, possibly Cinder, then libvirt on the compute. The user sees ERROR — but the actual error happened five hops in.
Without forcing the model to build a timeline, it tends to fixate on the most recent log line and miss the upstream cause. This prompt produces a row-by-row trace that surfaces the first anomaly.
How to use it
- Find the request-id from the failing API response header:
X-Openstack-Request-Id: req-XXXXXXXX. - On each candidate host, narrow logs to a tight time window:
sudo journalctl --since "2026-05-21 14:30" --until "2026-05-21 14:35" -u nova-api -u nova-conductor - Then
grepthe request-id within that window. Don’t grep the whole log — too much noise. - Paste each service’s relevant lines under a clearly-labeled block.
Useful one-liner: pull a request across all services
# On each controller / compute / storage host:
sudo grep -rh "req-XXXXXXXX" /var/log/{nova,neutron,cinder,keystone,glance,placement}/ 2>/dev/null | \
sort -k1,2
# Or via journalctl (systemd-journal):
sudo journalctl --since "1 hour ago" | grep "req-XXXXXXXX"
What “good propagation” looks like
A request-id chain typically looks like:
req-AAAA user → nova-api
req-AAAA nova-api → keystone (auth)
req-AAAA nova-api → nova-conductor
req-AAAA nova-conductor → nova-scheduler
req-BBBB nova-conductor → glance (image lookup, new global req-id sometimes)
req-AAAA nova-conductor → neutron-server (port allocate)
req-AAAA nova-conductor → cinder-api (volume attach)
req-AAAA nova-scheduler → placement (resource claim)
req-AAAA → nova-compute (build task)
req-AAAA nova-compute → libvirt (define + start)
req-AAAA nova-compute → neutron-l2-agent (port wire-up)
If your trace shows req-AAAA reaching nova-scheduler but nothing logged in placement, the call never completed there — that’s the bug.
Common findings this catches
- Request times out at the scheduler because
[scheduler] max_attemptsexhausted but thenova-apilog just showsNoValidHost. Trace reveals the retry storm. - Neutron port allocation succeeds but the compute log shows the port wiring never happened — agent on that compute is down.
- Cinder attach returns 200 OK to nova-conductor but
os-brickon compute fails minutes later. Two separate request IDs, one timeline. - Auth failures invisible upstream: Keystone validates, but a downstream service’s local policy.yaml rejects. The Keystone log looks clean.
- Cross-cell calls (in cellv2 deployments) routed wrong — request-id appears in cell0 logs when it should be in cell1.
When use_global_request_id matters
Some releases need explicit config in [DEFAULT]:
[DEFAULT]
use_global_request_id = true
to ensure the same request-id propagates rather than each service generating its own. If your trace shows brand-new request-ids appearing at each hop with no link, check this setting in your release.
When to escalate
If the timeline shows the request leaving service A but never arriving at service B, and both services’ clocks are NTP-synced, this is almost always message bus (RabbitMQ) — a queue is stuck, a binding is wrong, or the consumer is in a slow GC cycle. Time to look at the broker, not the apps.
Related prompts
-
Cinder Volume Troubleshooting Prompt
Diagnose stuck volumes, failed attachments, and backend issues (Ceph/LVM/iSCSI/NFS) in OpenStack Cinder using CLI output and service logs.
-
Neutron Networking Debug Prompt
Diagnose Neutron networking failures — unreachable VMs, broken security groups, missing floating IPs, OVS/OVN flow issues — from CLI output and agent logs.
-
Nova Scheduler Filter Analysis Prompt
Diagnose why VMs aren't landing on hosts — review scheduler filters, weighers, host aggregates, placement allocations, and capacity.
-
OpenStack VM Troubleshooting Prompt
Diagnose Nova VM boot failures, networking issues, and stuck instances using nova/openstack CLI output.