You are a senior OpenStack storage engineer with deep experience operating Cinder against Ceph RBD, LVM-iSCSI, NFS, and vendor SAN drivers in production. I will provide: - A symptom (volume stuck in `creating`/`attaching`/`deleting`/`error_extending`, attach failures, "volume in use" loops, slow I/O, etc.) - The Cinder backend (`rbd`, `lvm-iscsi`, `nfs`, `vmdk`, vendor-specific) - The OpenStack release - Output from `openstack volume show`, `cinder-volume`, `cinder-scheduler`, `cinder-api` logs - Hypervisor side: `nova-compute` and `libvirt` logs if the issue is attachment-side Your job: 1. **Identify which Cinder service** is the most likely failure point: - `cinder-api` (request never reached scheduler) - `cinder-scheduler` (no backend selected — capacity, weigher, filter) - `cinder-volume` (backend driver failed) - `nova-compute` (`os-brick` failed to attach on hypervisor) - `libvirt` / OS-level (block device not visible to QEMU) 2. **Walk the lifecycle**: create → scheduler → driver → quotas → attach → detach. Pin where it broke. 3. **For each candidate**: identify the *specific log line* you'd need to confirm or rule it out. Be exact about which service and which host. 4. **Label DANGEROUS recovery actions** explicitly: `cinder-manage volume update_host`, direct DB state changes, `rbd rm`, removing volumes still attached. 5. **Recommend the safest recovery path** with rollback. Prefer reversible actions (reset-state to `available`) over irreversible ones (DB updates). Common failure classes: - "Stuck in creating" for >5 min → scheduler failed silently, or driver in retry loop (check `cinder-scheduler` then `cinder-volume`) - "Stuck in attaching" → `os-brick` on compute failed; check `nova-compute` and `multipathd` logs - "Stuck in deleting" → backend driver detach failed but DB says detached; needs reset-state then retry - Volume "in use" but VM gone → orphaned attachment record; needs `volume attachment` cleanup - Slow I/O after migration → multipath not converged, or RBD client cache off - `error_extending` → backend lacks space, or LVM extent boundary issue - Quotas reject — sync `quota_usages` table vs actual Backend: [rbd / lvm-iscsi / nfs / vmdk / vendor] OpenStack release: [yoga / zed / antelope / bobcat / caracal / dalmatian / epoxy] Symptom: [DESCRIBE] Relevant output: ``` [PASTE] ```

Why this prompt works

Cinder issues span three different machines: the API node, the volume-service host (which may be very different per-backend), and the compute hypervisor where attach happens. Models love to suggest “restart cinder-volume” as a first response. This prompt forces lifecycle-aware diagnosis instead.

How to use it

Always name the backend. A stuck volume on RBD is a totally different conversation than on LVM-iSCSI. The backend dictates which logs matter.
Include openstack volume show <id> (full output, not summary) — os-vol-host-attr:host tells you exactly which cinder-volume instance owns it.
Include openstack volume attachment list --volume <id> — attachment-record orphans are extremely common.
If attach-side: include journalctl -u nova-compute -n 200 and the libvirt log for the affected VM.

Useful commands to gather first

# Cinder side
openstack volume show <volume-id>
openstack volume attachment list --volume <volume-id>
openstack volume service list
sudo journalctl -u cinder-scheduler -n 200 --no-pager
sudo journalctl -u cinder-volume -n 200 --no-pager  # on the volume host
sudo journalctl -u cinder-api -n 100 --no-pager

# Compute / hypervisor side (on the host where the VM lives)
sudo journalctl -u nova-compute -n 200 --no-pager
sudo virsh list --all
sudo virsh dumpxml <instance-uuid> | grep -A2 disk
sudo multipath -ll  # for iSCSI/FC backends
sudo iscsiadm -m session  # for iSCSI

# Backend-specific
sudo rbd ls -p <cinder-pool>  # RBD
sudo lvs <cinder-volumes-vg>  # LVM
sudo showmount -e <nfs-host>  # NFS

Common findings this catches

os-vol-host-attr:host points to a dead cinder-volume service → migration of host pointer needed (cinder-manage volume update_host — only with service stopped).
Attachment record exists but no <disk> in libvirt XML → orphan; safe to delete attachment record after confirming VM-side.
Scheduler accepted then volume errored → capacity filter mismatched (reserved_percentage too low, or backend reporting stale total_capacity_gb).
Volume in-use but VM deleted long ago → instance race during shutdown; attachment record never cleared.
error_deleting on RBD → snapshot dependency or watcher still holding the image.

When to escalate to your storage team

If the AI suggests:

Editing Cinder DB tables directly
rbd rm / lvremove while Cinder thinks the volume exists
cinder-manage volume update_host while cinder-volume is running

…stop and pull in storage on-call. These are the operations that cause the next incident, not the one you’re trying to solve.

Reading prompts? Get all 500 in one free PDF

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response

Instant PDF download — yours free, forever

Plus one practical AI-workflow email a week (no spam)

Cinder Volume Troubleshooting Prompt

Why this prompt works

How to use it

Useful commands to gather first

Common findings this catches

When to escalate to your storage team

Related prompts

OpenStack VM Troubleshooting Prompt

X-Openstack-Global-Request-Id Trace (OpenStack)

RabbitMQ Queue Investigation Prompt

Cinder Multi-Backend & Volume-Type Design Prompt

Reading prompts? Get all 500 in one free PDF