Running Database-as-a-Service with OpenStack Trove
Trove gives tenants self-service databases — MySQL, PostgreSQL, more — with backups and replication. Here's how I run it in production without the guest-agent pain.
- #openstack
- #trove
- #dbaas
- #mysql
- #postgresql
- #devops
Every private cloud reaches the moment when application teams stop wanting raw VMs and start wanting services. “Give me a Postgres” instead of “give me an Ubuntu box I’ll install Postgres on.” Trove is OpenStack’s answer to that: database-as-a-service, where tenants provision MySQL, PostgreSQL, and other engines with backups, replication, and resizing built in.
I’ll be honest up front — Trove has a reputation for being finicky, and it’s earned. But most of the pain comes from two things: the guest images and the guest agent. Get those right and Trove is genuinely good. Here’s how I run it.
How Trove actually works
Trove doesn’t run databases on its control plane. It boots a regular Nova instance from a purpose-built guest image that contains the database engine plus the trove-guestagent. The guest agent talks back to the Trove control plane over the message queue and does the actual work: configuring the DB, taking backups, setting up replication.
So the chain is: trove-api -> trove-taskmanager -> Nova boots a guest VM -> trove-guestagent inside the VM does the DB work -> reports back over RabbitMQ. Almost every Trove problem is somewhere in that chain, and almost always at the guest image or the guestagent’s queue connectivity.
The guest image is the hard part
You need a Trove guest image per datastore version. These are built with trovestack or diskimage-builder and bake in the engine plus the agent. The two things people get wrong:
- The agent can’t reach the control plane. The guest VM must be able to reach RabbitMQ. If your guest network is isolated, the agent silently never registers and the instance sticks in
BUILDforever. - Version mismatch. A guest image built for an older Trove release running against a newer control plane fails in confusing ways. Rebuild guest images when you upgrade Trove.
Register the datastore once the image exists:
openstack datastore version create 8.0 mysql mysql "" \
--image-tags trove-mysql \
--active
# Confirm it registered
openstack datastore version list mysql
Provisioning a database
With a working datastore, the tenant experience is genuinely nice:
openstack database instance create prod-orders \
--flavor db.medium \
--size 50 \
--datastore mysql \
--datastore-version 8.0 \
--databases orders \
--users appuser:$(openssl rand -base64 16) \
--nic net-id=<tenant-net-id>
# Watch it come up
openstack database instance show prod-orders
When it reaches ACTIVE, the tenant gets an IP and a connection string. The --size is the Cinder volume backing the data directory, which means you get Cinder’s durability and you can resize it later.
Backups and replication are the selling point
This is what justifies running Trove over handing out VMs. Backups go to object storage (Swift/S3) and are full or incremental:
# Take a backup
openstack database backup create prod-orders nightly-2026-06-14
# Restore into a brand new instance from that backup
openstack database instance create orders-restore \
--flavor db.medium --size 50 \
--datastore mysql --datastore-version 8.0 \
--backup nightly-2026-06-14
Replication gives tenants a read replica with one command:
openstack database instance create orders-replica \
--flavor db.medium --size 50 \
--datastore mysql --datastore-version 8.0 \
--replica-of prod-orders
The replica streams from the primary; promote it if the primary fails. Test the promote path before you need it — I’ve seen replicas that streamed fine but couldn’t promote because of a binlog config baked wrong into the guest image.
The failures I keep seeing
After years of running Trove, the production incidents cluster tightly:
- Stuck in BUILD. Almost always the guestagent can’t reach RabbitMQ. Check the guest VM’s network can route to the message queue and that credentials in the guest config are right.
- Backups failing silently. Object-storage credentials or container missing. Backups report success at the API but the object never lands. Verify the object actually exists in Swift/S3, don’t trust the status alone.
- Resize hangs. A volume resize that needs a guest filesystem grow can stall if the guestagent is unhealthy. Always confirm the agent is responsive before resizing.
I keep an AI prompt that takes a stuck-in-BUILD instance’s taskmanager log plus the guest network config and points at the most likely break in the chain — it’s reliably faster than me reading taskmanager logs at 2am. A few of those are in our prompt library.
Should you run Trove?
My rule: run Trove when you have enough internal database demand that self-service saves real toil, and when you can commit to maintaining guest images across upgrades. If you have three databases total, just give out VMs. If you have fifty teams that all want a Postgres next week, Trove pays for itself — as long as someone owns the guest-image pipeline.
Where to go next
Trove is a great service hiding behind a fussy guest-image and agent story. Nail the image build, guarantee the agent can reach the message queue, and test backup-restore and replica-promote before you rely on them. Then enjoy tenants provisioning their own databases instead of filing tickets. For the Nova, Cinder, and RabbitMQ services Trove sits on top of, see the OpenStack category.
Database automation is only as reliable as your backup-restore testing. Validate restores and replica promotion against your own guest images before depending on them.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.