Designing Heat Nested Stacks and ResourceGroups in OpenStack
How to structure Heat templates with nested stacks and ResourceGroups so updates and scale-downs don't replace the wrong resources, with AI predicting the blast radius.
- #openstack
- #ai
- #heat
- #orchestration
- #iac
The Heat template that taught me to respect update semantics was a tidy little ResourceGroup of database replicas. It created beautifully. Then someone scaled it down by one, and Heat removed the wrong replica — not the newest, the one whose index happened to renumber. We lost a node we cared about because the template’s naming wasn’t stable across updates. Heat’s real difficulty isn’t authoring; it’s predicting what a stack update will create, replace, and destroy. Here’s how I structure nested stacks and ResourceGroups to make updates boring, and how AI helps me predict the blast radius before I apply.
Pick the Right Repetition Primitive
When you have N of something, Heat gives you a few ways to express it, and they have different update and scaling behavior:
OS::Heat::ResourceGroup— a fixed count of identical nested resources, indexed.- A nested stack via
type: my_nested.yaml— reusable, parameterized sub-templates. OS::Heat::AutoScalingGroup— count driven by scaling policies/alarms.
Reaching for the wrong one makes everything downstream fragile. ResourceGroup is great for “I want exactly 3 of these,” but it is not an autoscaler. The openstack category has the broader Heat playbooks.
Here’s a ResourceGroup with the index made explicit, which is the detail that prevents the disaster I opened with:
resources:
db_nodes:
type: OS::Heat::ResourceGroup
properties:
count: 3
resource_def:
type: db_node.yaml
properties:
index: "%index%"
name: { list_join: ['-', ['db', '%index%']] }
Using %index% to build a stable, index-derived name means member 0 is always db-0. Without that, a scale-down can renumber members and remove one you didn’t intend.
Update-in-Place vs Replace: The Thing That Bites
Every property in a Heat resource either updates in place or forces a replacement (destroy + recreate) when changed. Changing a server’s metadata might update in place; changing its image or flavor often forces a new server. During a stack update, a forced replacement on a live resource means downtime or data loss. The entire skill of safe Heat design is structuring templates so the changes you expect to make are update-in-place, and the replace-forcing ones are isolated.
The safety net is that Heat can tell you what an update will do before it does it:
openstack stack update --dry-run --show-nested \
-t my_stack.yaml my-stack
Read that output like a hawk. It lists each resource as UPDATE, REPLACE, or CREATE/DELETE. A REPLACE on something stateful is your signal to stop and rethink.
Prompt: “Here is my parent template, my db_node.yaml nested template, and the
stack update --dry-runoutput. For each resource, tell me whether the update is in-place or a replacement, and flag any replacement that would destroy a stateful resource. Then explain which property change is forcing each replacement. Do not tell me to run the real update — I’ll do that after reviewing.”Output: It parsed the dry-run, flagged that a flavor change on
db_nodes.1forced a REPLACE (destroying that replica’s root disk), traced it to a parameter default change, and suggested pinning the flavor or moving the data to a separate volume so replacement wouldn’t lose state.
That replace-vs-update classification is genuinely valuable — the model reads the dry-run and the templates together and explains why each replacement happens, which is tedious to trace by hand. But I never skip running the real dry-run myself and reading it; the AI’s interpretation is a guide, not a substitute for Heat’s own preview.
Crossing the Nested Boundary Cleanly
Nested stacks communicate through parameters going down and outputs coming up. The friction is get_attr paths into a ResourceGroup, which return aggregated attributes:
outputs:
db_ips:
value: { get_attr: [db_nodes, db_ip] }
That returns a list of each member’s db_ip. Mis-referencing these is a common source of “stack created but the wiring is wrong.” Keep the nested template’s outputs minimal and well-named, and validate that parent get_attr paths resolve to what you expect.
Pro Tip: Have the AI trace every get_attr and get_resource across the nested boundary and confirm each resolves to a real output. Cross-stack reference bugs don’t fail at create time loudly — they hand a downstream resource an empty or wrong value, and you find out in production.
Conditions Without Surprises
Heat conditions let one template serve multiple environments (prod vs dev, HA vs single). They’re powerful and easy to misfire — a condition wired to the wrong parameter silently creates or skips a resource. Keep conditions few and test each branch on a throwaway stack:
conditions:
is_ha: { equals: [{ get_param: deployment_mode }, 'ha'] }
When I refactor a template to add conditions, I’ll have Claude enumerate which resources each condition value would create, then I deploy each branch to a scratch stack to confirm. The enumeration saves time; the scratch-stack test is the verification. I keep reusable template-review prompts in the prompt workspace.
Test on a Throwaway Stack First
The discipline that ties this together: never let a refactored nested template’s first run be against production. Deploy it to a disposable stack, run the dry-run against a copy of the production stack, and only then update the real one. Heat’s preview is honest — if you read it, you won’t be surprised.
Conclusion
Heat rewards authors who think in update semantics rather than create-time correctness. Stable indexing keeps scale-downs from removing the wrong member; isolating replace-forcing properties keeps stateful resources alive across updates; clean nested-boundary outputs keep the wiring right. AI accelerates the hardest part — predicting what an update will do — by parsing the dry-run and explaining each replacement. But the ground truth is Heat’s own --dry-run output and a throwaway-stack test, both of which stay in your hands. The model predicts the blast radius; you confirm it before you apply. More Heat prompts live in the prompts library.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.