Reviewing CloudFormation Templates for Drift With AI
CloudFormation drift creeps in when someone clicks in the console. Here's how I use AI to read drift reports, explain them, and propose safe reconciliation.
- #iac
- #ansible
- #ai
- #cloudformation
- #drift
Drift is the quiet killer of Infrastructure as Code. You write a clean CloudFormation stack, deploy it, feel good — and three weeks later someone bumps a security group rule in the console to “just unblock a demo,” and now your template no longer describes reality. The next deploy either reverts their change without warning or fails in a confusing way. CloudFormation’s drift detection tells you that something drifted; making sense of why and what to do is where I lean on AI.
As always, I treat AI as a fast junior engineer. It reads the drift report faster than I can and explains it clearly, but I review every reconciliation it proposes and never let it apply a change to a live stack on its own.
What CloudFormation drift detection actually gives you
You kick off detection and pull the report:
aws cloudformation detect-stack-drift --stack-name prod-api
aws cloudformation describe-stack-resource-drifts \
--stack-name prod-api \
--stack-resource-drift-status-filters MODIFIED DELETED
The output is a wall of JSON: each drifted resource, the property paths that differ, and the expected versus actual values. It’s accurate but tedious to parse, especially when twelve resources drifted and only two matter. That triage is the first place AI earns its keep.
Ask AI to triage the drift report
I paste the (non-sensitive) drift JSON and ask for a prioritized summary:
“Here’s a CloudFormation drift report. Group the drifted resources by likely cause: console edits, out-of-band automation, or AWS-side defaults. For each, tell me whether reverting to the template is safe or risky, and why. Don’t generate any change-set yet.”
This turns a 400-line JSON dump into a short list like “three of these are AWS auto-injected tags you can ignore; one is a security group rule someone added manually that the next deploy will silently remove — investigate before reconciling.” That distinction between safe-to-revert and investigate-first is exactly the judgment a drift report doesn’t give you on its own.
The dangerous case: silent reverts
The scenario that scares me is the manual change that mattered. Someone added an ingress rule to fix an outage and forgot to put it in the template. CloudFormation sees that as drift and will helpfully delete it on the next deploy. AI is good at flagging these when you describe the property diff:
# Template says:
Resources:
ApiSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 443
ToPort: 443
CidrIp: 0.0.0.0/0
The live group has an extra rule on port 5432 that isn’t in the template. AI’s job isn’t to decide whether that rule should stay — it’s to make sure I notice it before I run a deploy that wipes it. The decision is mine.
Pro Tip: Before any reconciliation, ask AI to list every drifted property where the live value is “more permissive or more present” than the template. Those are the ones a naive deploy will silently remove, and they’re the ones that cause outages.
Reconcile by updating the template, not the console
The right fix for legitimate drift is almost always to bring the template up to match reality, then deploy — not to manually revert the live resource. AI helps draft the template update, which I then review:
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 443
ToPort: 443
CidrIp: 0.0.0.0/0
- IpProtocol: tcp
FromPort: 5432
ToPort: 5432
CidrIp: 10.0.0.0/16 # codified the manual DB rule, scoped to VPC
Notice I tightened the manual rule’s CIDR while codifying it — the human judgment AI can suggest but shouldn’t decide. I always preview with a change-set before executing:
aws cloudformation create-change-set \
--stack-name prod-api \
--change-set-name reconcile-drift \
--template-body file://template.yaml \
--change-set-type UPDATE
aws cloudformation describe-change-set \
--stack-name prod-api --change-set-name reconcile-drift
The change-set is CloudFormation’s dry-run. It shows me exactly what will be added, modified, or replaced before anything happens. I read every “Replacement: True” line carefully, because replacement on a stateful resource means data loss.
The same idea works beyond CloudFormation
This workflow isn’t AWS-specific. For Ansible-managed config, the equivalent of a drift report is a check-mode diff:
ansible-playbook site.yml --check --diff --limit prod
Any changed line is drift between your playbook and the live host. I feed that diff to AI with the same triage prompt — group by cause, flag the risky reverts — and it works just as well. The lesson is general: AI reads diffs well and explains them clearly, regardless of which IaC tool produced them.
Never hand it credentials or state secrets
Drift reports can contain sensitive values — connection strings, ARNs that hint at account structure, occasionally a password in a parameter. I scrub those before pasting, and I never give an AI tool my AWS credentials or let it run aws commands against a real account. It reads sanitized reports and proposes text; I run every command myself.
Make drift detection routine
The best defense against drift is catching it early and often. I run detection on a schedule and route surprises into review. The config drift detection guide goes deeper on the operational side, and the monitoring alerts dashboard is where I wire drift findings into something a human actually sees. For the prompts I use to triage these reports, the prompt library keeps them versioned.
CloudFormation drift is inevitable the moment more than one human can touch your account. AI won’t stop the drift, but it turns the painful work of understanding and reconciling it into something fast — provided you keep the change-set preview and the human decision firmly in the loop. The rest of this series lives under the IaC category, and Claude handles these JSON-and-YAML reports well.
Detect early, triage with AI, reconcile through change-sets, and never let drift surprise you in a deploy.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.