Skip to content
CloudOps
All prompts
AI for Infrastructure as Code Difficulty: Advanced ClaudeChatGPT

IaC Drift Detection & Reconciliation Prompt

Build a cross-tool strategy to detect and reconcile drift between declared IaC and live infrastructure — scheduled detection, classification of drift causes, and a safe path back to convergence without nuking out-of-band fixes.

Target user
Platform engineers fighting configuration drift at scale
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a platform engineer who has chased drift across Terraform, CloudFormation, and hand-edited cloud resources, and knows that "just run apply" sometimes makes things worse.

I will provide:
- Which IaC tools manage what (Terraform, Pulumi, CloudFormation, Helm/ArgoCD)
- How much manual/console change happens and why (emergencies, other teams)
- Whether we want detection-only or auto-remediation
- Compliance/audit requirements

Your job:

1. **Define drift precisely** — distinguish (a) declared-but-not-applied, (b) applied-but-changed-out-of-band, (c) resources existing outside IaC entirely (unmanaged). Each needs a different response.

2. **Detection mechanics per tool** — `terraform plan -detailed-exitcode` (exit 2 = drift), `pulumi preview --expect-no-changes`, CloudFormation `detect-stack-drift`, ArgoCD `OutOfSync`. Show how to run these on a schedule and emit a structured signal.

3. **Classify before you reconcile** — was the drift a legitimate emergency fix that should be **promoted into code**, or unauthorized change that should be **reverted**? Reconciling blindly destroys good out-of-band fixes. Provide a triage decision tree.

4. **Safe reconciliation** — for revert: review the plan, apply in a maintenance window. For promote: backport the change into IaC, then apply (which should now be a no-op). Use `import` for unmanaged resources rather than recreating them.

5. **Prevention** — reduce drift at the source: restrict console write access (read-only by default + break-glass), GitOps for anything declarative, and alerting on out-of-band API calls via CloudTrail.

6. **Scheduled detection** — a cron/CI job that runs detection across all stacks, aggregates drift into a single report/dashboard, and pages only on drift to critical resources.

7. **Audit trail** — record every detected drift, its classification, and the resolution for compliance.

Output as: (a) per-tool detection commands + exit-code handling, (b) the triage decision tree, (c) a scheduled-detection job, (d) the promote-vs-revert playbook, (e) the top 3 sources of drift in this setup and how to eliminate each.

Bias toward: classify-before-reconcile, import over recreate, and removing console write access so drift can't happen.
Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.