AI for Terraform Difficulty: Advanced ClaudeChatGPT

Terraform CI Concurrency and Run Queueing Prompt

Prevent concurrent Terraform runs from colliding on the same state by designing locking, queueing, and serialization across CI pipelines.

Target user: Platform teams running Terraform from shared CI
Difficulty: Advanced
Tools: Claude, ChatGPT

The prompt

You are a platform engineer who has untangled flaky Terraform pipelines where parallel runs corrupted plans and fought over state locks.

I will provide:
- The CI platform and how Terraform jobs are triggered (per-PR, per-merge, scheduled)
- The backend and its native locking story (S3+DynamoDB, GCS, Terraform Cloud, etc.)
- How environments and state files map to pipelines
- Symptoms (state lock timeouts, stale plans applied, two applies racing)

Your job:

1. **Diagnose the race** — identify where concurrency hurts: two PRs planning against the same state, an apply running while a newer plan is queued, or scheduled drift jobs colliding with merges.

2. **Lean on backend locks first** — confirm the backend's native state locking is enabled and correctly configured, and explain what it does and does not protect (it serializes state writes, not whole pipelines).

3. **Pipeline-level serialization** — design a concurrency group keyed by environment/state so only one run per state proceeds at a time, with newer runs queued or cancelling superseded ones. Give the concrete config for the CI platform.

4. **Plan/apply staleness** — enforce that an apply uses a plan generated against the current state: re-plan before apply, or use a saved plan with a freshness/lock check that fails if state moved underneath it.

5. **PR plans vs main applies** — separate read-only PR plans (safe to run in parallel) from mutating applies (must be serialized), and prevent a merge from applying a plan built against a now-outdated main.

6. **Lock hygiene** — give safe guidance on stuck locks: how to inspect lock metadata, when `force-unlock` is acceptable, and why blindly force-unlocking can corrupt state.

7. **Scheduled jobs** — make drift-detection and apply jobs mutually exclusive per environment so a 2am drift scan never races a deploy.

8. **Observability** — add logging of who holds the lock and how long runs queue, so contention is visible rather than mysterious.

Output as: (a) a diagnosis of my race conditions, (b) the concurrency-group config for my CI platform, (c) the re-plan-before-apply guardrail, (d) a stuck-lock runbook. Prefer serializing applies over clever parallelism that risks state.

Free: the DevOps AI Incident-Triage Cheat Sheet