Surviving Terraform Provider Version Upgrades

A major version bump of the AWS provider can quietly change resource defaults, rename arguments, and turn a no-op plan into a wall of unexpected diffs. Across a large estate with dozens of workspaces, an uncoordinated provider upgrade is how you spend a week chasing surprise destroys. After 25 years I treat provider upgrades like database migrations: deliberate, staged, and reversible.

Here’s my playbook.

Pin providers; don’t float them

The root cause of upgrade pain is floating versions. If your config says >= 5.0, every terraform init on a fresh machine can grab a newer provider with different behavior. Pin with a pessimistic constraint:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.40"
    }
  }
}

~> 5.40 allows patch and minor bumps within 5.x but never jumps to 6.0 by accident. Major upgrades become a deliberate, reviewed change — exactly what you want.

Commit the lock file

The .terraform.lock.hcl file records the exact provider versions and checksums for the whole team. Commit it. Without it, CI and your laptop can resolve different versions and produce different plans. With it, everyone is locked to the same provider until you intentionally upgrade:

terraform init -upgrade   # bumps within constraints, rewrites lock file

That -upgrade is the only thing that should move your providers, and it should produce a reviewable diff to the lock file.

Read the upgrade guide before the code

Every major provider ships an upgrade guide listing breaking changes. Read it first. The provider maintainers tell you precisely which arguments were renamed, which defaults changed, and which resources need attention. Skipping this and discovering breakage via failed plans is doing it the hard and slow way.

Stage the rollout, never big-bang it

Across a large estate, upgrade incrementally:

One low-stakes workspace first — a dev environment. Bump the constraint, init -upgrade, and run a plan.
Read the plan carefully. A clean plan means defaults didn’t shift. Surprise diffs mean the new provider computes something differently — investigate before proceeding.
Fix forward. Update deprecated arguments, add ignore_changes for benign default shifts.
Promote the upgrade to staging, then production, one tier at a time.

Never bump every workspace in one PR. If something breaks, you want a small blast radius and a clear culprit.

Validate read-only before you apply

The whole upgrade can be validated without touching infrastructure. A clean plan after init -upgrade is your green light:

terraform init -upgrade
terraform plan

If the plan shows zero changes, the new provider is behaviorally compatible for that workspace. If it shows changes, each one is a decision: codify it, ignore it, or hold the upgrade.

Where AI accelerates upgrades

Reading a long upgrade guide and mapping it onto your actual code is tedious, and that’s where AI helps. I paste the provider’s upgrade guide alongside a module and ask: “Which of these breaking changes apply to this configuration, and what edits do I need?” It reliably points me at the renamed arguments and changed defaults that affect me, instead of the whole list.

I also have it explain surprise plan diffs after an upgrade: “Why would the new provider want to change this field?” Usually it’s a default that moved. I keep these as Terraform prompts and route the upgrade PR through our Code Review tool to confirm the deprecated-argument fixes are complete.

AI reads guides and proposes edits; I run the plans and applies.

Have a rollback path

Because you committed the lock file and pinned versions, rollback is easy: revert the lock file and constraint, init, and you’re back on the old provider. If state was already mutated, your versioned backend gives you the prior state. Never start an upgrade you can’t reverse.

The takeaway

Provider upgrades break estates when they’re floating and big-bang. Pin with ~>, commit the lock file, read the upgrade guide, and roll out one workspace at a time with read-only plans as your gate. Treat each surprise diff as a decision, keep a rollback path, and a major provider bump becomes a routine staged migration instead of a fire drill.