Terraform Error Guide: 'Failed to load state'

Exact Error Message

These are the variants users paste into Google when the state file itself will not load:

Error: Failed to load state

error loading state: 2 error(s) occurred:

* unexpected end of JSON input
* state snapshot was created by Terraform v1.9.5, which is newer than current
  v1.7.2; upgrade to Terraform v1.9.5 or greater to work with this state

Error: Error loading state: AccessDenied: Access Denied
	status code: 403, request id: ...

Error: Failed to load state: NoSuchKey: The specified key does not exist.

What the Error Means

Terraform state is a single JSON document describing every resource it manages. “Failed to load state” means Terraform reached the backend (S3, GCS, Azure Blob, local disk, Terraform Cloud) but could not parse or accept the object it found there. The state file is the source of truth for the mapping between your HCL and real cloud resources, so Terraform refuses to plan or apply against state it cannot trust — running anyway could orphan or duplicate live infrastructure.

The error covers several distinct failures: the JSON is truncated or corrupt, the object was written by a newer Terraform version whose schema the current binary cannot read, the object is missing or the key is wrong, your credentials cannot read it, or a stale lock left a half-written snapshot. The remedy differs by sub-cause, so diagnosis comes before any fix.

Common Causes

Truncated or corrupt state. An interrupted write (Ctrl-C, crashed CI job, full disk) left unexpected end of JSON input or invalid character mid-file.
Newer-version state. State was last written by a newer Terraform/OpenTofu; the older binary refuses it.
Object missing or wrong key. The S3/GCS object was deleted, lifecycle-expired, or the key/prefix points at the wrong path.
Backend permissions. The role/credentials can reach the bucket but lack s3:GetObject / storage.objects.get, yielding 403/AccessDenied that surfaces as a load failure.
Lineage/serial mismatch. A pushed state has a different lineage or a lower serial than expected, so Terraform rejects it as not-the-same-state.
Stale lock with a partial snapshot. A crashed apply left a DynamoDB/GCS lock and an incomplete state object.

How to Reproduce the Error

A truncated local state is the simplest reproduction:

terraform init
# simulate an interrupted write by chopping the file
head -c 200 terraform.tfstate > terraform.tfstate.broken
mv terraform.tfstate.broken terraform.tfstate
terraform plan

Error: Failed to load state
error loading state: unexpected end of JSON input

The newer-version variant reproduces by running an older binary against state a colleague applied with a newer one:

terraform version          # v1.7.2
terraform plan

state snapshot was created by Terraform v1.9.5, which is newer than current v1.7.2

Diagnostic Commands

Pull the live state and check whether it is even valid JSON:

# Fetch the raw object from the backend (does not require valid JSON to start)
terraform state pull > current.tfstate 2>/dev/null

# Is it well-formed?
jq empty current.tfstate && echo "valid JSON" || echo "corrupt / truncated"

# Inspect the identity fields
jq '{version, terraform_version, serial, lineage, resources: (.resources|length)}' current.tfstate

Check the backend object and your version directly:

# Does the S3 object actually exist, and what versions are available?
aws s3api list-object-versions --bucket acme-tf-state \
  --prefix prod/network/terraform.tfstate

terraform version
TF_LOG=DEBUG terraform plan 2>&1 | grep -i 'state\|403\|NoSuchKey'

Step-by-Step Resolution

Step 1: Back up whatever is there now

Even a corrupt file may be partially recoverable, so never overwrite blindly:

terraform state pull > broken-state.$(date +%Y%m%d-%H%M%S).json 2>/dev/null
cp terraform.tfstate.backup local-backup.json 2>/dev/null || true

Step 2a: If the JSON is truncated or corrupt — restore a good snapshot

Terraform keeps terraform.tfstate.backup locally; remote backends keep versions. Restore the most recent good one.

# Local backend: the .backup is the previous good snapshot
jq empty terraform.tfstate.backup && cp terraform.tfstate.backup terraform.tfstate

# Remote (S3 versioned): list versions, fetch the last good one, push it back
aws s3api list-object-versions --bucket acme-tf-state \
  --prefix prod/network/terraform.tfstate --query 'Versions[].[VersionId,LastModified]' --output table

aws s3api get-object --bucket acme-tf-state \
  --key prod/network/terraform.tfstate --version-id <GOOD_VERSION_ID> good.tfstate

terraform state push good.tfstate

Step 2b: If state was written by a newer Terraform version

State format is forward-incompatible: an older binary cannot read newer state. Align the version instead of downgrading the state.

# Pin and use the version that wrote the state
tfenv install 1.9.5 && tfenv use 1.9.5    # or asdf / tfswitch
terraform version
terraform plan

Never hand-edit terraform_version downward to force an old binary — it will silently mishandle newer schema fields.

Step 2c: If the object is missing or the key is wrong

# Confirm the path your backend expects vs what exists
grep -A4 'backend "s3"' *.tf
aws s3 ls s3://acme-tf-state/prod/network/

# If deleted but versioning is on, undelete by removing the delete marker
aws s3api list-object-versions --bucket acme-tf-state \
  --prefix prod/network/terraform.tfstate --query 'DeleteMarkers'
aws s3api delete-object --bucket acme-tf-state \
  --key prod/network/terraform.tfstate --version-id <DELETE_MARKER_ID>

Step 2d: If it is a permissions failure

A 403/AccessDenied is not corruption — grant read (and write) on the object and lock table:

aws sts get-caller-identity                 # confirm who you are
# IAM needs: s3:GetObject, s3:PutObject on the key;
# dynamodb:GetItem/PutItem/DeleteItem on the lock table

Step 2e: If a stale lock left a partial snapshot

terraform plan        # note the Lock ID in the error
terraform force-unlock <LOCK_ID>
terraform state pull | jq empty   # re-verify the snapshot is whole

Step 3: Verify recovery

terraform plan

No changes. Your infrastructure matches the configuration. confirms the restored state is correct. Lineage/serial mismatches at push time mean you grabbed the wrong snapshot — return to Step 2a with a different version ID.

Prevention and Best Practices

Enable versioning on the state bucket (S3/GCS) — it is the single highest-value safeguard and turns most “corrupt state” incidents into a one-line restore.
Pin Terraform versions across the team (required_version, tfenv, CI image) so no one writes state a teammate cannot read.
Use DynamoDB/native state locking so concurrent applies cannot interleave and truncate the object.
Never Ctrl-C mid-apply on a local backend; let it finish or you risk a half-written file.
Keep applies in CI with retries and timeouts so a flaky network does not leave partial writes.
The free incident assistant can read a stack trace and point you at the right sub-cause. More patterns live in the Terraform guides.

Backend configuration changed — when the backend itself was re-pointed and needs init -reconfigure/-migrate-state.
Authentication failed — when the credentials reaching the backend or provider are rejected outright.
Error: Error acquiring the state lock — a still-held lock rather than a corrupt snapshot; resolve with force-unlock.
Error: state snapshot was created by Terraform vX, which is newer — the version-mismatch sub-case covered in Step 2b.

Frequently Asked Questions

Can I just fix the JSON by hand? Only as a last resort, and only after backing up. If the file is merely truncated and you have no versioned backup, you can sometimes repair the closing braces, but lineage/serial/checksum fields make hand-editing risky. Prefer restoring a versioned snapshot with terraform state push.

Why does newer-version state refuse to load on my machine? Terraform state format is forward-incompatible by design: a v1.7 binary does not understand schema fields introduced in v1.9, so it stops rather than corrupt your state. Install and use the newer version (via tfenv/tfswitch) instead of downgrading the file.

What is a lineage/serial mismatch? Every state has a random lineage ID and an incrementing serial. If you push a snapshot whose lineage differs, or whose serial is lower than what the backend holds, Terraform rejects it to prevent overwriting newer state with older. Pull the current state, compare, and push the correct lineage.

My S3 object is gone — is the state recoverable? If bucket versioning was enabled, yes: list object versions, delete the delete-marker (or fetch a prior version), and push it back. If versioning was off and there is no .tfstate.backup, you must rebuild state by re-importing resources with terraform import.

Does terraform force-unlock fix corruption? No. force-unlock only releases a stale lock so you can operate again; it does not repair a truncated or version-mismatched snapshot. Use it only when the real problem is a lock, then re-verify the state with terraform state pull | jq empty.

Terraform Error Guide: 'Failed to load state' — corrupt, missing, or newer-version state

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Step 1: Back up whatever is there now

Step 2a: If the JSON is truncated or corrupt — restore a good snapshot

Step 2b: If state was written by a newer Terraform version

Step 2c: If the object is missing or the key is wrong

Step 2d: If it is a permissions failure

Step 2e: If a stale lock left a partial snapshot

Step 3: Verify recovery

Prevention and Best Practices

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit

Exact Error Message

What the Error Means

Common Causes

How to Reproduce the Error

Diagnostic Commands

Step-by-Step Resolution

Step 1: Back up whatever is there now

Step 2a: If the JSON is truncated or corrupt — restore a good snapshot

Step 2b: If state was written by a newer Terraform version

Step 2c: If the object is missing or the key is wrong

Step 2d: If it is a permissions failure

Step 2e: If a stale lock left a partial snapshot

Step 3: Verify recovery

Prevention and Best Practices

Related Errors

Frequently Asked Questions

Download the Free 500-Prompt DevOps AI Toolkit