Surgical Terraform Operations: target, replace, and refresh-only
Use terraform -target, -replace, and -refresh-only as careful escape hatches, not workflow. Let AI propose the minimal safe op while a human reviews every plan.
- #terraform
- #cli
- #state
- #operations
- #ai
The page came in at 02:14. One of our application servers had wedged itself into a state where the userdata script had half-run, the disk was mounting fine but the service refused to come up, and SSH was timing out. The fix was obvious: rebuild that one instance. The problem was the blast radius. A plain terraform apply against that workspace wanted to touch eleven resources — a security group rule someone had drifted by hand, an autoscaling tag, and a route table association that a teammate was mid-migration on. I did not want to apply eleven changes at 2 AM. I wanted to replace exactly one EC2 instance and nothing else.
That is what the surgical operations are for. -replace, -target, and -refresh-only exist precisely for the moment when a full apply is too blunt an instrument. They are escape hatches. Used well, they get you out of a hole at 2 AM. Used as a habit, they are how you drift your state into a mess that needs another escape hatch next week.
Why a full apply is sometimes the wrong tool
Terraform’s normal model is beautiful and you should not fight it: you describe desired state, Terraform diffs it against real state, and it reconciles everything in one shot. The whole point is that the apply is comprehensive.
But comprehensiveness is exactly the problem during an incident. When production is degraded, you frequently have:
- Known drift you have not reconciled yet (a hand-edited rule, a console hotfix).
- An in-flight change from a teammate that is half-merged.
- A single broken resource that just needs recreating.
A full apply forces you to confront all of it at once. The surgical operations let you scope the action down to the one thing you actually understand and want to change right now.
-replace: recreate one resource on purpose
When a single resource is corrupt or wedged but its configuration is fine, you want Terraform to destroy and recreate just that one object. That is -replace.
Always plan first:
terraform plan -replace="aws_instance.web"
You will see something like:
# aws_instance.web will be replaced, as requested
-/+ resource "aws_instance" "web" {
~ id = "i-0abc123" -> (known after apply)
~ private_ip = "10.0.1.42" -> (known after apply)
# (12 unchanged attributes hidden)
}
Plan: 1 to add, 0 to change, 1 to destroy.
The -/+ marker means destroy-then-create. Read it carefully: confirm the count is 1 to add, 1 to destroy and that nothing else snuck in. If the plan is clean:
terraform apply -replace="aws_instance.web"
For a resource inside a module, use the full address:
terraform apply -replace="module.app.aws_instance.web[0]"
Pro Tip: -replace does not change your configuration — it forces a re-create of an object whose config Terraform already considers correct. If the resource is broken because the config is wrong, fix the config and do a normal apply instead. -replace is for “the config is right, the real-world object is bad.”
-replace is the modern replacement for terraform taint
If you have been doing this a while, your muscle memory says terraform taint. That command (and its partner terraform untaint) is deprecated. The old flow was:
# deprecated
terraform taint aws_instance.web
terraform apply
The trouble with taint was that it mutated state as a separate, invisible step. You marked a resource tainted, then later an apply acted on that hidden flag — and anyone reading the apply had no idea why the instance was being recreated. -replace folds the intent into the plan itself, so the recreation shows up in the plan you review, in the run you can see, with no lingering state mutation. Same outcome, but the reasoning is visible and scoped to one command. Reach for -replace; leave taint/untaint in the past.
-target: scope an apply to a subgraph
-target restricts the operation to a specific resource or module and its dependencies. This is the one to be most careful with.
terraform plan -target=module.network.aws_subnet.private
Terraform will print a warning that is worth quoting because it tells you exactly how to feel about this flag:
│ Warning: Resource targeting is in effect
│
│ You are creating a plan with the -target option, which means that the
│ result of this plan may not represent all of the changes requested by
│ the current configuration.
And after apply it nudges you again:
│ Note: run "terraform plan" with no targets to detect any remaining changes.
That warning is not boilerplate. -target deliberately produces a partial, possibly inconsistent plan. It is for recovering from bugs and weird ordering problems — say a resource fails to create because of a dependency Terraform did not infer, and you need to stand up the dependency first. It is not for “I only want to deploy the database changes today.” If you find yourself reaching for -target to carve up routine deploys, your modules are too coarse; split the configuration instead.
After any targeted apply, run a full terraform plan with no flags to see what you left undone.
-refresh-only: reconcile state without changing infrastructure
Sometimes the real world moved and your state is stale — an instance got resized in the console, a tag changed, something drifted. You want Terraform to update its state to match reality without proposing any infrastructure changes. That is apply -refresh-only:
terraform plan -refresh-only
Note: Objects have changed outside of Terraform
~ aws_instance.web
~ instance_type = "t3.medium" -> "t3.large"
This is a refresh-only plan, so Terraform will not take any actions to
undo these. If you were expecting these changes then you can apply this
plan to record the updated values in the Terraform state.
terraform apply -refresh-only
This writes the observed values into state. Crucially, it does not try to revert the drift back to your configuration — it just records what is actually out there. That makes it the safe way to answer “did something change underneath us?” before you decide whether to codify the change or fix it. It is the gentlest of the three operations: zero infrastructure mutation.
Let AI propose the operation — and keep a human on the plan
Here is where these tools get genuinely safer. The hardest part of an incident is not typing the command; it is deciding the minimal one. Which single resource? -replace or -refresh-only? What is the exact resource address inside that nested module?
That is a perfect job for an LLM acting like a fast junior engineer. Paste the failing plan output, the error, and the relevant config, and ask it to propose the smallest operation that fixes the problem and explain the blast radius. A tool like Claude or Cursor is very good at reading a wall of -/+ markers and telling you “the only resource that actually needs recreating is module.app.aws_instance.web[0]; the other ten are pre-existing drift, leave them.”
But the model proposes; it does not act. The discipline that keeps this safe:
- Never let the model auto-apply. It hands you a command. You run
terraform planyourself and read it. - A human reviews every plan. The plan output is the contract. If the count is not what you expected, stop.
- Never give the model state-write access or cloud credentials. It reasons over text you paste in — plan output, config, errors. It does not hold
AWS_ACCESS_KEY_ID, it does not runterraform apply, it does not touch your backend. The moment an AI has write access to state or the cloud, “fast junior engineer” becomes “unsupervised junior engineer with prod keys,” which is the opposite of what you want at 2 AM.
Pro Tip: Save the prompts that work. A reusable “given this failing plan, propose the minimal -replace/-target/-refresh-only operation and justify the blast radius” prompt is worth keeping in your prompt library or a curated prompt pack, so the next on-call engineer is not reinventing it mid-incident.
If you want the review step itself to be more rigorous, routing the proposed change through an automated code review pass before a human signs off catches the embarrassing mistakes — a wrong resource index, a -target that quietly drags in half a module — without slowing you down.
Conclusion
-replace, -target, and -refresh-only are the right tools when a full apply is too blunt: recreate one wedged resource, scope a recovery to one subgraph, or reconcile drift into state without touching infrastructure. They are escape hatches, and the test for whether you are using them well is simple — if they show up in your incident notes occasionally, good; if they show up in your normal deploy pipeline, your configuration needs splitting, not more flags. Let AI do the fast reasoning about which minimal operation to run. Keep a human reading every plan, and keep the credentials far away from the model. See the rest of the Terraform guides for more on keeping state honest.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.