Running an AI-Assisted AWS Well-Architected Review

Every team I’ve worked on has had a Well-Architected review on the roadmap and never quite done one. The framework is genuinely good — six pillars, a structured set of questions, a clear way to find the gaps that bite you later — but the review itself is a slog. It means sitting in a room mapping your architecture against a long questionnaire, arguing about whether you “mostly” meet a best practice, and producing a findings document that takes days. So it slips, quarter after quarter, until an incident forces the conversation. AI changes the math here: it can read your architecture and infrastructure code, map it against the pillars, and draft a first pass of findings in an afternoon. That’s not the whole review — but it’s the 80% that was stopping anyone from starting.

The boundary matters more here than almost anywhere, because a Well-Architected review is fundamentally about judgment and trade-offs. AI drafts findings and flags gaps; the human decides which gaps actually matter for this system, what the business is willing to trade, and what to fix first. A model can tell you you’re missing multi-AZ; it can’t tell you whether the cost is worth it for your batch job.

Give the model your real architecture, not a description

The review is only as good as the input. Don’t describe your system in prose — feed the model the actual artifacts: your Terraform or infrastructure code, a list of the AWS resources in play, and your real operational facts (deployment process, backup setup, on-call). For the resource inventory:

aws resourcegroupstaggingapi get-resources \
  --query 'ResourceTagMappingList[].ResourceARN' --output text \
  | awk -F: '{print $3}' | sort | uniq -c | sort -rn

And a slice of the IaC that defines the critical path — say the database and load balancer:

resource "aws_db_instance" "primary" {
  engine                  = "postgres"
  instance_class          = "db.r6g.xlarge"
  multi_az                = false
  backup_retention_period = 1
  deletion_protection     = false
  storage_encrypted       = true
}

That multi_az = false, backup_retention_period = 1, deletion_protection = false triple is the kind of thing a model spots instantly and a tired reviewer reads right past.

Run the pillars one at a time, with the questions

Don’t ask for “a Well-Architected review” in one shot — you’ll get shallow generic output. Go pillar by pillar, and give the model the actual framework questions so it grounds findings in the real best practices rather than inventing them.

Review the Terraform and resource inventory below against the AWS Well-Architected Reliability pillar. For each finding: state the specific best practice it relates to (e.g. “REL: plan for component failures”), what the config currently does, the risk, and a concrete remediation. Only raise findings you can support from the config I gave you — do not assume facts not in evidence. Rate each finding High/Medium/Low likely impact, but note that I will re-prioritize based on business context you don’t have.

A grounded reply looks like this:

Reliability findings:

[HIGH] aws_db_instance.primary has multi_az = false. Best practice: withstand component failure. A single-AZ RDS instance has no automatic failover; an AZ outage takes the database down. Remediation: set multi_az = true. (Note: roughly doubles RDS cost — your call whether the workload justifies it.)

[HIGH] backup_retention_period = 1 gives only a single day of point-in-time recovery. Remediation: raise to at least 7. Near-zero cost.

[MEDIUM] deletion_protection = false risks accidental teardown of the primary database. Remediation: enable it.

[INSUFFICIENT EVIDENCE] Cannot assess RTO/RPO targets or DR strategy from the config — needs your operational input.

That last line is the behavior I want most: the model flagging where it can’t conclude, instead of confidently inventing a finding. The cost caveat on multi-AZ is also exactly right — it surfaces the trade-off and hands the decision back to me.

Where AI is strong, and where you take over

AI is reliably good on the pillars that are mostly config-checkable. Security: it’ll catch unencrypted storage, over-broad security groups, missing IAM least-privilege — overlapping with the public-exposure work. Cost Optimization: it spots over-provisioned instances and missing lifecycle policies from the IaC. Performance Efficiency and Reliability: clear from config (instance sizing, multi-AZ, autoscaling).

Where it weakens, and where the human carries the review: Operational Excellence and the cultural parts of every pillar. The framework asks whether you can deploy safely, whether you learn from operations, whether ownership is clear — and the model only knows what you tell it about how your team actually works. Don’t let it invent a runbook maturity assessment from infrastructure code. Feed it your real deployment and on-call facts, or mark those questions as human-only.

Turn the draft into a prioritized, owned plan

The model’s High/Medium/Low ratings are a starting sort, not the final priority — and I told it as much in the prompt. The single-AZ finding might be rated HIGH, but if it’s a nightly batch job that can tolerate a few hours down, it’s not your first fix. Re-rank against blast radius and business value yourself, then have AI help with the grunt work: turning agreed findings into tracked tickets with the remediation snippet attached.

# The two near-zero-cost, no-judgment-needed reliability fixes:
# backup retention and deletion protection — apply these first.

Some findings need no debate (encrypt the storage, raise backup retention, enable deletion protection) and you just do them. Others are genuine trade-off decisions (multi-AZ cost, multi-region DR) that go to the team. AI cleanly separates those two buckets for you, which is most of what makes the review actually finish.

The takeaway

The reason Well-Architected reviews don’t happen is the activation energy — the days of tedious mapping before you reach any insight. AI collapses that: it reads your real architecture, maps it against the pillar best practices, and drafts grounded findings with trade-offs surfaced and uncertainty flagged. But the review’s whole value is judgment — which gaps matter, what the business will trade, what to fix first — and that stays yours. Let AI draft the findings; you own the prioritization and the decisions.

The security and cost findings here connect directly to the rest of the AWS guides, and the pillar-by-pillar review prompts are in the prompts collection.

Give the model your real architecture, not a description

Run the pillars one at a time, with the questions

Where AI is strong, and where you take over

Turn the draft into a prioritized, owned plan

The takeaway

Download the Free 500-Prompt DevOps AI Toolkit