Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Infrastructure as Code By James Joyner IV · · 10 min read

IaC Error Guide: 'Build amazon-ebs errored' Packer AMI Build Failure

Fix Packer Build 'amazon-ebs' errored: diagnose SSH timeouts, missing source AMIs, IAM permissions, failing provisioners, and VPC subnets with no public IP.

  • #iac
  • #troubleshooting
  • #errors
  • #packer

Overview

Build 'amazon-ebs' errored is Packer’s catch-all message when the amazon-ebs builder fails at any stage of baking an AMI: launching the temporary EC2 instance, connecting to it over SSH/WinRM, running provisioners, or creating the final image. Packer always tears down the temporary instance, key pair, and security group on failure, so by the time you read the error the resources are already gone — the log is the only evidence left.

The error surfaces like this:

==> amazon-ebs: Provisioning with shell script: /tmp/packer-shell123456789
==> amazon-ebs: Pausing after run of step 'StepProvision'. Press enter to continue.
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
Build 'amazon-ebs' errored after 2 minutes 14 seconds: Timeout waiting for SSH.

The text after errored after ...: is the real cause — Timeout waiting for SSH, InvalidAMIID.NotFound, UnauthorizedOperation, or Script exited with non-zero exit status. The phrase Build 'amazon-ebs' errored itself tells you nothing; always read the clause that follows it and the lines immediately above the teardown.

Symptoms

  • packer build exits non-zero with Build 'amazon-ebs' errored after ....
  • The log shows Waiting for SSH to become available... repeating, then a timeout.
  • An aws sdk error like InvalidAMIID.NotFound, UnauthorizedOperation, or InsufficientInstanceCapacity.
  • A provisioner line ending in Script exited with non-zero exit status: 1.
packer build aws.pkr.hcl
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
==> amazon-ebs: Terminating the source AWS instance...
Build 'amazon-ebs' errored after 6 minutes 1 second: Timeout waiting for SSH.
==> Builds finished but no artifacts were created.

Common Root Causes

1. SSH timeout — wrong communicator, user, or security group

Packer launched the instance but never connected. The temporary security group blocks port 22, the ssh_username is wrong for the source AMI, or the instance has no route to reach SSH at all.

PACKER_LOG=1 packer build aws.pkr.hcl 2>&1 | grep -iE 'ssh|security group|waiting'
2026/06/23 14:02:10 packer-builder-amazon-ebs: Using host value: 10.0.3.altered
==> amazon-ebs: Waiting for SSH to become available...
2026/06/23 14:08:11 packer-builder-amazon-ebs: [DEBUG] TCP connection to SSH ip/port failed: dial tcp 10.0.3.altered:22: i/o timeout
==> amazon-ebs: Timeout waiting for SSH.

The dial tcp ...:22: i/o timeout confirms a network/security-group reachability problem, not credentials. A ssh: handshake failed would instead point at the wrong ssh_username (ec2-user vs ubuntu vs admin).

2. Source AMI not found or wrong region

The source_ami or source_ami_filter resolves to nothing in the build region — the AMI ID is region-specific, was deregistered, or the owner/filter is too narrow.

aws ec2 describe-images --owners amazon \
  --filters "Name=name,Values=al2023-ami-2023.*-x86_64" \
  --region us-east-1 --query 'Images[0].ImageId' --output text
An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id '[ami-0bad1dexample]' does not exist

If describe-images returns None or null, the filter matches nothing in this region. Fix the filter, the owners, or the region so it resolves to a real, current AMI.

3. IAM permissions for RunInstances / CreateImage

The credentials Packer uses lack a permission it needs at some stage — launching the instance, creating the key pair/security group, or registering the final image. AWS returns UnauthorizedOperation.

packer build aws.pkr.hcl
==> amazon-ebs: Error launching source instance: UnauthorizedOperation: You are not authorized to perform this operation. User: arn:aws:iam::123456789012:user/ci-packer is not authorized to perform: ec2:RunInstances on resource: arn:aws:ec2:us-east-1:123456789012:instance/*
Build 'amazon-ebs' errored after 3 seconds: Error launching source instance: UnauthorizedOperation

The error names the exact action (ec2:RunInstances, later ec2:CreateImage, ec2:CreateTags). Grant those actions to the Packer principal. A near-instant failure (errored after 3 seconds) almost always means an authz/validation error, not a build problem.

4. Provisioner script exited non-zero

The instance booted and SSH connected, but a shell/Ansible provisioner returned a non-zero exit code — a failed apt-get, a missing package, or a script set -e abort.

PACKER_LOG=1 packer build aws.pkr.hcl 2>&1 | grep -iE 'exit status|provision'
==> amazon-ebs: + apt-get install -y nginx=1.18.0-0ubuntu1
==> amazon-ebs: E: Version '1.18.0-0ubuntu1' for 'nginx' was not found
==> amazon-ebs: Provisioning step had errors: Running the cleanup provisioner, if present...
Build 'amazon-ebs' errored after 1 minute 47 seconds: Script exited with non-zero exit status: 100. Allowed exit codes are: [0]

The lines just above the error show the actual command and its output. Fix the script (pin a package version that exists, add a repo, handle the failure) — this is a guest-side problem, not an AWS one.

5. Subnet/VPC has no public IP, or wrong instance profile

The instance launched into a private subnet with no associate_public_ip_address and no NAT/endpoint route, so Packer (running outside the VPC) cannot reach it for SSH. Or iam_instance_profile references a role that does not exist.

aws ec2 describe-subnets --subnet-ids subnet-0abc123 \
  --query 'Subnets[0].{Public:MapPublicIpOnLaunch,AZ:AvailabilityZone}' --output table
-----------------------------------
|          DescribeSubnets         |
+----------------+----------------+
|       AZ       |     Public     |
+----------------+----------------+
|  us-east-1a    |  False         |
+----------------+----------------+

Public: False with no NAT path means Packer cannot reach the instance. Set associate_public_ip_address = true and a public subnet, or run Packer from inside the VPC (ssh_interface = "private_ip").

6. Spot/quota capacity or unsupported instance type

The chosen instance_type is unavailable in the AZ (spot capacity, on-demand quota, or the type is not offered there), so RunInstances fails with a capacity error.

packer build aws.pkr.hcl
aws ec2 describe-instance-type-offerings --location-type availability-zone \
  --filters "Name=instance-type,Values=m6g.large" --region us-east-1 --output table
==> amazon-ebs: Error launching source instance: InsufficientInstanceCapacity: We currently do not have sufficient m6g.large capacity in the Availability Zone you requested (us-east-1e).
Build 'amazon-ebs' errored after 11 seconds: Error launching source instance: InsufficientInstanceCapacity

Pin a subnet_id in an AZ that offers the type, switch instance types, or remove the spot configuration. If describe-instance-type-offerings returns nothing for the AZ, the type is simply not offered there.

Diagnostic Workflow

Step 1: Validate the template before building

packer init .
packer validate .

validate catches missing variables, malformed HCL, and unset required fields before AWS is ever called — ruling out config typos.

Step 2: Read the clause after ‘errored’

packer build aws.pkr.hcl 2>&1 | tail -20

The text after errored after ...: is the cause. A failure in seconds points to AWS authz/validation; a failure in minutes points to SSH or provisioning.

Step 3: Confirm the source AMI resolves in the build region

aws ec2 describe-images --owners amazon \
  --filters "Name=name,Values=<your-ami-name-filter>" \
  --region <build-region> --query 'reverse(sort_by(Images,&CreationDate))[0].ImageId' --output text

A None result means the source_ami_filter matches nothing — fix the filter, owner, or region.

Step 4: For SSH timeouts, check the network path

aws ec2 describe-security-groups --group-ids <packer-temp-sg> \
  --query 'SecurityGroups[0].IpPermissions' --output json
aws ec2 describe-subnets --subnet-ids <subnet-id> \
  --query 'Subnets[0].MapPublicIpOnLaunch' --output text

Verify port 22 is open inbound and the subnet assigns a public IP (or that Packer uses private_ip from inside the VPC).

Step 5: For provisioner or stubborn failures, build with -debug

PACKER_LOG=1 packer build -debug aws.pkr.hcl

-debug pauses at each step and keeps the key pair so you can SSH into the live instance and reproduce the failing command by hand. Combine with PACKER_LOG=1 for the full SDK trace.

Example Root Cause Analysis

A CI job that has worked for months suddenly fails:

==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
Build 'amazon-ebs' errored after 6 minutes 1 second: Timeout waiting for SSH.

A six-minute timeout points at the network path, not credentials. The template was recently changed to deploy into a specific subnet_id for a new VPC. Checking that subnet:

aws ec2 describe-subnets --subnet-ids subnet-0newvpc99 \
  --query 'Subnets[0].MapPublicIpOnLaunch' --output text
False

The new subnet is private and does not auto-assign public IPs, and the build host runs outside the VPC. Packer launches the instance fine but can never reach it on port 22 — hence the SSH timeout after the full wait. There is no NAT or VPC endpoint giving Packer a path in.

The fix is to either force a public IP or tell Packer to connect over the private interface from a runner inside the VPC:

# in aws.pkr.hcl: associate_public_ip_address = true, and a public subnet
# OR if the CI runner is inside the VPC: ssh_interface = "private_ip"
packer validate .
packer build aws.pkr.hcl

After setting associate_public_ip_address = true and pointing at a public subnet, SSH connects within seconds and the AMI builds. The root cause was a subnet change that removed Packer’s reachability, not anything in the provisioners.

Prevention Best Practices

  • Run packer validate . in CI before every build so malformed HCL and unset variables fail fast, before any AWS API call.
  • Resolve source_ami_filter with a describe-images smoke test per region so a deregistered or region-mismatched AMI is caught before the build.
  • Give the Packer IAM principal a least-privilege policy that explicitly includes ec2:RunInstances, ec2:CreateImage, ec2:CreateTags, and the security-group/key-pair actions it needs.
  • Standardize the build subnet: a public subnet with associate_public_ip_address = true, or ssh_interface = "private_ip" from an in-VPC runner — and keep that decision in version control. See more in infrastructure-as-code guides.
  • Pin instance types to AZs that actually offer them, and avoid spot for builds where a capacity miss would break the pipeline.
  • When a build fails intermittently, the free incident assistant can summarize the PACKER_LOG trace into whether the failure was network, IAM, or provisioner.

Quick Command Reference

# Validate template before building
packer init .
packer validate .

# Build and read the real cause
packer build aws.pkr.hcl 2>&1 | tail -20

# Full SDK trace
PACKER_LOG=1 packer build aws.pkr.hcl

# Pause at each step and keep the keypair for manual SSH
packer build -debug aws.pkr.hcl

# Confirm the source AMI resolves in the region
aws ec2 describe-images --owners amazon --filters "Name=name,Values=<filter>" --region <region> --output text

# Check the SSH network path
aws ec2 describe-security-groups --group-ids <sg> --query 'SecurityGroups[0].IpPermissions'
aws ec2 describe-subnets --subnet-ids <subnet> --query 'Subnets[0].MapPublicIpOnLaunch' --output text

Conclusion

Build 'amazon-ebs' errored is a wrapper — the cause is always the clause that follows it and the lines above the teardown. Match the symptom to the layer:

  1. SSH timeout from a closed security group, wrong ssh_username, or no network path to the instance.
  2. The source_ami/source_ami_filter resolves to nothing in the build region.
  3. Missing IAM permissions (ec2:RunInstances, ec2:CreateImage) — usually a near-instant failure.
  4. A provisioner script exited non-zero — a guest-side command failure.
  5. A private subnet with no public IP (or a bad instance profile) leaves the instance unreachable.
  6. Spot/quota capacity or an instance type not offered in the chosen AZ.

A failure measured in seconds is almost always AWS authz/validation; one measured in minutes is SSH or provisioning. Read the timing and the trailing clause first, and the layer to fix is obvious.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.