Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Infrastructure as Code By James Joyner IV · · 9 min read

IaC Error Guide: 'Failed running module' cloud-init Boot Provisioning Error

Fix cloud-init 'failed' / 'Failed running module' on boot: validate cloud-config YAML, debug runcmd exit codes, datasource issues, and failed package installs.

  • #iac
  • #troubleshooting
  • #errors
  • #cloud-init

Overview

cloud-init runs your provisioning instructions during first boot, executing a sequence of modules across the init and config stages. When one of those modules raises an exception — a malformed config, a non-zero command, an unreachable package repo — cloud-init records the failure and cloud-init status reports error. Anything that depended on that module (later runcmd, services, application bootstrap) does not run, and the instance comes up half-provisioned.

You will see this from cloud-init status --long:

status: error
extended_status: error - done
boot_status_code: enabled-by-generator
last_update: Tue, 23 Jun 2026 14:08:02 +0000
detail: Failed running module 'scripts-user' (modules:final) at 2026-06-23 14:08:02. Reason: errors running module scripts-user: [Errno 1] runcmd failed.
errors:
  - Failed running module 'scripts-user' (modules:final)

And in /var/log/cloud-init.log:

util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user'>) failed
util.py[ERROR]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user'>) failed

The failure is tied to a specific module and stage. Because modules run in order, an early failure (bad YAML, missing datasource) tends to break everything after it, while a late one (a single bad runcmd) may leave most of the instance provisioned.

Symptoms

  • cloud-init status reports error instead of done.
  • The instance boots but expected packages, files, or users are missing.
  • cloud-init-output.log shows a command’s stderr and a non-zero exit code.
  • A specific module name appears under errors: in cloud-init status --long.
cloud-init status --long
status: error
extended_status: error - done
detail: Failed running module 'scripts-user' (modules:final)
errors:
  - Failed running module 'scripts-user' (modules:final)
sudo cat /var/log/cloud-init-output.log | tail -8
Cloud-init v. 24.1.3 running 'modules:final' at Tue, 23 Jun 2026 14:08:01 +0000
+ systemctl enable myapp
Failed to enable unit: Unit file myapp.service does not exist.
Cloud-init v. 24.1.3 finished at Tue, 23 Jun 2026 14:08:02 +0000. Datasource DataSourceEc2Local. Up 21.42 seconds

Common Root Causes

1. Invalid #cloud-config YAML

A syntax error or a value of the wrong type in user-data causes the config modules to fail to parse. The #cloud-config header must be the literal first line.

sudo cloud-init schema --system
Cloud config schema errors: format-l1.c1: File None needs to begin with "#cloud-config"
Error: Invalid schema: user-data

(Or a typed error like runcmd.0: 'apt update' is not of type 'array' when a list item is malformed.)

2. A runcmd or bootcmd command exited non-zero

runcmd runs each entry through a shell; any non-zero exit fails the scripts-user module. This is the single most common cause.

sudo grep -E '\+ |failed|not found|No such' /var/log/cloud-init-output.log | tail -10
+ /opt/bootstrap.sh
/bin/sh: 1: /opt/bootstrap.sh: not found
+ systemctl restart nginx
Failed to restart nginx.service: Unit nginx.service not found.

(The referenced script or unit does not exist yet, so the command fails.)

3. A module not supported on this distro

A module or directive valid on one distro is unavailable on another (for example, an apt-specific package directive on a yum/dnf system), so the module errors out.

sudo journalctl -u cloud-init --no-pager | grep -i 'not supported\|unknown\|no such'
cc_apt_configure.py[WARNING]: Skipping module named apt-configure, no 'apt' package manager found

4. Datasource or network not available

If the metadata service (the datasource) is unreachable, cloud-init cannot fetch user-data, or network-dependent modules fail because the network is not up yet.

cloud-init query userdata 2>&1 | head -3
sudo journalctl -u cloud-init --no-pager | grep -i datasource | tail -5
WARNING: Unable to render user-data
DataSourceEc2.py[WARNING]: Bringing up network interface eth0 failed
util.py[WARNING]: No instance datasource found! Likely bad things to come!

5. Package install failed (repo unreachable)

The packages: module or an apt/dnf command in runcmd fails because a repository is unreachable, a key is missing, or a package name is wrong.

sudo grep -iE 'apt|dpkg|E:|404|Could not' /var/log/cloud-init-output.log | tail -8
Err:1 http://archive.ubuntu.com/ubuntu jammy/main amd64 nginx-core amd64
  Could not connect to archive.ubuntu.com:80 (185.125.190.36), connection timed out
E: Failed to fetch http://archive.ubuntu.com/.../nginx-core_1.18.0.deb
E: Unable to fetch some archives, maybe run apt-get update?

6. write_files permission or path error

A write_files entry targets a directory that does not exist, or uses a permission/owner the running context cannot set, causing the module to fail.

sudo journalctl -u cloud-init --no-pager | grep -i 'write_files\|No such file\|Permission denied'
cc_write_files.py[WARNING]: Failed to write file /etc/myapp/config.yaml
util.py[WARNING]: [Errno 2] No such file or directory: '/etc/myapp/config.yaml'

(The /etc/myapp directory does not exist, so the file cannot be written.)

Diagnostic Workflow

Step 1: Read the full status to find the failing module

cloud-init status --long

The detail: and errors: lines name the exact module (scripts-user, write-files, package-update-upgrade-install) and stage (modules:config, modules:final) that failed.

Step 2: Read the output log for the command stderr

sudo cat /var/log/cloud-init-output.log

This is where runcmd/bootcmd stdout and stderr land — the actual error text (not found, connection timed out, permission denied) lives here.

Step 3: Validate the user-data schema

sudo cloud-init schema --system
cloud-init query userdata

If the schema is invalid, fix the YAML before anything else — a parse failure breaks every config module.

Step 4: Inspect timing and the per-module log

cloud-init analyze blame | head -15
sudo journalctl -u cloud-init --no-pager | tail -60

analyze blame shows which modules ran and how long they took; the journal shows per-module warnings around the failure.

Step 5: Fix, clean, and re-run

After correcting the user-data or the underlying issue (create the missing directory, fix the repo, correct the command), wipe cloud-init’s state so it re-runs on next boot:

sudo cloud-init clean --logs
sudo reboot
# Or re-run a single stage without rebooting (for testing):
sudo cloud-init single --name runcmd

Example Root Cause Analysis

A fleet of new web nodes comes up, but nginx is not installed on any of them. cloud-init status --long reports:

status: error
detail: Failed running module 'package-update-upgrade-install' (modules:config)
errors:
  - Failed running module 'package-update-upgrade-install' (modules:config)

The failing module is package install, so the output log is the place to look:

sudo grep -iE 'Err|404|Could not|E:' /var/log/cloud-init-output.log | tail -6
Err:1 http://repo.internal.example.com/ubuntu jammy/main amd64 nginx
  Could not resolve 'repo.internal.example.com'
E: Unable to locate package nginx

The user-data points packages: at an internal mirror, but the new subnet has no DNS resolver for repo.internal.example.com — cause 5, an unreachable repository rather than a YAML bug. The fix is to correct DNS on the subnet (or point at the public mirror), then re-provision:

sudo cloud-init clean --logs
sudo reboot

After the resolver is reachable, the package module installs nginx and cloud-init status reports done.

Prevention Best Practices

  • Validate every user-data file with cloud-init schema --config-file user-data.yaml in CI before it ever reaches an instance — a bad first line fails the whole config stage.
  • Make runcmd entries idempotent and defensive: guard with command -v, create parent directories before write_files, and avoid referencing units or scripts that later steps create.
  • Pin and pre-validate package repositories (and DNS for internal mirrors) so the package module is not at the mercy of an unreachable archive on first boot.
  • Keep user-data templates in version control with the rest of your infrastructure as code so changes are reviewed and schema-checked, not pasted into a console.
  • Test a representative image end-to-end with cloud-init clean --logs && reboot before baking it into an autoscaling group.
  • When a boot failure pages you mid-rollout, the free incident assistant can map a Failed running module status to the specific module and command at fault.

Quick Command Reference

# Overall status and the failing module
cloud-init status --long

# The command stderr that actually failed
sudo cat /var/log/cloud-init-output.log

# Validate the user-data schema
sudo cloud-init schema --system
cloud-init query userdata

# Per-module timing and journal
cloud-init analyze blame | head -15
sudo journalctl -u cloud-init --no-pager | tail -60

# Reset state and re-run on next boot
sudo cloud-init clean --logs
sudo reboot

# Re-run a single module for testing
sudo cloud-init single --name runcmd

Conclusion

A cloud-init error status with “Failed running module” means one provisioning module raised an exception, leaving the instance partially configured. The usual root causes:

  1. Invalid #cloud-config YAML that the config modules cannot parse.
  2. A runcmd or bootcmd command exiting non-zero.
  3. A module or directive not supported on the running distro.
  4. The datasource or network being unavailable when modules run.
  5. A package install failing because a repository or DNS is unreachable.
  6. A write_files path or permission error.

Start from the module named in cloud-init status --long, read cloud-init-output.log for the real error, fix it, then cloud-init clean --logs and reboot to re-provision cleanly.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.