AI-Assisted Ansible Role Refactors Without Breaking Prod

I have a role in one of my repos called webserver that does about nine things. It installs nginx, configures TLS, sets up logrotate, manages a cron job, opens firewall ports, drops a healthcheck script, and — because someone was in a hurry in 2022 — also restarts an unrelated queue worker. Nobody wanted to touch it because nobody could prove a change wouldn’t break something downstream. That fear is exactly what makes role refactors a perfect job for AI assistance, as long as you keep the AI on a short leash.

The principle I work by: AI is a fast junior engineer who can do the mechanical splitting and renaming quickly, but a human reviews every diff, and nothing reaches a real host without check-mode proving behavior is unchanged.

Why roles rot

Roles grow by accretion. Someone needs “just one more task,” the tasks/main.yml grows, variables get added to defaults/ without documentation, and handlers/ accumulate notify chains nobody fully understands. Eventually you have a role that works but can’t be reasoned about. Refactoring it is high-value and high-risk — the kind of work that gets perpetually deferred.

Start by asking AI to map the role, not change it

Before any refactor, I ask AI to build me a behavioral inventory. The prompt:

“Read this Ansible role. List every distinct responsibility it has, the variables each task depends on, the handlers it notifies, and any task that touches something outside the role’s stated purpose. Don’t suggest changes yet.”

This surfaces the surprises. In my webserver example, the AI immediately flagged the queue-worker restart as out of scope — exactly the landmine I’d forgotten. Getting that map before touching code is the whole game.

Split responsibilities into focused roles

Once I have the map, I split. AI is fast at the mechanical part: moving tasks into new role skeletons and wiring up the variable references. A monolithic tasks/main.yml becomes a clean dispatcher:

# roles/webserver/tasks/main.yml
- name: "Install and configure nginx"
  ansible.builtin.import_tasks: nginx.yml

- name: "Manage TLS certificates"
  ansible.builtin.import_tasks: tls.yml

- name: "Configure log rotation"
  ansible.builtin.import_tasks: logrotate.yml

Or, better still, genuinely separate roles with a meta/main.yml declaring dependencies:

# roles/webserver/meta/main.yml
dependencies:
  - role: tls_certs
    vars:
      cert_domains: "{{ webserver_domains }}"

The AI proposes the split; I decide where the real seams are. It doesn’t know that tls_certs is reused by three other roles in our org — I do, so I steer it toward extraction rather than duplication.

Modernize the syntax while you’re in there

Refactors are a good moment to fix deprecated patterns, and AI is excellent at this because it’s pure pattern translation. Old-style bare module names and with_items get modernized:

# Before
- name: install packages
  yum: name={{ item }} state=present
  with_items:
    - nginx
    - certbot

# After
- name: "Install web packages"
  ansible.builtin.package:
    name:
      - nginx
      - certbot
    state: present

I switched yum to the generic package module here, used the FQCN, gave the task a proper capitalized name, and collapsed the loop into the module’s native list support — which is also faster because it’s one transaction instead of N. AI proposes all of that in one pass; I confirm package is appropriate for the target distros.

Pro Tip: When modernizing, ask AI to keep a running list of every behavior-changing edit versus every cosmetic edit. The cosmetic ones you can batch-approve; the behavior-changing ones each need their own check-mode run.

Prove behavior is identical

This is where the refactor lives or dies. A refactored role must produce the exact same end state as the original. My proof is a check-mode diff against a host that’s already converged with the old role:

# Host is already in desired state from the old role.
# Run the refactored role in check-mode against it.
ansible-playbook site.yml --check --diff --limit canary-01

If the refactor is truly behavior-preserving, this reports changed=0. Any reported change is a behavioral difference I introduced and need to investigate. That single command has caught more of my refactor mistakes than any code review.

I also run ansible-lint and a syntax check before anything else:

ansible-lint roles/webserver
ansible-playbook site.yml --syntax-check

Don’t let AI invent variable names

A subtle failure mode: when AI splits a role, it sometimes renames variables to “cleaner” names. That breaks any inventory or group_vars that references the old name. I explicitly instruct it: preserve every variable name exactly; if you think a rename improves clarity, list it separately as a proposal, don’t apply it. Renames are a separate, deliberate change with their own grep-the-whole-repo verification:

grep -rn "webserver_old_var" inventory/ group_vars/ host_vars/

Keep secrets and vault out of it

The refactor never includes decrypted vault content. If a task references vault_db_password, the AI sees the variable name and nothing else. I never paste the output of ansible-vault view into a prompt, and I never give a model my vault password. The structure of the role is all it needs; the secrets stay encrypted and stay mine.

The human review gate

Every refactor PR gets read by a person, not just the AI. The diff is usually large, so I lean on tooling to focus attention — the code review dashboard helps surface the risky tasks in a big YAML diff, and I keep my refactor prompts in the prompt workspace so they’re consistent across roles.

A good refactor is invisible in production: same behavior, cleaner code, easier next change. AI gets you there faster by doing the mechanical splitting and modernizing, but the safety comes from check-mode proof and human review — not from the model’s confidence. For the rest of this series, see the IaC category, and if you’re comparing tools for this kind of work, Claude versus Cursor for infrastructure is worth a look.

Refactor the scary roles. Just keep the AI doing the typing and keep yourself doing the judging.