Refactoring Ansible When Conditionals With AI: Taming

A few years into running Ansible at scale, I shipped a role that I was certain disabled SELinux enforcement on a fleet of staging boxes. The task had a tidy when: line, the play went green, and I moved on. Three weeks later a security scan flagged every one of those hosts as still enforcing. The task had never run. Not once. The condition looked correct, read correctly out loud, and was quietly false on every single host because of one tiny mistake in how I compared a variable. Ansible doesn’t warn you when a when: is always false. It just skips, prints a cheerful yellow skipping, and lets you believe you did something.

That bug sent me on a long campaign to clean up when: logic across our roles. These days I lean on AI to do the first pass of that cleanup — and I’ve learned exactly where it helps and where it will confidently hand you a foot-gun. Here’s the playbook.

The bare-variable trap that started it all

The single most common when: mistake is comparing a variable to a string when you meant to compare it to a boolean. My SELinux bug was this:

- name: "Set SELinux to permissive"
  ansible.posix.selinux:
    policy: targeted
    state: permissive
  when: disable_selinux == "true"

The disable_selinux variable was set to a real YAML boolean true in group_vars — not the string "true". So true == "true" is false, forever. The fix is to evaluate the variable directly:

- name: "Set SELinux to permissive"
  ansible.posix.selinux:
    policy: targeted
    state: permissive
  when: disable_selinux | bool

A bare when: disable_selinux works too, but I prefer the explicit | bool filter because it normalizes the messy real world: "yes", "on", 1, and True all become a proper boolean. When I ask an AI assistant to review a role, “find every when: that compares against a quoted string and tell me whether the variable is actually a string” is one of the highest-value prompts I run. It catches the silent-skip class of bugs that no linter flags.

Pro Tip: when: my_var and when: my_var == "true" are NOT the same thing. The first is truthy evaluation; the second is a string equality that fails the moment the value is a real boolean. Pick one convention and let AI enforce it across the repo.

Don’t wrap when in moustaches

The second thing AI catches well is the Jinja-in-when reflex. People who learn {{ }} everywhere else try to do this:

when: "{{ ansible_facts['os_family'] == 'RedHat' }}"

The when: clause is already a Jinja expression. Wrapping it in {{ }} is redundant at best and breaks on newer Ansible at worst. The correct form is bare:

when: ansible_facts['os_family'] == "RedHat"

This is a mechanical, unambiguous fix, which makes it perfect AI work. I’ll feed a whole role to Claude and ask it to strip moustaches from every conditional. It’s the kind of tedious sweep a human gets sloppy on around file number twenty.

Defined, default, and the bool filter

Half of “the condition never ran” bugs come from undefined variables. If a play references a variable that isn’t set, the whole task errors instead of skipping politely. Defensive conditionals fix this, and AI is good at suggesting the right guard:

# Fragile: errors if feature_flags is never defined
when: feature_flags.enable_caching

# Robust: skips cleanly when undefined or falsy
when: feature_flags.enable_caching | default(false) | bool

The is defined test is the other tool here, and it composes naturally:

when:
  - api_token is defined
  - api_token | length > 0

When I ask AI to harden conditionals, I tell it explicitly: “every variable that comes from optional group_vars must be guarded with default() or is defined.” Without that instruction it tends to assume variables are always present — the same optimistic assumption that bites humans.

Combining conditions: lists are AND

This one trips up newcomers constantly. A YAML list under when: is an implicit logical AND. These two are identical:

# Implicit AND via a list
when:
  - environment == "production"
  - ansible_facts['os_family'] == "Debian"

# Same thing, written inline
when: environment == "production" and ansible_facts['os_family'] == "Debian"

The list form is far more readable once you have three or more clauses, and it diffs beautifully in code review. For OR logic you have to be explicit, and parentheses matter once you mix operators:

when: >
  (environment == "staging" or environment == "production")
  and deploy_enabled | bool

Flattening a snarl of nested when blocks into one clean list is the single thing AI does best here. I’ll paste a 40-line block of duplicated, nested conditions and ask for an equivalent flat list — then I read every clause before I trust it.

When meets loop and register

Two cases deserve special care because the semantics are subtle. First, when: combined with a loop is evaluated per item, not once for the whole task:

- name: "Install only the packages flagged for this host"
  ansible.builtin.package:
    name: "{{ item.name }}"
    state: present
  loop: "{{ optional_packages }}"
  when: item.enabled | bool

Each iteration re-evaluates the condition against its own item. AI sometimes “optimizes” this by hoisting the condition out of the loop, which changes behavior — that’s exactly the kind of edit a human must catch.

Second, conditionals on registered results. After you register: a task, you branch on its return fields — .rc, .stdout, .changed, .failed:

- name: "Check if the migration marker exists"
  ansible.builtin.command: test -f /var/lib/app/.migrated
  register: migration_check
  failed_when: false
  changed_when: false

- name: "Run database migration"
  ansible.builtin.command: /usr/local/bin/migrate.sh
  when: migration_check.rc != 0

Note the failed_when: false and changed_when: false — that probe task should never fail the play or report a change just because the file is absent. AI is genuinely helpful at remembering to add those, because the most common version of this pattern online forgets them and turns every run red or dirty.

Distro and version logic with ansible_facts

Cross-distro roles accumulate when: clauses keyed on facts, and they rot fast. Version comparisons especially need the version test rather than naive string or numeric comparison:

- name: "Use the modern service manager"
  ansible.builtin.systemd:
    name: app
    state: started
  when:
    - ansible_facts['os_family'] == "RedHat"
    - ansible_facts['distribution_major_version'] is version("8", ">=")

distribution_major_version is a string, so > "8" would compare lexically and break on two-digit versions. The version(...) test does the right thing. I lean on AI here to audit every fact-based conditional and flag the ones doing string math on version numbers — a real bug that hides until RHEL hits version 10.

Push complex logic into variables or assert

The cleanest refactor of all is to stop cramming logic into when: and name it instead. If the same gnarly condition appears in five tasks, define it once:

# group_vars/all.yml
should_deploy: "{{ (environment in ['staging', 'production']) and deploy_enabled | bool }}"

- name: "Deploy the application"
  ansible.builtin.include_tasks: deploy.yml
  when: should_deploy | bool

Now the intent has a name, the condition lives in one place, and review is trivial. For preconditions that should halt the play loudly rather than skip silently — the lesson my SELinux bug taught me — reach for assert:

- name: "Fail fast on a missing vault token"
  ansible.builtin.assert:
    that:
      - vault_token is defined
      - vault_token | length > 0
    fail_msg: "vault_token must be set before running this role"

A skipped task is invisible; a failed assertion is a wall. When a condition guards something that must happen, AI can help you convert silent skips into explicit asserts across the role.

How to actually work with the AI

AI is a fast, tireless junior engineer for this work — and you should treat it like one. It flattens nested blocks, strips moustaches, and adds default() guards faster than any human, but it does not know your fleet. A human reviews every single change. Run the refactored play in check mode before you trust it — ansible-playbook site.yml --check --diff will show you what would change without touching a host, and it’s the only way to catch a condition that flipped meaning during the cleanup. And never, ever hand the assistant your vault keys; refactor the logic, keep the secrets encrypted and out of the prompt.

If you want repeatable results, build a small library of conditional-cleanup prompts in the prompt workspace and grab a curated set from the prompt packs or the open prompts library. When the refactor touches anything load-bearing, push the diff through automated code review before it merges. More patterns like this live in the infrastructure-as-code category.

Conclusion

My SELinux task taught me that Ansible’s worst bugs aren’t loud — they’re the conditions that quietly never fire. AI is brilliant at hunting those down at scale, but only because you point it, review it, and dry-run everything it touches. Let it do the tedious flattening; you keep the judgment, the secrets, and the final word.

Refactoring Ansible When Conditionals With AI: Taming Tangled Logic