Auditing Ansible Playbooks for Secret Leaks With AI and

A junior engineer pinged me last week with a screenshot from our CI runner. There it was, in plain text, scrolled across a job log that anyone with read access to the pipeline could see: the root password we hand to a freshly-imaged OpenStack hypervisor during bootstrap. Nobody had been careless on purpose. The playbook used -v for “better debugging,” a debug task printed a registered variable to confirm the value was set, and Ansible faithfully dumped the secret into the log where it sat for ninety days of retention. The fix took two minutes. The audit to find every other place we’d done the same thing took the rest of the afternoon — and that’s where AI actually earned its keep.

Secrets don’t usually leak because someone commits them to Git. They leak because Ansible is helpful. It echoes module arguments, prints registered results, expands loop items, and streams everything through callback plugins straight into your logs. If you’re not deliberate about no_log, your CI history becomes a searchable secret store.

Where Ansible actually leaks secrets

Before you can audit anything, you need to know the leak surface. In my experience running production playbooks across OpenStack and Kubernetes nodes, these are the recurring offenders.

Registered variables printed with debug. This is the classic. You run a command that returns a token, register it, then print it “just to check.”

- name: Generate a service token
  ansible.builtin.command: vault-cli issue --role bootstrap
  register: svc_token

- name: Confirm token (LEAKY — prints the secret)
  ansible.builtin.debug:
    var: svc_token.stdout

That debug task writes the token to stdout every single run. Even without an explicit debug, the module result is available to any callback plugin that logs task results.

Loop items. When you loop over a list of credentials, Ansible prints each item by default as part of the task banner.

- name: Create database users (LEAKY — passwords appear in the loop label)
  community.mysql.mysql_user:
    name: "{{ item.user }}"
    password: "{{ item.pass }}"
    priv: "{{ item.db }}.*:ALL"
  loop:
    - { user: "app", pass: "s3cr3t-app", db: "app" }
    - { user: "report", pass: "s3cr3t-rep", db: "reporting" }

Each iteration’s item — password included — shows up in the output.

Verbose mode (-v and up). At -vv and beyond, Ansible prints the full module invocation, including arguments. A password parameter passed inline is right there in the connection debug. People reach for -v the moment something breaks, which is exactly when secrets are flowing.

Callback logs. The community.general.log_plays callback, or any of the JSON/syslog callbacks shipping task results off-box, will capture whatever the task returned. If the module returns the secret in its result, your central log aggregator now has it.

The no_log pattern

The fix is no_log: true on any task that touches a secret. It tells Ansible to suppress the task’s arguments and results from output and callbacks.

- name: Create database users (safe)
  community.mysql.mysql_user:
    name: "{{ item.user }}"
    password: "{{ item.pass }}"
    priv: "{{ item.db }}.*:ALL"
  loop:
    - { user: "app", pass: "{{ vault_app_pass }}" }
    - { user: "report", pass: "{{ vault_report_pass }}" }
  no_log: true

And the registered-token case:

- name: Generate a service token
  ansible.builtin.command: vault-cli issue --role bootstrap
  register: svc_token
  no_log: true

Note what no_log does not do: it doesn’t stop you from later printing svc_token.stdout in a different unguarded task. no_log is per-task. The task that consumes the secret needs the flag too. Treat the secret as radioactive across every task that handles it, not just the one that produces it.

A solid companion practice is keeping the secrets themselves out of the playbook entirely with Ansible Vault. I wrote up a workflow for that in managing Ansible Vault secrets without losing your mind — no_log controls what hits the log, Vault controls what hits the repo. You want both.

The no_log gotchas nobody warns you about

no_log has sharp edges, and the failure modes are the dangerous kind because they fail open — you think you’re protected and you’re not.

Failures still leak. When a task with no_log: true fails, older behavior would sometimes surface data, and module-level tracebacks can still expose arguments depending on the failure path. More importantly, if a templating error occurs before the module runs, the error message can contain the rendered value. Don’t assume no_log is a perfect seal on a crashing task.

Loops and no_log are all-or-nothing. With a loop, no_log: true suppresses the entire loop output. You can’t keep the non-secret item.user visible while hiding item.pass — it’s the whole iteration or nothing. That’s usually fine, but it makes debugging a 200-item loop miserable, which tempts people to remove the flag temporarily and forget to put it back. Resist.

Check mode and diff. --diff can render the before/after of a templated file, secrets included, even when no_log is set on the task, because diff output is generated by a different path. If you template a config file full of credentials, the diff will happily show them.

It hides real errors too. no_log suppresses the genuine error output, so a failed task becomes a frustrating MODULE FAILURE with no detail. The honest move is to flip it off locally on a throwaway host while debugging, never in CI, and turn it back on before committing.

Using AI to find tasks that need no_log

Manually grepping a few hundred roles for missing no_log is exactly the kind of pattern-matching toil where AI shines — and exactly the kind of judgment call where it shouldn’t have the last word. My approach: let the model draft the list of suspects, then I confirm each one. It reads context I’d skim past; I catch the false positives it can’t reason about.

I dump a role’s tasks into a model with a prompt like this:

You are auditing Ansible tasks for secret leakage. For each task below, decide whether it handles a credential, token, key, or password — in its module arguments, in a variable it registers, or in a loop item. Flag any such task that is missing no_log: true. Output a table: task name, the secret-bearing field, and whether no_log is present. Do not flag tasks that only reference public config. Be conservative and explain borderline cases.

The output is genuinely useful as a worklist:

Task name Secret-bearing field no_log present?
Generate a service token registers svc_token (token in stdout) NO — add it
Create database users item.pass in loop NO — add it
Render haproxy.cfg templates vault_stats_pass into file partial — task has no_log, but --diff will leak
Install base packages none (public package list) n/a

Task name	Secret-bearing field	no_log present?
Generate a service token	registers `svc_token` (token in stdout)	NO — add it
Create database users	`item.pass` in loop	NO — add it
Render haproxy.cfg	templates `vault_stats_pass` into file	partial — task has no_log, but `--diff` will leak
Install base packages	none (public package list)	n/a

That third row is the kind of thing a plain grep for password: would never catch — the secret enters through a template, not a module argument. The model flagged it; I verified the template actually contained the credential before acting. That verify step is non-negotiable. AI will confidently flag a variable named db_password_file that only holds a path, and it’ll occasionally miss a secret hiding behind a vague variable name like cfg.value. You read the diff it proposes the same way you’d read a colleague’s PR.

For a repo-wide sweep, I pair the model with a cheap mechanical pre-filter so I’m not paying to analyze package-install tasks:

# Surface candidate tasks: anything mentioning a secret-ish word,
# minus tasks that already declare no_log
grep -rEn 'pass(word)?:|secret|token|api[_-]?key|private_key' \
  roles/ --include='*.yml' -l \
| xargs grep -L 'no_log' \
| sort -u

That hands the model a short list of files actually worth reading, and keeps a human — me — looking at the final set of changes. I keep my standard audit prompt versioned alongside the others in my prompt library so the whole team runs the same review, and the broader pile of Ansible workflows lives under the Ansible category.

The workflow that stuck

Here’s what I run now before any playbook touching credentials merges:

Mechanical grep to list secret-ish tasks lacking no_log.
Feed that list to the model with the audit prompt; get a table of suspects.
Manually confirm each flagged task — especially template and diff paths the model can’t fully reason about.
Add no_log: true, and for templated secrets, gate --diff or split the secret-bearing render into its own no-diff task.
Re-run the leaky scenario at -vv against a disposable host and grep the output for the actual secret value. If it appears, you missed one.

That last step is the only proof that counts. The model gives you a fast, thorough first pass; the verbose-run grep tells you the truth. AI drafts the audit, decodes the leak surface, and reviews the diff — but you run the playbook, read the log, and decide what ships. Your CI history will thank you, and so will whoever inherits it.

Auditing Ansible Playbooks for Secret Leaks With AI and no_log

Where Ansible actually leaks secrets

The no_log pattern

The no_log gotchas nobody warns you about

Using AI to find tasks that need no_log

The workflow that stuck

Download the Free 500-Prompt DevOps AI Toolkit