Skip to content
CloudOps
All prompts
AI for Infrastructure as Code Difficulty: Advanced ClaudeChatGPT

Ansible Performance Tuning Prompt

Speed up Ansible playbooks — forks, pipelining, async, smart gathering, fact caching, mitogen.

Target user
Ansible engineers tuning slow runs
Difficulty
Advanced
Tools
Claude, ChatGPT

The prompt

You are a senior automation engineer who has cut Ansible run times by 10× via tuning — forks, pipelining, async tasks, fact caching.

I will provide:
- Playbook runtime
- Host count
- Symptom (slow, blocked on facts, SSH bottleneck)

Your job:

1. **Forks** (parallel hosts):
   - Default 5; way too low for many hosts
   - `forks: 50-100` typical
   - Limited by Ansible controller resources + target SSH
2. **Pipelining**:
   - Reduces SSH connections per task
   - Requires sudo without requiretty
   - `pipelining = True` in ansible.cfg
   - 2-3× speedup
3. **Control persist**:
   - SSH connection reuse
   - Default 60s; raise to 600s+
   - Critical for many tasks per host
4. **Smart gathering**:
   - `gathering = smart` — cache facts within run
   - vs `implicit` (every play)
5. **Fact caching**:
   - Persist facts across runs
   - JSON / Redis / Memcached
   - Skip gather_facts if cached
6. **Async tasks**:
   - Long-running tasks (yum install everything)
   - `async: 600` `poll: 0` fire-and-forget
   - Check later
7. **Strategy**:
   - **linear** — default; wait for all hosts at task boundary
   - **free** — each host independent
   - **host_pinned** — batch
8. **Mitogen** (alternative connection):
   - Drop-in replacement for SSH
   - Significant speedup
   - Some compatibility limits

Mark DESTRUCTIVE: forks too high causing SSH server overload, async without poll on stateful tasks (state inconsistent), removing fact gathering when needed.

---

Playbook runtime: [DESCRIBE]
Host count: [N]
Symptom: [DESCRIBE]

Why this prompt works

Tuning multiplies productivity. This prompt walks knobs.

How to use it

  1. Profile first.
  2. Increase forks.
  3. Enable pipelining.
  4. Cache facts.

Useful commands

# Profile run
ansible-playbook site.yml --callback profile_tasks

# Enable profile callback
ANSIBLE_STDOUT_CALLBACK=profile_tasks ansible-playbook site.yml

# Time per host
ANSIBLE_CALLBACK_WHITELIST=timer,profile_tasks ansible-playbook site.yml

# Run with high forks
ansible-playbook site.yml -f 100

ansible.cfg tuning

[defaults]
forks = 100
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 86400
host_key_checking = True
callbacks_enabled = profile_tasks, timer

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=600s -o PreferredAuthentications=publickey
control_path_dir = ~/.ansible/cp
control_path = %(directory)s/%%h-%%r

Patterns

Async fire-and-forget

- name: Update packages (async)
  apt:
    upgrade: dist
  async: 3600          # max 1 hour
  poll: 0              # fire and forget

- name: Other tasks while packages install
  template: ...

- name: Wait for package update
  async_status:
    jid: "{{ async_result.ansible_job_id }}"
  register: status
  until: status.finished
  retries: 60
  delay: 30

Fact caching

[defaults]
gathering = smart
fact_caching = redis
fact_caching_connection = redis.example.com:6379
fact_caching_timeout = 7200

Skip gather_facts when not needed

- hosts: web
  gather_facts: false        # if facts not needed for this play
  tasks:
  - name: Quick task
    shell: ...

Strategy: free

- hosts: web
  strategy: free            # each host independent
  tasks:
  - name: Long task
    shell: /long-running.sh

Smart gathering with cache

# First play: gather + cache
- hosts: all
  gather_facts: true

# Subsequent plays: use cache
- hosts: web
  gather_facts: false       # facts already cached from first
  tasks: ...

Run subset (—limit)

# Test on small subset
ansible-playbook site.yml --limit web-01
ansible-playbook site.yml --limit 'web-0[1-3]'

Common findings this catches

  • Forks=5 with 100 hosts → bump to 50+.
  • Gather facts every play → cache.
  • SSH connections re-established → control persist.
  • Pipelining off → enable.
  • Long-running yum install → async.
  • Stuck on slow host → strategy: free.
  • Wrong order with async → use poll or sync block.

When to escalate

  • SSH server capacity — networking.
  • Controller resource scaling — engineering.
  • Mitogen adoption — engineering eval.

Related prompts

Newsletter

Get weekly AI workflows for DevOps engineers

Practical prompts, automation ideas, and tool reviews for infrastructure engineers. One email per week. No spam.