Ansible Network Automation for Switches and Routers, Done Safely With AI
Automate Cisco IOS, Arista EOS, and Juniper config with Ansible and network_cli. Resource modules, backups, check-mode dry runs, and where AI helps.
- #iac
- #ansible
- #networking
- #cisco
- #ai
The closest I ever came to taking down a datacenter was a single misplaced no in front of a switchport line. It was 11pm, I was pushing a “trivial” VLAN change to a stack of access switches by hand, and one of those switches happened to carry the management VLAN for half the rack. The good news: I caught it before walking away. The bad news: I caught it because every monitoring dashboard I had open turned red simultaneously. I drove back to the office at midnight to console into a switch I had just orphaned from the network.
That was the night I stopped treating network changes as something you do “carefully by hand” and started treating them like code: version-controlled, idempotent, peer-reviewed, and dry-run before they ever touch a live device. Ansible is how I do that now, and a model like Claude is how I draft the tasks and parse the noisy show output faster. But the safety rails below are non-negotiable — network changes have an enormous blast radius, and a bad push can lock you out of the very box you need to fix.
Why network devices are different
If you have automated Linux with Ansible, your instincts are about to betray you. Network gear breaks two core assumptions Ansible normally makes:
- There is usually no Python on the device. You can’t ship a module to a switch and run it locally the way you would on a server.
- You don’t get a real shell. You talk to a CLI over SSH, sometimes with an
enable/privileged mode gate in front of config.
Ansible handles this with the network_cli connection plugin from the ansible.netcommon collection. Instead of executing modules on the target, it runs them on the control node and drives the device’s CLI as a session. The magic variable is ansible_network_os, which tells the connection plugin which platform grammar to speak.
Inventory that speaks network_cli
Get the connection variables right and everything else follows. Here is an inventory grouping three vendors, with credentials pulled from vault rather than hard-coded.
all:
children:
cisco_ios:
hosts:
access-sw-01:
ansible_host: "10.10.0.11"
access-sw-02:
ansible_host: "10.10.0.12"
vars:
ansible_network_os: "cisco.ios.ios"
arista_eos:
hosts:
spine-01:
ansible_host: "10.10.0.21"
vars:
ansible_network_os: "arista.eos.eos"
juniper_junos:
hosts:
edge-01:
ansible_host: "10.10.0.31"
vars:
ansible_network_os: "junipernetworks.junos.junos"
vars:
ansible_connection: "ansible.netcommon.network_cli"
ansible_user: "{{ vault_net_user }}"
ansible_password: "{{ vault_net_password }}"
ansible_become: true
ansible_become_method: "enable"
ansible_become_password: "{{ vault_enable_secret }}"
A few things worth calling out because they bite everyone:
ansible_connectionmust beansible.netcommon.network_cli, not the defaultssh. Set it once at the group level.become: truewithbecome_method: enableis how you get into privileged mode on IOS/EOS. Without it, config commands silently fail or error.- Those
vault_*values live in an Ansible Vault file. They never appear in plaintext, and — this matters later — they never get pasted into an AI chat window.
Pro Tip: Keep ansible_become_password separate from ansible_password. On plenty of networks the login password and the enable secret differ, and conflating them produces a confusing “unable to enter privileged mode” error that looks like a connectivity problem.
Gathering facts before you touch anything
Before changing config, gather facts. The vendor collections ship fact modules that return structured data — interfaces, VLANs, version, neighbors — which is gold for both pre-change validation and for feeding context to an AI when you ask it to draft a change.
- name: "Collect device state"
hosts: cisco_ios
gather_facts: false
tasks:
- name: "Gather IOS facts"
cisco.ios.ios_facts:
gather_subset:
- "min"
gather_network_resources:
- "vlans"
- "l2_interfaces"
register: device_state
- name: "Show known VLANs"
ansible.builtin.debug:
var: device_state.ansible_facts.ansible_network_resources.vlans
Note gather_facts: false at the play level — that disables the default Linux fact gathering (which would just fail here) so you can call the network-specific ios_facts module instead.
Declarative config with resource modules
This is where modern Ansible networking gets good. The old pattern was ios_config shoveling raw CLI lines. It works, but it is imperative — you are responsible for idempotency and for not leaving cruft behind. Resource modules like ios_vlans and ios_l2_interfaces are declarative: you describe the desired state and Ansible computes the diff.
- name: "Manage access VLANs declaratively"
hosts: cisco_ios
gather_facts: false
tasks:
- name: "Ensure VLAN definitions"
cisco.ios.ios_vlans:
config:
- vlan_id: 10
name: "users"
- vlan_id: 20
name: "voice"
- vlan_id: 99
name: "mgmt"
state: "merged"
- name: "Set access ports to the users VLAN"
cisco.ios.ios_l2_interfaces:
config:
- name: "GigabitEthernet0/2"
access:
vlan: 10
- name: "GigabitEthernet0/3"
access:
vlan: 10
state: "merged"
The state keyword is the whole game. merged adds or updates without removing anything else. replaced makes a single resource match exactly. overridden makes the entire set of that resource type match your config — which means anything not listed gets removed. overridden is powerful and dangerous; that is precisely the kind of line you want a human to sign off on, not an AI to apply unsupervised.
When a resource module does not exist for what you need, fall back to ios_config:
- name: "Apply NTP servers"
cisco.ios.ios_config:
lines:
- "ntp server 10.0.0.1"
- "ntp server 10.0.0.2"
match: "line"
Backups, check mode, and diff: your seatbelt
Never push a change without a current backup and a dry run. The netcommon-aware config modules support a backup option that snapshots the running config to your control node before touching anything.
- name: "Backup running config first"
cisco.ios.ios_config:
backup: true
backup_options:
filename: "{{ inventory_hostname }}.cfg"
dir_path: "./backups"
Then run the actual change play in check mode with diff:
ansible-playbook -i inventory.yml vlan-change.yml --check --diff --limit access-sw-01
--check makes Ansible report what would change without applying it. --diff prints the exact line-level delta. Run it against one device first with --limit, read the diff like it is a code review, and only then drop --check to apply. If the diff shows something you did not intend — a VLAN being removed, an interface description vanishing — you just caught a midnight outage at 4pm instead.
Pro Tip: Resource modules respect check mode and produce a clean structured diff, which is far easier to eyeball than raw CLI. This is a strong reason to prefer ios_vlans over ios_config when a resource module exists for the thing you are changing.
Where AI actually helps
AI is a genuinely good fit for two slices of this work, and a genuinely bad fit for a third.
Drafting tasks. Describe the target state — “set Gi0/2 through Gi0/8 as access ports on VLAN 10, trunk Gi0/1 with native VLAN 99” — and a model will produce a first-draft ios_l2_interfaces block faster than you can recall the exact module key names. Tools like GitHub Copilot or Cursor do this inline while you write the playbook. Treat the output as a pull request from a fast junior engineer: review every line, confirm the state, and run it through check mode.
Parsing show output. show interface status, show ip bgp summary, and friends are walls of semi-structured text. Pasting that into ChatGPT to summarize which ports are err-disabled, or to generate a starting point for an ansible.netcommon.cli_parse template, saves real time. If you want this as a repeatable habit rather than ad-hoc pasting, a saved prompt or a curated prompt pack keeps the phrasing consistent across the team.
What AI must not do: hold your credentials or apply changes on its own. Never paste vault keys, enable secrets, or live device passwords into a chat window — sanitize inventories before sharing them for help. And never wire a model into a pipeline that pushes config to production gear without a human approving the diff. The model is there to draft and explain; the change window, the --check run, and the final sign-off belong to a person.
A safe workflow, end to end
Put it together and the loop looks like this:
- AI drafts the task from a plain-English description of desired state.
- You review the YAML and pick the correct
state(bewareoverridden). - Run with
backup: trueto snapshot the running config. - Run
--check --diff --limit one-deviceand read the delta. - Apply to the canary device, validate with a facts re-gather.
- Roll out to the rest inside an agreed change window.
That sequence is also where having a code review pass on the playbook diff before it merges earns its keep — a second set of eyes on a state: overridden line is cheap insurance against a very expensive evening.
Conclusion
Network automation with Ansible turns the scary, irreversible, midnight-console kind of change into something boring and reviewable — backed up, dry-run, diffed, and approved. AI accelerates the tedious parts: drafting resource-module tasks and parsing endless show output. It does not replace your judgment, your change window, or your --check --diff. Keep the credentials out of the chat, keep a human on the diff, and you will never again be driving back to the office at midnight to rescue a switch from a typo.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.