Running Ansible AWX for Self-Service Infrastructure Automation
Ad-hoc playbook runs from someone's laptop don't scale. Here's how to stand up AWX so teams can run automation safely, with audit trails and RBAC.
- #iac
- #ansible
- #awx
- #automation
- #self-service
- #rbac
There’s a predictable lifecycle to Ansible adoption. First one engineer writes a playbook. Then five engineers write playbooks. Then someone runs the wrong one against prod from their laptop with a stale inventory, and suddenly you’re explaining to leadership why the staging database got reconfigured.
The fix isn’t more discipline. It’s a control plane. AWX — the open-source upstream of Ansible Automation Platform — turns “SSH in and run a playbook” into a governed service with RBAC, audit logs, secret injection, and a button non-experts can press. After years of running it, here’s how I’d set it up.
What AWX actually gives you
Strip away the marketing and AWX solves four real problems:
- Centralized execution. Playbooks run on the AWX node (or remote execution nodes), not random laptops. One inventory source of truth, one set of credentials.
- RBAC. You decide who can run what against which inventory. The app team gets a “restart service” job template; they cannot touch the firewall playbook.
- Credential injection. Secrets live in AWX’s encrypted store and get injected at runtime. Engineers run jobs without ever seeing the SSH key or cloud token.
- Audit and history. Every run is logged: who, when, what changed, full output. When something breaks at 3am, the timeline is already written.
Deploying AWX on Kubernetes
AWX ships as an operator. On any cluster:
# awx-instance.yaml
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
spec:
service_type: ClusterIP
ingress_type: ingress
ingress_hosts:
- hostname: awx.internal.example.com
postgres_storage_class: fast-ssd
postgres_resource_requirements:
requests:
cpu: 500m
memory: 1Gi
Install the operator with Kustomize, apply the manifest, and the operator provisions the web, task, and Postgres pods. Put it behind your internal ingress and SSO — AWX is not something you expose to the public internet.
Project, inventory, credential, template
The mental model is four objects that compose into a runnable job:
- Project — a Git repo containing your playbooks. AWX clones and refreshes it on a schedule or on demand. This is the key discipline: playbooks live in version control, not in the UI.
- Inventory — your hosts. Use a dynamic inventory plugin so AWX queries AWS/Azure/GCP at runtime rather than running against a stale list.
- Credential — SSH keys, cloud tokens, vault passwords, encrypted and injected at runtime.
- Job Template — binds a project + playbook + inventory + credential into a runnable unit with defined variables.
# An inventory source pointing at AWS via the dynamic plugin
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
keyed_groups:
- key: tags.Environment
prefix: env
filters:
instance-state-name: running
Surveys turn playbooks into safe forms
The feature that actually makes AWX self-service is surveys. A survey is a form that prompts the user for variables before a job runs, with validation and dropdowns instead of free-text footguns.
# Survey spec for a "deploy app version" template
- question_name: "Target environment"
variable: target_env
type: multiplechoice
choices:
- staging
- production
required: true
- question_name: "App version (git tag)"
variable: app_version
type: text
required: true
Now an app developer who has never touched Ansible picks “staging,” types a version, and clicks launch. They cannot accidentally type “prod” with a typo, and they never see a credential. That’s the whole game: constrain the inputs, govern the execution.
Workflows chain jobs with logic
Real operations are multi-step: provision, configure, smoke-test, notify. AWX workflow templates let you chain job templates with success/failure branching:
[Provision VMs] --on success--> [Configure base] --on success--> [Smoke test]
--on failure--> [Rollback + alert]
This is where AWX stops being “remote Ansible” and starts being an orchestration layer. You can gate a production deploy behind an approval node so a human clicks “approve” before the change proceeds.
Where AI fits
AWX governs execution, but you still have to author playbooks and surveys. That’s where an assistant earns its keep — generating the survey YAML, drafting a workflow’s branching logic, or converting an existing shell runbook into a job-template-ready playbook. I keep a set of Ansible automation prompts for scaffolding survey specs and dynamic inventory configs, then review and commit them to the project repo. The model writes the boilerplate; AWX enforces the guardrails.
Operational guardrails
A few hard-won lessons:
- Never author playbooks in the UI. Everything comes from Git via projects. The UI is for running, not editing.
- Scope credentials tightly. One credential per environment with least-privilege cloud IAM. A prod deploy credential should not be able to terminate instances.
- Use instance groups for isolation. Run prod jobs on dedicated execution nodes that can reach prod, and nothing else.
- Set job timeouts. A hung playbook holding a lock will ruin your evening. Cap it.
- Back up the database. AWX state — templates, credentials, history — lives in Postgres. Snapshot it.
The payoff
The first time a product team self-serves a routine operational task — at 2pm on a Tuesday, without paging you — AWX has paid for itself. You’ve moved from being the bottleneck who runs every playbook to being the platform engineer who curates safe, governed automation other people consume.
Start small: one project, one inventory, one well-surveyed job template that replaces something people currently ping you about. Expand from there. For more on structuring the underlying automation, see our Infrastructure as Code guides.
AWX configurations and generated playbooks are assistive, not authoritative. Validate every job template against a non-production inventory before granting self-service access.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.