Building an AI Ops Copilot With Guardrails That Hold

I built our first ops copilot the wrong way. I gave a model a shell tool, a service account with broad permissions, and a cheerful system prompt about “helping the on-call team move faster.” It worked beautifully in the demo and then, three weeks later, proposed restarting a StatefulSet during a backup window because the metric it was staring at looked stuck. It didn’t execute — pure luck, the tool call failed on a typo — but I spent that evening rewriting the whole thing. The lesson stuck: a copilot is only as safe as the layer between what it wants to do and what actually happens.

The mental model that finally worked for me is to treat the model as a fast, tireless junior engineer. Genuinely useful, often right, occasionally confidently wrong, and absolutely not someone you hand the production credentials to on day one. A junior drafts the plan; a human signs off; the system enforces the boundaries the junior doesn’t yet understand. Everything below is about building those boundaries in code rather than hoping the prompt holds.

Split the tool surface: read freely, write behind a gate

The single most important design decision is the shape of your tools. Read tools can run on their own — querying Prometheus, tailing logs, describing a Kubernetes object hurts nothing and gives the model the context it needs to reason. Write tools are different. Every write tool returns a proposal, not a result. It never touches the cluster on its own.

from dataclasses import dataclass, field
from typing import Any, Callable
import time

@dataclass
class Proposal:
    action: str
    args: dict[str, Any]
    blast_radius: str          # "pod", "deployment", "namespace", "cluster"
    reversible: bool
    back_out: str              # the exact command to undo this
    proposal_id: str = field(default_factory=lambda: f"prop-{int(time.time())}")

# READ TOOL — runs immediately, no gate
def get_pod_metrics(namespace: str, pod: str) -> dict:
    return prom_query(
        f'rate(container_cpu_usage_seconds_total{{namespace="{namespace}",pod="{pod}"}}[5m])'
    )

# WRITE TOOL — returns a Proposal, executes nothing
def restart_deployment(namespace: str, deployment: str) -> Proposal:
    return Proposal(
        action="restart_deployment",
        args={"namespace": namespace, "deployment": deployment},
        blast_radius="deployment",
        reversible=True,
        back_out=f"kubectl -n {namespace} rollout undo deployment/{deployment}",
    )

The model can call restart_deployment as often as it likes. Nothing happens. It produces a structured object a human reviews. That asymmetry — reads are cheap and immediate, writes are inert until approved — is the whole game. The model gets to be useful at full speed on the half of the job that’s safe.

Pro Tip: Make back_out a required field on every write proposal. If the model (or you) can’t articulate the undo, that’s a strong signal the action shouldn’t be automated yet.

Wrap every write in a policy layer

Returning a proposal isn’t enough on its own — proposals still need somewhere to be checked before a human even sees them. I put a single guard wrapper around every write tool so the policy lives in one place, not scattered across fifteen tool definitions.

class PolicyViolation(Exception):
    pass

def guard(proposal: Proposal, *, env: str) -> Proposal:
    # Hard denials — never even surface these to a reviewer
    if env == "prod" and proposal.blast_radius == "cluster":
        raise PolicyViolation("cluster-wide actions are not permitted in prod")

    if not proposal.reversible and env == "prod":
        raise PolicyViolation(f"{proposal.action} is irreversible; requires manual runbook")

    # Maintenance-window check for anything above pod scope
    if proposal.blast_radius in ("namespace", "cluster") and not in_maintenance_window():
        raise PolicyViolation("namespace-scoped change outside maintenance window")

    return proposal

def execute_if_approved(proposal: Proposal, *, approver: str, env: str) -> str:
    guard(proposal, env=env)
    audit_log(proposal, approver=approver, decision="approved", env=env)
    return TOOL_IMPL[proposal.action](**proposal.args)

The guard runs twice in spirit: once to reject proposals that should never be offered, and once at execution to re-check that nothing changed between proposal and approval. A reviewer might sit on an approval for ten minutes; the maintenance window can close in that gap. Re-validating at execution time closes that race. This is the same discipline behind confidence-gated auto-remediation — the policy floor doesn’t move just because the model sounds sure.

Scope the service account to the blast radius you’ll allow

Prompts are advisory. RBAC is enforced. If your policy layer says “no cluster-wide actions in prod,” the service account the copilot runs as should be physically incapable of cluster-wide actions in prod — so that a bug in your Python is backstopped by the API server itself.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: payments
  name: copilot-proposer
rules:
  # Read everything in this one namespace
  - apiGroups: ["", "apps"]
    resources: ["pods", "deployments", "events", "replicasets"]
    verbs: ["get", "list", "watch"]
  # Write: only rollout restarts, only here
  - apiGroups: ["apps"]
    resources: ["deployments"]
    resourceNames: ["api", "worker"]
    verbs: ["patch"]

Notice what’s missing: no delete, no secrets, no wildcard namespaces, no node access. The copilot can read broadly within payments and patch exactly two deployments. Even if the model hallucinated a perfect-looking kubectl delete namespace payments, the call would 403. Never hand the model a credential whose reach exceeds the worst action you’d let it propose. The service account is the blast-radius boundary made real.

Make the audit trail non-optional

Every proposal, every approval, every denial, every execution gets logged with enough context to reconstruct the decision later. Not for compliance theater — for the postmortem you’ll inevitably write.

import json, datetime

def audit_log(proposal: Proposal, *, approver: str, decision: str, env: str):
    record = {
        "ts": datetime.datetime.utcnow().isoformat() + "Z",
        "proposal_id": proposal.proposal_id,
        "action": proposal.action,
        "args": proposal.args,
        "blast_radius": proposal.blast_radius,
        "back_out": proposal.back_out,
        "approver": approver,          # a human, always
        "decision": decision,           # approved | denied | auto-denied
        "env": env,
    }
    with open("/var/log/copilot/audit.jsonl", "a") as f:
        f.write(json.dumps(record) + "\n")

The approver field is the one that matters most. It is never the model. A human owns the decision and their name is on it. When something goes wrong at 3 AM, the audit log tells you exactly what was proposed, who said yes, and the precise command to undo it. I feed this same stream into our monitoring and alerts dashboard so a spike in denials becomes visible on its own — a sudden run of policy violations usually means the model is reasoning from stale telemetry, which is worth a human glance.

Keep the human in the loop where it counts

The temptation, once the gate works, is to start auto-approving “obviously safe” actions. Resist it for longer than feels comfortable. The proposals that look obviously safe are exactly the ones where a tired engineer rubber-stamps without reading — and the model’s worst suggestions are the ones that look reasonable. I keep a tiered approval flow: pod-scoped reversible actions need one ack; anything namespace-scoped needs an explicit confirmation that re-displays the blast radius and back-out path in full.

def review(proposal: Proposal) -> str:
    print(f"PROPOSAL  {proposal.action}({proposal.args})")
    print(f"  blast radius : {proposal.blast_radius}")
    print(f"  reversible   : {proposal.reversible}")
    print(f"  back out     : {proposal.back_out}")
    if proposal.blast_radius in ("namespace", "cluster"):
        print("  ** elevated scope — type the deployment name to confirm **")
    return input("approve? [name/no] ").strip()

Friction here is a feature. The job of the copilot is to compress the thinking — gathering telemetry, correlating events, drafting the plan — not to compress the deciding. A good copilot turns a forty-minute investigation into a five-minute one and then stops, holding out a clear proposal for a human to accept. For more on choosing which tasks are even worth automating, identifying and eliminating toil with AI is a useful companion.

Pick the model for reasoning, not for autonomy

I run this copilot on Claude because the part I lean on is the reasoning over messy telemetry — correlating a latency spike with a deploy event and a config change three services upstream. What I explicitly do not lean on is the model’s judgment about whether to act. That judgment lives in the policy layer, the RBAC, and the human at the keyboard. The model is the fast junior who reads everything and drafts a sharp plan. The guardrails are what let me trust the plan enough to look at it.

A copilot worth having is mostly plumbing: read tools that run free, write tools that only ever propose, a policy wrapper, a scoped credential, and an audit log with a human’s name in it. Get those five right and the model can be as eager as it likes — the system stays boring, and boring is exactly what you want at 3 AM. Start small, scope tighter than feels necessary, and widen the gate only after the audit log earns your trust.