Building a ChatOps Bot With Authorization Guardrails

The most dangerous line of code in a ChatOps bot is the one that doesn’t exist: the authorization check nobody wrote. A bot that runs deploy production when anyone types it in a channel is not an automation convenience, it’s a shared root shell with a friendly emoji. And the trap is subtle, because the bot feels personal — you type, it responds to you — so it’s easy to forget that the command executes with the bot’s credentials, not yours. Whoever can send the message inherits the bot’s power.

I learned this the calm way, in review, when a colleague pointed at a draft bot and asked “what stops someone in the public #general channel from running that?” The answer was nothing. We fixed it before it shipped. This guide is that fix, generalized: identity you can trust, default-deny authorization, context constraints, and an audit trail — with AI doing the drafting and you owning the policy.

Trust the Signed Identity, Not the Name

The foundational mistake is authorizing on a display name. Display names can be changed, and in some platforms impersonated, so a check like if user_name == "alice" is security theater. Every major chat platform delivers a signed, stable user ID in its event payload. Slack gives you a user.id inside a request whose signature you verify with your signing secret; that ID is the thing to trust.

def handle_command(event):
    if not verify_slack_signature(event.headers, event.raw_body):
        return reject("bad signature")          # the request isn't even from Slack
    user_id = event["user"]["id"]               # stable, signed — not the display name
    channel_id = event["channel"]["id"]
    command = parse(event["text"])
    decision = authorize(user_id, channel_id, command)
    audit(user_id, channel_id, command, decision)
    if not decision.allowed:
        return reject(decision.reason)
    return execute(command)                       # only now

Note the order. Verify the platform signature first — if the request isn’t genuinely from Slack, nothing else matters. Then resolve the trusted user ID. Then authorize. Then audit. Then, only then, execute. A model will happily draft this skeleton; what you verify is that no path reaches execute without passing authorize, and that authorize reads the signed ID rather than a name anywhere in the chain.

Default-Deny, Mapped by Blast Radius

ChatOps bots grow commands over time, and that growth is where allow-by-default kills you. If the authorization model permits anything not explicitly forbidden, every new command ships unguarded until someone remembers to restrict it — and the someone is usually an incident. Invert it: nothing is allowed unless explicitly granted to a role, and the required role scales with the command’s blast radius.

POLICY = {
    "status":        {"roles": ["everyone"],          "channels": ["any"]},
    "deploy:staging":{"roles": ["dev", "oncall"],     "channels": ["#deploys"]},
    "deploy:prod":   {"roles": ["oncall", "lead"],    "channels": ["#deploys"],
                      "window": "change-window"},
    "scale:down":    {"roles": ["oncall"],            "channels": ["#ops"]},
}

def authorize(user_id, channel_id, command):
    rule = POLICY.get(command.key)
    if rule is None:
        return Decision(False, "no policy: default deny")   # unknown == denied
    if not user_has_role(user_id, rule["roles"]):
        return Decision(False, "role not permitted")
    if channel_id not in resolve_channels(rule["channels"]):
        return Decision(False, "wrong channel")
    return Decision(True, "ok")

The POLICY.get(...) is None → deny line is the entire safety posture in one branch. A read-only status is open to everyone; deploy:prod demands a high role, a restricted channel, and a change window. This is exactly the kind of table AI drafts well — describe your commands and roles and it produces a credible first cut. Your job is to correct the blast-radius classifications, because only you know that scale:down can cause an outage during peak traffic and therefore deserves more guarding than its innocent name suggests.

Tie Power to On-Call, and Record Everything

Static role lists rot. The person who should be able to run emergency commands at 3 a.m. is whoever holds the pager right now, not whoever was added to a group eighteen months ago. Sourcing the elevated role from the on-call runbook automation schedule means authority follows responsibility. The bot asks the schedule “who is on call?” and grants the elevated command to that person for the duration of their shift.

Auditing is the non-negotiable backstop. Every command — allowed or denied — gets logged with who, what, where, when, and the policy version that decided it:

Prompt: “Here are my ChatOps commands and our IdP groups. Draft a default-deny authorization policy mapping each command to required roles, channels, and time windows by blast radius. Then write the authorization function so unknown commands are denied, and produce a test matrix of allowed and denied invocations including an impersonation attempt and a wrong-channel attempt.”

What it returns: a policy table, a default-deny authorize function, and a test matrix that explicitly includes the abuse cases — which is what turns “looks secure” into something you can assert in CI.

Verify the Denials, Not Just the Approvals

It is tempting to test a ChatOps bot by confirming the allowed commands work. That proves nothing about safety. The tests that matter are the denials: an unauthorized user is refused, a command in the wrong channel is refused, a forged or impersonated identity is refused, and every refusal produces no partial side effect plus an audit entry. Run these in a sandbox workspace before the bot touches a real channel.

The pattern mirrors every other guardrail in AI for Automation: the AI accelerates the drafting of policy and tests, but the load-bearing judgments — which commands are dangerous, whose identity to trust, what default to apply — stay human. A ChatOps authorization bug fails open and in public, so the verification step is not optional. For the design checklist behind this, see the ChatOps RBAC command authorization prompt.

Trust the Signed Identity, Not the Name

Default-Deny, Mapped by Blast Radius

Tie Power to On-Call, and Record Everything

Verify the Denials, Not Just the Approvals

Download the Free 500-Prompt DevOps AI Toolkit