Building a Python Slack Bot for Ops with AI (ChatOps Without

The first ops Slack bot I built did exactly one thing: /whodeployed told you who pushed the last release, so people would stop @-ing me to ask. It saved maybe thirty interruptions a week, and it was a hundred lines of Python. ChatOps is one of the highest-leverage automations a small team can build, because it puts your runbooks where people already are — the chat window — and gives you a free audit log of who ran what. The wiring is well-documented Slack SDK boilerplate, which is exactly what an AI assistant is good at drafting fast.

The catch is that a Slack bot is a command-execution surface exposed to your whole workspace, so the security review is the part you absolutely own. The AI is the quick junior; you decide what the bot is allowed to do.

Start with Bolt and Socket Mode

Slack’s official slack-bolt library plus Socket Mode is the lowest-friction start, because Socket Mode opens an outbound WebSocket and you don’t need a public HTTPS endpoint or to manage request-signature verification yourself.

import os
from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler

app = App(token=os.environ["SLACK_BOT_TOKEN"])

@app.command("/whodeployed")
def who_deployed(ack, respond):
    ack()  # acknowledge within 3 seconds or Slack retries
    last = get_last_deploy()  # your function
    respond(f"Last deploy: {last['user']} at {last['time']}")

if __name__ == "__main__":
    SocketModeHandler(app, os.environ["SLACK_APP_TOKEN"]).start()

Notice both tokens come from os.environ, never hardcoded. That ack() is load-bearing: Slack demands an acknowledgment within three seconds or it retries the command, and a bot that forgets ack() fires every action twice. When an AI drafts a handler, confirm ack() is called before any slow work.

Let AI draft the handler shapes

The repetitive part — wiring up commands, parsing arguments, formatting Block Kit responses — is ideal AI work. A prompt I reuse:

Write a slack-bolt slash command /status that takes a service name argument, calls a function check_health(name) returning a dict with up and latency_ms, and replies with a Block Kit message that’s green if up and red if down.

You’ll get a clean handler with Block Kit JSON you’d never enjoy writing by hand. Review it for two things: that it acknowledges fast, and that it handles the empty argument case, because the first thing a user does is type /status with no service name and watch an unhandled IndexError crash your bot silently.

@app.command("/status")
def status(ack, respond, command):
    ack()
    name = command["text"].strip()
    if not name:
        respond("Usage: `/status <service>`")
        return
    health = check_health(name)
    color = "🟢" if health["up"] else "🔴"
    respond(f"{color} {name}: {health['latency_ms']}ms")

The authorization gate is yours to write

Here’s where AI help stops and your judgment starts. A bot that runs /restart-service prod-db for anyone in the workspace is a self-inflicted incident. Every command that mutates anything needs an authorization check, and you should not trust an AI to scope this correctly without explicit instruction.

ALLOWED = set(os.environ["OPS_USER_IDS"].split(","))

def authorized(user_id: str) -> bool:
    return user_id in ALLOWED

@app.command("/restart")
def restart(ack, respond, command):
    ack()
    if not authorized(command["user_id"]):
        respond("Not authorized. Ask an on-call engineer.")
        return
    # ... do the restart

Even this is the floor, not the ceiling — for genuinely dangerous actions, add a confirmation step and consider checking Slack’s user groups via the API rather than a static list. The principle: the AI writes the handler, you write and verify the gate, because the AI has no idea which of your commands can take down production.

Pro Tip: Never let a Slack command pass user text straight into a shell. subprocess.run(f"kubectl rollout restart {command['text']}", shell=True) is a remote-code-execution hole — someone types nginx; rm -rf / and you have a very bad day. Pass arguments as a list, never shell=True, and allowlist the values that reach any command.

Verify signatures if you ever leave Socket Mode

The moment you switch from Socket Mode to HTTP endpoints (for scale or hosting reasons), request-signature verification becomes mandatory — otherwise anyone who finds your URL can forge commands. Bolt handles this when you give it the signing secret:

app = App(
    token=os.environ["SLACK_BOT_TOKEN"],
    signing_secret=os.environ["SLACK_SIGNING_SECRET"],
)

If an AI scaffolds an HTTP-mode bot and omits the signing secret, that’s a hard reject. An unverified endpoint is an open door. This is exactly the kind of security-relevant omission worth running past a structured code review before it ships.

Long-running actions: acknowledge, then work

Slack’s three-second rule means anything slow — a deploy, a backup, a query against a cold database — must be acknowledged immediately and run in the background, posting the result when done.

import threading

@app.command("/backup")
def backup(ack, respond, command):
    ack("Backup started, I'll report back…")
    threading.Thread(
        target=lambda: respond(run_backup()), daemon=True
    ).start()

For real workloads you’d use a proper task queue rather than a raw thread, but the shape is the point: respond fast, work async. When the AI gives you a handler that does slow work before ack(), you’ll see duplicate executions in testing — that’s your cue to restructure.

Keep the tokens and the secrets out of everything

Three rules that never bend. The bot token and app token come from the environment or a secrets manager, never from source — a leaked bot token lets someone impersonate your bot workspace-wide. When you paste handler code into an AI prompt for help, scrub any token, channel ID, or internal hostname first; the model needs the logic, not your credentials. And the bot’s own actions should use a least-privilege Slack app scope — request only the OAuth scopes the commands actually need, not admin.

Where it goes from here

A good ops bot grows from /whodeployed into a real ChatOps surface that fronts your incident response and monitoring alerts workflows — acknowledging pages, pulling dashboards, kicking off runbooks. I draft the handlers with Claude or Cursor, write the auth gates myself, and keep the vetted command templates in a prompt workspace. The prompt patterns are in our prompt library and the prompt packs, with related automation in the Bash and Python automation category.

The standing rule

A Slack bot executes commands on behalf of your whole workspace, which makes its blast radius large and its security non-optional. Let the AI draft the Bolt boilerplate and Block Kit fast — it’s genuinely good at it — but you own the three things it can’t judge: who’s authorized, that no user input reaches a shell, and that no token lives in code or in a prompt. Quick junior writes the handlers; the human decides what the bot can do to production.

Building a Python Slack Bot for Ops with AI (ChatOps Without the Foot-Guns)