Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Slack By James Joyner IV · · 9 min read

Building Ops Bots With the Slack Bolt Framework: A From-Scratch Guide

Bolt strips away the HTTP plumbing so you can ship a working Slack ops bot in an afternoon. Here's how I structure a Bolt app that survives production.

  • #slack
  • #bolt
  • #chatops
  • #nodejs
  • #devops
  • #automation

Every Slack bot I built before Bolt started the same way: a little Express server, a signature-verification middleware I copy-pasted and never fully trusted, a switch statement routing payloads, and a creeping suspicion that I’d gotten the 3-second response rule wrong somewhere. Bolt — Slack’s official framework — exists to delete all of that boilerplate so you can spend your time on the part that’s actually yours: the ops logic.

This is how I structure a Bolt app for a DevOps team, from empty directory to something I’d put on-call.

Why Bolt instead of raw HTTP

The Events API has rules that are easy to get subtly wrong. You must respond to event callbacks within 3 seconds or Slack retries and eventually disables your endpoint. You must verify the request signature on every inbound call. You must ack interactive payloads before you do any real work. Bolt handles all three by default — ack() is a first-class concept, signature verification is built into the receiver, and the app.event / app.command / app.action listeners give you a clean routing layer instead of a payload switch statement.

You also get Socket Mode for free, which means you can run the bot from behind a firewall with no public endpoint. For internal ops tooling that’s frequently the deciding factor.

The skeleton

const { App } = require('@slack/bolt');

const app = new App({
  token: process.env.SLACK_BOT_TOKEN,
  signingSecret: process.env.SLACK_SIGNING_SECRET,
  socketMode: true,
  appToken: process.env.SLACK_APP_TOKEN,
});

// A slash command: /deploy status
app.command('/deploy', async ({ command, ack, respond }) => {
  await ack();                        // ack FIRST, always
  const [action] = command.text.trim().split(/\s+/);
  if (action === 'status') {
    const status = await getDeployStatus();
    await respond({
      response_type: 'ephemeral',
      blocks: renderStatus(status),
    });
  }
});

(async () => {
  await app.start();
  console.log('⚡ ops bot running');
})();

The single most important habit in that snippet is await ack() as the very first line of every listener. Acking is just telling Slack “I received this” — it is not your response. You ack immediately, then take as long as you need to do the real work and reply with respond() or client.chat.postMessage().

Structure listeners by domain, not by type

The trap with Bolt is dumping every listener into one file. After three features you have a 600-line app.js and no idea which handler owns which command. I split by operational domain:

src/
  app.js              # wiring + start
  listeners/
    deploys.js        # /deploy command + button actions
    incidents.js      # /incident + modal submissions
    oncall.js         # /oncall + scheduled reminders
  services/
    argocd.js         # external API clients
    pagerduty.js

Each listener module exports a register(app) function:

// listeners/deploys.js
module.exports.register = (app) => {
  app.command('/deploy', handleDeployCommand);
  app.action('deploy_approve', handleApprove);
  app.action('deploy_rollback', handleRollback);
};

And app.js just calls each register. Now adding a feature is adding a file, and your handlers live next to the service clients they call.

The 3-second rule, the right way

Long-running ops actions — kicking a CI pipeline, querying a slow API — must not block the ack. The pattern is: ack, post a placeholder, do the work, update the message.

app.action('run_migration', async ({ ack, body, client }) => {
  await ack();
  const msg = await client.chat.postMessage({
    channel: body.channel.id,
    text: '⏳ Running migration…',
  });
  const result = await runMigration();   // takes 40s, that's fine
  await client.chat.update({
    channel: body.channel.id,
    ts: msg.ts,
    text: result.ok ? '✅ Migration complete' : `❌ ${result.error}`,
  });
});

The user gets instant feedback, Slack is happy, and the slow work happens off the critical path.

Error handling that doesn’t go silent

A thrown error inside a Bolt listener disappears unless you catch it. Wire up the global error handler and never trust a naked listener in production:

app.error(async (error) => {
  console.error('bolt error', error);
  await notifyOpsChannel(`Bot error: ${error.message}`);
});

I also wrap every external API call in a try/catch that posts a human-readable failure back into the channel. A bot that fails silently is worse than no bot — your team assumes the action worked.

Keep the AI logic out of the listeners

If your bot summarizes incidents or drafts comms with an LLM, resist putting the prompt inline. Keep prompts in a versioned library and call them through a thin service module so you can iterate on wording without redeploying the bot. We keep a set of reusable ops prompts for exactly this — the listener calls summarize(thread), and what “summarize” means lives in one place.

Deploying it

For a single-team internal bot, Socket Mode plus a systemd unit or a small container is plenty — no ingress, no TLS termination, no public DNS. Set Restart=always, ship the four tokens as environment variables (never in the repo), and you’re done. If you outgrow Socket Mode and need to scale horizontally, switching to the HTTP receiver is a config change, not a rewrite — another reason Bolt earns its place.

Where to go next

Start with one command you run by hand ten times a week — deploy status, on-call lookup, a log tail — and Boltify it. Once the skeleton is in place, every new feature is a listener file and a service client. For the broader pattern library around ChatOps, see the rest of our Slack for ops guides.

Bot-driven actions touch real infrastructure. Gate anything destructive behind explicit confirmation and verify against your own systems before trusting automated output.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.