Skip to content
DevOps AI ToolKit
Newsletter
All guides
AI for Slack By James Joyner IV · · 9 min read

Building Slack Socket Mode Apps for Ops: Ditch the Public Endpoint

Socket Mode lets your Slack ops bot run behind the firewall with no inbound port and no public URL. Here's how to build one that survives reconnects and production.

  • #slack
  • #socket-mode
  • #bolt
  • #chatops
  • #devops
  • #python

The first ops bot I ever shipped lived behind an Nginx reverse proxy, a public DNS record, and a TLS cert I had to remember to renew — all so Slack could POST events to it. For an internal tool that only my team used, that was a lot of attack surface and a lot of moving parts. Socket Mode deletes all of it.

With Socket Mode, your app opens an outbound WebSocket to Slack and receives events over that connection. No inbound port. No public URL. No request-signature dance on every payload. The bot can run on a box inside your VPC, on your laptop while you develop, or in a Kubernetes pod with no ingress at all. For internal DevOps tooling, that’s almost always the right model.

How Socket Mode actually works

In the default HTTP model, Slack sends each event as a POST to a URL you host, and you verify the request signature. In Socket Mode, the flow inverts:

  1. Your app authenticates with an app-level token (starts with xapp-) and asks Slack for a WebSocket URL.
  2. It opens that WebSocket.
  3. Slack pushes events, slash commands, interactions, and shortcuts down the socket as they happen.
  4. Your app acks each one within 3 seconds and responds using a normal bot token (xoxb-).

You enable it in the app config under Socket Mode, generate an app-level token with the connections:write scope, and subscribe to the events you care about. Crucially, the bot token scopes and event subscriptions are identical to the HTTP model — only the transport changes. You can develop in Socket Mode and switch to HTTP for distribution later without rewriting handlers.

A minimal bot that does something useful

Here’s a Bolt for Python app in Socket Mode that responds to a slash command by running a read-only diagnostic. Note there’s no web framework and no exposed port:

import subprocess
from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler

app = App(token="xoxb-...")  # load from env in real life

ALLOWED = {"pods", "nodes", "events"}

@app.command("/k8s")
def k8s_status(ack, respond, command):
    ack()  # ack within 3 seconds, always
    resource = command["text"].strip()
    if resource not in ALLOWED:
        respond(f"Allowed: {', '.join(sorted(ALLOWED))}")
        return
    out = subprocess.run(
        ["kubectl", "get", resource, "-o", "wide"],
        capture_output=True, text=True, timeout=10,
    )
    respond(f"```{out.stdout[:3500]}```")

if __name__ == "__main__":
    SocketModeHandler(app, app_token="xapp-...").start()

Two things to notice. First, the ack() fires immediately — Slack closes the interaction if you don’t ack within three seconds, so you ack before doing any real work. Second, the command allow-lists what it’ll run. A bot that shells out to whatever a user types is a remote code execution hole with a friendly UI. Allow-list, always.

The reconnect problem nobody warns you about

The demo works on your laptop. Then it runs for a week in production and one morning the bot is silently dead. WebSockets drop — network blips, Slack rotating connections, a deploy on their side. A production Socket Mode app must handle reconnection, and you must not assume the library does it perfectly.

Bolt’s Socket Mode client reconnects automatically in most cases, but I still wrap it for resilience:

  • Run under a process supervisor. systemd with Restart=always and a RestartSec backoff means a hard crash self-heals.
  • Add a heartbeat log so you can alert on silence. If the bot hasn’t logged a received event or a ping in N minutes during business hours, page yourself.
  • Don’t run multiple instances by accident. Each Socket Mode connection receives every event. Two replicas means every command runs twice. If you want HA, use Slack’s support for multiple connections deliberately and make your handlers idempotent — don’t just scale the Deployment to 2 and hope.

A systemd unit for the bot is boring and correct:

[Service]
ExecStart=/usr/local/bin/python /opt/slackbot/app.py
Restart=always
RestartSec=5
Environment=SLACK_BOT_TOKEN=...
Environment=SLACK_APP_TOKEN=...

Securing the two tokens

Socket Mode trades request-signature verification for two long-lived tokens, so token hygiene matters more:

  • The app-level token (xapp-) can open connections that receive all your subscribed events. Treat it like a master key.
  • The bot token (xoxb-) carries every scope you granted. Scope it minimally — if the bot only posts and reads one channel, don’t grant channels:history org-wide.
  • Store both in a secrets manager, inject at runtime, and rotate on a schedule. Never bake them into an image layer.
  • Because there’s no inbound endpoint, you also can’t be hit by replayed or forged requests — a real security win over the HTTP model for internal tools.

Where Socket Mode is the wrong call

Socket Mode is fantastic for internal bots, but skip it when:

  • You’re distributing the app to other workspaces via the App Directory. Distributed apps need the HTTP request model; Socket Mode is for apps you operate yourself.
  • You want stateless, autoscaling handlers behind a load balancer. The persistent connection model fights horizontal autoscaling.
  • Your platform genuinely can’t make outbound WebSocket connections (rare, but some locked-down environments block them).

For everything else internal — the on-call helper, the deploy trigger, the diagnostics bot — Socket Mode is less infrastructure, less attack surface, and faster to ship.

A pragmatic starting point

Build your next internal bot in Socket Mode from a single process, run it under systemd with Restart=always, allow-list every command, and add a heartbeat. You’ll skip the reverse proxy, the cert renewal, and the public endpoint entirely — and the whole thing fits in a file you can read in one sitting.

If you’re pairing the bot with an AI model to summarize incidents or classify requests, keep your prompt templates version-controlled alongside the code; our prompt library has starting points, and the rest of our AI for Slack guides cover the surrounding patterns.

Free download · 368-page PDF

Download the Free 500-Prompt DevOps AI Toolkit

500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.

  • 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
  • Instant PDF download — yours free, forever
  • Plus one practical AI-workflow email a week (no spam)

Single opt-in · unsubscribe anytime · no spam.