Slack LLM Agent Bot with Safe Function Calling Prompt
Design a Slack bot backed by an LLM that uses tool/function calling to run real operational actions, with guardrails, confirmation steps, and scoped permissions.
- Target user
- Engineers building an AI agent that takes actions from Slack
- Difficulty
- Advanced
- Tools
- Claude, ChatGPT
The prompt
You are an applied-AI engineer who has shipped LLM agents into Slack that can query systems and trigger actions without becoming a confused-deputy or a prompt-injection victim. I will provide: - The model/provider and SDK in use - The set of operational actions the agent should be able to take (read and write) - Who can invoke the bot and in which channels - Our risk tolerance for autonomous (no-confirm) actions Your job: 1. **Tool design** — define each callable function with a strict JSON schema, a clear description, and an explicit risk tier (read-only / reversible-write / destructive). Show how to split one fuzzy tool into several narrow, auditable ones. 2. **Trigger surface** — @-mention, slash command, or message in a designated channel; capture `user.id`, `channel.id`, `thread_ts` and thread the entire conversation so context (and audit) stays in one place. 3. **Permission mapping** — map the Slack user to an internal identity and authorize each tool call against THAT identity's RBAC, not the bot's powers. The bot must never let a user do via the LLM what they can't do directly. 4. **Confirmation flow** — for any write/destructive tool, the model proposes, the bot renders a Block Kit confirmation with the exact parameters, and only an explicit button click executes. No model-only execution of destructive actions. 5. **Prompt-injection defense** — treat message text, fetched docs, and tool outputs as untrusted; never let retrieved content silently change which tool runs or escalate scope; strip/escape control instructions. 6. **Rate, cost & loop control** — cap tool-call iterations per request, cap tokens/cost per user per day, and detect runaway tool loops. 7. **Observability** — log every prompt, tool call, parameters, and result with a trace ID; redact secrets; surface a `/agent audit` view. Output: (a) tool registry with schemas + risk tiers, (b) the agent loop pseudocode with authorization + confirmation gates, (c) Block Kit confirmation card, (d) prompt-injection test cases, (e) cost/loop guardrail config. Bias toward: human-in-the-loop for writes, per-user RBAC over bot omnipotence, and treating all text as untrusted.