Build Declarative Copilot Agents for DevOps in Microsoft

I was skeptical of Copilot for ops work. The generic assistant doesn’t know our runbooks, can’t see our deploy status, and confidently invents kubectl flags that don’t exist. But declarative agents changed my mind. They let you take the Copilot reasoning engine and bolt on your knowledge and your tools — without standing up a full Bot Framework service.

Here’s how I build a DevOps-focused declarative agent for Teams, and where I draw the safety line.

What a declarative agent actually is

A declarative agent is Microsoft 365 Copilot, scoped and specialized by a configuration file. You don’t train a model or host inference. You declare:

Instructions — the system prompt that gives it a persona and rules.
Knowledge — the documents/sites it can ground answers in (your runbooks, your wiki).
Actions — APIs it can call, defined via an OpenAPI plugin.
Conversation starters — suggested prompts so people know what to ask.

Microsoft hosts the model and orchestration. You provide the context. That’s a dramatically lower lift than building a bot from scratch — which I cover separately in the Bot Framework guide.

The manifest

The agent is defined by a JSON manifest. A DevOps agent’s core looks like this:

{
  "name": "OpsCopilot",
  "description": "Helps the SRE team find runbooks and check service status",
  "instructions": "You are an SRE assistant. Ground every answer in the provided runbooks. When asked to act on a system, only use the defined read-only actions. Never invent commands, metric names, or hostnames. If you are unsure, say so and link the runbook.",
  "conversation_starters": [
    { "title": "Find a runbook", "text": "Find the runbook for high checkout error rate" },
    { "title": "Service status", "text": "What's the current status of checkout-api?" }
  ],
  "capabilities": [
    { "name": "WebSearch" }
  ]
}

The instructions field is where you encode discipline — grounding, no invented commands, admit uncertainty. Treat it like you’d treat a careful system prompt, because that’s exactly what it is.

Grounding it in your runbooks

The single biggest value is knowledge grounding. Point the agent at the SharePoint site or document library where your runbooks live, and now “how do we recover from a stuck checkout queue” returns your procedure, not a generic web answer.

This is what kills the hallucination problem for the read-heavy use case. When the agent is grounded in real runbooks, it cites them instead of inventing steps. The quality of the answer is the quality of your runbooks — which is also a nice forcing function to keep them current.

Adding actions — the careful part

Knowledge is read-only and safe. Actions let the agent call APIs, and that’s where you need discipline. You define actions via an OpenAPI spec (an API plugin), and Copilot can invoke them to answer.

My hard rule, the same one I apply to every AI-in-ops surface: expose read-only actions freely; gate every state-changing action behind human confirmation.

Safe to expose: “get deploy status,” “list firing alerts,” “who’s on call.” These read and inform.
Never expose directly: “restart service,” “scale deployment,” “rollback.” If the agent can call these, a confidently-wrong inference becomes a production change.

For anything mutating, the action should create an approval request (a card a human clicks), not perform the change. The agent proposes; a human disposes. This mirrors the incident-response principle: AI reads and reasons, humans run commands.

Packaging and deploying

The agent ships as part of a Teams app package — the manifest, the declarative agent file, and any API plugin specs, zipped and uploaded via the Teams admin center or the Developer Portal. Microsoft’s Copilot Studio and the Teams Toolkit for VS Code both scaffold this for you, which saves a lot of manifest fiddling.

Roll it out to a pilot group first. An agent that gives a wrong runbook answer to the whole org on day one will lose trust you won’t easily win back.

Testing for hallucination

Before you trust it, adversarially test it:

Ask it things not in the runbooks. It should say “I don’t have a runbook for that,” not invent one. If it confabulates, tighten the instructions.
Ask it to do something destructive. It should refuse or route to an approval, never just call a mutating action.
Check its citations. Grounded answers should point at a real runbook. No citation is a red flag.

The prompt library has evaluation prompts for stress-testing an agent’s grounding and refusal behavior.

Where declarative agents fit vs. a bot

Choose a declarative agent when the job is mostly knowledge + light read actions and your org has Copilot. It’s faster to build and Microsoft hosts the hard parts. Choose a custom bot when you need complex stateful logic, deep callbacks, or you’re not on Copilot. Many teams run both: the agent for “find me the runbook / what’s the status,” the bot for interactive operational workflows.

Start with read-only and grounding

Don’t open with actions. Build the agent grounded in your best runbooks with read-only status lookups, ship it to a pilot team, and let it prove it gives accurate, cited answers. Once people trust it for knowing things, you can carefully add read actions, and only then consider mutating actions behind approval cards. An ops agent earns capability by demonstrating it won’t lie to you — get the grounding right first, and the rest follows.

Declarative agents can be wired to real APIs. Keep mutating actions behind human-approval cards, ground answers in real runbooks, and adversarially test for hallucination before broad rollout.

Build Declarative Copilot Agents for DevOps in Microsoft Teams