Sunday, 2 February 2025

Building an Agent for Microsoft 365 Copilot: Adding an API plugin

In the previous post, we saw how to get started building agents for Microsoft 365 Copilot. We saw the different agent manifest files and how to configure them. There are some great out of the box capabilities available for agents like using Web search, Code Interpreter and Creating Images.

However, a very common scenario is to bring in data from external systems into Copilot and pass it to the LLM. So in this post, we will have a look at how to connect Copilot with external systems using API Plugins.

In simple terms, API Plugins are AI "wrappers" on existing APIs. We inform the LLM about an API and give it instructions on when to call these APIs and which parameters to pass to them. 

To connect your API to the Copilot agent, we have to create an "Action" and include it in the declarativeAgent.json file we saw in the previous post:

The ai-plugin.json file contains information on how the LLM will understand our API. We will come to this file later.

Before that, let's understand how our API looks. We have a simple API that can retrieve some sample data about repairs:

Next, we will need the OpenAI specification for this API

The OpenAPI specification is important as it describes in detail the exact signatures of our APIs to the LLM.

Finally, we will come to the most important file which is the ai-plugin.json file. It does several things. It informs Copilot which API methods are available, which parameters they expect and when to call them. This is done in simple human readable language so that the LLM can best understand it.

Additionally, it also handles formatting of the data before it's shown to the user in Copilot. Whether we want to use Adaptive Cards to show the data or if we want to run any other processing like removing direct links from the responses.

If you are doing plugin development, chances are that you will be spending most of your time on this file.

Once everything is in place, we will run our plugin with simple instructions and this is how it will fetch the data from external database


Hope this helps!

Monday, 6 January 2025

Building an Agent for Microsoft 365 Copilot Chat

Microsoft 365 Copilot Chat is a powerful, general-purpose assistant that greatly improves personal productivity, helping users manage tasks like emails, calendars, document searches, and more.

However, the true potential of M365 Copilot lies in its extensibility. By building specialized "vertical" agents on top of M365 Copilot, you can unlock team productivity as well as automate business processes. These custom agents not only help in individual productivity, but also help in building workflows across groups of people.

Agents for Microsoft 365 Copilot leverage the same robust foundation—its orchestrator, foundation models, and trusted AI services—that powers M365 Copilot itself. This ensures consistency, reliability, and security at scale.

So in this post, let's take a look at how to build Agents on top of Microsoft 365 Copilot. 

We will be building a Declarative Agent with the help of the Teams toolkit. Before we start, we need the following prerequisites:

A Microsoft 365 Copilot license

Teams Toolkit Visual Studio Code extension

Enable side loading of Teams apps

Once everything is in place, we will go to Visual Studio Code 

In the Teams Toolkit extension, To create an agent, we will click on "Create new app"

Then, click on "Agent"


When the agent is created, we see a bunch of files getting created as part of the scaffolding.  So let's take a look the difference moving pieces of the agent:

manifest.json

If you have been doing M365 Apps (and Teams apps) for a while, you are familiar with this file. This is the file which represents our Agent in the M365 App catalog. It contains the various details like name, description and capabilities of the app.

However, you will notice the new property in this file which is "copilotAgents" This property will be pointing to a file containing the description of our new declarative agent. So let's look at how that file looks next:

declarativeAgent.json

Lots of interesting things are happening here. 

 
First the "instructions" property is pointing to a file which will contain the "System prompt" of our agent. We will have a look at this file later. 

Next, the "conversation_starters" property contains ready to go prompts which the user can ask the agent. Our agent will be trained to respond to these prompts. This is so that the user is properly onboarded when they land on our agent.

Finally there will be the actions and capabilities properties: 

Actions property contains connections to external APIs which the agent can invoke.

Capabilities property contains the different out of the box "Tools" which we want to allow in our agent. E.g. SharePoint and OneDrive search, image creation, code interpreter to generate charts etc 

We will talk about both these properties in more details in subsequent blog posts.

instruction.txt

And finally we have the instructions file where we can specify the system prompt for the agent. Here, we can guide the agent and assign it personality. We can make it aware of the tools and capabilities it has available and when the use them. We can provide one shot or few shot examples to "train" the agent on responding to users.

Once all files are in place, you can click "Provision" from the Teams toolkit extension:

And our new agent will be ready!

Here is how the agent will provide the conversation starters when we first lang on it:


Simple conversation using Web search capability:


Conversation using Web search and Code Interpreter capabilities:



Hope this was helpful! We will explore M365 Copilot Agents more in subsequent blog posts.