Thursday, 18 December 2025

Custom AI Agents: Use the Copilot Retrieval API for grounding on SharePoint Content

If you’re building a custom AI agent (Copilot Studio, Agents SDK, or your own app) and you want permission-trimmed content from SharePoint without managing your own vector store, then the Microsoft 365 Copilot Retrieval API is a great solution. 

It’s part of the broader Copilot APIs surface in Microsoft Graph, designed to reuse the same “in-tenant” grounding behavior that Microsoft 365 Copilot uses.

So in this post, lets deep dive into the Copilot Retrieval API:


On a high level, we’ll:

  • Decide when to use Retrieval API vs “normal” Graph CRUD/search

  • Configure the minimum delegated permissions for SharePoint grounding

  • Call the Retrieval endpoint with dataSource=sharePoint

  • Scope results to specific sites using KQL filterExpression

  • Use the returned extracts + URLs as grounding context for your LLM

Prereqs

  • A Microsoft 365 Copilot license for each user calling Copilot APIs (this is separate from standard Graph CRUD usage).

  • Your app uses delegated auth (application permissions aren’t supported for this API).

  • Microsoft Entra app registration with delegated Graph permissions: Files.Read.All + Sites.Read.All (required together for SharePoint/OneDrive retrieval).

  • Familiarity with the Copilot APIs model (Copilot APIs are REST under Microsoft Graph and use standard Graph auth).

1) Pick The Right Tool: Retrieval vs Graph

Use Microsoft Graph CRUD when you need to read/update SharePoint data (lists, drives, items, etc.). 

Use the Copilot Retrieval API when you need ranked text content (snippets) to ground an LLM response, while keeping content in place and respecting permissions/sensitivity labels.

A good heuristic:

  • “Give me the document metadata / file bytes” → Graph sites/drives/items

  • “Answer using the most relevant parts of our HR policies” → Retrieval API

2) Set Delegated Permissions

The Retrieval API is delegated-only. That’s intentional: the service retrieves snippets from content the calling user can access, permission-trimmed at query time.

Minimum permissions for SharePoint grounding:

4) Call Retrieval For SharePoint (With Site Scoping)

Here’s the core call. You provide:

  • queryString (single sentence, up to 1,500 chars)

  • dataSource = sharePoint

  • optional filterExpression (KQL) to scope sites

  • optional resourceMetadata

  • maximumNumberOfResults (1–25) 

Change these values:
  • queryString: the user’s natural-language question

  • filterExpression: one site, or multiple sites using OR

  • resourceMetadata: only the metadata fields you want returned

  • maximumNumberOfResults: keep within 1–25

Minimal working example

Run the PowerShell call above and expect a response with retrievalHits[], each containing:

  • webUrl

  • extracts[] with text + relevanceScore

  • optional resourceMetadata and sensitivityLabel

Troubleshooting

  • 403 / access denied: confirm the signed-in user has a Copilot license (Copilot APIs require it).

  • No results: remove filterExpression first, then add it back (KQL path scoping is easy to over-constrain).

  • 400 bad request: maximumNumberOfResults must be 1–25, and queryString has constraints (single sentence, 1,500 chars).

  • Trying application permissions: not supported—switch to delegated auth.

  • Using /beta in production: move to v1.0 (beta can change and isn’t supported for production).

Notes

  • The Retrieval API is built to avoid “DIY RAG plumbing” (export/crawl/index/vector DB) while still honoring Microsoft 365 security and compliance boundaries.

  • Use filterExpression with path:"<site url>" to keep grounding tight (single site or multiple sites with OR).

  • You typically pass the returned extracts.text + webUrl into your model prompt, and keep the URLs as citations in your UI.

Wrapping up

The Copilot Retrieval API is the most pragmatic way to pull permission-trimmed SharePoint grounding content into your own agents without building or hosting a parallel index. Once you can reliably retrieve relevant extracts, everything else becomes “just orchestration” around your model and UX.

Hope this helps!

No comments: