Azure AI Search as a Knowledge Source

When your document collection outgrows simple search

Simple explanation

Azure AI Search is like hiring a research librarian for your agent.

Copilot connectors and SharePoint knowledge work fine for thousands of documents. But when you have tens of thousands — or hundreds of thousands — of documents, and you need the agent to understand meaning, not just match keywords? That is when you bring in Azure AI Search.

It is a dedicated search engine that lives in Azure. You feed it your documents, it builds an index, and your Copilot Studio agent queries that index to find relevant answers. You get four search modes: keyword (exact match), vector (meaning-based), semantic (AI-ranked), and hybrid (best of all worlds).

The four search modes

Understanding these four modes is essential. The exam will present scenarios and expect you to pick the right one.

Azure AI Search: four modes of finding information
Feature	How it works	Strengths	Weaknesses	Best for
Keyword search	Classic BM25 text matching — searches for exact terms and variations in the index	Fast, predictable, works well for technical terms, product codes, and exact phrases	Misses synonyms and conceptual matches — 'automobile' will not match 'car'	Structured queries, known terminology, product SKU lookups
Vector search	Converts query and documents to numerical embeddings — finds conceptually similar content	Understands meaning — 'vehicle maintenance' matches 'car repair schedule'	Requires an embedding model (Azure OpenAI), higher compute cost, embeddings must be generated at index time	Natural-language questions, cross-language search, finding conceptually related documents
Semantic ranking	AI re-ranker on top of keyword results — promotes the most relevant passages	Dramatically improves result quality for natural-language queries without needing embeddings	Only re-ranks existing keyword results — cannot find documents that keyword search missed	Improving keyword search quality when vector search is not feasible or not needed
Hybrid search	Runs keyword AND vector search in parallel, then fuses results with Reciprocal Rank Fusion (RRF)	Gets the best of both — exact matches from keyword, conceptual matches from vector	Most complex setup, highest compute cost, requires embedding model	Production workloads where query patterns are unpredictable — recommended default for most agents

Exam tip: hybrid + semantic is the gold standard

For the exam, the recommended production pattern is hybrid search with semantic ranking. This gives you keyword precision, vector conceptual understanding, and AI-powered re-ranking in one query. If a question asks for the “best” or “recommended” approach for a Copilot Studio agent, hybrid + semantic is almost always the answer — unless constraints (cost, no embedding model) rule it out.

Index architecture — what you need to know

An Azure AI Search index is like a database table optimised for search. Understanding its structure helps you connect it correctly.

Component	What it is	Why it matters
Index	A collection of documents with a defined schema	One index per knowledge domain (e.g., one for medical papers, one for HR policies)
Fields	Named attributes on each document (title, content, category, embedding)	Fields marked as “searchable” are included in full-text search. Fields marked “filterable” support pre-query narrowing
Indexer	An automated pipeline that pulls data from a source into the index	Blob Storage, Cosmos DB, Azure SQL, and other sources. Runs on a schedule or on-demand
Skillset	Optional AI enrichment pipeline attached to an indexer	OCR for scanned PDFs, language detection, entity extraction, chunking for vector search
Scoring profile	Custom relevance tuning rules	Boost recent documents, prioritise certain fields, weight by category

Chunking matters for RAG

When using vector search, documents must be split into chunks (typically 500-2,000 tokens). Each chunk gets its own embedding vector. The integrated vectorisation feature in Azure AI Search can handle chunking and embedding automatically using a skillset with the Azure OpenAI embedding skill. The exam may reference “integrated vectorisation” — it means the search service handles chunking + embedding during indexing so you do not need a separate pipeline.

Connecting Azure AI Search to Copilot Studio

The connection workflow has specific steps the exam expects you to know:

Create the Azure AI Search resource in your Azure subscription (choose the tier based on volume — Free, Basic, Standard S1/S2/S3).
Create an index — define the schema, configure fields, optionally add a skillset for enrichment.
Populate the index — run an indexer or push data via the REST API.
In Copilot Studio, go to the agent’s knowledge sources and select “Azure AI Search.”
Provide the connection details: search service endpoint URL, index name, and authentication.
Configure field mappings — tell Copilot Studio which index fields contain the title, content, and URL for citations.
Test — ask questions in the test pane and verify the agent returns grounded answers from the index.

Authentication options

Method	How it works	When to use
API key	Pass the search service admin or query key in the request header	Simplest setup — good for development and testing
Managed identity	Copilot Studio environment uses a system-assigned managed identity with RBAC role on the search service	Production recommended — no keys to rotate, follows zero-trust principles
Microsoft Entra ID token	OAuth 2.0 bearer token from Entra ID	Advanced scenarios with fine-grained access control

How this differs from Module 21 (Foundry RAG)

In Module 21, you will connect Azure AI Search through Azure AI Foundry to build a full RAG (Retrieval-Augmented Generation) pipeline with a Foundry model. In this module, the agent connects to Azure AI Search directly as a knowledge source — no Foundry model in between. The search results feed into Copilot Studio’s built-in generative answers capability. Same search service, different integration pattern. The exam distinguishes these two paths.

Scenario: Lena indexes 50,000 medical papers

Lena is the AI engineer at a healthcare analytics firm. Their clinical research team needs an agent that can answer questions across 50,000 published medical papers stored as PDFs in Azure Blob Storage.

Lena’s architecture: she creates an Azure AI Search resource (Standard S1 tier for the volume), builds an index with fields for title, authors, abstract, full_text, publication_date, and a vector field for embeddings. She configures an indexer with a skillset that includes OCR (some older papers are scanned images), text chunking (1,000-token chunks), and the Azure OpenAI embedding skill for vector search.

After the indexer runs, she connects the search index to the Copilot Studio agent using managed identity authentication. She maps the title, full_text, and a URL field for citations. In the test pane, she asks: “What are the latest findings on immunotherapy response rates in stage 3 melanoma?” The agent returns a grounded answer citing three specific papers with publication dates — powered by hybrid search (keyword for “melanoma” + vector for conceptual “immunotherapy response” matching) with semantic ranking.

50,000 papers, searchable in seconds.

Question

What are the four search modes in Azure AI Search?