Monitoring & Security
Your AI solution is only as good as its data pipeline and security posture. Learn how to monitor search index health, data ingestion quality, and lock down your AI infrastructure with managed identity, private endpoints, and RBAC.
Monitoring data and search quality
Your AI is only as good as the data it searches. If the data pipeline breaks or the search index goes stale, your AI starts giving wrong answers — and nobody tells you.
Monitoring means watching two things: (1) Is new data flowing in correctly? (2) When users search, are they finding what they need? If either breaks, your RAG application starts hallucinating or giving irrelevant responses.
Data ingestion monitoring
| What to Monitor | Why | Red Flag |
|---|---|---|
| Indexer status | Confirms documents are being processed | Indexer in “failed” or “degraded” state |
| Document count | Tracks how many documents are indexed | Count plateaus when new docs should be flowing in |
| Parsing errors | Catches corrupt or unsupported files | Error rate above 1-2% |
| Field completeness | Ensures extracted metadata is populated | Required fields (title, date) returning null |
| Embedding freshness | Confirms vectors match current model | Embeddings generated with outdated model version |
Search relevance monitoring
| Metric | What It Measures | How to Improve |
|---|---|---|
| Precision at K | Of the top K results, how many are relevant? | Tune ranking profiles, adjust chunking |
| Recall | Of all relevant documents, how many were found? | Add more search types (hybrid), broaden index |
| Mean Reciprocal Rank | How high does the correct answer rank? | Improve semantic ranker configuration |
| User satisfaction | Do users rephrase and retry? | Track query reformulations as a proxy for poor results |
Exam tip: Stale index vs model drift
The exam may present declining quality and ask for the cause. Key distinction:
- Stale index = data pipeline stopped, new documents aren’t indexed, answers are outdated
- Model drift = model behaviour changed, but data is fine
Check the data pipeline first, then the model. In practice, stale indexes cause more quality issues than model drift.
Security fundamentals for AI
| Feature | Security Feature | What It Does |
|---|---|---|
| Managed identity | System-assigned identity | Azure resources authenticate to each other without storing credentials in code. No API keys needed. |
| Private endpoints | Private networking | AI services communicate over Azure's private network backbone — never touching the public internet. |
| Keyless credentials | Token-based auth | Applications use Microsoft Entra ID tokens instead of API keys. Tokens expire automatically, keys don't. |
| RBAC (role policies) | Role-Based Access Control | Fine-grained permissions: who can deploy models, who can read data, who can manage agents. |
Managed identity in practice
Managed identity is the number one security best practice for Azure AI:
| Without Managed Identity | With Managed Identity |
|---|---|
| Store API key in app config or Key Vault | No keys to store — Azure handles authentication |
| Rotate keys manually | No rotation needed — tokens are short-lived |
| Risk of key exposure in logs or code | No secret to expose |
| Configure key per service | One identity, grant roles to each resource |
Real-world example: Atlas Financial's security posture
Atlas Financial handles sensitive financial data. Their AI security setup:
- Managed identity on all Foundry resources — zero API keys in code
- Private endpoints for Foundry Project, AI Search, and Storage — no public internet exposure
- RBAC roles:
- Data scientists: “Cognitive Services User” (can call models, can’t deploy)
- AI engineers: “Cognitive Services Contributor” (can deploy models)
- Security team: “Reader” + custom role for audit log access
- VNet integration — all AI traffic stays within Atlas’s private network
- Key Vault — only for third-party API keys (external services that don’t support managed identity)
RBAC roles for AI services
| Role | What It Allows | Who Gets It |
|---|---|---|
| Cognitive Services User | Call deployed models and agents | Application service principals, developers |
| Cognitive Services Contributor | Deploy and manage models | AI engineers, DevOps |
| Search Index Data Reader | Query search indexes | Applications, agents |
| Search Index Data Contributor | Read and write search index data | Indexing pipelines |
| Search Service Contributor | Manage search service configuration | Infrastructure admins |
Key terms
Knowledge check
NeuralMed's RAG chatbot has been giving outdated drug interaction information, even though new research papers are being uploaded to storage daily. What should the team investigate first?
Kai's team stores an API key for the Foundry model deployment in their application's environment variables. The security team flags this as a risk. What's the recommended fix?
Which RBAC role should Atlas Financial assign to their AI application's service principal so it can call deployed models but NOT deploy or delete them?
🎬 Video coming soon