Agent Security · Guided

Security is not a single layer

Simple explanation

Securing an agent is like securing a building. You need locks on the front door (authentication), security badges for each floor (data access controls), guards monitoring behaviour (runtime security), CCTV for the vault (model protection), and a screening process for everything entering or leaving (content safety).

No single control is enough. If someone gets past the front door, the floor badges stop them. If they forge a badge, the guards catch them. Defence in depth.

The five security layers

Defence in depth — every layer operates independently
Security Layer	What It Protects	Key Controls
Identity and authentication	Who can use the agent. What the agent can access.	Entra ID authentication, OAuth for API access, managed identities for service-to-service, conditional access policies
Data access	The data sources the agent reads from and writes to.	Least-privilege permissions, scoped API access, sensitivity labels, row-level security, data loss prevention policies
Network	Traffic between the agent, users, data sources, and model endpoints.	Private endpoints, virtual network integration, network security groups, firewall rules, API Management gateway
Runtime	The agent execution environment itself.	Sandboxed execution, rate limiting, DDoS protection, request throttling, timeout enforcement, anomaly detection
Content safety	What goes into and comes out of the agent.	Input validation, prompt shields, jailbreak detection, output filtering, PII redaction, content moderation

Identity and authentication

Agents interact with users AND with backend services. Both directions need authentication:

Direction	Authentication Method	Design Consideration
User to agent	Entra ID SSO, multi-factor authentication	Users authenticate through existing identity. Conditional access can restrict agent access by location, device, or risk level.
Agent to data sources	Managed identity, OAuth client credentials	Use managed identities — no stored credentials. The agent authenticates as itself with scoped permissions.
Agent to model endpoints	API key rotation, managed identity, network restrictions	Rotate API keys automatically. Prefer managed identity where supported. Restrict endpoint access to specific virtual networks.
Agent to external APIs	OAuth with connection references, API key via Key Vault	Store secrets in Key Vault. Use connection references for per-environment credential management.

Data access security

The principle of least privilege is critical for agents. An agent should access only the data it needs — nothing more.

Scoped permissions — if an agent needs to read customer order history, it should not have access to all customers. Scope to the authenticated user’s data.
Sensitivity labels — Microsoft Purview sensitivity labels inform agent data access policies. They help classify content, but labels alone don’t automatically suppress content from agent responses. Enforce with permissions, DLP policies, grounding scope controls, and audit logging.
Row-level security — for Dataverse-backed agents, security roles control which records the agent can access on behalf of the user.
Data loss prevention — DLP policies can block agents from accessing or transmitting sensitive data types (credit card numbers, national IDs).

Model security

Protecting the model itself — not just the data it accesses:

Endpoint protection — model endpoints should not be publicly accessible. Use private endpoints within a virtual network.
Model artefact security — model files stored in the registry should have access controls. Not everyone should be able to download or copy production models.
Inference logging — log all requests to model endpoints for audit purposes. Monitor for unusual patterns (bulk extraction attempts).
Model theft prevention — rate limiting and output perturbation can mitigate model extraction attacks (where an attacker queries the model systematically to reconstruct it).

Runtime security

The execution environment needs its own protections:

Sandboxing — agent code runs in isolated environments. A compromised agent cannot access other agents or system resources.
Rate limiting — cap the number of requests per user, per session, and per time window. Prevents abuse and contains blast radius.
Timeout enforcement — set maximum execution time for agent responses. Prevents runaway processes from consuming resources.
Anomaly detection — monitor for unusual patterns: sudden spikes in usage, unusual query patterns, attempts to access out-of-scope data.

Scenario: Marcus designs security for Vanguard's financial advisory agent

Marcus Webb (CISO at Vanguard Financial Group) designs the security architecture for a Copilot Studio agent that provides financial advisory information to wealth management clients.

Identity and authentication:

Clients authenticate via Entra ID with mandatory MFA
Conditional access: agent accessible only from approved devices and locations
Agent authenticates to D365 Finance using a managed identity with read-only access to the client’s own portfolio data

Data access:

Row-level security ensures the agent can only access the authenticated client’s records
Sensitivity labels: “Highly Confidential” labels on portfolio valuations prevent the agent from including exact figures in unencrypted channels
DLP policy blocks the agent from transmitting account numbers or tax IDs in responses

Network:

Agent backend runs in a virtual network with private endpoints to D365 and the Foundry model endpoint
API Management gateway handles external-facing traffic with WAF protection
No direct internet access from the agent’s compute environment

Runtime:

Rate limited to 20 requests per user per minute
Session timeout after 15 minutes of inactivity
All interactions logged to immutable audit storage for regulatory compliance

Content safety:

Prompt shields enabled to detect manipulation attempts
Output filter prevents the agent from providing specific investment recommendations (regulatory requirement)
PII redaction on any logged conversation data

Exam tip: security covers more than just the agent

The exam asks about end-to-end security, not just agent-level controls:

The data the agent accesses — who has permission? What sensitivity labels apply? Is row-level security enforced?
The models the agent calls — are endpoints protected? Are model artefacts secured? Is inference logged?
The channels the agent communicates through — Teams, web chat, email? Each channel has its own security considerations.
The people who manage the agent — who can modify topics, update knowledge sources, change configuration? Admin access needs the same rigour as user access.

If the exam presents a security scenario, look for the answer that addresses the MOST layers — not just one.

Flashcards

Question

What are the five security layers for AI agents?

Click or press Enter to reveal answer

Answer

1) Identity and authentication — who can use the agent and what the agent can access. 2) Data access — least-privilege access to grounding data. 3) Network — isolation and traffic control. 4) Runtime — sandboxing, rate limiting, monitoring. 5) Content safety — input/output filtering and prompt shields.

Click to flip back

Question

Why should agents use managed identities instead of stored credentials?

Click or press Enter to reveal answer

Answer

Managed identities eliminate the need to store and rotate secrets. The identity is managed by Entra ID, automatically rotated, and scoped to specific permissions. Stored credentials risk exposure, require manual rotation, and are a common attack vector.

Click to flip back

Question

What is model theft and how do you mitigate it?

Click or press Enter to reveal answer

Answer

Model theft (or model extraction) occurs when an attacker queries a model systematically to reconstruct it. Mitigations include: rate limiting on model endpoints, output perturbation (adding slight randomness), inference logging with anomaly detection, and restricting endpoint access to authorised virtual networks.

Click to flip back

Question

How do sensitivity labels protect agent data access?

Click or press Enter to reveal answer

Answer

Microsoft Purview sensitivity labels flow through to agent interactions. If a source document is labelled Confidential, the agent respects that classification — it may exclude that content from responses or restrict which users can receive information from labelled sources.

Click to flip back

Knowledge check

Knowledge Check

Marcus discovers that Vanguard's financial advisory agent can return portfolio valuations for ANY client, not just the authenticated user's portfolio. Which security control is missing?

Knowledge Check

An architect proposes storing the model API key in the agent's configuration file for simplicity. What is the correct security approach?

Knowledge Check

Which combination of controls provides the strongest defence-in-depth for an agent that accesses sensitive financial data?

Next up: Governance — designing governance frameworks for agent registration, approval workflows, data residency, and access controls on grounding data.