Agent Security
Design multi-layered security for AI agents β covering identity, data access, network isolation, model protection, runtime hardening, and content safety.
Security is not a single layer
Securing an agent is like securing a building. You need locks on the front door (authentication), security badges for each floor (data access controls), guards monitoring behaviour (runtime security), CCTV for the vault (model protection), and a screening process for everything entering or leaving (content safety).
No single control is enough. If someone gets past the front door, the floor badges stop them. If they forge a badge, the guards catch them. Defence in depth.
The five security layers
| Security Layer | What It Protects | Key Controls |
|---|---|---|
| Identity and authentication | Who can use the agent. What the agent can access. | Entra ID authentication, OAuth for API access, managed identities for service-to-service, conditional access policies |
| Data access | The data sources the agent reads from and writes to. | Least-privilege permissions, scoped API access, sensitivity labels, row-level security, data loss prevention policies |
| Network | Traffic between the agent, users, data sources, and model endpoints. | Private endpoints, virtual network integration, network security groups, firewall rules, API Management gateway |
| Runtime | The agent execution environment itself. | Sandboxed execution, rate limiting, DDoS protection, request throttling, timeout enforcement, anomaly detection |
| Content safety | What goes into and comes out of the agent. | Input validation, prompt shields, jailbreak detection, output filtering, PII redaction, content moderation |
Identity and authentication
Agents interact with users AND with backend services. Both directions need authentication:
| Direction | Authentication Method | Design Consideration |
|---|---|---|
| User to agent | Entra ID SSO, multi-factor authentication | Users authenticate through existing identity. Conditional access can restrict agent access by location, device, or risk level. |
| Agent to data sources | Managed identity, OAuth client credentials | Use managed identities β no stored credentials. The agent authenticates as itself with scoped permissions. |
| Agent to model endpoints | API key rotation, managed identity, network restrictions | Rotate API keys automatically. Prefer managed identity where supported. Restrict endpoint access to specific virtual networks. |
| Agent to external APIs | OAuth with connection references, API key via Key Vault | Store secrets in Key Vault. Use connection references for per-environment credential management. |
Data access security
The principle of least privilege is critical for agents. An agent should access only the data it needs β nothing more.
- Scoped permissions β if an agent needs to read customer order history, it should not have access to all customers. Scope to the authenticated userβs data.
- Sensitivity labels β Microsoft Purview sensitivity labels inform agent data access policies. They help classify content, but labels alone donβt automatically suppress content from agent responses. Enforce with permissions, DLP policies, grounding scope controls, and audit logging.
- Row-level security β for Dataverse-backed agents, security roles control which records the agent can access on behalf of the user.
- Data loss prevention β DLP policies can block agents from accessing or transmitting sensitive data types (credit card numbers, national IDs).
Model security
Protecting the model itself β not just the data it accesses:
- Endpoint protection β model endpoints should not be publicly accessible. Use private endpoints within a virtual network.
- Model artefact security β model files stored in the registry should have access controls. Not everyone should be able to download or copy production models.
- Inference logging β log all requests to model endpoints for audit purposes. Monitor for unusual patterns (bulk extraction attempts).
- Model theft prevention β rate limiting and output perturbation can mitigate model extraction attacks (where an attacker queries the model systematically to reconstruct it).
Runtime security
The execution environment needs its own protections:
- Sandboxing β agent code runs in isolated environments. A compromised agent cannot access other agents or system resources.
- Rate limiting β cap the number of requests per user, per session, and per time window. Prevents abuse and contains blast radius.
- Timeout enforcement β set maximum execution time for agent responses. Prevents runaway processes from consuming resources.
- Anomaly detection β monitor for unusual patterns: sudden spikes in usage, unusual query patterns, attempts to access out-of-scope data.
Scenario: Marcus designs security for Vanguard's financial advisory agent
Marcus Webb (CISO at Vanguard Financial Group) designs the security architecture for a Copilot Studio agent that provides financial advisory information to wealth management clients.
Identity and authentication:
- Clients authenticate via Entra ID with mandatory MFA
- Conditional access: agent accessible only from approved devices and locations
- Agent authenticates to D365 Finance using a managed identity with read-only access to the clientβs own portfolio data
Data access:
- Row-level security ensures the agent can only access the authenticated clientβs records
- Sensitivity labels: βHighly Confidentialβ labels on portfolio valuations prevent the agent from including exact figures in unencrypted channels
- DLP policy blocks the agent from transmitting account numbers or tax IDs in responses
Network:
- Agent backend runs in a virtual network with private endpoints to D365 and the Foundry model endpoint
- API Management gateway handles external-facing traffic with WAF protection
- No direct internet access from the agentβs compute environment
Runtime:
- Rate limited to 20 requests per user per minute
- Session timeout after 15 minutes of inactivity
- All interactions logged to immutable audit storage for regulatory compliance
Content safety:
- Prompt shields enabled to detect manipulation attempts
- Output filter prevents the agent from providing specific investment recommendations (regulatory requirement)
- PII redaction on any logged conversation data
Exam tip: security covers more than just the agent
The exam asks about end-to-end security, not just agent-level controls:
- The data the agent accesses β who has permission? What sensitivity labels apply? Is row-level security enforced?
- The models the agent calls β are endpoints protected? Are model artefacts secured? Is inference logged?
- The channels the agent communicates through β Teams, web chat, email? Each channel has its own security considerations.
- The people who manage the agent β who can modify topics, update knowledge sources, change configuration? Admin access needs the same rigour as user access.
If the exam presents a security scenario, look for the answer that addresses the MOST layers β not just one.
Flashcards
Knowledge check
Marcus discovers that Vanguard's financial advisory agent can return portfolio valuations for ANY client, not just the authenticated user's portfolio. Which security control is missing?
An architect proposes storing the model API key in the agent's configuration file for simplicity. What is the correct security approach?
Which combination of controls provides the strongest defence-in-depth for an agent that accesses sensitive financial data?
π¬ Video coming soon
Next up: Governance β designing governance frameworks for agent registration, approval workflows, data residency, and access controls on grounding data.