Data Classification and Loss Prevention

Simple explanation

Classification: The Foundation of Everything

You can’t protect what you haven’t classified. Every data protection decision — encryption, access control, sharing restrictions, retention, DLP enforcement — depends on knowing what the data is. Classification is the foundation that everything else builds on.

Yet most organisations skip classification and jump straight to DLP. They create DLP rules looking for credit card numbers and social security numbers — which catches maybe 10% of sensitive data. The other 90% (strategic plans, engineering designs, customer communications, M&A preparations) doesn’t match a pattern, so it flows freely.

The security architect designs the classification strategy first, then layers DLP on top.

Designing the Classification Taxonomy

The taxonomy must be simple enough for every user to understand and specific enough to drive protection decisions. Microsoft recommends 3-5 sensitivity levels:

Level	Label	Protection	Example Content
1	Public	No restrictions	Marketing brochures, press releases, public website content
2	Internal	Organisation-only access	Internal memos, team plans, process documentation
3	Confidential	Restricted to specific groups	Financial reports, customer data, product roadmaps
4	Highly Confidential	Encrypted, tracked, no external sharing	M&A plans, executive compensation, trade secrets

Common architecture mistakes:

Too many labels — More than 5-6 sensitivity levels confuse users. They won’t choose correctly between “Confidential - Internal,” “Confidential - Partners,” and “Confidential - Restricted.”
No default label — If there’s no default, users skip labelling entirely. Set “Internal” as the default for most users.
Labels without protection — A label that adds a header but doesn’t restrict access gives a false sense of security. Every label should enforce something.
No sublabels for key scenarios — “Confidential” might need sublabels: “Confidential - Financial,” “Confidential - HR,” “Confidential - Legal” — each with different group-based encryption recipients.

Sensitivity Labelling Strategies

Labelling Approaches
	Manual Labelling	Client-side Auto-labelling	Service-side Auto-labelling
Who/what applies the label?	The user selects the label in Office apps, Outlook, or the SharePoint web UI	The Office client recommends or automatically applies a label based on content inspection	The Microsoft 365 service scans content at rest in SharePoint/OneDrive/Exchange and applies labels without user involvement
When does it apply?	At creation or when the user remembers to label	During document editing in Office desktop/web apps — inspects as the user types	On a schedule — scans existing content in SharePoint libraries, OneDrive folders, and Exchange mailboxes
What triggers it?	User judgment	Sensitive information types (SIT) or trainable classifiers detected in the document	Same SITs and classifiers, but applied to content already at rest — catches the backlog of unlabelled content
Coverage gap	Users forget, choose wrong labels, or don't label at all	Only works in Office apps — doesn't label PDFs, images, or non-Office content	Runs on a schedule, not real-time. Newly created content waits until the next scan cycle.
Licence requirement	E3 (basic) / E5 (recommended for priority)	E5 or E5 Compliance add-on	E5 or E5 Compliance add-on
Best for	All organisations — establishes user awareness and accountability	Catching sensitive data as it's created in Office documents	Classifying the backlog of existing content — millions of existing documents

Trainable Classifiers

Sensitive Information Types (SITs) detect patterns — credit card numbers, social security numbers, bank account formats. But much sensitive data doesn’t follow a pattern. A board strategy document, an employee performance review, or an engineering design specification contains sensitive information that no regex can detect.

Trainable classifiers use machine learning to identify content types based on examples:

Built-in classifiers — Microsoft provides pre-trained classifiers for common content: resumes, source code, financial statements, harassment, profanity, intellectual property
Custom classifiers — You provide 50-500 examples of the content type. The classifier learns the patterns and identifies similar content across your tenant.

The architect designs which classifiers to deploy based on the organisation’s sensitive data types. For example:

A financial firm needs classifiers for investment memos, trading strategies, and client reports
A manufacturing company needs classifiers for engineering designs, product specifications, and supplier agreements
A law firm needs classifiers for legal opinions, case strategies, and privileged communications

💰 Scenario: Ingrid’s Classification Taxonomy for Nordic Capital

Ingrid designs the sensitivity label taxonomy for Nordic Capital Partners. The financial services regulator requires data classification, and the firm handles multiple categories of sensitive information:

Taxonomy design:

Label	Sublabel	Protection	Use Case
Public	—	No restrictions	Published research reports, marketing materials
Internal	—	Org-only access, no external sharing	Internal policies, general communications
Confidential	Financial	Encrypted to Finance group + compliance team	Financial reports, trading data, portfolio analysis
Confidential	Client	Encrypted to deal team + client relationship managers	Client proposals, engagement letters, KYC documents
Confidential	HR	Encrypted to HR group only	Employee records, compensation, performance reviews
Highly Confidential	Board	Encrypted to board members + CEO + General Counsel	Board minutes, M&A plans, executive compensation
Highly Confidential	Regulatory	Encrypted to compliance team only	Regulatory submissions, examination responses, audit findings

Labelling strategy:

Default label: “Internal” applied to all new documents. Users must actively upgrade to Confidential or downgrade to Public.
Client-side auto-labelling: Recommend “Confidential - Financial” when financial SITs are detected (account numbers, SWIFT codes, portfolio values).
Service-side auto-labelling: Scan existing SharePoint libraries to classify the backlog. Ingrid estimates 2 million existing documents need classification.
Trainable classifiers: Custom classifier trained on 200 examples of investment memos. This catches strategic analysis content that no SIT can detect.
Mandatory labelling: Users cannot save or send documents/emails without selecting a label.

Harald Eriksen validates that the taxonomy aligns with the FMA’s data classification requirements. Yuki Tanaka confirms that Conditional Access policies can reference sensitivity labels — documents labelled “Highly Confidential” can only be accessed from managed devices.

DLP Architecture

Data Loss Prevention enforces classification decisions. The architect designs DLP policies that detect sensitive content and take appropriate action based on the sensitivity label, the channel, and the user’s risk level.

DLP Policy Architecture Decisions

Scope: Where does the policy apply?

Exchange Online (email)
SharePoint and OneDrive (document libraries)
Teams (chat and channel messages)
Endpoints (Windows devices — copy to USB, print, upload to cloud)
Third-party cloud apps (via Defender for Cloud Apps — SaaS integrations)
On-premises repositories (via Microsoft Purview scanner)

Detection: What triggers the policy?

Sensitivity labels (most reliable — the label has already classified the content)
Sensitive Information Types (pattern matching — credit card numbers, tax IDs)
Trainable classifiers (ML-based content detection)
Exact Data Match (EDM — match against a known database of sensitive values)

Actions: What happens when a violation is detected?

Audit only — Log the event but don’t block. Useful for initial rollout to assess false positive rates.
Warn — Show the user a policy tip explaining why this action is risky. User can override with justification.
Block with override — Prevent the action but allow the user to provide a business justification to proceed.
Block — Prevent the action with no override. Used for the most sensitive data (Highly Confidential labels).

🌐 Scenario: Elena’s Multi-Country DLP Design

Elena must implement DLP across Meridian’s 12-country operation. The challenge: different countries have different data protection regulations, different definitions of sensitive data, and different enforcement expectations.

Elena’s architecture:

Global policies (applied everywhere):

Block external sharing of “Highly Confidential” documents across all channels — no override
Warn when “Confidential” documents are shared externally — user can override with justification
Block USB copy of any labelled document on unmanaged Windows devices

Regional policies:

EU operations: GDPR-aligned DLP. Detect EU personal data (names, addresses, national IDs). Block external transfer of personal data without encryption. Aligned to GDPR Article 32 requirements.
US operations: Detect US-specific SITs (SSN, driver’s licence, financial account numbers). Healthcare subsidiaries have additional HIPAA-aligned policies.
APAC operations: Country-specific SITs for each jurisdiction (Australian TFN, Japanese My Number, etc.).

Enforcement progression:

Month 1-2: Audit only for all policies. Elena collects data on violation frequency and false positive rates.
Month 3-4: Warn mode for medium-sensitivity policies. Users see policy tips and learn the rules.
Month 5+: Block mode for high-sensitivity policies. Warn mode for medium.

Li Wei asks: “Why not block everything from day one?” Elena explains: “If we block immediately, we’ll get flooded with help desk tickets and override requests. Worse, users will find workarounds — copying content to personal email, using unsanctioned cloud services. A progressive rollout builds user awareness and lets us tune false positives before enforcement.”

Adaptive Protection

Adaptive Protection is the integration between Microsoft Purview Insider Risk Management and DLP. It dynamically adjusts DLP enforcement based on the user’s current risk level.

How Adaptive Protection Works

Insider Risk Management calculates a risk score for each user based on their behaviour — mass file downloads, unusual sharing patterns, access to sensitive data outside normal patterns, potential data exfiltration signals.
Risk levels: Elevated, Moderate, Minor (configurable thresholds).
DLP policies reference risk levels: A DLP policy that normally shows a “Warn” for medium-sensitivity data can automatically escalate to “Block” for users with an “Elevated” risk score.

Example: A departing employee (flagged by Insider Risk because HR data indicates resignation) tries to download 500 documents from SharePoint. Normal DLP policy would show a warning. Adaptive Protection detects the elevated risk score and escalates to blocking the download entirely.

The architect designs Adaptive Protection as the connection between user behaviour monitoring and data protection enforcement — dynamic DLP that responds to context, not static rules.

🎯 Exam Tip: DLP Design Questions

SC-100 tests DLP at the architecture level. You won’t be asked how to create a DLP policy. You’ll be asked: “An organisation operates in 12 countries with different data protection regulations. How should the architect design the DLP strategy?” The answer involves global baseline policies plus regional overlays, a phased enforcement progression (audit → warn → block), and Adaptive Protection for risk-based escalation. Focus on the design decisions, not the configuration steps.

Data Governance and Lifecycle

Classification and DLP protect data from unauthorised access and sharing. Data governance manages the lifecycle — how long data is retained, when it can be deleted, and what happens to records that must be preserved.

Retention Architecture

Retention policies apply broadly to locations (all Exchange mailboxes, all SharePoint sites) and define how long content is retained and what happens after the retention period (delete automatically, require disposition review, or do nothing).
Retention labels apply to individual items and override retention policies. Labels can be applied manually, automatically (based on content), or through records management workflows.
Retention wins over deletion — If a retention policy says “retain for 7 years” and a user deletes the document after 2 years, the document is preserved for the remaining 5 years.

Records Management

For regulated industries, some documents must be declared as records — immutable, with a defined retention period and formal disposition process.

Regulatory records cannot be deleted by anyone, even admins, until the retention period expires
Disposition review requires designated reviewers to approve deletion of expired records
File plan provides a formal classification structure with additional metadata (department, category, citation)

The architect designs the retention and records management strategy alongside classification — the sensitivity label tells you how to protect the data, the retention policy tells you how long to keep it.

Knowledge Check

Ingrid is designing a data classification taxonomy for Nordic Capital Partners. She initially proposes 12 sensitivity labels to cover every type of financial data. Harald from compliance suggests this is too many. What is the architectural risk of having too many sensitivity labels?

Knowledge Check

Elena is deploying DLP across Meridian's 12-country operation. She proposes starting with blocking mode for all policies to demonstrate strong security to the board. What is the primary risk of this approach?

Knowledge Check

What does Adaptive Protection add to a DLP architecture that static DLP policies cannot provide?

Knowledge Check

Question

Why is data classification considered the foundation of all data protection?

Click or press Enter to reveal answer

Answer

Every data protection decision depends on knowing what the data is. DLP policies reference classifications to determine enforcement actions. Encryption decisions are based on data sensitivity. Retention policies vary by data type. Access controls are scoped by classification. Without classification, protection is either applied uniformly (too restrictive for low-sensitivity data, too permissive for high-sensitivity data) or not applied at all. Classification enables proportional, risk-based protection.

Click to flip back

Question

What is the difference between client-side auto-labelling and service-side auto-labelling?

Click or press Enter to reveal answer

Answer

Client-side auto-labelling runs in Office desktop/web apps as the user works — it inspects content in real-time and recommends or applies labels during document creation. Service-side auto-labelling runs as a background service that scans content already at rest in SharePoint, OneDrive, and Exchange — it catches the backlog of existing unlabelled content. Client-side is real-time but limited to Office apps. Service-side covers existing content but runs on a schedule, not instantly.

Click to flip back

Question

How does Adaptive Protection work in a DLP architecture?

Click or press Enter to reveal answer

Answer

Adaptive Protection integrates Insider Risk Management with DLP. Insider Risk calculates a risk score for each user based on behaviour signals (mass downloads, unusual sharing, data exfiltration patterns). DLP policies reference these risk levels. A policy that normally 'Warns' for a standard user can 'Block' for a user with an elevated risk score. This provides dynamic, context-aware enforcement — the same data action receives different treatment based on the user's current risk profile.

Click to flip back

Question

Why should DLP deployment follow a progressive rollout (audit → warn → block) rather than starting with blocking?

Click or press Enter to reveal answer

Answer

Progressive rollout serves three purposes: 1) Audit mode baselines normal behaviour and identifies false positives before they impact users. 2) Warn mode educates users about policies — they learn what's sensitive and why sharing is restricted. 3) Blocking mode, deployed after tuning, has fewer false positives and more user acceptance. Starting with blocking causes users to find workarounds (personal email, unsanctioned services, screen photos), which moves data to unmonitored channels and reduces overall security.

Click to flip back

Summary

Data classification is the foundation of data protection — you can’t protect what you haven’t classified. The architect designs a simple taxonomy (3-5 levels), implements labelling through manual, client-side auto, and service-side auto approaches, and layers DLP enforcement on top. DLP policies should be deployed progressively (audit → warn → block) and scoped globally with regional overlays for multi-country operations. Adaptive Protection connects insider risk scores to DLP enforcement, providing dynamic, context-aware data protection that static rules cannot achieve.

Next module: We complete Domain 4 with Azure data security — encryption at rest, in transit, and in use, plus database security and Key Vault integration.