Purview IP Client: Classify Files at Scale
Extend sensitivity labels beyond the cloud. The Microsoft Purview Information Protection client and scanner classify files on Windows devices, network file shares, and on-premises SharePoint libraries.
Why do you need an on-premises solution?
Imagine your house alarm only works when you are home. Helpful, but what about when you go to work?
Microsoft 365 sensitivity labels work brilliantly in the cloud β SharePoint, OneDrive, Exchange, Teams. But what about the file server in the basement? The shared network drive with 10 years of documents? The on-premises SharePoint farm that has not been migrated yet?
The Purview Information Protection client extends labels to Windows desktop apps (Word, Excel, PowerPoint, Outlook) and File Explorer. The Information Protection scanner crawls your file servers and on-premises SharePoint libraries, discovering and auto-labeling sensitive data at scale β like deploying a robot that reads every file in every cabinet and stamps it with the right classification.
The Information Protection client
What it enables
| Capability | Without Client | With Client |
|---|---|---|
| Labels in Office | Built-in labeling in M365 Apps (cloud-connected) | Same + enhanced features for complex scenarios |
| File Explorer labeling | Not available | Right-click any file to apply or view a sensitivity label |
| PowerShell classification | Not available | Bulk label files via Set-AIPFileLabel cmdlet |
| Protected file viewer | View protected PDFs in browser | Open protected PDFs, images, and text files natively |
| Track & revoke | Not available for on-prem files | Track who accessed protected files, revoke access |
Built-in labeling vs the IP client
Microsoft 365 Apps include built-in sensitivity labeling. The IP client adds features on top:
| Feature | Built-in Labeling (M365 Apps) | Purview IP Client |
|---|---|---|
| Available in | Word, Excel, PowerPoint, Outlook (M365 Apps) | Same + File Explorer + PowerShell |
| Cloud-connected? | Yes β labels from Purview portal | Yes β same labels, same policies |
| File Explorer support | No | Yes β right-click to classify any file type |
| PowerShell cmdlets | No | Yes β Set-AIPFileLabel, Get-AIPFileStatus |
| Non-Office files | Limited | Classify and protect PDFs, images, text files, CAD files |
| Installation | Ships with M365 Apps (no extra install) | Separate download and deployment |
| Recommended for | Most organisations β default choice | Organisations with on-prem file classification needs or non-Office file types |
Exam tip: built-in vs client β when to use which
Microsoft recommends built-in labeling for most organisations. The IP client is needed when you require:
- File Explorer right-click labeling β classify files outside Office apps
- PowerShell bulk operations β label thousands of files programmatically
- Non-Office file protection β PDFs, images, text files, CAD drawings
- On-premises scanner β the scanner component requires the IP client infrastructure
On the exam, if a scenario involves on-premises file classification or non-Office file types, the IP client is the answer.
The Information Protection scanner
The scanner is the workhorse for on-premises classification β it crawls file repositories, discovers sensitive data, and applies labels automatically.
How the scanner works
| Phase | What Happens |
|---|---|
| 1. Install | Deploy the scanner service on a Windows Server in your network |
| 2. Configure | In the Purview portal, create a scanner cluster and content scan job |
| 3. Discovery | The scanner crawls configured repositories and reports what it finds |
| 4. Enforce | Optionally auto-label files based on SIT matches and label policies |
Scanner requirements
| Requirement | Detail |
|---|---|
| Server OS | Windows Server 2016, 2019, or 2022 |
| SQL Server | Local or remote SQL Server instance for the scanner database |
| Network access | Server must reach file shares and SharePoint farms |
| Service account | Entra ID service principal with Purview permissions |
| Client installed | Purview IP client must be installed on the scanner server |
Scanner modes
| Feature | Discovery Mode | Enforce Mode |
|---|---|---|
| What it does | Scans and reports β does not change any files | Scans, reports, AND applies labels to files |
| Use case | Initial assessment β see what sensitive data exists | Ongoing classification β automatically label files |
| Risk level | Zero risk β read-only operation | Medium β modifies files by adding labels/protection |
| Recommended first? | Yes β always start here | After reviewing discovery results |
What the scanner can scan
| Repository Type | Supported? |
|---|---|
| Network file shares (SMB) | Yes β UNC paths |
| SharePoint Server 2013 | Yes |
| SharePoint Server 2016 | Yes |
| SharePoint Server 2019 | Yes |
| SharePoint Online | No β use auto-labeling policies for cloud SharePoint |
| OneDrive | No β use auto-labeling policies |
| NFS shares | Yes (with configuration) |
Scenario: Dr. Liam scans the hospital file server
St. Harbour Health has a legacy file server with 2 million documents accumulated over 15 years β patient records, administrative files, research data, and old HR documents. Nobody knows exactly what sensitive data is in there.
Dr. Liam deploys the scanner:
- Discovery mode first: Scans the entire server over a weekend. Results: 340,000 documents contain patient health identifiers, 28,000 contain financial data, 95,000 contain employee PII.
- Review: Dr. Liam reviews the discovery report with the compliance team and confirms the label mapping.
- Enforce mode: Enables auto-labeling. βPatient Data β Confidentialβ label applied to the 340,000 PHI documents. βInternal β Financialβ applied to financial data.
- Schedule: Scanner runs nightly to catch new files.
Content scan jobs
A content scan job defines WHAT the scanner looks at:
| Setting | What It Configures |
|---|---|
| Repositories | List of file share paths and/or SharePoint URLs |
| File types | Which file types to scan (all, or a filtered list) |
| Label policy | Which auto-labeling rules to apply |
| Default label | Apply a specific label to all files that match no other rule |
| Relabel | Whether to change existing labels or leave labelled files alone |
| Schedule | One-time scan or recurring (daily, weekly) |
Dr. Liam needs to classify 2 million files on an on-premises file server at St. Harbour Health. The files include Word documents, PDFs, images, and proprietary medical formats. He has never scanned this server before. What should he do FIRST?
Zara at Atlas Global needs HR staff to classify employee documents stored locally on their Windows laptops β not in SharePoint or OneDrive, but local files. The documents include PDFs and TIFF scans of employment contracts. What solution should she deploy?
π¬ Video coming soon
Domain 1 complete! You now know how to classify, label, encrypt, and scan data across cloud and on-premises.
Next up: DLP Foundations: Stop Data Leaks β Domain 2 begins with the policies that enforce your classification work.