🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-750 Domain 2
Domain 2 — Module 2 of 5 40%
7 of 28 overall

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor
Domain 2: Secure and Govern Unity Catalog Objects Premium ⏱ ~12 min read

Secrets & Authentication

Access Azure Key Vault secrets from Databricks, authenticate with service principals, and use managed identities — the three authentication patterns the exam expects you to know.

Why secrets and authentication matter

☕ Simple explanation

Never put passwords in your code. Ever.

Think of it this way: your notebook is like a recipe card. You write “add the secret spice” — but you don’t write the actual spice name on the card where anyone could read it. Instead, you keep the spice in a locked cabinet (Azure Key Vault) and only people with the right key can open it.

Similarly, when your pipeline needs to connect to a database or storage account, it shouldn’t carry credentials around like a sticky note. Instead, it uses a service principal (an ID card for apps) or a managed identity (an ID card that Azure manages for you — no password at all).

Azure Databricks provides three authentication mechanisms for accessing external resources:

  • Azure Key Vault-backed secret scopes — store and retrieve secrets (connection strings, API keys, passwords) from Azure Key Vault
  • Service principals — Microsoft Entra ID application identities used for automated/unattended access to Azure resources and Unity Catalog objects
  • Managed identities — Azure-managed identities that eliminate credential management entirely; Azure handles token rotation

The exam tests your ability to choose the right authentication method for a given scenario and configure it correctly.

Azure Key Vault secrets

How secret scopes work

Databricks uses secret scopes as a bridge to Azure Key Vault:

Notebook code
    → dbutils.secrets.get("scope-name", "secret-key")
        → Secret scope (Databricks)
            → Azure Key Vault (stores the actual secret value)

Setting up a Key Vault-backed secret scope

  1. Create an Azure Key Vault and add your secrets
  2. Create a secret scope in Databricks that points to the Key Vault
  3. Reference secrets in code using dbutils.secrets.get()
# Read a secret from Key Vault via the secret scope
storage_key = dbutils.secrets.get(scope="kv-production", key="adls-access-key")

# Use it in a connection (the value is never printed or logged)
spark.conf.set(
    "fs.azure.account.key.myaccount.dfs.core.windows.net",
    storage_key
)

Critical security feature: Secret values are redacted in notebook output. If you try to print(storage_key), Databricks shows [REDACTED] — not the actual value.

💡 Exam tip: Secret scope types

There are two types of secret scopes:

TypeBackendUse Case
Azure Key Vault-backedAzure Key VaultProduction — centralised secret management
Databricks-backedDatabricks internal storeSimple setups — secrets stored in Databricks

The exam strongly favours Key Vault-backed scopes because they integrate with Azure’s security and compliance tooling (audit logs, access policies, key rotation).

Service principals

A service principal is an application identity in Microsoft Entra ID. It’s like giving your ETL pipeline its own ID badge instead of using a human user’s credentials.

When to use service principals

ScenarioWhy Service Principal
Automated ETL pipelinesNo human to log in; pipeline needs its own identity
Cross-workspace accessService principal can be granted access across workspaces
External system integrationAzure Data Factory, Logic Apps connecting to Databricks
Unity Catalog automationGrant table permissions to pipelines, not people

Configuring a service principal

# Authenticate ADLS access using a service principal
spark.conf.set("fs.azure.account.auth.type.myaccount.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.myaccount.dfs.core.windows.net",
    "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.myaccount.dfs.core.windows.net",
    dbutils.secrets.get("kv-production", "sp-client-id"))
spark.conf.set("fs.azure.account.oauth2.client.secret.myaccount.dfs.core.windows.net",
    dbutils.secrets.get("kv-production", "sp-client-secret"))
spark.conf.set("fs.azure.account.oauth2.client.endpoint.myaccount.dfs.core.windows.net",
    "https://login.microsoftonline.com/<tenant-id>/oauth2/token")

Notice how the client ID and secret come from Key Vault — never hardcoded.

Tomás uses a service principal at NovaPay for the fraud detection pipeline. The pipeline runs on a schedule with no human interaction — it authenticates using its service principal to access both ADLS storage and Unity Catalog tables.

Managed identities

A managed identity is the simplest authentication method — Azure manages everything:

  • No credentials to store or rotate — Azure handles token generation
  • No Key Vault needed for the identity itself (though you still use Key Vault for other secrets)
  • Two types: system-assigned (tied to one resource) and user-assigned (reusable across resources)
FeatureService PrincipalManaged Identity
Credential managementYou manage client secret/certificateAzure manages automatically
Secret rotationYou must rotateAzure rotates automatically
Stored inEntra ID app registrationTied to Azure resource
Cross-tenantYes (multi-tenant app)No (same tenant only)
Best forCross-workspace, cross-tenant, ADFStorage access, Azure-to-Azure
Exam preferenceWhen question mentions 'automated pipeline'When question mentions 'no credentials' or 'least management overhead'
# Configure ADLS access using managed identity (much simpler!)
spark.conf.set("fs.azure.account.auth.type.myaccount.dfs.core.windows.net", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type.myaccount.dfs.core.windows.net",
    "org.apache.hadoop.fs.azurebfs.oauth2.ManagedIdentityCredentialBasedAccessTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id.myaccount.dfs.core.windows.net",
    "<managed-identity-client-id>")

Dr. Sarah Okafor configures Athena Group’s Databricks workspace to use a user-assigned managed identity for accessing ADLS Gen2 storage. No secrets to rotate, no Key Vault entries to maintain for storage access.

💡 Exam decision tree: which authentication method?

Follow this logic for exam scenarios:

  1. “Store and retrieve secrets/passwords” → Azure Key Vault + secret scope
  2. “Automated pipeline needs identity” → Service principal
  3. “Minimise credential management” or “no secrets to rotate” → Managed identity
  4. “Cross-tenant access” → Service principal (managed identities don’t cross tenants)
  5. “Access from Azure Data Factory” → Service principal or managed identity (ADF supports both)

When the question doesn’t specify constraints, managed identity is usually the “best practice” answer because it eliminates credential management.

Question

How do you access Azure Key Vault secrets from a Databricks notebook?

Click or press Enter to reveal answer

Answer

Create a Key Vault-backed secret scope, then use dbutils.secrets.get('scope-name', 'secret-key'). The secret value is automatically redacted in notebook output for security.

Click to flip back

Question

When should you use a service principal vs. a managed identity?

Click or press Enter to reveal answer

Answer

Service principal: automated pipelines needing their own identity, cross-tenant access, ADF integration. Managed identity: when you want zero credential management — Azure handles token rotation. Managed identity is the 'least management overhead' answer.

Click to flip back

Question

What happens if you try to print a secret value retrieved via dbutils.secrets.get()?

Click or press Enter to reveal answer

Answer

Databricks shows [REDACTED] instead of the actual value. This prevents accidental exposure of secrets in notebook output, logs, and revision history.

Click to flip back

🎬 Video coming soon

Knowledge check

Knowledge Check

Ravi's ETL pipeline at DataPulse Analytics needs to connect to an Azure SQL Database to ingest customer records. The connection string contains a password. Where should Ravi store this password?

Knowledge Check

Dr. Sarah Okafor wants Athena Group's Databricks workspace to access ADLS Gen2 storage with the LEAST management overhead. No manual secret rotation, no Key Vault entries for storage access. Which authentication method should she configure?


Next up: Data Discovery & Attribute-Based Access — descriptions, tags, and ABAC policies.

← Previous

Securing Unity Catalog: Who Gets What

Next →

Data Discovery & Attribute-Based Access

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.