🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-750 Domain 1
Domain 1 — Module 2 of 5 40%
2 of 28 overall

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor
Domain 1: Set Up and Configure an Azure Databricks Environment Free ⏱ ~14 min read

Choosing the Right Compute

Azure Databricks offers five compute types — job compute, serverless, SQL warehouses, classic all-purpose, and shared compute. Know when to use each and why the exam tests this heavily.

What is compute in Databricks?

☕ Simple explanation

Compute is the engine that runs your code.

Think of it like ordering a car. You don’t buy one — you pick the type that matches your trip:

  • Job compute — a rental car. Starts when you need it, returns itself when done. Cheapest for scheduled trips.
  • Serverless compute — an Uber. Shows up instantly, you don’t manage the vehicle. Premium convenience.
  • SQL warehouse — a shuttle bus for SQL passengers only. Optimised for queries, not for Python notebooks.
  • Classic all-purpose — your own car. Always available for interactive work, but you pay while it’s parked.
  • Shared compute — a carpooling app. Multiple people share one vehicle to cut costs.

In Azure Databricks, compute refers to the cluster of virtual machines that execute your code. Each compute type is optimised for different workload patterns — batch ETL, interactive exploration, SQL analytics, or shared development.

The compute type you choose affects cost, startup time, isolation, and feature availability. The exam tests your ability to select the right compute for a given scenario.

Key architecture point: all compute types run Apache Spark under the hood, but they differ in lifecycle management, user isolation, and supported workloads.

The five compute types

FeatureJob ComputeServerlessSQL WarehouseClassic All-PurposeShared
Best forScheduled ETL/pipelinesFast-start notebooksSQL queries & BIInteractive devTeam dev (cost sharing)
LifecycleCreated per job, auto-terminatesManaged by DatabricksAlways-on or auto-stopManual start/stopManual start/stop
Startup timeMinutes (cold start)SecondsSeconds (serverless) or minutesMinutesMinutes
User isolationFull (one job per cluster)Per-user isolationShared endpointShared clusterShared cluster
LanguagesPython, SQL, Scala, RPython, SQL, Scala, RSQL onlyPython, SQL, Scala, RPython, SQL
Cost modelPay only during job runPremium DBU ratePer-warehouse DBUPay while runningShared cost across users
Photon supportYesYes (auto)Yes (default)Yes (opt-in)Limited
Exam scenarioNightly batch pipelineAd-hoc notebook explorationDashboard queriesNotebook developmentTraining/workshops

When to use each type

Job compute (the exam favourite)

When Ravi schedules a nightly pipeline at DataPulse Analytics, he uses job compute. The cluster spins up, runs the job, and terminates automatically.

Use when:

  • Running scheduled ETL pipelines or batch jobs
  • You want cost efficiency (pay only during execution)
  • Each job needs an isolated environment
  • Running production workloads via Lakeflow Jobs

Exam pattern: If the question says “scheduled,” “automated,” “nightly,” or “production pipeline” — think job compute first.

Serverless compute

When Tomás needs to quickly prototype a new fraud detection query at NovaPay, he uses serverless compute. No waiting for cluster startup — it’s ready in seconds.

Use when:

  • Interactive notebook exploration (fast iteration)
  • You want zero infrastructure management
  • Startup time is critical
  • Running Lakeflow Spark Declarative Pipelines (serverless is default for pipelines)
ℹ️ Serverless compute architecture

Serverless compute runs in the Databricks account (not your Azure subscription). This means:

  • Faster startup — pre-warmed pools of machines are ready instantly
  • No VNet configuration needed from your side
  • Higher DBU rate — you pay a premium for the convenience
  • Data still accessed via your storage — compute is remote, but it reads/writes to your ADLS Gen2

Exam tip: If a question mentions security requirements that mandate compute in the customer’s VNet, serverless may NOT be appropriate. Classic or job compute would be the answer.

SQL warehouse

Mei Lin’s BI team at Freshmart runs dashboards against the lakehouse. They don’t write Python — they query with SQL. A SQL warehouse gives them a dedicated SQL endpoint.

Use when:

  • Running SQL queries against lakehouse tables
  • Powering BI tools (Power BI, Tableau)
  • You need a SQL-only endpoint with Photon acceleration
  • Using AI/BI Genie for natural-language data exploration

Two flavours:

TypeStartupCostBest For
Serverless SQL warehouseSecondsPremium DBULow-latency BI, variable workloads
Classic SQL warehouseMinutesStandard DBUPredictable workloads, VNet requirements

Classic all-purpose compute

Dr. Sarah Okafor’s data engineering team at Athena Group uses classic all-purpose clusters for interactive development. The cluster stays running while they iterate on notebooks.

Use when:

  • Interactive notebook development and exploration
  • You need full control over cluster configuration
  • Multiple users share a development cluster
  • Running ad-hoc analysis that doesn’t justify serverless cost

Exam pattern: If the question says “development,” “exploration,” or “interactive” with cost sensitivity — classic all-purpose.

Shared compute (formerly “high-concurrency”)

When Athena Group runs a Databricks training workshop for 20 data engineers, shared compute lets everyone use one cluster simultaneously.

Use when:

  • Multiple users need concurrent access to one cluster
  • Training workshops or classroom settings
  • Budget constraints require sharing resources
  • Workloads are lightweight (each user doing small queries)
💡 Exam tip: The decision tree

When the exam gives you a scenario, follow this decision tree:

  1. Is it a production scheduled job? → Job compute
  2. Is it SQL-only (BI dashboards)? → SQL warehouse
  3. Do you need instant startup for notebooks? → Serverless compute
  4. Is it interactive development with cost sensitivity? → Classic all-purpose
  5. Multiple concurrent users, shared budget? → Shared compute

The exam almost never asks you to pick shared compute — it’s the least-tested type. Job compute and SQL warehouse scenarios are the most common.

Question

When should you use job compute instead of all-purpose compute?

Click or press Enter to reveal answer

Answer

Job compute is for scheduled/automated workloads (ETL pipelines, production jobs). It spins up per job and auto-terminates — you pay only during execution. All-purpose is for interactive development where the cluster stays running.

Click to flip back

Question

What is the key architectural difference of serverless compute?

Click or press Enter to reveal answer

Answer

Serverless compute runs in the Databricks account (not your Azure subscription). This means faster startup (pre-warmed pools) but no VNet control and a higher DBU rate.

Click to flip back

Question

Which compute type should you recommend for a BI team running SQL dashboards?

Click or press Enter to reveal answer

Answer

SQL warehouse — either serverless (fast startup, variable workloads) or classic (predictable workloads, VNet requirements). SQL warehouses are SQL-only endpoints with Photon acceleration, optimised for BI tools.

Click to flip back

Question

What are the two flavours of SQL warehouse?

Click or press Enter to reveal answer

Answer

Serverless SQL warehouse (seconds startup, premium DBU, no VNet config) and Classic SQL warehouse (minutes startup, standard DBU, supports VNet deployment). Choose based on latency needs and security requirements.

Click to flip back

🎬 Video coming soon

Knowledge check

Knowledge Check

Ravi needs to run a nightly ETL pipeline that processes 500GB of sales data at DataPulse Analytics. The pipeline runs at 2 AM and typically completes in 45 minutes. Which compute type should he choose?

Knowledge Check

Mei Lin's BI team at Freshmart needs sub-second query response times for their real-time inventory dashboard. The dashboard connects via Power BI and runs SQL queries only. Security policy requires that all compute must run within Freshmart's Azure VNet. Which compute type is most appropriate?

Knowledge Check

Dr. Sarah Okafor is setting up a Databricks training workshop for 20 data engineers at Athena Group. Each engineer will run small queries and notebook experiments during the 3-hour session. Budget is limited. Which compute type minimises cost?


Next up: Configuring Compute for Performance — autoscaling, Photon, runtime versions, pooling, and libraries.

← Previous

Azure Databricks: Your Lakehouse Platform

Next →

Configuring Compute for Performance

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.