Choosing the Right Compute
Azure Databricks offers five compute types — job compute, serverless, SQL warehouses, classic all-purpose, and shared compute. Know when to use each and why the exam tests this heavily.
What is compute in Databricks?
Compute is the engine that runs your code.
Think of it like ordering a car. You don’t buy one — you pick the type that matches your trip:
- Job compute — a rental car. Starts when you need it, returns itself when done. Cheapest for scheduled trips.
- Serverless compute — an Uber. Shows up instantly, you don’t manage the vehicle. Premium convenience.
- SQL warehouse — a shuttle bus for SQL passengers only. Optimised for queries, not for Python notebooks.
- Classic all-purpose — your own car. Always available for interactive work, but you pay while it’s parked.
- Shared compute — a carpooling app. Multiple people share one vehicle to cut costs.
The five compute types
| Feature | Job Compute | Serverless | SQL Warehouse | Classic All-Purpose | Shared |
|---|---|---|---|---|---|
| Best for | Scheduled ETL/pipelines | Fast-start notebooks | SQL queries & BI | Interactive dev | Team dev (cost sharing) |
| Lifecycle | Created per job, auto-terminates | Managed by Databricks | Always-on or auto-stop | Manual start/stop | Manual start/stop |
| Startup time | Minutes (cold start) | Seconds | Seconds (serverless) or minutes | Minutes | Minutes |
| User isolation | Full (one job per cluster) | Per-user isolation | Shared endpoint | Shared cluster | Shared cluster |
| Languages | Python, SQL, Scala, R | Python, SQL, Scala, R | SQL only | Python, SQL, Scala, R | Python, SQL |
| Cost model | Pay only during job run | Premium DBU rate | Per-warehouse DBU | Pay while running | Shared cost across users |
| Photon support | Yes | Yes (auto) | Yes (default) | Yes (opt-in) | Limited |
| Exam scenario | Nightly batch pipeline | Ad-hoc notebook exploration | Dashboard queries | Notebook development | Training/workshops |
When to use each type
Job compute (the exam favourite)
When Ravi schedules a nightly pipeline at DataPulse Analytics, he uses job compute. The cluster spins up, runs the job, and terminates automatically.
Use when:
- Running scheduled ETL pipelines or batch jobs
- You want cost efficiency (pay only during execution)
- Each job needs an isolated environment
- Running production workloads via Lakeflow Jobs
Exam pattern: If the question says “scheduled,” “automated,” “nightly,” or “production pipeline” — think job compute first.
Serverless compute
When Tomás needs to quickly prototype a new fraud detection query at NovaPay, he uses serverless compute. No waiting for cluster startup — it’s ready in seconds.
Use when:
- Interactive notebook exploration (fast iteration)
- You want zero infrastructure management
- Startup time is critical
- Running Lakeflow Spark Declarative Pipelines (serverless is default for pipelines)
Serverless compute architecture
Serverless compute runs in the Databricks account (not your Azure subscription). This means:
- Faster startup — pre-warmed pools of machines are ready instantly
- No VNet configuration needed from your side
- Higher DBU rate — you pay a premium for the convenience
- Data still accessed via your storage — compute is remote, but it reads/writes to your ADLS Gen2
Exam tip: If a question mentions security requirements that mandate compute in the customer’s VNet, serverless may NOT be appropriate. Classic or job compute would be the answer.
SQL warehouse
Mei Lin’s BI team at Freshmart runs dashboards against the lakehouse. They don’t write Python — they query with SQL. A SQL warehouse gives them a dedicated SQL endpoint.
Use when:
- Running SQL queries against lakehouse tables
- Powering BI tools (Power BI, Tableau)
- You need a SQL-only endpoint with Photon acceleration
- Using AI/BI Genie for natural-language data exploration
Two flavours:
| Type | Startup | Cost | Best For |
|---|---|---|---|
| Serverless SQL warehouse | Seconds | Premium DBU | Low-latency BI, variable workloads |
| Classic SQL warehouse | Minutes | Standard DBU | Predictable workloads, VNet requirements |
Classic all-purpose compute
Dr. Sarah Okafor’s data engineering team at Athena Group uses classic all-purpose clusters for interactive development. The cluster stays running while they iterate on notebooks.
Use when:
- Interactive notebook development and exploration
- You need full control over cluster configuration
- Multiple users share a development cluster
- Running ad-hoc analysis that doesn’t justify serverless cost
Exam pattern: If the question says “development,” “exploration,” or “interactive” with cost sensitivity — classic all-purpose.
Shared compute (formerly “high-concurrency”)
When Athena Group runs a Databricks training workshop for 20 data engineers, shared compute lets everyone use one cluster simultaneously.
Use when:
- Multiple users need concurrent access to one cluster
- Training workshops or classroom settings
- Budget constraints require sharing resources
- Workloads are lightweight (each user doing small queries)
Exam tip: The decision tree
When the exam gives you a scenario, follow this decision tree:
- Is it a production scheduled job? → Job compute
- Is it SQL-only (BI dashboards)? → SQL warehouse
- Do you need instant startup for notebooks? → Serverless compute
- Is it interactive development with cost sensitivity? → Classic all-purpose
- Multiple concurrent users, shared budget? → Shared compute
The exam almost never asks you to pick shared compute — it’s the least-tested type. Job compute and SQL warehouse scenarios are the most common.
🎬 Video coming soon
Knowledge check
Ravi needs to run a nightly ETL pipeline that processes 500GB of sales data at DataPulse Analytics. The pipeline runs at 2 AM and typically completes in 45 minutes. Which compute type should he choose?
Mei Lin's BI team at Freshmart needs sub-second query response times for their real-time inventory dashboard. The dashboard connects via Power BI and runs SQL queries only. Security policy requires that all compute must run within Freshmart's Azure VNet. Which compute type is most appropriate?
Dr. Sarah Okafor is setting up a Databricks training workshop for 20 data engineers at Athena Group. Each engineer will run small queries and notebook experiments during the 3-hour session. Budget is limited. Which compute type minimises cost?
Next up: Configuring Compute for Performance — autoscaling, Photon, runtime versions, pooling, and libraries.