🔒 Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-750 Domain 3
Domain 3 — Module 2 of 10 20%
12 of 28 overall

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor

DP-750 Study Guide

Domain 1: Set Up and Configure an Azure Databricks Environment

  • Azure Databricks: Your Lakehouse Platform Free
  • Choosing the Right Compute Free
  • Configuring Compute for Performance Free
  • Unity Catalog: The Three-Level Namespace Free
  • Tables, Views & External Catalogs Free

Domain 2: Secure and Govern Unity Catalog Objects

  • Securing Unity Catalog: Who Gets What
  • Secrets & Authentication
  • Data Discovery & Attribute-Based Access
  • Row Filters, Column Masks & Retention
  • Lineage, Audit Logs & Delta Sharing

Domain 3: Prepare and Process Data

  • Data Modeling: Ingestion Design Free
  • SCD, Granularity & Temporal Tables
  • Partitioning, Clustering & Table Optimization
  • Ingesting Data: Lakeflow Connect & Notebooks
  • Ingesting Data: SQL Methods & CDC
  • Streaming Ingestion: Structured Streaming & Event Hubs
  • Auto Loader & Declarative Pipelines
  • Cleansing & Profiling Data Free
  • Transforming & Loading Data
  • Data Quality & Schema Enforcement

Domain 4: Deploy and Maintain Data Pipelines and Workloads

  • Building Data Pipelines Free
  • Lakeflow Jobs: Create & Configure
  • Lakeflow Jobs: Schedule, Alerts & Recovery
  • Git & Version Control
  • Testing & Databricks Asset Bundles
  • Monitoring Clusters & Troubleshooting
  • Spark Performance: DAG & Query Profile
  • Optimizing Delta Tables & Azure Monitor
Domain 3: Prepare and Process Data Premium ⏱ ~14 min read

SCD, Granularity & Temporal Tables

Slowly changing dimensions, choosing the right granularity, and designing temporal history tables — the data modeling patterns that exam scenarios are built on.

Slowly changing dimensions

☕ Simple explanation

A customer moves house. What happens to their old address in your database?

That’s the SCD problem. Dimension data (customers, products, stores) changes slowly over time. You have three choices:

  • Type 1: Overwrite the old value. Simple, but you lose history. “What was their address last year?” — can’t answer.
  • Type 2: Keep both versions. Add a new row with the new address, mark the old row as expired. You can see both past and present.
  • Type 3: Add a column for the old value. Quick to query but limited — you can only track one previous value.

Slowly Changing Dimensions (SCD) are dimension tables where attribute values change occasionally. The SCD type determines how you handle those changes:

  • Type 1 — overwrite in place. No history preserved. Simplest to implement.
  • Type 2 — version rows with effective dates and a current flag. Full history preserved. Most common in data warehouses.
  • Type 3 — add “previous value” columns. Limited history (one prior version only).

In Delta Lake, Type 2 is implemented using MERGE with effective date columns and a boolean is_current flag.

FeatureSCD Type 1SCD Type 2SCD Type 3
History preserved?NoYes (all versions)Partial (one previous)
Storage growthConstantGrows with changesConstant
ImplementationUPDATE in placeINSERT new row + expire oldUPDATE + shift columns
Query complexitySimpleNeed WHERE is_current = TRUEModerate
Best forData where history doesn't matterCustomer addresses, pricing, org changesRarely used in practice
Delta Lake approachMERGE with UPDATEMERGE with INSERT + UPDATEMERGE with UPDATE

SCD Type 2 in Delta Lake

-- SCD Type 2: Merge new customer data preserving history
MERGE INTO dim_customer AS target
USING staging_customer AS source
ON target.customer_id = source.customer_id AND target.is_current = TRUE

-- When existing record changed: expire old, insert new
WHEN MATCHED AND (target.address != source.address OR target.city != source.city) THEN
  UPDATE SET
    target.is_current = FALSE,
    target.effective_end = CURRENT_DATE()

-- When no match: insert new record
WHEN NOT MATCHED THEN
  INSERT (customer_id, name, address, city, is_current, effective_start, effective_end)
  VALUES (source.customer_id, source.name, source.address, source.city,
          TRUE, CURRENT_DATE(), DATE('9999-12-31'));

Mei Lin implements SCD Type 2 for Freshmart’s store dimension — when a store changes its category (from “Express” to “Supermarket”), both the old and new versions are preserved for historical reporting.

Granularity decisions

Granularity is the level of detail in your table — the “grain” of each row.

GranularityRow RepresentsExample
Transaction-levelOne event/transactionEach POS scan at Freshmart
Daily aggregateOne day’s summaryTotal sales per store per day
Monthly aggregateOne month’s summaryMonthly revenue per region

Choosing granularity:

  • Finer grain (transaction) → more flexibility for analysis, more storage, slower aggregation queries
  • Coarser grain (monthly) → less storage, faster queries, but can’t drill down to individual transactions

Exam tip: Choose the finest granularity that your business requirements demand. You can always aggregate up from transactions to daily/monthly, but you can’t disaggregate monthly back to transactions.

Temporal (history) tables

A temporal table records every version of a row with timestamps:

CREATE TABLE prod_sales.curated.product_history (
  product_id INT,
  product_name STRING,
  price DECIMAL(10,2),
  category STRING,
  valid_from TIMESTAMP,
  valid_to TIMESTAMP,
  is_current BOOLEAN
);

This enables point-in-time queries: “What was the price of product X on March 1st?”

-- Point-in-time query
SELECT product_name, price
FROM product_history
WHERE product_id = 42
  AND valid_from <= '2026-03-01'
  AND valid_to > '2026-03-01';
💡 Temporal tables vs Delta time travel

Don’t confuse these:

  • Temporal table — a table YOU design with valid_from/valid_to columns to track business history. Permanent. Queryable forever.
  • Delta time travel — Delta Lake’s built-in version history. Governed by retention settings. Intended for data recovery and auditing, NOT for business history.

Exam pattern: “Track how customer data changes over time for business reporting” → temporal table (SCD Type 2). “Recover data accidentally deleted yesterday” → Delta time travel.

Question

What are the three SCD types and when should you use each?

Click or press Enter to reveal answer

Answer

Type 1: Overwrite (no history needed). Type 2: Add rows with effective dates (full history for reporting). Type 3: Add 'previous' columns (rarely used, limited history).

Click to flip back

Question

How do you choose the right granularity for a table?

Click or press Enter to reveal answer

Answer

Choose the finest granularity your business requires. You can aggregate up (transactions → daily → monthly) but can't disaggregate down. Finer grain = more flexibility but more storage.

Click to flip back

Question

What is the difference between a temporal table and Delta time travel?

Click or press Enter to reveal answer

Answer

Temporal table: business-designed history with valid_from/valid_to columns, permanent. Delta time travel: system version history governed by retention, for recovery/auditing.

Click to flip back

🎬 Video coming soon

Knowledge check

Knowledge Check

Freshmart changes its store classification system. Mei Lin needs to track how each store's category changed over time so historical reports show the correct category at the time of each sale. Which approach should she use?

Knowledge Check

Ravi is designing a sales fact table for DataPulse clients. Marketing wants to know which promotions drove individual purchases. Finance only needs monthly revenue by region. What granularity should Ravi choose?


Next up: Partitioning, Clustering & Table Optimization — partitioning schemes, liquid clustering, Z-ordering, and deletion vectors.

← Previous

Data Modeling: Ingestion Design

Next →

Partitioning, Clustering & Table Optimization

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.