πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-600 Domain 2
Domain 2 β€” Module 4 of 14 29%
11 of 29 overall

DP-600 Study Guide

Domain 1: Maintain a Data Analytics Solution

  • Workspace Access Controls
  • Row-Level & Object-Level Security
  • Sensitivity Labels & Endorsement
  • Git Version Control in Fabric
  • Deployment Pipelines: Dev β†’ Test β†’ Prod
  • Impact Analysis & Dependencies
  • XMLA Endpoint & Reusable Assets

Domain 2: Prepare Data

  • Microsoft Fabric: The Big Picture Free
  • Lakehouses: Your Data Foundation Free
  • Warehouses in Fabric Free
  • Choosing the Right Data Store Free
  • Data Connections & OneLake Catalog
  • Shortcuts & OneLake Integration
  • Ingesting Data: Dataflows Gen2 & Pipelines
  • Star Schema Design Free
  • SQL Objects: Views, Functions & Stored Procedures
  • Transforming Data: Reshape & Enrich
  • Data Quality & Cleansing
  • Querying with SQL
  • Querying with KQL
  • Querying with DAX

Domain 3: Implement and Manage Semantic Models

  • Semantic Models: Storage Modes
  • Relationships & Advanced Modeling
  • DAX Essentials: Variables & Functions
  • Calculation Groups & Field Parameters
  • Large Models & Composite Models
  • Direct Lake Mode
  • DAX Performance Optimization
  • Incremental Refresh

DP-600 Study Guide

Domain 1: Maintain a Data Analytics Solution

  • Workspace Access Controls
  • Row-Level & Object-Level Security
  • Sensitivity Labels & Endorsement
  • Git Version Control in Fabric
  • Deployment Pipelines: Dev β†’ Test β†’ Prod
  • Impact Analysis & Dependencies
  • XMLA Endpoint & Reusable Assets

Domain 2: Prepare Data

  • Microsoft Fabric: The Big Picture Free
  • Lakehouses: Your Data Foundation Free
  • Warehouses in Fabric Free
  • Choosing the Right Data Store Free
  • Data Connections & OneLake Catalog
  • Shortcuts & OneLake Integration
  • Ingesting Data: Dataflows Gen2 & Pipelines
  • Star Schema Design Free
  • SQL Objects: Views, Functions & Stored Procedures
  • Transforming Data: Reshape & Enrich
  • Data Quality & Cleansing
  • Querying with SQL
  • Querying with KQL
  • Querying with DAX

Domain 3: Implement and Manage Semantic Models

  • Semantic Models: Storage Modes
  • Relationships & Advanced Modeling
  • DAX Essentials: Variables & Functions
  • Calculation Groups & Field Parameters
  • Large Models & Composite Models
  • Direct Lake Mode
  • DAX Performance Optimization
  • Incremental Refresh
Domain 2: Prepare Data Free ⏱ ~12 min read

Choosing the Right Data Store

Lakehouse, warehouse, or Eventhouse? Learn the decision framework for picking the right Fabric storage item for every scenario β€” an exam favourite.

Which data store should I use?

β˜• Simple explanation

Think of it like choosing transport.

Need to haul containers across the ocean? Use a cargo ship (lakehouse) β€” it handles any type of cargo, in bulk, and you can sort it when it arrives. Need to deliver precisely packaged goods to a retail store? Use a delivery truck (warehouse) β€” everything is labelled, sorted, and ready to shelve. Need to stream live sports to millions of viewers? Use a satellite broadcast (Eventhouse) β€” optimised for real-time, high-speed data.

The exam tests your ability to match the right transport to the right cargo. Get comfortable with the decision framework β€” it appears in nearly every scenario question.

Microsoft Fabric provides three primary data stores, each optimised for different workload patterns:

  • Lakehouse β€” schema-on-read storage using Delta tables, accessed via Spark and a read-only SQL endpoint. Best for data engineering, semi-structured data, and Spark-based transformations.
  • Warehouse β€” fully managed SQL analytics warehouse with full T-SQL DML. Best for SQL-first teams, stored procedures, and structured analytical workloads.
  • Eventhouse β€” a real-time analytics database powered by KQL (Kusto Query Language). Best for streaming data, time-series analysis, IoT telemetry, and log analytics.

Lakehouses and warehouses store data natively in OneLake. Eventhouse can expose data to OneLake via OneLake availability (must be enabled). All three can cross-reference each other. The choice is about how your team works with data, not about data isolation.

The three data stores compared

Lakehouses and warehouses store data natively in OneLake. Eventhouse exposes data to OneLake when OneLake availability is enabled.
FeatureLakehouseWarehouseEventhouse
Query languagePySpark + SQL (read-only)T-SQL (full DML)KQL (Kusto Query Language)
Write methodSpark notebooks, pipelines, Dataflows Gen2SQL INSERT, UPDATE, DELETE, MERGE, COPY INTOStreaming ingestion, batch ingestion, connectors
Best data typesSemi-structured (JSON, CSV), raw files, Delta tablesStructured (relational tables, star schemas)Time-series, events, logs, telemetry
SchemaSchema-on-read (flexible) with Delta enforcementSchema-on-write (strict, defined upfront)Schema-on-write (strict, optimised for append)
Stored proceduresNot availableFull T-SQL stored proceduresNot available (KQL functions instead)
Power BI connectionSQL analytics endpoint (auto-generated)Default semantic model (auto-generated)KQL queryset or Direct Query
Real-time ingestionNot designed for real-timeNot designed for real-timeBuilt for real-time β€” sub-second ingestion
Typical usersData engineers, data scientistsBI professionals, SQL developers, analystsIoT engineers, DevOps, security analysts

Decision framework

Use this flowchart when answering exam questions:

Step 1: Is the data streaming or batch?

  • Streaming (real-time events, IoT, logs) β†’ Eventhouse
  • Batch β†’ Go to Step 2

Step 2: Does the team primarily use SQL with DML needs?

  • Yes (stored procedures, MERGE, UPDATE) β†’ Warehouse
  • No β†’ Go to Step 3

Step 3: Is the data semi-structured, unstructured, or needs Spark?

  • Yes (JSON, CSV, Parquet, Python/PySpark transformations) β†’ Lakehouse
  • No (structured, SQL-focused but no DML) β†’ Lakehouse (SQL analytics endpoint covers read-only SQL needs)
πŸ’‘ Exam tip: The 'both' answer

Many exam scenarios have answers where you use both a lakehouse and a warehouse. This is often correct. A common pattern:

  1. Land raw data in a lakehouse (Bronze/Silver layers via Spark)
  2. Create business-ready tables in a warehouse (Gold layer with stored procedures)
  3. Connect Power BI to the warehouse (or create a custom semantic model)

Do not assume you must pick only one. Cross-database queries make multi-store architectures seamless.

When each character picks what

CharacterPrimary Data StoreWhy
πŸ›’ Anita (FreshCart)Lakehouse + WarehouseLakehouse for raw POS data ingestion (CSV, 15M rows/day via Spark); Warehouse for Gold-layer star schema with stored procedures
🏒 James (Summit)WarehouseSQL-first team, stored procedures for client transforms, cross-database queries to access shared reference data
πŸ₯ Dr. Sarah (Pacific Health)LakehouseSemi-structured clinical data (HL7/FHIR JSON), PySpark transformations, sensitivity labels on raw data
πŸ’° Raj (Atlas Capital)Warehouse + EventhouseWarehouse for financial reporting; Eventhouse for real-time market data feeds and trade monitoring

Eventhouse: The real-time option

The Eventhouse is less heavily tested in DP-600 than lakehouses and warehouses, but you need to know when to choose it.

What makes Eventhouse different?

CapabilityDetails
Ingestion speedSub-second ingestion of streaming data
Query languageKQL (Kusto Query Language) β€” optimised for time-series queries
Data patternAppend-optimised β€” data arrives constantly and is rarely updated
IntegrationConnects to Event Hubs, IoT Hub, Change Data Capture
OneLake integrationCan expose data to OneLake via OneLake availability, then accessible via shortcuts
Use casesIoT telemetry, application logs, security events, real-time dashboards
πŸ’‘ Scenario: Raj monitors real-time trades

Raj at Atlas Capital receives 50,000 trade events per second from the trading platform. He needs to:

  • Detect anomalous trading patterns within seconds
  • Query the last 24 hours of trades interactively
  • Feed alerts into a real-time Power BI dashboard

This is a perfect Eventhouse scenario: streaming ingestion, time-series queries with KQL, and real-time dashboards. The warehouse handles the nightly P&L calculations; the Eventhouse handles the live monitoring.

OneLake integration for Eventhouse and semantic models

The exam specifically tests OneLake integration for Eventhouse and semantic models (a bullet under β€œGet data”):

  • Eventhouse β†’ OneLake: Eventhouse data can be made available in OneLake using OneLake availability (a setting per database/table). This lets other Fabric items (lakehouses, warehouses, semantic models) access Eventhouse data via shortcuts.
  • Semantic model β†’ OneLake: Semantic models in Direct Lake mode read directly from Delta tables in OneLake. This is covered in detail in Domain 3.
Question

Name the three primary data stores in Microsoft Fabric.

Click or press Enter to reveal answer

Answer

1. Lakehouse β€” Spark-based, Delta tables, read-only SQL endpoint. 2. Warehouse β€” full T-SQL DML, stored procedures, relational analytics. 3. Eventhouse β€” KQL-based, real-time streaming, time-series analytics. All three store data in OneLake.

Click to flip back

Question

When should you choose a Fabric Warehouse over a lakehouse?

Click or press Enter to reveal answer

Answer

Choose a warehouse when: (1) your team works primarily in SQL, (2) you need stored procedures, (3) you need INSERT/UPDATE/DELETE/MERGE, or (4) you are building a SQL-based star schema with full DML. Choose a lakehouse for Spark, semi-structured data, or Python transformations.

Click to flip back

Question

What is an Eventhouse best suited for?

Click or press Enter to reveal answer

Answer

Real-time streaming analytics. Eventhouse uses KQL (Kusto Query Language) and is optimised for append-heavy, time-series workloads β€” IoT telemetry, application logs, security events, and real-time dashboards. It supports sub-second ingestion.

Click to flip back

Knowledge Check

Dr. Sarah at Pacific Health Network receives patient data in HL7 FHIR format (nested JSON documents). She needs to parse, flatten, and clean this data before building patient outcome reports. Which data store should she land the raw data in?

Knowledge Check

Raj at Atlas Capital has two workloads: (1) nightly financial reporting with complex stored procedures, and (2) real-time trade monitoring with 50,000 events per second. What is the best Fabric architecture?

Knowledge Check

Anita at FreshCart stores daily POS data in a lakehouse. Her finance team wants to run UPDATE statements to correct accounting entries. The SQL analytics endpoint does not support UPDATE. What is the simplest solution?

🎬 Video coming soon


Next up: Data Connections & OneLake Catalog β€” discover data across your organisation and connect to external sources.

← Previous

Warehouses in Fabric

Next β†’

Data Connections & OneLake Catalog

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.