πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901
Guided DP-700 Domain 2
Domain 2 β€” Module 6 of 10 60%
14 of 26 overall

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance

DP-700 Study Guide

Domain 1: Implement and Manage an Analytics Solution

  • Workspace Settings: Your Fabric Foundation
  • Version Control: Git in Fabric
  • Deployment Pipelines: Dev to Production
  • Access Controls: Who Gets In
  • Data Security: Control Who Sees What
  • Governance: Labels, Endorsement & Audit
  • Orchestration: Pick the Right Tool
  • Pipeline Patterns: Parameters & Expressions

Domain 2: Ingest and Transform Data

  • Delta Lake: The Heart of Fabric Free
  • Loading Patterns: Full, Incremental & Streaming Free
  • Dimensional Modeling: Prep for Analytics Free
  • Data Stores & Tools: Make the Right Choice Free
  • OneLake Shortcuts: Data Without Duplication
  • Mirroring: Real-Time Database Replication
  • PySpark Transformations: Code Your Pipeline
  • Transform Data with SQL & KQL
  • Eventstreams & Spark Streaming: Real-Time Ingestion
  • Real-Time Intelligence: KQL & Windowing

Domain 3: Monitor and Optimize an Analytics Solution

  • Monitoring & Alerts: Catch Problems Early
  • Troubleshoot Pipelines & Dataflows
  • Troubleshoot Notebooks & SQL
  • Troubleshoot Streaming & Shortcuts
  • Optimize Lakehouse Tables: Delta Tuning
  • Optimize Spark: Speed Up Your Code
  • Optimize Pipelines & Warehouses
  • Optimize Streaming: Real-Time Performance
Domain 2: Ingest and Transform Data Premium ⏱ ~12 min read

Mirroring: Real-Time Database Replication

Replicate operational databases into Fabric using mirroring β€” continuous CDC-based replication from Azure SQL, Cosmos DB, Snowflake, and more, with zero ETL code.

What is mirroring?

β˜• Simple explanation

Think of a live TV broadcast of a sports match.

The match is happening at the stadium (your operational database). The TV broadcast (mirroring) shows everything happening in near real-time on your screen (Fabric). You don’t need to go to the stadium β€” the broadcast brings the action to you, with just a few seconds of delay.

Mirroring in Fabric does exactly this for databases. It continuously replicates data from an operational database (Azure SQL, Cosmos DB, Snowflake, etc.) into your Fabric lakehouse as Delta tables. No pipelines to build, no scheduling to configure β€” it just stays in sync.

Fabric mirroring creates a continuously replicated read-only copy of an external operational database in OneLake, stored as Delta Lake tables. It uses each source’s native change-tracking mechanism (CDC, change feed, streams) to detect inserts, updates, and deletes and applies them to the mirror in near real-time (typically minutes).

Supported sources include Azure SQL Database, Azure Cosmos DB, Snowflake, Azure Database for PostgreSQL, Azure Database for MySQL, and SQL Server (on-premises, 2016-2025). The mirrored data is stored in OneLake as Delta tables, accessible via the SQL analytics endpoint and Spark notebooks.

Supported mirroring sources

Each source uses its native change-tracking mechanism β€” Fabric handles the rest
SourceCDC MethodKey Consideration
Azure SQL DatabaseChange Data Capture (CDC)Requires CDC to be enabled on source tables
Azure Cosmos DBCosmos DB change feedReplicates all containers in a database; supports NoSQL API
SnowflakeSnowflake streamsCross-cloud replication; egress costs may apply from Snowflake
Azure Database for PostgreSQLLogical replication (pgoutput)Requires wal_level = logical on the server
Azure Database for MySQLBinary log (binlog) replicationRequires binlog_format = ROW
SQL Server (on-premises)CDC or Fabric change feed (SQL Server 2025)Supports SQL Server 2016-2025; requires on-prem data gateway

How mirroring works

The flow

Source Database (Azure SQL)
  β”‚
  β”œβ”€β”€ CDC captures changes (inserts, updates, deletes)
  β”‚
  β”œβ”€β”€ Fabric reads the CDC stream
  β”‚
  β”œβ”€β”€ Changes applied to OneLake as Delta Lake tables
  β”‚
  └── Data accessible via SQL endpoint + Spark

What you configure

  1. Create a mirrored database in Fabric
  2. Connect to the source β€” provide connection string and credentials
  3. Select tables β€” choose which tables to mirror (or all)
  4. Mirroring starts β€” initial snapshot loads all data, then CDC keeps it in sync

What you get

FeatureDetail
Delta tables in OneLakeEach source table becomes a Delta table β€” queryable via Spark and SQL endpoint
Near real-time syncChanges typically appear within minutes
Automatic schema syncNew columns in the source are automatically added to the mirror
Read-onlyThe mirror is a read-only replica β€” you cannot write back to the source
No ETL codeZero pipelines, zero notebooks, zero scheduling β€” mirroring is fully managed
πŸ’‘ Scenario: Carlos mirrors SAP data

Precision Manufacturing runs SAP on Azure SQL Database. Carlos’s team used to maintain a nightly ETL pipeline with 14 activities to copy production data into the lakehouse. Pipeline failures were common, and data was always at least 12 hours stale.

Carlos replaces the entire pipeline with mirroring:

  1. Creates a mirrored database in Fabric, connected to the Azure SQL Database
  2. Selects the 8 production tables he needs
  3. Mirroring starts β€” initial snapshot takes 20 minutes for 500M rows
  4. After that, changes appear in Fabric within 3-5 minutes

He retires 14 pipeline activities, saves 4 hours of maintenance per month, and production managers see data that’s minutes old instead of 12 hours old.

Mirroring vs other ingestion methods

Mirroring for databases, shortcuts for storage, pipelines for everything else
FactorMirroringPipeline (Copy Activity)OneLake Shortcut
Data copied?Yes β€” replicated to OneLakeYes β€” copied to targetNo β€” reads from source
LatencyMinutes (near real-time CDC)Depends on schedule (hourly/daily)Real-time (reads source directly)
ETL code needed?None β€” fully managedYes β€” pipeline activities, expressionsNone
Supported sourcesDatabases only (SQL, Cosmos, Snowflake, etc.)Any supported connector (150+)Storage only (ADLS, S3, GCS, Fabric)
Offline access?Yes β€” replica in OneLakeYes β€” copied dataNo β€” depends on source availability
Write to source?No (read-only)No (one-way copy)No (read-only)
MaintenanceLow β€” Fabric manages replicationHigh β€” monitor and fix pipeline failuresLow β€” no moving parts
πŸ’‘ Exam tip: When to choose mirroring

Exam questions about mirroring typically describe a scenario where:

  • The source is a relational database (not files)
  • The requirement is near real-time or continuous replication
  • The team wants to reduce pipeline maintenance
  • The data needs to be available in OneLake (not just accessible via shortcut)

If the scenario describes a file-based source β†’ shortcut. If it describes a database and wants zero-code, near real-time β†’ mirroring. If it needs complex transformation during ingestion β†’ pipeline.

Mirroring considerations

ConsiderationDetail
Source requirementsCDC must be enabled on the source (varies by database type)
Initial loadFirst sync loads all data β€” can take minutes to hours depending on volume
Schema changesNew columns are automatically added; column removals may require re-sync
CostOneLake storage for the replica + source egress charges (especially for Snowflake)
LimitsTable count limits per mirrored database (check current documentation)
MonitoringUse the Monitoring Hub to track replication lag and errors

Question

What is Fabric mirroring?

Click or press Enter to reveal answer

Answer

Continuous, CDC-based replication of an operational database into Fabric as Delta Lake tables in OneLake. Supported sources: Azure SQL, Cosmos DB, Snowflake, PostgreSQL, MySQL. No ETL code needed β€” Fabric manages the replication automatically.

Click to flip back

Question

Is mirrored data writable?

Click or press Enter to reveal answer

Answer

No. Mirrored databases in Fabric are read-only replicas. You can query them via the SQL analytics endpoint or Spark, but you cannot write data back to the source through the mirror.

Click to flip back

Question

What is the key prerequisite for mirroring an Azure SQL Database?

Click or press Enter to reveal answer

Answer

Change Data Capture (CDC) must be enabled on the source tables. Fabric uses the CDC stream to detect and replicate inserts, updates, and deletes.

Click to flip back


Knowledge Check

Carlos wants to replace his nightly ETL pipeline that copies data from Azure SQL Database to Fabric. He needs data freshness within 5 minutes and wants minimal maintenance. Which approach is best?

Knowledge Check

A data engineer creates a mirrored database for an Azure SQL source. The initial sync completes, but after a few hours, no new changes appear in Fabric. What is the most likely cause?

🎬 Video coming soon

Next up: PySpark Transformations: Code Your Pipeline β€” write PySpark to clean, shape, denormalize, and aggregate data at scale.

← Previous

OneLake Shortcuts: Data Without Duplication

Next β†’

PySpark Transformations: Code Your Pipeline

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.