πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901 aws-aif-c01
Guided DP-420 Domain 3
Domain 3 β€” Module 2 of 3 67%
16 of 28 overall

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review
Domain 3: Integrate and Move Data Premium ⏱ ~14 min read

Analytical Workloads: Synapse Link and Fabric Mirroring

Implement HTAP analytics on Cosmos DB data using Azure Synapse Link's auto-synced analytical store and Microsoft Fabric mirroring β€” no ETL pipelines required.

The problem: analytics vs transactions

β˜• Simple explanation

Running analytics on your live database is like doing a full inventory count while the store is open. Customers bump into counters, shelves get blocked, everything slows down.

Synapse Link creates a second copy of your data in a format optimised for analytics (columns instead of documents). This copy auto-syncs from your live data, so analysts query the copy while your app runs at full speed.

HTAP (Hybrid Transactional/Analytical Processing) runs analytics on operational data without extracting it into a separate data warehouse. Cosmos DB achieves this with:

  • Transactional store: Row-oriented, optimised for point reads and writes (the normal Cosmos DB store).
  • Analytical store: Column-oriented, auto-synced, optimised for aggregations and scans.
  • No ETL: Data flows automatically β€” no pipelines to build or maintain.
  • No RU impact: Analytical queries run against the column store, not your transactional RU budget.

Amara’s analytics challenge

πŸ“‘ Amara at SensorFlow ingests 500M sensor events per day. Her data scientist TomΓ‘s wants to run daily aggregations β€” average temperature per device, anomaly detection, trend analysis. But running these queries against the transactional store would consume massive RU/s and slow down real-time ingestion.

Synapse Link is the answer: TomΓ‘s queries the analytical store via Synapse or Fabric while Amara’s ingestion runs unimpacted.

Enabling Synapse Link

⚠️ Important (2025): Azure Synapse Link for Cosmos DB is no longer supported for new projects. Microsoft recommends Azure Cosmos DB Mirroring for Microsoft Fabric instead, which is now GA and provides the same zero-ETL benefits. The exam may still test Synapse Link concepts, but know that Fabric Mirroring is the recommended replacement.

Step 1: Enable Synapse Link on the Cosmos DB account (one-time, irreversible):

az cosmosdb update --name sensorflow-cosmos \
  --resource-group rg-sensorflow \
  --enable-analytical-storage true

Step 2: Enable the analytical store on each container:

ContainerProperties props = new ContainerProperties("readings", "/deviceId")
{
    AnalyticalStoreTimeToLiveInSeconds = -1  // no expiry
};
await database.CreateContainerAsync(props);
TTL ValueBehaviour
-1Analytical store enabled, data retained indefinitely
0 or nullAnalytical store disabled
N (positive)Data retained for N seconds in analytical store
πŸ’‘ Exam tip: analytical store TTL is independent

The analytical store TTL is separate from the transactional store TTL. You can keep data in the analytical store longer than in the transactional store:

  • Transactional TTL = 30 days (keep recent data for the app)
  • Analytical TTL = -1 (keep all data forever for analytics)

When a document expires from the transactional store, it persists in the analytical store until its own TTL expires. This is a common exam scenario.

How the analytical store works

Transactional Store (row-oriented)         Analytical Store (column-oriented)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ { id, deviceId, temp, ts }   │──auto──→  β”‚ id     β”‚ deviceId β”‚ temp β”‚tsβ”‚
β”‚ { id, deviceId, temp, ts }   β”‚  sync     β”‚ ────── β”‚ ──────── β”‚ ──── │──│
β”‚ { id, deviceId, temp, ts }   β”‚  (~2min)  β”‚ val    β”‚ val      β”‚ val  β”‚  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        App reads/writes                      Synapse / Fabric queries
        (RU budget)                           (no RU impact)
  • Auto-sync latency: Typically under 2 minutes, can be up to 5 minutes
  • Schema: Fully-faithful, auto-inferred from the transactional store
  • Nested properties: Flattened into columns automatically
  • No RU consumption: Analytical sync and queries don’t consume transactional RU/s

Querying via Synapse

TomΓ‘s queries the analytical store using Synapse serverless SQL or Spark:

-- Synapse serverless SQL pool
SELECT deviceId,
       AVG(temperature) as avg_temp,
       MAX(temperature) as max_temp,
       COUNT(*) as reading_count
FROM OPENROWSET(
    'CosmosDB',
    'Account=sensorflow-cosmos;Database=sensorflow;Key=...',
    readings
) WITH (
    deviceId VARCHAR(50),
    temperature FLOAT,
    _ts BIGINT
) AS readings
GROUP BY deviceId
HAVING AVG(temperature) > 80

Fabric mirroring (recommended for new projects)

Microsoft Fabric mirroring is the recommended replacement for Synapse Link. It is GA and continuously replicates Cosmos DB data into Fabric OneLake:

FeatureSynapse LinkFabric Mirroring
Data locationCosmos DB analytical storeFabric OneLake (Delta tables)
Query enginesSynapse SQL/Spark poolsFabric Spark, SQL endpoint, Power BI
Latency~2 min auto-syncNear real-time continuous sync
Schema handlingAuto-inferred columnsDelta tables with schema evolution
Cost modelSynapse compute + analytical storeFabric capacity units
Setup complexityEnable on account + containerConfigure in Fabric workspace
Best forSynapse-centric architecturesFabric/Power BI-centric architectures
πŸ’‘ Exam tip: Synapse Link is the DP-420 focus

The DP-420 exam focuses primarily on Synapse Link and the analytical store. Fabric mirroring may appear in β€œwhich tool to choose” questions but deep configuration questions will be about Synapse Link. Know how to enable it, set TTL, and understand the auto-sync behaviour.

🎬 Video walkthrough

🎬 Video coming soon

Analytical Workloads β€” DP-420 Module 16

Analytical Workloads β€” DP-420 Module 16

~14 min

Flashcards

Question

What is the analytical store in Cosmos DB?

Click or press Enter to reveal answer

Answer

A column-oriented, auto-synced copy of your transactional data. It's optimised for analytical queries (aggregations, scans) and is queried via Synapse or Fabric β€” without consuming transactional RU/s. Auto-syncs within ~2 minutes.

Click to flip back

Question

What does setting analytical store TTL to -1 mean?

Click or press Enter to reveal answer

Answer

The analytical store is enabled and data is retained indefinitely. TTL=0 or null means disabled. A positive value sets expiry in seconds. The analytical TTL is independent of the transactional TTL.

Click to flip back

Question

Does querying the analytical store consume transactional RU/s?

Click or press Enter to reveal answer

Answer

No β€” analytical queries run against the column store and do NOT consume the container's transactional RU/s. The compute cost comes from Synapse or Fabric, not from Cosmos DB.

Click to flip back

Knowledge Check

Knowledge Check

TomΓ‘s needs to run daily aggregation queries on 500M sensor events. Running these directly on the transactional store would consume 50,000+ RU/s. What's the best approach?

Knowledge Check

Amara sets transactional store TTL to 7 days and analytical store TTL to -1. A document is ingested on Monday. What happens after the following Monday?

Knowledge Check

Which is required before you can enable the analytical store on a container?


Next up: Data Movement β€” Azure Data Factory, Kafka connectors, and Spark connectors for moving data into and out of Cosmos DB.

← Previous

Change Feed with Azure Functions and Processors

Next β†’

Data Movement: ADF, Kafka, and Spark Connectors

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.