Analytical Workloads: Synapse Link and Fabric Mirroring ·…

The problem: analytics vs transactions

Simple explanation

Running analytics on your live database is like doing a full inventory count while the store is open. Customers bump into counters, shelves get blocked, everything slows down.

Synapse Link creates a second copy of your data in a format optimised for analytics (columns instead of documents). This copy auto-syncs from your live data, so analysts query the copy while your app runs at full speed.

Amara’s analytics challenge

📡 Amara at SensorFlow ingests 500M sensor events per day. Her data scientist Tomás wants to run daily aggregations — average temperature per device, anomaly detection, trend analysis. But running these queries against the transactional store would consume massive RU/s and slow down real-time ingestion.

Synapse Link is the answer: Tomás queries the analytical store via Synapse or Fabric while Amara’s ingestion runs unimpacted.

Enabling Synapse Link

⚠️ Important (2025): Azure Synapse Link for Cosmos DB is no longer supported for new projects. Microsoft recommends Azure Cosmos DB Mirroring for Microsoft Fabric instead, which is now GA and provides the same zero-ETL benefits. The exam may still test Synapse Link concepts, but know that Fabric Mirroring is the recommended replacement.

Step 1: Enable Synapse Link on the Cosmos DB account (one-time, irreversible):

az cosmosdb update --name sensorflow-cosmos \
  --resource-group rg-sensorflow \
  --enable-analytical-storage true

Step 2: Enable the analytical store on each container:

ContainerProperties props = new ContainerProperties("readings", "/deviceId")
{
    AnalyticalStoreTimeToLiveInSeconds = -1  // no expiry
};
await database.CreateContainerAsync(props);

TTL Value	Behaviour
`-1`	Analytical store enabled, data retained indefinitely
`0` or null	Analytical store disabled
`N` (positive)	Data retained for N seconds in analytical store

Exam tip: analytical store TTL is independent

The analytical store TTL is separate from the transactional store TTL. You can keep data in the analytical store longer than in the transactional store:

Transactional TTL = 30 days (keep recent data for the app)
Analytical TTL = -1 (keep all data forever for analytics)

When a document expires from the transactional store, it persists in the analytical store until its own TTL expires. This is a common exam scenario.

How the analytical store works

Transactional Store (row-oriented)         Analytical Store (column-oriented)
┌──────────────────────────────┐           ┌──────────────────────────────┐
│ { id, deviceId, temp, ts }   │──auto──→  │ id     │ deviceId │ temp │ts│
│ { id, deviceId, temp, ts }   │  sync     │ ────── │ ──────── │ ──── │──│
│ { id, deviceId, temp, ts }   │  (~2min)  │ val    │ val      │ val  │  │
└──────────────────────────────┘           └──────────────────────────────┘
        App reads/writes                      Synapse / Fabric queries
        (RU budget)                           (no RU impact)

Auto-sync latency: Typically under 2 minutes, can be up to 5 minutes
Schema: Fully-faithful, auto-inferred from the transactional store
Nested properties: Flattened into columns automatically
No RU consumption: Analytical sync and queries don’t consume transactional RU/s

Querying via Synapse

Tomás queries the analytical store using Synapse serverless SQL or Spark:

-- Synapse serverless SQL pool
SELECT deviceId,
       AVG(temperature) as avg_temp,
       MAX(temperature) as max_temp,
       COUNT(*) as reading_count
FROM OPENROWSET(
    'CosmosDB',
    'Account=sensorflow-cosmos;Database=sensorflow;Key=...',
    readings
) WITH (
    deviceId VARCHAR(50),
    temperature FLOAT,
    _ts BIGINT
) AS readings
GROUP BY deviceId
HAVING AVG(temperature) > 80

Fabric mirroring (recommended for new projects)

Microsoft Fabric mirroring is the recommended replacement for Synapse Link. It is GA and continuously replicates Cosmos DB data into Fabric OneLake:

Feature	Synapse Link	Fabric Mirroring
Data location	Cosmos DB analytical store	Fabric OneLake (Delta tables)
Query engines	Synapse SQL/Spark pools	Fabric Spark, SQL endpoint, Power BI
Latency	~2 min auto-sync	Near real-time continuous sync
Schema handling	Auto-inferred columns	Delta tables with schema evolution
Cost model	Synapse compute + analytical store	Fabric capacity units
Setup complexity	Enable on account + container	Configure in Fabric workspace
Best for	Synapse-centric architectures	Fabric/Power BI-centric architectures

Exam tip: Synapse Link is the DP-420 focus

The DP-420 exam focuses primarily on Synapse Link and the analytical store. Fabric mirroring may appear in “which tool to choose” questions but deep configuration questions will be about Synapse Link. Know how to enable it, set TTL, and understand the auto-sync behaviour.

🎬 Video walkthrough

Flashcards

Question

What is the analytical store in Cosmos DB?

Click or press Enter to reveal answer

Answer

A column-oriented, auto-synced copy of your transactional data. It's optimised for analytical queries (aggregations, scans) and is queried via Synapse or Fabric — without consuming transactional RU/s. Auto-syncs within ~2 minutes.

Click to flip back

Question

What does setting analytical store TTL to -1 mean?

Click or press Enter to reveal answer

Answer

The analytical store is enabled and data is retained indefinitely. TTL=0 or null means disabled. A positive value sets expiry in seconds. The analytical TTL is independent of the transactional TTL.

Click to flip back

Question

Does querying the analytical store consume transactional RU/s?

Click or press Enter to reveal answer

Answer

No — analytical queries run against the column store and do NOT consume the container's transactional RU/s. The compute cost comes from Synapse or Fabric, not from Cosmos DB.

Click to flip back

Knowledge Check

Tomás needs to run daily aggregation queries on 500M sensor events. Running these directly on the transactional store would consume 50,000+ RU/s. What's the best approach?

Knowledge Check

Amara sets transactional store TTL to 7 days and analytical store TTL to -1. A document is ingested on Monday. What happens after the following Monday?

Knowledge Check

Which is required before you can enable the analytical store on a container?

Next up: Data Movement — Azure Data Factory, Kafka connectors, and Spark connectors for moving data into and out of Cosmos DB.