πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901 aws-aif-c01
Guided DP-420 Domain 1
Domain 1 β€” Module 5 of 11 45%
5 of 28 overall

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review
Domain 1: Design and Implement Data Models Free ⏱ ~16 min read

Relationships β€” Embedding vs Referencing

Master the two fundamental data modelling strategies in Cosmos DB: embedding related data inside a document and referencing it with separate documents. Plus TTL, unique keys, and denormalisation sync with the change feed.

Embedding vs referencing

β˜• Simple explanation

Think about a recipe card. You can write the ingredients right on the card (embedding) β€” everything you need is in one place. Or you can write β€œSee pantry shelf 3” and look it up separately (referencing).

Embedding is faster to read (one trip). Referencing is better when the ingredient list changes often or is shared across many recipes.

In relational databases, relationships are modelled with foreign keys and JOINs. In Cosmos DB, you choose between two strategies:

  • Embedding: Nest related data directly inside the parent document. One read returns everything. Best for data that’s read together and doesn’t change independently.
  • Referencing: Store related data in separate documents with a reference (an ID). Requires multiple reads or queries. Best for data that changes frequently, is large, or is shared across many parents.

Most real-world models use both β€” embed some relationships, reference others.

When to embed

Embed when:

SignalWhy embedding works
1:1 relationshipA user and their profile β€” always read together
1:few relationshipAn order with 3-5 line items β€” small, bounded
Data is read togetherA blog post and its tags β€” always displayed together
Child rarely changes independentlyAddress embedded in a customer record
Bounded growthYou know the array won’t exceed a reasonable size
{
  "id": "user-001",
  "tenantId": "tenant-abc",
  "type": "user",
  "name": "Priya Sharma",
  "email": "priya@novasaas.com",
  "address": {
    "street": "42 Cloud Lane",
    "city": "Auckland",
    "country": "New Zealand"
  },
  "roles": ["admin", "architect"]
}

One read, one RU β€” no JOINs, no second query.

When to reference

Reference when:

SignalWhy referencing works
1:many (unbounded)A product with 10,000 reviews β€” embedding would blow past 2 MB
Data changes independentlyA user’s name changes; you don’t want to update 500 embedded copies
Data is sharedA category referenced by 10,000 products β€” store once, reference everywhere
Data is largeA comment with 50 KB of rich text
Different access patternsComments are loaded on scroll, not with the parent
// Parent document
{
  "id": "proj-001",
  "tenantId": "tenant-abc",
  "type": "project",
  "name": "Website Redesign",
  "ownerId": "user-001"
}

// Referenced document (same or different container)
{
  "id": "task-042",
  "tenantId": "tenant-abc",
  "type": "task",
  "projectId": "proj-001",
  "title": "Update hero section",
  "assigneeId": "user-001"
}

Reading a project and its tasks: two queries, but each is a single-partition query on /tenantId.

Embedding vs referencing comparison

AspectEmbeddingReferencing
Read performanceSingle read β€” fast, 1 RU for point readMultiple reads β€” more RU, more latency
Write performanceEntire document rewritten on any changeOnly the changed document is written
Data duplicationPossible β€” embedded copies may go staleNo duplication β€” single source of truth
Document sizeGrows with embedded data β€” watch the 2 MB limitEach document stays small
ConsistencyAlways consistent (one document)May be eventually consistent (denormalised copies)
Best for1:1, 1:few, read-together, rarely changes1:many, changes often, shared across parents

Denormalisation and the change feed

When you embed data (e.g., a user’s name inside every task they’re assigned to), updates create a sync problem. The change feed solves this:

1. Priya updates her display name in the "users" container
2. A change feed processor detects the update
3. The processor queries all tasks where assigneeId = "user-001"
4. It patches each task's embedded assigneeName with the new value
// Change feed processor handler
static async Task HandleUserChanges(
    ChangeFeedProcessorContext context,
    IReadOnlyCollection<User> changes,
    CancellationToken ct)
{
    foreach (User user in changes)
    {
        // Find all tasks assigned to this user
        var query = new QueryDefinition(
            "SELECT * FROM c WHERE c.assigneeId = @userId AND c.type = 'task'")
            .WithParameter("@userId", user.Id);

        using FeedIterator<TaskItem> feed = tasksContainer
            .GetItemQueryIterator<TaskItem>(query);

        while (feed.HasMoreResults)
        {
            foreach (TaskItem task in await feed.ReadNextAsync(ct))
            {
                // Patch the embedded name
                await tasksContainer.PatchItemAsync<TaskItem>(
                    task.Id,
                    new PartitionKey(task.TenantId),
                    new[] { PatchOperation.Set("/assigneeName", user.DisplayName) });
            }
        }
    }
}

This gives you fast reads (embedded name) with eventual consistency (change feed updates propagate in seconds).

Time to Live (TTL)

TTL automatically deletes items after a specified number of seconds. It’s configured at two levels:

LevelSettingBehaviour
Container defaultDefaultTimeToLiveEnables TTL for the container. Set to -1 to enable without a default, or a positive number for a default expiry.
Per-item overridettl property on the JSON documentOverrides the container default. Set to -1 to never expire, or a positive number for a custom expiry.
// Enable TTL on container with a 90-day default
ContainerProperties props = new("audit", "/tenantId")
{
    DefaultTimeToLive = 90 * 24 * 60 * 60  // 90 days in seconds
};

// Per-item: this specific item never expires
{
    "id": "audit-critical-001",
    "tenantId": "tenant-abc",
    "type": "auditEntry",
    "ttl": -1,
    "action": "data-export",
    "details": "Full tenant export requested by admin"
}

Key rules:

  • Container TTL must be enabled (not null) before per-item TTL works
  • ttl: -1 on an item means β€œnever expire” even if the container has a default
  • TTL deletes consume RU/s but are background operations β€” no extra cost beyond RU
πŸ’‘ Exam tip: TTL hierarchy

TTL has three states: off (container DefaultTimeToLive is null β€” no items expire), on with default (positive number β€” all items expire unless overridden), on without default (set to -1 β€” items only expire if they have their own ttl value). Per-item ttl: -1 always means β€œnever expire.” Per-item ttl only works when the container has TTL enabled.

Unique keys

Unique keys enforce uniqueness within a logical partition β€” not across the entire container:

ContainerProperties props = new("users", "/tenantId")
{
    UniqueKeyPolicy = new UniqueKeyPolicy
    {
        UniqueKeys =
        {
            new UniqueKey { Paths = { "/email" } },
            new UniqueKey { Paths = { "/username" } }
        }
    }
};

Key rules:

  • Unique keys are scoped to a logical partition β€” two different tenants can have the same email
  • Unique keys must be defined at container creation time β€” you cannot add them later
  • The uniqueness check includes null values β€” only one item per partition can have a null for that path

Ravi’s mistake: Ravi assumed unique keys were globally unique across the container. He set /email as a unique key, then was confused when two different tenants could both register admin@company.com. Priya explained: unique keys are per-partition-key-value, not global.

πŸ’‘ Exam tip: unique key creation time

Like partition keys, unique key policies are set at container creation and cannot be changed later. Plan your uniqueness constraints upfront. If you need to add a unique key, you must create a new container and migrate data.

🎬 Video walkthrough

🎬 Video coming soon

Embedding vs Referencing β€” DP-420 Module 5

Embedding vs Referencing β€” DP-420 Module 5

~16 min

Flashcards

Question

When should you embed related data in Cosmos DB?

Click or press Enter to reveal answer

Answer

Embed when: (1) 1:1 or 1:few relationship, (2) data is always read together, (3) child rarely changes independently, (4) growth is bounded. The benefit is a single read returning everything.

Click to flip back

Question

How do you keep denormalised (embedded) data in sync?

Click or press Enter to reveal answer

Answer

Use the change feed. When the source document changes, a change feed processor detects the update and patches all documents containing the embedded copy. This provides eventual consistency β€” updates propagate in seconds.

Click to flip back

Question

What does TTL: -1 mean on a Cosmos DB item?

Click or press Enter to reveal answer

Answer

It means 'never expire' β€” the item lives forever, even if the container has a default TTL. The container must have TTL enabled (DefaultTimeToLive β‰  null) for per-item TTL to work at all.

Click to flip back

Question

Are unique keys in Cosmos DB globally unique across the entire container?

Click or press Enter to reveal answer

Answer

No β€” unique keys are scoped to a logical partition (same partition key value). Two items in different logical partitions can have the same value for a unique key path. Unique key policies must be defined at container creation.

Click to flip back

Knowledge check

Knowledge Check

Priya's 'project' document embeds the owner's display name. The owner updates their name. What's the best way to keep the embedded name current?

Knowledge Check

A container has DefaultTimeToLive set to 86400 (1 day). An item has ttl: -1. When does the item expire?

Knowledge Check

Ravi sets a unique key on /email in a container partitioned by /tenantId. Can two users in different tenants have the same email?


Next up: SDK Connectivity β€” learn how to create and configure the CosmosClient, choose between direct and gateway modes, and authenticate with account keys or Entra ID.

← Previous

Synthetic and Hierarchical Partition Keys

Next β†’

SDK Connectivity and Client Configuration

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.