πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901 aws-aif-c01
Guided DP-420 Domain 1
Domain 1 β€” Module 4 of 11 36%
4 of 28 overall

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review
Domain 1: Design and Implement Data Models Free ⏱ ~14 min read

Synthetic and Hierarchical Partition Keys

Go beyond simple partition keys with synthetic keys (concatenation, random suffixes, hashing) and hierarchical partition keys (up to 3 levels) to break the 20 GB logical partition limit and improve distribution.

When a simple partition key isn’t enough

β˜• Simple explanation

Imagine your library’s β€œS” shelf is overflowing. You have two options: (1) Create a combined label like β€œS-Fiction” and β€œS-History” to spread books across sub-shelves (that’s a synthetic key). (2) Build a multi-level system β€” Floor β†’ Section β†’ Shelf (that’s a hierarchical key).

Both solve the same problem: one shelf had too much stuff. Synthetic keys are a string trick you do yourself. Hierarchical keys are a Cosmos DB feature that lets you define up to 3 levels of partitioning.

A simple partition key like /tenantId may run into the 20 GB logical partition limit for large tenants, or create hot partitions when one key value dominates traffic. Two advanced approaches solve this:

  • Synthetic keys: You create a computed property by concatenating, hashing, or appending a suffix to existing fields. This is a data modelling technique β€” Cosmos DB sees it as a regular partition key.
  • Hierarchical partition keys: A Cosmos DB feature that lets you define up to 3 levels of partition key paths. It breaks the 20 GB limit per level and allows efficient scoped queries.

Synthetic partition keys

Technique 1: Concatenation

Combine two or more fields into a single string:

// In your application code, before writing to Cosmos DB
var item = new
{
    id = Guid.NewGuid().ToString(),
    tenantId = "tenant-abc",
    type = "task",
    partitionKey = "tenant-abc_task",  // synthetic key
    title = "Update hero section"
};

await container.CreateItemAsync(item, new PartitionKey("tenant-abc_task"));

When to use: Your queries almost always filter on both fields (e.g., tenantId AND type).

Technique 2: Random suffix

Append a random number to spread writes across partitions:

// Spread a hot tenant across 10 sub-partitions
int suffix = new Random().Next(0, 10);
string syntheticKey = $"tenant-bigcorp_{suffix}";

var item = new
{
    id = Guid.NewGuid().ToString(),
    tenantId = "tenant-bigcorp",
    partitionKey = syntheticKey,  // e.g., "tenant-bigcorp_7"
    sensorData = new { temperature = 72.5 }
};

Trade-off: Writes are perfectly distributed, but reads need to fan out across all suffix values:

// To read ALL data for tenant-bigcorp, you need 10 queries
for (int i = 0; i < 10; i++)
{
    string pk = $"tenant-bigcorp_{i}";
    // query each sub-partition
}

Technique 3: Hash

Use a hash for deterministic distribution without fan-out (if you know the input):

string input = $"{tenantId}_{projectId}";
int hash = Math.Abs(input.GetHashCode()) % 100;
string syntheticKey = $"{tenantId}_{hash}";

When to use: You want even distribution AND can recompute the key at read time from known inputs.

Hierarchical partition keys

Hierarchical keys are a Cosmos DB feature β€” you define up to 3 levels of partition key paths when creating the container:

// Create a container with hierarchical partition keys
ContainerProperties properties = new ContainerProperties(
    id: "workitems",
    partitionKeyPaths: new List<string> { "/tenantId", "/projectId", "/id" }
);

Database database = await cosmosClient.CreateDatabaseIfNotExistsAsync("novasaas");
Container container = await database.CreateContainerIfNotExistsAsync(properties, throughput: 10000);

How it works

Level 1: /tenantId     β†’ "tenant-abc"
Level 2: /projectId    β†’ "proj-001"
Level 3: /id           β†’ "task-042"

Logical partition = combination of all 3 levels
  • Queries at level 1 (WHERE c.tenantId = 'abc') target all partitions for that tenant
  • Queries at level 1 + 2 (WHERE c.tenantId = 'abc' AND c.projectId = 'proj-001') are more targeted
  • Queries at all 3 levels are a precise point read

Breaking the 20 GB limit

With a simple /tenantId key, all data for one tenant must fit in 20 GB. With hierarchical keys:

  • Each unique combination of all levels is a logical partition
  • Tenant β€œabc” with 100 projects has 100+ logical partitions (one per project per item)
  • Each logical partition stays small β†’ the 20 GB limit effectively disappears for that tenant

Query efficiency: left-to-right

Hierarchical keys enable prefix queries β€” you can query from left to right:

-- βœ… Uses first 2 levels β€” efficient, scoped
SELECT * FROM c
WHERE c.tenantId = 'tenant-abc'
  AND c.projectId = 'proj-001'

-- βœ… Uses first level only β€” still targeted (all of tenant's data)
SELECT * FROM c WHERE c.tenantId = 'tenant-abc'

-- ❌ Skips level 1 β€” cross-partition fan-out
SELECT * FROM c WHERE c.projectId = 'proj-001'
πŸ’‘ Exam tip: query left-to-right

With hierarchical partition keys, queries must specify levels from left to right without skipping. You can use level 1 alone, levels 1+2, or levels 1+2+3. But you cannot skip level 1 and query only level 2 β€” that’s a cross-partition query. Think of it like a phone book: you can look up by country β†’ city β†’ name, but not city alone without a country.

Synthetic vs hierarchical keys

AspectSynthetic keysHierarchical partition keys
Where definedIn your application code (a computed property)In the container definition (Cosmos DB feature)
Max levelsUnlimited (it's just string concatenation)Up to 3 levels
20 GB limitStill applies to the synthetic key valueEffectively broken β€” each level combination is a separate logical partition
Query supportYou manage the key format in queriesNative prefix queries (left-to-right)
Write logicYou must compute the key before every writeCosmos DB combines the levels automatically
Fan-out on readsRandom suffix requires reading all suffix valuesPrefix queries are natively efficient
Best forSimple distribution improvements, write-heavy appendMulti-level hierarchical data (tenant β†’ project β†’ item)
RetroactiveCan be added to existing containers (new property)Must be set at container creation

Best practice: Make the last level unique (e.g., /id). This ensures every item has its own logical partition, giving maximum distribution while still allowing efficient prefix queries at higher levels.

πŸ’‘ Exam tip: hierarchical key creation

Hierarchical partition keys must be defined at container creation time β€” you cannot add or change levels later. The paths must be in the document (they’re not computed). If you need to change the hierarchy, you must create a new container and migrate data.

🎬 Video walkthrough

🎬 Video coming soon

Synthetic & Hierarchical Keys β€” DP-420 Module 4

Synthetic & Hierarchical Keys β€” DP-420 Module 4

~14 min

Flashcards

Question

What are the three synthetic partition key techniques?

Click or press Enter to reveal answer

Answer

1) Concatenation β€” combine fields (e.g., 'tenant-abc_task'). 2) Random suffix β€” append a random number for write distribution (e.g., 'tenant-abc_7'). 3) Hash β€” deterministic distribution using a hash function. All are computed in your application code before writing.

Click to flip back

Question

How many levels can a hierarchical partition key have?

Click or press Enter to reveal answer

Answer

Up to 3 levels (e.g., /tenantId, /projectId, /id). Each unique combination of all levels forms a separate logical partition, effectively breaking the 20 GB limit for any single top-level value.

Click to flip back

Question

What is the query rule for hierarchical partition keys?

Click or press Enter to reveal answer

Answer

Queries must specify levels from left to right without skipping. You can filter on level 1 alone, levels 1+2, or levels 1+2+3. Skipping a level (e.g., querying only level 2) results in a cross-partition fan-out.

Click to flip back

Question

What is the best practice for the last level in a hierarchical partition key?

Click or press Enter to reveal answer

Answer

Make it unique β€” typically /id. This ensures every item is its own logical partition, giving maximum distribution. Higher levels enable efficient prefix queries for group-level access patterns.

Click to flip back

Knowledge check

Knowledge Check

Priya's enterprise tenant 'BigCorp' has 25 GB of data. She used a simple /tenantId partition key. What problem will she hit?

Knowledge Check

Ravi uses a random suffix (0-9) synthetic key for write-heavy IoT data. What's the trade-off?

Knowledge Check

Priya defines hierarchical keys as /tenantId, /projectId, /id. A query filters only on projectId (skipping tenantId). What happens?


Next up: Relationships β€” Embedding vs Referencing β€” learn when to nest data inside a document and when to keep separate documents with references.

← Previous

Partition Key Strategy

Next β†’

Relationships β€” Embedding vs Referencing

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.