πŸ”’ Guided

Pre-launch preview. Authorised access only.

Incorrect code

Guided by A Guide to Cloud
Explore AB-900 AI-901 aws-aif-c01
Guided DP-420 Domain 1
Domain 1 β€” Module 8 of 11 73%
8 of 28 overall

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review

DP-420 Study Guide

Domain 1: Design and Implement Data Models

  • Cosmos DB β€” The Big Picture Free
  • Designing Your Data Model Free
  • Partition Key Strategy Free
  • Synthetic and Hierarchical Partition Keys Free
  • Relationships β€” Embedding vs Referencing Free
  • SDK Connectivity and Client Configuration Free
  • SDK CRUD Operations and Transactions Free
  • SQL Queries in Cosmos DB Free
  • SDK Query Pagination and LINQ Free
  • Server-Side Programming Free
  • Transactions in Practice Free

Domain 2: Design and Implement Data Distribution

  • Global Replication and Failover
  • Consistency Levels: Five Choices, Real Trade-Offs
  • Multi-Region Writes and Conflict Resolution

Domain 3: Integrate and Move Data

  • Change Feed with Azure Functions and Processors
  • Analytical Workloads: Synapse Link and Fabric Mirroring
  • Data Movement: ADF, Kafka, and Spark Connectors

Domain 4: Optimize Query and Operation Performance

  • Indexing Policies: Range, Spatial, and Composite
  • Request Units and Query Cost Optimization
  • Integrated Cache and Dedicated Gateway
  • Change Feed Patterns: Materialized Views and Estimator

Domain 5: Maintain an Azure Cosmos DB Solution

  • Monitoring: Metrics, Logs, and Alerts
  • Backup and Restore: Periodic vs Continuous
  • Network Security: Firewalls, VNets, and Private Endpoints
  • Data Security: Encryption, Keys, and RBAC
  • Cost Optimization: Throughput Modes and RU Strategy
  • DevOps: Infrastructure as Code and Deployments
  • Exam Strategy and Cross-Domain Review
Domain 1: Design and Implement Data Models Free ⏱ ~16 min read

SQL Queries in Cosmos DB

Master the Cosmos DB SQL query language β€” SELECT, WHERE, ORDER BY, aggregations, intra-document JOIN, single vs cross-partition queries, parameterised queries, query metrics, and composite indexes.

SQL in Cosmos DB

β˜• Simple explanation

It looks like SQL, but it’s not the SQL you know. Cosmos DB’s query language borrows familiar syntax β€” SELECT, WHERE, ORDER BY β€” but works on JSON documents instead of table rows. The biggest surprise? JOIN doesn’t join two tables. It joins a document with its own nested arrays.

Think of it as β€œJSON-flavoured SQL” β€” powerful for querying documents, but different enough from SQL Server that Sophie needs to rewire her brain.

Cosmos DB provides a SQL-like query language for the NoSQL API with key differences from traditional SQL:

  • Schema-free: No fixed columns. You query JSON paths β€” properties may or may not exist on every document.
  • Intra-document JOIN: JOIN operates within a single document (self-join on nested arrays), NOT across documents.
  • No GROUP BY without aggregate: GROUP BY requires an aggregate function (COUNT, SUM, AVG, MIN, MAX).
  • Partition-aware: Queries that include the partition key in WHERE hit one partition (efficient). Those that don’t fan out across all partitions (expensive).

Basic queries

SELECT and WHERE

-- Get all active projects for a tenant
SELECT c.id, c.name, c.status
FROM c
WHERE c.tenantId = 'tenant-abc'
  AND c.type = 'project'
  AND c.status = 'active'

The c alias refers to items in the container. You can also use * to return the full document:

SELECT * FROM c WHERE c.tenantId = 'tenant-abc' AND c.type = 'task'

ORDER BY

-- Tasks ordered by creation date (newest first)
SELECT c.title, c.createdAt
FROM c
WHERE c.tenantId = 'tenant-abc' AND c.type = 'task'
ORDER BY c.createdAt DESC

Important: ORDER BY on a single property uses the default range index. Multi-field ORDER BY requires a composite index (covered below).

Aggregations

-- Count tasks per status
SELECT c.status, COUNT(1) AS taskCount
FROM c
WHERE c.tenantId = 'tenant-abc' AND c.type = 'task'
GROUP BY c.status

-- Average estimated hours
SELECT AVG(c.estimatedHours) AS avgHours
FROM c
WHERE c.tenantId = 'tenant-abc' AND c.type = 'task'

Available aggregations: COUNT, SUM, AVG, MIN, MAX.

Intra-document JOIN

Critical concept: JOIN in Cosmos DB is NOT a cross-document join like SQL Server. It’s a self-join that flattens nested arrays within a single document.

{
  "id": "proj-001",
  "tenantId": "tenant-abc",
  "name": "Website Redesign",
  "tags": ["frontend", "priority", "q1"],
  "milestones": [
    { "name": "Design", "dueDate": "2025-03-01" },
    { "name": "Build", "dueDate": "2025-06-01" }
  ]
}
-- Flatten milestones array
SELECT p.name AS project, m.name AS milestone, m.dueDate
FROM p
JOIN m IN p.milestones
WHERE p.tenantId = 'tenant-abc' AND p.type = 'project'

Result:

projectmilestonedueDate
Website RedesignDesign2025-03-01
Website RedesignBuild2025-06-01

Sophie’s confusion: Sophie tried SELECT * FROM projects p JOIN tasks t ON p.id = t.projectId β€” this doesn’t work. There’s no cross-document JOIN. To get a project and its tasks (stored as separate documents), you run two queries.

Single-partition vs cross-partition queries

AspectSingle-partition queryCross-partition query
FilterIncludes the partition key in WHEREDoesn't include the partition key
RU costLow β€” hits one partition onlyHigh β€” fans out to all physical partitions
LatencyFast β€” single network hopSlow β€” parallel fan-out, waits for slowest partition
ExampleWHERE c.tenantId = 'abc' AND c.type = 'task'WHERE c.type = 'task' AND c.status = 'overdue'
When to useAlways when possible β€” design for thisRarely β€” analytics, reports, admin queries
SDK headerx-ms-documentdb-query-enablecrosspartition: falsex-ms-documentdb-query-enablecrosspartition: true
// Single-partition query (fast, cheap)
var query = new QueryDefinition(
    "SELECT * FROM c WHERE c.tenantId = @tenant AND c.type = 'task'")
    .WithParameter("@tenant", "tenant-abc");

// The SDK automatically routes to the correct partition
using FeedIterator<TaskItem> feed = container.GetItemQueryIterator<TaskItem>(
    query,
    requestOptions: new QueryRequestOptions
    {
        PartitionKey = new PartitionKey("tenant-abc")  // explicit routing
    });

Query metrics

Enable query metrics to understand performance:

var options = new QueryRequestOptions
{
    PopulateIndexMetrics = true
};

using FeedIterator<TaskItem> feed = container.GetItemQueryIterator<TaskItem>(query, requestOptions: options);

while (feed.HasMoreResults)
{
    FeedResponse<TaskItem> response = await feed.ReadNextAsync();
    Console.WriteLine($"RU charge: {response.RequestCharge}");
    Console.WriteLine($"Index metrics: {response.IndexMetrics}");
    // Shows which indexes were used and suggests composite indexes
}

Parameterised queries

Always use parameters β€” never concatenate user input into queries:

// βœ… Parameterised β€” safe from injection, cached execution plan
var query = new QueryDefinition(
    "SELECT * FROM c WHERE c.tenantId = @tenant AND c.status = @status")
    .WithParameter("@tenant", tenantId)
    .WithParameter("@status", "active");

// ❌ String concatenation β€” injection risk, no plan caching
var unsafeQuery = $"SELECT * FROM c WHERE c.tenantId = '{tenantId}'";

Ravi’s mistake: Ravi concatenated user input directly into queries. A tenant named tenant' OR '1'='1 returned all tenants’ data. Parameterised queries prevent this.

πŸ’‘ Exam tip: JOIN is intra-document only

This is one of the most tested concepts. Cosmos DB JOIN flattens nested arrays within a single document. It does NOT join across documents or containers like SQL Server’s JOIN. If a question asks how to β€œjoin” two separate document types, the answer involves two separate queries or embedding the data.

Composite indexes

By default, Cosmos DB creates range indexes on each property individually. For multi-field ORDER BY, you need a composite index:

{
  "indexingPolicy": {
    "compositeIndexes": [
      [
        { "path": "/tenantId", "order": "ascending" },
        { "path": "/createdAt", "order": "descending" }
      ],
      [
        { "path": "/status", "order": "ascending" },
        { "path": "/priority", "order": "descending" }
      ]
    ]
  }
}

When you need a composite index:

  • ORDER BY c.status ASC, c.priority DESC β†’ needs composite index on [status ASC, priority DESC]
  • ORDER BY c.createdAt DESC β†’ single field, default index works
  • Filters with ORDER BY on a different property β†’ composite improves performance
πŸ’‘ Exam tip: ORDER BY multiple fields

A query with ORDER BY c.field1, c.field2 requires a composite index that matches the exact fields and sort directions. Without it, the query fails with an error β€” it doesn’t silently fall back to a scan. The exam tests this: if you see multi-field ORDER BY, check for a composite index.

🎬 Video walkthrough

🎬 Video coming soon

SQL Queries β€” DP-420 Module 8

SQL Queries β€” DP-420 Module 8

~16 min

Flashcards

Question

What does JOIN do in Cosmos DB SQL?

Click or press Enter to reveal answer

Answer

It's an intra-document self-join that flattens nested arrays within a single document. It does NOT join across documents or containers. Example: JOIN m IN c.milestones flattens each milestone into a separate row.

Click to flip back

Question

What happens if you use ORDER BY on two fields without a composite index?

Click or press Enter to reveal answer

Answer

The query fails with an error. Multi-field ORDER BY requires a composite index that matches the exact fields and sort directions. A single-field range index is not sufficient.

Click to flip back

Question

Why should you use parameterised queries instead of string concatenation?

Click or press Enter to reveal answer

Answer

Two reasons: (1) Security β€” prevents SQL injection attacks. (2) Performance β€” parameterised queries reuse cached execution plans, while concatenated strings create a new plan each time.

Click to flip back

Question

What makes a cross-partition query expensive?

Click or press Enter to reveal answer

Answer

It fans out to ALL physical partitions in parallel, each consuming RU/s independently. The total cost is the sum of all partition costs. Latency equals the slowest partition's response. It's typically 5-10Γ— more expensive than a single-partition query.

Click to flip back

Knowledge check

Knowledge Check

Sophie writes: SELECT * FROM projects p JOIN tasks t ON p.id = t.projectId. What happens?

Knowledge Check

Priya's top query is: SELECT * FROM c WHERE c.tenantId = 'abc' AND c.type = 'task' ORDER BY c.createdAt DESC. What index does she need?

Knowledge Check

A query runs without a partition key filter on a container with 8 physical partitions. Each partition charges 5 RU. What's the total cost?


Next up: SDK Query Pagination β€” learn how to use LINQ, FeedIterator, continuation tokens, and MaxItemCount for efficient query pagination.

← Previous

SDK CRUD Operations and Transactions

Next β†’

SDK Query Pagination and LINQ

Guided

I learn, I simplify, I share.

A Guide to Cloud YouTube Feedback

© 2026 Sutheesh. All rights reserved.

Guided is an independent study resource and is not affiliated with, endorsed by, or officially connected to Microsoft. Microsoft, Azure, and related trademarks are property of Microsoft Corporation. Always verify information against Microsoft Learn.