Cosmos DB & Semi-Structured Data

Why Cosmos DB design matters

Simple explanation

Cosmos DB is like a global postal service that guarantees delivery speed — but you choose how “fresh” the letter needs to be.

It’s Azure’s globally distributed NoSQL database. The three big design decisions: Which API? (NoSQL, MongoDB, Cassandra, Gremlin, Table), Which consistency level? (strong to eventual — a tradeoff between correctness and speed), and How to partition? (the partition key determines performance and cost).

Cosmos DB consistency models

This is one of the most exam-tested topics in AZ-305. The five consistency levels form a spectrum:

Cosmos DB Consistency Levels
Level	Guarantee	Latency	Throughput	Best For
Strong	Reads always return most recent committed write	Highest (cross-region round-trip)	Lowest	Financial transactions, inventory counts
Bounded Staleness	Reads lag behind writes by at most K versions or T time	High	Medium-low	Global apps needing near-strong consistency with better perf
Session (default)	Reads are consistent within a session (your own writes)	Medium	Medium	Most applications — user sees their own updates immediately
Consistent Prefix	Reads never see out-of-order writes	Low	Medium-high	Social feeds, activity logs (order matters, staleness is OK)
Eventual	Reads may return older data, eventually converges	Lowest	Highest	View counters, likes, non-critical telemetry

🏦 Elena’s consistency decision: FinSecure’s trading platform needs Strong consistency for account balance reads — a trade must see the latest balance. But their customer activity feed uses Session consistency — each user sees their own activity immediately, but seeing other users’ activity with a slight delay is acceptable.

Exam tip: Session consistency is the default and most common answer

If the exam scenario doesn’t mention a specific consistency requirement, Session is almost always correct. It provides the “read your own writes” guarantee that most applications need, with good performance. Choose Strong only when the scenario explicitly mentions “must always read the latest data across all users/regions” — and be ready for the performance/cost tradeoff.

API selection

Cosmos DB API Options
API	Data Model	Query Language	Best For
NoSQL	JSON documents	SQL-like queries	New cloud-native apps, most common choice
MongoDB	BSON documents	MongoDB query language	Migrating existing MongoDB applications
Cassandra	Wide-column (tables)	CQL (Cassandra Query Language)	Migrating Cassandra workloads, high-write IoT
Gremlin	Graph (vertices + edges)	Gremlin traversal language	Social networks, recommendation engines, fraud detection
Table	Key-value pairs	OData queries	Migrating Azure Table Storage (better perf + global distribution)

Design rule: Choose the API based on existing application investment, not personal preference. If the app is already built on MongoDB, use the MongoDB API — don’t rewrite for NoSQL API. For new apps, NoSQL API is the recommended default.

Partition key design

The partition key is the most important Cosmos DB design decision. A bad partition key causes hot partitions, poor query performance, and wasted RUs.

Principle	Good Example	Bad Example
High cardinality	`userId` (millions of unique values)	`country` (only ~200 values → hot partitions)
Even distribution	`tenantId` (uniform data per tenant)	`createdDate` (all today’s data in one partition)
Query alignment	Key used in most WHERE clauses	Key rarely used in queries (forces cross-partition queries)

🚀 Marcus’s partition strategy: NovaSaaS uses tenantId as the partition key for most containers — queries are always scoped to a tenant, data is evenly distributed, and cross-partition queries are rare.

Design decision: Hierarchical partition keys

For very large datasets, Cosmos DB supports hierarchical partition keys (up to 3 levels). Example: /tenantId/userId/sessionId. This allows:

Queries filtered by tenantId → scoped to that tenant’s partitions
Sub-filtering by userId → further narrowed
Even data distribution across the hierarchy

Use when a single partition key would create partitions that exceed the 20 GB logical partition limit.

Throughput models

Cosmos DB Throughput Options
Factor	Provisioned (Manual)	Provisioned (Autoscale)	Serverless
RU allocation	Fixed RU/s you set	Scales between 10% and max RU/s	No pre-allocation — pay per request
Billing	Per-hour for allocated RU/s	Per-hour for the highest RU/s the system scaled to within that hour	Per-RU consumed
Minimum cost	400 RU/s minimum	10% of max (e.g., 400 if max is 4000)	Zero when idle
Best for	Predictable, steady workloads	Variable workloads with known peaks	Dev/test, infrequent access, spiky traffic
Max throughput	Unlimited (manual scaling)	Unlimited (set max)	5,000 RU/s per container, 1 TB storage per container

Azure Table Storage vs Cosmos DB Table API

Factor	Azure Table Storage	Cosmos DB Table API
Performance	Variable latency	Single-digit ms guaranteed
Global distribution	Single region (GRS for DR only)	Multi-region active-active
Throughput	Per-partition limits	Provisioned or serverless RU/s
Secondary indexes	Primary key only	Automatic indexing on all properties
Cost	Very low	Higher (premium performance)
Best for	Simple key-value, cost-sensitive, low traffic	High-performance global key-value

Knowledge check

Question

What are the five Cosmos DB consistency levels from strongest to weakest?

Click or press Enter to reveal answer

Answer

Strong → Bounded Staleness → Session (default) → Consistent Prefix → Eventual. Each step down increases throughput and reduces latency but weakens the consistency guarantee.

Click to flip back

Question

What makes a good Cosmos DB partition key?

Click or press Enter to reveal answer

Answer

Three qualities: (1) High cardinality — many unique values to distribute data evenly, (2) Even distribution — no 'hot' partitions, (3) Query alignment — used in most WHERE clauses to avoid expensive cross-partition queries. Example: userId or tenantId.

Click to flip back

Question

When should you choose autoscale RU/s over manual provisioned throughput in Cosmos DB?

Click or press Enter to reveal answer

Answer

Autoscale when traffic is unpredictable or spiky — it scales between 10% and 100% of the max RU/s you set, and you pay for peak usage per hour. Manual provisioned when traffic is steady and predictable — lower cost but risks throttling during spikes. For dev/test, use serverless (pay per request, no minimum).

Click to flip back

Knowledge Check

🚀 NovaSaaS is designing a Cosmos DB container for user session data. Each tenant has 100-10,000 users. Most queries filter by tenant first, then by user. Session data per user is typically 2-5 KB. Which partition key should Marcus recommend?

Knowledge Check

🏦 Elena needs Cosmos DB for a global trading platform. Account balance reads must always return the latest committed write — even across regions. Which consistency level should she recommend?

Next up: Semi-structured data is designed — now let’s handle unstructured data — Blob, Data Lake & Azure Files.