Partition Key Strategy
Master the most important Cosmos DB design decision β choosing the right partition key. Understand physical vs logical partitions, the three rules of a good key, and common mistakes that cause hot partitions.
Why the partition key is everything
Imagine a library with millions of books. The partition key is how you organise the shelves. If you sort by βfirst letter of the authorβs last name,β shelf βSβ gets Shakespeare, Spielberg, and 10,000 other authors β it overflows. But if you sort by a full author ID, each shelf has a manageable number of books.
A bad partition key creates one overloaded shelf (a hot partition). A good key spreads books evenly so every shelf does its fair share of work.
Physical vs logical partitions
| Concept | Logical partition | Physical partition |
|---|---|---|
| Definition | All items sharing the same partition key value | A physical storage unit managed by Cosmos DB |
| Size limit | 20 GB per logical partition | ~50 GB per physical partition |
| Throughput limit | N/A (shares the physical partitionβs RU budget) | 10,000 RU/s per physical partition |
| Who manages it | You (via partition key choice) | Cosmos DB (automatic splits) |
| Contains | Items with identical PK value | One or more logical partitions |
Physical partition 1 (50 GB max, 10K RU/s)
βββ Logical partition: tenantId = "abc" (15 GB)
βββ Logical partition: tenantId = "def" (8 GB)
Physical partition 2 (50 GB max, 10K RU/s)
βββ Logical partition: tenantId = "ghi" (12 GB)
βββ Logical partition: tenantId = "jkl" (3 GB)
Key insight: Cosmos DB automatically splits physical partitions as data grows. But a single logical partition can never be split β all items with the same PK value must stay together on one physical partition.
The three rules of a good partition key
Rule 1: High cardinality
The key should have many distinct values β ideally as many as there are items, or close to it.
β /country β ~200 values for millions of documents
β /status β 3-5 values (active, inactive, archived)
β
/tenantId β thousands of distinct tenants
β
/userId β one per user
β
/orderId β one per order (highest cardinality)
Rule 2: Even distribution
Traffic and storage should spread evenly across partition key values. No single value should dominate.
β /companyId in a B2B app where one company has 80% of data
β That one logical partition becomes a hot partition
β
/userId in a consumer app with millions of balanced users
β Each user's data is roughly the same size
Rule 3: Query alignment
Your most frequent queries should include the partition key in the WHERE clause.
-- β
Single-partition query (fast, ~1 RU for point read)
SELECT * FROM c WHERE c.tenantId = 'abc' AND c.type = 'task'
-- β Cross-partition query (fan-out, 10Γ more RU)
SELECT * FROM c WHERE c.type = 'task' AND c.status = 'overdue'
Priyaβs scenario: choosing the right key
π Priya evaluates partition key candidates for her workitems container:
| Candidate | Cardinality | Distribution | Query alignment | Verdict |
|---|---|---|---|---|
/tenantId | Medium (1,000 tenants) | β οΈ One enterprise tenant has 60% of data | β Every query filters by tenant | Risky β hot partition |
/projectId | High (50,000 projects) | β Projects are roughly equal size | β οΈ Task queries need projectId + tenantId | Possible |
/id | Perfect (unique per item) | β Perfectly even | β Queries never filter by id alone | Too scattered |
/tenantId + type | Higher (1,000 Γ 5 types) | β Better spread | β Most queries filter by both | Better β but synthetic key needed |
Priya decides on /tenantId for now but plans to evaluate hierarchical partition keys (next module) once she confirms the enterprise tenantβs data exceeds 20 GB.
Read-heavy vs write-heavy strategies
| Strategy | Read-heavy workloads | Write-heavy workloads |
|---|---|---|
| Goal | Minimise cross-partition reads | Distribute writes across many partitions |
| Ideal PK | Matches your WHERE clause filters | High cardinality (e.g., /deviceId, /eventId) |
| Example | /tenantId for 'get all tasks for tenant X' | /deviceId for IoT sensor writes |
| Trade-off | May concentrate writes on popular tenants | Reads that span devices need fan-out |
| Common pattern | Denormalise + type discriminator | Append-only with synthetic/random suffix |
| Risk | Hot partition on popular key values | Cross-partition queries for aggregations |
Common partition key mistakes
| Mistake | Why itβs bad | Fix |
|---|---|---|
Using /date or /timestamp | All todayβs writes go to one partition (hot write) | Use /deviceId or append a random suffix |
Using a low-cardinality field like /status | 3-5 partitions total β canβt scale | Use the entityβs natural ID |
| Choosing a key that doesnβt appear in queries | Every query becomes cross-partition | Align the key with your WHERE clauses |
| Ignoring the 20 GB logical limit | One large tenant fills the partition | Use hierarchical keys or synthetic keys |
| Assuming you can change it later | The partition key is immutable after container creation | Design carefully upfront; migration = new container + data copy |
Raviβs mistake: Ravi chose /createdDate (a date string like β2025-06-15β) as the partition key for the audit log container. On a busy day, all writes hammer the same partition. He got 429 (throttled) errors even though the container had plenty of total RU/s β because one physical partition was maxed at 10,000 RU/s.
Exam tip: partition key is immutable
You cannot change the partition key after creating a container. If you chose wrong, the only fix is to create a new container with the correct key and migrate data. The exam tests this β know that thereβs no ALTER CONTAINER SET PARTITION KEY equivalent. Plan carefully or use the emulator to prototype first.
Exam tip: 429 on a single partition
You can get 429 (Too Many Requests) even when total RU/s isnβt exhausted β if a single physical partition exceeds its ~10,000 RU/s allocation. This is the classic hot partition symptom. The fix is a better partition key with more even distribution, not more total RU/s.
π¬ Video walkthrough
π¬ Video coming soon
Partition Key Strategy β DP-420 Module 3
Partition Key Strategy β DP-420 Module 3
~18 minFlashcards
Knowledge check
Ravi chose /createdDate as the partition key for an audit log. The container has 50,000 RU/s but he's getting 429 errors. What's the most likely cause?
Priya is choosing a partition key for a container storing projects and tasks. Her #1 query is 'get all tasks for tenant X'. Which key best serves this query?
A logical partition for 'tenant-big-corp' has grown to 19.5 GB. What happens if it reaches 20 GB?
Next up: Synthetic & Hierarchical Keys β advanced techniques to break past the 20 GB logical partition limit and distribute data more evenly.