Integrated Cache and Dedicated Gateway
Implement the Cosmos DB integrated cache with a dedicated gateway cluster to reduce RU consumption for read-heavy workloads β including staleness configuration and when to use it.
What is the integrated cache?
Think of it as a fast shelf next to the warehouse loading dock. Instead of walking to the back of the warehouse every time someone asks for the same product, you keep popular items on the shelf near the door. Much faster, and the warehouse workers arenβt interrupted.
The integrated cache stores frequently-read items in memory at the gateway so repeated reads donβt consume RU/s from the backend.
Amaraβs read-heavy dashboard
π‘ Amara at SensorFlow built a dashboard that shows the latest readings for 10,000 devices. The dashboard auto-refreshes every 30 seconds. Thatβs 10,000 point reads every 30 seconds β about 333 reads/second, costing ~333 RU/s. Most readings havenβt changed since the last refresh.
With the integrated cache, repeat reads of unchanged data are served from cache at zero RU cost.
How it works
Application
β
βΌ
Dedicated Gateway (with integrated cache)
β
βββ Cache HIT β return from memory (0 RU)
β
βββ Cache MISS β fetch from backend (normal RU cost)
β store in cache for next request
Setting up the dedicated gateway
Step 1: Provision a dedicated gateway cluster:
az cosmosdb service create \
--resource-group rg-sensorflow \
--account-name sensorflow-cosmos \
--name "SqlDedicatedGateway" \
--kind "SqlDedicatedGateway" \
--count 3 \
--size "Cosmos.D4s"
Step 2: Connect via gateway mode (required for cache):
CosmosClientOptions options = new CosmosClientOptions
{
// Use the dedicated gateway endpoint, not the standard endpoint
ConnectionMode = ConnectionMode.Gateway
};
// The dedicated gateway endpoint follows this pattern:
// https://<account-name>.sqlx.cosmos.azure.com
CosmosClient client = new CosmosClient(
"https://sensorflow-cosmos.sqlx.cosmos.azure.com",
key,
options
);
Exam tip: gateway mode is mandatory
The integrated cache only works with ConnectionMode.Gateway via the dedicated gateway endpoint (.sqlx.cosmos.azure.com). If you use ConnectionMode.Direct (the default and recommended for production), the cache is bypassed entirely.
This is a key exam trap: direct mode has lower latency but skips the cache. Gateway mode adds a network hop but enables caching. Choose based on your read pattern.
MaxIntegratedCacheStaleness
Control how stale cached data can be:
// Cache results for up to 2 minutes
FeedIterator<SensorReading> iterator = container.GetItemQueryIterator<SensorReading>(
query,
requestOptions: new QueryRequestOptions
{
DedicatedGatewayRequestOptions = new DedicatedGatewayRequestOptions
{
MaxIntegratedCacheStaleness = TimeSpan.FromMinutes(2)
}
}
);
| Staleness Value | Behaviour |
|---|---|
TimeSpan.Zero | Always fetch fresh data (bypass cache) |
| Small (seconds) | Near-real-time with minimal caching benefit |
| Large (minutes) | Higher cache hit rate, more stale data |
| Not set | Uses the server default |
When to use the integrated cache
| Scenario | Integrated Cache? | Why |
|---|---|---|
| Read-heavy dashboard with repeated queries | β Yes | Same data read repeatedly β high cache hit rate |
| IoT ingestion (write-heavy) | β No | Writes don't benefit from read cache |
| Unique queries (low repetition) | β No | Low cache hit rate β most reads are cache misses |
| Session-scoped point reads | β Yes | Same session data read multiple times per session |
| Serverless accounts | β No | Dedicated gateway not supported on serverless |
| Applications using Direct mode | β No | Cache requires Gateway mode via dedicated gateway |
Limitations
- Serverless: Not supported on serverless accounts
- Multi-region writes: Cache is per-gateway, region-local
- Write-through: Writes always go to the backend β cache is read-only
- Size: Cache size depends on the dedicated gateway SKU
- Cost: Dedicated gateway cluster has its own pricing (separate from RU/s)
π¬ Video walkthrough
π¬ Video coming soon
Integrated Cache and Dedicated Gateway β DP-420 Module 20
Integrated Cache and Dedicated Gateway β DP-420 Module 20
~12 minFlashcards
Knowledge Check
Amara's dashboard reads the same 10,000 device readings every 30 seconds. She enables the integrated cache with 2-minute staleness. What happens to RU consumption?
A developer enables the integrated cache but sees no RU savings. Their connection string uses the standard endpoint. What's wrong?
Next up: Change Feed Patterns β materialized views, denormalization propagation, and the change feed estimator for lag monitoring.