DevOps: Infrastructure as Code and Deployments
Manage Cosmos DB infrastructure with Bicep templates, Azure CLI, and PowerShell β including throughput migration scripts, blue-green deployments, canary patterns, and CI/CD pipeline integration.
Why Infrastructure as Code for Cosmos DB?
Clicking through the Azure portal to create databases is like building IKEA furniture without instructions. You might get it right once, but good luck doing it the same way in three environments. Infrastructure as Code (IaC) is the instruction manual β write it once, deploy it identically everywhere.
Marcusβs DevOps workflow
βοΈ Marcus at FinSecure manages Cosmos DB across three environments (dev, staging, prod). He needs:
- Identical configurations across all environments (except throughput)
- Automated deployment via GitHub Actions
- Zero-downtime throughput scaling during market hours
- Safe rollout of indexing policy changes
Bicep template example
Bicep is the recommended IaC language for Azure (compiles to ARM templates):
param accountName string
param location string = resourceGroup().location
param throughput int = 400
resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2024-05-15' = {
name: accountName
location: location
kind: 'GlobalDocumentDB'
properties: {
databaseAccountOfferType: 'Standard'
consistencyPolicy: {
defaultConsistencyLevel: 'Session'
}
locations: [
{ locationName: location, failoverPriority: 0 }
]
enableAutomaticFailover: true
backupPolicy: {
type: 'Continuous'
continuousModeProperties: {
tier: 'Continuous30Days'
}
}
}
}
resource database 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases@2024-05-15' = {
parent: cosmosAccount
name: 'transactions'
properties: {
resource: { id: 'transactions' }
}
}
resource container 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/sqlContainers@2024-05-15' = {
parent: database
name: 'orders'
properties: {
resource: {
id: 'orders'
partitionKey: {
paths: ['/customerId']
kind: 'Hash'
version: 2
}
indexingPolicy: {
indexingMode: 'consistent'
includedPaths: [
{ path: '/customerId/?' }
{ path: '/orderDate/?' }
{ path: '/status/?' }
]
excludedPaths: [
{ path: '/*' }
]
compositeIndexes: [
[
{ path: '/customerId', order: 'ascending' }
{ path: '/orderDate', order: 'descending' }
]
]
}
defaultTtl: -1
}
options: {
autoscaleSettings: {
maxThroughput: throughput
}
}
}
}
Azure CLI for operational tasks
For scripts and automation, Azure CLI provides imperative commands:
# Create account
az cosmosdb create --name finsecure-cosmos \
--resource-group rg-finsecure \
--default-consistency-level Session \
--enable-automatic-failover true
# Scale throughput for market hours
az cosmosdb sql container throughput update \
--account-name finsecure-cosmos \
--resource-group rg-finsecure \
--database-name transactions \
--name orders \
--max-throughput 20000 # autoscale max
# Scale down after market hours
az cosmosdb sql container throughput update \
--account-name finsecure-cosmos \
--resource-group rg-finsecure \
--database-name transactions \
--name orders \
--max-throughput 5000
Exam tip: throughput change limits
When you change autoscale max throughput, the new max must be between 10% and 10Γ the current max. For example, if the current max is 10,000 RU/s:
- Minimum new max: 1,000 RU/s (10%)
- Maximum new max: 100,000 RU/s (10Γ)
For manual throughput, changes take effect immediately but large increases may require partition splits (which Cosmos DB handles automatically).
Throughput migration scripts
Automate throughput changes for day/night cycles:
#!/bin/bash
# market-hours-scale.sh β run via cron or Azure Automation
ACCOUNT="finsecure-cosmos"
RG="rg-finsecure"
DB="transactions"
CONTAINER="orders"
HOUR=$(date +%H)
if [ "$HOUR" -ge 9 ] && [ "$HOUR" -lt 16 ]; then
echo "Market hours β scaling up to 20,000 RU/s"
az cosmosdb sql container throughput update \
--account-name $ACCOUNT -g $RG -d $DB -n $CONTAINER \
--max-throughput 20000
else
echo "Off-hours β scaling down to 5,000 RU/s"
az cosmosdb sql container throughput update \
--account-name $ACCOUNT -g $RG -d $DB -n $CONTAINER \
--max-throughput 5000
fi
Deployment patterns
Blue-green deployment
Blue (current): finsecure-cosmos-blue (production traffic)
Green (new): finsecure-cosmos-green (testing new config)
Steps:
1. Deploy new config to green environment
2. Run smoke tests against green
3. Switch application connection string from blue to green
4. Monitor green for issues
5. Decommission blue (or keep as rollback)
Canary deployment
Canary pattern for indexing policy change:
1. Apply new indexing policy (Cosmos DB rebuilds index in background)
2. Monitor index transformation progress
3. Route 5% of traffic to test the new policy
4. If RU cost improves β route 100%
5. If RU cost increases β revert the policy
Exam tip: indexing policy changes are online
Changing an indexing policy triggers a background index transformation. During transformation:
- The container remains fully available for reads and writes
- Queries may use the old index until transformation completes
- You can monitor progress via the
IndexTransformationProgressresponse header - Multiple policy changes queue up β avoid rapid successive changes
The exam tests this: βWill the container be unavailable during an indexing policy change?β β No, itβs an online operation.
CI/CD pipeline integration
# GitHub Actions example
name: Deploy Cosmos DB
on:
push:
branches: [main]
paths: ['infra/cosmos/**']
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- uses: azure/arm-deploy@v2
with:
resourceGroupName: rg-finsecure
template: infra/cosmos/main.bicep
parameters: >
accountName=finsecure-cosmos
throughput=10000
π¬ Video walkthrough
π¬ Video coming soon
DevOps and IaC β DP-420 Module 27
DevOps and IaC β DP-420 Module 27
~16 minFlashcards
Knowledge Check
Marcus wants to ensure his dev, staging, and production Cosmos DB environments have identical container configurations (except throughput). What's the best approach?
Marcus applies a new indexing policy to his production container. What happens to ongoing reads and writes?
Marcus wants to scale his orders container from 5,000 to 50,000 autoscale max RU/s. Is this possible in a single operation?
Next up: Exam Strategy and Cross-Domain Review β the capstone module with domain weight recap, most-tested topics, common traps, and final tips from all four characters.