Scale, Performance, and Compression

Scaling Azure SQL

Simple explanation

Think of scaling like adjusting the lanes on a motorway.

Scaling up (vertical) is widening each lane — more power per lane. You upgrade your database to a bigger tier with more CPU, memory, and I/O.

Scaling out (horizontal) is adding more lanes — more capacity through parallelism. You add read replicas, elastic pools, or shard your data across multiple databases.

Azure SQL Database makes this easy — you can scale up with a single slider. SQL on VMs means resizing the entire virtual machine. Each platform has its own scaling levers.

Scaling Azure SQL Database

Priya at ScaleWave uses these levers every week:

Service tiers

Tier	Use Case	Key Features
General Purpose	Most workloads	Remote storage, up to 128 vCores
Business Critical	Low-latency, high IOPS	Local SSD, built-in read replica
Hyperscale	Very large databases (up to 100 TB)	Near-instant scale, up to 4 read replicas, fast backup/restore

Compute models

Model	How It Works	Best For
Provisioned	Fixed vCores, always running	Predictable workloads, consistent performance
Serverless	Auto-scales between min/max vCores, auto-pauses	Intermittent use, dev/test, unpredictable traffic

Scaling operations

Scale up/down: Change vCore count or tier. Brief connectivity blip during switchover.
Elastic pools: Multiple databases share a pool of resources. Set min/max per database.
Read replicas: Offload read queries to up to 4 replicas (Hyperscale) or 1 built-in replica (Business Critical).

Priya’s move: ScaleWave’s traffic spikes during product launches. She uses serverless for dev databases (auto-pause saves money) and provisioned Business Critical for production (consistent latency).

Scaling Azure SQL Managed Instance

Kenji scales MI differently:

vCores: 4 to 128 vCores (General Purpose) or 4 to 128 (Business Critical)
Storage: Up to 16 TB (auto-grows in General Purpose)
Tier change: General Purpose ↔ Business Critical. Scaling operations can take minutes to hours.
Instance pools: Multiple MIs share underlying resources — cost savings for smaller instances

MI scaling gotcha

Scaling a Managed Instance (changing vCores or tier) triggers an operation that involves moving the instance to new hardware. This can take 30 minutes or longer. Plan scaling operations during maintenance windows, not during peak hours.

The exam may present a scenario where an admin needs to scale MI quickly — the correct answer acknowledges that MI scaling is NOT instant.

Scaling SQL Server on Azure VMs

Kenji also manages SQL VMs for the legacy reporting system:

VM resize: Change VM series/size for more CPU, memory, I/O. Requires VM restart.
Storage: Add Premium SSD or Ultra Disk. Separate data, log, and tempdb to different disks.
Memory: Configure max server memory to leave room for the OS.
tempdb: Place on local SSD (ephemeral disk) for best performance — but data is lost on VM restart.

VM Consideration	Recommendation
VM series	Memory-optimised (E-series) or storage-optimised (L-series) for SQL
Data files	Premium SSD P30+ or Ultra Disk, striped with Storage Spaces
Log files	Separate Premium SSD, NOT on same disk as data
tempdb	Local/ephemeral SSD for best IOPS (if data loss is acceptable on restart)
Read caching	Enable read caching on data disks; disable on log disks

Table partitioning

When Kenji’s biggest tables hit hundreds of millions of rows, queries slow down. Partitioning helps.

How partitioning works

Partition function — defines the boundaries (how to split)
Partition scheme — maps partitions to filegroups (where to store)
Partitioned table — uses the scheme for physical storage

-- Example: Partition orders by year
CREATE PARTITION FUNCTION pfOrderDate (DATE)
AS RANGE RIGHT FOR VALUES ('2023-01-01', '2024-01-01', '2025-01-01');

CREATE PARTITION SCHEME psOrderDate
AS PARTITION pfOrderDate ALL TO ([PRIMARY]);

CREATE TABLE Orders (
    OrderID INT,
    OrderDate DATE,
    Amount DECIMAL(10,2)
) ON psOrderDate(OrderDate);

Partitioning benefits

Query performance: Partition elimination skips irrelevant partitions
Maintenance: Rebuild indexes on one partition, not the whole table
Data lifecycle: Use partition switching to archive old data instantly (metadata-only operation)

RANGE LEFT vs RANGE RIGHT

Type	Boundary Value Belongs To
RANGE LEFT	Left partition (boundary value is the MAX of the left partition)
RANGE RIGHT	Right partition (boundary value is the MIN of the right partition)

Most date-based partitions use RANGE RIGHT — the boundary date starts a new partition.

Data compression

Compression reduces storage and improves I/O performance (less data to read from disk):

Compression Type	How It Works	Savings	Overhead
Row compression	Fixed-length data stored as variable-length	15-25%	Minimal CPU
Page compression	Row compression + prefix/dictionary compression	50-80%	Moderate CPU
Columnstore compression	Column-oriented storage + batch processing	80-95%	Designed for analytics

-- Enable page compression on an existing table
ALTER TABLE Orders REBUILD WITH (DATA_COMPRESSION = PAGE);

-- Enable compression on a specific partition
ALTER TABLE Orders REBUILD PARTITION = 3 WITH (DATA_COMPRESSION = PAGE);

Tomas’s learning moment: Sam explains: “Row compression is free performance — almost no CPU overhead, good savings. Page compression squeezes harder but uses more CPU. Use page compression on cold data (archives, history). Use row compression on hot tables with frequent updates.”

Exam tip: compression and partitioning together

You can apply different compression levels to different partitions. This is a common exam scenario:

Hot partitions (current year): Row compression or none
Warm partitions (last year): Page compression
Cold partitions (archives): Page compression or columnstore (if the table is read-only)

Question

What's the difference between serverless and provisioned compute in Azure SQL Database?

Click or press Enter to reveal answer

Answer

Provisioned: fixed vCores, always running, predictable cost. Serverless: auto-scales between min/max vCores, auto-pauses when idle, pay only for compute used.

Click to flip back

Question

Why should SQL VM tempdb be placed on the local/ephemeral SSD?

Click or press Enter to reveal answer

Answer

The local SSD provides the highest IOPS and lowest latency. Since tempdb is rebuilt on every SQL Server restart anyway, data loss on the ephemeral disk is acceptable.

Click to flip back

Question

What are the three components of table partitioning in SQL Server?

Click or press Enter to reveal answer

Answer

1) Partition function (defines boundaries), 2) Partition scheme (maps partitions to filegroups), 3) Partitioned table (uses the scheme). Function → Scheme → Table.

Click to flip back

Question

Row compression vs page compression: which saves more space?

Click or press Enter to reveal answer

Answer

Page compression saves more (50-80%) because it adds prefix and dictionary compression on top of row compression. But it uses more CPU. Row compression saves 15-25% with minimal CPU overhead.

Click to flip back

Knowledge Check

Priya has a development database that's only used during business hours (8 AM – 6 PM). What compute model should she choose to minimise costs?

Knowledge Check

Kenji has a 500-million-row Orders table partitioned by OrderDate. He wants to archive 2022 data with maximum compression and minimal impact on current queries. What should he do?

Next up: Migration Planning: Online vs Offline — evaluate migration requirements and choose between online and offline strategies.