Sizing for Performance and Capacity

Right-sizing your session hosts

Simple explanation

Imagine each session host VM as a bus.

A small bus (4 vCPU) can carry a few passengers (users). A large bus (16 vCPU) carries more. If the passengers only carry handbags (light workload — email and web), you fit lots on one bus. If they bring full suitcases (heavy workload — CAD software), you need fewer passengers per bus or a bigger bus. Picking the wrong bus size means either wasted money (huge empty bus) or angry passengers (overloaded, slow bus).

Workload types and VM sizing

Microsoft defines four workload types. The exam uses these categories in scenario questions:

Workload	Examples	Max users per vCPU	Min VM spec	Example VM
Light	Data entry, basic web browsing	6	8 vCPU, 16 GB RAM	D8s_v5
Medium	Office apps, email, light databases	4	8 vCPU, 16 GB RAM	D8s_v5
Heavy	Software development, content creation	2	8 vCPU, 16 GB RAM	D8s_v5
Power	CAD, 3D modelling, simulation, video editing	1	6 vCPU, 56 GB RAM, GPU	NV-series

Reading the table: “Max users per vCPU” tells you the density. A D8s_v5 with 8 vCPUs can host up to 48 light users (8 times 6), 32 medium users (8 times 4), or 16 heavy users (8 times 2). Power users usually get 1:1 or at most 2:1 ratios.

🌐 Priya at NomadTech: “We profiled our 200 users: 150 are light (email, browser, Teams), 45 are medium (Excel dashboards, Power BI), and 5 are power (video editing). That profile drove completely different VM sizes across our three pools.”

VM families for AVD

VM families for AVD workloads
VM Family	Optimised for	Typical AVD use	Example sizes
D-series v5	General purpose (balanced CPU/memory)	Default choice for light and medium workloads	D4s_v5 (4 vCPU/16 GB), D8s_v5 (8/32), D16s_v5 (16/64)
E-series v5	Memory optimised	Heavy workloads with large datasets, big Excel files, in-memory databases	E4s_v5 (4 vCPU/32 GB), E8s_v5 (8/64)
F-series v2	Compute optimised	CPU-intensive light workloads (high density)	F8s_v2 (8 vCPU/16 GB)
NV-series	GPU — virtualised (vGPU)	Graphics rendering, CAD, video playback	NV6ads_A10_v5 (6 vCPU/55 GB/A10 GPU partition)
NC-series	GPU — full GPU (CUDA/ML)	Machine learning inference, scientific simulations	NC4as_T4_v3 (4 vCPU/28 GB/T4 GPU)
B-series	Burstable (credits)	Dev/test or very light intermittent workloads only	B4ms (4 vCPU/16 GB)

Key rules:

D-series is the default — start here unless you have a specific reason not to
E-series when users complain about RAM pressure (lots of browser tabs, big spreadsheets)
NV-series for graphics-accelerated desktops (CAD, video, 3D)
B-series only for dev/test — burstable VMs throttle once credits run out, making them unsuitable for production workloads

🏢 Raj’s CAD team: “We tested D16s_v5 for CAD but Revit rendering was painfully slow without GPU acceleration. We switched to NV-series with a partitioned NVIDIA A10 GPU. Now each designer gets a slice of GPU and render times dropped by 80%. Cost went up, but productivity went up even more.”

Memory planning

While vCPU ratios get the most attention, memory is often the real bottleneck:

Workload	RAM per user (guideline)
Light	2 GB
Medium	3-4 GB
Heavy	6-8 GB
Power	8-16+ GB

A D8s_v5 has 32 GB RAM. With 32 medium users (per the 4:1 vCPU ratio), that is only 1 GB per user — well below the 3-4 GB guideline. You may need to drop density or choose a memory-optimised E-series VM.

Exam tip: RAM vs vCPU — which limits first?

The exam may give you a VM size and a user count and ask if it is adequate. Always check BOTH vCPU ratio and memory per user. A D8s_v5 (8 vCPU, 32 GB) hosting 32 medium users satisfies the 4:1 vCPU ratio but only gives 1 GB RAM per user (well below the 3-4 GB recommendation). The correct answer would be to either reduce density or switch to E8s_v5 (8 vCPU, 64 GB) for 2 GB per user.

GPU workloads

Some users need hardware graphics acceleration. AVD supports GPU-enabled VMs for:

GPU partitioning — shares a physical GPU among multiple users using GPU-P (NV-series, NVads A10 v5)
Full GPU pass-through — dedicates the entire GPU to one user (NC-series for compute, NV for graphics)

Note: RemoteFX vGPU was deprecated and removed from Windows Server due to security concerns. AVD uses Azure GPU VM sizes with native driver support instead.

Use cases: CAD/CAM (AutoCAD, Revit, SolidWorks), video editing (Premiere, DaVinci), 3D visualisation (Power BI 3D visuals), medical imaging

🎧 Mia at Horizons Health: “Dr. Patel’s radiology workstation needs GPU for DICOM 3D rendering. We deployed NV-series personal desktops for the 8 radiologists. Everyone else in the clinics uses D-series pooled desktops for EHR and scheduling — no GPU needed.”

OS disk types

The operating system disk impacts boot time and application launch speed:

Disk type	Performance	Cost	Persistent?	Best for
Standard HDD	Low IOPS, high latency	Lowest	Yes	Dev/test only
Standard SSD	Moderate IOPS	Low	Yes	Light workloads on a budget
Premium SSD	High IOPS, low latency	Medium	Yes	Production personal desktops
Ephemeral OS disk	Very high (uses local VM cache)	Included in VM price	No — resets on reimage/deallocate	Production pooled desktops

Ephemeral OS disks

Ephemeral disks use the VM’s local temporary storage (or cache) instead of a remote managed disk. Benefits:

Faster boot and reimage — no network storage latency
No disk cost — the local cache is included in the VM price
Stateless by design — perfect for pooled pools where VMs are reimaged regularly

The tradeoff: data on the OS disk is lost when the VM is deallocated or reimaged. This is ideal for pooled host pools (user data is in FSLogix on the file share, not on the OS disk) but NOT suitable for personal desktops where users install software locally.

Exam tip: Ephemeral disk placement

Ephemeral OS disks can be placed on the VM cache or the temporary (temp) disk. Cache placement gives better IOPS but only works if the VM size has enough cache capacity for the OS image. The exam may ask where to place the ephemeral disk — check that the VM size supports it. For example, D8s_v5 has a cache size of 200 GiB, which comfortably fits a standard Windows image.

Session limits

Each session host has a maximum session limit that caps how many users can connect to it:

For breadth-first load balancing, the default max session limit is 999,999 (effectively unlimited — relies on even distribution)
For depth-first, you MUST set a realistic max session limit (e.g., 12 users per D8s_v5 for medium workloads)
The max session limit should be based on your vCPU and memory calculations

If all session hosts reach their max session limit, new users cannot connect. You need to either increase the limit or add more session hosts.

Capacity planning walkthrough

🏢 Raj’s calculation for TerraStack:

Scenario: 500 medium-workload users, breadth-first load balancing, target 4:1 vCPU ratio, 4 GB RAM per user.

Pick VM size: D16s_v5 — 16 vCPU, 64 GB RAM
vCPU capacity: 16 vCPU times 4 users per vCPU = 64 users per VM
Memory check: 64 GB divided by 64 users = 1 GB per user — too low for medium workloads
Adjust for memory: 64 GB divided by 4 GB per user = 16 users per VM
VMs needed: 500 users divided by 16 per VM = 31.25, round up to 32 VMs
Add buffer: 20% buffer for failover and updates = 32 times 1.2 = 38.4, round up to 39 VMs

Notice how memory (step 4) limited capacity far more than vCPU ratio alone (step 2). Always check both.

🎧 Mia’s clinic sizing: “We have 400 light-workload clinical staff (EHR and scheduling). With D8s_v5 (8 vCPU, 32 GB), we get 48 users per VM from the vCPU ratio (8 times 6). Memory check: 32 GB divided by 48 = 0.67 GB — borderline for light. We settled on 24 users per VM, which gives 1.3 GB RAM each, and that needs 17 VMs plus a 20% buffer = 21 VMs total.”

Question

What is the maximum users-per-vCPU ratio for light workloads?