Autoscaling and Session Management

Running all your session hosts 24/7 is like leaving every light in the building on overnight — it works, but it wastes money. Autoscaling lets AVD start VMs when users need them and shut them down when they don’t. Combined with session management tools, you get full control over who is connected, where, and how much compute you are paying for.

Simple explanation

Why Autoscaling Matters

Without autoscaling, you have two bad choices:

Keep all VMs running — reliable, but you pay for idle compute during nights and weekends
Manually start/stop VMs — saves money, but someone has to remember, and users get angry if their VM is not ready

Autoscaling eliminates this trade-off. Microsoft estimates it can reduce VM costs by 30 to 60 percent compared to always-on hosts, depending on your usage patterns.

Scaling Plans — The Four Phases

A scaling plan divides each day into four time phases. Each phase has its own rules:

Phase	Typical hours	What happens	Recommended load balancing
Ramp-up	7:00 AM - 9:00 AM	Gradually start hosts before users arrive	Breadth-first (spread users across hosts)
Peak hours	9:00 AM - 5:00 PM	All needed hosts running, full capacity	Breadth-first or depth-first
Ramp-down	5:00 PM - 7:00 PM	Gradually shut down hosts as users leave	Depth-first (consolidate onto fewer hosts)
Off-peak	7:00 PM - 7:00 AM	Minimum hosts running	Depth-first

Key Settings Per Phase

Minimum percentage of hosts — The floor. Even if nobody is connected, keep at least this percentage running. Set to 10-20 percent during off-peak so a few early birds can connect immediately.
Capacity threshold — When the percentage of used sessions across running hosts exceeds this number, start more hosts. Example: threshold of 75 percent means “start a new host when current hosts are 75 percent full.”
Force logoff — During ramp-down, if a host needs to shut down but still has users, should AVD force them off? You set a delay (e.g., 15 minutes warning) and a message they see.

🏢 Raj’s multi-timezone challenge: TerraStack has offices in Sydney, London, and Toronto. Raj creates three scaling plans — one per region — each aligned to local business hours. The Sydney plan ramps up at 7 AM AEST while London is still in off-peak. He assigns each plan to the appropriate host pool. Andrea loves the cost report: compute spend dropped 42 percent in the first month.

Exam Tip — Breadth-First vs Depth-First for Scaling

The exam frequently tests this. During ramp-up and peak, use breadth-first — spread users across many hosts so no single host gets overwhelmed. During ramp-down and off-peak, switch to depth-first — pack users onto fewer hosts so empty hosts can be deallocated. This is the cost-optimal pattern.

Configuring a Scaling Plan

To create a scaling plan in the Azure portal:

Navigate to Azure Virtual Desktop then Scaling plans
Click Create a scaling plan
Select the subscription, resource group, and give it a name
Set the time zone (critical for multi-region)
Add schedules for each day of the week (you can use one schedule for weekdays and another for weekends)
Configure each phase with its start time, load balancing, minimum hosts, and capacity threshold
Assign the plan to one or more host pools
Set the plan to Enabled

Important: the scaling plan needs the Desktop Virtualization Power On Off Contributor role on the subscription or resource group containing the VMs. Without this RBAC assignment, the plan cannot start or stop hosts.

Deep Dive — Scaling Plan Evaluation Logic

The scaling plan evaluates every 15 minutes. It checks: (1) What phase are we in? (2) How many hosts are running? (3) What is the current session load vs capacity threshold? (4) Do we need to start or stop hosts?

When starting hosts, it picks deallocated VMs and starts them. When stopping hosts, it picks hosts with zero sessions first. If all hosts have sessions and the plan needs to reduce capacity, it puts hosts in drain mode (no new sessions) and optionally force-logs-off users after the configured delay.

Start VM on Connect

Start VM on Connect solves a different problem: personal host pools where each user has a dedicated VM.

With personal pools, you cannot use traditional autoscaling because each user needs their specific VM. Start VM on Connect works like this:

User opens the Remote Desktop client and clicks their desktop
AVD detects the VM is deallocated
AVD starts the VM automatically
User waits 1-2 minutes while the VM boots
Connection completes

This means personal VMs can be deallocated when not in use, saving compute costs, without requiring users to manually start their VM or wait for an admin.

How to Enable It

Go to the host pool properties
Set Start VM on Connect to Yes
Assign the Desktop Virtualization Power On Contributor role to the AVD service principal on the subscription or resource group

🌐 Priya’s cost win: NomadTech’s 200 remote workers each have a personal desktop. Before Start VM on Connect, all 200 VMs ran 24/7 — even though most people work 8-hour days across different time zones. After enabling the feature, VMs only run when someone actually connects. Combined with auto-shutdown after 2 hours of inactivity, Priya reduced compute costs by 55 percent. Ben (creative director) barely noticed — his VM takes about 90 seconds to start when he clicks his desktop.

Autoscaling vs Start VM on Connect

Feature	Autoscaling (Scaling Plans)	Start VM on Connect
Best for	Pooled host pools	Personal host pools
How it works	Starts/stops hosts based on schedule and demand	Starts a specific VM when a user connects
Trigger	Time schedule + capacity threshold	User connection attempt
Load balancing	Breadth-first or depth-first per phase	Not applicable (user has assigned VM)
User experience	Hosts already running — instant connection	1-2 minute wait while VM boots
Cost savings	30-60 percent typically	40-70 percent for personal pools
Configuration	Scaling plan with 4 phases	Single toggle on host pool
RBAC required	Power On Off Contributor	Power On Contributor
Can be combined?	Yes — scaling plan handles pooled pools alongside	Yes — can coexist with scaling plans

Managing Active Sessions

As an AVD admin, you need to manage users who are currently connected. The Azure portal gives you these tools under each host pool’s Session hosts blade:

View active sessions — See every connected user, their session host, login time, and session state
Send a message — Push a notification to a user’s session (e.g., “Maintenance in 15 minutes, please save your work”)
Disconnect — End the RDP connection but keep the session alive on the host (user can reconnect and resume)
Log off — End the session completely, closing all apps
Drain mode — Prevent NEW connections to a specific host while existing sessions continue

Drain Mode

Drain mode is essential for maintenance. When you turn on drain mode for a session host:

No new users will be assigned to that host
Existing users remain connected and can keep working
When the last user logs off, the host is empty and safe to update or restart

🎧 Mia’s maintenance window: Mia needs to update the antivirus agent on session hosts at Horizons Health. She cannot just reboot them — nurses are charting during all hours. She enables drain mode on hosts one at a time, waits for users to naturally log off (or sends a friendly message), patches the host, disables drain mode, and moves to the next. Zero downtime for the clinic.

Application Group Monitoring

Application groups control which desktops and apps users can access. From the portal, you can:

See which application groups are assigned to which users or groups
Check the number of active connections per application group
Identify unused application groups (candidates for cleanup)
Review which RemoteApps are published and their paths

This data also flows into AVD Insights if diagnostics are enabled, letting you track usage trends over time.

Exam Tip — Drain Mode vs Deallocate vs Delete

Know the difference. Drain mode keeps the VM running but blocks new sessions — use for maintenance. Deallocate stops the VM to save compute costs — the VM still exists and can be started. Delete removes the VM entirely — you lose the OS disk unless it is backed up. The exam may describe a scenario and ask which action to take.

Flashcards

Question

What are the four phases of an AVD scaling plan?

Click or press Enter to reveal answer

Answer

1. Ramp-up — gradually start hosts before peak. 2. Peak hours — full capacity. 3. Ramp-down — gradually stop hosts as demand drops. 4. Off-peak — minimum hosts running. Each phase has its own load-balancing algorithm, minimum host percentage, and capacity threshold.

Click to flip back

Question

What RBAC role does a scaling plan need to start and stop VMs?

Click or press Enter to reveal answer

Answer

Desktop Virtualization Power On Off Contributor. This role must be assigned on the subscription or resource group containing the session host VMs. Without it, the scaling plan has no permission to change VM power state.

Click to flip back

Question

What is Start VM on Connect and which host pool type uses it?

Click or press Enter to reveal answer

Answer

Start VM on Connect automatically starts a deallocated VM when a user tries to connect to it. It is designed for personal host pools where each user has a dedicated VM. The user waits 1-2 minutes for the VM to boot, then their connection completes.

Click to flip back

Question

What does drain mode do on a session host?

Click or press Enter to reveal answer

Answer

Drain mode prevents new user sessions from being assigned to that host, but existing sessions continue uninterrupted. It is used for maintenance — you drain the host, wait for users to finish, then safely patch or reboot it.

Click to flip back

Knowledge Check

Priya wants to reduce costs for NomadTech's personal desktop pool. Users work across 12 time zones and there is no predictable peak hour. Which feature should she enable?

Knowledge Check

Raj's scaling plan is configured with a capacity threshold of 75 percent during peak hours. The host pool has 10 VMs, each with a max session limit of 10. Currently 6 VMs are running with a total of 48 active sessions. What will happen at the next evaluation?

Knowledge Check

Which load-balancing algorithm should be used during the ramp-down phase and why?

Summary

Autoscaling and session management are how you balance cost and user experience. Use scaling plans for pooled host pools (four-phase schedule with capacity thresholds), Start VM on Connect for personal pools, and drain mode for safe maintenance. Get the load-balancing strategy right — breadth-first when ramping up, depth-first when winding down.

Next up: VMs do not patch themselves — Update Strategy and Backups.

🎬 Video coming soon

Autoscaling and Session Management