Backup & Recovery for Compute
Azure Backup, VM snapshots, and Azure Site Recovery — design a compute backup strategy that matches your RPO/RTO requirements without breaking the budget.
Compute backup design
Backing up a VM is like taking a photo of your desk. Azure Backup takes consistent “photos” (snapshots) of your entire VM — OS, data, applications — so you can restore to that exact point if something breaks.
Three tools: Azure Backup (scheduled, policy-driven backups to a vault), VM snapshots (instant point-in-time copy of disks), and Azure Site Recovery (continuous replication for disaster recovery — different from backup).
Azure Backup vs snapshots vs Site Recovery
| Factor | Azure Backup | VM Disk Snapshots | Azure Site Recovery (ASR) |
|---|---|---|---|
| Purpose | Scheduled backup with retention | Instant point-in-time disk copy | Continuous replication for DR failover |
| RPO | Hours (policy-based schedule) | Manual — at time of snapshot | Minutes (continuous replication) |
| RTO | Hours (restore from vault) | Minutes (create VM from snapshot) | Minutes (orchestrated failover) |
| Retention | Days to years (configurable policy) | Manual management (no auto-delete) | Current state only (no historical versions) |
| Cross-region | Yes — cross-region restore from GRS vault | No — same region only | Yes — replicates to secondary region |
| Application consistency | Yes — VSS integration for Windows, scripts for Linux | Crash-consistent only | Yes — application-consistent recovery points |
| Cost | Per-instance + storage | Snapshot storage only | Per-protected instance + compute in target region |
| Best for | Operational recovery (accidental deletion, corruption) | Pre-change snapshots (before patching) | Regional disaster recovery (region outage) |
Critical distinction: Azure Backup and Azure Site Recovery solve different problems. Backup is for “I need yesterday’s data” (operational recovery). ASR is for “the entire region is down, fail over NOW” (disaster recovery). Most production VMs need both.
Azure Backup policies
| Policy Setting | Options | Design Guidance |
|---|---|---|
| Frequency | Daily, hourly (Enhanced policy) | Daily for most; hourly for databases or critical apps |
| Retention | Days, weeks, months, years | 30 days operational + monthly/yearly for compliance |
| Instant restore | 1-5 days (snapshot tier) | Enables fast restore from local snapshot before vault |
| Cross-region restore | Enabled on GRS vaults | Enable for workloads needing regional DR from backup |
| Soft delete | 14-day default recovery window (configurable up to 180 days) | Always enable — protects against ransomware deleting backups |
🏗️ Priya’s backup design: GlobalTech’s production VMs:
- Daily backup at 2 AM (off-peak), retain 30 days
- Weekly checkpoint retained for 12 weeks
- Monthly checkpoint retained for 12 months
- Yearly checkpoint retained for 7 years (regulatory compliance)
- GRS vault with cross-region restore enabled
- Soft delete enabled — ransomware can’t delete backups
Azure Site Recovery for DR
ASR continuously replicates VMs to a secondary region:
| Feature | Detail |
|---|---|
| Replication | Continuous — changes replicated every few minutes |
| Failover | Orchestrated — run a recovery plan that starts VMs in sequence |
| Failback | Reverse replication back to primary after recovery |
| Recovery plans | Define VM startup order, pre/post scripts, manual steps |
| Test failover | Validate DR without affecting production (isolated network) |
🏦 Elena’s DR architecture: FinSecure Bank’s critical VMs:
- ASR replicates to the paired region (West Europe → North Europe)
- Recovery plan starts databases first, then application servers, then web frontends
- Monthly DR drill — test failover to validate the plan works
- Azure Backup runs alongside ASR — backup for operational recovery, ASR for regional DR
Exam tip: Backup ≠ DR — you often need both
A common exam question pattern: “The company needs to recover from accidental data deletion AND survive a regional outage.” The answer is Azure Backup (data recovery) PLUS Azure Site Recovery (regional DR). Neither alone covers both scenarios. Backup doesn’t keep VMs running during an outage. ASR doesn’t let you restore to a point 3 days ago.
Knowledge check
🏗️ GlobalTech needs their production VMs to survive a regional outage (RTO: 15 minutes) AND be recoverable from accidental data corruption (recover to any point in the last 30 days). Which combination should Priya recommend?
🏛️ David's government agency needs to protect VM backups from a compromised admin account. Even a global administrator should not be able to delete or disable backups without a second authorisation. Which Azure Backup feature should David enable?
🎬 Video coming soon
Next up: Compute is protected — now let’s back up databases and unstructured data — Backup for Databases & Unstructured Data.