HANA System Replication for HA
Configure HANA System Replication (HSR) for high availability with synchronous replication, active/read-enabled secondary, Pacemaker resource agents SAPHana and SAPHanaTopology, hook scripts, and Azure Load Balancer integration.
Protecting the HANA database
π‘οΈ Lars points to the architecture diagram. βWe have ASCS protected. Now the database. If our HANA VM fails, GlobalPharma loses access to all data. Dr. Schmidt wants automatic failover β no manual intervention, no 30-minute restart from disk.β
βοΈ Mei draws two HANA nodes. βThat is exactly what HANA System Replication gives us. HSR continuously copies data from the primary HANA instance to a secondary instance on a separate VM. If the primary fails, Pacemaker triggers failover and the secondary becomes the new primary β usually within minutes.β
Think of it like two identical notebooks.
You write every transaction in your primary notebook. A colleague sits next to you, copying every entry into their notebook in real time (synchronous replication). If you suddenly cannot continue, your colleague already has an identical copy and takes over immediately. With the βread-enabledβ option, your colleague can even answer questions from their notebook while they are still copying β so the backup is not just sitting idle.
π Architecture diagram: Open the HANA HA Cluster diagram in Excalidraw to see the full Pacemaker cluster with HSR, Azure Load Balancer, and STONITH fencing.
HSR replication modes for HA
HANA System Replication supports multiple modes. For HA, you need to know:
SYNC (synchronous) β the primary waits for the secondary to acknowledge the write before committing. Guarantees zero data loss (RPO=0). Recommended for HA.
SYNCMEM (synchronous in-memory) β the primary waits for the secondary to receive the data in memory (but not persist to disk). Slightly faster than SYNC with RPO=0 when only the primary fails. Note: in the unlikely event of a simultaneous dual-node failure, in-flight data that has not been persisted on the secondary could be lost.
ASYNC (asynchronous) β the primary does not wait for the secondary. Used for DR across regions where latency makes synchronous impractical. Potential data loss (RPO > 0).
| Feature | SYNC | SYNCMEM | ASYNC |
|---|---|---|---|
| Primary waits for secondary | Yes β write to disk | Yes β write to memory | No β fire and forget |
| Data loss on failover (RPO) | Zero | Zero when only the primary fails (simultaneous dual failure could lose in-flight data) | Possible β depends on lag |
| Performance impact | Higher latency on writes | Moderate latency | Minimal impact on primary |
| Network requirement | Low latency (same region/zone) | Low latency (same region/zone) | Tolerates higher latency (cross-region) |
| Use case | HA within a region | HA within a region (alternative to SYNC) | DR across regions |
| Exam recommendation | Primary choice for HA | Know it exists as HA alternative | Know it for DR scenarios |
Exam tip: SYNC + memory preload for HA
When the exam asks about HANA HA configuration, the answer is synchronous replication (SYNC or SYNCMEM) with memory preload enabled on the secondary. Memory preload means the secondary loads data tables into memory so it can serve queries immediately after takeover, reducing RTO significantly.
Active/read-enabled secondary
Starting with HANA 2.0 SPS01, the secondary node in an HSR pair can serve read-only queries while actively receiving replication data. This is called the active/read-enabled secondary.
Benefits:
- Offload read-heavy reporting queries to the secondary
- Better utilization of the secondary VM (it is not just idle standby)
- Reduces load on the primary for better write performance
- No additional licensing cost for HANA Enterprise Edition
Limitations:
- Read queries on the secondary may see slightly stale data during replication lag
- If the secondary needs to take over as primary, read sessions are disconnected
- Only available with HANA 2.0 SPS01 or later
π‘οΈ Lars considers. βSo GlobalPharmaβs auditors can run their compliance reports against the secondary node without slowing down production?β
βοΈ Mei nods. βExactly. And if the primary fails, the secondary drops the read connections and becomes the new primary. It is a much better use of the standby hardware than having it sit idle.β
Pacemaker for HANA HA
On Linux, Pacemaker automates HANA HSR failover using two specialized resource agents:
SAPHana β manages the HANA primary/secondary roles. It monitors HSR status and orchestrates takeover when the primary fails. This agent understands HANA-specific states and replication status.
SAPHanaTopology β monitors the HANA replication topology (which node is primary, which is secondary, replication status). It feeds information to SAPHana for decision-making.
Both agents work together:
- SAPHanaTopology continuously checks replication status on both nodes
- SAPHana uses this information to determine cluster health
- If the primary fails, SAPHana promotes the secondary to primary
- STONITH fences the failed node
- Azure Load Balancer health probe detects the new primary
Hook scripts
HANA hook scripts are Python scripts that HANA calls during replication events (takeover, registration, status changes). They integrate HANAβs internal replication awareness with Pacemaker:
- SAPHanaSR hook β notifies Pacemaker about HSR status changes
- Runs inside the HANA process, providing faster notification than polling
- Reduces the time Pacemaker needs to detect a replication issue
- Must be configured on both primary and secondary nodes
Hook scripts improve failover speed
Without hook scripts, Pacemaker relies on periodic polling to detect HSR status changes. With hook scripts, HANA proactively notifies Pacemaker the moment a replication event occurs. This can reduce failover detection time from 30+ seconds to near-instant. The exam may test whether you know that hook scripts complement (not replace) the Pacemaker resource agents.
Azure Load Balancer for HANA
The Load Balancer configuration for HANA HA follows the same principles as ASCS:
- Standard SKU, internal β HANA cluster IPs are private
- Floating IP enabled β mandatory for the HANA virtual IP
- Health probe on port 625xx β where xx is the HANA instance number (e.g., 62503 for instance 03)
- Backend pool β contains both HANA VMs
- HA ports rule β forwards all HANA ports (3xx13, 3xx14, 3xx15, etc.) through a single rule
Applications connect to the Load Balancerβs frontend IP, which always resolves to the active HANA primary. After failover, the health probe detects the new primary and traffic switches automatically.
Testing failover
π‘οΈ Lars insists. βWe need to test this before going live. Dr. Schmidt requires documented evidence that failover works.β
Testing approaches:
- Graceful takeover β use
hdbnsutil -sr_takeoverto trigger a planned failover - Simulate VM failure β stop the primary VM from the Azure portal
- Kill HANA process β terminate the HANA indexserver process to test Pacemaker detection
- Network isolation β block network between nodes to test fencing
- Document results: failover time, data loss verification, client reconnection behavior
Knowledge check
GlobalPharma needs HANA HA with zero data loss. Which HSR configuration should Lars use?
Lars notices that Pacemaker takes 45 seconds to detect an HSR status change. What should he configure to improve detection speed?
GlobalPharma's auditors want to run compliance reports against the HANA database without impacting production users. What HANA feature enables this?
Lars is configuring the Pacemaker cluster for GlobalPharma's HANA HA setup. Which two Pacemaker resource agents are required for HANA HA on Azure?
Summary
You now know how to protect the HANA database: HSR with synchronous replication for zero data loss, memory preload for fast takeover, active/read-enabled secondary for report offloading, Pacemaker with SAPHana and SAPHanaTopology resource agents, hook scripts for fast detection, and Azure Load Balancer for traffic routing. Combined with the ASCS HA from the previous module, you have a fully protected SAP system.
Next, we take a deep dive into shared storage and Load Balancer configuration details that apply to both ASCS and HANA clusters.
π¬ Video coming soon