High Availability for ASCS/SCS
Configure HA for SAP Central Services including enqueue replication with ENSA1 vs ENSA2, Pacemaker on SUSE and RHEL, WSFC on Windows, Azure Load Balancer configuration, and SBD vs Azure Fence Agent fencing.
Protecting the heart of SAP
π‘οΈ Lars pulls up the architecture diagram. βASCS is the single most important thing to protect. If the enqueue server goes down and we lose the lock table, every userβs unsaved transaction is at risk. Dr. Schmidt will not accept that for GlobalPharma.β
βοΈ Mei agrees. βThe good news is that SAP designed a replication mechanism specifically for this. The Enqueue Replication Server β ERS β keeps a copy of the lock table on a second node. If the primary ASCS fails, the secondary picks up the lock table and users do not lose their work.β
Think of it like a backup cashier at a busy store.
The main cashier (ASCS) tracks every open transaction. A backup cashier (ERS) writes down everything the main cashier does in real time. If the main cashier suddenly has to leave, the backup already knows every open transaction and takes over instantly. No customer loses their purchase. With the old system (ENSA1), the backup maintained a copy, but the handoff process had race conditions where some entries could be lost. With the new system (ENSA2), the backupβs list is always reliably up to date.
π Architecture diagram: Open the ASCS/SCS HA Cluster diagram in Excalidraw to see the clustering layout with ERS, shared storage, Azure LB, and STONITH fencing.
ENSA1 vs ENSA2
| Feature | ENSA1 (Legacy) | ENSA2 (Current) |
|---|---|---|
| Lock table replication | ERS maintains a replicated lock table, but failover has race conditions where some locks can be lost in edge cases | Active β ERS maintains an independent copy with improved failover reliability |
| Lock preservation on failover | Possible lock loss in edge cases | All locks preserved β seamless failover |
| SAP version | SAP NetWeaver older releases | S/4HANA default, NetWeaver 7.52+ |
| ERS behavior | ERS restarts on the same node as ASCS after failover | ERS stays on its own node β ASCS moves to the ERS node |
| Complexity | Simpler but less reliable | Slightly more complex but robust |
| Exam focus | Know it exists and its limitations | Primary focus β this is the modern standard |
π‘οΈ Lars checks the SAP version. βGlobalPharma runs S/4HANA, so we use ENSA2 by default?β
βοΈ Mei confirms. βYes. ENSA2 is the default for S/4HANA. ENSA1 is only relevant if you are running older NetWeaver systems that have not been updated.β
Exam tip: ENSA2 is the modern answer
When the exam asks about enqueue replication for S/4HANA or modern SAP systems, ENSA2 is always correct. ENSA1 questions typically describe legacy NetWeaver systems. The key difference to remember: ENSA2 maintains an independent copy with reliable failover (no lock loss), ENSA1 maintained a replicated lock table but had race conditions during failover (potential lock loss in edge cases).
β οΈ Recently changed β exam alert
ENSA2 (Enqueue Server 2) is the current standard and is mandatory for S/4HANA. The older ENSA1 is only relevant for legacy ECC systems. If an exam question asks about the recommended enqueue replication approach for a new S/4HANA deployment, ENSA2 with standalone ERS is always the correct answer. ENSA1 may appear as a distractor.
Linux HA: Pacemaker for ASCS
On Linux (SUSE SLES or Red Hat RHEL), ASCS HA uses Pacemaker β an open-source cluster resource manager:
Cluster components:
- Two VMs β one runs ASCS, the other runs ERS
- Pacemaker β manages which node runs which service
- Corosync β provides cluster communication and membership
- STONITH β fencing mechanism (SBD or Azure Fence Agent)
- Azure Load Balancer β routes traffic to the active ASCS node
How failover works:
- The active ASCS node fails (VM crash, OS issue, or ASCS process dies)
- Pacemaker detects the failure via the cluster health check
- STONITH fences the failed node (Azure Fence Agent restarts it, or SBD blocks it)
- Pacemaker moves ASCS to the surviving node (which was running ERS)
- ERS moves to the recovered node
- Azure Load Balancer health probe detects the move and redirects traffic
SUSE vs RHEL differences
Both SUSE and RHEL support Pacemaker for SAP, but they have different cluster agent names, configuration syntax, and supported fencing approaches:
- SUSE uses the
sapstartsrvandSAPInstanceresource agents - RHEL uses the
SAPInstanceresource agent withsap_redhat_cluster_connector - Both require the distributionβs HA extension for SAP (SLES for SAP, RHEL for SAP)
- Configuration guides are distribution-specific β Microsoft publishes separate guides for each
Windows HA: WSFC for ASCS
On Windows, ASCS HA uses Windows Server Failover Clustering (WSFC):
- Azure Shared Disk or SOFS (Scale-Out File Server) for shared storage
- Cluster role for ASCS with a virtual network name
- Azure Load Balancer with health probe (same concept as Linux)
- No STONITH needed β WSFC handles fencing differently through cluster arbitration
Windows vs Linux ASCS HA
The exam focuses more on Linux (Pacemaker) than Windows (WSFC) for ASCS HA because most SAP HANA deployments use Linux. However, know that WSFC is the Windows equivalent and uses Azure Shared Disk for shared storage instead of NFS.
SBD vs Azure Fence Agent
| Feature | SBD (STONITH Block Device) | Azure Fence Agent |
|---|---|---|
| How it works | Uses a shared disk for fencing messages β nodes write 'poison pills' | Calls Azure REST API to deallocate or restart the failed VM |
| Storage required | Azure Shared Disk (small, dedicated for SBD) | No additional storage needed |
| Network dependency | Works even if network is partitioned (uses shared disk) | Requires network access to Azure API endpoints |
| Setup complexity | Moderate β configure SBD device and iSCSI/shared disk | Simpler β configure managed identity and permissions |
| SUSE support | Yes | Yes |
| RHEL support | Yes | Yes |
| Exam tip | Know that SBD uses shared storage for fencing | Know that it uses Azure API and needs network |
π‘οΈ Lars evaluates. βAzure Fence Agent is simpler, but SBD works even if the network is down. For GlobalPharmaβs compliance requirements, I prefer SBD β belt and suspenders.β
βοΈ Mei agrees. βSBD is more robust in network-partition scenarios. But Azure Fence Agent is easier to set up and works well for most deployments. The exam may test both.β
Azure Load Balancer for ASCS
The Load Balancer configuration for ASCS HA:
- Standard SKU, internal β SAP cluster IPs are always private
- Floating IP enabled β mandatory for the virtual cluster IP to work
- Health probe on port 620xx β where xx is the ASCS instance number (e.g., 62000 for instance 00)
- Health probe on port 621xx β for ERS instance
- HA ports rule β forwards all ports to the active node
- Idle timeout β set to 30 minutes for SAP long-running connections
Exam tip: Health probe ports
The exam loves testing ASCS Load Balancer health probe ports. The pattern is 620xx for ASCS and 621xx for ERS, where xx is the SAP instance number. For a multi-SID setup, each SID needs its own frontend IP and health probe on distinct ports.
Knowledge check
GlobalPharma runs S/4HANA and needs ASCS HA. Which enqueue replication architecture should Lars configure?
Lars needs a STONITH mechanism that works even if network connectivity between the two ASCS nodes is lost. What should he choose?
What port should Lars configure for the Azure Load Balancer health probe for ASCS instance number 01?
Lars is comparing clustering options for different OS platforms in GlobalPharma's SAP landscape. On which operating system does ASCS HA use Windows Server Failover Clustering (WSFC)?
Summary
You now know how to protect ASCS/SCS β the most critical SPOF in SAP. ENSA2 provides active lock table replication for seamless failover, Pacemaker handles Linux clustering (WSFC for Windows), SBD and Azure Fence Agent provide STONITH fencing, and Azure Load Balancer routes traffic to the active node. The health probe port pattern (620xx/621xx) is an exam favorite.
Next, we protect the other critical SPOF: the HANA database itself, using HANA System Replication for high availability.
π¬ Video coming soon