FT | Electric Monk

Tag Archive for FT

Test FT failover, secondary restart and app fault tolerance in a FT VM

February 6, 2013 Objective 4 Business Continuity No comments

Fault Tolerance failure scenarios

Fault Tolerance failures are only triggered when there is no communication between the primary and secondary VMs.

Three scenarios may occur

Deterministic

This is where you can predict how a failover will occur

An ESXi host fails which causes complete host failover
The Primary VM process fails or becomes unresponsive on the ESXi host
A Fault Tolerance test is initiated from vCenter Server

Reactionary

This is where a failover may occur but you don’t know the expected outcome ahead of time. These events are not predicable as there is a race between the Primary and Secondary VMs to see which one should be the live one. The race prevents a split brain scenario that can cause data corruption

The Fault Tolerant NIC is interrupted or fails
The Fault Tolerant NIC communication is very slow

No action taken

This is where no failure can occur because Fault Tolerance does not monitor for this type of event

Management network interruption or failure
VM network interruption or failure
HBA Failures that do not affect the entire host
Any combination of the above

Testing Fault Tolerance

VMware provides a Test Failover function from the VM which is the best option for testing

3 Tests

Select the Test Failover Function from the Fault Tolerance menu on the Primary VM

This tests the Fault Tolerance functionality in a fully supported and non invasive way. In this scenario, the Virtual Machine fails over from Host A to Host B and a secondary VM is started back up again. VMware HA failure does not occur in this case

Host Failure

This can be accomplished by pulling the power cord of the host, rebooting the host or powering off the host from a remote KVM such as ILO, DRAC, IMM and RSA etc. The secondary VM on Host B takes over immediately and continues to process information for the VM. VMware HA occurs

Virtual Machine process on Host A fails

The scenario can be accomplished by terminating the active process for the VM by logging into Host A. The secondary VM takes over and no VMware HA failure occurs. VMware do not recommend testing in this way

Fault Tolerance

February 5, 2013 Objective 4 Business Continuity No comments

What is Fault Tolerance?

FT is the evolution of continuous availability that utilises VMware vLockstep technology to keep a primary and secondary virtual machine in sync. It is based on the record/playback technology used in VMware Workstation. It streams non-deterministic events and then replay will occur deterministically. This means it matches instruction for instruction and memory for memory to create identical processing

Deterministic means that the processor will execute the same instruction set on the secondary VM

Non-Deterministic means event functions such as network/disk/mouse and keyboard including hardware interrupts which are also played back

The Primary and Secondary VMs continuously exchange heartbeats. This exchange allows the virtual machine pair to monitor the status of one another to ensure that Fault Tolerance is continually maintained. A transparent failover occurs if the host running the Primary VM fails, in which case the Secondary VM is immediately activated to replace the Primary VM. A new Secondary VM is started and Fault Tolerance redundancy is reestablished within a few seconds. If the host running the Secondary VM fails, it is also immediately replaced. In either case, users experience no interruption in service and no loss of data

Fault Tolerance avoids “split-brain” situations, which can lead to two active copies of a virtual machine after recovery from a failure. Atomic file locking on shared storage is used to coordinate failover so that only one side continues running as the Primary VM and a new Secondary VM is respawned automatically.

Use Cases

Applications that need to be available at all times, especially those that have long-lasting client connections that users want to maintain during hardware failure.
Custom applications that have no other way of doing clustering.
Cases where high availability might be provided through custom clustering solutions, which are too complicated to configure and maintain.
On demand protection for VMs running end of month reports or financials

Best Practices for Fault Tolerance

To ensure optimal Fault Tolerance results, VMware recommends that you follow certain best practices. In addition to the following information, see the white paper VMware Fault Tolerance Recommendations and Considerations at http://www.vmware.com/resources/techresources/10040

Requirements for FT

Cluster Requirements
Host Requirements
VM Requirements

Cluster Requirements

Host certificate checking must be enabled. Default for vSphere 4.1 but you may need to enable this (vCenter Server Settings > SSL Settings > Select the vCenter requires verified host SSL certificates)
The cluster must have at least 2 ESXi hosts running the same FT Version or build number
HA must be enabled on the cluster
EVC must be enabled if you want to use FT in conjunction with DRS or DRS will be disabled

Hosts Requirements

The ESXi hosts must have access to the same datastores and networks
The ESXi hosts must have a FT Logging network setup
The FT Logging network must have at least 1GB connectivity
NICs can be shared if necessary
The ESXi hosts CPUs must be FT compatible
Host must be licensed for FT
Hardware Virtualisation must be enabled on the BIOS of the hosts to enable CPU support for FT
It is recommended that Power Management is turned off in the BIOS. This helps ensure uniformity in the CPU speeds

VMs Requirements

Only VMs with a single CPU are supported
VMs must be running a supported O/S
VMs must be stored on shared storage available to all hosts
FC, iSCSI, FCOE and NFS are supported
A VMs disk must be eager zeroedthick format or a Virtual RDM (Physical RDMs are not supported)
No VM snapshots
The VM must not be a linked clone
No USB, Sound devices, serial ports or parallel ports configured
The VM cannot use NPIV
Nested Page Tables/Extended Page Tables are not supported
The VM cannot use NIC Passthrough
The VM cannot use the older vlance drivers
No CD-ROM or floppy devices attached
The VM cannot use a paravirtualised kernel
VMs must be on the correct Monitor Mode

Caveats

You can use vMotion but not Storage vMotion and therefore Storage sDRS
Hot Plugging is not allowed
You cannot change the network settings while the VM is on
Because snapshots are not supported, you will not be able to use any backup mechanism that uses snapshots. You can disable FT first before backing up

Configure FT Networking for Host Machines

On each host that you want to add to a vSphere HA cluster, you must configure two different networking switches so that the host can also support vSphere Fault Tolerance.
To enable Fault Tolerance for a host, you must complete this procedure twice, once for each port group option to ensure that sufficient bandwidth is available for Fault Tolerance logging. Select one option, finish this procedure, and repeat the procedure a second time, selecting the other port group option.

Prerequisites

Multiple gigabit Network Interface Cards (NICs) are required. For each host supporting Fault Tolerance, you need a minimum of two physical gigabit NICs. For example, you need one dedicated to Fault Tolerance logging and one dedicated to vMotion.
VMware recommends three or more NICs to ensure availability.
The vMotion and FT logging NICs must be on different subnets
IPv6 is not supported on the FT logging NIC.

Procedure

Connect vSphere Client to vCenter Server.
In the vCenter Server inventory, select the host and click the Configuration tab.
Select Networking under Hardware, and click the Add Networking link
The Add Network wizard appears.
Select VMkernel under Connection Types and click Next.
Select Create a virtual switch and click Next.
Provide a label for the switch.
Select either Use this port group for vMotion or Use this port group for Fault Tolerance logging and click Next.
Provide an IP address and subnet mask and click Next.

Click Finish.

Networking Example

vMotion and FT Logging can share the same VLAN (configure the same VLAN number in both port groups), but require their own unique IP addresses residing in different IP subnets. However, separate VLANs might be preferred if Quality of Service (QoS) restrictions are in effect on the physical network with VLAN based QoS. QoS is of particular use where competing traffic comes into play, for example, where multiple physical switch hops are used or when a failover occurs and multiple traffic types compete for network resources.

This example uses four port groups configured as follows:

VLAN A: Virtual Machine Network Port Group-active on vmnic2 (to physical switch #1); standby on vmnic0 (to physical switch #2.)
VLAN B: Management Network Port Group-active on vmnic0 (to physical switch #2); standby on vmnic2 (to physical switch #1.)
VLAN C: vMotion Port Group-active on vmnic1 (to physical switch #2); standby on vmnic3 (to physical switch #1.)
VLAN D: FT Logging Port Group-active on vmnic3 (to physical switch #1); standby on vmnic1 (to physical switch #2.)

Instructions for setup

Connect to vCenter using the vClient or Web Client
Right click the VM you want to use for FT and select Fault Tolerance > Turn on Fault Tolerance

You will get a message as per below

vSphere Fault Tolerance Configuration Recommendations

VMware recommends that you observe certain guidelines when configuring Fault Tolerance.

In addition to non-fault tolerant virtual machines, you should have no more than four fault tolerant virtual machines (primaries or secondaries) on any single host. The number of fault tolerant virtual machines that you can safely run on each host is based on the sizes and workloads of the ESXi host and virtual machines, all of which can vary.
If you are using NFS to access shared storage, use dedicated NAS hardware with at least a 1Gbit NIC to obtain the network performance required for Fault Tolerance to work properly.
Ensure that a resource pool containing fault tolerant virtual machines has excess memory above the memory size of the virtual machines. The memory reservation of a fault tolerant virtual machine is set to the virtual machine’s memory size when Fault Tolerance is turned on. Without this excess in the resource pool, there might not be any memory available to use as overhead memory.
Use a maximum of 16 virtual disks per fault tolerant virtual machine.
To ensure redundancy and maximum Fault Tolerance protection, you should have a minimum of three hosts in the cluster. In a failover situation, this provides a host that can accommodate the new Secondary VM that is created.

Search

Search for:
Calendar

July 2025

M T W T F S S

1 2 3 4 5 6

7 8 9 10 11 12 13

14 15 16 17 18 19 20

21 22 23 24 25 26 27

28 29 30 31

« Jan
Social Media and RSS
vExpert
Recent Posts
- What’s occurring with slack space in vSAN8? January 3, 2025
- Introduction to Artificial Intelligence November 12, 2024
- Windows Virtualization Based Security June 8, 2024
- SNMP explained January 29, 2023
- Using tcpdump December 14, 2022
Archives
- January 2025 (1)
- November 2024 (1)
- June 2024 (1)
- January 2023 (1)
- December 2022 (1)
- August 2022 (1)
- February 2022 (2)
- October 2021 (1)
- July 2021 (1)
- May 2021 (1)
- March 2021 (1)
- February 2021 (1)
- January 2021 (1)
- December 2020 (1)
- November 2020 (2)
- October 2020 (1)
- August 2020 (1)
- July 2020 (2)
- June 2020 (2)
- April 2020 (1)
- March 2020 (2)
- December 2019 (1)
- November 2019 (2)
- August 2019 (1)
- July 2019 (1)
- May 2019 (1)
- April 2019 (1)
- February 2019 (1)
- January 2019 (2)
- December 2018 (1)
- November 2018 (1)
- October 2018 (1)
- August 2018 (1)
- June 2018 (1)
- April 2018 (1)
- March 2018 (1)
- January 2018 (1)
- November 2017 (1)
- October 2017 (1)
- September 2017 (1)
- August 2017 (1)
- June 2017 (1)
- May 2017 (1)
- April 2017 (2)
- March 2017 (1)
- February 2017 (1)
- January 2017 (1)
- December 2016 (1)
- November 2016 (2)
- September 2016 (1)
- August 2016 (2)
- July 2016 (2)
- May 2016 (2)
- February 2016 (3)
- January 2016 (3)
- December 2015 (3)
- November 2015 (1)
- October 2015 (2)
- September 2015 (2)
- August 2015 (2)
- July 2015 (2)
- June 2015 (3)
- May 2015 (2)
- April 2015 (1)
- March 2015 (2)
- February 2015 (2)
- January 2015 (2)
- December 2014 (3)
- November 2014 (2)
- October 2014 (1)
- September 2014 (2)
- August 2014 (2)
- July 2014 (2)
- June 2014 (3)
- May 2014 (1)
- April 2014 (1)
- March 2014 (6)
- February 2014 (2)
- January 2014 (2)
- December 2013 (1)
- November 2013 (3)
- October 2013 (5)
- September 2013 (1)
- August 2013 (2)
- July 2013 (6)
- June 2013 (4)
- May 2013 (5)
- April 2013 (4)
- March 2013 (28)
- February 2013 (53)
- January 2013 (63)
- December 2012 (13)
- November 2012 (11)
- October 2012 (13)
- September 2012 (6)
- August 2012 (16)
- July 2012 (22)
- June 2012 (16)
- May 2012 (19)
- April 2012 (10)
- March 2012 (13)
- February 2012 (32)
- January 2012 (25)
Categories
- AI (1)
- Benchmarking (3)
- Certification (164)
  - Microsoft (1)
  - VCAP5 DCA (155)
    - Objective 1 Storage (33)
    - Objective 2 Networking (21)
    - Objective 3 Tuning and Optimisation (26)
    - Objective 4 Business Continuity (11)
    - Objective 5 Operational Maintenance (15)
    - Objective 6 Advanced Troubleshooting (28)
    - Objective 7 Secure a vSphere environment (15)
    - Objective 8 Perform Scripting and Automation (8)
    - Objective 9 Advanced vSphere Installation (8)
  - VCP5-DCV (1)
- Cisco (1)
- Command Line (8)
  - PowerCLI (3)
  - Robocopy (1)
  - vCLI (1)
- Flex 10 (1)
- FreeNas (2)
- IMM/RSA (1)
- IPv6 (2)
- IT (39)
- Kubernetes (1)
- McAfee Products (1)
- microsoft (89)
  - Active Directory (7)
  - ActiveSync (1)
  - App-V (2)
  - Auditing (1)
  - BgInfo (1)
  - CA (1)
  - Clustering (6)
    - Microsoft Failover Clustering (2)
    - SQL Failover Clustering (1)
  - DFS (5)
  - DHCP (1)
  - Disk Quotas (1)
  - DNS (1)
  - Excel (1)
  - Forest Trusts (1)
  - Group Policy (3)
  - Hyper V (1)
  - Kerberos (1)
  - NAP (1)
  - Networking (2)
  - NLB (2)
  - NTFS (1)
  - Performance (3)
  - PowerShell (5)
  - Process Explorer (1)
  - Registry Mods (1)
  - RemoteApp (1)
  - Roaming Profiles (3)
  - ROUTE Command (1)
  - Shrinking Drives (1)
  - SQL Server (6)
  - System Volume Information (1)
  - Technet Labs (1)
  - Terminal Services (7)
  - Time Synchronisation (1)
  - UAC (1)
  - Upgrading Windows Editions (1)
  - WFAS (1)
  - Windows Firewall (1)
  - Windows Server 2012 (9)
  - XP Mode (1)
- Mobile Telephony (1)
- Networking (3)
- Oracle RAC (1)
- Personal (8)
- Security (1)
- SNMP (1)
- SRM (1)
- Storage (6)
- Technology (2)
- VdBench (1)
- Viso (1)
- VMware (206)
  - Active Directory Integration (1)
  - AD LDS (1)
  - Antivirus (1)
  - AutoDeploy (4)
  - Autolab (1)
  - Blogs (1)
  - Certificates (1)
  - Cloning (2)
  - Cluster Admission Control (1)
  - Clustering (3)
  - Compatibility Guide (1)
  - Database (6)
  - Documentation Center (4)
  - DRS (1)
  - ESXTOP + RESXTOP (4)
  - EVC (1)
  - F5 Load Balancer (1)
  - Guest O/S Customization (1)
  - HA (7)
  - Host profiles 6.5 (2)
  - iPad Knowledge App (1)
  - Labs (1)
  - Licensing (2)
  - Logs (3)
  - Monitoring (15)
  - NetFlow (1)
  - Networking (10)
  - NLB (1)
  - PowerCLI (4)
  - PSA and NMP (1)
  - RDM (1)
  - Resource Pools (1)
  - SMP (2)
  - Snapshots (1)
  - SSO (2)
  - Storage (32)
  - Time Syncing (1)
  - TPS (1)
  - UMDS (2)
  - Upgrading (3)
  - USB Devices (1)
  - VAAI (1)
  - vApps (1)
  - VASA (1)
  - vCenter (7)
  - vCheck (1)
  - vCLI (1)
  - VCSA 6.5 (2)
  - vMA (1)
  - vMotion (2)
  - VMware Labs (1)
  - VMware Tools (1)
  - VMware View (1)
  - vRA (13)
    - F5 Load Balancer with vRA (1)
    - vRA Certificates (1)
    - vRA Distributed Deployment v6.2.3 (3)
      - Part 1 (1)
      - Part 2 (1)
      - Part 3 (1)
    - vRA Small Deployment v6.2.3 (7)
      - Part 1 (1)
      - Part 2 (1)
      - Part 3 (1)
      - Part 4 (1)
      - Part 5 (1)
      - Part 6 (1)
      - Part 7 (1)
    - vRA7 (1)
      - vRA7 Minimal Deployment (1)
  - vRealize Log Insight (3)
    - Management Packs (1)
    - vCO Monitoring (1)
  - vROps (1)
    - Replacing Certificates (1)
  - VSA (1)
  - vSAN (8)
    - HCIBench (3)
    - vSAN Stretched Cluster (1)
  - vSphere 6 (10)
    - Decommissioning vCenter and PSC (1)
    - HTML5 Web Client (1)
    - JXplorer (1)
    - Platform Services Controller (5)
      - Enhanced Linked Mode (1)
      - High Availability (1)
      - Multisite (3)
    - PSC Replication (1)
    - Registering Orchestrator in vSphere (1)
  - vSphere Web Client (1)
- Web (1)
Tags

2012 ad AutoDeploy Certificate certification cluster Clustering DB DFS DRS esxi firewall gpo HA I/O iSCSI Labs logs LUN memory Microsoft NIC Performance powercli powershell PSA PSC RDM RDS sql storage Storage vMotion troubleshooting tuning upgrade vCenter vDS vm VMDK VMware vRA vro VSAN vsphere6 vSS
Fatcow Webhosting

Tag Archive for FT

Test FT failover, secondary restart and app fault tolerance in a FT VM

Fault Tolerance

Electric Monk

Don't think about what can happen in a month. Don't think what can happen in a year. Just focus on the 24 hours in front of you and do what you can to get closer to where you want to be :-)

Search

Calendar

Social Media and RSS

vExpert

Recent Posts

Archives

Categories

Tags

Fatcow Webhosting