Tag Archive for cluster

Recap on Cluster Admission Control in vSphere 6.5/6.5U1

 

 

 

 

Cluster Admission Control in vSphere 6.5/6.5U1

vSphere HA uses admission control to ensure that sufficient resources are reserved for virtual machine recovery when a host fails. The basis for vSphere HA admission control is how many host failures your cluster is allowed to tolerate and still guarantee failover for the VMs on to the remaining hosts. The default Admission Control policy has changed from Slot Policy (Default until 6.5), to ‘Cluster Resource Percentage’. VMware had found that very few people were actually using slot policy, and those that were, were not using it correctly and it also involved some manual calculations when hosts were added or removed.

Admission control imposes constraints on resource usage. Any action that might violate these constraints is not permitted. Actions that might be disallowed include the following examples:

  • Powering on a virtual machine
  • Migrating a virtual machine
  • Increasing the CPU or memory reservation of a virtual machine

Computing the Current Failover Capacity

The total resource requirements for the powered-on virtual machines is comprised of two components, CPU and memory. vSphere HA calculates these values.

  • The CPU component by summing the CPU reservations of the powered-on virtual machines. If you have not specified a CPU reservation for a virtual machine, it is assigned a default value of 32MHz (this value can be changed using the das.vmcpuminmhz advanced option.)
  • The memory component by summing the memory reservation (plus memory overhead) of each powered-on virtual machine.

The total host resources available for virtual machines is calculated by adding the hosts’ CPU and memory resources. These amounts are those contained in the host’s root resource pool, not the total physical resources of the host. Resources being used for virtualization purposes are not included. Only hosts that are connected, not in maintenance mode, and have no vSphere HA errors are considered.

The Current CPU Failover Capacity is computed by subtracting the total CPU resource requirements from the total host CPU resources and dividing the result by the total host CPU resources.

The Current Memory Failover Capacity is computed by subtracting the total Memory resource requirements from the total host memory resources and dividing the result by the total host Memory resources

Host Failure Cluster Tolerates 

This option allows you to define the number of ESXi hosts tolerate for failures. vSphere HA will automatically calculate a percentage of resources to set by applying the “Percentage of Cluster Resources” (Default option in vSphere 6.5) admission control policy. Resources required for failover capacity is now directly related to the Host failures cluster tolerates option. In the example below, there are 2 ESXi hosts in the cluster and I have configured “Host failures cluster tolerates” value as “1”. HA will then automatically reserve 50% of Memory and 50% of CPU for the failover capacity. If you have a 4-host cluster and FTT=1 then it will calculate a 25% reservation. HA Slot policy used to be the default admission control policy. With vSphere 6.5, the default admission control policy is now “Cluster resource Percentage”.

If you have add or remove ESXi hosts in the cluster, the percentage of failover capacity will be automatically recalculated.

You have the option to Override the failover capacity by the number of host failures cluster tolerates by selecting the Override option and specify % for CPU and Memory.

 

Define host failover capacity by HA Slot Policy

You can also have option to choose “Slot Policy”.  This is the default option prior to vSphere 6.5. Slot Size is defined as the memory and CPU resources that satisfy the reservation requirements for any powered-on virtual machines in the HA cluster. You have 2 options under Slot Policy:

  • Cover All powered-on Virtual Machines

It calculates the slot size based on the maximum CPU/Memory reservation and overhead of all powered-on virtual machines in the cluster but will be skewed by reservations on VMs which is why we have the setting below to override calculations being based on large reservations.

  • Fixed Slot Size

You can explicitly specify the fixed slot size

 

Define host failover capacity by Dedicated Failover Hosts

This option allows you to define a dedicated ESXi host in the cluster as failover hosts for the HA cluster.  That dedicated failover host will not run virtual machines unless vSphere HA needs to recover from a failed host. However this is a waste of a whole host and not generally used unless you have the ability and capacity in your datacenter to keep a spare host aside just in case.

 

VM resource reduction event threshold – Performance Degradation Tolerance

The reserved capacity used by admission control ensures that all configured reservations will continue to be honored after a host failure.  If the reservations are not used in some environments, the performance of the cluster could be impacted.  This new setting called “VM resource reduction event threshold” defines how much of a performance impact is tolerated and will issue a warning if the consumed resources are higher than the reserved resources.

0% – Raises a warning if there is insufficient failover capacity to guarantee the same performance after VM’s restart.

100% -Warning is disabled.

 

Example

400GB of memory available in a 4 node cluster
1 host failure to tolerate specifed
310GB of memory actively used by VMs
0% resource reduction tolerated

This results in the following:
400GB – 100GB (1 host worth of memory) = 300GB
We have 310GB of memory actively used, with 0% resource reduction to tolerate
310GB needed, 300GB available after failure > A warning will be issued.

Summary

Just some general points I’ve seen lately. I’ll update them if anything else comes up which is interesting.

  • In terms of larger VMs skewing the options, any CPU or Memory VM reservation will be taken into account including the defaults taken for any VMs which do not have reservations.
  • In regards to reservations, they will only come into play when the system is under contention anyway. VMware is very good at managing resources. However if you do want to use them, I would advise monitoring the peak usage of CPU and RAM of the VMs you want a reservation on first so you can assign an accurate reservation.
  • Slot Policy is no longer the default option and is easy to set incorrectly so be careful with this option. You can create extra work for yourself but it is useful.
  • Use the Host failures to tolerate policy (the recommended policy now)
  • It may be worth setting the VM resource reduction event threshold so you are warned of any potential performance problems. Setting it to 0% will generate a warning if Admission Control thinks there is insufficient failover capacity to ensure the same performance after VMs are failed over. This is achieved through monitoring of actual usage of CPU and RAM resources. In conjunction with the Host Failures to Tolerate policy which just calculates using reservations, default cpu/ram values and overhead for its calculations, these 2 settings combined give a very useful monitoring aspect for your cluster.

Analyze vSphere environment to determine appropriate HA admission control policy

index

Choosing an Admission Control Policy

You should choose a vSphere HA admission control policy based on your availability needs and the characteristics of your cluster. When choosing an admission control policy, you should consider a number of factors.

  • Avoiding Resource Fragmentation

Resource fragmentation occurs when there are enough resources in aggregate for a virtual machine to be failed over. However, those resources are located on multiple hosts and are unusable because a virtual machine can run on one ESXi host at a time. The Host Failures Cluster Tolerates policy avoids resource fragmentation by defining a slot as the maximum virtual machine reservation. The Percentage of Cluster Resources policy does not address the problem of resource fragmentation. With the Specify Failover Hosts policy, resources are not fragmented because hosts are reserved for failover.

  • Flexibility of Failover Resource Reservation

Admission control policies differ in the granularity of control they give you when reserving cluster resources for failover protection. The Host Failures Cluster Tolerates policy allows you to set the failover level as a number of hosts. The Percentage of Cluster Resources policy allows you to designate up to 100% of cluster CPU or memory resources for failover. The Specify Failover Hosts policy allows you to specify a set of failover hosts.

  • Heterogeneity of Cluster

Clusters can be heterogeneous in terms of virtual machine resource reservations and host total resource capacities. In a heterogeneous cluster, the Host Failures Cluster Tolerates policy can be too conservative because it only considers the largest virtual machine reservations when defining slot size and assumes the largest hosts fail when computing the Current Failover Capacity. The other two admission control policies are not affected by cluster heterogeneity.

Failover Clusters in Windows Server 2008 – Quorums

What is a cluster?

A failover cluster is a group of independent computers that work together to increase the availability of applications and services. The clustered servers (called nodes) are connected by physical cables and by software. If one of the cluster nodes fails, another node begins to provide service (a process known as failover). Users experience a minimum of disruptions in service.

Are there any special considerations?

Microsoft supports a failover cluster solution only if all the hardware components are marked as “Certified for Windows Server 2008 R2.” In addition, the complete configuration (servers, network, and storage) must pass all tests in the Validate a Configuration wizard, which is included in the Failover Cluster Manager snap-in.

Note that this policy differs from the support policy for server clusters in Windows Server 2003, which required the entire cluster solution to be listed in the Windows Server Catalog under Cluster Solutions.

Cluster validation is intended to catch hardware or configuration problems before the cluster goes into production. Cluster validation helps to ensure that the solution you are about to deploy is truly dependable. Cluster validation can also be performed on configured failover clusters as a diagnostic tool.

Step by Step Guide

  • Run the cluster validation wizard for a failover cluster
  • If the cluster does not yet exist, choose the servers that you want to include in the cluster, and make sure you have installed the failover cluster feature on those servers. To install the feature, on a server running Windows Server 2008 or Windows Server 2008 R2, click Start, click Administrative Tools, click Server Manager, and under Features Summary, click Add Features. Use the Add Features wizard to add the Failover Clustering feature.
  • If the cluster already exists, make sure that you know the name of the cluster or a node in the cluster
  • For a planned cluster with all hardware connected: Run all tests.
  • For a planned cluster with parts of the hardware connected: Run System Configuration tests, Inventory tests, and tests that apply to the hardware that is connected (that is, Network tests if the network is connected or Storage tests if the storage is connected).
  • For a cluster to which you plan to add a server: Run all tests. Before you run them, be sure to connect the networks and storage for all servers that you plan to have in the cluster.
  • For troubleshooting an existing cluster: If you are troubleshooting an existing cluster, you might run all tests, although you could run only the tests that relate to the apparent issue.
  • In the failover cluster snap-in, in the console tree, make sure Failover Cluster Management is selected and then, under Management, click Validate a Configuration.

  • Follow the instructions in the wizard to specify the servers and the tests, and run the tests.
  • Note that when you run the cluster validation wizard on unclustered servers, you must enter the names of all the servers you want to test, not just one.
  • The Summary page appears after the tests run.
  • While still on the Summary page, click View Reportto view the test results.To view the results of the tests after you close the wizard, see SystemRoot\Cluster\Reports\Validation Report date and time.html where SystemRoot is the folder in which the operating system is installed (for example, C:\Windows)

Error Chart

Configuring the Quorum in a Failover Cluster

In simple terms, the quorum for a cluster is the number of elements that must be online for that cluster to continue running. In effect, each element can cast one “vote” to determine whether the cluster continues running. The voting elements are nodes or, in some cases, a disk witness or file share witness. Each voting element (with the exception of a file share witness) contains a copy of the cluster configuration, and the Cluster service works to keep all copies synchronized at all times

Note that the full function of a cluster depends not just on quorum, but on the capacity of each node to support the services and applications that fail over to that node. For example, a cluster that has five nodes could still have quorum after two nodes fail, but each remaining cluster node would continue serving clients only if it had enough capacity to support the services and applications that failed over to it.

Why Quorum is necessary

When network problems occur, they can interfere with communication between cluster nodes. A small set of nodes might be able to communicate together across a functioning part of a network, but might not be able to communicate with a different set of nodes in another part of the network. This can cause serious issues. In this “split” situation, at least one of the sets of nodes must stop running as a cluster.

To prevent the issues that are caused by a split in the cluster, the cluster software requires that any set of nodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set has quorum. Because a given cluster has a specific set of nodes and a specific quorum configuration, the cluster will know how many “votes” constitutes a majority (that is, a quorum). If the number drops below the majority, the cluster stops running. Nodes will still listen for the presence of other nodes, in case another node appears again on the network, but the nodes will not begin to function as a cluster until the quorum exists again.

For example, in a five node cluster that is using a node majority, consider what happens if nodes 1, 2, and 3 can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute a majority, and they continue running as a cluster. Nodes 4 and 5 are a minority and stop running as a cluster, which prevents the problems of a “split” situation. If node 3 loses communication with other nodes, all nodes stop running as a cluster. However, all functioning nodes will continue to listen for communication, so that when the network begins working again, the cluster can form and begin to run.

Overview of the Quorum Nodes

There have been significant improvements to the quorum model in Windows Server 2008. In Windows Server 2003, almost all server clusters used a disk in cluster storage (the “quorum resource”) as the quorum. If a node could communicate with the specified disk, the node could function as a part of a cluster, and otherwise it could not. This made the quorum resource a potential single point of failure. In Windows Server 2008, a majority of ‘votes’ is what determines whether a cluster achieves quorum. Nodes can vote, and where appropriate, either a disk in cluster storage (called a “disk witness”) or a file share (called a “file share witness”) can vote. There is also a quorum mode called No Majority: Disk Only which functions like the disk-based quorum in Windows Server 2003. Aside from that mode, there is no single point of failure with the quorum modes, since what matters is the number of votes, not whether a particular element is available to vote.

This new quorum model is flexible and you can choose the mode best suited to your cluster.

Important: In most situations, it is best to use the quorum mode selected by the cluster software. If you run the quorum configuration wizard, the quorum mode that the wizard lists as “recommended” is the quorum mode chosen by the cluster software. We only recommend changing the quorum configuration if you have determined that the change is appropriate for your cluster.

There are four quorum modes:

  • Node Majority: Each node that is available and in communication can vote. The cluster functions only with a majority of the votes, that is, more than half.
  • Node and Disk Majority: Each node plus a designated disk in the cluster storage (the “disk witness”) can vote, whenever they are available and in communication. The cluster functions only with a majority of the votes, that is, more than half.
  • Node and File Share Majority: Each node plus a designated file share created by the administrator (the “file share witness”) can vote, whenever they are available and in communication. The cluster functions only with a majority of the votes, that is, more than half.
  • No Majority: Disk Only: The cluster has quorum if one node is available and in communication with a specific disk in the cluster storage.

Choosing the Quorum Mode for a particular cluster

Description of Cluster

Quorum Recommendation

Odd number of nodes

Node Majority

Even number of nodes (but not a multi-site cluster)

Node and Disk Majority

Even number of nodes, multi-site cluster

Node and File Share Majority

Even number of nodes, no shared storage

Node and File Share Majority

Node Majority

The following diagram shows Node Majority used (as recommended) for a cluster with an odd number of nodes.In this mode, each node gets one vote. In certain circumstances, you might want to install a hotfix that lets you select which nodes will have votes. This can be useful with certain multi-site clusters, for example, where you want one site to have more votes than other sites in a disaster recovery situation

Node and Disk Majority

The following diagram shows Node and Disk Majority used (as recommended) for a cluster with an even number of nodes. Each node can vote, as can the disk witness.

  • Use a small Logical Unit Number (LUN) that is at least 512 MB in size.
  • Choose a basic disk with a single volume.
  • Make sure that the LUN is dedicated to the disk witness. It must not contain any other user or application data.
  • Choose whether to assign a drive letter to the LUN based on the needs of your cluster. The LUN does not have to have a drive letter (to conserve drive letters for applications).
  • As with other LUNs that are to be used by the cluster, you must add the LUN to the set of disks that the cluster can use. For more information, see http://go.microsoft.com/fwlink/?LinkId=114539.
  • Make sure that the LUN has been verified with the Validate a Configuration Wizard.
  • We recommend that you configure the LUN with hardware RAID for fault tolerance.
  • In most situations, do not back up the disk witness or the data on it. Backing up the disk witness can add to the input/output (I/O) activity on the disk and decrease its performance, which could potentially cause it to fail.
  • We recommend that you avoid all antivirus scanning on the disk witness.
  • Format the LUN with the NTFS file system.

If there is a disk witness configured, but bringing that disk online will not achieve quorum, then it remains offline. If bringing that disk online will achieve quorum, then it is brought online by the cluster software

Node and File Share Majority

The following diagram shows Node and File Share Majority used (as recommended) for a cluster with an even number of nodes and a situation where having a file share witness works better than having a disk witness. Each node can vote, as can the file share witness.

  • Use a Server Message Block (SMB) share on a Windows Server 2003 or Windows Server 2008 file server.
  • Make sure that the file share has a minimum of 5 MB of free space.
  • Make sure that the file share is dedicated to the cluster and is not used in other ways (including storage of user or application data).
  • Do not place the share on a node that is a member of this cluster or will become a member of this cluster in the future.
  • You can place the share on a file server that has multiple file shares servicing different purposes. This may include multiple file share witnesses, each one a dedicated share. You can even place the share on a clustered file server (in a different cluster), which would typically be a clustered file server containing multiple file shares servicing different purposes.
  • For a multi-site cluster, you can co-locate the external file share at one of the sites where a node or nodes are located. However, we recommend that you configure the external share in a separate third site.
  • Place the file share on a server that is a member of a domain, in the same forest as the cluster nodes.
  • For the folder that the file share uses, make sure that the administrator has Full Control share and NTFS permissions.
  • Do not use a file share that is part of a Distributed File System (DFS) Namespace

No Majority – Disk only

The following illustration shows how a cluster that uses the disk as the only determiner of quorum can run even if only one node is available and in communication with the quorum disk. It also shows how the cluster cannot run if the quorum disk is not available (single point of failure). For this cluster, which has an odd number of nodes, Node Majority is the recommended quorum mode.

  • Use a small Logical Unit Number (LUN) that is at least 512 MB in size.
  • Choose a basic disk with a single volume.
  • Make sure that the LUN is dedicated to the disk witness. It must not contain any other user or application data.
  • Choose whether to assign a drive letter to the LUN based on the needs of your cluster. The LUN does not have to have a drive letter (to conserve drive letters for applications).
  • As with other LUNs that are to be used by the cluster, you must add the LUN to the set of disks that the cluster can use. For more information, see http://go.microsoft.com/fwlink/?LinkId=114539.
  • Make sure that the LUN has been verified with the Validate a Configuration Wizard.
  • We recommend that you configure the LUN with hardware RAID for fault tolerance.
  • In most situations, do not back up the disk witness or the data on it. Backing up the disk witness can add to the input/output (I/O) activity on the disk and decrease its performance, which could potentially cause it to fail.
  • We recommend that you avoid all antivirus scanning on the disk witness.
  • Format the LUN with the NTFS file system.

If there is a disk witness configured, but bringing that disk online will not achieve quorum, then it remains offline. If bringing that disk online will achieve quorum, then it is brought online by the cluster software

Viewing the Quorum Configuration

  • To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management (in Windows Server 2008) or Failover Cluster Manager (in Windows Server 2008 R2).If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.
  • In the console tree, if the cluster that you want to view is not displayed, right-click Failover Cluster Management or Failover Cluster Manager, click Manage a Cluster, and then select the cluster you want to view
  • In the center pane, find Quorum Configuration, and view the description
  • In the following example, the quorum mode is Node and Disk Majority and the disk witness is Cluster Disk 2.