Objective 3 Tuning and Optimisation | Electric Monk

Archive for Objective 3 Tuning and Optimisation

Tune ESXi VM Network Configuration

January 30, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configuration

Use the VMXNet3 adapter and if it is not supported use the VMXNET/VMXNET2 adapter

Use a network adapter that supports TCP Checksum, TSO and Jumbo Frames multiqueue support (also known as Receive Side Scaling in Windows), IPv6 offloads, and MSI/MSI-X interrupt delivery
Use the fastest ethernet you can. 10GB preferable
Ensure the speed and duplex settings on the network adapters is correct. For 10/100 nics, set the speed and duplex. Make sure the duplex is set to full duplex
For NICs, Gigabit Ethernet or higher set the speed and duplex to auto-negotiate
DirectPath I/O (DPIO) provides a means of bypassing the vmkernel, giving a VM direct access to hardware devices by leveraging Intel VT-D and AMD-V hardware support. Specific to networking, DPIO allows a VM to connect directly to the hosts physical network adapter without the overhead associated with emulation or paravirtualization. The bandwidth increases associated with DPIO are nominal but the savings on CPU cycles can be substantial for busy workloads. There are quite a few restrictions when utilizing DPIO. For example, unless using Cisco UCS hardware, DPIO is not compatible with hot-add, FT, HA, DRS or snapshots.
Use NIC Teaming where possible. VMware’s proprietry network teaming or Etherchannel
Virtual Machine Communications Interface (VMCI) is a virtual device that promotes enhanced communication between a virtual machine and the host on which it resides, and between VMs running on the same host. VMCI provides a high-speed alternative to standard TCP/IP sockets. The VMCI SDK enables engineers to develop applications which take advantage of the VMCI infrastructure. With VMCI, VM application traffic (of VMs on the same host) bypasses the network layer, reducing communication overhead. With VMCI, it’s not uncommon for inter-VM traffic to exceed 10 GB/s

Tune ESXi VM CPU Configuration

January 30, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configuration

Configuring Multicore Virtual CPUs. There are some limitations and considerations on this subject, like ESXi host configuration, VMware License, Guest OS (license) restrictions and so on. Only then can you decide the number of virtual sockets and the number of cores per socket.
CPU affinity is a technique that doesn’t necessarily imply load balancing, but it can be used to restrict a virtual machine to a particular set of processors. Affinity may not apply after a vMotion and it can disrupt ESXi’s ability to apply and meet shares and reservations
Duncan Epping raises some good points in this link http://www.yellow-bricks.com/2009/04/28/cpu-affinity/

You can use Hot Add to add vCPUs on the fly

Check Hyperthreading is enabled

Generally keep CPU/MMU Virtualisation on Automatic

You can adjust Limits, Reservations and Shares to control CPU Resources

Tune ESXi VM Memory Configuration

January 30, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configuration

Minimum memory size is 4MB for virtual machines that use BIOS firmware. Virtual machines that use EFI firmware require at least 96MB of RAM or they cannot power on.
The memory size must be a multiple of 4MB
vNUMA exposes NUMA technology to the Guest O/S. Hosts must have matching NUMA architecture and VMs must be running Hardware Version 8

Size the VM so they align with physical boundaries. If you have a system with 6 cores per NUMA node then size your machines with a multiple of 6 vCPUs
vNUMA can be enabled on smaller machines by adding numa.vcpu.maxPerVirtualNode=X (Where X is the number of vCPUs per vNUMA node)
Enable Memory Hot Add to be able to add memory to the VMs on the fly

Use Operating Systems that support large memory pages as ESXi will by default provide them to those O/S’s which request them
Store a VMs swap file in a different faster location to the working directory
Configure a special host cache on an SSD (If one is installed) to be used for the swap to host cache feature. Host cache is new in vSphere 5. If you have a datastore that lives on a SSD, you can designate space on that datastore as host cache. Host cache acts as a cache for all virtual machines on that particular host as a write-back storage for virtual machine swap files. What this means is that pages that need to be swapped to disk will swap to host cache first, and the written back to the particular swap file for that virtual machine
Keep Virtual Machine Swap files on low latency, high bandwidth storage systems
Do not store swap files on thin provisioned LUNs. This can cause swap file growth to fail.

You can use Limits, Reservations and Shares to control Resources per VM

Configure and apply Advanced ESXi Host, VM and Cluster attributes

January 29, 2013 Objective 3 Tuning and Optimisation No comments

What are Advanced Attributes?

You can set advanced attributes for hosts or individual virtual machines to help you customize resource management. In most cases, adjusting the basic resource allocation settings (reservation, limit, shares) or accepting default settings results in appropriate resource allocation. However, you can use advanced attributes to customize resource management for a host or a specific virtual machine.

Note: Changing advanced options is considered unsupported unless VMware technical support or a KB article instruct you to do so. In all other cases, changing these options is considered unsupported. In most cases, the default settings produce the optimum result.

Please Read Page 101 onwards of the Resource Management Guide

Host Advanced attributes

Click on a host
Click on Configuration
Click on Advanced
Select Attribute to Modify
CPU Below

Memory below

Network below

Disk below

A list of Memory and CPU advanced attributes can be found in the vSphere Resource Management guide on pages 99-104

Virtual Machine Advanced attributes

Right click on a VM and select Edit Settings
Click the Options tab
Select General under settings then select Configuration Parameters

A typical example we added on a vSphere 4 system was an adjustment of Storage vMotion timeout which was fsr.MaxSwitchoverSeconds=300 as recommended in a KB article for VMs with a large Memory size. Ours were 160GB

Cluster Advanced Attributes

Right click on your cluster
Select Edit Settings
Select VMware HA
Select Advanced Options

The example below shows an entry for a secondary HA Isolation address that we were testing. IP removed.

Tune ESXi host Storage Configuration

January 29, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configurations

Always use the Vendors recommendations whether it be EMC, NetApp or HP etc
Document all configurations
In a well-planned virtual infrastructure implementation, a descriptive naming convention aids in identification and mapping through the multiple layers of virtualization from storage to the virtual machines. A simple and efficient naming convention also facilitates configuration of replication and disaster recovery processes.
Make sure your SAN fabric is redundant (Multi Path I/O)
Separate networks for storage array management and storage I/O. This concept applies to all storage protocols but is very pertinent to Ethernet-based deployments (NFS, iSCSI, FCoE). The separation can be physical (subnets) or logical (VLANs), but must exist.
If leveraging an IP-based storage protocol I/O (NFS or iSCSI), you might require more than a single IP address for the storage target. The determination is based on the capabilities of your networking hardware.
With IP-based storage protocols (NFS and iSCSI) you channel multiple Ethernet ports together. NetApp refers to this function as a VIF. It is recommended that you create LACP VIFs over multimode VIFs whenever possible.
Use CAT 6 cabling rather than CAT 5
Enable Flow-Control (should be set to receive on switches and
transmit on iSCSI targets)
Enable spanning tree protocol with either RSTP or portfast
enabled. Spanning Tree Protocol (STP) is a network protocol that makes sure of a loop-free topology for any bridged LAN
Configure jumbo frames end-to-end. 9000 rather than 1500 MTU
Ensure Ethernet switches have the proper amount of port
buffers and other internals to support iSCSI and NFS traffic
optimally
Use Link Aggregation for NFS
Maximum of 2 TCP sessions per Datastore for NFS (1 Control Session and 1 Data Session)
Ensure that each HBA is zoned correctly to both SPs if using FC
Create RAID LUNs according to the Applications vendors recommendation
Use Tiered storage to separate High Performance VMs from Lower performing VMs
Choose Virtual Disk formats as required. Eager Zeroed, Thick and Thin etc
Choose RDMs or VMFS formatted Datastores dependent on supportability and Aplication vendor and virtualisation vendor recommendation
Utilise VAAI (vStorage APIs for Array Integration) Supported by vSphere 5
No more than 15 VMs per Datastore
Extents are not generally recommended
Use De-duplication if you have the option. This will manage storage and maintain one copy of a file on the system
Choose the fastest storage ethernet or FC adaptor (Dependent on cost/budget etc)
Enable Storage I/O Control
VMware highly recommend that customers implement “single-initiator, multiple storage target” zones. This design offers an ideal balance of simplicity and availability with FC and FCoE deployments.
Whenever possible, it is recommended that you configure storage networks as a single network that does not route. This model helps to make sure of performance and provides a layer of data security.
Each VM creates a swap or pagefile that is typically 1.5 to 2 times the size of the amount of memory configured for each VM. Because this data is transient in nature, we can save a fair amount of storage and/or bandwidth capacity by removing this data from the datastore, which contains the production data. In order to accomplish this design, the VM’s swap or pagefile must be relocated to a second virtual disk stored in a separate datastore
It is the recommendation of NetApp, VMware, other storage vendors, and VMware partners that the partitions of VMs and the partitions of VMFS datastores are to be aligned to the blocks of the underlying storage array. You can find more information around VMFS and GOS file system alignment in the following documents from various vendors
Failure to align the file systems results in a significant increase in storage array I/O in order to meet the I/O requirements of the hosted VMs
Try using sDRS
Turn on Storage I/O Control (SIOC) to split up disk shares globally across all hosts accessing that datastore
Make sure your multipathing is correct. Active/Active arrays use Fixed, Active/Passive use Most Recently used and then you have ALUA
Change queue depths to 64 rather than the default 32 if required. Set the parameter Disk.SchedNumReqOutstanding to 64 in vCenter
VMFS and RDM are both good for Random Reads/Writes
VMFS and RDM are also good for sequential Reads/Writes of small I/O block sizes
VMFS best for sequential Reads/Writes at larger I/O block sizes

Tune ESXi host CPU Configuration

January 29, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configurations

Deploy single-threaded applications on uniprocessor virtual machines, instead of on SMP virtual machines, for the best performance and resource use.
VMware advise against using CPU Affinity as it generally constrains the scheduler BD can cause an improperly balanced load
Use DRS where you can as this will balance the load for you
Don’t configure your VMs for more vCPUs then their workloads require. Configuring a VM with more vCPUs then it needs will cause additional, unnecessary CPU utilization due to the increased overhead relating to multiple vCPUs
Enable Hyperthreading in the BIOS. Check this in the BIOS and in vCenter, click on the host, select Configuration, select Properties and check that Hyperthreading is enabled
When dealing with NUMA systems, ensure that node interleaving is disabled in the BIOS. If node interleaving is set to enabled it essentially disables NUMA capability on that host
When possible configure the number of vCPUs to equal or less than the number of physical cores on a single NUMA node. When you configure equal or less vCPUs:physical cores the VM will get all its memory from that single NUMA node, resulting in lower memory access and latency times
Sometimes for certain machines it may be beneficial to schedule all of the vCPUs on the same sockets in under committed systems which gives that VM full access to a Shared Last Level cache rather than spread across multiple processors. Adjust sched.cpu.vsmpConsolidate=”true” in the VMX Configuration file
Pay attention to the Manufacturers recommendations for resources, especially application multithreading support
Use processors which support Hardware-Assisted CPU Virtualization (Intel VT-x and AMD AMD-V)
Use processors which support Hardware-Assisted MMU Virtualization (Intel EPT and AMD RVI)
When configuring virtual machines, the total CPU resources needed by the virtual machines running on the system should not exceed the CPU capacity of the host. If the host CPU capacity is overloaded, the performance of individual virtual machines may degrade

Tune ESXi host networking configuration

January 29, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Configurations

Use Network I/O Control to utilise Limits, Shares and Qos priority tags for traffic
Team NICs across PCI cards and switches for complete redundancy
Using vDS switches gives you more features than the Standard Switch and minimises configuration time
Utilise NIC Teaming where possible to provide failover and extra bandwidth
Use Jumbo Frames where you can – MTU 9000 rather than 1500. Must be set the same end to end
Keep physical NIC firmware updated
Use VMXNET-3 Virtual Network Adapters where possible. Must be supported by the O/S you are running. Shares a ring buffer between the VM and VMKernel and uses zero copy which saves CPU cycles. Takes advantage of transmission packet coalescing to reduce address space switching
DirectPath I/O may provide you a bump in network performance, but you really need to look at the use case. You can lose a lot of core functionality when using this feature, such as vMotion and FT (some special exceptions when running on UCS for vMotion) so you really need to look at the cost:benefit ratio and determine if it’s worth the tradeoffs
Enable Discovery Protocols CDP and LLDP for extra information on your networks
Make sure your NIC teaming policies on your Virtual switches match the correct policies on the physical switches
Make sure your physical switches support cross stack etherchannel if you are planning on using this in a fully redundant networking solution
Use static or ephemeral port bindings due to the deprecation of Dynamic Binding
Choose 10GB ethernet over 1GB. This gives you Netqueue, a feature which use multiple transmit and receive queues to allow I/O processing across multiple CPUs
Choose physical NICs with TCP Checksum Offload which reduces the load on the physical CPU by allowing the NIC to perform checksum operations on network packets
Choose physical adapters with TCP Segmentation offload as this can reduce the CPU Overhead involved with sending large amounts of TCP traffic.
To speed up packet handling, network adapters can be configured for direct memory access to high memory/ This bypasses the CPU and allows the NIC direct access to memory
You can use DirectPath which allows a VM to directly access the physical NIC instead of using an emulated or paravirtual device however it is not compatible with certain features such as vMotion, Hot Add/Hot Remove, HA, DRS and Snapshots
Use Split RX Mode on VMXNet-3 adapters is an ESXi feature that uses multiple physical CPUs to process network packets received in a single work queue. It is individually configured on each NIC. Good for Stock Exchanges and Multimedia companies
Use VMCI if you have 2 VMs on the same host which require a high-speed communication channel which bypasses the guest or VMKernel networking stack
In a native environment, CPU utilization plays a significant role in network throughput. To process higher levels of throughput, more CPU resources are needed. The effect of CPU resource availability on the network throughput of virtualized applications is even more significant. Because insufficient CPU resources will limit maximum throughput, it is important to monitor the CPU utilization of high-throughput workloads.

Tune ESXi Host Memory Configuration

January 29, 2013 Objective 3 Tuning and Optimisation No comments

Tuning Options

In addition to the usual 4KB memory pages, ESX also makes 2MB memory pages available (commonly referred to as “large pages”). By default ESX assigns these 2MB machine memory pages to guest operating systems that request them, giving the guest operating system the full advantage of using large pages. The use of large pages results in reduced memory management overhead and can therefore increase hypervisor performance.
Hardware-assisted MMU is supported for both AMD and Intel processors beginning with ESX 4.0 (AMD processor support started with ESX 3.5 Update 1). On processors that support it, ESX 4.0 by default uses hardware-assisted MMU virtualization for virtual machines running certain guest operating systems and uses shadow page tables for others
Carefully select the amount of memory you allocate to your virtual machines. You should allocate enough memory to hold the working set of applications you will run in the virtual machine, thus minimizing swapping, but avoid over-allocating memory.
Understand Limits, Reservations, Shares and Working Set Size

If swapping cannot be avoided, placing the virtual machine’s swap file on a high speed/high bandwidth storage system will result in the smallest performance impact. The swap file location can be set with the sched.swap.dir option in the vSphere Client (select Edit virtual machine settings, choose the Options tab, select Advanced, and click Configuration Parameters)

A new feature that VMware introduced with vSphere 5 is the ability to swap to host cache using a solid state disk. In the event that overcommitment leads to swapping, the swapping can occur on an SSD, a much quicker alternative than traditional disks. Hightlight a host > Configuration > Software > Host Cache Configuration

Use the Mem.ShareScanTime and Mem.ShareScanGHz advanced settings to control the rate at which the system scans memory to identify opportunities for sharing memory. You can also disable sharing for individual virtual machines by setting the sched.mem.pshare.enable option to FALSE (this option defaults to TRUE)

Don’t disable these other memory over-commitment techniques – Ballooning, Page Sharing and Memory Compression

Identify appropriate BIOS and firmware setting requirements for optimal host performance

January 29, 2013 Objective 3 Tuning and Optimisation No comments

Appropriate BIOS and firmware settings

Make sure you have the most up to date firmware for your Servers including all 3rd party cards
Enable Hyperthreading. Note, you cannot enable hyperthreading on a system with great then 32 physical cores because of the logical limit of 64 CPUs
Make sure the BIOS is set to enable all populated processor sockets and to enable all cores in each socket.
Enable “Turbo Boost” in the BIOS if your processors support it
Some NUMA-capable systems provide an option in the BIOS to disable NUMA by enabling node interleaving. In most cases you will get the best performance by disabling node interleaving (in other words, leaving NUMA enabled) These technologies automatically trap sensitive calls, eliminating the overhead required to
do so in software. This allows the use of a hardware virtualization (HV) virtual machine monitor (VMM) as opposed to a binary translation (BT) VMM.
Hardware-Assisted CPU Virtualization (Intel VT-x and AMD AMD-V) The first generation of hardware virtualization assistance, VT-x from Intel and AMD-V from AMD, became available in 2006. These technologies automatically trap sensitive calls, eliminating the overhead required to do so in software. This allows the use of a hardware virtualization (HV) virtual machine monitor (VMM) as opposed to a binary translation (BT) VMM.
Hardware-Assisted MMU Virtualization (Intel EPT and AMD RVI) Some recent processors also include a new feature that addresses the overheads due to memory management unit (MMU) virtualization by providing hardware support to virtualize the MMU. ESX 4.0 supports this feature in both AMD processors, where it is called rapid virtualization indexing (RVI) or nested page tables (NPT), and in Intel processors, where it is called extended page tables (EPT).
Cache prefetching mechanisms (sometimes called DPL Prefetch, Hardware Prefetcher, L2 Streaming Prefetch, or Adjacent Cache Line Prefetch) usually help performance, especially when memory access patterns are regular. When running applications that access memory randomly, however, disabling these mechanisms might result in improved performance.
ESX 4.0 supports Enhanced Intel SpeedStep® and Enhanced AMD PowerNow!™ CPU power management technologies that can save power when a host is not fully utilized. However because these and other power-saving technologies can reduce performance in some situations, you should consider disabling them when performance considerations outweigh power considerations.
Disable C1E halt state in the BIOS.
Disable any other power-saving mode in the BIOS.
Disable any unneeded devices from the BIOS, such as serial and USB ports.

Identify appropriate driver revisions required for optimal ESXi host performance

Check out the VMware HCL and (or) VMware KB 2030818 for recommended drivers and firmware for different vSphere versions.

Configure Datastore Clusters

January 15, 2013 Objective 1 Storage, Objective 3 Tuning and Optimisation No comments

What is a Datastore Cluster?

A Datastore Cluster is a collection of Datastores with shared resources and a shared management interface. When you create a Datastore cluster, you can use Storage DRS to manage storage resources and balance

Capacity
Latency

General Rules

Datastores from different arrays can be added to the same cluster but LUNs from arrays of different types can adversely affect performance if they are not equally performing LUNs.
Datastore clusters must contain similar or interchangeable Datastores
Datastore clusters can only have ESXi 5 hosts attached
Do not mix NFS and VMFS datastores in the same Datastore Cluster
You can mix VMFS-3 and VMFS-5 Datastores in the same Datastore Cluster
Datastore Clusters can only be created from the vSphere client, not the Web Client
A VM can have its virtual disks on different Datastores

Storage DRS

Storage DRS provides initial placement and ongoing balancing recommendations assisting vSphere administrators to make placement decisions based on space and I/O capacity. During the provisioning of a virtual machine, a Datastore Cluster can be selected as the target destination for this virtual machine or virtual disk after which a recommendation for initial placement is made based on space and I/O capacity. Initial placement in a manual provisioning process has proven to be very complex in most environments and as such crucial provisioning factors like current space utilization or I/O load are often ignored. Storage DRS ensures initial placement recommendations are made in accordance with space constraints and with respect to the goals of space and I/O load balancing. These goals aim to minimize the risk of storage I/O bottlenecks and minimize performance impact on virtual machines.

Ongoing balancing recommendations are made when

One or more Datastores in a Datastore cluster exceeds the user-configurable space utilization which is checked every 5 minutes
One or more Datastores in a Datastore cluster exceeds the user-configurable I/O latency thresholds which is checked every 8 Hours
I/O load is evaluated by default every 8 hours. When the configured maximum space utilization or the I/O latency threshold (15ms by default) is exceeded Storage DRS will calculate all possible moves to balance the load accordingly while considering the cost and the benefit of the migration.

Storage DRS utilizes vCenter Server’s Datastore utilization reporting mechanism to make recommendations whenever the configured utilized space threshold is exceeded.

Affinity Rules and Maintenance Mode

Storage DRS affinity rules enable controlling which virtual disks should or should not be placed on the same datastore within a datastore cluster. By default, a virtual machine’s virtual disks are kept together on the same datastore. Storage DRS offers three types of affinity rules:

VMDK Anti-Affinity
Virtual disks of a virtual machine with multiple virtual disks are placed on different datastores
VMDK Affinity
Virtual disks are kept together on the same datastore
VM Anti-Affinity
Two specified virtual machines, including associated disks, are place on different datastores

In addition, Storage DRS offers Datastore Maintenance Mode, which automatically evacuates all virtual machines and virtual disk drives from the selected datastore to the remaining datastores in the datastore cluster.

Configuring Datastore Clusters on the vSphere Web Client

Log into your vSphere client and click on the Datastores and Datastore Clusters view
Right-click on your Datacenter object and select New Datastore Cluster

Enter in a name for the Datastore Cluster and choose whether or not to enable Storage DRS

Click Next
You can now choose whether you want a “Fully Automated” cluster that migrates files on the fly in order to optimize the Datastore cluster’s performance and utilization, or, if you prefer, you can select No Automation to approve recommendations.

Here you can decide what utilization levels or I/O Latency will trigger SDRS action. To benefit from I/O metric, all your hosts that will be using this datastore cluster must be version 5.0 or later. Here you can also access some advanced and very important settings like defining what is considered a marginal benefit for migration, how often does SDRS check for imbalance and how aggressive should the algorithm be

I/O Latency only applicable if Enable I/O metric for SDRS recommendations is ticked
Next you pick what standalone hosts and/or host clusters will have access to the new Datastore Cluster

Select from the list of datastores that can be included in the cluster. You can list datastores that are connected to all hosts, some hosts or all datastores that are connected to any of the hosts and/or clusters you have chosen in the previous step.