Tag Archive for storage

Storage I/O Control

What is Storage I/ Control?

*VMware Enterprise Plus License Feature

Set an equal baseline and then define priority access to storage resources according to established business rules. Storage I/O Control enables a pre-programmed response to occur when access to a storage resource becomes contentious

With VMware Storage I/O Control, you can configure rules and policies to specify the business priority of each VM. When I/O congestion is detected, Storage I/O Control dynamically allocates the available I/O resources to VMs according to your rules, enabling you to:

  • Improve service levels for critical applications
  • Virtualize more types of workloads, including I/O-intensive business-critical applications
  • Ensure that each cloud tenant gets their fair share of I/O resources
  • Increase administrator productivity by reducing amount of active performance management required.
  • Increase flexibility and agility of your infrastructure by reducing your need for storage volumes dedicated to a single application

How is it configured?

It’s quite straight forward to do. First you have to enable it on the datastores. Only if you want to prioritize a certain VM’s I/Os do you need to do additional configuration steps such as setting shares on a per VM basis. Yes, this can be a bit tedious if you have very many VMs that you want to change from the default shares value. But this only needs to be done once, and after that SIOC is up and running without any additional tweaking needed

The shares mechanism is triggered when the latency to a particular datastore rises above the pre-defined latency threshold seen earlier. Note that the latency is calculated cluster-wide. Storage I/O Control also allows one to tune &  place a maximum on the number of IOPS that a particular VM can generate  to a shared datastore. The Shares and IOPS values are configured on a per VM basis. Edit the Settings of the VM, select the Resource tab, and the Disk setting will allow you to set the Shares value for when contention arises (set to Normal/1000 by default), and limit the IOPs that the VM can generate on the datastore (set to Unlimited by default):

Why enable it?

The thing is, without SIOC, you could definitely hit this noisy neighbour problem where one VM could use more than its fair share of resources and impact other VMs residing on the same datastore. So by simply enabling SIOC on that datastore, the algorithms will ensure fairness across all VMs sharing the same datastore as they will all have the same number of shares by default. This is a great reason for admins to use this feature when it is available to them. And another cool feature is that once SIOC is enabled, there are additional performance counters available to you which you typically don’t have

What threshold should you set?

30ms is an appropriate threshold for most applications however you may want to have a discussion with your storage array vendor, as they often make recommendations around latency threshold values for SIOC

Problems

One reason that this can occur is when the back-end disks/spindles have other LUNs built on them, and these LUNs are presented to non ESXi hosts. Check out

KB 1020651 for details on how to address this and previous posts

and

http://www.electricmonk.org.uk/2012/04/20/external-io-workload-detected-on-shared-datastore-running-storage-io-control-sioc/

Deleting a VM with Raw Disk Mappings

How do you delete a VM with Raw Disk Mappings?

To the best of my knowledge, if you delete the VM it will only delete the pointer file and you would have to delete the RDM on the SAN.

Before deleting the VM, you could always click on Edit Settings and select the RDM and Remove Disk – Delete from disk. This should delete both the pointer file and the RDM.

VMware vSphere Storage Appliance

A VMware vSphere Storage Appliance (VSA) is a virtual appliance that provides small and medium businesses with the benefit of VMware vSphere vMotion and High Availability without requiring shared storage

VSA runs on an ESXi host. A VSA cluster is a group of ESXi hosts each running its own VSA instance

A VSA cluster enables the following features

  • Shared Datastores for all hosts in the cluster
  • vMotion and HA
  • Datastore Replication
  • Hardware and Software Datastore failover capabilities

VSA is an alternative to SAN storage

  • A SAN system provides a centralised array of storage
  • A VSA cluster provides a distributed array of storage
  • Elimates the need to purchase expensive SAN storage

VSA Cluster Architecture

The architecture of a VSA cluster includes the physical servers that have local hard disks, ESXi as the operating system of the physical servers, and the vSphere Storage Appliance virtual machines that run clustering services to create volumes that are exported as the VSA datastores.

vSphere Storage Appliance supports the creation of a VSA cluster with two or three members. A vSphere Storage Appliance uses the hard disks of an ESXi host to create two volumes of the same size. It exports one of the volumes as a datastore. The other volume is a replica of the volume that is exported by another vSphere Storage Appliance from another host in the VSA cluster

VSA Cluster with 2 hosts

In a VSA cluster with two VSA cluster members, an additional service called VSA cluster service runs on the vCenter Server machine. The service participates as a member in the VSA cluster, but it does not provide storage. To remain online, a VSA cluster requires that more than half of the members are also online. If one instance of a vSphere Storage Appliance fails, the cluster can remain online only if the remaining VSA cluster member and the VSA cluster service are online.

A VSA cluster with 2 members has 2 VSA datastores and maintains a replica of each datastore

VSA Cluster with 3 hosts

A VSA cluster with 3 members has 3 VSA datastores and maintains a replica of each datastore. This configuration does not require the VSA cluster service running on the vCenter Server system

How does it work?

VSA uses the hard disks of the ESXi 5 hosts to maintain a datastore and its replica. VSA created 2 volumes of the same size. It exports one of the volumes as a datastore. The other volume is a replica of the volume that is exported by another VSA from another host in the VSA cluster. A VSA cluster with 2 members has 2 VSA datastores and maintains a replica of each datastore

A VSA cluster with 3 members has 3 datastores and maintains a replica of each datastore

How is Data accessed?

Data is accessed in the VSA cluster using the NFS Version 3 protocol. The NFS exports are used as ESXi datastores which provide shared storage to members in the VSA cluster. The default RAID configuration for the VSA cluster is RAID 10

vCenter server continues to manage the ESXi hosts and the VM’s. vCenter can only manage one VSA cluster at a time

VSA Manager

A VSA cluster is created and managed by the VSA Manager. This is a vCenter Server 5.0 extension that you install on a vCenter Server system. After you install VSA Manager, the VSA Manager tab is displayed in the vSphere client. The VSA Manager is used to do the following

  • Deploying a VSA Cluster
  • Mounting as datastores the volumes that each VSA instance exports
  • Monitoring and maintaining and troubleshooting a VSA cluster

More Information

VMware VSA Documentation

http://www.vmware.com/support/pubs

Unable to add new LUNS on VMware 4.1 U2

Problem

This week we have upgraded our hosts to VMware ESXi 4.1.0, 582267. Our storage guy has given us 2 x 2TB LUN’s but I was unable to add them as per screen-print below. Previously he has created 2TB LUNs and these have been fine

Unable to read partition information from disk

Solution

It seems Update 2 enforces the maximum LUN size, which is 2TB minus 512 Bytes with vSphere 4.x. Depending on the storage system, 2 TB could be either 2.000 GB (marketing size) or 2.048GB (technical size). The above mentioned maximum relates to the technical size, so with the storage system you have, you may need to configure 2.047GB max.

See Also

http://virtualgeek.typepad.com/virtual_geek/2009/06/vsphere-and-2tb-luns-changes-from-vi3x.html

External I/O workload detected on shared datastore running Storage I/O Control (SIOC)

This alarm may appear in the vCenter vSphere Client. A warning message similar to one of these may also appear in the vCenter vSphere Client:

  • Non-VI Workload detected on the datastore
  • An external I/O workload is detected on datastore XYZABC

This informational event alerts the user of a potential misconfiguration or I/O performance issue caused by a non-ESX workload. It is triggered when Storage I/O Control (SIOC) detects that a workload that is not managed by SIOC is contributing to I/O congestion on a datastore that is managed by SIOC. (Congestion is defined as a datastore’s response time being above the SIOC threshold.) Specific situations that can trigger this event include:

  • The host is running in an unsupported configuration.
  • The storage array is performing a system operation such as replication or RAID reconstruction.
  • VMware Consolidated Backup or vStorage APIs for Data Protection are accessing a snapshot on the datastore for backup purposes.
  • The storage media (spindles, SSD) on which this datastore is located is shared with volumes used by non-vSphere workloads
SIOC continues to work during these situations. This event can be ignored in many cases and you can disable the associated alarm once you have verified that none of the potential misconfigurations or serious performance issues are present in your environment. As explained in detail below, SIOC ensures that the ESX workloads it manages are able to compete for I/O resources on equal footing with external workloads. This event notifies the user of what is happening, provides the user with the opportunity to better understand what is going on, and highlights a potential opportunity to correct or optimize the infrastructure configuration.

NOTE: At this time, SIOC is not supported with NFS storage or with Raw Device Mappings (RDM) virtual disks. This includes RDM’s used for MSCS (Microsoft cluster server) also datastores with multiple extents. This alarm could occur if these storage object are configured.

Example Scenario 1:

A /vmfs/volumes/shared-LUN datastore is accessible across multiple hosts. Some hosts are running ESX version 4.1 or later and others are either running an older version or are outside the control domain of vCenter Server.

Example Scenario 2:

The array being used for vSphere is also being used for non-vSphere workloads. The non-vSphere workloads are accessing a storage volume that is on the same disk spindles as the affected datastore.

Impact

When SIOC detects that datastore response time has exceeded the threshold, it typically throttles the ESX workloads accessing the datastore to ensure that the workloads with the highest shares get preference for I/O access to the datastore and lower I/O response time. However, such throttling is not appropriate when workloads not managed by SIOC are accessing the same storage media. Throttling in this case would result in the external workload getting more and more bandwidth, while the vSphere workloads get less and less. Therefore, SIOC detects the presence of such external workloads, and as long as they are present while the threshold is being exceeded, SIOC competes with the interfering workload by curtailing its usual throttling activity.

SIOC automatically detects when the interference goes away and resumes its normal behavior. In this way, SIOC is able to operate correctly even in the presence of interference. The vCenter Server event is notifying the user that SIOC has noticed and handled the interference from external workloads.

Note: When an external workload is acting to drive the datastore response time above the SIOC threshold, the external workload might cause I/O performance issues for vSphere workloads. In most cases, SIOC can automatically and safely manage this situation. However, there may be an opportunity to improve performance by changing some aspects of your configuration. The next section provides guidance on this

These unsupported configurations can result in the event:

  • One or more hosts accessing the datastore are running an ESX version older than 4.1.
  • One or more hosts accessing the datastore are not managed by vCenter Server.
  • Not all of the hosts accessing the datastore are managed by the same vCenter Server.
  • The storage media (spindles, SSD) where this datastore is located are shared with other datastores that are not SIOC enabled.
  • Datastores in the configuration have multiple extents.

Ensure that you are running a supported configuration:

  • Can you disable and successfully re-enable congestion management for the affected datastore?

Disable and attempt to re-enable congestion management for the affected datastore. If the event occurred because the configuration includes hosts that are running an older version of ESX and the hosts are managed by the same vCenter Server, vCenter Server detects the problem and does not allow you to re-enable congestion management. When the older hosts are updated to ESX 4.1 or later, or the hosts are disconnected from the affected datastore, you can enable congestion management.

  • Are hosts that are not managed by this vCenter Server accessing the affected datastore?
If disabling and re-enabling congestion management for the affected datastore does not solve the problem, other hosts that are not managed by this vCenter Server might be accessing the datastore.

Verify that the datastore is shared across hosts that are managed by different vCenter Server systems or are not managed hosts. If so, perform one of these actions:

  • Do all datastores in the configuration that share the same physical storage media (spindles, SSD) have the same SIOC configuration?
    All datastores that share physical storage media must share the same SIOC configuration — all enabled or all disabled. In addition, if you have modified the default congestion threshold setting, all datastores that share storage media must have the same setting.
  • Are any SIOC-enabled datastores in the configuration backed up by multiple extents?
    SIOC-enabled datastores must not be backed up by multiple extents.

If none of the above scenarios apply to your configuration and you have determined that you are running a supported configuration, but are still seeing this event, investigate possible I/O throttling by the storage array.

If an environment is known to have shared access to datastores or performance constraints, it may be preferable to disable the Alarm in vCenter Server. For more information, see Working with Alarms in the vSphere 4.1 Datacenter Administration Guide.

Flowchart for Troubleshooting