Archive for January 2013

Analyse I/O Workloads to determine storage Performance Requirements

What causes Storage Performance issues?

Poor storage performance is generally the result of high I/O latency, but what can cause high storage performance and how to address it? Below are a list of things that can cause poor storage performance

Analysis of storage system workloads is important for a number of reasons. The analysis might be performed to understand the usage patterns of existing storage systems. It is very important for the architects to understand the usage patterns when designing and developing a new, or improving upon the existing design of a storage system. It is also important for a system administrator to understand the usage patterns when configuring and tuning a storage system

  • Under sized storage arrays/devices unable to provide the needed performance
  • I/O Stack Queue congestion
  • I/O Bandwidth saturation, Link/Pipe Saturation
  • Host CPU Saturation
  • Guest Level Driver and Queuing Interactions
  • Incorrectly Tuned Applications

Methods of determining Performance Requirements

There are various tools which can give us insight into how our applications are performing on a virtual infrastructure as listed below

  • vSphere Client Counters
  • esxstop/resxtop
  • vscsistats
  • Iometer
  • I/O Analyzer (VMware Fling)

vSphere Client Counters

The most significant counters to monitor for disk performance are

  • Disk Throughput (Disk Read Rate/Disk Write rate/Disk Usage) Monitored per LUN or per Host
  • Disk Latency (Physical Device Write Latency/Physical Device Write Latency no greater than 15ms and Kernel disk Read Latency/Kernel Disk Write Latency no greater than 4ms
  • Number of commands queued
  • Number of active disk commands
  • Number of aborted disk commands (Disk Command Aborts)

ESXTOP/RESXTOP

The most significant counters to monitor for disk performance are below and can be monitored per HBA

  • READs/s – Number of Disk Reads/s
  • WRITEs/s – Number of Disk Writes/s
  • MBREAD/s – MB read per second
  • MBWRN/s – MB written per second
  • GAVG (Guest Average Latency) total latency as seen from vSphere. GAVG is made up of KAVG and DAVG
  • KAVG (Kernel Average Latency) time an I/O request spent waiting inside the vSphere storage stack. Should be close to 0 but anything greater than 2 ms may be a performance problem
  • QAVG (Queue Average latency) time spent waiting in a queue inside the vSphere Storage Stack.
  • DAVG (Device Average Latency) latency coming from the physical hardware, HBA and Storage device. Should be less than 10
  • ACTV – Number of active I/O Operations
  • QUED – I/O operations waiting to be processed. If this is getting into constant double digits then look carefully as the storage hardware cannot keep up with the host
  • ABRTS – A sign of an overloaded system

stroage2

vscsiStats

Since ESX 3.5, VMware has provided a tool specifically for profiling  storage: vscsiStats.  vscsiStats collects and reports counters on  storage activity.  Its data is collected at the virtual SCSI device  level in the kernel.  This means that results are reported per VMDK (or  RDM) irrespective of the underlying storage protocol.  The following  data are reported in histogram form:

  • IO size
  • Seek distance
  • Outstanding IOs
  • Latency (in microseconds)

vscsiStats Command Options

  • -l – Lists running virtual machines and their world (worldGroupID)
  • -s – Starts vscsiStats data collection
  • -x Stops vscsiStats data collection
  • -p – Prints histogram information ( all, ioLength, seekDistance, outstandingIOs, latency, interarrival)
  • -c – Produces results in a comma-delimted list
  • -h – Displays the hep menu for more info
  • seekDistance is the distance in logical block numbers (LBN) that the disk head must travel to read or write a block. If a concentration of your seek distance is very small (less than 1), then the data is sequential in nature. If the seek distance is varied, your level of randomization may be proportional to this distance traveled
  • interarrival is the amount of time in microseconds between virtual machine disk commands.
  • latency is the time of the I/O trip.
  • ioLength is the size of the I/O. This is useful when you are trying to determine how to layout your disks or how to optimize the performance of the guest O/S and applications running on the virtual machines.
  • outstandingIOs will give you an idea of any queuing that is occurring.

Instructions

I found vscsiStats in the following locations

/usr/sbin

/usr/lib/vmware/bin

  • Determine the world number for your virtual machine
  • Log into an SSH session and type
  • cd /usr
  • cd /sbin
  • vscsiStats -l
  • Record the world ID for the virtual machine you would like to monitor
  • As per example below – 62615

Capture

  • Next capture data for your virtual machine
  • vscsiStats -s -w (worldgroup ID)
  • vscsiStats -s – w 62615
  • Although vscsiStats exits, it is still gathering data

putty

  • Once it has started, it will automatically stop after 30 minutes
  • Type the below command to display histograms for all in a comma-delimited list
  • vscsiStats -p all -c
  • You will see many of these histograms listed

putty3

  • Type the following to show the latency histogram
  • vscsiStats -p latency

putty2

  • You can also run vscsiStats and output to a file
  • vscsiStats -p latency > /tmp/vscsioutputfile.txt
  • To manually stop the data collection and reset the counters, type the following command
  • vscsStats -x -w 62615
  • To reset all counters  to zero, run
  • vscsiStats -r

Iometer

What is Iometer?

http://www.electricmonk.org.uk/2012/11/27/iometer/

Iometer is an I/O subsystem measurement and characterization tool for single and clustered systems. It is used as a benchmark and troubleshooting tool and is easily configured to replicate the behaviour of many popular applications. One commonly quoted measurement provided by the tool is IOPS

Iometer can be used for measurement and characterization of:

  • Performance of disk and network controllers.
  • Bandwidth and latency capabilities of buses.
  • Network throughput to attached drives.
  • Shared bus performance.
  • System-level hard drive performance.
  • System-level network performance.

I/O Analyzer (VMware Fling)

http://labs.vmware.com/flings/io-analyzer

VMware I/O Analyzer is a virtual appliance solution, which provides a simple and standardized way of measuring storage performance in VMware vSphere virtualized environments. I/O Analyzer supports two types of workload generator: IOmeter for synthetic workload and trace replay for real-world application workload. It collects both guest level statistics as well as the host level statistics via VMware VI SDK. Standardizing load generation and stats collection increases the confidence of the customer and VMware engineers in the data collected. It also ensures completeness of data collected

Understand and apply LUN masking using PSA-related commands

index

What is LUN Masking?

LUN (Logical Unit Number) Masking is an authorization process that makes a LUN available to some hosts and unavailable to other hosts.LUN Masking is implemented primarily at the HBA (Host Bus Adapter) level. LUN Masking implemented at this level is vulnerable to any attack that compromises the HBA. Some storage controllers also support LUN Masking.

LUN Masking is important because Windows based servers attempt to write volume labels to all available LUN’s. This can render the LUN’s unusable by other operating systems and can result in data loss.

How to MASK on a VMware ESXi Host

  • Step 1: Identifying the volume in question and obtaining the naa ID
  • Step 2: Run the esxcli command to associate/find this naa ID with the vmhba identifiers
  • Step 3: Masking the volume when you want to preserve data from the VMFS volumes for later use or if the volume is already deleted
  • Step 4: Loading the Claim Rules
  • Step 5: Verify that the claimrule has loaded:
  • Step 6: Unclaim the volume in question
  • Step 7: Check Messages
  • Step 8: Unpresent the LUN
  • Step 9: Rescan all hosts
  • Step 10 Restore normal claim rules
  • Step 11: Rescan Datastores

Step 1

  • Check in both places as listed in the table above that you have the correct ID
  • Note: Check every LUN as sometimes VMware calls the same Datastore different LUN Numbers and this will affect your commands later

claim3

  • Example Below

LUN

  • Make a note of the naa ID

Step 2

  • Once you have the naa ID from the above step, run the following command
  • Note we take the : off
  • -L parameter will show a compact list of paths

CLAIM2

  • Example below

lun3

  • We can see there are 2 paths to the LUN called C0:T0:L40 and C0:T1:L40
  • C=Channel, T=Target, L=LUN
  • Next we need to check and see what claim rules exist in order to not use an existing claim rule number
  • esxcli storage core claimrule list
  • Note I had to revert to the vSphere 4 CLI command as I am screenprinting from vSphere 5 not 4!

claimrule

Step 3

  • At this point you should be absolutely clear what LUN number you are using!

claim4

  • Next, you can use any rule numbers for the new claim rule that isn’t in the list above and pretty much anything from 101 upwards
  • In theory I have several paths so i should do this exercise for all of the paths

claim5

Step 4

claim6

  • The Class for those rules will show as file which means that it is loaded in /etc/vmware/esx.conf but it isn’t yet loaded into runtime.

Step 5

claim

  • Run the following command to see those rules displayed twice, once as the file Class and once as the runtime Class

Step 6

claim8

  • Before these paths can be associated with the new plugin (MASK_PATH), they need to be disassociated from the plugin they are currently using. In this case those paths are claimed by the NMP plugin (rule 65535). This next command will unclaim all paths for that device and then reclaim them based on the claimrules in runtime.

claim

Step 7

  • Check Messages

claim9

  • See example below

grep

  • Refresh the Datastore and you should see it vanish from the host view
  • Run the following command to check it now shows no paths
  • esxcfg-mpath -L | grep naa.60050768028080befc00000000000050 again will now show no paths

Step 8

  • Now get your Storage Team to remove the LUN from the SAN

Step 9

  • Rescan all hosts and make sure the Datastore has gone

Step 10

  • To restore normal claimrules, perform these steps for every host that had visibility to the LUN, or from all hosts on which you created rules earlier:

claim10

  • Run esxcli corestorage claimrule load
  • Run esxcli corestorage claimrule list
  • Note that you do not see/should not see the rules that you created earlier.

claimrule

  • Perform a rescan on all ESX hosts that had visibility to the LUN. If all of the hosts are in a cluster, right-click the cluster and click Rescan for Datastores. Previously masked LUNs should now be accessible to the ESX hosts

Step 11

  • Next you may have to follow the following KB Article if you find you have these messages in the logs or you cannot add new LUNs
  • Run the following commands on all HBA Adapters

unclaim

Useful Video of LUN Masking

http://www.youtube.com/watch?feature=player_embedded&v=pyNZkZmTKQQ

Useful VMware Docs (ESXi4)

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1029786

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1015252

Useful VMware Doc (ESXi5)

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2004605

 

Understand and apply VMFS resignaturing

VMFS LUN UUID

Every VMFS based LUN is assigned a Universally Unique Identifier (UUID). The UUID is stored in the metadata of your file system called a superblock and is a unique hexadecimal number generated by VMware.

When a LUN is copied or a replication made of an original LUN, the copied LUN ends up being absolutely identical to the original LUN including having the same UUID. This means the newly copied LUN must be resignatured before it is mounted. ESXi can determine whether a LUN contains a VMFS copy and does not mount it automatically

VMFS resignaturing does not apply to NFS Datastores

VMFS Resignaturing

  1. Creating a new signature for a drive is irreversible
  2. A datastore with extents (Spanned Datastore) may only be resignatured if all extents are online
  3. The VMs that use a datastore that was resignatured must be reassociated with the disk in their respective configuration files. The VMs must also be re-registered within vCenter
  4. The procedure is fault tolerant. If interrupted, it will continue later

Resignature a datastore using vSphere Client

  1. Log into vCenter using vClient
  2. Click Configuration > Storage
  3. Click Add Storage in the right window frame
  4. Select Disk/LUN and click Next
  5. Select the device to add and click Next
  6. You then have 3 options

sig

  1. Keep the existing signature: This option will leave the VMFS partition unchanged
  2. Assign a new signature: This option will delete the existing disk signature and replace it with a new one. This option must be selected if the original VMFS volume is still mounted (It isn’t possible to have two separate volumes with the same UUID mounted simultaneously)
  3. Format the disk: This option is the same as creating a new VMFS volume on an empty LUN
  4. Select Assign new signature and click Next
  5. Review your changes and then click Finish

Applying resignaturing using ESXCLI

  • SSH into a host using Putty or login into vMA
  • Type esxcli storage vmfs snapshot list. This will list the copies
  • esxcli storage vmfs snapshot mount -l (VolumeName)
  • esxcli storage vmfs snapshot resignature -l (VolumeName)

Troubleshooting

As of ESXi/ESX 4.0, it is no longer necessary to handle snapshot LUNs via the CLI. Resignature and Force-Mount operations have full GUI support and vCenter Server does VMFS rescans on all hosts after a resignature operation.

Snapshot LUNs issue is caused when the ESXi/ESX host cannot confirm the identity of the LUN with what it expects to see in the VMFS metadata. This can be caused by replaced SAN hardware, firmware upgrades, SAN replication, DR tests, and some HBA firmware upgrades. Some ESXi/ESX host upgrades from 3.5 to 4.x (due to the change in naming convention from mpx to naa) have also been known to cause this, but this is a rare occurrence. For more/related information, see Managing Duplicate VMFS Datastores in the vSphere Storage Guide for ESXi 5.x.

Force mounting a VMFS datastore may fail if:

  1. Multiple ESXi/ESX 4.x and 5.0 hosts are managed by the same vCenter Server and these hosts are in the same datacenter.
  2. A snapshot LUN containing a VMFS datastore is presented to all these ESXi/ESX hosts.
  3. One of these ESXi/ESX hosts has force mounted the VMFS datastore that resides on this snapshot LUN.
  4. A second ESXi/ESX host is attempting to do an operation at the same time.

When one ESXi/ESX host force mounts a VMFS datastore residing on a LUN which has been detected as a snapshot, an object is added to the datacenter grouping in the vCenter Server database to represent that datastore.

When a second ESXi/ESX host attempts to do the same operation on the same VMFS datastore, the operation fails because an object already exists within the same datacenter grouping in the vCenter Server database.

Since an object already exists, vCenter Server does not allow mounting the datastore on any other ESXi/ESX host residing in that same datacenter.

ESXCLI Commands for troubleshooting

Snapshot1

Useful YouTube Link

http://www.youtube.com/watch?feature=player_embedded&v=CFJTjbPGlY4

VMware Article Link

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1011387

Apply VMware storage Best Practices

best-practice

Datastore supported features

ds

VMware supported storage related functionality

ds3

Storage Best Practices

  • Always use the Vendors recommendations whether it be EMC, NetApp or HP etc
  • Document all configurations
  • In a well-planned virtual infrastructure implementation, a descriptive naming convention aids in identification and mapping through the multiple layers of virtualization from storage to the virtual machines. A simple and efficient naming convention also facilitates configuration of replication and disaster recovery processes.
  • Make sure your SAN fabric is redundant (Multi Path I/O)
  • Separate networks for storage array management and storage I/O. This concept applies to all storage protocols but is very pertinent to Ethernet-based deployments (NFS, iSCSI, FCoE). The separation can be physical (subnets) or logical (VLANs), but must exist.
  • If leveraging an IP-based storage protocol I/O (NFS or iSCSI), you might require more than a single IP address for the storage target. The determination is based on the capabilities of your networking hardware.
  • With IP-based storage protocols (NFS and iSCSI) you channel multiple Ethernet ports together. NetApp refers to this function as a VIF. It is recommended that you create LACP VIFs over multimode VIFs whenever possible.
  • Use CAT 6 cabling rather than CAT 5
  • Enable Flow-Control (should be set to receive on switches and
    transmit on iSCSI targets)
  • Enable spanning tree protocol with either RSTP or portfast
    enabled. Spanning Tree Protocol (STP) is a network protocol that makes sure of a loop-free topology for any bridged LAN
  • Configure jumbo frames end-to-end. 9000 rather than 1500 MTU
  • Ensure Ethernet switches have the proper amount of port
    buffers and other internals to support iSCSI and NFS traffic
    optimally
  • Use Link Aggregation for NFS
  • Maximum of 2 TCP sessions per Datastore for NFS (1 Control Session and 1 Data Session)
  • Ensure that each HBA is zoned correctly to both SPs if using FC
  • Create RAID LUNs according to the Applications vendors recommendation
  • Use Tiered storage to separate High Performance VMs from Lower performing VMs
  • Choose Virtual Disk formats as required. Eager Zeroed, Thick and Thin etc
  • Choose RDMs or VMFS formatted Datastores dependent on supportability and Aplication vendor and virtualisation vendor recommendation
  • Utilise VAAI (vStorage APIs for Array Integration) Supported by vSphere 5
  • No more than 15 VMs per Datastore
  • Extents are not generally recommended
  • Use De-duplication if you have the option. This will manage storage and maintain one copy of a file on the system
  • Choose the fastest storage ethernet or FC adaptor (Dependent on cost/budget etc)
  • Enable Storage I/O Control
  • VMware highly recommend that customers implement “single-initiator, multiple storage target” zones. This design offers an ideal balance of simplicity and availability with FC and FCoE deployments.
  • Whenever possible, it is recommended that you configure storage networks as a single network that does not route. This model helps to make sure of performance and provides a layer of data security.
  • Each VM creates a swap or pagefile that is typically 1.5 to 2 times the size of the amount of memory configured for each VM. Because this data is transient in nature, we can save a fair amount of storage and/or bandwidth capacity by removing this data from the datastore, which contains the production data. In order to accomplish this design, the VM’s swap or pagefile must be relocated to a second virtual disk stored in a separate datastore
  • It is the recommendation of NetApp, VMware, other storage vendors, and VMware partners that the partitions of VMs and the partitions of VMFS datastores are to be aligned to the blocks of the underlying storage array. You can find more information around VMFS and GOS file system alignment in the following documents from various vendors
  • Failure to align the file systems results in a significant increase in storage array I/O in order to meet the I/O requirements of the hosted VMs

vCenter Server Storage Filters

filter_data

What are the vCenter Server Storage Filters?

They are filters provided by vCenter to help avoid device corruption or performance issues which could arise as a result of using an unsupported storage device.

Storage Filter Chart

filter

How to access the Storage Filters

If you want to change the filter behaviour, please do the following

  • Log into the vSphere client
  • Select Administration > vCenter Server Settings
  • Select Advanced Settings
  • In the Key box, type the key you want to change
  • To disable the key, type False
  • Click Add
  • Click OK
  • Note the pic below is from vSphere 4.1

advsettings

Determine appropriate RAID levels for various Virtual Machine workloads

storage

Choosing a RAID level for a particular machine workload relies on the consideration of a lot of different factors if you want your machine/machines to run at their maximum potential and with Best Practices in mind

Other factors

  • Manufacturers Disk IOPs values
  • Type of Disk. E.g SATA, SAS, NSATA, SSD and FC
  • Speed of Disk. E.g 15K or 10K RPM etc
  • To ensure a stable and consistent I/O response, maximize the number of VM storage disks available. This strategy enables you to spread disk reads and writes across multiple disks at once, which reduces the strain on a smaller number of drives and allows for greater throughput and response times.
  • Controller and transport speeds affect VM performance
  • Disk Cost.
  • Some vendors have their own proprietary RAID Level. E.g Netapp RAID DP
  • The RAID level you choose for your LUN configuration can further optimize VM performance. But there’s a cost-vs-functionality component to consider. RAID 0+1 and 1+0 will give you the best virtual machine performance but will come at a higher cost, because they utilize only  50% of all allocated disks
  • RAID 5 will give you more storage for your money, but it requires you to write parity bits across drives. However slower SANs or local VM storage can create a resource deficit which can create bottlenecks
  • Cache Sizes
  • Connectivity. E.g. ISCSI, FC or FCOE. Fibre Channel and iSCSI are the most common transports and within these transports, there are different speeds. E.g. 1/10 GB iSCSI and 4/8 GB FC
  • Thin provisioning. This will take up less space on the SAN but create extra I/O utilisation due to the zeroing of blocks on write
  • De-deuplication. This does not necessarily improve storage performance but it stops duplicate data on storage which can save a great deal of money
  • Predictive Scheme. Create several LUNs with varying storage characteristics
  • Adaptive Scheme. Create large datastores and place VMs on and monitor performance

Please see the following links for general information on RAID and IOPS

http://www.electricmonk.org.uk/2013/01/03/raid-levels/

http://www.electricmonk.org.uk/2012/01/30/iops/

 

Group Policy Loopback Processing

3d key

Group Policy Processing

Group Policy Objects (GPO) are a collection of configurable policy settings that are organised as a single object and contain Computer Configuration policies which are applied to computers during Startup and User Configuration policies which are applied to users during logon.

Group Policy has 2 main configurations

  • Computer
  • User

When the computer starts, it processes all of the computer policies that are assigned to the computer object from AD in this order:

  • Local Policy
  • Site
  • Domain
  • OU
  • Child OU
  • Any startup scripts that were assigned to it in Group Policy

When a user logs in to the computer, the computer processes all of the policies assigned to that user object in this order:

  • Local Policy
  • Site
  • Domain
  • OU
  • Child OU
  • Any startup scripts that were assigned to it in Group Policy

What is Loopback processing?

The User Group Policy loopback processing mode option available within the computer configuration node of a Group Policy Object is a useful tool for ensuring certain user settings are applied on specified computers.

Essentially loopback processing changes the standard group policy processing in a way that allows user configuration settings to be applied based on the computers GPO scope during logon. This means that user configuration options can be applied to all users who log on to a specific computer.

Where is Loopback Processing found?

Loopback processing is configured in the Group Policy Management Console in Computer Configuration / Policies / Administrative Templates / System / Group Policy / User Group Policy loopback processing mode.

Modes

  • Replace

Replace Mode replaces the User policy that is assigned to the user. In the Computer Configuration, set the loopback processing mode to Replace. Next, assign user policies to the computer in addition to the computer polices, you would normally assign. When the computer starts, it will process the computer policies. When the user logs in, instead of processing the GPO’s assigned to the user, the computer will apply the user policies that are assigned to the computer object.

Where can it be used?

  • File, Print, and other servers that non-admin users don’t typically access via the console or Remote Desktop. When someone with admin rights logs in via the console or Remote Desktop, they only have the default policy or any other policy
  • Redirecting folders, mapping printers, or assigning software with Group Policy; you don’t want unwanted drivers or software showing up on your production server that now has to be maintained or removed.
  • Kiosk systems. An Administrator would typically have an unrestricted desktop experience. If that user logs onto a Kiosk machine, he or she would normally have a “wide open” desktop. This might be dangerous, so it may be useful to enable Replace mode to enforce a specific set of enforced settings.
  • Any other environment where the user settings should be determined by the computer account instead of the user account.
  • Terminal Servers

loopback

  • Merge

Merge Mode combines the policy that is assigned to the user instead of completely replacing it like in Replace Mode. When the computer starts, it will process the assigned computer policies. When the user logs in, the computer will process the user policies assigned to the user as it normally would and then processes the user policies that have been assigned to the computer object.

merge

Where can it be used?

  • Merge Mode can be useful if you need to make additions to a policy or override a general user policy that a user receives when he/she logs in to a computer

Processing order of Loopback Mode

Without Loopback

  • Computer Node policies from all GPOs in scope for the computer account object are applied during start-up (in the normal Local, Site, Domain, OU order)
  • User Node policies from all GPOs in scope for the user account object are applied during logon (in the normal Local, Site, Domain, OU order).

Loopback processing enabled (Merge Mode)

  • Computer Node policies from all GPOs in scope for the computer account object are applied during start-up (in the normal Local, Site, Domain, OU order), the computer flags that loopback processing (Merge Mode) is enabled.
  • User Node policies from all GPOs in scope for the user account object are applied during logon (in the normal Local, Site, Domain, OU order).
  • As the computer is running in loopback (Merge Mode) it then applies all User Node policies from all GPOs in scope for the computer account object during logon (Local, Site, Domain and OU),
  • If any of these settings conflict with what was applied , then the computer account setting will take precedence.

Loopback processing enabled (Replace Mode)

  • Computer Node policies from all GPOs in scope for the computer account object are applied during start-up (in the normal Local, Site, Domain, OU order), the computer flags that loopback processing (Replace Mode) is enabled.
  • User Node policies from all GPOs in scope for the user account object are not applied during logon (as the computer is running loopback processing in Replace mode no list of user GPOs has been collected).
  • As the computer is running in loopback (Replace Mode) it then applies all User Node policies from all GPOs in scope for the computer account object during logon (Local, Site, Domain and OU)

Useful Link

http://kudratsapaev.blogspot.co.uk/2009/07/loopback-processing-of-group-policy.html

Determine requirements for and configure NPIV

Going-way-too-fast-coloring-page.png

What does NPIV stand for?

(N_Port ID Virtualization)

What is an N_Port?

An N_Port is an end node port on the Fibre Channel fabric. This could be an HBA (Host Bus Adapter) in a server or a target port on a storage array.

What is NPIV?

N_Port ID Virtualization or NPIV is a Fibre Channel facility allowing multiple N_Port IDs to share a single physical N_Port. This allows multiple Fibre Channel initiators to occupy a single physical port, easing hardware requirements in Storage Area Network design, especially where virtual SANs are called for. NPIV is defined by the Technical Committee T11 in the Fibre Channel – Link Services (FC-LS) specification

NPIV  allows a single host bus adaptor (HBA) or target port on a storage array to register multiple World Wide Port Names (WWPNs) and N_Port identification numbers.  This allows each virtual server to present a different world wide name to the storage area network (SAN), which in turn means that each virtual server will see its own storage — but no other virtual server’s storage

How NPIV-Based LUN Access Works

NPIV enables a single FC HBA port to register several unique WWNs with the fabric, each of which can be assigned to an individual virtual machine.

SAN objects, such as switches, HBAs, storage devices, or virtual machines can be assigned World Wide Name (WWN) identifiers. WWNs uniquely identify such objects in the Fibre Channel fabric. When virtual machines have WWN assignments, they use them for all RDM traffic, so the LUNs pointed to by any of the RDMs on the virtual machine must not be masked against its WWNs. When virtual machines do not have WWN assignments, they access storage LUNs with the WWNs of their host’s physical HBAs. By using NPIV, however, a SAN administrator can monitor and route storage access on a per virtual machine basis. The following section describes how this works.

When a virtual machine has a WWN assigned to it, the virtual machine’s configuration file (.vmx) is updated to include a WWN pair (consisting of a World Wide Port Name, WWPN, and a World Wide Node Name, WWNN). As that virtual machine is powered on, the VMkernel instantiates a virtual port (VPORT) on the physical HBA which is used to access the LUN. The VPORT is a virtual HBA that appears to the FC fabric as a physical HBA, that is, it has its own unique identifier, the WWN pair that was assigned to the virtual machine. Each VPORT is specific to the virtual machine, and the VPORT is destroyed on the host and it no longer appears to the FC fabric when the virtual machine is powered off. When a virtual machine is migrated from one ESX/ESXi to another, the VPORT is closed on the first host and opened on the destination host.

If NPIV is enabled, WWN pairs (WWPN & WWNN) are specified for each virtual machine at creation time. When a virtual machine using NPIV is powered on, it uses each of these WWN pairs in sequence to try to discover an access path to the storage. The number of VPORTs that are instantiated equals the number of physical HBAs present on the host. A VPORT is created on each physical HBA that a physical path is found on. Each physical path is used to determine the virtual path that will be used to access the LUN. Note that HBAs that are not NPIV-aware are skipped in this discovery process because VPORTs cannot be instantiated on them

Requirements

  • The fibre switch must support NPIV
  • The HBA must support NPIV.
  • RDMs must be used (Raw Device mapping)
  • Use HBAs of the same type, either all QLogic or all Emulex. VMware does not support heterogeneous HBAs on the same host accessing the same LUNs
  • If a host uses multiple physical HBAs as paths to the storage, zone all physical paths to the virtual machine. This is required to support multipathing even though only one path at a time will be active
  • Make sure that physical HBAs on the host have access to all LUNs that are to be accessed by NPIV-enabled virtual machines running on that host
  • When configuring a LUN for NPIV access at the storage level, make sure that the NPIV LUN number and NPIV target ID match the physical LUN and Target ID
  • Keep the RDM on the same datastore as the VM configuration file.

NPIV Capabilities

  • NPIV supports vMotion. When you use vMotion to migrate a virtual machine it retains the assigned WWN.
  • If you migrate an NPIV-enabled virtual machine to a host that does not support NPIV, VMkernel reverts to using a physical HBA to route the I/O
  • If your FC SAN environment supports concurrent I/O on the disks from an active-active array, the concurrent I/O to two different NPIV ports is also supported.

NPIV Limitations

  • Because the NPIV technology is an extension to the FC protocol, it requires an FC switch and does not work on the direct attached FC disks
  • When you clone a virtual machine or template with a WWN assigned to it, the clones do not retain the WWN.
  • NPIV does not support Storage vMotion.
  • Disabling and then re-enabling the NPIV capability on an FC switch while virtual machines are running can cause an FC link to fail and I/O to stop

Assign WWNs to Virtual Machines

You can create from 1 to 16 WWN pairs, which can be mapped to the first 1 to 16 physical HBAs on the host.

  • Open the New Virtual Machine wizard.
  • Select Custom, and click Next.
  • Follow all steps required to create a custom virtual machine.
  • On the Select a Disk page, select Raw Device Mapping, and click Next.
  • From a list of SAN disks or LUNs, select a raw LUN you want your virtual machine to access directly.
  • Select a datastore for the RDM mapping file.
  • You can place the RDM file on the same datastore where your virtual machine files reside, or select a different datastore.

Note: If you want to use vMotion for a virtual machine with enabled NPIV, make sure that the RDM file is located on the same datastore where the virtual machine configuration file resides.

  • Follow the steps required to create a virtual machine with the RDM.
  • On the Ready to Complete page, select the Edit the virtual machine settings before completion check box and click Continue.
  • The Virtual Machine Properties dialog box opens.
  • Click the Options tab, and select Fibre Channel NPIV
  • (Optional) Select the Temporarily Disable NPIV for this virtual machine check box
  • Select Generate new WWNs.
  • Specify the number of WWNNs and WWPNs.
  • A minimum of 2 WWPNs are needed to support failover with NPIV. Typically only 1 WWN is created for each virtual machine.
  • Click Finish.
  • The host creates WWN assignments for the virtual machine.

NPIV

What to do next

Register newly created WWN in the fabric so that the virtual machine is able to log in to the switch, and assign storage LUNs to the WWN

NPIV Advantages

  • Granular security: Access to specific storage LUNs can be restricted to specific VMs using the VM WWN for zoning, in the same way that they can be restricted to specific physical servers.
  • Easier monitoring and troubleshooting: The same monitoring and troubleshooting tools used with physical servers can now be used with VMs, since the WWN and the fabric address that these tools rely on to track frames are now uniquely associated to a VM.
  • Flexible provisioning and upgrade: Since zoning and other services are no longer tied to the physical WWN “hard-wired” to the HBA, it is easier to replace an HBA. You do not have to reconfigure the SAN storage, because the new server can be pre-provisioned independently of the physical HBA WWN.
  • Workload mobility: The virtual WWN associated with each VM follows the VM when it is migrated across physical servers. No SAN reconfiguration is necessary when the work load is relocated to a new server.
  • Applications identified in the SAN: Since virtualized applications tend to be run on a dedicated VM, the WWN of the VM now identifies the application to the SAN.
  • Quality of Service (QoS): Since each VM can be uniquely identified, QoS settings can be extended from the SAN to VMs

Identify Supported HBA types

ce-HBA-fig1a

HBA Adapters

The three types of Host Bus Adapters (HBA) that you can use on an ESXi host are

  • Ethernet (iSCSI)
  • Fibre Channel
  • Fibre Channel over Ethernet (FCoE).
  • In addition to the hardware adapters there is software versions of the iSCSI and FCoE adapters (software FCoE is new with version 5) are available.

Compatibility Guide

To see all the results search VMware’s compatibility guide

Determine use cases for and configure VMware DirectPath I/O

pci

DirectPath I/O allows virtual machine access to physical PCI functions on platforms with an I/O Memory Management Unit.

The following features are unavailable for virtual machines configured with DirectPath

  • Hot adding and removing of virtual devices
  • Suspend and resume
  • Record and replay
  • Fault tolerance
  • High availability
  • DRS (limited availability. The virtual machine can be part of a cluster, but cannot migrate across hosts)
  •  Snapshots

Cisco Unified Computing Systems (UCS) through Cisco Virtual Machine Fabric Extender (VM-FEX) distributed switches support the following features for migration and resource management of virtual machines which use DirectPath I/O

  • Hot adding and removing of virtual devices
  • vMotion
  • Suspend and resume
  • High availability
  • DRS (limited availability
  •  Snapshots

Configure Passthrough Devices on a Host

  • Click on a Host
  • Select the Configuration Tab
  • Under Hardware, select Advanced Settings. You will see a warning message as per below

pass

  • Click Configure Passthrough. The Passthrough Configuration page appears, listing all available passthrough devices.

passthrough

  • A green icon indicates that a device is enabled and active. An orange icon indicates that the state of the device has changed and the host must be rebooted before the device can be used

Capture

Configure a PCI Device on a VM

Prerequisites

Verify that a Passthrough networking device is configured on the host of the virtual machine as per above instructions

Instructions

  • Select a VM
  • Power off the VM
  • From the Inventory menu, select Virtual Machine > Edit Settings
  • On the Hardware tab, click Add.
  • Select PCI Device and click Next
  • Select the Passthrough device to use
  • Click Finish
  • Power on VM

As per below I haven’t cofigured any pass thorugh devices but just to show you where the settings are

vmpci