www.electricmonk.org.uk

Comparing VM Encryption performance between ESXi 6.7U3 + vSAN and ESXi 7.0U2 + vSAN

May 9, 2021 HCIBench No comments

This blog is similar to another I wrote which compared VM Encryption and vSAN encryption on ESXi 6.7U3. This time, I’m comparing VM Encryption performance on ESXi 6.7U3 and ESXi 7.0U2 running on vSAN.

What is the problem which needs to be solved?

I have posted this section before on the previous blog however it is important to understand the effect of an extra layer of encryption has on the performance of your systems. It has become a requirement (sometimes mandatory) for companies to enable protection of both personal identifiable information and data; including protecting other communications within and across environments New EU General Data Protection Regulations (GDPR) are now a legal requirement for global companies to protect the personal identifiable information of all European Union residents. In the last year, the United Kingdom has left the EU, however the General Data Protection Regulations will still be important to implement. “The Payment Card Industry Data Security Standards (PCI DSS) requires encrypted card numbers. The Health Insurance Portability and Accountability Act and Health Information Technology for Economic and Clinical Health Acts (HIPAA/HITECH) require encryption of Electronic Protected Health Information (ePHI).” (Townsendsecurity, 2019) Little is known about the effect encryption has on the performance of different data held on virtual infrastructure. VM encryption and vSAN encryption are the two data protection options I will evaluate for a better understanding of the functionality and performance effect on software defined storage.

It may be important to understand encryption functionality in order to match business and legal requirements. Certain regulations may need to be met which only specific encryption solutions can provide. Additionally, encryption adds a layer of functionality which is known to have an effect on system performance. With systems which scale into thousands, it is critical to understand what effect encryption will have on functionality and performance in large environments. It will also help when purchasing hardware which has been designed for specific environments to allow some headroom in the specification for the overhead of encryption

Testing Components

Test lab hardware (8 Servers)

HCIBench Test VMs

80 HCIBench Test VMs will be used for this test. I have placed 10 VMs on each of the 8 Dell R640 servers to provide a balanced configuration. No virtual machines other than the HCIBench test VMs will be run on this system to avoid interference with the testing.

The HCIBench appliance is running vdBench, not Fio

The specification of the 80 HCIBench Test VMs are as follows.

RAID Configuration

VM encryption will be tested on RAID1 and RAID6 vSAN storage

VM encryption RAID1 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid1_vsan_policy Storage Type = vSAN Failures to tolerate = 2 (RAID 1) Thin provisioned = Yes Number of disk stripes per object = 2 Encryption enabled = Yes Deduplication and Compression enabled = No

VM encryption RAID6 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid6_vsan_policy Storage Type = vSAN Failures to tolerate = 2 (RAID6) Thin provisioned = Yes Number of disk stripes per object = 1 Encryption enabled = Yes Deduplication and Compression enabled = No

HCIBench Test Parameters

The test will run through various types of read/write workload at the different block sizes to replicate different types of applications using 1 and 2 threads.

0% Read 100% Write
20% Read 80% Write
70% Read 30% Write

The block sizes used are

4k
16k
64k
128k

The test plan below containing 24 tests will be run for VM Encryption on 6.7U3 and again for VM Encryption on 7.0U2. These are all parameter files which are uploaded in HCIBench then can run sequentially without intervention through the test. I think I left these running for 3 days! It refreshes the cache in between tests.

Scroll across at the bottom to see the whole table

Test	Number of disks	Working Set %	Number of threads	Block size (k)	Read %	Write %	Random %	Test time (s)
1	2 (O/S and Data)	100%	1	4k	0	100	100	7200
2	2 (O/S and Data)	100%	2	4k	0	100	100	7200
3	2 (O/S and Data)	100%	1	4k	20	80	100	7200
4	2 (O/S and Data)	100%	2	4k	20	80	100	7200
5	2 (O/S and Data)	100%	1	4k	70	30	100	7200
6	2 (O/S and Data)	100%	2	4k	70	30	100	7200
7	2 (O/S and Data)	100%	1	16k	0	100	100	7200
8	2 (O/S and Data)	100%	2	16k	0	100	100	7200
9	2 (O/S and Data)	100%	1	16k	20	80	100	7200
10	2 (O/S and Data)	100%	2	16k	20	80	100	7200
11	2 (O/S and Data)	100%	1	16k	70	30	100	7200
12	2 (O/S and Data)	100%	2	16k	70	30	100	7200
13	2 (O/S and Data)	100%	1	64k	0	100	100	7200
14	2 (O/S and Data)	100%	2	64k	0	100	100	7200
15	2 (O/S and Data)	100%	1	64k	20	80	100	7200
16	2 (O/S and Data)	100%	2	64k	20	80	100	7200
17	2 (O/S and Data)	100%	1	64k	70	30	100	7200
18	2 (O/S and Data)	100%	2	64k	70	30	100	7200
19	2 (O/S and Data)	100%	1	128k	0	100	100	7200
20	2 (O/S and Data)	100%	2	128k	0	100	100	7200
21	2 (O/S and Data)	100%	1	128k	20	80	100	7200
22	2 (O/S and Data)	100%	2	128k	20	80	100	7200
23	2 (O/S and Data)	100%	1	128k	70	30	100	7200
24	2 (O/S and Data)	100%	2	128k	70	30	100	7200

HCIBench Performance Metrics

These metrics will be measured across all tests

Workload Parameter	Explanation	Value
IOPs	IOPS measures the number of read and write operations per second	Input/Outputs per second
Throughput	Throughput measures the number of bits read or written per second Average IO size x IOPS = Throughput in MB/s	MB/s
Read Latency	Latency is the response time when you send a small I/O to a storage device. If the I/O is a data read, latency is the time it takes for the data to come back	ms
Write Latency	Latency is the response time when you send a small I/O to a storage device. If the I/O is a write, latency is the time for the write acknowledgement to return.	ms
Latency Standard Deviation	Standard deviation is a measure of the amount of variation within a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range	Values must be compared to the standard deviation
Average ESXi CPU usage	Average ESXi Host CPU usage	%
Average vSAN CPU usage	Average CPU use for vSAN traffic only	%

Results

IOPs

IOPS and block size tend to have an inverse relationship. As the block size increases, it takes longer latency to read a single block, and therefore the number of IOPS decreases however, smaller block sizes yield higher IOPS

With RAID1 VM Encryption, 7.0U2 performs better than 6.7U3 at the lower block level – 4k and 16k but as we get into the larger 64k and 128k blocks, there is less of a difference with 6.7U3 having the slight edge over IOps performance.

With RAID6 VM Encryption, 7.0U2 has consistently higher IOPS across all tests than 6.7U3.

RAID6 VM Encryption produces less IOPs than RAID1 VM Encryption which is expected due to the increased overhead RAID6 incurs over RAID1 in general. RAID 1 results in 2 writes, one to each mirror. A RAID6 single write operation results in 3 reads and 3 writes (due to double parity) Each write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity.

RAID 1 VM Encryption

The graph below shows the comparison of IOPs between 6.7U3 and 7.0U2 with RAID 1 VM Encryption

Click the graph for an enlarged view

RAID 6 VM Encryption

The graph below shows the comparison of IOPs between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Throughput

IOPs and throughput are closely related by the following equation.

Throughput (MB/s) = IOPS * Block size

IOPS measures the number of read and write operations per second, while throughput measures the number of bits read or written per second. The higher the throughput, the more data which can be transferred. The graphs follow a consistent pattern from the heavier to the lighter workload tests. I can see the larger block sizes such as 64K and 128K have the greater throughput in each of the workload tests than 4K or 8K. As the block sizes get larger in a workload, the number of IOPS will decrease. Even though it’s fewer IOPS, you’re getting more data throughput because the block sizes are bigger. The vSAN datastore is a native 4K system. It’s important to remember that storage systems may be optimized for different block sizes. It is often the operating system and applications which set the block sizes which then run on the underlying storage. It is important to test different block sizes on storage systems to see the effect these have.

With RAID1 VM Encryption at at lower block sizes, 4k and 16k, 7.0U2 performs better with greater throughput. At the higher block sizes 64k and 128k, there is less of a difference with 6.7U3 performing slightly better but the increase is minimal.

With RAID6 VM Encryption, there is generally a higher throughput at the lower block sizes but not at the higher block sizes

RAID1 VM Encryption

The graph below shows the comparison of throughput between 6.7U3 and 7.0U2 with RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

The graph below shows the comparison of throughput between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Average Latency

With RAID1 VM Encryption at at lower block sizes, 4k and 16k, 7.0U2 shows less latency but at the higher block sizes there is a slight increase in latency than 6.7U3

With RAID6 VM Encryption, the 7.0U2 tests are better showing less latency than the 6.7U3 tests

RAID1 VM Encryption

The graph below shows the comparison of average latency between 6.7U3 and 7.0U2 with RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

The graph below shows the comparison of average latency between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Read Latency

The pattern is consistent between the read/write workloads. As the workload decreases, read latency decreases although the figures are generally quite close. Read latency for all tests varies between 0.30 and 1.40ms which is under a generally recommended limit of 15-20ms before latency starts to cause performance problems.

RAID1 VM Encryption shows lower read latency for the 7.0U2 tests than 6.7U3. There are outlier values for the Read Latency across the 4K and 16K block size when testing 2 threads which may be something to note if applications will be used at these block sizes.

RAID6 shows a slightly better latency result than RAID1 however RAID6 has more disks than mirrored RAID1 disks to read from than RAID1 therefore the reads are very fast which is reflected in the results. Faster reads result in lower latency. Overall 7.0U2 performs better than 6.7U3 apart from one value at the 128k block size with 2 threads which may be an outlier.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Write Latency

The lowest write latency is 0.72ms and the largest is 9.56ms. Up to 20ms is the recommended value from VMware however with all flash arrays, thse values are expected and well within these limits. With NVMe and flash disks, the faster hardware may expose bottlenecks elsewhere in hardware stack and architecture which can be compared with internal VMware host layer monitoring. Write latency can occur at several virtualization layers and filters which each cause their own latency. The layers can be seen below.

This image has an empty alt attribute; its file name is image-14.png

Latency can be caused by limits on the storage controller, queuing at the VMkernel layer, the disk IOPS limit being reached and the types of workloads being run possibly alongside other types of workloads which cause more processing.

With RAID1 Encryption, 7.0U2 performed better at the lower block size with less write latency than 6.7U3. However on the higher block sizes, 64k and 128k, 6.7U3 performs slightly better but we are talking 1-2ms.

With RAID6 VM Encryption, 7.0U2 performed well with less latency across all tests than 6.7U3.

As expected, all the RAID6 results incurred more write latency than the RAID1 results. Each RAID6 write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity producing a heavy write penalty and therefore more latency

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Latency Standard Deviation

The standard deviation value in the testing results uses a 95^th percentile. This is explained below with examples.

An average latency of 2ms and a 95th percentile of 6ms means that 95% of the IO were serviced under 6ms, and that would be a good result
An average latency of 2ms and a 95th percentile latency of 200ms means 95% of the IO were serviced under 200ms (keeping in mind that some will be higher than 200ms). This means that latencies are unpredictable and some may take a long time to complete. An operation could take less than 2ms, but every once in a while, it could take well over 200
Assuming a good average latency, it is typical to see the 95th percentile latency no more than 3 times the average latency.

With RAID1 Encryption, 7.0U2 performed better at the lower block size with less latency standard deviation than 6.7U3. However on the higher block sizes, 64k and 128k, 6.7U3 performs slightly better.

With RAID 6 VM Encryption, 7.0U2 performed with less standard deviation across all the tests.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

ESXi CPU Usage %

With RAID1 VM Encryption, at the lower block sizes, 4k and 16k, 7.0U2 uses more CPU but at the higher block sizes, 7.0U2 uses slightly less CPU usage.

With RAID6 VM Encryption, there is an increase in CPU usage across all 7.0U2 compared to 6.7U3 tests. RAID 6 has a higher computational penalty than RAID1.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Conclusion

The performance tests were designed to get an overall view from a low workload test of 30% Write, 70% Read through a series of increasing workload tests of 80% Write, 20% Read and 100% Write, 0% Read simulation. These tests used different block sizes to simulate different application block sizes. Testing was carried out on an all flash RAID1 and RAID6 vSAN datastore to compare the performance for VM encryption between ESXi 6.7U3 and 7.0U2. The environment was set up to vendor best practice across vSphere ESXi, vSAN, vCenter and the Dell server configuration.

RAID1 VM Encryption

With 6.7U3, IOPs at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not at lower block sizes.
With 6.7U3, throughput at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not at lower block sizes
Overall latency for 6.7U3 at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not for the lower block size
Read latency for 6.7U3 is higher than 7.0U2.
Write latency at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not for the lower block sizes.
There is more standard deviation for 6.7U3 then 7.0U2.
At the lower blocks sizes, 6.7U3 uses less CPU on the whole but at the higher block sizes, 7.0U2 uses less CPU

RAID6 VM Encryption

There are higher IOPs for 7.0U2 than 6.7U3 across all tests.
There is generally a higher throughput for 7.0U2 at the lower block sizes, than 6.7U3 but not at the higher block sizes. However, the difference is minimal.
There is lower overall latency for 7.0U2 than 6.7U3 across all tests
There is lower read latency for 7.0U2 than 6.7U3 across all tests
There is lower write latency for 7.0U2 than 6.7U3 across all tests
There is less standard deviation for 7.0U2 than 6.7U3 across all tests
There is a higher CPU % usage for 7.0U2 than 6.7U3 across all tests

With newer processors, AES improvements, memory improvements, RDMA NICs and storage controller driver improvements, we may see further performance improvements in new server models.

BIOS and UEFI

March 22, 2021 IT No comments

What does the BIOS do?

The BIOS (Basic Input Output Operating System) is the first piece of software which runs and carries out the following tasks.

Performing POST – (Power-On Self-Test) in this phase the BIOS checks if the components installed on the motherboard are functioning
Basic I/O checks -This checks the peripherals such as the keyboard, the monitor and serial ports can operate to perform basic tasks.
Booting – The BIOS tries to boot from the devices connected (SSDs, HDDs, PXE, whatever) in order to provide an Operating System) to operate the computer.

It can also be a low level management tool providing some ability to tweak system features and settings

What is UEFI?

UEFI stands for Unifed Extensible Firmware Interface. UEFI was released in 2007 to provide a successor to BIOS to overcome limitations. Before this computers used the BIOS (Basic Input Output Operating System). Most UEFI firmware implementations provide support for legacy BIOS services.

UEFI Advantages over BIOS

32-bit/64/bit architecture rather than 16-bit
CPU independent architecture
Ability to use large disk partitions over 2TB. UEFI’s theoretical size limit for bootable drives is more than nine zettabytes, while BIOS can only boot from drives 2TB or smaller.
Flexible pre-OS environment, including network capability, GUI, multi language
Expanded BIOS with a GUI and mouse ability
UEFI Secure Boot feature, which employs digital signatures to verify the integrity of low-level code like boot loaders and operating system files before execution. If validation fails, Secure Boot halts execution of the compromised bits to stop any potential attack in its tracks. Secure Boot was added in version 2.2 of the UEFI specification
UEFI does not use the Master Boot Record (MBR) scheme to store the low-level bits that bootstrap the operating system. Under the MBR, these key bits reside in the first segment of the disk, and any corruption or damage to that area stops the operating system from loading. Instead, UEFI uses the GUID Partition Table (GPT) scheme and stores initialization code in an .efi file found in a hidden partition. GPT also stores redundant copies of this code and uses cyclic redundancy checks to detect changes or corruption of the data
C / C++ language used instead of assembly language
Backwards compatibility with MBR hard drives

UEFI Specification

This can be found at the link – https://uefi.org/specifications

Considerations

When building Windows 10 or Windows Server 2016 VM’s, it is recommended you build them with EFI firmware enabled. Moving from traditional BIOS/MBR to EFI (UEFI) firmware afterwards introduces some challenges later on down the line and can cause machines not to boot.

UEFI still cannot be used for auto deploying vSphere ESXi hosts but this may change in the future.

Using PowerCLI Image Builder CLI to build a new ESXi 7.0U1c image

February 14, 2021 AutoDeploy No comments

What do we need to build a custom image?

An ESXi image (Download from myvmware.com) and use the depot zip

VMware PowerCLI and the ESXi Image Builder module

For more information on setting this up, see this blog. Thanks to Michelle Laverick.

Other software depots

The vSphere ESXi depot is the main software depot you will need but there are other depots provided by vendors who create collections of VIBs specially packaged for distribution. Depots can be Online and Offline. An online software depot is accessed remotely using the HTTP protocol. An offline software depot is downloaded and accessed locally. These depots have the vendor specific VIBs that you will need to combine with the vSphere ESXi depot in order to create your custom installation image. An example could be HP’s depot on this link

What are VIBS?

VIB actually stands for vSphere Installation Bundle. It is basically a collection of files packaged into a single archive to facilitate distribution. It is composed of 3 parts

A file archive (The files which will be installed on the host)
An xml descriptor file (Describes the contents of the VIB. It contains the requirements for installing the VIB and identifies who created the VIB and the amount of testing that’s been done including any dependencies, any compatibility issues, and whether the VIB can be installed without rebooting.)
A signature file (Verifies the acceptance level of the VIB) There are 4 acceptance levels. See next paragraph

Acceptance levels

Each VIB is released with an acceptance level that cannot be changed. The host acceptance level determines which VIBs can be installed to a host.

VMwareCertfied

The VMwareCertified acceptance level has the most stringent requirements. VIBs with this level go through thorough testing fully equivalent to VMware in-house Quality Assurance testing for the same technology. Today, only I/O Vendor Program (IOVP) program drivers are published at this level. VMware takes support calls for VIBs with this acceptance level.

VMwareAccepted

VIBs with this acceptance level go through verification testing, but the tests do not fully test every function of the software. The partner runs the tests and VMware verifies the result. Today, CIM providers and PSA plug-ins are among the VIBs published at this level. VMware directs support calls for VIBs with this acceptance level to the partner’s support organization.

PartnerSupported

VIBs with the PartnerSupported acceptance level are published by a partner that VMware trusts. The partner performs all testing. VMware does not verify the results. This level is used for a new or nonmainstream technology that partners want to enable for VMware systems. Today, driver VIB technologies such as Infiniband, ATAoE, and SSD are at this level with nonstandard hardware drivers. VMware directs support calls for VIBs with this acceptance level to the partner’s support organization.

CommunitySupported

The CommunitySupported acceptance level is for VIBs created by individuals or companies outside of VMware partner programs. VIBs at this level have not gone through any VMware-approved testing program and are not supported by VMware Technical Support or by a VMware partner.

Steps to create an custom ESXi image

I have an ESXI 7.0U1c software depot zip file and I am going to use an Intel VIB which I will add into the custom image

2. Open PowerCLI and connect to your vCenter

Connect-VIServer <vCenterServer>

3. Next I add my vSphere ESXi and Intel software depot zips

Add-EsxSoftwareDepot c:\Users\rhian\Downloads\VMware-ESXi-7.0U1c-17325551-depot.zip

Add-EsxSoftwareDepot c:\Users\rhian\Downloads\intel-nvme-vmd-en_2.5.0.1066-1OEM.700.1.0.15843807_17238162.zip

4. If you want to check what packages are available once the software depots have been added.

Get-EsxSoftwarePackage

5. Next we can check what image profiles are available. We are going to clone one of these profiles

Get-EsxImageProfile

6. There are two ways to create a new image profile, you can create an empty image profile and manually specify the VIBs you want to add, or you can clone an existing image profile and use that. I have cloned an existing image profile

New-EsxImageProfile -CloneProfile ESXi-7.0U1c-17325551-standard -name esxi701c-imageprofile -vendor vmware -AcceptanceLevel PartnerSupported

If I do a Get-EsxImageProfile now, I can see the new image profile I created

7. Next, I’ll use the Add-EsxSoftwarePackage to add and remove VIBs to/from the image profile. First of all I’ll check my extra Intel package to get the driver name then I will add the software package

Get-EsxSoftwarePackage | where {$_.Vendor -eq “INT”}

Add-EsxSoftwarePackage -ImageProfile esxi701c-imageprofile -SoftwarePackage intel-nvme-vmd -Force

8. We now have the option to export the profile as a zip or an iso.

Export-EsxImageProfile -ImageProfile esxi701c-imageprofile -FilePath c:\Users\rhian\Downloads\esxi701c-imageprofile.zip -ExportToBundle -Force -NoSignatureCheck

Export-EsxImageProfile -ImageProfile esxi701c-imageprofile -FilePath c:\Users\rhian\Downloads\esxi701c-imageprofile.iso -ExportToIso -Force -NoSignatureCheck

9. Just as a note, If you need to change the acceptance level, then you can do so by running the following command before creating the iso or zip. The example below shows changing the imageprofile to the PartnerSupport acceptance level.

Set-EsxImageProfile -AcceptanceLevel PartnerSupported –ImageProfile esxi701c-imageprofile

Useful tip

Typing history in PowerCLI will show you all the commands you have typed. Very handy to check mistakes or save the commands for future use.

Installing Linux bash shell on Windows

January 31, 2021 IT No comments

Cananonical and Windows have linked up to provide the ability to run Linux on Windows. Developers can also use Cygwin, MSYS, or run Linux in a virtual machine, but these workarounds have their own disadvantages and can overload systems. Bash on Windows provides a Windows subsystem and Ubuntu Linux runs on top of it.

Basically, Windows allows you to run the same Bash shell that you find on Linux. This way you can run Linux commands inside Windows without the needing to install a virtual machine, or dual booting Linux/Windows. You install Linux inside Windows like a regular application. This is a good option if you want to learn Linux/Unix commands.

How to enable

Go to Control Panel – Programs and Features – Turn Windows Features on and off.
Enable Windows Subsystem for Linux and Virtual Machine Platform
Reboot

Go to the Windows store and search for Linux or Ubuntu. Install the distribution you want. In my case Ubuntu.

Once Ubuntu has installed, you will need to set up a username and password
This occurs for the first run. Bash shell will be available to use the next time you log in.

When you open the Bash shell in Windows, you are literally running Ubuntu. Developers can now run Bash scripts, Linux command-line tools like sed, awk, grep, and Linux-first tools like Ruby, Git, Python, etc. directly on Windows.
Search for bash or wsl in the Windows search box

Almost all Linux commands can be used in the Bash shell on Windows
Opening bash and wsl will display as a “Run command” that can be selected to instantly open the bash shell. The difference with using either of these methods is that they open in the /mnt/c/Windows/System32 directory so you can browse the System32 subdirectory in Windows 10.

Or you can simply open the Ubuntu app

Examples

You can run sudo apt-get update and sudo apt-get upgrade to obtain and install updates along with all usual Linux commands.

Comparing the functionality and performance of VM encryption and vSAN encryption on RAID1 and RAID6 vSAN storage

December 5, 2020 HCIBench, vSAN No comments

What is the problem which needs to be solved?

It has become a requirement for companies to enable protection of both personal identifiable information and data; including protecting other communications within and across environments New EU General Data Protection Regulations (GDPR) are now a legal requirement for global companies to protect the personal identifiable information of all European Union residents. In the last year, the United Kingdom has left the EU, however the General Data Protection Regulations will still be important to implement. “The Payment Card Industry Data Security Standards (PCI DSS) requires encrypted card numbers. The Health Insurance Portability and Accountability Act and Health Information Technology for Economic and Clinical Health Acts (HIPAA/HITECH) require encryption of Electronic Protected Health Information (ePHI).” (Townsendsecurity, 2019) Little is known about the effect encryption has on the performance of different data held on virtual infrastructure. VM encryption and vSAN encryption are the two data protection options I will evaluate for a better understanding of the functionality and performance effect on software defined storage.

What will be used to test

Key IT Aspects	Description
VMware vSphere ESXi servers	8 x Dell R640 ESXi servers run the virtual lab environment and the software defined storage.
HCIBench test machines	80 x Linux Photon 1.0 virtual machines.
vSAN storage	Virtual datastore combining all 8 ESXi server local NVMe disks. The datastore uses RAID (Redundant array of inexpensive disks), a technique combining multiple disks together for data redundancy and performance.
Key Encryption Management Servers	Clustered and load balanced Thales key management servers for encryption key management.
Encryption Software	VM encryption and vSAN encryption
Benchmarking software	HCIBench v2.3.5 and Oracle Vdbench

Test lab hardware

8 servers

Architecture	Details
Server Model	Dell R640 1U rackmount
CPU Model	Intel Xeon Gold 6148
CPU count	2
Core count	20 per CPU
Processor AES-NI	Enabled in the BIOS
RAM	768GB (12 x 64GB LRDIMM)
NIC	Mellanox ConnectX-4 Lx Dual Port 25GbE rNDC
O/S Disk	1 x 240GB Solid State SATADOM
vSAN Data Disk	3 x 4TB U2 Intel P4510 NVMe
vSAN Cache Disk	1 x 350GB Intel Optane P4800X NVMe
Physical switch	Cisco Nexus N9K-C93180YC-EX
Physical switch ports	48 x 25GbE and 4 x 40GbE
Virtual switch type	VMware Virtual Distributed Switch
Virtual switch port types	Elastic

HCIBench Test VMs

The specification of the 80 HCIBench Test VMs are as follows.

Resources	Details
CPU	4
RAM	8GB
O/S VMDK primary disk	16GB
Data VMDK disk	20GB
Network	25Gb/s

HCIBench Performance Metrics

Workload Parameter	Explanation	Value
IOPs	IOPS measures the number of read and write operations per second	Input/Outputs per second
Throughput	Throughput measures the number of bits read or written per second Average IO size x IOPS = Throughput in MB/s	MB/s
Read Latency	Latency is the response time when you send a small I/O to a storage device. If the I/O is a data read, latency is the time it takes for the data to come back	ms
Write Latency	Latency is the response time when you send a small I/O to a storage device. If the I/O is a write, latency is the time for the write acknowledgement to return.	ms
Latency Standard Deviation	Standard deviation is a measure of the amount of variation within a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range	Values must be compared to the standard deviation
Average ESXi CPU usage	Average ESXi Host CPU usage	%
Average vSAN CPU usage	Average CPU use for vSAN traffic only	%

HCIBench Test Parameter Options

The HCIBench performance options allow you to set the block size and the types of read/write ratios. In these tests, I will be using the following block sizes to give a representation of the different types of applications you can see on corporate systems

4k
16k
64k
128k

In these tests I will be using the following Read/Write ratios to also give a representation of the different types of applications you can see on corporate systems

0% Read 100% Write
20% Read 80% Write
70% Read 30% Write

RAID Configuration

VM encryption will be tested on RAID1 and RAID6 vSAN storage
vSAN encryption will be tested on RAID1 and RAID6 vSAN storage

Note: vSAN encryption is not configured at all in the policy for vSAN encryption as this is turned on at the datastore level but we still need a generic RAID1 and RAID6 storage policy.

VM encryption RAID1 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid1_vsan_policy Storage Type = vSAN Failures to tolerate = 1 (RAID 1) Thin provisioned = Yes Number of disk stripes per object = 1 Encryption enabled = Yes Deduplication and Compression enabled = No

VM encryption RAID6 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid6_vsan_policy Storage Type = vSAN Failures to tolerate = 2 (RAID6) Thin provisioned = Yes Number of disk stripes per object = 1 Encryption enabled = Yes Deduplication and Compression enabled = No

vSAN encryption RAID1 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid1_vsan_policy Storage Type = vSAN Failures to tolerate = 1 (RAID 1) Thin provisioned = Yes Number of disk stripes per object = 1 Deduplication and Compression enabled = No

vSAN encryption RAID6 storage policy

Test Parameters	Configuration
vCenter Storage Policy	Name = raid6_vsan_policy Storage Type = vSAN Failures to tolerate = 2 (RAID6) Thin provisioned = Yes Number of disk stripes per object = 1 Deduplication and Compression enabled = No

Test Plans

The table below shows one individual test plan I have created. This plan is replicated for each of the tests listed below. Scroll across at the bottom to see the whole table.

RAID1 Baseline
RAID1 VM Encryption
RAID1 vSAN Encryption
RAID6 Baseline
RAID6 VM Encryption
RAID6 vSAN Encryption

The tests were run for 3 hours each including a warm up and warm down period.

Test	Number of disks	Working Set %	Number of threads	Block size (k)	Read %	Write %	Random %	Test time (s)
1	2 (O/S and Data)	100%	1	4k	0	100	100	7200
2	2 (O/S and Data)	100%	2	4k	0	100	100	7200
3	2 (O/S and Data)	100%	1	4k	20	80	100	7200
4	2 (O/S and Data)	100%	2	4k	20	80	100	7200
5	2 (O/S and Data)	100%	1	4k	70	30	100	7200
6	2 (O/S and Data)	100%	2	4k	70	30	100	7200
7	2 (O/S and Data)	100%	1	16k	0	100	100	7200
8	2 (O/S and Data)	100%	2	16k	0	100	100	7200
9	2 (O/S and Data)	100%	1	16k	20	80	100	7200
10	2 (O/S and Data)	100%	2	16k	20	80	100	7200
11	2 (O/S and Data)	100%	1	16k	70	30	100	7200
12	2 (O/S and Data)	100%	2	16k	70	30	100	7200
13	2 (O/S and Data)	100%	1	64k	0	100	100	7200
14	2 (O/S and Data)	100%	2	64k	0	100	100	7200
15	2 (O/S and Data)	100%	1	64k	20	80	100	7200
16	2 (O/S and Data)	100%	2	64k	20	80	100	7200
17	2 (O/S and Data)	100%	1	64k	70	30	100	7200
18	2 (O/S and Data)	100%	2	64k	70	30	100	7200
19	2 (O/S and Data)	100%	1	128k	0	100	100	7200
20	2 (O/S and Data)	100%	2	128k	0	100	100	7200
21	2 (O/S and Data)	100%	1	128k	20	80	100	7200
22	2 (O/S and Data)	100%	2	128k	20	80	100	7200
23	2 (O/S and Data)	100%	1	128k	70	30	100	7200
24	2 (O/S and Data)	100%	2	128k	70	30	100	7200

Results

Click on the graphs for a larger view

IOPS comparison for all RAID1 and RAID6 tests

IOPS measures the number of read and write operations per second. The pattern for the 3 different tests is consistent where the heavier write tests show the least IOPs gradually increasing in IOPs as the writes decrease. IOPS and block size tend to have an inverse relationship. As the block size increases, it takes longer latency to read a single block, and therefore the number of IOPS decreases however, smaller block sizes yield higher IOPS.

It is clear to see from the graphs that RAID1 VM encryption and RAID1 vSAN encryption produces more IOPS for all tests than RAID6 VM encryption and RAID6 vSAN encryption. This is expected due to the increased overhead RAID6 incurs over RAID1 in general. RAID 1 results in 2 writes, one to each mirror. A RAID6 single write operation results in 3 reads and 3 writes (due to double parity)

Each write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity.

RAID1 VM encryption outperforms RAID1 vSAN encryption in terms of IOPs. The RAID6 results are interesting where at the lower block sizes, RAID6 VM encryption outperforms RAID6 vSAN encryption however at the higher block sizes, RAID6 vSAN encryption outperforms VM encryption.

In order of the highest IOPs

RAID1 VM encryption
RAID1 vSAN encryption
RAID6 VM encryption
RAID 6 vSAN encryption

Throughput comparison for all RAID1 and RAID6 tests

IOPs and throughput are closely related by the following equation.

Throughput (MB/s) = IOPS * Block size

RAID1 VM encryption has the best performance in terms of throughput against RAID1 vSAN encryption however the results are very close together.

RAID6 vSAN encryption has the best performance in terms of throughput against RAID6 VM encryption.

In order of highest throughput

RAID1 VM encryption
RAID1 vSAN encryption
RAID6 vSAN encryption
RAID6 VM encryption

Read Latency comparison for all RAID1 and RAID6 tests

The pattern is consistent between the read/write workloads. As the workload decreases, read latency decreases although the figures are generally quite close. Read latency for all tests varies between 0.40 and 1.70ms which is under a generally recommended limit of 15ms before latency starts to cause performance problems.

There are outlier values for the Read Latency across RAID1 VM Encryption and RAID1 vSAN encryption at 4K and 16K when testing 2 threads which may be something to note if applications will be used at these block sizes.

RAID1 vSAN encryption incurs a higher read latency in general than RAID1 VM encryption and RAID6 VM encryption incurs a higher read latency in general than RAID6 vSAN encryption however the figures are very close for all figures from the baseline.

RAID6 has more disks than mirrored RAID1 disks to read from than RAID1 therefore the reads are very fast which is reflected in the results. Faster reads result in lower latency.

From the lowest read latency to the highest

RAID6 vSAN encryption
RAID6 VM encryption
RAID1 VM encryption
RAID1 vSAN encryption

Write latency comparison for all RAID1 and RAID6 tests

The lowest write latency is 0.8ms and the largest is 9.38ms. Up to 20ms is the recommended value from VMware however with all flash arrays, this should be significantly lower which is what I can see from the results. With NVMe and flash disks, the faster hardware may expose bottlenecks elsewhere in hardware stack and architecture which can be compared with internal VMware host layer monitoring. Write latency can occur at several virtualization layers and filters which each cause their own latency. The layers can be seen below.

The set of tests at the 100% write/0% read and 80% write/20% read have nearly no change in the write latency but it does decrease more significantly for the 30% write/70% read test.

When split into the RAID1 VM encryption and RAID1 vSAN encryption results, RAID1 VM encryption incurs less write latency than RAID1 vSAN encryption however the values are very close.

When split into the RAID6 VM encryption and RAID6 vSAN encryption results, RAID6 VM encryption seems to perform with less write latency at the lower block sizes however performs with more write latency at the higher block sizes than RAID6 vSAN encryption.

From the lowest write latency to the highest.

RAID1 VM encryption
RAID1 vSAN encryption
RAID6 vSAN encryption
RAID6 VM encryption

Latency Standard Deviation comparison for all RAID1 and RAID6 tests

The standard deviation value in the testing results uses a 95^th percentile. This is explained below with examples.

An average latency of 2ms and a 95th percentile of 6ms means that 95% of the IO were serviced under 6ms, and that would be a good result
An average latency of 2ms and a 95th percentile latency of 200ms means 95% of the IO were serviced under 200ms (keeping in mind that some will be higher than 200ms). This means that latencies are unpredictable and some may take a long time to complete. An operation could take less than 2ms, but every once in a while, it could take well over 200
Assuming a good average latency, it is typical to see the 95th percentile latency no more than 3 times the average latency.

I analysed the results to see if the 95th percentile latency was no more than 3 times the average latency for all tests. I added new columns for multiplying the latency figures for all tests by 3 then comparing this to the standard deviation figure. The formula for these columns was =sum(<relevant_latency_column*3)

In the 80% write, 20% read test for the 64K RAID1 Baseline there was one result which was more than 3 times the average latency however not by a significant amount. In the 30% write, 70% read test for the 64K RAID6 Baseline, there were two results which were more than 3 times the average latency however not by a significant amount.

For all the RAID1 and RAID6 VM encryption and vSAN encryption tests, all standard deviation results overall were less than 3 times the average latency indicating that potentially, AES-NI may give encryption a performance enhancement which prevents significant latency deviations.

ESXi CPU usage comparison for all RAID1 and RAID6 tests

I used a percentage change formula on the ESXi CPU usage data for all tests. Percentage change differs from percent increase and percent decrease formulas because both directions of the change (Negative or positive) are seen. VMware calculated that using a percentage change formula, that VM encryption added up to 20% overhead to CPU usage (This was for an older vSphere O/S). There are no figures for vSAN encryption from VMware so I have used the same formula for all tests. I used the formula below to calculate the percentage change for all tests.

% change = 100 x (test value – baseline value)/baseline value

The lowest percentage change is -7.73% and the highest percentage change is 18.37% so the tests are all within VMware’s recommendation that encryption can add up to 20% more server CPU usage. Interestingly when the figures are negative, it shows an improvement over the baseline. This could be due to the way AES-NI boosts performance when encryption is enabled. RAID6 VM Encryption and vSAN encryption show more results which outperformed the baseline in these tests than RAID1 VM Encryption and vSAN encryption.

What is interesting about the RAID1 vSAN encryption and RAID6 vSAN encryption figures is that RAID1 vSAN encryption CPU usage goes up between 1 and 2 threads however RAID6 vSAN encryption CPU usage goes down between 1 and 2 threads.

Overall, there is a definite increase in CPU usage when VM encryption or vSAN encryption is enabled for both RAID1 and RAID6 however from looking at graphs, the impact is minimal even at the higher workloads.

RAID6 VM encryption uses less CPU at the higher block sizes than RAID6 vSAN encryption.

From the lowest ESXi CPU Usage to the highest.

RAID6 VM encryption
RAID6 vSAN encryption
RAID1 VM encryption
RAID1 vSAN encryption

vSAN CPU usage comparison for all RAID1 and RAID6 tests

For the vSAN CPU usage tests. I used a percentage change formula on the data for the vSAN CPU usage comparison tests. Percentage change differs from percent increase and percent decrease formulas because I can see both directions of the change (Negative or positive) Negative values indicate the vSAN CPU usage with encryption performed better than the baseline. VMware calculated that using a percentage change formula, that VM encryption would add up to 20% overhead. There are no figures for vSAN encryption from VMware so I have used the same formula for these tests also.

% change = 100 x (test value – baseline value)/baseline value

The lowest percentage change is -21.88% and the highest percentage change is 12.50% so the tests are all within VMware’s recommendation that encryption in general can add up to 20% more CPU usage. Interestingly when the figures are negative, it shows an improvement over the baseline. This could be due to the way AES-NI boosts performance when encryption is enabled.

RAID1 VM encryption and RAID1 vSAN encryption uses more vSAN CPU than RAID6 VM encryption and RAID6 vSAN encryption. All of the RAID6 VM encryption figures performed better than the RAID6 baseline with the majority of RAID6 vSAN encryption figures performing better than the baseline. In comparison RAID1 VM encryption and RAID1 vSAN encryption nearly always used more CPU than the RAID1 baseline.

From the lowest vSAN CPU usage to the highest.

RAID6 VM encryption
RAID6 vSAN encryption
RAID1 vSAN encryption
RAID1 VM encryption

Conclusion

The following pages provide a final conclusion on the comparison between the functionality and performance of VM Encryption and vSAN Encryption.

Functionality

The main functionality differences can be summed up as follows

VM Encryption

Storage Policy based (enable per VM)
Data travels encrypted.
No deduplication or compression.
Simple to set up with a key management server.
The DEK key is stored encrypted in the VMX file/VM advanced settings.
vSAN and VM encryption use the exact same encryption and kmip libraries but they have very different profiles. VM Encryption is a per-VM encryption.
VM Encryption utilizes the vCenter server for key management server key transfer. The hosts do not contact the key management server. vCenter only is a licensed key management client reducing license costs.
Storage agnostic

vSAN Encryption

Enabled on a virtual cluster datastore level. Encryption is happening at different places in the hypervisor’s layers.
Data travels unencrypted, but it is written encrypted to the cache layer.
Full compatibility with deduplication and compression.
More complicated to set up with a key management server as each vendor has a different way of managing the trust between the key management server the vCenter Server.
The DEK key is stored encrypted in metadata on each disk.
vSAN and VM encryption use the exact same libraries but they have very different profiles.
VM Encryption utilizes the vCenter server for key management server key transfer. The hosts do not contact the key management server. vCenter only is a licensed key management client reducing license costs.
vSAN only, no other storage is able to be used for vSAN encryption.

Functionality conclusion

VM encryption and vSAN encryption are similar in some functionality. Both use a KMS server, both support RAID1, RAID5 and RAID6 encryption and both use the same encryption libraries and the kmip protocol. However, there are some fundamental differences. VM encryption gives the flexibility of encrypting individual virtual machines on a datastore opposed to encrypting a complete datastore with vSAN encryption where all VMs will automatically be encrypted. Both solutions provide data at rest encryption but only VM encryption provides end to end encryption as it writes an encrypted data stream whereas vSAN encryption receives an unencrypted data stream and encrypts it during the write process. Due to this level at which data is encrypted at, VM encryption cannot be used with features such as deduplication and compression however vSAN encryption can. It depends if this functionality is required and if the space which could be saved was significant. VM encryption is datastore independent and can use vSAN, NAS, FC and iSCSi datastores. vSAN encryption can only be used on virtual machines on a vSAN datastore. Choosing the encryption depends on whether different types of storage reside in the environment and whether they require encryption.

The choice between VM encryption functionality and vSAN encryption functionality will be on a use case dependency of whether individual virtual machine encryption control is required and/or whether there is other storage in an organization targeted for encryption. If this is the case, VM encryption will be best. If these factors are not required and deduplication and compression are required, then vSAN encryption is recommended.

Performance conclusion

The performance tests were designed to get an overall view from a low workload test of 30% Write, 70% Read through a series of increasing workload tests of 80% Write, 20% Read and 100% Write, 0% Read simulation. These tests used different block sizes to simulate different application block sizes. Testing was carried out on an all flash RAID1 and RAID6 vSAN datastore to compare the performance for VM encryption and vSAN encryption. The environment was set up to vendor best practice across vSphere ESXi, vSAN, vCenter and the Dell server configuration.

It can be seen in all these tests that performance is affected by the below factors.

Block size.
Workload ratios.
RAID level.
Threads used
Application configuration settings.
Access pattern of the application.

The table below shows a breakdown of the performance but in some cases the results are very close

Metric	1st	2nd	3rd	4th
IOPs	RAID1 VM encryption	RAID1 vSAN encryption	RAID6 VM encryption	RAID6 vSAN encryption
Throughput	RAID1 VM encryption	RAID1 vSAN encryption	RAID6 vSAN encryption	RAID6 VM encryption
Read Latency	RAID6 vSAN encryption	RAID6 VM encryption	RAID1 VM encryption	RAID1 vSAN encryption
Write Latency	RAID1 VM encryption	RAID1 vSAN encryption	RAID6 vSAN encryption	RAID6 VM encryption
Standard Dev	All standard deviation results were less than 3 times the average latency which is recommended with minor outliers	All standard deviation results were less than 3 times the average latency which is recommended with minor outliers	All standard deviation results were less than 3 times the average latency which is recommended with minor outliers	All standard deviation results were less than 3 times the average latency which is recommended with minor outliers
ESXi CPU Usage	RAID6 VM encryption	RAID6 vSAN encryption	RAID1 VM encryption	RAID1 vSAN encryption
vSAN CPU Usage	RAID6 VM encryption	RAID6 vSAN encryption	RAID1 vSAN encryption	RAID1 VM encryption

In terms of IOPs, RAID1 VM encryption produces the highest IOPS for all tests. This is expected due to the increased overhead RAID6 incurs over RAID1 in general. RAID 1 results in 2 writes, one to each mirror. A RAID6 single write operation results in 3 reads and 3 writes (due to double parity) causing more latency decreasing the IOPs.

In terms of throughput, RAID1 VM encryption produces the highest throughput for all tests. It is expected that by producing the highest IOPs in the majority of tests would mean it would produce a similar result for the throughput. Depending on whether your environment needs larger IOPs or larger throughput depends on the block sizing. Larger block sizes produce the best throughput due to getting more data through the system in bigger blocks. As the block size increases, it takes longer latency to read a single block, and therefore the number of IOPS decreases however, smaller block sizes yield higher IOPS.

In terms of read latency, RAID6 vSAN encryption performed best in the read latency tests. Read latency for all tests varies between 0.40 and 1.70ms which is under a generally recommended limit of 15ms before latency starts to cause performance problems. RAID6 has more disks than mirrored RAID1 disks to read from than RAID1 therefore the reads are very fast which is reflected in the results. Faster reads result in lower latency. The values overall were very close.

In terms of write latency, RAID1 VM encryption performed best. All the RAID6 results incurred more write latency than the RAID1 results which was to be expected. Each RAID6 write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity producing a heavy write penalty and therefore more latency. The lowest write latency is 0.8ms and the largest is 9.38ms. Up to 20ms is the recommended value therefore all tests were well within acceptable limits.

The performance of encrypted data also seems to be enhanced by the use of newer flash disks like SSDs and NVME showing latency figures which were within the acceptable values. SSD and NVMe uses a streamlined lightweight protocol compared to SAS, SCSI and AHC protocols while also reducing CPU cycles.

In terms of standard deviation, all standard deviation test results were less than 3 times the average latency which is recommended.

In terms of average ESXi CPU and vSAN CPU usage, RAID6 VM encryption produced the lowest increase in CPU. All encryption appeared to be enhanced by leveraging the AES-NI instructions in Intel and AMD CPU’s. The increase in CPU usage by the hosts and vSAN compared to the baseline for both sets of encryption tests is minimal and within acceptable margins by a considerable amount. In some cases, there was lower CPU use than the baseline possibly due to the AES-NI offload.

Encryption recommendation

Overall RAID1 VM encryption produces the best IOPs, throughput and write latency including the standard deviation metric values for latency being well under the acceptable limits. RAID1 ESXi CPU usage and vSAN CPU usage is higher than RAID6 however the difference is minimal when looking at the graphs especially in some cases where both sets of tests can outperform the baseline across the different block sizes. For applications which need very fast read performance, RAID6 will always be the best option due to having more disks than mirrored RAID1 disks to read from therefore this encryption should be matched to a specific application requirement if reads are a priority.

Reference

(Townsendsecurity, 2019) The Definitive Guide to VMware Encryption and Key Management [Online]. Available at https://info.townsendsecurity.com/vmware-encryption-key-management-definitive-guide (Accessed 19 February 2020)

vSAN Trim/Unmap functionality

November 24, 2020 IT No comments

What is vSAN Trim/Unmap functionality?

This is an interesting feature of vSAN which came up in work recently. vSAN supports thin provisioning which lets you use as much capacity as currently needed where you can add more space in the future. One challenge to thin provisioning is that the VMDKs will not shrink when files within the guest O/S are deleted. An even bigger problem develops where many file systems will always direct new writes into free space rather than the old used space. Previous solutions to this involved manual intervention and storage vMotion to external storage or powering off an external machine. vSAN Trim/Unmap space reclamation solves this problem.

How does it work?

Modern guest O/S file systems have had the ability to reclaim no longer used space which are known as Trim/Unmap commands for the ATA and SCSI protocols. vSAN 6.7U1+ now has full awareness of Trim/Unmap commands sent from the guest O/S and can reclaim previously allocated storage as free space.

Benefits

Faster repair means that blocks which have been reclaimed do not need to be rebalanced or remirrored in the event of a device failure
Removal of dirty cache pages means that read cache can be freed up in the DRAM client cache as well as the hybrid vSAN SSD cache for use by other blocks. If removed from the write buffer then this reduces the number of blocks copied to the capacity tier.

Performance Impact

It does carry some performance impact as I/O must be processed to track pages which are no longer needed. The largest impact will be the UNMAPs issued against the capacity tier. vSAN 7U1 includes performance enhancements which help provide the fairness of UNMAPs in heavy write environments.

How is it enabled?

You can use either of the 2 CLI tools below.

RVC
PowerCLI

RVC

Enable = vsan.unmap_support <cluster> -e

Disable = vsan.unmap_support <cluster> -d

Powercycle the VMs

PowerCLI

Enable = Get-Cluster -name <cluster>|set-VsanClusterConfiguration–GuestTrimUnmap:$true

Disable = Get-Cluster -name <cluster>|set-VsanClusterConfiguration–GuestTrimUnmap:$false

Status = Get-Cluster -name <cluster>|get-VsanClusterConfiguration |ft GuestTrimUnmap

Powercycle the VMs

Powercycle command

You can run a command which will force a powercycle when the VM is next rebooted.

Get-Folder <foldername> | Get-VM | New-AdvancedSetting -Name vmx.reboot.PowerCycle -Value TRUE -Confirm:$false

Requirements

A minimum of virtual machine hardware version 11 for Windows
A minimum of virtual machine hardware version 13 for Linux.
disk.scsiUnmapAllowed flag is not set to false. The default is an implied true. This setting can be used as a “kill switch” at the virtual machine level should you wish to disable this behaviour on a per VM basis and do not want to use in guest configuration to disable this behaviour. VMX changes require a reboot to take effect.
The guest operating system must be able to identify the virtual disk as thin.
After enabling at a cluster level, virtual machines must be power cycled

Monitoring TRIM/UNMAP

TRIM/UNMAP uses the following counters in the vSAN performance service for the hosts as seen in the figure below courtesy of VMware.

UNMAP Throughput – The measure of UNMAP commands being processed by the disk groups of a host.
Recovery UNMAP Throughput – The measure of throughput of UNMAP commands be synchronized as part of an object repair following a failure or absent object.

Using the advanced performance counters for a host will also show you the below counters

AES and AES-NI

November 12, 2020 IT No comments

What is AES?

The Advanced Encryption Standard Instruction Set and the Intel Advanced Encryption Standard New Instructions allows specific Intel/AMD and other CPUs to do extremely fast hardware encryption and decryption. AES (Advanced Encryption Standard), is a symmetric block cipher which means that blocks of text which have a size of 128 bits are encrypted, which is the opposite to a stream cipher where each character is encrypted one at a time. The algorithm takes a block of plain text and applies alternating rounds of substitution and permutation boxes to it which are separate stages. In AES, the size of each box is 128, 192 or 256 bits, depending on the strength of the encryption with 10 rounds applied for a 128-bit key, 12 rounds for the 192-bit key, and 14 rounds for the 256-bit key, providing higher security.

The figure below shows that potential key combinations exponentially increase with the key size. AES-256 is impossible to break by a brute force attack based on current computing power, making it the strongest encryption standard. However longer key and more rounds requires higher performance requirements. AES 256 uses 40% more system resources than AES 192, and is therefore best suited to high sensitivity environments where security is more important than speed.

AES Block Cipher Modes

There are different AES block cipher modes that are part of AES.

Electronic Code Book

The simplest block cipher mode is Electronic Code Book. This cipher mode just repeats the AES encryption process for each 128-bit block of data. Each block is independently encrypted using AES with the same encryption key. For decryption, the process is reversed. With ECB, identical blocks of unencrypted data, referred to as plain text, are encrypted the same way and will produce identical blocks of encrypted data. This cipher mode is not ideal since it does not hide data patterns well.

Cipher Block Chaining

A newer block cipher mode was created called Cipher Block Chaining. CBC’s aim is to achieve an encryption method that encrypts each block using the same encryption key producing different cipher text, even when the plain text for two or more blocks is identical. Cipher Block Chaining addresses security weaknesses with ECB.

AES-XTS Block Cipher mode

AES-XTS Block Cipher Mode is a new block cipher mode and designed to be stronger than other modes. It eliminates potential vulnerabilities from sophisticated side channel attacks used to exploit weaknesses within other modes. XTS uses two AES keys. One key performs the AES block encryption; the other is used to encrypt what is known as a Tweak Value. This encrypted tweak is further modified with a Galois polynomial function (GF) and XOR with both the plain text and the cipher text of each block. The GF function ensures that blocks of identical data will not produce identical cipher text. This achieves the goal of each block producing unique cipher text given identical plain text without the use of initialization vectors and chaining. Decryption of the data is carried out by reversing this process.

What is AES-NI?

Intel AES New Instructions (Intel AES-NI) is a new encryption instruction set which contains improvements to the AES algorithm and accelerates the encryption of data in the Intel Xeon processor family and the Intel Core processor suite. AES is a symmetric block cipher that encrypts/decrypts data through several rounds. It is part of the FIPS standard.

There are seven new instructions. The instructions have been implemented to perform some of the complex and performance intensive steps of the AES algorithm. Hardware is used to accelerate the AES algorithms. Intel say that AES-NI can be used to accelerate the performance of an implementation of AES by 3 to 10x over a total software implementation.

How does it work?

A fixed block size of plain text is encrypted several times to produce a final encrypted output. The number of rounds (10, 12, or 14) used depends on the key length (128, 192, or 256). Each round feeds into the following round. Each round is encrypted using a subkey that is generated using a key schedule

What are the six new instructions?

The new instructions perform several computationally intensive parts of the AES algorithm using fewer clock cycles than a software solution.

Four of the new instructions accelerate the encryption/decryption of a round
Two new instructions are for round key generation.

Improved security

The new instructions also improve security by preventing side channel attacks on AES. Encryption and decryption are performed completely in hardware without the need for software lookup tables. By running in data-independent time and not using tables, they help in eliminating the major timing and cache-based attacks that target table-based software implementations of AES. In addition, AES is simple to implement, with reduced code size, which helps reducing the risk of introducing security flaws, such as difficult-to-detect side channel leaks.

Most of the cloud providers such as Amazon, Google, IBM, Microsoft offer instances equipped with this Intel extension and use it as security feature in their products. AES can be used in applications where confidentiality and integrity is of highest priority. If cryptographic strength is a major factor in the application, AES is the best suited algorithm.

Kubernetes components

October 2, 2020 Kubernetes No comments

Kubenetes components

As seen in the diagram below, every Kubernetes cluster will have one or more control plane nodes and one more worker nodes. The Control Plane manages the worker nodes and the Pods in the cluster. The worker nodes host the pods that are the components of the application workload. There are no Cloud Providers if this is running on bare metal.

Components running on the control plane node include

etcd is the persistent datastore for Kubenetes which stores the cluster state.
kube-api-server is the front end for the Kubernetes control plane which exposes the Kubernetes API and the only component which accesses the etcd.
kube-scheduler assigns workloads to the worker nodes and decides which nodes pods will be run on.
The kube-controller manager runs a collection of control processes to manage various resources. It monitors when nodes go down, maintains the correct number of pods, joins services and pods and creates default accounts and API access tokens for new namespaces.
The cloud controller manager runs controllers which provision underlying infrastructure needed by workloads. It has a control loop to manage storage volumes if a workload needs persistent storage.

Components run on a worker node

Kubelet – Primary node agent which is responsible for spinning up containerized workloads that are assigned to its node.
Kube Proxy – Used for implementing Kubenetes services such as network components which connect workloads in the cluster.
A container runtime such as Docker

etcd

etcd is the database for Kubernetes. It is a distributed key value store. etcd clusters can be 3 or 5 nodes and each node has a copy of the datastore providing fault tolerance.

To maintain consensus, it uses an algorithm called Raft. The nodes can connect to each other on port 2380. To establish consensus they must maintain quorum which requires more than half the nodes in the cluster to be available. If Quorum is lost, the cluster cannot reach consensus and cannot process changes.

3 nodes can tolerate the loss of 1 node.

5 nodes can tolerate the loss of 2 nodes.

The diagram below shows the etcd members in their own dedicated cluster. The etcd client which is the Kubenetes API server connects to any of the members on port 2380.

Alternatively the etcd members can be located with the control plane components on the same machine. The location will depend on cost, performance and capacity. It is not recommended to share the etcd cluster for the Kubernetes cluster with other applications and worth dedicating an etcd installation to the Kubernetes cluster.

Using the Kubernetes command-line tool to find information

If we run the command get-nodes, we can see the 3 control nodes

If we run the command get pods -n kube-system , then we can see the 3 etcd pods that are running on the control plane nodes in a co-located config.

Kubenetes API Server

The API server is where all control plane operations are exposed to the API. We use a tool called kubectl which translates commands into http rest style API calls.

Custom resource definitions

Kubernetes is extensible via custom resource definitions. CRDs are used to create our own API types.

Once we create a CRD in Kubernetes we can use it like any other native Kubernetes object thus leveraging all the features of Kubernetes

Kubenetes API resources

All components communicate with the API server. The API server’s REST endpoint implements the open API specification. Objects created in the API are implementations of Kubernetes resources

Resources include Pods, Services and Namespaces. Each resource contains a spec which defines the desired state of the resource and a status which includes the current state of the object in the system.

Resources can come under a cluster or namespace scope depending on their implementation. You can see below how some resources fit into each category.

API versioning

There are several meanings in Kubernetes

Alpha level

Could contain bugs
May be disabled by default
Lack of support

Beta Level

Tested
Enabled by default
Supported for a length of time to enable adoption and use
Details may change

GA Level

Stable
Will be available through several versions
Details are set

Authentication and Authorization

kubectl is used by parsing a local configuration file containing authentication data and data about the request and posting that json data to the API endpoint. The API server then answers requests from the controllers.

Firstly, the API server needs to authenticate that you are allowed to make a request using different methods of configuration authentication. Kubernetes doesn’t have a concept of a user object and doesn’t store user details. It uses authenticators for this task which are configured by the administrators.

For authorization, the API leverages authorizers and authorization policies. To view these, you can type kubectl auth can-i –list to list the resources, the resource URLs and the verbs you can use within a cluster.

Admission control comes after authorization. we can then validate or mutate the request. Validation just checks the validation logic and makes sure it is correct. Mutate will looks at an object and potentially change that object.

The API server then does some spec validations. These are validation routines and check that everything in your spec is correct and notify you on typos and format errors.

Scheduler

The scheduler’s job is to assign pods to nodes. When you create a pod request, you will provide the pods name and image that it will use. You don’t have to define a node for the pod but the option is there. The scheduler watches for new pod resources to be created. This watch functionality is exposed by the api servers to the controllers and the scheduler uses it to watch pods. When the scheduler finds one that doesn’t have a node name field set, it determines where the pod should run and updates the pod resource in the node name field to add a value to the field. The kubelet on the assigned node will then change the current state to the defined desired state.

The scheduler goes through a process of filtering and scoring stages. Filtering comes first and filters out any nodes which cannot host a pod. There may be something called a taint on the node which describes something a node cannot tolerate and if this isn’t written in the manifest then the node cannot host the pod. However, if the information is written in the manifest, then it will be able to.

A pod may request a certain amount of RAM and CPU or have a requirement for a GPU for example.

Once the filtering is complete, it moves on to the scoring stage. Scoring the candidates means finding the best host for scheduling the pod. Some pods will have an affinity section which sets a preference for scheduling a pod in a certain zone. Another scoring factor could be whether the node has the container image being used by the pod. Also lower workload utilisation on a node may give a preference.

Customised scheduling – Policies and Profiles

You can configure the behaviour of the default scheduler using policies and profiles using predicates (used for filtering) and priorities (used for scoring)

You can also build your own scheduler with custom scheduling logic instead of the existing scheduler.

Running more than one scheduler

The scheduler should be run in a highly available configuration at all times however, only one scheduler is active at any one time.

The first scheduler will acquire a leader lease using an endpoint by default to record a leader lease. The other schedulers will be online and fail to acquire the leader lease. They will periodically it check the lease is current for the active leader and will succeed if the leader is unavailable

The Kube Controller Manager

The Kube Controller Manager runs the core control loops for the Kubernetes control plane. There are many different controllers. Several are responsible for maintaining the desired state of common resources in a Kubernetes cluster. Each controller has a specific set of functionality which depend on the resource they manage.

A control loop is a non terminating loop which regulates the state of the system. In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the API server and makes changes to move the current state towards the desired state.

A controller is responsible for managing a resource and it will have a watch on the resource kind for which it is responsible. The watch is a continuous connection with the kube-api server where notification of changes is sent to the controller. It will then work on changing the existing state to the desired state. It will try to continue trying if it can’t finish the first time.

If a replica set is created then we are duplicating pods. If we want 4 replicas as our desired state. the scheduler will start assigning these replica sets to nodes. A deployment controller will create these replica sets.

Like the scheduler, when used in a highly available configuration, the controller manager uses a leader election to ensure only one instance is actively managing resources in a cluster at a time.

Cloud Controller Manager

The cloud controller manager is similar to the kube controller manager. It is a collection of controllers with the same principles around control loops and leader elections. The Cloud Controller Manager will not be found in every cluster if you’re running on bare metal. The cloud controller manager lets you link your cluster into your cloud provider’s API and separates out the components that interact with that cloud platform from components that just interact with your cluster.

Controllers inside the Cloud controller manager

Node Controller: The node controller is responsible for creating node objects when new servers are created in your cloud infrastructure. The node controller obtains information about the hosts running inside your tenancy with the cloud provider.
Route controller: The route controller is responsible for configuring routes in the cloud correctly so that containers on different nodes in your Kubernetes cluster can communicate.
Service Controller: Services integrate with cloud infrastructure components such as managed load balancers, IP addresses, network packet filtering, and target health checking. The service controller interacts with your cloud provider’s APIs to set up load balancers and other infrastructure components.

Examples

If you have a workload that you would like to expose to requests from outside, one way to do this is to put it behind a Kubernetes service such as a load balancer. If you have a cloud controller manager set up, it will configure a load balancer through your cloud providers API and configure it to route traffic to the pods for your workload.

Another example could be a workload which requires persistent storage. You can set up storage classes which leverage a provisioner from a cloud provider. This allows you to provision backing storage volumes for your workload on demand for example by referencing a storage class from a Kubernetes stateful set. The cloud controller manager will use the cloud providers API to provision the storage volumes when needed so they can be mounted in a workloads pod.

Kubelet

The kubelet is the primary Kubernetes node agent. It runs on every node in the cluster. It’s responsible for running the containers for the pods which are scheduled to its node. The kubelet for each node keeps a watch on pod resources in the api server.

The kubelet is another Kubernetes controller which provides an interface between the Kubernetes control plane and the container runtime on each server in the cluster.

Whenever the scheduler assigns a pod to a node in the api server, the kubelet for that node reads the pod spec then instructs the container runtime to spin up the container to build that spec. The container runtime then downloads the container images if they’re not there, then starts the container. The kubelet instructs the container runtime using the container runtime interface or CRI.

The kubelet is the only Kubernetes component that does not run in a container. The kubelet along with the container runtime are installed and run directly on the machine that is the node in the cluster.

The other components are typically run in containers as kubernets pods but this is the general convention below

The Kubelet gets notifications from the api server on what pods to run. The api server and other control plane components get created by using static pod manifests. When you start the kubelet you can set a path to the directory or file that contains these static pod manifests. The kublelet tells the container runtime to spin up the containers for the pod manifests and monitor them for changes. It can also make http requests to remote endpoints or listen for http connections to get to pod manifests but the most common method is to do static pod manifests on a local file system seen below.

Kube Proxy

Similar to the kubelet the kube proxy runs on every node in the cluster. Unlike the kubelet the kube proxy runs in a Kubernetes pod

kube-proxy enables essential functionality for Kubernetes services. If services didn’t exist then when a client application needed to connect to server pods in a cluster then it would need to use the pods IP to connect to the pod on the IP network. It would need to retrieve and maintain a list of all the pod addresses which is unnecessary work. In Kubernetes it is likely that pods will be created and destroyed frequently so we need a better way to manage it. The service provides a stable IP address seen here as 10.10.10.1.

A controller keeps track of the pods associated with the service and adds and removes them from the back end pool as needed. The client just needs the address of the service and the rest is taken care of by the end point resource which is usually created on your behalf by a controller. If you create a service with a sector which references a label applied to the pods, the endpoints resource is created for you. There the addresses for the back end pods are maintained. If the pool of pods changes, the endpoints will be updated and the client requests routed appropriately. It looks like this service is a proxy, that load balances requests to the back end but in Kubernetes applications, it works slightly differently

We have a end point controller in the kube controller manager that manages the end point resources. It manages the associations between services and pods. Each node in the cluster has the kube proxy running and watching service and end point service resources. When a change occurs which needs updating, kube proxy updates rules in ip tables which is a network packet filtering utility which allows network rules to be set in the network stack of the Linux kernel. kube proxy offers alternatives to using ip tables but this is the most common option.

Now when a client pod sends a request to the services IP, it gets routed by the kernel to one of the pod IPs depending on the rule which has been set by kube proxy. When using IP tables the pod selected by the pool will be random. For more control you would have to use IPVS (IP Virtual Server) which implements layer 4 load balancing in the Linux kernel.

The services IP is a virtual IP and you won’t get a response if you ping it. It is essentially a key in the rule set by IP tables which gives network packet routing instructions to the host’s kernel. The client pod can use the service IP like it normally would as like it was calling an actual pod IP.

Sample manifests

The below screenprints shows a sample manifest for a deployment and service resources

The deployment manifest shows the name of test-deployment with 3 replica pods in the deployment. The selector indicates that this deployment will manage pods with the label app: test and in the template we give that label to the pods. Each pod will consist of a single container with the same sample-container and run the nginx container image and listen on port 8080. It will also spin up a load balancer with the cloud provider to expose it to requests from outside the cluster.

kubectl appy -f deployment.yaml -f service.yaml

This command above will translate the command into a REST API call to the Kubernetes API server. The API server will authenticate and authorize the user and then apply any admission control operations such as pod security policies. if it fails the resource will not be created

Once this is complete the various controllers in the system are notified by the watch mechanism and work to change the existing state to the desired state.

The Deployment controller creates the corresponding replica set with the 3 replicas which were defined in the manifest
The Replication controller is notified of the new replica sets and in response creates the 3 separate pod resources using the pod template
The Endpoints controller will create the endpoints resource which connects the individual pods to the service by the pods label

Another controller which is notified in response to the resources being created is the service controller. The previous controllers are part of the core Kubernetes controllers in the kube controller manager. This controller is part of the cloud controller manager which is responsible for integrations with the underlying cloud providing infrastructure. The service controller is notified by its watch when the services resource is created. It notices that the spec includes a load balancer and responds by calling the cloud providers api to have a load balancer provisioned to route traffic to the cluster and associated pods.

Next the scheduler is watching for new pods and when the replication controller creates them, the scheduler is notified and responds by finding worker nodes for the containers to run on to fulfil the pod spec. Once the assignments have been made, the next controller which is the kubelet instructs the local container runtime to create the requested container image from the nginx deploy spec manifest.

Now the containers are up and running we have to get network access to these containers and this is what kube proxy is used for. It watches the endpoints resource which connects the service to the pod and updates IP tables on it nodes to ensure that traffic sent to the services ip gets routed to one of the pod IPs. This includes any client request from outside the cluster, inside the cluster and through the cloud providers load balancer.

vCloud Foundation Lab Constructor 4.0.1 Install

August 3, 2020 IT No comments

What is vCloud Foundation in general?

VMware Cloud Foundation provides a software-defined stack including VMware vSphere with Kubernetes, VMware vSAN, VMware NSX-T Data Center, and VMware vRealize Suite, VMware Cloud Foundation provides a complete set of software-defined services for compute, storage, network security, Kubernetes management, and cloud management.

What is vCloud Foundation Lab Constructor?

VLC is an automated tool built by Ben Sier and Heath Johnson that deploys an entire nested Cloud Foundation environment onto a single physical host or vSphere Cluster. It is an unsupported tool, this will allow you to learn about VCF with a greatly reduced set of resource requirements. VLC deploys the Core SDDC components in the smallest possible form factor. Specifically, components like the vCenter and vRealize Log Insight nodes are deployed in the tiny, and xsmall format as specified in a JSON config file. With these two stages, the reduction of physical resources needed to deploy the VLC nested lab components becomes possible on a single physical host with 12 CPU Cores, 128 GB RAM, and 2 TB of SSD Disk.

An overall view of what VLC looks like

Download VLC

You will need to register at http://tiny.cc/getVLC and then you will be provided with a zip file.

Support channel

Support for VLC on Slack vlc-support.slack.com

Useful links

https://blogs.vmware.com/cloud-foundation/2020/01/31/deep-dive-into-vmware-cloud-foundation-part-1-building-a-nested-lab/

https://blogs.vmware.com/cloud-foundation/2020/02/06/deep-dive-into-vmware-cloud-foundation-part-2-nested-lab-deployment/

Sofware involved

The below software is used in VLC

VMware Cloud Builder OVA (Contains the S/W below)
VCF-SDDC-Manager-Appliance OVA
VMware NSX Manager OVA
VMware vCenter appliance
VMware ESXi (Gets extracted from CloudBuilder)
VMware vRealize Log Insight OVA

Where to get the software?

VMware Cloud Foundation software is only available a few ways today. Here are a few methods to get started with

VMUG Advantage Eval Experience – VCF is available for VMUG Advantage Subscribers
vExpert Program – If you are a vExpert you can log in and download the software for free however there is not an NSX-T license available here, only in VMUG I believe.
VCF customers – Can download what you need from the My VMware Portal

Pre-requisites

Step 1

You need a single physical host running ESXi 6.7+ with 12 cores, 128 GB RAM and 800 GB SSD. This is the minimum requirement for using VLC and you will need to configure the host in 1 of 4 configurations below.

Standalone ESXi (No vCenter) using a vSS
ESXi host with vCenter using vSS
Single ESXi host in a cluster using vDS
Multiple ESXi hosts in a cluster using vDS

If you are running multiple hosts in a vSAN cluster then run the following command on all hosts because you will be in effect nesting a vSAN within a vSAN

esxcli system settings advanced set -o /VSAN/FakeSCSIReservations -i 1

If you are deploying to a single physical host don’t worry about physical VLANs as all the traffic will reside on that single physical host. If you are deploying to a vSphere cluster you’ll need at least 1 VLAN (10 is the default) physically configured and plumbed up on your physical switch to all hosts in that cluster. If you intend to do anything with NSX (AVN’s are the common thread) you’ll also need 3 additional VLANs (11-13 are default)

If in a cluster configuration, disable all HA and DRS and vMotion on the physical host(s).
You will need a virtual switch (VSS or vDS) with the MTU set to 9000
On the vSwitch, create a portgroup for VCF with VLAN Trunking (0-4094) enabled. On the portgroup (not the switch) set the following security settings:

Promiscuous Mode = Accept
Allow Forged Transmits = Accept
Allow MAC Address Changes = Accept

I chose to deploy my lab on one host with a vDS switch.

Step 2

Build a Windows-based jump host on this ESXi host as a VM and install the following software.

Windows 10/2012/2016 (Older versions are not supported)
Powershell 5.1+
PowerCLI 11.3+
OVFTool 4.3+ (64bit)
Net Framework
VMXNET3 NICs – 1500 MTU

On this jump host, attach two virtual NICs.

Attached one NIC to your local LAN Network so you can RDP to it.
Attach the second NIC to the VCF PortGroup created in Step 1 and configure it with the IP 10.0.0.220. Set the DNS on the second NIC to 10.0.0.221. The 10.0.0.221 address will be the address assigned to the Cloud Builder appliance, by default. VLC will modify the Cloud Builder appliance so that it provides specific services, like DNS, for the nested environment. Thus, this using this IP for DNS will allow you to access the nested VCF environment when using the default configuration file in Automated mode.
This second NIC will also need to be configured in the NIC properties to use the VLAN of your management network. In the default Automated VLC configuration this is VLAN 10.

The jump host should look like the below

This image has an empty alt attribute; its file name is image-7.png

On the jump host, do the following

Disable Windows Firewall.
Turn off Windows Defender Real-time Scanning. Note: this has a habit of resetting after reboots of the Windows VM.

Step 3

On the Windows jump host, create a local disk folder for VLC. This must be a local attached disk (i.e. “C:\VLC\ ”) as mapped Network drives will fail.
Download the VCF Software (Cloud Builder OVA) into this folder.
You used to have to download the vSphere ESXi ISO that matches the version required for VCF. The easiest method to do this was to simply copy the .iso file located on the Cloud Builder appliance but to make this even easier, VLC now provides an option in the setup GUI where it will download this file directly from the Cloud Builder appliance that it deploys.
Download and extract the VLC package to this folder as well
Install anything extra you need like Putty, WinSCP and Notepad++

Step 4

We now need to edit one of two files. You have a choice of Automated_AVN or Automated_No_AVN when deploying VLC.

For more information on AVN, check this blog

https://blogs.vmware.com/cloud-foundation/2020/01/14/application-virtual-networks-with-vcf/

Multiple sample bringup JSON formatted files are provided with VLC. The selection of the bringup JSON file will dictate if AVN will be implemented at bringup or not. Regardless of which bringup file is to be used, you will need to edit the file to be used in order to define the license keys to be used. The default configuration files do not include any license keys. Using a text editor, edit the appropriate file, as desired with an ESXi license, vCenter license, NSX-T license and vSAN license.

Step 5

Either open a Powershell window (as Administrator) and execute the VLC PowerShell Script “C:\VLC\VLCGUi.ps1” or right click on the VLCGUI.ps1 and select ‘Run with PowerShell’.
VLC UI will Launch

Once the above screen completes, you will see the below screen. Select the “Automated” Button. This will build your first four hosts for the Management Domain. This is done by creating four virtual nested ESXi hosts. These nested hosts are automatically sized and created for you. You are able to configure the hostnames and IP addresses to be used within the configuration file that you provide the VCF Lab Constructor

Click on the field titled ’VCF EMS JSON’ and select the JSON file that you just entered the license keys for.
Click on the CB OVA Location field to select the location of the CB OVA.
(Optional) Enter the External GW for the Cloud Builder Appliance to use. This allows you to point to a gateway that will allow internet access.

Click the Connect Button
VLC will connect to the host or vCenter you specified and will validate all necessary settings. It will then populate the Cluster, Network, and Datastore fields with information gathered from your environment.
Select the cluster, network (port group) and datastore that you desire VLC to install the nested lab to. The Cluster field will not display any value if you are deploying directly to a single ESXi host.
** If your port group does not show up, you need to check to see if the previous security settings have been set explicitly on the port group and not just the switch.
Click the yellow Validate button
As VLC validates the information, it will mark the fields in green. When everything has been validated, the Validate button will change to a green button that says ‘Construct’.
Note the Bring up box. his is a fully documented process in the installation of VCF. Using the VCF Lab Constructor will allow you to manually do this so you can follow the steps of the official VMware Documentation, or if you check the box in the GUI the VCF Lab Constructor will complete Bring-up for you automatically.
Click Construct to begin the deployment of VMware Cloud Foundation.

The process will take some time to complete. On average, expect to wait three and a half hours for the deployment process to complete.

Logging

During bringup logs can be found in the Cloudbuilder appliance in the /var/log/vmware/vcf/bringup directory – Check vcf-bringup-debug.log in that directory.
For problems deploying VC and PSC on bringup look in /var/log/vmware/vcf/bringup/ci-installer-xxxx/workflow_xxxx/vcsa-cli-installer.log
After bringup you can look at the SDDC Manager for logs. The are all rooted in the /var/log/vmware/vcf folder. Depending on what operation you are performing you can look into one of the other folders.
Domain Manager – Used when creating/deleting/expanding/shrinking new workload domains:/var/log/vmware/vcf/domainmanager/domainmanager.log
Operations Manager – Used when commissioning/decommissioning hosts and resource utilization collection:/var/log/vmware/vcf/operations/operationsmanager.log
LCM – Used for Life cycle management activities like downloading bundles, applying updates: /var/log/vmware/vcf/lcm/lcm.log

Accessing the VCF UI

To gain network access when the VCF components are installed on layer 3, your jump host will need a NIC with multiple IP addresses or you will need multiple NICs. Be aware that because everything is nested inside Layer 2 all network traffic is being broadcast back up to Layer 1 port groups. Simply having your jump host on this subnet or port group and listening on the default VCF subnet i.e. (192.168.0.0) will allow you to access everything in layer 3. The jump host can also be nested at layer 1 or a physical desktop that has access to the same subnet. Nesting it at Layer 1 has the best performance.

The below diagram courtesy of VMware shows the networks which are created

Further tasks – Expanding the number of hosts

Using the Expansion pack option will now allow you to scale out hosts

This image has an empty alt attribute; its file name is image-5-1024x542.png

When clicking on the Expansion pack option, you get the below screen

This image has an empty alt attribute; its file name is image-10-1024x544.png

When you have used the Automated method to deploy your environment. VLC has configured the Cloud Builder appliance to provide essential infrastructure services for the managment domain. Before adding additional hosts, you will need to add the appropriate DNS entries to the Cloud Builder configuration. You can use the information below or further down the post, I go through using the expansion pack option when running the VLCGui.ps1 script again and modifying some VLC files.

Adding DNS entries for extra hosts

Use SSH to connect to your Cloud Builder VM and log in using the username (admin) and the password that you specified in the VLC GUI when you deployed the environment.
You will need to edit the DNS “db” file for the zone specified. As an example, assume that the domain ‘vcf.sddc.lab’ was used during the creation of the nested environment. This would mean the zone file would be located here: /etc/maradns/db.vcf.sddc.lab
After making your changes and saving the file you will need to reload maradns and the maradns.deadwood services. MaraDNS takes care of forward lookups and Deadwood takes care of Reverse DNS.

You would follow this same procedure for adding DNS entries for vRSLCM, vROps, vRA, Horzion, or any other component. Note: Certain software (like vROps, vRA, and Horizon) are not automated in VCF 4.0 via SDDC Manager. You may need to follow the manual guidance presented in the VCF documentation to deploy these software packages.

Logging into the SDDC

From the jump host you can log into the following

Hosts = 10.0.0.100-103

vCenter IP = 10.0.0.12 (https://vcenter-mgmt.sddc.lab)

SDDC = 10.0.0.4 (https://sddc-manager.vcf.sddc.lab)

Have a click around and get familiar with the user interface

What if we want to create a workload domain?

The initial part of this script deploys the 4 node ESXi management domain so what if we want to create some more hosts for a workload domain for the products below?

K8s
Horizon
HCX
vRealize suite

Step 1

First of all we are going to use the below 3 files and add DNS entries

Open the additional_DNS_Entries.txt file and add in the new 3 hosts. In my case it looks like this.

The next file to look at is the add_3_BIG_hosts_bulk_commission VSAN.json

Next we will have a look at the add_3_BIG_hosts_bulk_commission VSAN.jso file. This is used by the vCloud Foundation software itself.

So now we need to run the VLCGui.ps1 script again located in c:\VLC to get to the point where we see the expansion pack option below.

Run Powershell and run .\VLCGui.ps1

Click on Expansion pack

Add in 10 for the main VLAN

In the Addtl Hosts JSON file box, select your add_3_BIG_hosts.json

In the ESXi ISO Location, navigate to c:\VLC\cb_ESX_iso and select the ESXi image

Next add in the host password which is VMware123

Add in the NTP IP which points to the CloudBuilder appliance on 10.0.0.221

Add in the DNS IP which points to the CloudBuilder appliance on 10.0.0.221

Add in the domain name for the lab which is vcf.sddc.lab

Next put in your vCenter IP, username and password and click Connect

When connected, choose your cluster, network and datastore like you did when configuring this for the inital management host deployment and then click Validate and everything should be green

Click Construct

You will now see in your vCenter the extra hosts being created

Once finished, you should see the below message in PowerShell. You can see it took a total of around 8 minutes.

The hosts are now ready to be commissioned into SDDC Manager so we go back to sddc-manager.vcf.sddc.lab and click on Commission hosts in the top right hand corner.

Say Yes to the entire checklist and click Proceed

Next we will use the Import button to add the additional hosts.

Choose the add_3_BIG_hosts_bulk_commission VSAN.json file

Click upload

If you have a problem where you get the below message then follow the steps below

Log into the Cloudbuilder appliance using root and VMware123! and run the below command

Press i to insert new data and add in your new hosts in the same format as the other entries

Go back to SDDC manager and try an upload again and everything should be fine.

Select all hosts, click on the tickbox on the column saying Confirm FingerPrint and click Validate

Click Next

Review

You will see a message in SDDC Manager saying the hosts are being commissioned

Once commissioned, you will see them as unassigned hosts

Following on from this, I will be following the VLC manual to enable vSphere with Kubernetes on VLC

Enabling Kubernetes on VLC

In vcenter-mgmt.vcf.sddc.lab, set DRS to Conservative on mgmt-cluster

In vcenter-mgmt.vcf.sddc.lab, set VM Monitoring to Disabled

In vcenter-mgmt.vcf.sddc.lab, remove the CPU and memory reservation on nsx-mgmt-1

Make sure you have enough licensing available and add additional licenses if required.

We can now create a VI workload domain with the 3 extra hosts we added before. In sddc-manager.vcf.sddc.lab, click on the menu and select Inventory > Workload domains and click the blue + Workload Domain button. Then select the dropdown VI – Virtual Infrastructure

Select vSAN on Storage Selection and click begin

Enter a Virtual Infrastructure Name and Organisation name

Enter a cluster name and click Next

Fill in the details of the workload domain vCenter. I have screenshotted the file additional_DNS_Entries.txt from the c:\VLC folder next to this for reference. I used the password used throughout this lab which is VMware123! to keep everything easy.

Next we need to fill in the NSX information. Again the information has come from the additional_DNS_Entries.txt from the c:\VLC folder and the password needs to be stronger so I have used VMware123!VMware123!

Leave the vSAN storage parameters as they are

On the Host selection page, select your 3 new unassigned hosts

Put in your license keys

Check the object names

Review the final configuration

You will start to see the vcenter-wld appliance being deployed

You will see the workload domain activating

When it is finally done, we should see the following

If we log out of the vCenter and back in then we will see the linked mgmt and workload vCenters under one page

Edit the cluster settings and change the migration threshold to Conservative

In the HA settings, set the VM monitoring to disabled.

Edit the settings on the nsx1-wld to change the CPU and memory reservation to 0

In the sddc-manager.vcf.sddc.lab > Workload Domains > WLD-1 – Actions – Add Edge Cluster

Select All on the Edge Cluster Prerequisites page

Put in the Edge Cluster details (I followed this from the lab guide

Select Workload domain on the specify use case

Next, add the first Edge node, once you have filled everything in, select the button to add the second edge node

Add the second Edge node and click Add Edge node

Once complete, you should see that both Edge nodes are added successfully

On the Summary page, double check all your details are correct and click Next

Validation will run

Validation should succeed

Click Finish and check out SDDC Manager where you should see a task saying Adding edge cluster

You will see the Edge servers deploying in vCenter if you check

When complete it should say succcesful

In the vCenter, edit the settings for both edge1-wld and edge2-wld to change the CPU shares to normal and the memory reservation to 0

Go to the SDDC Dashboard and select Solutions and Deploy Kubernetes Workload Kubernetes – Workload Management.

Read the Pre-requisites and select all.

Select Workload domain, then cluster01 and next

It will then go through a process of validation

Read the Review page and click Complete in vSphere

In vcenter-wld.vcf.sddc.lab > Workload Management > Select cluster01 and click Next

Select Tiny and click Next

Enter network info and click next

Select storage policy for each component

Review and Confirm

In vcenter-wld.vcf.sddc.lab, you can monitor the tasks

The below actions then take place

The deployment of 3 x Supervisor Control Plane VMs

The creation of a set of SNAT rules (Egress) in NSX-T for a whole array of K8s services

The creation of a Load Balancer (Ingress) in NSX-T for the K8s control plane

The installation of the Spherelet on the ESXi hosts so that they behave as Kubernetes worker nodes

In vCenter, we can see that Workload Management has been succesfully created

Add 2 routes to the jump host

We now need to create a content library

Select subscribed content library and click Next

Accept the certificate

Select vSAN datastore and click Next

Click Finish

Go to vCenter > Home > Workload management > Create Namespace

I created a namespace called test-namespace

Download the kubectl plugin from the CLI Tools link. If you click on Open

You will get this page. Click Download CLI plugin and unzip it. I unzipped mine to c:\VLC\vsphere-plugin

Create permissions on the name space

Set Storage Policies

Open Command prompt, Navigate to c:\VLC\vsphere-plugin\bin

Login as administrator@vsphere.local to 10.50.0.1 which is just the first IP address of the ingress CIDR block you provided which is assigned to the load balancer in NSX that then points to the supervisor cluster

This is where I need to read up more on Kubernetes and vCloud in general to do anything else! 😀

VMware Tanzu, Tanzu Mission Control and Project Pacific

July 27, 2020 IT No comments

What is VMware Tanzu?

VMware Tanzu is a portfolio of services for modernizing Kubernetes controlled container-based applications and infrastructure.

Application services: Modern application platforms
Build service: Container creation and management. Heptio, Bitnami and Pivotal come under this category. Bitnami packages and delivers 180+ Kubernetes applications, ready-to-run virtual machines and cloud images. Pivotal controls one of the most popular application frameworks, “Spring”, and offers customers the Pivotal Application Service recently announcing that PAS and its components, Pivotal Build Service and Pivotal Function Service are being evolved to run on Kubernetes.
Application catalogue: Production ready, open source containers
Data services: Cloud native data and messaging including Gemfire, RabbitMQ and SQL
Kubernetes grid. Enterprise ready runtime
Mission Control: Centralised cluster management
Observability: Modern app monitoring and analytics
Service mesh: App wide networking and control

VMware Tanzu services

What is Tanzu Mission Control?

VMware Tanzu Mission Control is a SaaS based control plane which allows customers to manage all the Kubernetes clusters, across vSphere, VMware PKS, public clouds, managed services, packaged distributions from a central single point of control and single pane of glass. This will allow applying policies for access, quotas, back-up, security and more to individual clusters or groups of clusters. It will support a wide array of operations such as life-cycle management including initial deployment, upgrade, scale and delete. This will be achieved via the open source Cluster API project.

As these environments evolve, there can be a proliferation of containers and applications so how do you keep this all under control allowing the developers to do their jobs and operations to keep the infrastructure under control to help with the following

Map enterprise identity to Kubernetes RBAC across clusters
Define policies once and push them across clusters
Manage cluster lifecycle consistently
Unified view of cluster metrics, logs and data
Cross cluster cloud data
Automated policy controlled cross cluster traffic
Monitor Kubernetes costs

What is Project Pacific?

Project Pacific is an initiative to embed Kubernetes into the control plane of vSphere for managing Kubernetes workloads on ESXi hosts. The integration of Kubernetes and vSphere will happen at the API and UI layers, but also the core virtualization layer where ESXi will run Kubernetes natively. A developer will see and utilise Project Pacific as a Kubernetes cluster and an IT admin will still see the normal vSphere infrastructure.T

The control plane will allow the deployment of

Virtual Machines and cluster of VMs
Kubernetes Clusters
Pods

This image has an empty alt attribute; its file name is image-6-1024x575.png

The Supervisor cluster

The control plane is made up of a supervisor cluster using ESXi as the worker nodes instead of Linux. This is carried out by by integrating a Spherelet directly into ESXi. The Spherelet doesn’t run in a VM, it runs directly on ESXi. This allows workloads or pods to be deployed and run natively in the hypervisor, alongside normal Virtual Machine workloads. A Supervisor Cluster can be thought of as a group of ESXi hosts running virtual machine workloads, while at the same time acting as Kubernetes worker nodes and running container workloads.

vSphere Native Pods

The supervisor cluster allows workloads or pods to be deployed. Native pods are actually containers that comply with the Kubernetes Pod specification. This functionality is provided by a new container runtime built into ESXi called CRX. CRX optimises the Linux kernel and hypervisor and removes some of the traditional heavy config of a virtual machine enabling the binary image and executable code to be quickly loaded and booted. The Spherelet ensures containers are running in pods. Pods are created on a network internal to the Kubernetes nodes. By default, pods cannot talk to each other across the cluster of nodes unless a Service is created. A Service in Kubernetes allows a group of pods to be exposed by a common IP address, helping define network routing and load balancing policies without having to understand the IP addressing of individual pods

CRX – Container runtime for ESXi

Each virtual machine has a vmm (virtual machine manager) and vmx (virtual machine executive) process that handles all of the other subprocesses to support running a VM. To implement Kubernetes, VMware introduced a new process called CRX (the container runtime executive) which manages the processes associated with a Kubernetes Pod. Each ESXi server also runs the equivalent of hostd (the ESXi scheduler) called spherelet, analogous to the kubelet in standard Kubernetes.

A CRX instance is a specific form of VM which is packaged with ESXi and provides a Linux Application Binary Interface (ABI) through a very isolated environment. VMware supply the Linux Kernel image used by CRX instances. When a CRX instance is brought up, ESXi will push the Linux image directly into the CRX instance. Since it is pretty much concentrated down from a normal VM, most of the other features have been removed and you can launch it in less than a second.

CRX instances have a CRX init process which provides the endpoint with communication with ESXi and allows the environment running inside of the CRX instance to be managed

Namespaces

A Namespace in the Kubernetes cluster includes a collection of different objects like CRX VMs or VMX VMs. Namespaces are commonly used to provide multi-tenancy across applications or users, and to manage resource quotas

Guest Kubernetes Clusters

It is important to understand that the Supervisor Cluster itself does not deliver regular Kubernetes based clusters. The supervisor Kubernetes cluster is a specific implementation of Kubernetes for vSphere which is not fully conformant with upstream Kubernetes. If you want general purpose Kubernetes workloads, you have to use Guest Clusters. Guest Clusters in vSphere use the open source Cluster API project to lifecycle manage Kubernetes clusters, which in turn uses the VM operator to manage the VMs that make up a guest.

What is Cluster API?

This is an Open source project for managing the lifecycle of a Kubernetes cluster using Kubernetes itself. You start with the management cluster which gives you an API with custom resources or operators.

« Older Entries Recent Entries »

Comparing VM Encryption performance between ESXi 6.7U3 + vSAN and ESXi 7.0U2 + vSAN

BIOS and UEFI

Using PowerCLI Image Builder CLI to build a new ESXi 7.0U1c image

Installing Linux bash shell on Windows

Comparing the functionality and performance of VM encryption and vSAN encryption on RAID1 and RAID6 vSAN storage

vSAN Trim/Unmap functionality

Monitoring TRIM/UNMAP

AES and AES-NI

Kubernetes components

vCloud Foundation Lab Constructor 4.0.1 Install

VMware Tanzu, Tanzu Mission Control and Project Pacific

Electric Monk

Don't think about what can happen in a month. Don't think what can happen in a year. Just focus on the 24 hours in front of you and do what you can to get closer to where you want to be :-)

Search

Calendar

Social Media and RSS

vExpert

Recent Posts

Archives

Categories

Fatcow Webhosting

Monitoring TRIM/UNMAP

Electric Monk

Don't think about what can happen in a month. Don't think what can happen in a year. Just focus on the 24 hours in front of you and do what you can to get closer to where you want to be :-)

Search

Calendar

Social Media and RSS

vExpert

Recent Posts

Archives

Categories

Tags

Fatcow Webhosting