Windows Virtualization Based Security

Windows Virtualization-Based Security (VBS) is a security feature in Windows that uses hardware virtualization to create and isolate a secure region of memory from the normal operating system. This secure memory region can be used to host various security solutions, providing protection from vulnerabilities and attacks that could compromise the system.

Key Components and Features of VBS:

  1. Hypervisor-Enforced Code Integrity (HVCI):
    • Ensures that only signed and verified code can execute in kernel mode.
    • Uses the hypervisor to enforce code integrity policies, preventing unsigned drivers or system files from being loaded.
  2. Credential Guard:
    • Isolates and protects credentials such as NTLM hashes and Kerberos tickets using VBS.
    • Prevents attackers from stealing credentials even if the operating system kernel is compromised.
  3. Device Guard:
    • Combines HVCI with other features to ensure that the device runs only trusted applications.
    • Includes Configurable Code Integrity (CCI) and relies on policies that define which code can be trusted.
  4. Secure Kernel Mode:
    • Runs alongside the normal Windows kernel, but is isolated from it.
    • Protects key processes and data from being tampered with or read by the normal operating system.
  5. Kernel Data Protection (KDP):
    • Prevents kernel memory from being tampered with by malicious actors.
    • Protects non-executable data in the kernel such as data structures, which are vital for the operating system’s security and stability.

How VBS Works:

  • Hardware Requirements:
    • Requires modern CPUs with virtualization extensions (such as Intel VT-x or AMD-V).
    • Requires a system firmware that supports Secure Boot and UEFI.
    • Typically requires TPM 2.0 for certain features like Credential Guard.
  • Operational Flow:
    • At system boot, the Windows hypervisor (Hyper-V) initializes and creates an isolated environment.
    • The VBS components operate within this environment, isolated from the main operating system and its potential vulnerabilities.
    • This isolation ensures that even if the main operating system is compromised, the VBS-protected components remain secure.

Benefits of VBS:

  • Enhanced Security:
    • Protects against a variety of modern threats, including malware, rootkits, and credential theft.
    • Provides a stronger security boundary than traditional software-based security measures.
  • Trustworthy Execution Environment:
    • Ensures that critical security mechanisms and sensitive data are executed and stored in a protected environment.

Use Cases:

  • Enterprise Environments:
    • Provides advanced protection mechanisms for organizations handling sensitive data and requiring stringent security measures.
    • Helps meet compliance and regulatory requirements by providing enhanced security controls.
  • Secure Workloads:
    • Ideal for protecting workloads that handle sensitive or high-value data, such as financial transactions, healthcare records, and government data.

In summary, Windows VBS leverages hardware virtualization to create a secure environment that enhances the security of the operating system, providing robust protection against a wide range of threats and vulnerabilities.

SNMP explained

What is SNMP?

SNMP was created in 1988 (based on Simple Gateway Management Protocol, or SGMP) as a short-term solution and was created to allow devices to exchange information with each other across a network. Since then, SNMP has achieved universal acceptance and become a standard protocol for many applications and device. It is considered “simple” because of its reliance on an unsupervised or connectionless communication link.and was created to allow devices to exchange information with each other across a network

SNMP has a simple architecture based on a client-server model.

  • The servers, called managers, collect and process information about devices on the network.
  • The clients, called agents, are any type of device or device component connected to the network. They can include not just computers, but also network switches, phones and printers as an example

SNMP is considered “robust”, because of the independence of the managers from the agents. Because they are typically separate devices, if an agent fails, the manager will continue to function and the opposite is also true.

SNMP is non-proprietary, fully documented, and supported by multiple vendors.

SNMP Ports

SNMP Managers broadcast requests and receive responses on UPD port 161. Traps are sent to UDP port 162.

What versions of SNMP are there?

SNMP Version AdvantagesDisadvantages
SNMP v1 Old version of the protocol now so little advantages compared to v2 and v3Community string sent in clear text
Most community strings set to “public”
Only supports 32-bit counters, which is very limiting for today’s networks
SNMP v2cSupports 64-bit counters

GETBULK command added to request multiple variables from an agent

INFORM” altered the way that “Traps” worked in SNMPv1 making the manager confirm receipt of a message

SNMPv2c brought improvements in areas such as protocol packet types, MIB structure elements, and transport mappings, it still has the same security flaws as its predecessor
SNMPv2 introduced a new security system that, unfortunately, limited the adoption of this new protocol
SNMPv2c was developed in response, removing the new security system and reverting to the familiar community approach

SNMPv2c’s simple authentication system and lack of encryption makes networks vulnerable to a wide range of threats.
SNMP v3SNMPv3 introduces three new elements: SNMP View, SNMP Groups, and SNMP Users. This ensures every interaction with a device on the network is effectively authenticated and encrypted

SNMPv3 also introduced encryption methods such as SHA, MDS and DES to increase security and prevent data tampering and eavesdropping 
Encryption systems only work if authentication has been enabled

Multiple variables that need to be configured, including usernames, passwords, authentication protocols, and privacy protocols. Misconfiguration is a serious concern

Not all devices are compatible yet

What layer is SNMP found?

SNMP Message Types

SNMP uses six basic messages to communicate between the manager and the agent

  • GET – The manager can send GET and GET-NEXT messages to the agent requesting information for a specific variable.
  • GET-NEXT -The SNMP manager sends this message to the agent to get information from the next OID within the MIB tree.
  • RESPONSE – The agent sends a RESPONSE to the SNMP manager when replying to a GET request. This provides the SNMP manager with the variables that were requested originally.
  • SET – A SET message allows the manager to request a change be made to a managed object. The object agent will then respond with a GET-RESPONSE message if the change has been made
  • or an error saying why the change cannot be made.
  • TRAP – TRAP messages are unique because they are they only message type that is initiated by the agent. TRAP messages are used to inform the manager when an important event happens. This makes TRAPs perfect for reporting alarms to the manager rather than wait for a status request from the manager.
  • INFORM – Similar to TRAP initiated by the agent, INFORM also includes confirmation from the SNMP manager on receiving a message

MIB

A MIB or Management Information Base is a formatted ASCII text file that resides within the SNMP manager designed to collect information and organize it into a hierarchical format. It’s essentially a agent-to-manager dictionary of the SNMP language, where every object referred to in an SNMP message is listed and explained. In order for your SNMP manager to understand a device that it’s managing, a MIB must first be loaded (“compiled”).The SNMP manager uses information from the MIB to translate and interpret messages before sending them onwards to the end-use. A long numeric tag or object identifier (OID) is used to distinguish each variable uniquely in the MIB and SNMP messages. MIBs are written in the OID format. In order to read a MIB, you need to load it into an MIB browser, which will make the OID structure visible.

It’s essentially a agent-to-manager dictionary of the SNMP language, where every object referred to in an SNMP message is listed and explained. In order for your SNMP manager to understand a device that it’s managing, a MIB must first be loaded (“compiled”).

Vendors will make their VIBs available for download when appliances are configured for SNMP. Example from Cohesity below

When an SNMP device sends a Trap or other message, it identifies each data object in the message with a number string called an object identifier (OID). This is great for a computer, but not easily readable for a human being. The MIB provides a text label for each OID. This is similar to DNS servers on the internet that translate numerical IP addresses into domain names that you can understand.

What is an OID?

An OID is an Object Identifier that can be defined by RFC’s. A MIB file is a text file that defines all the OID’s available in that file.  If you look at this file it will be hard to understand. You can use a MIB browser which are designed to interpret MIB files and make it easier to understand each OID.  Each OID will have a name, a description as well as if SNMP Get’s or Set’s are accepted.  Most MIB browsers also have a built in feature to send SNMP Get’s and Set’s.  You can search for the specific OID you need.

An OID is formatted in a string of numbers as shown below. These numbers each provide you with a piece of corresponding information. Most of the time OIDs will be provided by the vendor you purchased your device from. Example Cisco OID for RAM usage in %

1.3.6.1.4.1.9.9.618.1.8.6.0

Each segment in the number string denotes a different level in the order, starting with one of the two organizations that assign OIDs, all the way down to a unique manufacturer, a unique device, and a unique data object

Every SNMP-enabled network device will have its own MIB table with many different OIDs. There are so many OIDs in most MIBs that it would be next to impossible to record all of the information.

SNMP agents include OIDs with every Trap message they send. This allows the SNMP manager to use the compiled MIB to understand what the agent is saying.

SNMP monitoring tools are designed to take data from MIBs and OIDs to present to you in a format that is easy to understand. Get requests and SNMP traps provide network monitors with raw performance data which is then converted into graphical displays, charts, and graphs. As such, MIBs and OIDs make it possible for you to monitor multiple SNMP-enabled devices from one centralized location.

SNMP v3 Authentication and Encryption

Older versions of SNMP relied on a single unencrypted “community string” for both get requests and traps, making it very insecure on the network (Anyone could ‘snoop’ on the network and detect the unencrypted community strings). The only security options with SNMP v1 and v2c are to either disable it altogether or make sure SNMP enabled devices are ‘read only’ so that if the connection details were obtained by a malicious person, they would only be able to read configuration rather than change device configuration.

Version 3 uses the same base protocol as version 1 and 2c, but introduces encryption and much improved authentication mechanisms. Depending on how you authorize with the SNMP agent on a device, you may be granted different levels of access.

The security level you use depends on what credentials you must provide to authenticate successfully

Authentication protocols

  • MD5 and SHA

Privacy protocols

  • DES and AES

Information on Engine IDs

The protocols used for Authentication are MD5 and SHA ; and for Privacy, DES (Data Encryption Standard) and AES (Advanced Encryption Standard)

Engine IDs

In SNMP (Simple Network Management Protocol), an engine ID is a unique identifier assigned to a SNMP entity. It is a string of octets that identifies a particular SNMP entity within a network or administrative domain.

The engine ID is used in SNMP to distinguish between different SNMP entities and to ensure that SNMP messages are sent to the correct recipient. When an SNMP message is sent, it includes the engine ID of the sending entity, as well as the engine ID of the intended recipient. The engine ID is also used in SNMP to authenticate messages and to ensure that they are generated by a trusted SNMP entity.

There are two types of engine IDs in SNMP:

Local Engine ID – This is the engine ID assigned to the local SNMP entity. It is used to identify the local entity to other SNMP entities in the network.

Remote Engine ID – This is the engine ID assigned to a remote SNMP entity. It is used to identify the remote entity to the local entity when SNMP messages are exchanged between them.

The engine ID is an important aspect of SNMP as it ensures that SNMP messages are sent to the correct recipient and are generated by a trusted SNMP entity.

Context engine

The context engine in SNMP (Simple Network Management Protocol) is responsible for providing context to SNMP messages. SNMP messages are used to manage network devices, and they contain information about the operation to be performed on the network device.

However, SNMP manages a large number of network devices, and it is necessary to identify the specific network device that is being managed. This is where the context engine comes in. The context engine provides the necessary context to SNMP messages to identify the specific network device being managed.

In SNMP, a context is a piece of information that identifies the specific instance of a managed object. Managed objects are objects in the network device that can be managed through SNMP. For example, a managed object could be the interface statistics for a network interface.

The context engine provides the necessary context to SNMP messages in the form of a context identifier (CID). The CID is a string of characters that uniquely identifies the instance of the managed object being managed. The CID is included in the SNMP message, and it allows the SNMP manager to identify the specific network device being managed.

In summary, the context engine in SNMP provides the necessary context to SNMP messages to identify the specific network device being managed. The context engine does this by providing a context identifier (CID) that uniquely identifies the instance of the managed object being managed.

In SNMP (Simple Network Management Protocol), an authoritative engine ID is a unique identifier assigned to a SNMP entity that serves as the authoritative source of information within a particular administrative domain.

Authoratitive engine

The authoritative engine ID is a string of octets that identifies a particular SNMP entity. It is used to distinguish between different SNMP entities within the same network or domain. An SNMP entity is usually a network device or a server that is capable of responding to SNMP queries.

The authoritative engine ID is important in SNMP because it is used to authenticate SNMP messages. SNMP messages can be authenticated by verifying the source of the message and ensuring that it was generated by a trusted SNMP entity. The authoritative engine ID is used in the authentication process to verify the source of the message.

In summary, the authoritative engine ID is a unique identifier assigned to a SNMP entity that serves as the source of information within a particular administrative domain. It is used to authenticate SNMP messages and ensure that they are generated by a trusted SNMP entity.

Using tcpdump

What is tcpdump?

tcpdump is a network capture and protocol analysis tool (www.tcpdump.org). This program is based on the libpcap interface, a library for user-level network datagram capture. tcpdump can also be used to capture non-TCP traffic, including UDP and ICMP. The tcpdump program is native to Linux and ships with many distributions of BSD, Linux, and Mac OS X however, there is a Windows version.

Where is tcpdump installed?

You can check whether tcpdump is installed on your system with the following command

rhian@LAPTOP-KNJ4ALF8:~$ which tcpdump
/usr/sbin/tcpdump

How long does tcpdump run for?

tcpdump will keep capturing packets until it receives an interrupt signal. You can interrupt capturing by pressing Ctrl+C. To limit the number of packets captured and stop tcpdump, use the -c (for count) option.

When tcpdump finishes capturing packets, it will report counts of

  • Packets “captured” (this is the number of packets that tcpdump has received and processed)
  • Packets “received by filter” (This depends on the OS where you’re running tcpdump, and possibly on the way the OS was configured – if a filter was specified on the command line, then on some OSes it counts packets regardless of whether they were matched by the filter expression and, even if they were matched by the filter expression, regardless of whether tcpdump has read and processed them yet, on other OSes it counts only packets that were matched by the filter expression regardless of whether tcpdump has read and processed them yet, and on other OSes it counts only packets that were matched by the filter expression and were processed by tcpdump)
  • Packets “dropped by kernel” (this is the number of packets that were dropped, due to a lack of buffer space by the packet capture mechanism in the OS on which tcpdump is running. It depends if the OS reports that information to applications; if not, it will be reported as 0).

Writing a tcpdump output to file

When running tcpdump, the output file generated by the –w switch is not a text file and can only be read by tcpdump or another piece of software such as Wireshark which can parse the binary file format

tcpdump manual

https://www.tcpdump.org/manpages/tcpdump.1.html

Common Parameters

There are many more parameters but these are likely to be the most common ones.

ParameterExplanation
-#A packet number is printed on every line.
-cExit the dump after the specified number of packets.
-DPrint all available interfaces for capture.
Use ifconfig to check what interfaces you have
-ePrint also the link-layer header of a packet (e.g., to see the vlan tag).
This can be used, for example, to print MAC layer addresses for protocols such as Ethernet and IEEE 802.11.
-i
–interface
Interface to dump from
-nDo not resolve the addresses to names (e.g., IP reverse lookup).
-nnDisable name resolution of both host names and port names
-v
-vv
-vvv
Verbose output in more and more detail
-wWrites the output to a file which can be opened in Wireshark for example
-xUse tcpdump -X to show output including ASCII and hex. This will making reading screen output easier
-rRead a file containing a previous tcpdump capture

Examples

Check the interfaces available

# sudo tcpdump -D
1.eth0
2.eth1
3.wifi0
4.any (Pseudo-device that captures on all interfaces)
5.lo [Loopback]

 Capture all packets in any interface

# sudo tcpdump --interface any

Capture packets for a specific host and output to a file

# sudo tcpdump -i any host <host_ip> -w /tmp/tcpdump.pcap

Filtering Packets for just source and destination IP addresses, ports and protocols, etc. icmp example below

# sudo tcpdump -i any -c5 icmp

Filtering packets by port numbers

sudo tcpdump -i any -c5 -nn port 80

Filter based on source or destination ip or hostname

sudo tcpdump -i any -c5 -nn src 192.168.10.125
sudo tcpdump -i any -c5 -nn dst 192.168.20.125

sudo tcpdump -i any -c5 -nn src techlabadc001.techlab.com
sudo tcpdump -i any -c5 -nn dst techlabdns002.techlab.com

Complex expressions

You can also combine filters by using the logical operators and and or to create more complex expressions. For example, to filter packets from source IP address 192.168.10.125 and proocol HTTP only, use this command.

sudo tcpdump -i any -c5 -nn src 192.168.10.125 and port 80

You can create even more complex expressions by grouping filter with parentheses. Enclose the filter expression with quotation marks which prevents the shell from confusing them with shell expressions

sudo tcpdump -i any -c5 -nn "port 80 and (src 192.168.10.125 or src 10.168.10.20.125)"

Occasionally, we need even more visibility and inspection of the contents of the packets is required to ensure that the message we’re sending contains what we need or that we received the expected response. To see the packet content, tcpdump provides two additional flags: -X to print content in hex, and ASCII or -A to print the content in ASCII.

sudo tcpdump -i any -c20 -nn -A port 80

Reading and writing to a file

tcpdump has the ability to save the capture to a file so you can read and analyze the results later. This allows you to capture packets in batch mode overnight, for example, and verify the results at your leisure. It also helps when there are too many packets to analyze since real-time capture can occur too fast. If you have Wireshark installed, you can open the .pcap files in here for further analysis as well.

# Writing the file 
sudo tcpdump -i any -c10 -nn -w dnsserver.pcap port 53
# And to read the file
tcpdump -nn -r dnsserver.pcap

Summary

tcpdump and Wireshark are extremely useful tools to have to hand for troubleshooting network issues in more details. For example, we have used tcpdump to check whether outbound traffic from a host can ping a key management server or to check connectivity between a host and a syslog server over TCP port 514. Sometimes you may have to run these tools as an elevated account which may not be possible and there are certain situations where you may get an error when you run tcpdump like

tcpdump: socket for SIOCETHTOOL(ETHTOOL_GET_TS_INFO): Socket type not supported

This can sometime happen where you may be using Windows Subsystem for Linux (WSL) which  allows you to install a complete Ubuntu terminal environment on your Windows machine. There is some functionality not enabled quite yet which will restrict certain things you want to do.

Vdbench

What is VdBench?

Vdbench is a command line utility specifically created to help engineers and customers generate disk I/O workloads to be used for validating storage performance and storage data integrity. Vdbench execution parameters may also specified via an input text file.

Vdbench is written in Java with the objective of supporting Oracle heterogeneous attachment. Vdbench has been tested on Solaris Sparc and x86, Windows NT, 2000, 2003, 2008, XP and Windows 7, HP/UX, AIX, Linux, Mac OS X, zLinux, and native VMware

Objective of Vdbench

The objective of Vdbench is to generate a wide variety of controlled storage I/O workloads, allowing control over workload parameters such as I/O rate, LUN or file sizes, transfer sizes, thread count, volume count, volume skew, read/write ratios, read and write cache hit percentages, and random or sequential workloads. This applies to both raw disks and file system files and is integrated with a detailed performance reporting mechanism eliminating the need for the Solaris command iostat or equivalent performance reporting tools. Vdbench performance reports are web accessible and are linked using HTML. Open your browser to access the summary.html file in the Vdbench output directory.
There is no requirement for Vdbench to run as root as long as the user has read/write access for the target disk(s) or file system(s) and for the output-reporting directory.

Non-performance related functionality includes data validation with Vdbench keeping track of what data is written where, allowing validation after either a controlled or uncontrolled shutdown.

How to download Vdbench

https://www.oracle.com/downloads/server-storage/vdbench-downloads.html

Vdbench comes packaged as a zip file which contains everything you need for Windows and Linux

Vdbench Terminology

Execution parameters control the overall execution of Vdbench and control things like parameter file name and target output directory name.

  • Raw I/O workload parameters describe the storage configuration to be used and the workload to be generated. The parameters include General, Host Definition (HD), Replay Group (RG), Storage Definition (SD), Workload Definition (WD) and Run Definition (RD) and must always be entered in the order in which they are listed here. A Run is the execution of one workload requested by a Run Definition. Multiple Runs can be requested within one Run Definition.
  • File system Workload parameters describe the file system configuration to be used and the workload to be generated. The parameters include General, Host Definition (HD), File System Definition (FSD), File system Workload Definition (FWD) and Run Definition (RD) and must always be entered in the order in which they are listed here. A Run is the execution of one workload requested by a Run Definition. Multiple Runs can be requested within one Run Definition.
  • Replay: This Vdbench function will replay the I/O workload traced with and processed by the Sun StorageTekTM Workload Analysis Tool (Swat).
  • Master and Slave: Vdbench runs as two or more Java Virtual Machines (JVMs). The JVM that you start is the master. The master takes care of the parsing of all the parameters, it determines which workloads should run, and then will also do all the reporting. The actual workload is executed by one or more Slaves. A Slave can run on the host where the Master was started, or it can run on any remote host as defined in the parameter file.
  • Data Validation: Though the main objective of Vdbench has always been to execute storage I/O workloads, Vdbench also is very good at identifying data corruptions on your storage.
  • Journaling: A combination of Data Validation and Journaling allows you to identify data corruption issues across executions of Vdbench.
  • LBA, or lba: For Vdbench this never means Logical Block Address, it is Logical Byte Address. 16 years ago Vdbench creators decided that they did not want to have to worry about disk sector size changes

Vdbench Quick start

You can carry out a quick test to make sure everything is working ok

  • /vdbench -t (for a raw I/O workload)

When running ‘./vdbench –t’ Vdbench will run a hard-coded sample run. A small temporary file is created and a 50/50 read/write test is executed for just five seconds.
This is a great way to test that Vdbench has been correctly installed and works for the current OS platform without the need to first create a parameter file.

  • /vdbench -tf (for a filesystem workload)

Use a browser to view the sample output report in /vdbench/output/summary.html

To start Vdbench

  • Linux: /home/vdbench/vdbench -f <parameter file>
  • Windows: c:\vdbench\vdbench.bat -f <parameter file>

There are sample parameter files in the PDF documentation in section 1.34 and in the examples folder from the zip file.

Execution Parameter Overview

The main execution parameters are

CommandExplanation
-f <workload parameter file>One parameter file is required
-o <output directory>Output directory for reporting. Default is output in current directory
-tRun a 5 second sample workload on a small disk file
-tfRun a 5 second sample filesystem workload
-eOverride elapsed parameters in Run Definitions
-IOverride interval parameters in Run Definitions
-wOverride warmup parameters in Run Definitions
-mOverride the amount of current JVM machines to run workload
-vActivate data validation
-vrActivate data validation immediately re-read after each write
-vwbut don’t read before write
-vtActivate data validation. Keep track of each write timestamp Activate data validation (memory intensive)
-jActivate data validation with journaling
-jrRecover existing journal, validate data and run workload
-jroRecover existing journal, validate data but do not run workload
-jriRecover existing journal, ignore pending writes
-jmActivate journaling but only write the journal maps
-jnActivate journaling but use asynchronous writes to journal
-sSimulate execution, Scans parameter names and displays run names
-kSolaris only: Report kstat statistics on the console
-cClean (delete existing file system structure at start of run
-coForce format=only
-cyForce format=yes
-cnForce format=no
-pOverride java socket port number
-l nnnAfter the last run, start over with the first run. Without nnn this is an endless loop
-rAllows for a restart of a parameter file containing multiple run definitions. E.g. -r rd5 if you have rd1 through rd10 in a parameter file
xxx=yyyUsed for variable substitution

There are also Vdbench utility functions – See Section 1.9 in the PDF documentation

UtilityExplanation
compareStart Vdbench workload compare
csimCompression simulator
dsimDedupe simulator
editPrimitive full screen editor, syntax ‘./vdbench edit file.name’
jstackCreate stack trace. Requires a JDK
parse(flat)Selective parsing of flatfile.html
printPrint any block on any disk or disk file
rshStart RSH daemon (For multi host testing)
sdsStart Vdbench SD Parameter Generation Tool (Solaris, Windows and Linux)
showlbaUsed to display output of the XXXX parameter

Parameter files

The parameter files get read in the following order

  • General (Optional)
  • HD (Host Definition) (Optional)
  • RG (Replay Group)
  • SD (Storage Definition)
  • WD (Workload Definition)
  • RD (Run Definition)

or for file system testing:

  • General
  • HD (Host Definition)
  • FSD (File System Definition)
  • FWD (File System Workload Definition)
  • RD (Run Definition)

General Parameters

The below parameters must be the first in the file

CommandExplanation
abort_failed_skew==nnnAbort if requested workload skew is off by more than nnnn%
Compration=nnCompression ratio
Concatenatesds=yesBy default, Vdbench will write an uncompressible random data pattern. ‘compratio=nn’ generates a data pattern that results in a nn:1 ratio. The data patterns implemented are based on the use of the ‘LZJB’ compression algorithm using a ZFS record size of 128k compression ratios 1:1 through 25:1 are the only ones implemented; any ratio larger than 25:1 will be set to 25.
create_anchors=yesCreate parent directories for FSD anchor
data_errors=nnTerminate after ‘nn’ read/write/data validation errors (default 50)
data_errors=cmdRun command or script ‘cmd’ after first read/write/data validation error, then terminate.
dedupratio=Expected ratio. Default 1 (all blocks are unique).
dedupunit=What size of data does Dedup compare?
dedupsets=How many sets of duplicates.
deduphotsets=Dedicated small sets of duplicates
dedupflipflop=Activate the Dedup flip-flop logic.
endcmd=cmdExecute command or script at the end of the last run
formatsds=Force a one-time (pre)format of all SDs
formatxfersize=Specify xfersize used when creating, expanding, or (pre)formatting an SD.
fwd_thread_adjust=noOverride the default of ‘yes’ to NOT allow FWD thread counts to be adjusted because of integer rounding/truncation.
histogram=(default,….)Override defaults for response time histogram.
include=/file/nameThere is one parameter that can be anywhere: include=/file/name When this parameter is found, the contents of the file name specified will be copied in place. Example: include=/complicated/workload/definitions.txt
ios_per_jvm=nnnnnnOverride the 100,000 default warning for ‘i/o or operations per second per slave. This means that YOU will be responsible if you are overloading your slaves.
journal=yesActivate Data Validation and Journaling:
journal=recoverRecover existing journal, validate data and run workload
journal=onlyRecover existing journal, validate data but do not run requested workload.
journal=noflushUse asynchronous I/O on journal files
journal=maponlyDo NOT write before/after journal records
journal=skip_read_allAfter journal recovery, do NO read and validate every data block.  
journal=(max=nnn)Prevent the journal file from getting larger than nnn bytes
journal=ignore_pendingIgnore pending writes during journal recovery.
loop=Repeat all Run Definitions: loop=nn repeat nn times loop=nn[s|m|h] repeat until nn seconds/minutes/hours See also ‘-l nn’ execution parameter
messagescan=noDo not scan /var/xxx/messages (Solaris or Linux)
messagescan=nodisplayScan but do not display on console, instead display on slave’s stdout.
messagescan=nnnnScan, but do not report more than nnn lines. Default 1000
monitor=/file/nameSee External control of Vdbench termination
pattern=Override the default data pattern generation.
port=nn  Override the Java socket port number.
report=host_detail report=slave_detailSpecifies which SD detail reports to generate. Default is SD total only.
report=no_sd_detail report=no_fsd_detailWill suppress the creation of SD/FSD specific reports.
report_run_totals=yesReports run totals.
startcmd=cmdExecute command or script at the beginning of the first run
showlba=yesCreate a ‘trace’ file so serve as input to ./vdbench showlba
timeout=(nn,script) 
validate=yes(-vt) Activate Data Validation. Options can be combined: validate=(x,y,z)
validate=read_after_write(-vr) Re-reads a data block immediately after it was written.
validate=no_preread(-vw) Do not read before rewrite, though this defeats the purpose of data validation!
validate=time(-vt) keep track of each write timestamp (memory intensive)
validate=reportdedupsetsReports ‘last time used’ for all duplicate blocks if a duplicate block is found to be corrupted. Also activates validate=time. Note: large SDs with few dedup sets can generate loads of output!

Host Definition Parameter Overview

These parameters are ONLY needed when running Vdbench in a multi-host environment or if you want to override the number of JVMs used in a single-host environment

CommandExplanation
hd=defaultSets defaults for all HDs that are entered later
hd=localhostSets values for the current host
hd=host_labelSpecify a host label.
System=hostnameHost IP address or network name, e.g. xyz.customer.com
vdbench=vdbench_dir_nameWhere to find Vdbench on a remote host if different from current.
jvms=nnnHow many slaves to use
shell=rsh | ssh | vdbenchHow to start a Vdbench slave on a remote system.
user=xxxxUserid on remote system Required.
clients=nnVery useful if you want to simulate numerous clients for file servers without having all the hardware. Internally is basically creates a new ‘hd=’ parameter for each requested client.
mount=”mount xxx …”This mount command is issued on the target host after the possibly needed mount directories have been created.

Replay Group (RG parameter overview

CommandExplanation
rg=nameUnique name for this Replay Group (RG).
devices=(xxx,yyy,….)The device numbers from Swat’s flatfile.bin.gz to be replayed.

Example: rg=group1,devices=(89465200,6568108,110)
Note: Swat Trace Facility (STF) will create Replay parameters for you. Select the ‘File’ ‘Create Replay parameter file’ menu option. All that’s then left to do is specify enough SDs to satisfy the amount of gigabytes needed.

Storage Definition (SD) Parameter Overview

This set of parameters identifies each physical or logical volume manager volume or file system file used in the requested workload. Of course, with a file system file, the file system takes the responsibility of all I/O: reads and writes can and will be cached (see also openflags=) and Vdbench will not have control over physical I/O. However, Vdbench can be used to test file system file performance

Example: sd=sd1,lun=/dev/rdsk/cxt0d0s0,threads=8

CommandExplanation
sd=defaultSets defaults for all SDs that are entered later.
sd=nameUnique name for this Storage Definition (SD).
count=(nn,mm)Creates a sequence of SD parameters.
align=nnnGenerate logical byte address in ‘nnn’ byte boundaries, not using default ‘xfersize’ boundaries.
dedupratio=See data deduplication:
dedupsets= 
deduphotsets= 
dedupflipflop= 
hitarea=nnSee read hit percentage for an explanation. Default 1m.
host=nameName of host where this SD can be found. Default ‘localhost’
journal=xxxDirectory name for journal file for data validation
lun=lun_nameName of raw disk or file system file.
offset=nnnAt which offset in a lun to start I/O.
openflags=(flag,..)Pass specific flags when opening a lun or file
range=(nn,mm)Use only a subset ‘range=nn’: Limit Seek Range of this SD.
replay=(group,..)Replay Group(s) using this SD.
replay=(nnn,..)Device number(s) to select for Swat Vdbench replay
resetbus=nnnIssue ioctl (USCSI_RESET_ALL) every nnn seconds. Solaris only
resetlun=nnnIssue ioctl (USCSI_RESET) every nnn seconds. Solaris only
size=nnSize of the raw disk or file to use for workload. Optional unless you want Vdbench to create a disk file for you.
threads=nnMaximum number of concurrent outstanding I/O for this SD. Default 8

Workload Definition (WD) Parameter Overview

The Workload Definition parameters describe what kind of workload must be executed using the storage definitions entered.
Example: wd=wd1,sd=(sd1,sd2),rdpct=100,xfersize=4k

CommandExplanation
wd=defaultSets defaults for all WDs that are entered later.
wd=nameUnique name for this Workload Definition (WD)
sd=xxName(s) of Storage Definition(s) to use
host=host_labelWhich host to run this workload on. Default localhost.
hotband=See hotbanding
iorate=nnRequested fixed I/O rate for this workload.
openflags=(flag,..)Pass specific flags when opening a lun or file.
priority=nnI/O priority to be used for this workload.
range=(nn,nn)Limit seek range to a defined range within an SD.
rdpct=nnRead percentage. Default 100.
rhpct=nnRead hit percentage. Default 0.
seekpct=nnPercentage of random seeks. Default seekpct=100 or seekpct=random.
skew=nnPercentage of skew that this workload receives from the total I/O rate.
streams=(nn,mm)Create independent sequential streams on the same device.
stride=(min,max)To allow for skip-sequential I/O.
threads=nnOnly available during SD concatenation.
whpct=nnWrite hit percentage. Default 0.
xfersize=nnData transfer size. Default 4k.
xfersize=(n,m,n,m,..)Specify a distribution list with percentages.
xfersize=(min,max,align)Generate xfersize as a random value between min and max.

File System Definition (FD) parameter overview

CommandExplanation
fsd=nameUnique name for this File System Definition.
fsd=defaultAll parameters used will serve as default for all the following fsd’s.
anchor=/dir/The name of the directory where the directory structure will be created.
count=(nn,mm)Creates a sequence of FSD parameters.
depth=nnHow many levels of directories to create under the anchor.
distribution=allDefault ‘bottom’, creates files only in the lowest directories. ‘all’ creates files in all directories.
files=nnHow many files to create in the lowest level of directories.
mask=(vdb_f%04d.file, vdb.%d_%d.dir)The default printf() mask used to generate file and directory names. This allows you to create your own names, though they still need to start with ‘vdb’ and end with ‘.file’ or ‘.dir’. ALL files are numbered consecutively starting with zero. The first ‘%’ mask is for directory depth, the second for directory width.
openflags=(flag,..)Pass extra flags to file system open request (See: man open)
shared=yes/noDefault ‘no’: See FSD sharing
sizes=(nn,nn,…..)Specifies the size(s) of the files that will be created.
totalsize=nnnStop after a total of ‘nnn’ bytes of files have been created.
width=nnHow many directories to create in each new directory.
workingsetsize=nn wss=nnCauses Vdbench to only use a subset of the total amount of files defined in the file structure. See workingsetsize.
journal=dirWhere to store your Data Validation journal files

Filesystem Workload Definition (FWD) parameter overview:

CommandExplanation
fwd=nameUnique name for this Filesystem Workload Definition.
fwd=defaultAll parameters used will serve as default for all the following fwd’s.
fsd=(xx,….)Name(s) of Filesystem Definitions to use
openflags=Pass extra flags to (Solaris) file system open request (See: man open)
fileio=(random.shared)Allows multiple threads to use the same file.
fileio=(seq,delete)Sequential I/O: When opening for writes, first delete the file
fileio=randomHow file I/O will be done: random or sequential
fileio=sequentialHow file I/O will be done: random or sequential
fileselect=random/seqHow to select file names or directory names for processing.
host=host_labelWhich host this workload to run on.
operation=xxxxSpecifies a single file system operation that must be done for this workload.
rdpct=nnFor operation=read and operation=write only. This allows a mix and read and writes against a single file.
skew=nn  The percentage of the total amount of work for this FWD
stopafter=nnnFor random I/O: stop and close file after ‘nnn’ reads or writes. Default ‘size=’ bytes for random I/O.
threads=nn  How many concurrent threads to run for this workload. (Make sure you have at least one file for each thread).
xfersize=(nn,…)  Specifies the data transfer size(s) to use for read and write operations.

Run Definition (RD) Parameter Overview (For raw I/O testing)

The Run Definition parameters define which of the earlier defined workloads need to be executed, what I/O rates need to be generated, and how long the workload will run. One Run Definition can result in multiple actual workloads, depending on the parameters used.


Example: rd=run1,wd=(wd1,wd2),iorate=1000,elapsed=60,interval=5


There is a separate list of RD parameters for file system testing.

CommandExplanation 
rd=defaultSets defaults for all RDs that are entered later. 
rd=nameUnique name for this Run Definition (RD). 
wd=xxWorkload Definitions to use for this run. 
sd=xxxWhich SDs to use for this run (Optional). 
curve=(nn,nn,..)Data points to generate when creating a performance curve. See also stopcurve= 
distribution=(x[,variable]I/O inter arrival time calculations: exponential, uniform, or deterministic. Default exponential. 
elapsed=nnElapsed time for this run in seconds. Default 30 seconds. 
endcmd=cmdExecute command or script at the end of the last run 
(for)compratio=nnMultiple runs for each compression ratio. 
(for)hitarea=nnMultiple runs for each hit area size. 
(for)hpct=nnMultiple runs for each read hit percentage. 
(for)rdpct=nnMultiple runs for each read percentage. 
(for)seekpct=nnMultiple runs for each seek percentage. 
(for)threads=nnMultiple runs for each thread count. 
(for)whpct=nnMultiple runs for each write hit percentage. 
(for)xfersize=nnMultiple runs for each data transfer size. 
Most forxxx parameters may be abbreviated to their regular name, e.g. xfersize=(..,..) 
interval=nnStop the run after nnn bytes have been read or written, e.g. maxdata=200g. I/O will stop at the lower of elapsed= and maxdata=. 
iorate=(nn,nn,nn,…)Reporting interval in seconds. Default ‘min(elapsed/2,60)’ 
iorate=curveOne or more I/O rates. 
iorate=maxCreate a performance curve. 
iorate=(nn,ss,…)Run an uncontrolled workload. 
nn,ss: pairs of I/O rates and seconds of duration for this I/O rate. See also ‘distribution=variable’. 
openflags=xxxxPass specific flags when opening a lun or file
pause=nnSleep ‘nn’ seconds before starting next run.
replay=(filename, split=split_dir, repeat=nn)-‘filename’: Replay file name used for Swat Vdbench replay – ‘split_dir’: directory used to do the replay file split. – ‘nn’: how often to repeat the replay.
startcmd=cmdExecute command or script at the beginning of the first run
stopcurve=n.nStop iorate=curve runs when response time > n.n ms.
warmup=nnOverride warmup period.

Run Definition (RD) parameters for file systems, overview

These parameters are file system specific parameters. More RD parameters can be found

CommandExplanation
fwd=(xx,yy,..)Name(s) of Filesystem Workload Definitions to use.
fwdrate=nnHow many file system operations per second
format=yes/no/only/ restart/clean/once/ directoriesDuring this run, if needed, create the complete file structure.
operations=xxOverrides the operation specified on all selected FWDs.
foroperations=xxMultiple runs for each specified operation.
fordepth=xxMultiple runs for each specified directory depth
forwidth=xxMultiple runs for each specified directory width
forfiles=xxMultiple runs for each specified amount of files
forsizes=xxMultiple runs for each specified file size
fortotal=xxMultiple runs for each specified total file size

Report Files

HTML files are written to the directory specified using the ‘-o’ execution parameter.
These reports are all linked together from one starting point. Use your favourite browser and point at ‘summary.html’.

Report TypeExplanation
summary.htmlContains workload results for each run and interval. Summary.html also contains a link to all other html files, and should be used as a starting point when using your browser for viewing. For file system testing see summary.html for file system testing From a command prompt in windows just enter ‘start summary.html’; on a unix system, just enter ‘firefox summary.html &’.
totals.htmlReports only run totals, allowing you to get a quick overview of run totals instead of having to scan through page after page of numbers.
totals_optional.htmlReports the cumulative amount of work done during a complete Vdbench execution. For SD/WD workloads only.
hostx.summary.htmlIdentical to summary.html, but containing results for only one specific host. This report will be identical to summary.html when not used in a multi-host environment.
hostx-n.summary.htmlSummary for one specific slave.
logfile.htmlContains a copy of most messages displayed on the console window, including several messages needed for debugging.
hostx_n.stdout.htmlContains logfile-type information for one specific slave.
parmfile.htmlContains a copy of the parameter file(s) from the ‘-f parmfile ‘ execution parameter.
parmscan.htmlContains a running trail of what parameter data is currently being parsed. If a parsing or parameter error is given this file will show you the latest parameter that was being parsed.
sdname.htmlContains performance data for each defined Storage Definition. See summary.html for a description. You can suppress this report with ‘report=no_sd_detail’
hostx.sdname.htmlIdentical to sdname.html, but containing results for only one specific host. This report will be identical to sdname.html when not used in a multi-host environment. This report is only created when the ‘report=host_detail’ parameter is used.
hostx_n.sdname.htmlSD report for one specific slave. . This report is only created when the ‘report=slave_detail’ parameter is used.
kstat.htmlContains Kstat summery performance data for Solaris
hostx.kstat.htmlKstat summary report for one specific host. This report will be identical to kstat.html when not used in a multi-host environment.
host_x.instance.htmlContains Kstat device detailed performance data for each Kstat ‘instance’.
nfs3/4.htmlSolaris only: Detailed NFS statistics per interval similar to the nfsstat command output.
flatfile.htmlA file containing detail statistics to be used for extraction and input for other reporting tools. See also Parse Vdbench flatfile
errorlog.htmlAny I/O errors or Data Validation errors will be written here.
swat_mon.txtThis file can be imported into the Swat Performance Monitor allowing you to display performance charts of a Vdbench run.
swat_mon_total.txtSimilar to swat_mon.txt, but allows Swat to display only run totals.
swat_mon.binSimilar to swat_mon.txt above, but for File System workload data.
messages.htmlFor Solaris and Linux only. At the end of a run the last 500 lines from /var/adm/messages or /var/log/messages are copied here. These messages can be useful when certain I/O errors or timeout messages have been displayed.
fwdx.htmlA detailed report for each File system Workload Definition (FWD).
wdx.htmlA separate workload report is generated for each Workload Definition (WD) when more than one workload has been specified.
histogram.htmlFor file system workloads only. A response time histogram reporting response time details of all requested FWD operations.
sdx.histogram.htmlA response time histogram for each SD.
wdx.histogramA response time histogram for each WD. Only generated when there is more than one WD.
fsdx.histogram.htmlA response time histogram for each FSD.
fwdx.histogramA response time histogram for each FWD. Only generated when there is more than one FWD.
skew.htmlA workload skew report.

Sample Parameter Files

These example parameter files can also be found in the installation directory.

  • Example 1: Single run, one raw disk
  • Example 2: Single run, two raw disk, two workloads.
  • Example 3: Two runs, two concatenated raw disks, two workloads.
  • Example 4: Complex run, including curves with different transfer sizes
  • Example 5: Multi-host.
  • Example 6: Swat trace replay.
  • Example 7: File system test. See also Sample parameter file:

There is a larger set of sample parameter files in the /examples/ directory inside your Vdbench install directory inside the filesys and raw folders

Example 1

Example 2
Example 3
Example 4
Example 5
Example 6
Example 7

What is SSPI – Security Support Provider Interface?

Alongside its operating systems, Microsoft offers the Security Support Provider Interface (SSPI) which is the foundation for Windows authentication. The SSPI provides a universal, industry-standard interface for secure distributed applications. SSPI is the implementation of the Generic Security Service API (GSSAPI) in Windows Server operating systems. For more information about GSSAPI, see RFC 2743 and RFC 2744 in the IETF RFC Database.

SSPI is a software interface. Distributed programming libraries such as RPC can use it for authenticated communications. Software modules called SSPs provide the actual authentication capabilities. The default Security Support Providers (SSPs) that invoke specific authentication protocols in Windows are incorporated into the SSPI as DLLs. An SSP provides one or more security packages

Security Support Provider Interface Architecture

The SSPI in Windows provides a mechanism that carries authentication tokens over the existing communication channel between the client computer and the server. When two computers or devices need to be authenticated so that they can communicate securely, the requests for authentication are routed to the SSPI, which completes the authentication process, irrespective of the network protocol currently in use. The SSPI returns transparent binary large objects. These are passed between the applications, at which point they can be passed to the SSPI layer. The SSPI enables an application to use various security models available on a computer or network without changing the interface to the security system.

Security Support Provider

The following sections show the default SSPs that interact with the SSPI. The SSPs are used in different ways in Windows operating systems to enable secure communication in an unsecure network environment. The protocols used by these providers enable authentication of users, computers, and services; the authentication process, in turn, enables authorized users and services to access resources in a secure manner.

Using SSPI ensures that no matter which SSP you select, your application accesses the authentication features in a uniform manner. This capability provides your application greater independence from the implementation of the network than was available in the past.

Distributed applications communicate through the RPC interface. The RPC software in turn, accesses the authentication features of an SSP through the SSPI.

Diagram that shows the components that are required and the paths that credentials take through the system to authenticate the user or process for a successful logon.

Python Training

It’s been a while since I’ve blogged and in the interest of keeping a focus on training on new concepts, a friend suggested I follow John Zelle’s book- Python Programming – An introduction to Computer Science. The book is focused on Python but also provides some great detail on Computer Science principles along side programming.

Book Link

Available in several different formats

https://fbeedle.com/our-books/23-python-programming-an-introduction-to-computer-science-3rd-ed-9781590282755.html

Github

As a result of having to do more programming at work, I thought I would chart my progress and register for a github account and document the end of chapter discussions, questions and exercises whilst learning some git concepts also

https://github.com/redrocket83/python

GitHub is a code hosting platform for version control and collaboration. It lets you and others work together on projects from anywhere. It has plenty of tutorials which can teach you GitHub essentials like repositories, branches, commits, and pull requests.

I have found it useful to my learning to document what I have learned, probably a repetitive learning concept and hopefully useful to others as it will be a public repository. If anyone feels like correcting anything or providing simpler and easier solutions, then feel free 🙂

SAML explained

What does SAML stand for?

Security Access Markup Language

What is SAML used for?

SAML is an XML-based open-standard for transferring identity data between two parties: an identity provider and a service provider. SAML enables Single-Sign On (SSO), a term that means users can log in once, and those same credentials can be reused to log into other service providers. The OASIS Consortium approved SAML v2 in 2005. SAML 2.0 changed significantly from 1.1 and the versions are incompatible.

What is XML used for in relation to SAML?

SAML transactions use Extensible Markup Language (XML) to communicate between the identity provider and service providers. SAML is the link between the authentication of a user’s identity and the authorization to use a service.

How does authentication and authorization work in SAML?

SAML implements a secure method of transferring user authentications and authorizations between the identity provider and service providers. When a user logs into a SAML enabled application, the service provider requests authorization from the appropriate identity provider. The identity provider authenticates the user’s credentials and then returns the authorization for the user to the service provider, and the user is now able to use the application.

SAML authentication is the process of checking the user’s identity and credentials. SAML authorization tells the service provider what access to grant the authenticated user.

What is a SAML provider?

There are two primary types of SAML providers, service provider, and identity provider.

  • The identity provider carries out the authentication and passes the user’s identity and authorization level to the service provider.
  • A service provider needs the authentication from the identity provider to grant authorization to the user.

Advantages of SAML

  • Users only need to sign in once to access several service providers. This means a faster authentication process and the user does not need to remember multiple login credentials for every application.
  • SAML provides a single point of authentication
  • SAML doesn’t require user information to be maintained and synchronized between directories.
  • Identity management best practices require user accounts to be both limited to only the resources the user needs to do their job and to be audited and managed centrally. Using an SSO solution will allow you to disable accounts and remove access to resources simultaneously when needed.

Visualising SAML

SAML Example

SAML uses a claims-based authentication workflow. When a user tries to access an application or site, the service provider asks the identity provider to authenticate the user. Then, the service provider uses the SAML assertion issued by the identity provider to grant the user access.

  1. The user opens a browser and navigates to the service provider’s web application, which uses an identity provider for authentication.
  2. The web application responds with a SAML request.
  3. The browser passes the SAML request to the identity provider.
  4. The identity provider parses the SAML request.
  5. The identity provider authenticates the user by prompting for a username and password or some other authentication factor. NOTE: The identity provider will skip this step if the user is already authenticated.
  6. The identity provider generates the SAML response and returns it to the user’s browser.
  7. The browser sends the generated SAML response to the service provider’s web application which verifies it.
  8. If the verification succeeds, the web application grants the user access.

Network Partition on vSAN Cluster

The problem

We had an interesting problem with a 6 host vSAN cluster where 1 host seemed to be in a network partition according to Skyline Health. I thought it would be useful to document our troubleshooting steps as it can come in useful. Our problem wasn’t one of the usual network mis-configurations but in order to reach that conclusion we needed to perform some usual tests

We had removed this host from the vSAN cluster, the HA cluster and removed from the inventory and rebuilt it, then tried adding it back into the vSAN cluster with the other 5 hosts. It let us add the host to the current vSAN Sub-cluster UUID but then partitioned itself from the other 5 hosts.

Usual restart of hostd, vpxa, clomd and vsanmgmtd did not help.

Test 1 – Check each host’s vSAN details

Running the command below will tell you a lot of information on the problem host, in our case techlabesxi1

esxcli vsan cluster get

  • Enabled: True
  • Current Local Time: 2021-07-16T14:49:44Z
  • Local Node UUID: 70ed98a3-56b4-4g90-607c4cc0e809
  • Local Node Type: NORMAL
  • Local Node State: MASTER
  • Local Node Health State: HEALTHY
  • Sub-Cluster Master UUID: 70ed45ac-bcb1-c165-98404b745103
  • Sub-Cluster Backup UUID: 70av34ab-cd31-dc45-8734b32d104
  • Sub-Cluster UUID: 527645bc-789d-745e-577a68ba6d27
  • Sub-Cluster Member Entry Revision: 1
  • Sub-Cluster Member Count: 1
  • Sub-Cluster Member UUIDs: 98ab45c4-9640-8ed0-1034-503b4d875604
  • Sub-Cluster Member Hostnames: techlabesx01
  • Unicast Mode Enabled: true
  • Maintenance Mode State: OFF
  • Config Generation: b3289723-3bed-4df5-b34f-a5ed43b4542b43 2021-07-13T12:57:06.153

Straightaway we can see it is partitioned as the Sub-Cluster Member UUIDs should have the other 5 hosts’ UUID in and the Sub-Cluster Member Hostnames should have techlabesxi2, techlabesxi3, techlabesxi4, techlabesxi5, techlabesxi6. It has also made itself a MASTER where as we already have a master with the other partitioned vSAN cluster and there can’t be two masters in a cluster.

Master role:

  • A cluster should have only one host with the Master role. More than a single host with the Master role indicates a problem
  • The host with the Master role receives all CMMDS updates from all hosts in the cluster

Backup role:

  • The host with the Backup role assumes the Master role if the current Master fails
  • Normally, only one host has the Backup role

Agent role:

  • Hosts with the Agent role are members of the cluster
  • Hosts with the Agent role can assume the Backup role or the Master role as circumstances change
  • In clusters of four or more hosts, more than one host has the Agent role

Test 2 – Can each host ping the other one?

A lot of problems can be caused by the misconfiguration of the vsan vmkernel and/or other vmkernel ports however, this was not our issue. It is worth double checking everything though. IP addresses across the specific vmkernel ports must be in the same subnet.

Get the networking details from each host by using the below command. This will give you the full vmkernel networking details including the IP address, Subnet Mask, Gateway and Broadcast

esxcli network ip interface ipv4 address list

It may be necessary to test VMkernel network connectivity between ESXi hosts in your environment. From the problem host, we tried pinging the other hosts management network.

vmkping -I vmkX x.x.x.x

Where x.x.x.x is the hostname or IP address of the server that you want to ping and vmkX is the vmkernel interface to ping out of.

This was all successful

Test 3 – Check the unicast agent list and check the NodeUUIDs on each host

To get a check on what each host’s nodeUUID is, you can run

esxcli vsan cluster unicastagent list

Conclusion

We think what happened was that the non partitioned hosts had a reference to an old UUID for techlabesx01 due to us rebuilding the host. The host was removed from the vSAN cluster and the HA cluster and completely rebuilt. However, when we removed this host originally, the other hosts did not seem to update themselves once it had gone. So when we tried to add it back in, the other hosts didn’t recognise it.

The Fix

What we had to do was disable ClustermemberListUpdates on each host

esxcfg-advcfg -s 1 /VSAN/IgnoreClustermemberListupdates

Then remove the old unicastagent information from each host for techlabesx01

esxcli vsan cluster unicastagent remove -a 192.168.1.10

Then add the new unicastagent details to each host. You have seen where to get the host UUID in Step 1

esxcli vsan cluster unicastagent add -t node -u 70ed98a3-56b4-4g90-607c4cc0e809 -U true -a 192.168.1.10 -p 12321

This resolved the issue

Comparing VM Encryption performance between ESXi 6.7U3 + vSAN and ESXi 7.0U2 + vSAN

This blog is similar to another I wrote which compared VM Encryption and vSAN encryption on ESXi 6.7U3. This time, I’m comparing VM Encryption performance on ESXi 6.7U3 and ESXi 7.0U2 running on vSAN.

What is the problem which needs to be solved?

I have posted this section before on the previous blog however it is important to understand the effect of an extra layer of encryption has on the performance of your systems. It has become a requirement (sometimes mandatory) for companies to enable protection of both personal identifiable information and data; including protecting other communications within and across environments New EU General Data Protection Regulations (GDPR) are now a legal requirement for global companies to protect the personal identifiable information of all European Union residents. In the last year, the United Kingdom has left the EU, however the General Data Protection Regulations will still be important to implement. “The Payment Card Industry Data Security Standards (PCI DSS) requires encrypted card numbers. The Health Insurance Portability and Accountability Act and Health Information Technology for Economic and Clinical Health Acts (HIPAA/HITECH) require encryption of Electronic Protected Health Information (ePHI).” (Townsendsecurity, 2019) Little is known about the effect encryption has on the performance of different data held on virtual infrastructure. VM encryption and vSAN encryption are the two data protection options I will evaluate for a better understanding of the functionality and performance effect on software defined storage.

It may be important to understand encryption functionality in order to match business and legal requirements. Certain regulations may need to be met which only specific encryption solutions can provide. Additionally, encryption adds a layer of functionality which is known to have an effect on system performance. With systems which scale into thousands, it is critical to understand what effect encryption will have on functionality and performance in large environments. It will also help when purchasing hardware which has been designed for specific environments to allow some headroom in the specification for the overhead of encryption

Testing Components

Test lab hardware (8 Servers)

HCIBench Test VMs

80 HCIBench Test VMs will be used for this test. I have placed 10 VMs on each of the 8 Dell R640 servers to provide a balanced configuration. No virtual machines other than the HCIBench test VMs will be run on this system to avoid interference with the testing.

The HCIBench appliance is running vdBench, not Fio

The specification of the 80 HCIBench Test VMs are as follows.

RAID Configuration

VM encryption will be tested on RAID1 and RAID6 vSAN storage

VM encryption RAID1 storage policy

Test ParametersConfiguration
vCenter Storage PolicyName = raid1_vsan_policy
Storage Type = vSAN
Failures to tolerate = 2 (RAID 1) Thin provisioned = Yes
Number of disk stripes per object = 2
Encryption enabled = Yes Deduplication and Compression enabled = No

VM encryption RAID6 storage policy

Test ParametersConfiguration
vCenter Storage PolicyName = raid6_vsan_policy
Storage Type = vSAN
Failures to tolerate = 2 (RAID6)
Thin provisioned = Yes
Number of disk stripes per object = 1
Encryption enabled = Yes Deduplication and Compression enabled = No

HCIBench Test Parameters

The test will run through various types of read/write workload at the different block sizes to replicate different types of applications using 1 and 2 threads.

  • 0% Read 100% Write
  • 20% Read 80% Write
  • 70% Read 30% Write

The block sizes used are

  • 4k
  • 16k
  • 64k
  • 128k

The test plan below containing 24 tests will be run for VM Encryption on 6.7U3 and again for VM Encryption on 7.0U2. These are all parameter files which are uploaded in HCIBench then can run sequentially without intervention through the test. I think I left these running for 3 days! It refreshes the cache in between tests.

Scroll across at the bottom to see the whole table

TestNumber of disksWorking Set %Number of threadsBlock size (k)Read %Write %Random %Test time (s)
12 (O/S and Data)100%14k01001007200
22 (O/S and Data)100%24k01001007200
32 (O/S and Data)100%14k20801007200
42 (O/S and Data)100%24k20801007200
52 (O/S and Data)100%14k70301007200
62 (O/S and Data)100%24k70301007200
72 (O/S and Data)100%116k01001007200
82 (O/S and Data)100%216k01001007200
92 (O/S and Data)100%116k20801007200
102 (O/S and Data)100%216k20801007200
112 (O/S and Data)100%116k70301007200
122 (O/S and Data)100%216k70301007200
132 (O/S and Data)100%164k01001007200
142 (O/S and Data)100%264k01001007200
152 (O/S and Data)100%164k20801007200
162 (O/S and Data)100%264k20801007200
172 (O/S and Data)100%164k70301007200
182 (O/S and Data)100%264k70301007200
192 (O/S and Data)100%1128k01001007200
202 (O/S and Data)100%2128k01001007200
212 (O/S and Data)100%1128k20801007200
222 (O/S and Data)100%2128k20801007200
232 (O/S and Data)100%1128k70301007200
242 (O/S and Data)100%2128k70301007200

HCIBench Performance Metrics

These metrics will be measured across all tests

Workload ParameterExplanationValue
IOPsIOPS measures the number of read and write operations per secondInput/Outputs per second
ThroughputThroughput measures the number of bits read or written per second Average IO size x IOPS = Throughput in MB/sMB/s
Read LatencyLatency is the response time when you send a small I/O to a storage device. If the I/O is a data read, latency is the time it takes for the data to come backms
Write LatencyLatency is the response time when you send a small I/O to a storage device. If the I/O is a write, latency is the time for the write acknowledgement to return.ms
Latency Standard DeviationStandard deviation is a measure of the amount of variation within a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider rangeValues must be compared to the standard deviation
Average ESXi CPU usageAverage ESXi Host CPU usage%
Average vSAN CPU usageAverage CPU use for vSAN traffic only%

Results

IOPs

IOPS measures the number of read and write operations per second. The pattern for the 3 different tests is consistent where the heavier write tests show the least IOPs gradually increasing in IOPs as the writes decrease.

IOPS and block size tend to have an inverse relationship. As the block size increases, it takes longer latency to read a single block, and therefore the number of IOPS decreases however, smaller block sizes yield higher IOPS

With RAID1 VM Encryption, 7.0U2 performs better than 6.7U3 at the lower block level – 4k and 16k but as we get into the larger 64k and 128k blocks, there is less of a difference with 6.7U3 having the slight edge over IOps performance.

With RAID6 VM Encryption, 7.0U2 has consistently higher IOPS across all tests than 6.7U3.

RAID6 VM Encryption produces less IOPs than RAID1 VM Encryption which is expected due to the increased overhead RAID6 incurs over RAID1 in general. RAID 1 results in 2 writes, one to each mirror. A RAID6 single write operation results in 3 reads and 3 writes (due to double parity) Each write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity.

RAID 1 VM Encryption

The graph below shows the comparison of IOPs between 6.7U3 and 7.0U2 with RAID 1 VM Encryption

Click the graph for an enlarged view

RAID 6 VM Encryption

The graph below shows the comparison of IOPs between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Throughput

IOPs and throughput are closely related by the following equation.

Throughput (MB/s) = IOPS * Block size

IOPS measures the number of read and write operations per second, while throughput measures the number of bits read or written per second. The higher the throughput, the more data which can be transferred. The graphs follow a consistent pattern from the heavier to the lighter workload tests. I can see the larger block sizes such as 64K and 128K have the greater throughput in each of the workload tests than 4K or 8K. As the block sizes get larger in a workload, the number of IOPS will decrease. Even though it’s fewer IOPS, you’re getting more data throughput because the block sizes are bigger. The vSAN datastore is a native 4K system. It’s important to remember that storage systems may be optimized for different block sizes. It is often the operating system and applications which set the block sizes which then run on the underlying storage. It is important to test different block sizes on storage systems to see the effect these have.

With RAID1 VM Encryption at at lower block sizes, 4k and 16k, 7.0U2 performs better with greater throughput. At the higher block sizes 64k and 128k, there is less of a difference with 6.7U3 performing slightly better but the increase is minimal.

With RAID6 VM Encryption, there is generally a higher throughput at the lower block sizes but not at the higher block sizes

RAID1 VM Encryption

The graph below shows the comparison of throughput between 6.7U3 and 7.0U2 with RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

The graph below shows the comparison of throughput between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Average Latency

With RAID1 VM Encryption at at lower block sizes, 4k and 16k, 7.0U2 shows less latency but at the higher block sizes there is a slight increase in latency than 6.7U3

With RAID6 VM Encryption, the 7.0U2 tests are better showing less latency than the 6.7U3 tests

RAID1 VM Encryption

The graph below shows the comparison of average latency between 6.7U3 and 7.0U2 with RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

The graph below shows the comparison of average latency between 6.7U3 and 7.0U2 with RAID6 VM Encryption

Click the graph for an enlarged view

Read Latency

The pattern is consistent between the read/write workloads. As the workload decreases, read latency decreases although the figures are generally quite close. Read latency for all tests varies between 0.30 and 1.40ms which is under a generally recommended limit of 15-20ms before latency starts to cause performance problems.

RAID1 VM Encryption shows lower read latency for the 7.0U2 tests than 6.7U3. There are outlier values for the Read Latency across the 4K and 16K block size when testing 2 threads which may be something to note if applications will be used at these block sizes.

RAID6 shows a slightly better latency result than RAID1 however RAID6 has more disks than mirrored RAID1 disks to read from than RAID1 therefore the reads are very fast which is reflected in the results. Faster reads result in lower latency. Overall 7.0U2 performs better than 6.7U3 apart from one value at the 128k block size with 2 threads which may be an outlier.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Write Latency

The lowest write latency is 0.72ms and the largest is 9.56ms. Up to 20ms is the recommended value from VMware however with all flash arrays, thse values are expected and well within these limits. With NVMe and flash disks, the faster hardware may expose bottlenecks elsewhere in hardware stack and architecture which can be compared with internal VMware host layer monitoring. Write latency can occur at several virtualization layers and filters which each cause their own latency. The layers can be seen below.

This image has an empty alt attribute; its file name is image-14.png

Latency can be caused by limits on the storage controller, queuing at the VMkernel layer, the disk IOPS limit being reached and the types of workloads being run possibly alongside other types of workloads which cause more processing.

With RAID1 Encryption, 7.0U2 performed better at the lower block size with less write latency than 6.7U3. However on the higher block sizes, 64k and 128k, 6.7U3 performs slightly better but we are talking 1-2ms.

With RAID6 VM Encryption, 7.0U2 performed well with less latency across all tests than 6.7U3.

As expected, all the RAID6 results incurred more write latency than the RAID1 results. Each RAID6 write operation requires the disks to read the data, read the first parity, read the second parity, write the data, write the first parity and then finally write the second parity producing a heavy write penalty and therefore more latency

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Latency Standard Deviation

The standard deviation value in the testing results uses a 95th percentile. This is explained below with examples.

  • An average latency of 2ms and a 95th percentile of 6ms means that 95% of the IO were serviced under 6ms, and that would be a good result
  • An average latency of 2ms and a 95th percentile latency of 200ms means 95% of the IO were serviced under 200ms (keeping in mind that some will be higher than 200ms). This means that latencies are unpredictable and some may take a long time to complete. An operation could take less than 2ms, but every once in a while, it could take well over 200
  • Assuming a good average latency, it is typical to see the 95th percentile latency no more than 3 times the average latency.

With RAID1 Encryption, 7.0U2 performed better at the lower block size with less latency standard deviation than 6.7U3. However on the higher block sizes, 64k and 128k, 6.7U3 performs slightly better.

With RAID 6 VM Encryption, 7.0U2 performed with less standard deviation across all the tests.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

ESXi CPU Usage %

With RAID1 VM Encryption, at the lower block sizes, 4k and 16k, 7.0U2 uses more CPU but at the higher block sizes, 7.0U2 uses slightly less CPU usage.

With RAID6 VM Encryption, there is an increase in CPU usage across all 7.0U2 compared to 6.7U3 tests. RAID 6 has a higher computational penalty than RAID1.

RAID1 VM Encryption

Click the graph for an enlarged view

RAID6 VM Encryption

Click the graph for an enlarged view

Conclusion

The performance tests were designed to get an overall view from a low workload test of 30% Write, 70% Read through a series of increasing workload tests of 80% Write, 20% Read and 100% Write, 0% Read simulation. These tests used different block sizes to simulate different application block sizes. Testing was carried out on an all flash RAID1 and RAID6 vSAN datastore to compare the performance for VM encryption between ESXi 6.7U3 and 7.0U2. The environment was set up to vendor best practice across vSphere ESXi, vSAN, vCenter and the Dell server configuration.

RAID1 VM Encryption

  • With 6.7U3, IOPs at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not at lower block sizes.
  • With 6.7U3, throughput at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not at lower block sizes
  • Overall latency for 6.7U3 at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not for the lower block size
  • Read latency for 6.7U3 is higher than 7.0U2.
  • Write latency at the higher block sizes, 64k and 128k can be slightly better than 7.0U2 but not for the lower block sizes.
  • There is more standard deviation for 6.7U3 then 7.0U2.
  • At the lower blocks sizes, 6.7U3 uses less CPU on the whole but at the higher block sizes, 7.0U2 uses less CPU

RAID6 VM Encryption

  • There are higher IOPs for 7.0U2 than 6.7U3 across all tests.
  • There is generally a higher throughput for 7.0U2 at the lower block sizes, than 6.7U3 but not at the higher block sizes. However, the difference is minimal.
  • There is lower overall latency for 7.0U2 than 6.7U3 across all tests
  • There is lower read latency for 7.0U2 than 6.7U3 across all tests
  • There is lower write latency for 7.0U2 than 6.7U3 across all tests
  • There is less standard deviation for 7.0U2 than 6.7U3 across all tests
  • There is a higher CPU % usage for 7.0U2 than 6.7U3 across all tests

With newer processors, AES improvements, memory improvements, RDMA NICs and storage controller driver improvements, we may see further performance improvements in new server models.

BIOS and UEFI

What does the BIOS do?

The BIOS (Basic Input Output Operating System) is the first piece of software which runs and carries out the following tasks.

  1. Performing POST – (Power-On Self-Test) in this phase the BIOS checks if the components installed on the motherboard are functioning
  2. Basic I/O checks -This checks the peripherals such as the keyboard, the monitor and serial ports can operate to perform basic tasks.
  3. Booting – The BIOS tries to boot from the devices connected (SSDs, HDDs, PXE, whatever) in order to provide an Operating System) to operate the computer.

It can also be a low level management tool providing some ability to tweak system features and settings

What is UEFI?

UEFI stands for Unifed Extensible Firmware Interface. UEFI was released in 2007 to provide a successor to BIOS to overcome limitations. Before this computers used the BIOS (Basic Input Output Operating System). Most UEFI firmware implementations provide support for legacy BIOS services.

UEFI Advantages over BIOS

  • 32-bit/64/bit architecture rather than 16-bit
  • CPU independent architecture
  • Ability to use large disk partitions over 2TB. UEFI’s theoretical size limit for bootable drives is more than nine zettabytes, while BIOS can only boot from drives 2TB or smaller.
  • Flexible pre-OS environment, including network capability, GUI, multi language
  • Expanded BIOS with a GUI and mouse ability
  • UEFI Secure Boot feature, which employs digital signatures to verify the integrity of low-level code like boot loaders and operating system files before execution. If validation fails, Secure Boot halts execution of the compromised bits to stop any potential attack in its tracks. Secure Boot was added in version 2.2 of the UEFI specification
  •  UEFI does not use the Master Boot Record (MBR) scheme to store the low-level bits that bootstrap the operating system. Under the MBR, these key bits reside in the first segment of the disk, and any corruption or damage to that area stops the operating system from loading. Instead, UEFI uses the GUID Partition Table (GPT) scheme and stores initialization code in an .efi file found in a hidden partition. GPT also stores redundant copies of this code and uses cyclic redundancy checks to detect changes or corruption of the data
  • C / C++ language used instead of assembly language
  • Backwards compatibility with MBR hard drives

UEFI Specification

This can be found at the link – https://uefi.org/specifications

Considerations

When building Windows 10 or Windows Server 2016 VM’s, it is recommended you build them with EFI firmware enabled. Moving from traditional BIOS/MBR to EFI (UEFI) firmware afterwards introduces some challenges later on down the line and can cause machines not to boot.

UEFI still cannot be used for auto deploying vSphere ESXi hosts but this may change in the future.