Updated: Nov 6
In this edition of Random Bits Of Geek News (RBOGN) Volume One, release One, I am looking at a wide spectrum of issues as it pertains to all things computing based, literally anywhere.
The Anywhere being on premise or in the cloud.
I want to start with my niche favorite topic and that is CPU and GPU combinations and my deep bias for EPYC CPU with either Nvidia or AMD based GPU.
Specifically this time, centered around VMware's ESXi 8.0 on Nutanix HCI platforms.
Now for years many companies have been using NVIDIA T4 cards or the M10 and M60 variants for VDI based solutions using mainly Citrix VDI but some customers also use VMware Horizon.
Here I am focusing on the mainstream Citrix customer base.
For those of you who do not know, VMware ESXi 8.0 is a strange beast that offers two modes of operation.
For ease of understanding I will over-simplify and refer to them as Legacy 7 Mode and the new Enhanced 8.0 mode as it relates to VSAN and other pieces of the vSphere 8.0 equation.
They did this because the requirements to run Native ESXi 8.0 mode with all the features turned on and pumping requires an insane level of hardware capability in the ESXi 8.0 hosts.
If you have some older servers with SSD and spinning disk in them you will find you can only run in 7 mode with your ESXi 8.0 software as it only supports the latest CPU types and only SSD and Optane storage devices with a preference for NVMe SSD..
VMware stopped selling ESXi 7 and so many of these customers had to buy VMware version 8 and run it in degraded 7 mode due to their existing hardware limitations.
vSphere 8 is only supported on the latest CPU from AMD and Intel but you can also get it to run on your legacy hardware by editing boot.cfg and appending allowLegacyCPU=true to the kernel opt line.
However ESXi 8.0 Update 2 now needs CPU that support the XSAVE instruction set, but you can install 8.0 on unsupported CPU to have a sniff and tinker.
The Kickstart example below uses two of the bypass options.
--ignoreprereqwarnings - Will ignore warning messages
--ignoreprereqerrors - Will ignore error messages
--forceunsupportedinstall - Will ignore error/warning messages for deprecated CPUs
On 8.0 you can bypass all of these warning messages with the above options in your Kickstart script.
One of the important settings to consider when creating a new Virtual Machine in vSphere is the VM firmware, which can either be BIOS or EFI and can be configured under VM Options->Boot Options->Firmware.
After selecting the desired guest operating system (GOS) in vSphere, the system will default to a recommended firmware type and can also be overridden by the user.
Ultimately, the selection of the VM firmware should be determined by what your GOS supports.
If you ever need to change the VM firmware, you typically will need to re-install the GOS because it does not understand the new firmware change (just like in a physical server) and more than likely the GOS will also not boot due to this change and this is the existing behavior from GOS point of view.
For a net new VM creation, prior to vSphere 8, if you had configured a VM using EFI firmware and you have not installed a GOS and realized that you had made a mistake and needed to change the VM firmware to BIOS, you could easily do so using the vSphere UI or API and then install your OS.
In vSphere 8 and specifically when using the latest Virtual Machine Compatibility (vHW20), you cannot just switch the VM firmware after the initial VM creation, especially if you had started with EFI firmware and wish to change it to BIOS.
In doing so, you will come across the following error message: ACPI motherboard layout requires EFI. Failed to start the virtual machine. Module DevicePowerOnEarly power on failed.
Ref docs.vmware.com Activate or Deactivate UEFI Secure Boot for a Virtual Machine UEFI Secure Boot is a security standard that helps ensure that your PC boots using only software that is trusted by the PC manufacturer.
For certain virtual machine hardware versions and operating systems, you can activate secure boot just as you can for a physical machine.
To support the new Virtual NUMA Topology in vSphere 8, a new VM motherboard layout has been introduced for VMs using the new Virtual Machine Compatibility 8.0 (vHW20) configured with EFI firmware.
This new motherboard layout setting is available as a new vSphere API property called motherboardLayout and defaults to the existing i440bxHostBridge value for VMs configured with BIOS firmware and acpiHostBridges for VMs configured with EFI firmware.
The important consideration when creating a new Virtual Machine Compatibility 8.0 (vHW20) VM is that the motherboard layout is only set once during the initial VM creation.
This means that if you had selected EFI as the firmware and then decided to change it to BIOS for whatever reason, the motherboard layout setting will still be configured for the original firmware setting.
The reason for this behavior is that a VM cannot be converted from using the new motherboard layout to the old or vice versa, especially for the device slots on the new motherboard that will not exist in the old motherboard and the only proper way to switch is to re-create the VM with the desired VM firmware.
Going back to the error message shown earlier, we can see why the message is being displayed because the motherboard layout is still configured as acpiHostBridges from the original VM firmware configure with EFI, but current VM firmware has been switched to BIOS and this is not supported using the new motherboard layout.
So, how do we go about remediating this issue?
As mentioned earlier, if this is a brand new VM that you have not installed a GOS on, then you simply just need to re-create the VM with the desired VM firmware and the issue goes away.
If you really do not wish to re-create the VM, you do have a couple of options.
Using the vSphere API, you can update the motherboard layout to go back to the default i440bxHostBridge, assuming you are using the BIOS firmware.
The following PowerCLI snippet can be applied to the desired VM and it will automatically update the motherboard layout property based on the configured VM firmware.
If you do not want to use vSphere API or if you are using the Free ESXi License, which does not give you write access to the vSphere API, then you will need to update the VMX file manually (which we typically do not recommend) but is your option if you do not have API access.
Step 1 - SSH to ESXi host and change into the datastore where the VM is stored
Step 2 - Ensure the VM is powered off and edit the VMX file and search for the string chipset.motherboardLayout which should have the value "acpi".
Update the value to "i440bx"
Note: When creating a VM using BIOS firmware, this setting is not added directly into the VMX file because the default behavior is to use i440bx, however when you use EFI firmware, this entry is added to the VMX file.
Step 3 - For the changes to go into effect, we need to reload the VM and to do so, we need to identify the VM ID (also known as the VM Managed Object Reference ID or MoRef for short) by running the following command then specifying the ID in the following command:
If you have existing VMs that have GOS installed and you wish to change the firmware from BIOS to EFI, you should consider re-creating the VM with EFI firmware so you get the full capabilities, especially if you plan to take advantage of the new vNUMA Topology features.
If you do not intend to use any of the new vNUMA features, then you can update the firmware and simply reinstall the GOS and the motherboard layout will continue to use the default i440bx.
vSphere version 8 has a substantial number of new features in many areas from multi-cloud, containers, core vSphere, AI/ML, and of course Data Processing Units (DPUs).
This new DPU feature in vSphere 8 is called vSphere Distributed Services Engine, and there is no need to install any additional appliances; in fact, it is super easy to get started.
Enabling the offload capability is done within Distributed Switches, allowing to bring the intelligence from ESXi, vCenter and NSX into the DPUs for enhanced performance, better workload consolidation, simplified management, and stronger security.
vSphere on DPUs not only brings the capability of utilizing new hardware technology and offloading services to such devices, but more importantly, the introduction of a new data path only available with this feature.
While utilizing vSphere Distributed Services Engine, customers can leverage UPTv2 and/or MUX mode to achieve better performance, and reduce network hops as well as CPU resources on X86 servers.
UPTv2 delivers passthrough to the VMs by leveraging VMXNET3 drivers rather than relying on vendor specific drivers, and most importantly it still makes those awesome vSphere features available such as HA, and DRS compared to SR-IOV where those features are not available.
We are going to see a lot of DPU based offloading technology refinements over the next few years to make the application workloads deliver sublime results from an ease of compute perspective.
OK, enough of that gubbins!
With regards to customers wanting to use AMD EPYC CPU with NVIDIA GPU for their Nutanix clusters using EUC Ultimate VDI OS on ESXi 8, be aware that this is bleeding edge right now.
Nutanix AMD NX boxes were launched on November 1 2023 with the NX-8155A-G9 and they will support various GPU in March of 2024 or sooner.
For GPU use on HPE DX you can only select the 24 LFF DX-385 with the current HPE AMD lineup for Nutanix.
Not only that but for VDI use with GPU the valid ESX 8.0 drivers available means you can only select up to 5 x A2 16GB RAM cards in every DX-385 node.
You will need special case help to get these configured via the Nutanix HPE team as well, the normal SE/SA at Nutanix cannot configure these builds yet.
You CANNOT use the A100, A40, A16 or A30 GPU cards at this time for ESXi 8.0 onwards.
Also, for AHV use, you can inversely NOT use these A2 GPU in the DX-385 platform!
I am getting my ducks in a row for a blog on the new Nutanix NX-8155A-G9 platform when it supports the new L40 GPU cards, so stay tuned!
I will add to this blog as more info comes to light as I go but I ran out of time and energy on this one, so it is what it is for right now!
Happy Geeking out!