Performance – CB-Net

ESX 4.X : Increased RAM use on Intel EPT Enabled Processors

Chris Bradford — Mon, 19 Sep 2011 16:19:52 +0000

Why does memory utilisation appear higher on ESX 4.X when using Intel EPT Enabled Processors?

There is an simple reason why memory usage on your ESX 4.X boxes looks higher than on 3.5; ESX3.5 doesn’t support Intel EPT for HW MMU this is why you’ve not seen this in ESX 3.5. Had you been running AMD RVI capable processors you’d have encountered this with an upgrade from 3.0 to 3.5; ESX 4 was the first ESX platform to support Intel EPT; http://www.vmware.com/files/pdf/software_hardware_tech_x86_virt.pdf

Interpretation of esxtop counters is key here, counters we’re interested in:

· “GRANT” (the amount of physical memory granted to a VM/pool)

· “SHRD” (the shared portion of “GRANT”.)

· “SHRDSVD” (estimated saving due to TPS)

· “COWH” (indication of memory which can be reclaimed by TPS)

When using Intel EPT – http://www.vmware.com/pdf/Perf_ESX_Intel-EPT-eval.pdf – (or AMD RVI) and where a VM supports HWMMU the ESX kernel will use 2MB pages instead of 4KB pages. On these VM’s you will see low values for SHRD/SHRDSVD.

HWMMU is supported on all versions of Windows from 2003 onwards; it is also the default setting for VM’s running these operating systems on hardware that supports Intel EPT; http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020524

How can we interpret memory usage on ESX 4.X when using HWMMU?

Using esxtop you will find that Linux based VM’s will have high values for SHRD/SHRDSVD, this is because they do NOT use Intel EPT and therefore do not use HWMMU. As MMU is virtualised (software) small pages are used which play nicely with TPS. Windows 2003+ VM’s will use HWMMU and therefore will use large pages.

You can check the HWMMU status of a VM by checking the vmware.log and looking for “HV Settings” –look at the value for “virtual mmu” if ‘software’ the VM is not using HWMMU (Intel EPT) if ‘hardware’ it is.

Large Page support can be disabled (therefore forcing TPS to work regardless of contention), but there is a CPU performance impact (not massive to be fair but mileage may vary on this). At the end of the day TPS will kick in a claim back memory (significant amounts – http://vmnomad.blogspot.com/2011/08/shared-memory-effectiveness-of.html) at 94% used.

I hope this makes sense and sheds some light on why the ESX 4 boxes appear to be using more RAM.

ESXi Troubleshooting Peformance

Chris Bradford — Fri, 03 Jun 2011 21:01:44 +0000

http://communities.vmware.com/docs/DOC-14905

vSphere : Converter Windows 2000 and vApp Issues

Chris Bradford — Wed, 02 Mar 2011 09:37:03 +0000

vSphere : P2V Windows 2000 and vApp Issues

In order to perform a Windows 2000 P2V you will be unable to use the most recent version of the VMWare converter. The last version to support Windows 2000 is 4.0.1 build-161434.

A big caveat with this version is that it does not support vApp’s as defined on vCenter. If you try to import to vCenter the converter will crash, the following event will be logged in the event log:

The VMware vCenter Converter Server service terminated unexpectedly. It has done this 1 time(s).

On loading the converter application again you will receive an error:

VMware vCenter Converter Server is installed but not running. When Converter Server is not running you will not be able to connect to local server.
Do you want to start it now?

The workaround for this issue is to connect directly to the/an ESX host instead of the vCenter.

The physical machine must be able to connect to the ESX host on ports 443 and 903 in order for the conversion to work.

HP BL465c G7 : ESXi 4.1 Issues

Chris Bradford — Mon, 25 Oct 2010 12:10:32 +0000

HP BL465c G7 : ESXi 4.1 Issues

Having just built a new ESXi 4.1 cluster comprised of BL465c G7’s we’ve run into some interesting issues I thought I would share.

Six BL465c G7 servers have been installed in 2 new C7000 G2 enclosures each with 2x Flex-10 Virtual Connect Modules and 2×20-port 8GB Fiber Channel Virtual Connect Modules. Servers are split 3 per-chassis;

In breif the issues are as follows:

PSOD on reboot and random crashes with PSOD
Corrected Memory Error threshold exceeded
Hang on reboot
ILO Duplicate SSL Certificate Serial Number

Hardware

BL465c G7
2x AMD Opteron 6172 12-core CPU
96GB RAM (12x 8GB 1333Mhz)
2 x 60GB SSD
ESXi 4.1 (HP Part Number 583772-007)

Firmware Version

All BL465c G7 ESXi blades have:
•    BIOS vA19 Jul 23 2010 Sept 30 2010
•    ILO3 v1.10 v1.15 Oct 22 2010
•    CNA v2.102.453.02.102.517.6 2.102.517.7
•    P410i v3.50

Chassis firmware:
•    Onboard Administrator 3.20 3.21
•    Flex10 Modules – 3.10 3.15
•    FC Modules –1.40 1.41

CNA Driver Version

The following versions have been tried:

• 2.102.404 2.102.440, 2.102.486, 2.102.518.0, currently 2.102.518.0 patch1 (unsigned), currently 2.102.554.0

ESXi Versions

HP OEM build as VMware vanilla build fails to load installer. Build ID is 260247.

Issues

#1 – PSOD on reboot and random crashes with PSOD – RESOLVED

The PSOD error was always a #PF Exception 14 and mentions benet_vlan_rem_vid. From this you can easily acertain it is an issue with the CNA driver. To confirm the current driver version you are using enable with either the local support console or SSH and enter the following command: vsish -e get /net/pNics/vmnic0/properties | grep “Driver Version”

The built in version with the HP OEM version of ESXi 4.1 is 2.102.404. The 2.102.440 version available on the VMware site simply made the PSOD issue worse.

A new Emulex CNA driver (onboard Converged Network Adapter) is available directly from VMware which resolves the PSOD issue. At the time of writing this is version 2.102.486 2.102.518.0 2.102.554.0. This is an unsigned signed driver and must be installed using the following process:

Download and install the VMware Remote CLI: http://www.vmware.com/support/developer/vcli/
Obtain the new CNA driver from HP support.
Launch the CLI from the start menu
Execute the following command (modifying the path to the CNA driver and server name)
- vihostupdate.pl –server server01 –install –bundle “C:\offline-bundle.zip” –bulletin SVE-be2net-2.102.486.0 –nosigcheck
- vihostupdate.pl –server server01 –install –bundle “C:\offline-bundle.zip” –bulletin SVE-be2net-2.102.518.0
- vihostupdate.pl –server server01 –install –bundle “C:\SVE-be2net-2.102.554.0-offline_bundle-347594.zip” –bulletin SVE-be2net-2.102.554.0

Check the version that you are running using the following command:

vsish -e get /net/pNics/vmnic0/properties | grep “Driver Version”

vsish -e get /net/pNics/vmnic0/properties | grep “Driver Firmware Version”

#2 – Corrected Memory Error threshold exceeded ((Processor 2, Memory Module 2)) – RESOLVED

There are also multiple alarm entries within vCenter for ‘Host memory status‘

We went through changing DIMMs, system board and CPU as part of the troubleshooting process. None of these seemed to help the problem.I then came across the following community post that got me thinking: http://communities.vmware.com/thread/221222

I implemented the below, and this has since been confirmed by HP as a work around for this issue which is caused by the 8GB DIMM and the default power management options. Specifically this affects 8GB DDR3 DIMMs only.

Reboot the server and enter the Rom Based Setup Utility (RBSU)
Select Power Management Options
Select HP Power Profile
Select Maximum Performance
Verify that HP Power regulator is now set to HP Static High Performance Mode

I have also been informed that the following will resolve this issue (thanks to Mads Kirkegaard):

Select Power Management Options
Select Advanced Power Management Options
Select Minimum Processor Idle Power State
Select No C-states (Default is C1E State (AMD C1 Clock Ramping))

If this resolves your issue contact your HP support specialist for further details on this issue, ensure you advise them of the workaround used. To be clear this is a hardware issue, the above is a workaround that disables the trigger for the hardware issue. Note, that the above is best practice for an ESX/ESXi host.

#3 – Hang on reboot – RESOLVED

The shutdown/reboot process for the BL465c G7 blades always fails with the server hanging. Without the new CNA driver (version 2.102.486) you will get a PSOD (#PF Exception 14) approx 75% of the time, with it installed you will simply get a hang/crash.

Further analysis, looking at the console output from the ILO using the tech support mode you can see the reboot process gets stuck at: ‘Requesting system reboot‘ When comparing this to a BL460c G6 blade I can see this is the last output before the server resets.

We have tested disabling USB support and the serial port in the BIOS, this made no difference. Also setting the HP Power Profile to OS Control Mode did not resolve this.

Use the following commands to check driver/firmware versions:

vsish -e get /net/pNics/vmnic0/properties | grep “Driver Version”

vsish -e get /net/pNics/vmnic0/properties | grep “Driver Firmware Version”

Update 23/11/2010: I have identified that when the Virtual Connect profile is not attached to the server it will reboot/shutdown without issue. Whilst in this state the server is useless, it will hopefully help identify the root cause of the problem.

Update 07/12/2010: New firmware and BIOS out. CNA firmware version 2.102.517.6 and ESXi driver version 2.102.518.0. Issue still remains. One thing I have found is that if the following commands are executed in order the server will shutdown:

/sbin/services.sh stop
/sbin/esxcfg-module –u –f be2net
reboot

I have had reported cases from New Zealand and Turkey so this is not a unique issue.

Update 23/12/2010: New firmware out 2.102.517.7 – does not resolve the issue. Also tested with 32GB RAM instead of 96GB RAM as per request by VMware and HP – no difference. Had an interesting discussion with VMWare/HP today which points at the Emulex CNA driver being the cause – an issue with the interrupts not being disabled on shutdown. A debug driver has been tested on full ESX and when the driver is instructed to disable interrupts on shutdown the server reboots.This may also indicate towards more issues with 12-core CPU’s than 8-core, simply due to the increased number of interrupts.

We’re hoping to get a pre-release version in the next few days, designed specifically to deal with this issue.

Update 24/12/2010: Pre-release driver version tests of driver ‘518.0.elx.patch1-1’ have proven to be succesful. This driver should be available form your support partner for testing. I’ll update this post when the official driver has been released.

Update 07/01/2011 21/01/2011: The most recent update I have is that the official release is now still on schedule for the first week of Febraury. Note that running the unsigned driver is not a supported solution. I’ve also recieved similar reports from Australia and Germany.

Update 07/02/2011: The new driver has been released, version 2.102.554.0 and is publically available from the following location: http://downloads.vmware.com/d/details/esxi4x_emulex_blade_10gb_dt/ZHcqYnRkdCVidGR3

To install;

Download the driver and extract the SVE-be2net-2.102.554.0-offline_bundle-347594.zip file from the ‘offline-bundle’ folder. Copy this to C:\.
Place host you wish to update in maintenence mode
Confirm the bulletin ID by listing available packages, change server name as appropriate:
- vihostupdate.pl –server server01 –list –bundle “C:\SVE-be2net-2.102.554.0-offline_bundle-347594.zip”
Use the rCLI to install the driver using the command, change server name as appropriate:
- vihostupdate.pl –server server01 –install –bundle “C:\SVE-be2net-2.102.554.0-offline_bundle-347594.zip” –bulletin SVE-be2net-2.102.554.0
Reboot the server – it will still hang as the driver is not in use at this stage.
Perform a test reboot – it should work as the driver is now in use!

Pre-release fix available, final, official, release not out yet.

Final fix now available.

#4 – ILO Duplicate SSL Certificate Serial Number – IN PROGRESS

No fix as of yet, we’re troubleshooting as you read this.

#5 – Network Connectivity Failure – IN PROGRESS

We had an issue last week where the management/vMotion NIC’s decided that network connetivity was no longer required.This caused VM’s to fail over via VMHA and therefore interrupted live applications. We’re testing the new driver released last week (05/02/2011) to see if this resolves this issue.

No fix as of yet, we’re troubleshooting as you read this.

VMWare : Increasing the HBA / Device Queue Depth

Chris Bradford — Wed, 24 Feb 2010 17:27:29 +0000

VMWare : Increasing the HBA / Device Queue Depth

ESX 3.5 ships with a standard HBA / LUN queue depth of 32. For QLogicHBA’s a setting of 64 may improve storage performance. You can identify a storage IO bottleneck using esxtop from the ESX command line. When running esxtop view LUN queue statistics by pressing ‘u‘ – monitor the QUED, ACTV and LOAD stats. If LOAD is above 1.0 constantly, and therefore QUED is greater than 0, increasing the queue depth above 32 may increase performance. As always, apply the ‘if it’s not broke then don’t try to fix it‘ philosophy!

There are two settings that must bechanged to increase the queue depth.

The steps below apply to QLogic HBA’s only.

Adapter Queue Depth

Find the current module name:
   esxcfg-module -l | grep -i ql

Check the current module queue depth setting:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This will return the value: Device queue depth = 0x20 (0x20 is HEX this is ‘32’ in decimal)

Change the queue depth using the following command, note there is no output from the command:
   esxcfg-module -s ql2xmaxqdepth=64 qla2300_707

Verify the change has been written to the esx.conf file:
   cat /etc/vmware/esx.conf

Reboot the ESX server, then check the module configuration:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This should return the value: Device queue depth = 0x40

Now, if you stop here you’ll find that the DQLEN will change from 32 to 64 and back again when viewing the LUN statistics in esxtop. It will keep changing randomly unless you perform the step below.

Disk.SchedNumReqOutstanding

Now we must increase the Disk.SchedNumReqOutstanding to 64, otherwise the setting above will have no effect:
esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding

This setting does NOT require a restart of the ESX environment.

Additional Considerations

Disabling Device Resets (which can cause backup interruptions and flooding of NSR’s) and limiting resets to an individual LUN which will not affect the entire SAN should be considered. ESX 3.X by default enables device resets and disables LUN resets. The following commands can be executed from the ESX console:

esxcfg-advcfg -s 1 /Disk/UseLunReset
esxcfg-advcfg -s 0 /Disk/UseDeviceReset

VMWare : Capturing Performance Statistics

Chris Bradford — Mon, 08 Feb 2010 15:43:53 +0000

VMWare : Capturing Performance Statistics

The following process will allow you to capture Windows Performance counter compatible CSV files from any ESX server using the ‘esxtop’ utility which is an integral part of VMWare ESX.

First we must create a couple of script files. The first being ‘ftp.sh‘ I have created the scripts on a datastore which houses NO Virtual Servers. Be careful where you place this data as filling up a datastore with VM’s will stop those VM’s working. You will need to modify the text in RED to ensure the script works in your environment. The text in RED is simply the path where the script fiels are located, and the path where the csv files will be generated.

This script will generate the CSV file and ‘trim’ it down to the stats we require. By default esxtop will generate an insanely large csv file. Once ‘trimmed’ it will upload the csv file to an FTP server of your choice and finally gzip/archive the file for future reference.

# Every 24 Hours FTP todays stats
#!/bin/bash
#
echo $(date +%R)
# Perform streamlining of CSV file
#
dm1=$(date –date=’1 day ago’ +%Y-%m-%d)
cat /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv | cut -d “,” -f 1,`head -1 /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv | tr “,” “\12” | egrep -n “\\\\\Memory\\\\\Free MBytes|Physical Disk$vmhba1$\\\\\Reads/sec|Physical Disk$vmhba1$\\\\\Writes/sec|Physical Disk$vmhba1$\\\\\MBytes Written|Physical Disk$vmhba1$\\\\\MBytes Read|\\\\\Physical Disk$vmhba2$\\\\\Reads/sec|Physical Disk$vmhba2$\\\\\Writes/sec|Physical Disk$vmhba2$\\\\\MBytes Written|Physical Disk$vmhba2$\\\\\MBytes Read |Physical Disk$vmhba2$\\\\\Commands/sec|Physical Disk$vmhba1$\\\\\Commands/sec|Physical Cpu$_Total$” | cut -d “:” -f 1 | tr “\12” “,”` > /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv

sed -i”.bak” “2d” /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv
rm /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv.bak -f
rm /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv -f
mv /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv

HOST=’ftp.domain.local’
USER=’username’
PASS=’password’

# Connect to FTP Server
ftp -inv $HOST << EOF
user $USER $PASS
lcd /vmfs/volumes/LOCAL_ATTACHED/esxtop
put $HOSTNAME_$dm1.csv
bye
EOF

# GZIP and archive stats
#
gzip /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv
mv /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv.gz /vmfs/volumes/LOCAL_ATTACHED/esxtop/archive/

Secondly create the ‘capturestats.sh‘ script which will launch esxtop and capture the statistics you require. Again, modify the text in RED to suit you environment. This script will capture stats every 60 seconds 1439 times – there are 1440 minutes in a day, and we want the script to start again at midbight, so thisscript will run 00:00 to 23:59.

# capture.sh
#!/bin/bash
#
today=$(date +%Y-%m-%d)
#
# There are 1440 minutes in a day, we want to capture 00:00 > 23:59 so we’ll specify 1439 captures at 60 second intervals.
#
esxtop >> /vmfs/volumes/LOCAL_ATTACHED/esxtop/EUVMTST1_$today.csv -d 60 -n 1439 -c /root/.esxhoststats

Next, create the esxtop config file under /root/.esxhoststats. This will ensure that we capyure only what we need, CPU stats, Memory Useage and Disk I/O stats. You can modify your own config file to meet your own requirements.

abcdefgh
abcdefghijklmno
AbcdefghIjklm
abcdefghijk
abcdefghijk
ABcDEFGHIJKLm
5u

Finally, under root acount context (accessed via sudo su –) execute the ‘controb -e‘ command. Add the following lines to the file:

#!/bin/bash
00 00 * * * /vmfs/volumes/LOCAL_ATTACHED/esxtop/capturestats.sh >/dev/null
00 01 * * * /vmfs/volumes/LOCAL_ATTACHED/esxtop/ftp.sh >/dev/null

This will cause the capturestats.sh script to run at midnight every day and the ftp.sh script to run at 01:00 everyday.

VMWare : Troubleshooting VM Performance

Chris Bradford — Sun, 25 Oct 2009 21:54:39 +0000

VMWare Troubleshooting VM Performance in ESX/vSphere

Firstly we’re going to be using esxtop, which can be executed from the local console or an SSH session on ESXi, as well as vCenter.

General guidelines;

Physical : vCPU ratio should be around 1:5
Avoid memory oversubscription in production environments
Reserve memory for virualised SQL servers using Lock Pages in Memory
Reserve memory for virtualised RDS/Citrix Servers
Always create datastores using the vSphere client, this will ensure that the VMFS partition is aligned – this will reduce IO and potentially increase performance.
Align guest data disks (for Windows, any version earlier than 2008 must be manually aligned) – this will reduce IO and potentially increase performance.
Look to keep the number of VM’s per Datastore/LUN to around 10-15, this will help to reduce SCSI reservation contention.
vCPU’s ; less is more. From my own testing I’ve found that Citrix servers perform best with 2 vCPU’s over 4 vCPU’s. This is not only better for my users but also the ESXi hosts as there is less co-scheduling.

Lets begin…..

1) Investigate CPU contention/exhaustion using esxtop (press ‘c’ from esxtop, and shift-v for per-VM stats only):

Check host PCPU usage using esxtop
Look at %RDY, if this is equal or greater than 10% there is a performance issue – this can indicate CPU contention.
Look at %MLMTD, if this is high it would indicate a CPU limit is being imposed on the VM. %RDY – %MLMTD gives a true indication of CPU contention.
If %RDY is truluy above 10% the first step is to lower the number of active vCPUs configured on the ESX/vSphere server, next you’re looking at reducing the number of VM’s on the server.
Investigate co-scheduling/SMP related issue – are VM’s using all presented vCPU’s? From esxtop press ‘c’ then ‘e’ – then take a look at %CSTP. If these values are high this could indicate issue as this represents the overhead in co-scheduling CPU’s from a co-stopped to co-started state.

For example, if you have 16 cores, the maximum vCPUs that should be defined across all active VMs should not exceed 80.

2) Investigate memory usage via esxtop and vCenter (press ‘m’ from esxtop, and ‘shift-v’ for per-VM stats only):

Check host memory utilisation using esxtop
For full-fat ESX only – the service console may be low on RAM, you can adjust this by following these instructions: http://www.vmware.com/pdf/esx_performance_tips_tricks.pdf
Watch out for memory balooning, this can have a significant impact on VM performance. You can track memory balooning in vCenter and esxtop; MCTLTGT is the VMKernel’s desired memory baloon size, MCTLSZ is the actual size. If the target is greater than the size the baloon is increased/inflated, if it is smaller it is decreased/deflated. VM memory limits can also trigger balooning.
Transparent Page Sharing (TPS) allows a host to share memory with other VM’s on the host – only used when memory resources are low/overcommitted.
Check esxtop for SWCUR (currently used SWAP), SWTGT and SWCUR. If SWTGT is less than SWCUR swapping will take place. Swapping is slow so should be avoided at all costs. If sawpping is unavoidable use SSD’s; There’s a -12% degradation with local SSD versus -69% for Fiber Channel and -83% for local SATA storage. (more information here)
SWPWT represents the ammount of time a Virtual Machine is waiting for memory to be swapped in and should always be below 5%
SWR/SWW represent Swap Reads/Writes from disk to memory and vice versa.

3) Using esxtop investigate storage (press ‘u’ for per-datastore or ‘d’ for per-hba stats:

Investigate DAVG – represents the roud-trip time bewteen HBA and storage, should be less than 30ms ideally
Investigate KAVG – represents actual latency due to VMKernal
Investigate GAVG – represents the round-tripfor IO requests sent form the host to storage, again lower is better, ideally less than 30ms.
Check the CONS/s – this indictaes SCSI reservation conflicts generated by metadata updates on the same LUN at a given time.
vscsiStats (more info here) will report per-VMDK/RDM

4) Finally, consider the network subsystem:

Check bandwidth availability
Using esxtop check %DRPTX and %DRPR, if the latter is high consider increasing the Rx buffer from device manager (yes, Windows only…?linux configuration) on the VM

If all else fails check advisories on your hardware platform, I’ve run into issues in the past that have been device firmware specific so dont rule out the siplist of things.

UPDATE 22/02/2010 : Check out the new esxtop article here for further performance troubleshooting tips.