ESX 4.X : Increased RAM use on Intel EPT Enabled Processors

 

Why does memory utilisation appear higher on ESX 4.X when using Intel EPT Enabled Processors?

 

There is an simple reason why memory usage on your ESX 4.X boxes looks higher than on 3.5; ESX3.5 doesn’t support Intel EPT for HW MMU this is why you’ve not seen this in ESX 3.5. Had you been running AMD RVI capable processors you’d have encountered this with an upgrade from 3.0 to  3.5; ESX 4 was the first ESX platform to support Intel EPT; http://www.vmware.com/files/pdf/software_hardware_tech_x86_virt.pdf

 

Interpretation of esxtop counters is key here, counters we’re interested in:

·         “GRANT” (the amount of physical memory granted to a VM/pool)

·         “SHRD” (the shared portion of  “GRANT”.)

·         “SHRDSVD” (estimated saving due to TPS)

·         “COWH” (indication of memory which can be reclaimed by TPS)

 

When using Intel EPT – http://www.vmware.com/pdf/Perf_ESX_Intel-EPT-eval.pdf  – (or AMD RVI) and where a VM supports HWMMU the ESX kernel will use 2MB pages instead of 4KB pages. On these VM’s you will see low values for SHRD/SHRDSVD.

 

HWMMU is supported on all versions of Windows from 2003 onwards; it is also the default setting for VM’s running these operating systems on hardware that supports Intel EPT; http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020524

 

 

How can we interpret memory usage on ESX 4.X when using HWMMU?

 

Using esxtop you will find that Linux based VM’s will have high values for SHRD/SHRDSVD, this is because they do NOT use Intel EPT and therefore do not use HWMMU. As MMU is virtualised (software) small pages are used which play nicely with TPS. Windows 2003+ VM’s will use HWMMU and therefore will use large pages.

 

You can check the HWMMU status of a VM by checking the vmware.log and looking for “HV Settings” –look at the value for “virtual mmu” if ‘software’ the VM is not using HWMMU (Intel EPT) if ‘hardware’ it is.

 

Large Page support can be disabled (therefore forcing TPS to work regardless of contention), but there is a CPU performance impact (not massive to be fair but mileage may vary on this). At the end of the day TPS will kick in a claim back memory (significant amounts – http://vmnomad.blogspot.com/2011/08/shared-memory-effectiveness-of.html) at 94% used.

  

I hope this makes sense and sheds some light on why the ESX 4 boxes appear to be using more RAM.