VMWare : Increasing the HBA / Device Queue Depth

VMWare : Increasing the HBA / Device Queue Depth

ESX 3.5 ships with a standard HBA / LUN queue  depth of 32. For QLogicHBA’s a setting of 64 may improve storage performance. You can identify a storage IO bottleneck using esxtop from the ESX command line. When running esxtop view LUN queue statistics by pressing ‘u‘ – monitor the QUED, ACTV and LOAD stats. If LOAD is above 1.0 constantly, and therefore QUED is greater than 0, increasing the queue depth above 32 may increase performance. As always, apply the ‘if it’s not broke then don’t try to fix it‘ philosophy!

There are two settings that must bechanged to increase the queue depth.

The steps below apply to QLogic HBA’s only.

Adapter Queue Depth

Find the current module name:
   esxcfg-module -l | grep -i ql

Check the current module queue depth setting:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This will return the value: Device queue depth = 0x20 (0x20 is HEX this is ‘32’ in decimal)

Change the queue depth using the following command, note there is no output from the command:
   esxcfg-module -s ql2xmaxqdepth=64 qla2300_707

Verify the change has been written to the esx.conf file:
   cat /etc/vmware/esx.conf

Reboot the ESX server
, then check the module configuration:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This should return the value: Device queue depth = 0x40

Now, if you stop here you’ll find that the DQLEN will change from 32 to 64 and back again when viewing the LUN statistics in esxtop. It will keep changing randomly unless you perform the step below.

Disk.SchedNumReqOutstanding

Now we must increase the Disk.SchedNumReqOutstanding to 64, otherwise the setting above will have no effect:
   esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding

This setting does NOT require a restart of the ESX environment.

Additional Considerations

Disabling Device Resets (which can cause backup interruptions and flooding of NSR’s) and limiting resets to an individual LUN which will not affect the entire SAN should be considered. ESX 3.X by default enables device resets and disables LUN resets. The following commands can be executed from the ESX console:

   esxcfg-advcfg -s 1 /Disk/UseLunReset
   esxcfg-advcfg -s 0 /Disk/UseDeviceReset