VMWare : Increasing the HBA / Device Queue Depth

VMWare : Increasing the HBA / Device Queue Depth

ESX 3.5 ships with a standard HBA / LUN queue  depth of 32. For QLogicHBA’s a setting of 64 may improve storage performance. You can identify a storage IO bottleneck using esxtop from the ESX command line. When running esxtop view LUN queue statistics by pressing ‘u‘ – monitor the QUED, ACTV and LOAD stats. If LOAD is above 1.0 constantly, and therefore QUED is greater than 0, increasing the queue depth above 32 may increase performance. As always, apply the ‘if it’s not broke then don’t try to fix it‘ philosophy!

There are two settings that must bechanged to increase the queue depth.

The steps below apply to QLogic HBA’s only.

Adapter Queue Depth

Find the current module name:
   esxcfg-module -l | grep -i ql

Check the current module queue depth setting:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This will return the value: Device queue depth = 0x20 (0x20 is HEX this is ‘32’ in decimal)

Change the queue depth using the following command, note there is no output from the command:
   esxcfg-module -s ql2xmaxqdepth=64 qla2300_707

Verify the change has been written to the esx.conf file:
   cat /etc/vmware/esx.conf

Reboot the ESX server
, then check the module configuration:
   cat /proc/scsi/qla2300/? | grep -i “queue depth”

This should return the value: Device queue depth = 0x40

Now, if you stop here you’ll find that the DQLEN will change from 32 to 64 and back again when viewing the LUN statistics in esxtop. It will keep changing randomly unless you perform the step below.

Disk.SchedNumReqOutstanding

Now we must increase the Disk.SchedNumReqOutstanding to 64, otherwise the setting above will have no effect:
   esxcfg-advcfg -s 64 /Disk/SchedNumReqOutstanding

This setting does NOT require a restart of the ESX environment.

Additional Considerations

Disabling Device Resets (which can cause backup interruptions and flooding of NSR’s) and limiting resets to an individual LUN which will not affect the entire SAN should be considered. ESX 3.X by default enables device resets and disables LUN resets. The following commands can be executed from the ESX console:

   esxcfg-advcfg -s 1 /Disk/UseLunReset
   esxcfg-advcfg -s 0 /Disk/UseDeviceReset

Citrix : LMC Authentication Error

Citrix : LMC Authentication Error

On opening the LMC you may receive the following error:
   “You did not authenticate correctly. Please try again or contact your System Administrator.”

To resolve this review the users listed in the following configuration file:
   %ProgramFiles%\Citrix\Licensing\LMC\Tomcat\conf\tomcat-users.xml

I experienced this error after migrating an administartive user account to a new domain, the account definition in the xml file was listed under the old domain. Modifying the old domain NETBIOS name to match the new domain NETBIOS name resolve the authentication issue.

It may be necessary to restart the Citrix Licensing Service after modification of the file.

VMWare : Configuring VCB with HP Dataprotector 6

VMWare : Configuring VCB with HP Dataprotector 6

This guide is intended to enable you to setup VCB on ESX 3.5 and HP DataProtector 6.0, it covers all elements from initial configuration of the VCB Proxy Server, to installation of the VCB Framework and both the required steps for backup AND restore of Virtual Machines.

VMWare Consolidatated Backup Concept

In essence a ‘VMWare Consolidatated Backup’ (VCB) is a crash consistant snapshot of a virtual machine which is redirected to a ‘VCB Proxy Server’ either across the LAN or Fibre Channel (SAN) infrastructure. When using HP DataProtector, before the backup is excuted, DataProtector calls a pre-exec script which quiesces the VM and performs the snapshot – redirecting the snapshot to the VCB proxy server, the backup then runs copying the data to tape before running a post-exec script which cleans up the snapshot.

One important thing to note is that VCB is an online process; traditional backup windows do not necessarily apply. One consideration is storage subsystem IOPS; VCB can be resource intensive for your SAN/Storage Infrastructure. I would reccommend running test VCB’s out-of-hours and closely monitoring the storage infrastructure to understand the performance impactof this process before performing backups during business hours.

The VCB Proxy server requires that there is sufficient disk-space equal to the size of the largest VCB (or sum of all concurrent VCBs) snapshot. For example, if you intend to perform a VCB of a VM with a total allocated vmdk size of 500GB, if only 300GB of the vmdk is used then the VCB will require approx 300GB to create a snapshot, if all 500GB has been used then the backup will require approx 500GB.

There are additonal considerations with Windows-based VM’s – it is necessary to ‘zero’ deleted files in order to reduce VCB snapshot sizes. If you were to create two 20GB files and then delete a single 20GB file, Windows will not ‘zero’ out the delete blocks, therefore the backup will still be 40GB. Using sdelete we can zero-out these blocks, reducing the backup by 20GB and therefore reducing the required backup window.

I will attempt to explain all of the above in more detail as we progress through this guide.

 

1. Windows VCB Proxy Server Configuration

1.1 Disabling Windows Automount

In order to void corruption of SAN attached disk data we must disable windows automounting. We will be presenting the ESX LUN’s to the VCB Proxy Server

Step 1: Open a command prompt and enter the command ‘diskpart

Step 2: Enter the command ‘automount

This will identify whether volume auto mounting is enabled on this system:

Step 3: If auto mounting is enabled we must disable t and flush the mount cache.

Step 4: Enter the command ‘automount disable

Step 5: Enter the command ‘automount scrub

Step 6: Enter the command ‘exit

Step 7: Reboot the VCB Proxy Server

Note: This will not affect existing LUNs that are assigned drive letters on this system.


1.2 Installation of VCB Framework

Step 1: Download the Vmware Consolodated Backup Framework application from http://www.vmware.com and run the setup; accept the licence agreement and follow the default onscreen prompts Please see “Setting up VCB.pdf for full details.

Note: Once the VBC framework is installed, there is no need to configure it, as the VCB scripts downloaded from HP will be used.

 

1.3 Installation of HP Data protector VCB Scripts

Step 1: Download the VCB scripts from the HP site and extract them.

Step 2: On the VCB proxy server, copy vcbmount.js, vmwarepostexec.cmd and vmwarepreexec.cmd to the omniback\bin share.

Step 3: Create the file “vmware_passwd” in C:\Program Files\VMware\VMware Consolidated Backup Framework\config

The vmware_passwd file should contain the IP or server name of the ESX server, its user name and password in the format below:

 

1.4 Installation HP Data Protector Media Agent on the VCB Proxy Server

In order to ensure LAN-free backups the VCB Server must be fibre attached and have the Data Protector Media Agent Installed. Install this agent via the Dataprotector GUI.

1.5 Configuration of Table Libraries/Drives for the VCB Proxy Server

If applicable to your environement, the VCB Proxy Server should be zoned to enable the VCB Proxy to access the tape libraries/drives. This is configured by the SAN Fabric Manager or Storage Team.

 

2. Zoning of SAN Disks for VCB Proxy Server

The VCB Proxy Server must be zoned to allow access to the SAN Volumes used by the ESX host; without this FC backups are not possible.

Step 1: Zone the SAN, VCB Proxy and ESX Server together.

Step 2: Configure host presentation of the VCB Proxy Server on the SAN LUNs

Step 3: Reboot the VCB proxy server.

Please note that if LUNs are presented read-only to the VCB Proxy Server the server will take a long time to start Windows. LUN’s must be presented read-write, hence the atomount setting detailed at the beginning of this guide.

 

3. ESX Server Configuration

3.1 Creation of User Account and Delegation of Required Permissions

Step 1: Log on to the ESX server using the VMWare Infrastructure Client (VI Client) as a user with Administrator privileges.

Step 2:  From the VI Client, click Administration in the navigation bar. Click the Roles tab. Click Add Role

Step 3: Type a name for the new role, for example, VMware Consolidated Backup User.

 

Step 4: Select the following privileges for the new role:

  •  VirtualMachine > Configuration > Disk Lease
  • VirtualMachine > State > Create Snapshot
  • VirtualMachine > State > Remove Snapshot
  • VirtualMachine > Provisioning > Allow Virtual Machine Download
  • Virtual Machine > Provisioning > Allow Read?only Disk Access

 

Click OK to complete the process.  (These are the minimum permissions required)

Step 5: From the VI Client, click Inventory in the navigation bar.

Step 6: Click on Users & Group tab and select the Users button. In the white space, press the right mouse button and select Add.

Step 7: Enter the logon details and click the ok button. Note these logon details should match those in the ‘vmware_passwd’ file on the VCB proxy server. This will create the VCB user account.

Step 8:Select the ESX Server from the system tree on the left-hand-side. Select the Permissions tab from the main window. This will enable us to set server-wide permissions.

Step 9: Right-click and select Add Permission.

Step 10: Select the new user account and click OK.

Step 11: On the Assign Permissions screen, in the drop down menu select the VC Backup Role that was created earlier and press ok.

UPDATE : Windows 2008 R2 Support

It is possible to get this working on Windows 2008 R2; you’ll need to perform the following additinal steps:

  1. Manually create  C:\Program Files\OmniBack\tmp
  2. Set C:\Program Files (x86)\VMware\VMware Consolidated Backup Framework\vcbMounter.exe to Run this program As Administrator under executable compatibility options.

VMWare : Enabling SNMP on ESX 3.5

Enabling SNMP on ESX 3.5

SNMP traps can be used to monitor ESX serverhealth, and individual Virtual Machine status. An example of a free SNMP monitor for ESX is SolarWinds VM Monitor. In order to use these tools it is necessary to configure and enable the SNMPd on your ESX server.

First we must edit the snmpd.conf file:
   vi /etc/snmp/snmpd.conf

Change the rocommunity line to match your community string:
   rocommunity public

Ensure the VMWare MIB are enabled:
   dlmod SNMPESX            /usr/lib/vmware/snmp/libSNMPESX.so

If using the ESX firewall you will need to open the snmp ports:
   esxcfg-firewall -e snmpd

Now start the snmpd service:
   /etc/init.d/snmpd start

Set SNMP to startup automatically on system boot:
   chkconfig snmpd on

You can query the status of the SNMPD service using the command:
   /etc/init.d/snmpd status

These changes can be made online, and there is no requirement to restart your ESX server.

Exchange 2007 : Grant Full Mailbox Access via Shell

Exchange 2007 : Grant Full Mailbox Access via Shell

The following Exchange Management Shell command can be used to assign ‘Full Mailbox Access’ permissions on a users mailbox for another user:

Add-MailboxPermission “UserA” -User “UserB” -AccessRights FullAccess

This will grant UserB full access to UserA’s mailbox.

SQL : View Running Trace Information

SQL : View Running Trace Information

The following SQL can be used to identify any traces that are active on a SQL instance.

Use the query below to see how many running traces there are on the instance:

SELECT count(*) FROM :: fn_trace_getinfo(default) WHERE property = 5 and value = 1

running
1

The next query will return more detailed information about the active traces:

SELECT * FROM :: fn_trace_getinfo(default)

traceid property value
1 1 0
1 2 c:\temp\TraceGlobal
1 3 5
1 4 29:27.5
1 5

1

To halt the running trace use the fllowing query:

EXEC sp_trace_setstatus  1, @status = 0

To delete the trace definition from the database:

EXEC sp_trace_setstatus  1, @status = 2

 

VMWare VCB Troubleshooting ‘ non-zero return code’

VMWare VCB Troubleshooting ‘ non-zero return code’

I recently came across the following error when running a VCB via HP DataProtector 6.0:

 

“Creating a quiesced snapshot failed because the (user-supplied) custom pre-freeze script in the virtual machine exited with a non-zero return code”

The following steps resolved the issue:
  > Perform a ‘repair’ of the VMWare tools installed on the Virtual Machine
  > Restart the Virtual Machine

The error is generated because the VCB fails to execute a script on the VM due to an issue with the VMWare tools on the guest.

VBScript : Find User SID

VBScript : Find User SID

Teh following script can be modified to return the SID of a user object. Change to be the hostname of a local DC, to be the sAMAccountNameof the user who’s SID you wish to find, and finally to be the NETBIOS name of the Active Directory domain:

strComputer = “
Set objWMIService = GetObject(“winmgmts:\\” & strComputer & “\root\cimv2”)

Set objAccount = objWMIService.Get _
    (“Win32_UserAccount.Name=’‘,Domain=’‘”)
Wscript.Echo objAccount.SID

For example this could be changed to:

strComputer = “DC1
Set objWMIService = GetObject(“winmgmts:\\” & strComputer & “\root\cimv2”)

Set objAccount = objWMIService.Get _
    (“Win32_UserAccount.Name=’BloggsJ‘,Domain=’MYDOMAIN‘”)
Wscript.Echo objAccount.SID

VMWare : Capturing Performance Statistics

VMWare : Capturing Performance Statistics

The following process will allow you to capture Windows Performance counter compatible CSV files from any ESX server using the ‘esxtop’ utility which is an integral part of VMWare ESX.

First we must create a couple of script files. The first being ‘ftp.sh‘ I have created the scripts on a datastore which houses NO Virtual Servers. Be careful where you place this data as filling up a datastore with VM’s will stop those VM’s working. You will need to modify the text in RED to ensure the script works in your environment. The text in RED is simply the path where the script fiels are located, and the path where the csv files will be generated.

This script will generate the CSV file and ‘trim’ it down to the stats we require. By default esxtop will generate an insanely large csv file. Once ‘trimmed’ it will upload the csv file to an FTP server of your choice and finally gzip/archive the file for future reference.

# Every 24 Hours FTP todays stats
#!/bin/bash
#
echo $(date +%R)
# Perform streamlining of CSV file
#
dm1=$(date –date=’1 day ago’ +%Y-%m-%d)
cat /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv | cut -d “,” -f 1,`head -1 /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv | tr “,” “\12” | egrep -n “\\\\\Memory\\\\\Free MBytes|Physical Disk\(vmhba1\)\\\\\Reads/sec|Physical Disk\(vmhba1\)\\\\\Writes/sec|Physical Disk\(vmhba1\)\\\\\MBytes Written|Physical Disk\(vmhba1\)\\\\\MBytes Read|\\\\\Physical Disk\(vmhba2\)\\\\\Reads/sec|Physical Disk\(vmhba2\)\\\\\Writes/sec|Physical Disk\(vmhba2\)\\\\\MBytes Written|Physical Disk\(vmhba2\)\\\\\MBytes Read |Physical Disk\(vmhba2\)\\\\\Commands/sec|Physical Disk\(vmhba1\)\\\\\Commands/sec|Physical Cpu\(_Total\)” | cut -d “:” -f 1 | tr “\12” “,”` > /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv

sed -i”.bak” “2d” /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv
rm /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv.bak -f
rm /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv -f
mv /vmfs/volumes/LOCAL_ATTACHED/esxtop/trim_$HOSTNAME_$dm1.csv /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv

HOST=’ftp.domain.local’
USER=’username’
PASS=’password’

# Connect to FTP Server
ftp -inv $HOST << EOF
user $USER $PASS
lcd /vmfs/volumes/LOCAL_ATTACHED/esxtop
put $HOSTNAME_$dm1.csv
bye
EOF

# GZIP and archive stats
#
gzip /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv
mv /vmfs/volumes/LOCAL_ATTACHED/esxtop/$HOSTNAME_$dm1.csv.gz /vmfs/volumes/LOCAL_ATTACHED/esxtop/archive/


Secondly create the ‘capturestats.sh‘ script which will launch esxtop and capture the statistics you require. Again, modify the text in RED to suit you environment. This script will capture stats every 60 seconds 1439 times – there are 1440 minutes in a day, and we want the script to start again at midbight, so thisscript will run 00:00 to 23:59.

# capture.sh
#!/bin/bash
#
today=$(date +%Y-%m-%d)
#
# There are 1440 minutes in a day, we want to capture 00:00 > 23:59 so we’ll specify 1439 captures at 60 second intervals.
#
esxtop >>
/vmfs/volumes/LOCAL_ATTACHED/esxtop/EUVMTST1_$today.csv -d 60 -n 1439 -c /root/.esxhoststats

Next, create the esxtop config file under /root/.esxhoststats. This will ensure that we capyure only what we need, CPU stats, Memory Useage and Disk I/O stats. You can modify your own config file to meet your own requirements.

abcdefgh
abcdefghijklmno
AbcdefghIjklm
abcdefghijk
abcdefghijk
ABcDEFGHIJKLm
5u

Finally, under root acount context (accessed via sudo su –) execute the ‘controb -e‘ command. Add the following lines to the file:

#!/bin/bash
00 00 * * * /vmfs/volumes/LOCAL_ATTACHED/esxtop/capturestats.sh >/dev/null
00 01 * * * /vmfs/volumes/LOCAL_ATTACHED/esxtop/ftp.sh >/dev/null

 This will cause the capturestats.sh script to run at midnight every day and the ftp.sh script to run at 01:00 everyday.

SQL : Enable AWE on i386/x86

SQL 2000 : Enabling AWE on Windows Server

On an x86/i386 system it is possible to use PAE and AWE to allow SQL to use more than 2GB of RAM. Windows Server Advanced 2000 x86 allows for up to 8GB of RAM using PAE and AWE, Windows Server 2003 Enterprise allows for up to 16GB.Datacenter editions allow for even greater ammounts of PAE/AWE assigned RAM.

First configure the /PAE switch in the operating system boot.ini file. You can also use the /3GB switch if not configuring more than 16GB of RAM.

Next, run the following SQL to enabled the instance to use AWE, and therefore the newly available RAM.

sp_configure ‘show advanced options’, 1
RECONFIGURE
GO
sp_configure ‘awe enabled’, 1
RECONFIGURE
GO
sp_configure ‘max server memory’, 2048 — This Sets The Allocation To 2 Gigabyte
RECONFIGURE
GO
 

You will have to restart the SQL instance for the change to become effective.

Considerations:

  • The total sum of all SQL assignd RAM should not be greater than all of the memory in the server; you should removed 2GB from this total for the OS if not using the /3GB  switch, or 1GB if using the /3GB switch.
  • You should configure the SQL service account to have ‘Lock Pages In Memory‘ permissions; this will prevent the AWE memory being paged to disk.
  • On a failover cluster environment, the sum of ALL instance assigned AWE memory should be no greater than the total memory (taking the kernel reserved 2GB/1GB depending on /3GB switch) on a single node. If this is exceeded, any instance which starts on a node where all memory is assigned will start in dynamic mode with 128MB RAM, or my even fail to start.
  • The ‘min server memory’ option is ignored when using AWE.
  • You cannot monitor SQL server memory use when utilising AWE from Task Manager – this will simply show the SQL instance using the total amount of memory. The following SQL can be used to identify real memory usage:

select counter_name,cntr_value/1024 As MemoryUsedMB from master..sysperfinfo
where counter_name = ‘Total Server Memory (KB)’

  • AWE is is an enabler allowing a 32-bit Operating System to address more than 4GB of physical memory.; there are obvious benifits however, there are performance considerations which should not be over looked when using AWE. For example, AWE memory cannot be swapped to the page file, therefore you should closely monitor application memory requirements after machine startup before allocating memory to SQL.