Troubleshoot Citrix / Thin Client Performance’
After a long project that was aimed at improving Thin Client performance I though I would post my experiences and solutions in order to aid those in a similar situation.
Citrix Server Performance Improvement
I was recently tasked with improving a Citrix XP and PS 4.5 Farm’s performance; by no means was this simple project which I could simply throw more servers at the farm hoping to resolve the issue.
By far, the most useful tool in diagnosing slow logons is the userenv.dll debugging available in your Windows out of the box. This will really spell out where your problem is coming from.
For further information read this link: http://support.microsoft.com/kb/221833
External File Server Performance
External file servers, especially servers holding roaming user profiles can cause significant delays; if these are running out of free connections or worker threads then logon delays are inevitable.
Symptoms: Long pause / very slow / hangs at logon ‘Loading Your Personal Settings”
Long logon delays often indicate issues with remote file access; namely GPO’s and Profile data if roaming profiles are used. Not only are these logon delays a nuisance for end-users, they have a knock-on effect; the duration of the delay often effects all users on a particular server. I have seen logon delays of 50+ second’s effect all users on a single server until the logon process has finished for the user
To Diagnose: Use userenv.dll debugging – http://support.microsoft.com/kb/221833– log file is located under %Systemroot%DebugUserModeUserenv.log.
Solution: Watch out for ‘Srv’ events in the System Event Log with Error code ‘2022’; see the following KB article for more details: http://support.microsoft.com/kb/317249I would definitely suggest rolling out the MaxFree Connections /MinFree Connections registry tweak described in more detail here: http://support.microsoft.com/kb/830901 Note that Windows Server / Advanced Server 2000 require a hotfix, which is free to obtain form MS Technical support.The following web site is also a great resource: http://support.microsoft.com/kb/324446 – if you’re running RAID cards with battery backup units get the Delayed Write Cache setting enabled!
NOTES: Please note that Microsoft does not support the use of PST files across a network. This can cause significant performance issues to file servers hosting them. For further details please see: http://blogs.technet.com/askperf/archive/2007/01/21/network-stored-pst-files-don-t-do-it.aspx If you’re hosting PST files on the same server as your profiles you’ve more than likely found your problem. I would suggest separating the profiles and PST files on separate servers. Profile access needs to be quick to ensure smooth logons.
Active Directory Access
Slow access to domain controllers, namely Global Catalogue (GC) servers can cause significant delays in logon as group memberships are referenced and permissions are established from the Active Directory.If you have only a single domain in your forest each Domain Controller can be setup as a GC server. In a multi-domain forest you should ensure that the Infrastructure Master FSMO role is not placed on a GC. The first DC in a domain is always automatically configured as a GC, subsequent DC’s are not.
Symptoms: Long pause / delay / hang / slow at logon “Applying computer settings” and loading Logon Scripts
To Diagnose: Use userenv.dll debugging – http://support.microsoft.com/kb/221833 – log file is located under %Systemroot%DebugUserModeUserenv.log.
Solution: Setup dedicated DC’s; DC’s are central to yourActive Directory Domain. Quick access for LDAP queries is essential for performance. Running print/file server roles on these servers is simply not smart and not reccommended.
Citrix Server Hardware / Number of Users Per Citrix Server
There are many myths about the number of users you can effectively have on a single Citrix server. I have seen single servers handle 60 users without any issues what so ever. I have seen servers struggle to handle 20 users when applications or external problems, such as file server access, can cause slowdowns. There isn’t a Citrix reccomended number of users per server. This limit is dictated by the applications your user operates during their session. The only way to find out what your Citrix servers can handle is to test them.
Symptoms: High CPU/ Memory / Page File usage on all Citrix servers within a farm.
To Diagnose: Create a performance benchmark using the built in Window Performance counters. You’ll know if this is an issue when you examine the results.
Solution: Setup and introduce further servers into a farm. Unless you’re seeing high CPU/RAM usage there is little point in adding more servers to the farm; your problem is elsewhere my friend.
It’s worth noting at this point a poor logon script can cause more problems than the few issues it may automatically fix. Avoid, where possible, calling network applications held on File servers – these shares will be in high demand at peak hours and could cause delays.Script type; I’m not going to get into which is better and which is worse programming language wise. I’ve had great success implementing vbscript over KIX scripts and DOS scripts; this may not be the same in your environment.Scripts to look at in particular; • Scripts being called by UsrLogn2.cmd (found under %SystemRoot%System32)• Group Policy Active Directory Account Logon Scripts
Symptoms: Long pause after the ‘Applying your personal settings’ box disappears.
To Diagnose: Test a user account with the same profile settings other than logon script; ensure it has no logon script.
Solution: Scale back / Streamline your scripts where possible. Alternatively you’re looking at a long night rebuilding them. There is no one-fix-fits all here; your scripts are bespoke to your network… good luck!
Network Adapter Configuration
UPDATE 31/01/2008: Simple, yet easy to overlook is the Network Adapter configuration.
Symptoms: Running Citrix Presentation Server 4.5 on Windows Server 2003 I experienced delays of up to 5 minutes for some user accounts whilst logging on. Specifically the logon would get stuck at ‘Loading your personal settings.’
Solution: The cause was simple; a network configuration mismatch. The switch to which the serevr was connected was configured for auto, as was the server. The link infact had auto-negotiated to 10Mb Half Duplex. Forcing the server to 100Mb Full-Duplex reduced logon to around 15 seconds.This can be explained by the use of roaming profiles. The delay was caused by the slow NIC configuration. This means that copying users roaming profiles took up to 5 minutes prior to logon.
UPDATE: 27/09/2009: Antivirus software should be installed and configured correctly for Citrix XenApp/Presentation Server in order to ensure that there is no performance overhead.
Symptoms: Generally slow performance across all applicationsand file access.
To Diagnose: TEMPORARILY disable all anti-virus components (especially the on-access scanner and any application filters/buffer overflow protection)
Solution: You should configure the anti-virus on-access scanner as follows:
• Scan on write events only
• Scan local drives only
• Exclude the pagefile from being scanned
• Exclude the Print Spooler directory to improve print performance
• Exclude the Program FilesCitrix folder from being scanned (the heavily accessed local host cache and Resource Manager local database are contained inside this folder)
• If ICA pass-through connections are used, exclude the user‘s XenApp Plugin bitmap cache and the XenApp Plugin folders
More information is available here
UPDATE: 11/11/2009: If using McAfee Virus Scan 8.7i ensure that at least patch version 2 is installed.
Symptoms: Slow Windows startup and logon performance. Windows boot takes several minutes and gets stuck on ‘Applying Computer Settings…’
To Diagnose: Set the ‘Network Location Awareness’ service startup type to ‘Automatic’
Solution: Install patch 2 for McAfee 8.7i – there is a known issue with version before this causing network communication requests to be sent prior to the ‘Network Location Awareness’ service starting
UPADTE 26/02/2010: I thought I would streamline this article, incorporating an additional troubleshooting step from another article in the cb-net archives.
Symptoms: Slow responses when entering text into applications. Refresh of application GUI appears slow, menus etc appear ‘sluggish.’
To Diagnose: Use the Metaframe Servers SDK (MFCOMSDK) v2.3 tool; smcconsole.exe. Using this tool you can view individual sessions bandwidth utilisation and latency.This tool is incredibly useful when troubleshooting issues regarding session performance. Session latency can also be viewed using the WMI performance counters for ICA Session that are installed when Citrix is installed on a Windows Server.
Solution: When troubleshooting my issues I was receiving figures of 27000ms (yes, 27 seconds!).
Common causes of high latency are:
Ø Network topology issues including port mismatches
Ø MTU issues
Ø Link saturation / QoS misconfiguration
I have seen latency figures as high as 27,000ms (yes, 27 seconds!) due to NIC / switch port mismatches.
Speed Screen Configuration
Symptoms: Slow responses when entering text into applications
Solution: An often overlooked setting is Speedscreen. Speedscreen will significantly improve the speed at which applications appear to respond to text input from a thin user. You should configure speed screen and replicate settings across the server farm. For instruction see this link:
Symptoms: Generally slow performance of virtualised Citrix servers, especially on AMD ESX/ESXi virtualisation platforms. I had similar issues with physical servers which had been converted to virtual servers.
Solution: For AMD RVI deployments beware that on Windows 2003 Hardware-assisted MMU virtualisation (AMD RVI) will not automatically be enabled. This is because of performance related issues in versions of Windows 2003 prior to Service Pack 2. I would suggest that any VM running Windows 2003 SP2 or newer should have hardware MMU manually enabled if your virtualisation platform supports it. You can confirm that Hardware-assisted MMU virtualisation is in use by viewing you vmware.log file that is stored alongised the vmx file, look for virtual exec = ‘hardware’; virtual mmu = ‘hardware’
Less is more; just because your old platform had 4 physical CPU’s, or even more, doesn’t mean that the virtualised platform will perform better. I’ve run 50 users on a single VM with 4GB RAM and 2vCPU’s – performance was good! Also check the %RDY and MLMTD values for you Citridx VM’s in esxtop; these counters can help identify CPU contention or limits that are affecting VM performance. %RDY should always be below 10-15% higher than this and it’s likely you have an over subscribed host – try reducing physical to virtual CPU ratio’s first. With regard to MLMTD; this should be carefully considered – if this has a value it means that ESX is limiting resources to your VM due to limits you have set (i.e. CPU MHz limits). Further ESX/ESXi performance troubleshooting steps can be found here: http://www.cb-net.co.uk/vmwareesxi-articles/32-performance/61-vmware-troubleshooting-vm-performance