There are numerous benefits for running Citrix XenApp on XenServer; including single vendor support, built-in optimizations, and integration features. However, what if you are working in a VMWare ESX environment? As a consultant or an internal engineer, you cannot always dictate the virtualization environment. The following are some tried and true best practices for optimizing XenApp on VMWare ESX.
The key is Memory Sharing and how VMWare allows overallocation of resources as opposed to XenServer. A key to optimizing performance is to NOT over-allocate RAM, which will reduce memory sharing between guests. The memory sharing/de-duplication is a great feature for infrastructure servers, but application/terminal services can suffer faults from the shared memory. Of course, if it were that simple, everyone would do it. Along with avoiding overallocation, the OS and services should be optimized. Finally, you should consolidate/isolate your Terminal Services workloads to common hosts -- in other words, dedicate ESX Hosts to run only XenApp. This will optimize VMWare's native memory sharing as well as streamlining I/O.
Create a master VMWare TEMPLATE for each flavor of OS and XenApp you plan to install
- Align the drives - use DISKPART to create a 64k Partition, formatted using 32k allocation unit sizes
- 1 vCPU - assuming your application can run effectively in single CPU mode. Although there are many valid reasons to NOT do this, I recommend it because this reduces CPU %READY and CPU scheduling requirements by single threading the OS. Of course, if your app set requires multiple processors, or you risk pegging the single CPU, then this may not be an option.
- 2 GB RAM - this is a good starting point for baseline testing. Depending on the Host RAM and the app requirements, you may need to move this up or down accordingly.
- Set Page File Min and Max at 1.5 x RAM
- Installed latest VMTools, including the Memory Manager (aka the Balloon Driver)
- Please note, there are a lot of recommendation to disable this, but I believe that to be a fallacy:
- a lot of the information and recommendation out there is still based on ESX 2.x
- The Balloon Driver is a safety net, which should not be normally called on when designed properly
- when in doubt, and until proven otherwise, go with the standard package
- Disconnect/Disable the CD Rom - Windows guests poll CD devices quite frequently. When multiple guests try to access the same physical CD drive, performance suffers. Disabling CD devices in virtual machines when they are not needed alleviates this.
- Adjust the Disk timeout valie to the storage vendor's recommended: HKLM/SYSTEM/CurrentControlSet/Services/Disk/TimeoutValue = REG_DWORD Hex value 3c (60)
- Disable Last Time Access Atrribute for NTFS, this setting that keeps track of the last time a file was accessed. Removing the necessity for the system to keep reading and writing this information may speed up performance. The command is: fsutil behavior set disablelastaccess 1
- Disable screen savers
- Disable animations
- Disable USB
Any and all services which are not needed should be DISABLED. These services consume Memory and CPU cycles, which are not usually noticeable on physical hardware, but are exacerbate in virtual environments.
- Citrix ActiveSync Service
- Citrix XML Service - enable this on your ZDCs, disable on your production app servers
- DHCP Client - if you are using static IP addresses
- Distributed File System
- Help and Support
- Human Interface Device Access
- Indexing Service
- Netmeeting Remote Desktop Sharing
- Windows Audio - unless, ofcourse, you truely need sound for your apps
- Wireless Zero Configuration
When virtualizing, it is key to remember to scale out, not up. Sure, 2 CPU and 4 GB RAM physical machine has more horse power than a 1 CPU / 2 GB RAM VM, but when you can fit 10 of those VMs on a single host, you get a much greater density while spreading the load around.
This is a larger hurdle with any virtualization project, to move past large amounts of CPU and RAM. Once you can accept that smaller may be better, you see the value of scaling out (more units) instead of up (larger units).
I have found 3gb is a nice "sweet spot," depending on your specific application requirements. If you are running on a physical Host of 48 GB, this could allow you to run 10-12 VMs, consuming 30-36 GB, leaving plenty of resources for the host, VMotion, and axillary servers if necessary.
My Rule of Thumb: 2 Citrix VMs = 1 Physical Citrix Server. Obviously, this will vary based upon the actual workloads and application demands. For the aforementioned 3gb Guest on a 48gb Host, you could replace 6 1U servers with a single host - saving 5U of valuable space (as well as power and cooling).
You will need to design your Load Evaluators according to your design. Because performance counters are abstracted in a virtual environment, I recommend using us custom evaluator based on Memory Utilization, CPU Utilization, Session Load, and Load Throttling.
Combine this design with the use of VMWare Templates, XenAppPrep, and/or Provisioning Services for rapid deployment and you will have a robust, efficient, and highly flexible VMWare environment for running XenApp.