While working with a customer on a large VDI architecture recently we were comparing the required storage across several vendors and after looking at the proposed solutions from several I was asked the question:
In regard to the configs I’m really surprised at the low number of spindles relative to the IOPS req[uirement]s. Can you please help me understand the PAM a little more?"
The short answer is that the PAM can greatly increase performance in an environment that is heavy on small random reads like VDI, but that is not the only technology that NetApp uses to help optimize the storage of VDI.
There are a few things working in NetApp’s favor to keep the spindle count low. To begin with on the NetApp system one or more large pools of disks (Aggregates) are created that allow thinly provisioned volumes to be created that are striped across the entire aggregate. These volumes then contain one or more LUNs, more on this part later.
The Raid Groups that make up these aggregates use RAID-DP which offers double disk failure protection like RAID6, but because the NetApp storage system always writes full stripes and never has to do the read, read, write, write operation that gives RAID5 it’s 4:1 overhead and a 6:1 overhead for RAID6. In fact since NetApp can write it’s metadata anywhere in the file system the RAID write overhead is 1:1, in fact the only place I have to calculate overhead on RAID-DP is for IOPS (I=P(N-2), where I is total raid group IOPS, P is single disk IOPS and N is the number of disks in the array) to account for parity disks.
Other technologies on the NetApp storage system combine to reduce the physical size of the working set including thin cloning and primary storage deduplication.
The first of these, thin cloning, allows a snapshot of a single master copy of a volume (FlexClone) or LUN (LUN clone) to be presented read-write to hosts. This appears to hosts as a separate full copy of the data, but in fact only the deltas between the old and new blocks are written to disk, all the common OS components that make up a good portion of the working set for VDI actually remain in the same, single location. This technology can be used using the NetApp Rapid Cloning Utility (RCU) for VMware View or when using Citrix XenServer as the host for a XenDesktop machine. In either case this allows the working set to be decreased from N*W to W+((N-1)*D)W (where N is the number of clones, W is the working set size, and D is the delta percent of change from the master.)
Data Deduplication also plays a role when more than one VM is stored in the same volume. This feature looks through a volume for duplicate data blocks and removes all but a master copy and places metadata pointers back to this copy for each other copy of the block. Like thin cloning this feature is available because of NetApp’s ability to store metadata anywhere within the file system and creating pointers is nothing unusual given the structure of the WAFL file system. In a VDI scenario Data Dedupe helps contain the size of the deltas between VM’s by removing duplicate blocks created by OS or software updates, but this is a batch process so short lived data structures may not benefit.
So now that we have reduced the size of the working set, let’s talk about the PAM card which is 16GB of DRAM on a PCI-E card coupled with FlexScale software to act as an intelligent read cache. For folks from the server world the PAM operates like an L2 cache on a processor as an accelerator between the controller RAM and data on disks. There are 2 modes that are interesting to us in this discussion, default mode and metadata-only mode. In default mode the PAM card caches ONLY small random reads and metadata. This allows a large majority of the pointers used for deduplication and thin cloning to be stored very close to RAM which will already be used to cache the MOST frequently used data and metadata. If thin cloning and deduplication are used intelligently with an optimized configuration this mode can be used to retain a large number of the random read blocks in the PAM, greatly reducing the amount of time that users have to wait for blocks to come all the way from disk. This is the mode that I would use with VMware View or persistent XenDesktop VM’s and this is the mode that helps the most with events like boot-storms in View and persistent VDI scenarios. Metadata-only mode is used when there is a large working set and there is no way that it can fit enough of it in the PAM to avoid simply churning through the cached data. Metadata is cached in the PAM while data blocks are not allowing instant access to metadata blocks and a shorter access time for data stored on disk. This mode is the one I would use with XenDesktop in which each VM can be configured to store its own Provisioning Services write cache on the SAN, but this cache will be unique to each VM.
VMware RCU: http://blogs.netapp.com/virtualization/2009/03/netapp-and-vmware-view-vdi-best-practices-for-solution-architecture-deployment-and-management-part-8.html
XenDesktop on NetApp: http://www.citrix.com/site/resources/dynamic/partnerDocs/CitrixXD2.0withNetAppStoragePilotDeploymentOverview.pdf
Microsoft Virtualization, Citrix, XENServer, Storage, iscsi, Exchange, Virtual Desktops, XENDesktop, APPSense, Netscaler, Virtual Storage, VM, Unified Comminications, Cisco, Server Virtualization, Thin client, Server Based Computing, SBC, Application Delivery controllers, System Center, SCCM, SCVMM, SCOM, VMware, VSphere, Virtual Storage, Cloud Computing, Provisioning Server, Hypervisor, Client Hypervisor.