Wednesday, October 28, 2009

Cluster Databases Instead of Servers … Brilliant!

By:Rik Hoffelder
Among the major architectural changes included in Microsoft's Exchange Server 2010 one of my favorites is the Database Availability Group or DAG. What is a DAG? Simply put it is a failover cluster for Exchange 2010 mailbox stores. So? Exchange has supported clustering mailbox servers for years what's different about this? Ah, but there is the answer, it clusters mailbox databases, not mailbox servers.. This allows you support Mailbox, Hub Transport, and Client Access Server roles on the same server reducing hardware requirements, administration, and complexity yet providing high availability and site resiliency for a lot less investment cost. To top that off, you can create the cluster after Exchange 2010 has been installed allowing you to build one server today then add another months later and still cluster the mailbox stores. This is part of a new Exchange design methodology called Incremental Deployment.

To make this possible the Exchange Design Team made significant changes to the database structure and database objects. First the mailbox database structure was flattened to improve performance allowing it to run on cheaper SATA/Tier2 drives. Because of this improvement Exchange 2010 can support up to 100 databases on a single Enterprise Edition server, (Standard Edition still only supports 5) with the right storage design of course. Next Storage Groups were removed from the architecture, thus forcing a 1 to 1 relationship between databases and transaction logs. Exchange 2000, 2003, and 2007 supported multiple mailbox stores in a single storage group, all sharing the same set of transaction logs. Which has presented its fair share of problems in recovery scenarios, but that's a different topic!

The removal of storage groups was a necessity to supporting DAGs. A DAG utilizes a replication technology based on a combination of Continuous Copy Replication (CCR) and Standby Continuous Replication (SCR). CCR is a technology first introduced in Exchange 2007 that allowed you to create a "shared nothing" cluster, meaning no shared storage, then used log shipping to maintain the passive copy of the database. CCR only supported the mailbox server role so SCR was introduced in Exchange 2007 Service Pack 1 to provide replication between servers host mailbox, hub, and CAS roles, however unlike a cluster failover was manual and did not maintain the original server name. SCR could be used in conjunction with CCR to provide site resiliency and used the same log shipping technology. To implement either replication technology you could have only one database per storage group anyway.

The combined CCR/SCR technology, simply named continuous replication, includes some big changes as well. First the elimination of storage groups allows the replication to occur at the database level allowing the database to be replicated to as many as 16 servers. Log shipping no long occurs over a SMB connection; rather it uses an administrator defined TCP port for data transfer that is compressed and encrypted. Unlike Exchange 2007, 2010 uses a push model in logging shipping ensuring all passive copies are kept up to date. Database seeding can also be performed for a passive copy of the mailbox database, eliminating the performance hit on the active copy.

Exchange 2010 does not support continuous replication of public folder databases, but it does allow a separate copy of a public folder to exist on each DAG member with public folder replication enabled. Exchange 2007 CCR only allowed a single public folder database per cluster requiring a separate mailbox server to host public folder replicas to ensure availability. You may have noticed that I didn't mention Local Continuous Replication (LCR) or Single Copy Clusters (SCC). Those features have been removed in Exchange 2010.

Finally the Mailbox and Public folder store objects were moved to the Organization level of the Exchange Forest (Yep, new term in 2010). The database is no longer a subordinate of a server, it is a peer. Because of this and the features of Windows 2008 Failover Cluster service you are able to cluster the databases and cluster them across as many as 16 servers. And just to get a little crazy here, those 16 servers could be in 16 different datacenters! Not exactly a recommended scenario, but it is possible. All 16 members of the DAG maintain the same name so no reconfiguration of legacy Outlook clients is required as was the case in SCR.

In order to provide the greatest level of resiliency changes were made to the hub transport role to help prevent data loss. Exchange 2007 CCR clusters used a feature of the Hub Transport role called the transport dumpster to help backfill missing messages in the event of a lossy failover. A lossy failover means that transaction log data from the primary node had not been shipped to the secondary prior to failure resulting in data loss. To mitigate data loss the transport dumpster could redeliver up to seven days worth of messages thus backfilling the database. Exchange 2010 still supports this but adds some additional protection.

When a hub transport role is on the same server as a DAG it will reroute all messages destined for local mailbox databases in the DAG to another hub transport server within the Active Directory site to ensure a copy is stored in a second transport dumpster. This is done to prevent loss in the event the entire server is lost and a database failover occurs by ensuring another transport dumpster can redeliver the message. The hub transport server also monitors the replication of delivered messages and will remove the message from the dumpster once a copy has been replicated to all mailbox databases within the DAG.

As you can see this really makes building a highly available, high resilient Exchange architecture simple and darn near bulletproof without a lot of cost or complications. From the testing and playing I have done, it is very simple and quick to setup. A lot less hassle and time consuming than the clusters of yore. This makes me want to tell all my customers … DAG - Nab It!

More information on Exchange


Post a Comment

Note: Only a member of this blog may post a comment.

Microsoft Virtualization, Citrix, XENServer, Storage, iscsi, Exchange, Virtual Desktops, XENDesktop, APPSense, Netscaler, Virtual Storage, VM, Unified Comminications, Cisco, Server Virtualization, Thin client, Server Based Computing, SBC, Application Delivery controllers, System Center, SCCM, SCVMM, SCOM, VMware, VSphere, Virtual Storage, Cloud Computing, Provisioning Server, Hypervisor, Client Hypervisor.