Wide-area network (WAN) clustering involves clustering across physical locations to provide transparent failover between sites. Jeff Boles, senior analyst with Taneja Group, discusses WAN clustering in this Q&A.
Table of contents:
WAN clustering, also known as geoclustering, consists of clustering across physical locations. WAN clustering's popularity has been driven by affordable technologies such as Double-Take and increased support in products like Microsoft Exchange. That being said, WAN clustering today is still somewhat exotic, requiring high speed connections between nodes. However, this has become more readily available with technology like dense wave and coarse wave division multiplexing and metropolitan Ethernet have become mainstream.
The core challenge with this technology is making the system state and the data supporting an application concurrent on two different systems at two different locations because that's what ultimately makes transparent failover possible.
On top of comprehensive replication, many of these solutions use a single virtualized master identity to make redirects upon failover completely transparent and give you a real clustering approach in which you can do a node failover across locations that's totally transparent to your organization. Mileage will vary as to whether the transparency can compensate for a complete site failure and doing so may require some sophisticated engineering at the network layer.
There are WAN clustering options available for everything from mainframes to file servers. Also, WAN clustering options have emerged in the application stack.
I think the main drawback is the complexity of the choices available. I'll just sum up how I feel about the space and this will paint a clear picture of the benefits and drawbacks.
WAN clustering can be applied at several layers in the infrastructure. And, there's also a range of WAN clustering capability, ranging from full, real-time WAN clustering to near-WAN cluster emulation by point-in-time and asynchronous solutions. These solutions often get marketed under the banner of WAN clustering even though they aren't true seamless failover WAN solutions. So, being aware of what's available and what you require will help you sort through the variety of WAN clustering options.
WAN clustering can be applied from the host operating system, including the hypervisor, on up to individual application components. In general, it's easier to apply WAN clustering technology at a more specific upper layer from a trusted application vendor. This is because the solution is specifically engineered for that application and is available alongside the application from a "one throat to choke" vendor.
Depending on where and how you deploy WAN clustering in the stack there are applications for the capability of the cluster. Lower levels, such as the host operating system, may pose limitations on your ability to use more than one site in an active-active manner. Leveraging both sites for active processing may require some application and network sophistication on top of the clustering technology, and may create additional network loads depending on the design of the cluster. Upper layer application WAN clustering may give you more capability for active-active processing in the application stack. It can also make multi-site load balancing a little bit easier. But, keep in mind that active-active WAN clustering is really the "rocket science" of WAN clusters. You're getting into some pretty sophisticated stuff here that can be complex.
But, in contrast with that rocket science, recognize that not all WAN clustering is the same, and there are degrees of technology for nearly any type of customer that needs offsite protection with some kind of automated failover. So, it pays to be attentive to what you really need from a solution. Can you deal with less that fully transparent failover? Because that complete transparency will come with a significant price tag. If you can tolerate a few lost transactions and some minor disruptions during your recovery there are lots of options to choose from in the category of asynchronous WAN replication technologies for failover. These solutions can give you near-WAN clustering capabilities. In fact, adding confusion to the market, some of these solutions advertise themselves as WAN or geoclustering. Depending on the speed and latency of the connection and the amount of activity within the application, these solutions may achieve transparent or near-transparent failover. But in most cases, these failovers are moderately disruptive.
Can a product handle all of your data types? For example, if you have encrypted data behind a node and you need to send encrypted data over the wire in flight, can your WAN clustering solution handle that data.
How does the WAN clustering solution behave if connectivity between sites? This is especially important when the complexity of network behavior increases.
How sophisticated is the solution at optimizing data transmission across the WAN connection? Can it handle everything you are going to throw at it?
Also, it's important to understand the requirements for hosting data replication. Will it be hosted in the WAN clustering software? Or, do you need to manage different components in your infrastructure to perform replication? If so, what is the hardware compatibility behind the WAN clustering solution and how is it going to integrate with those external components?
Finally, understand how many IT components you are trying to protect both today and going forward. Understand the complexity of your WAN clustering solution management today and down the road. Do you have 50 points of management or can you perform everything from a single pane of glass?
I can tick off a list of vendors, but keep in mind that this is a multidisciplinary effort, and it's going to involve different parts of your organization to really construct a true WAN cluster. You are going to have to involve your data storage team, your software team, your network engineering team, and really look holistically at everything.
When you start looking for solutions on the market, there are a bunch of them out there. It includes application-specific solutions such as Oracle Stretch Cluster or some of the Java solutions from Oracle. IBM has a number of solutions as well, for example, some of the WebSphere solutions. Also, Microsoft's Cluster Services have some capability for Exchange geoclustering.
When you start looking into the software layer, you'll find products like NEC's ExpressCluster, InMage's CDP asynchronous WAN solution that can do some failover and some near-geoclustering, Double-Take Software, SteelEye has some capability here, Symantec/Veritas has some solutions, and of course CA's XOsoft. All of those are viable contenders and there's likely to be a lot more that I didn't list. But, that's a good starting point.
Dig Deeper on Disaster Recovery Networking