Published: 12 Aug 2007
Not all data is created equal or is necessary for business continuity, so a tiered data recovery plan can be most effective.
Data recovery solution, whether remote or local, is a set of technologies that supports your disaster recovery (DR) strategy. This combination forms the backbone of your company's business-continuity program. Your data recovery solution must be as crisp and predictable as possible because there's no room for error or a second chance when it's needed. Is your data recovery solution working to make your DR solution successful?
Data recovery by tape alone isn't a complete data recovery solution for DR because you need your data right away when disaster strikes. Whether you're recovering data from a few feet away or a few hundred miles, relying on tape to recover your core apps and data is risky and time consuming. The truck carrying the tapes might not reach the destination in time, environmental factors can damage media or the tapes may be unreadable. You should use tape as a backup (literally), not as a primary recovery solution.
Keep it simple
A well-implemented solution doesn't have to be complex. A simple data recovery solution has the least amount of customization and is implemented with out-of-the-box technologies. A data recovery solution shouldn't end up as a garage project or college thesis. If you need to create full-time positions to support your data recovery solution, it's time for a reality check. It's not about proving how well someone can script, but rather how easily it can work.
When recovering data during a disaster, you need to do so swiftly and with 100% success. A simple solution allows one to focus on the nontechnical aspects of getting the business back up and running. Your plan shouldn't have too many hidden dependencies, customized components or one-offs, but don't oversimplify and leave out critical details. Keep in mind that the person executing the recovery is unlikely to be the one who designed or implemented the plan.
On the other hand, the learning curve for out-of-the-box solutions isn't that steep. They're often well documented and most vendors offer some sort of formal training that lets you have more than one person trained in managing it.
Go multivendor, but don't overdo it
Don't rely on a single vendor for your recovery plan. The luxury of calling a single support number isn't much of an advantage if the vendor is incapable of providing the right set of technologies.
While vendors might like you to believe otherwise, no single vendor is in a position to supply a set of technologies that make it absolutely unnecessary to look elsewhere when building a complete data replication strategy. Most DR experts will tell you to avoid putting all of your eggs in one basket. But you don't have to be obsessive about being multivendor; in some cases, a single-vendor strategy will suffice. But if you're putting a new plan in place, you should try to evaluate "best-of-breed" products that--with perhaps a little effort--can be made to work together to meet your DR objectives.
Let's take the most common and often contentious example: database replication. I can't tell you how many times I've seen DBAs and storage folks at odds over selecting the best database replication tool. Storage vendors will tout their proprietary block-level replication solutions as superior to built-in database replication solutions such as Oracle's Data Guard. DBAs are more inclined to opt for the latter replication scenario. Why not implement a strategy that makes use of both?
Block-level solutions provide you with a "restartable" copy of your database, while database replication solutions provide a "recoverable" copy. When designing an enterprise-wide solution, I would lean toward an application-agnostic, block-level solution that's closer to the storage array than the app. For example, if you have an EMC environment, consider using either Symmetrix Remote Data Facility (SRDF) or MirrorView. You may want to include the database replicator because you shouldn't assume SRDF will provide an end-to-end solution. The combined approach will give you a restartable copy that can be made "current" to the last good transaction permitted by your recovery point objective (RPO). This isn't a black-and-white solution--the shades of grey are in how you implement the combination.
Interoperability is a key consideration when selecting technologies to help avoid finger-pointing among vendors when implementing a multivendor strategy. For example, the application clustering technology used with the data replication solution will determine if the clustering solution can effectively control the failover of the replication and start of applications at the remote site. Most clustering and application vendors now offer geographical clustering plug-ins to their cluster solutions that can be used to control data replication solutions from different vendors. Therefore, the choice of a clustering solution can't be made in a vacuum.
The bottom line is always price. Is the cost of implementing an elegant solution justified? Some business requirements may dictate large expenditures, but don't assume that every recovery plan has to be expensive. When you put all the options on the table, you'll be surprised how many there are. Block-level, array-to-array solutions tend to be the most expensive in terms of storage, licensing costs and the network bandwidth required to keep the two copies in sync. Application replication products are usually cheaper and have less-demanding bandwidth requirements, though they only work with the specific application.
Array-based replication solutions are application-agnostic and can be used to replicate any of the array's contents. But replicating everything on the array is an expensive proposition; if you need to trim costs, you may want to use a more selective approach.
Tiered data recovery
You don't need 100% of your production data at the remote location for a partial or near full recovery--just enough to put you back in business. The effort, therefore, involves identifying components that can be left behind or those that can be reconstructed using relational data. Databases are the most common examples here. File and print services are another. Keep in mind that we're not talking about losing this data for good; we're only deferring its recovery to a later point in time, perhaps using tape.
Just as production storage is tiered using service-level agreements, data recovery can be offered in a tiered manner and tied into the overall storage service model. Data- supporting applications that form the backbone of your business can be replicated in a synchronous or near-synchronous manner that's capable of providing "zero transactional loss." Second-tier applications may be replicated asynchronously in an "always on" or periodic "sync-split-sync" manner. Those approaches can provide an RPO in which losing a few transactions is considered normal or not business critical. Moreover, the lost information can be reconstructed using indirect methods or a reload of data from sources outside of your company. Of course, if the application data can withstand a few days of downtime, it's a candidate for recovery by tape. An important technological innovation that has only recently been applied to data storage is quality of service (QoS). QoS allows you to choose which class of data gets preference over other classes. The array or network gear is then capable of serving as a traffic cop to ensure that the most important class of data reaches its destination first.
It's easy to assume that just because you have a data recovery solution in place it will work in the manner envisioned by your DR strategy. But the old adage of practice makes perfect is particularly appropriate for a DR plan: Put it through its paces and make it work for you.