Manage Learn to apply best practices and optimize your operations.

Buffington: DR test requires an 'honest look' at your infrastructure

In this Q&A, Jason Buffington, a senior analyst at ESG, discusses some DR test best practices and technologies that are making DR testing easier.

A disaster recovery test can reveal issues with the mechanisms put in place to restore your organization's IT operations following an outage. Will the system work as planned? Are you able to bring systems back online to meet recovery time objectives and recovery point objectives? The issue becomes murkier if you turn to a third party to handle disaster recovery operations for you -- who is responsible for what process, and how can your organization be sure it will get what it pays for?

I would not expect someone to do a full-scale DR test any more frequently than six to 12 months.

Jason Buffington,
senior analyst, ESG

In this Q&A, Jason Buffington, a senior analyst at ESG, discusses some best practices when it comes to DR testing.

How often should you test?

Jason Buffington: I would not expect someone to do a full-scale DR test any more frequently than six to 12 months, and that's based on the criticality of the data and the industry that they are in.

However, even though you won't full-scale test, the idea of randomly picking a server or a core application on a monthly basis … and on a recurring basis, test something different, a small component of the plan. That coupled with an annual full-scale test will ensure most pieces and parts are covered.

What a lot of people don't understand is that the resiliency that you get in business continuity planning and recovery preparation also yields operational benefits year-round.

How do you determine what you should be testing?

Buffington: It's OK to take an honest and authentic look at your infrastructure and say, "There's maybe 30, 40 percent of these servers and applications that we really need to make sure we have an additional measure of effort to ensure resiliency for." So, whatever percentage of your infrastructure you believe requires less than, say, a three-day recovery window, let's label that "important." That stuff deserves an extra level of resiliency and preparation and should be part of the test plan.

How is DR testing easier today than it was in the past?

Buffington: The hardest part of any data recovery exercise is the initial standup of infrastructure. Server virtualization makes all that portable. I don't have to rebuild physical servers; I can move VMs or copy VMs. So because of that, the testing process is much more straightforward. For some folks, it may be as simple as having a copy of those VMs someplace else, isolating them and then powering them up. It really doesn't have to be that hard.

Some backup solutions even come with automated validity check, to make sure VMs are recoverable. Some of those solutions give you the ability to sandbox a set of VMs. You can bring them up in an isolated silo from everything else, make sure they work and power them all down again.

BC/DR is often presumed to be so complex and so expensive that most people aren't willing to try it. And the reality is, with virtualization and the robust backup and replication tools that are in the market today, BC/DR is viable and obtainable for mid-size organizations and enterprises. You just have to stop presuming it's hard.

Has the cloud made DR easier?

Buffington: Using the cloud as an infrastructure stack is one of the best things that an organization can do, because one of the challenges of BC/DR is that you need a second site. And mid-size organizations often don't. Even enterprises have to justify how much money they want to put into maintaining a second infrastructure.

So the idea of an elastic, cloud-based infrastructure that costs you pennies when you're not using it, and is ready to go when you need it, is a godsend. However, from an expertise and a personnel perspective, the cloud is not a silver bullet, because most cloud providers do not have BC/DR expertise to help you get back up and running as quickly as your business demands. Also, no one cares as much about your organization's ability to restore service as you do. So people shouldn't think they can just write a check on a monthly basis and not worry about it anymore.

Organizations still need to own their own BC/DR plan. A cloud provider may be able to supplement your expertise, but your recovery plan is your recovery plan.

Do testing measures vary among cloud DR providers?

Buffington: One of the things you should look for is that a lot of data protection service providers [have] a wide range of features. They charge you next to nothing to push your data into their cloud, and they want to charge you inordinately on the way back out.

One of the challenges is during a recovery exercise, depending on the model, you might be incurring quite a bit of recovery costs and sticker shock afterwards. You want to be sure that the service provider you choose supports the idea that you'll annually test your plan and routinely test specific parts of the plan. You need to make sure that your data protection service provider has an economic model that doesn't hinder you from doing the responsible thing of testing.

So, with cloud DR testing, it's less about ease, and more that other concerns are at play?

Buffington: When it comes to BC/DR, you still own the plan. If you're efficient with your IT deployment models, then you can probably do it yourself at a secondary [site] or one of your existing sites. The question is does your IT team and your executive team have the appetite for maintaining that second infrastructure? Or would they rather offload the secondary site to the cloud?

It's really about the culture of your IT team, it's about how distributed your workforce is … it's not economics; you can't write a check and have the problem go away.

Dig Deeper on Disaster recovery planning - management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.