BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
With so much emphasis placed on building out a disaster recovery plan, organizations often forget that they need to assess even the most detailed plans to ensure that everything will work. Without a disaster recovery testing strategy, there is no guarantee that all of your planning will be worth it in the end.
It's a common tale: An organization needs to plan what steps to take when a disaster -- of varying types, scopes and levels of impact -- strikes. Typically, an organization puts quite a bit of effort into conducting a business impact analysis and a risk assessment, and then creating the plan itself.
An untested plan is pretty much a conceptual set of steps. Granted, some things are going to work, but as your environment changes over time and the business needs evolve, there isn't a true guarantee that an untested plan is going to work.
So, you need to test the plan. You've got to prove to yourself and to the powers that be that your plan will work. The whole point of a DR plan is to proactively mitigate the risks associated with a disaster, and testing the plan mitigates the risk that the plan won't work.
There are four high-level steps that can point you down the right path toward establishing a proper disaster recovery testing strategy.
Decide what parts of the plan need to be tested
Your DR testing may include anything from a single system to a multi-tiered application to an entire environment. Depending on what's considered critical to your organization, you need to first define what needs to be tested. Be sure to include dependencies in your thinking. For example, if Exchange relies on Active Directory and DNS records, make a note of that.
Determine testing frequency
There's always the question of how often a DR plan should be tested. In your disaster recovery testing strategy, the frequency of testing should be based on the frequency that the plan changes. Workloads that don't change at all probably only need to undergo DR testing once a year. When systems, applications and platforms change, the DR plan gets updated, which means it needs to be tested. And this decision is not necessarily final. Reviewing the need to test the DR plan can be done quarterly or semiannually, depending on the criticality of the workloads in question.
Choose your testing method
There are four commonly accepted ways to perform DR testing. As you consider which one is right for you, keep in mind that the goal is always to validate that the plan will actually work in practical execution. The four common methods are:
- Simple plan review -- This is about as basic as it sounds. The DR team looks over the plan identifying any parts that are outdated, missing, etc.
- Tabletop run-through -- The DR team walks through the plan as if it were being executed, discussing the steps and identifying any potential issues. Often, this is done using recovery scenarios to ensure the plan will work for specific disaster circumstances.
- DR scenario simulations -- This is basically an actual execution of the DR plan into a non-production DR environment. It usually is limited to specific workloads, systems, applications and so on and does not include the entire environment.
- Full DR simulation -- Same as the previous method, but you're attempting to recover everything in a scenario where there is a total loss of operations and location.
Update your DR plan
Your tests should uncover that the plan is spot on and needs no adjusting, or, more likely, that it has deficiencies, errors and omissions that need to be addressed or updated. So, assume you'll need to include some time post-testing to update the DR plan so that it reflects any changes necessary to ensure a successful recovery down the line.
Testing is likely the most important part of your DR plan. Without it, you truly have no idea whether the plan is going to work. So, if you don't currently have a disaster recovery testing strategy included in your plan, it would be highly beneficial to add one. That way, you have a good idea of what to expect when it comes time to actually recover.