Any disaster recovery environment is only as good as its last DR test. In a virtualized disaster recovery environment,...
there are a lot of positive aspects around testing that are not available in the physical world.
Virtualization enables you to bring up servers and services in isolated networks, therefore allowing for functional testing, while the production is still up and running. The ability to delete and recreate DR servers at will is also something that cannot be done with physical boxes.
So how do administrators and DR managers leverage this virtual panacea of DR? More importantly, what are the key steps to a solid virtualized disaster recovery test?
Beginning the virtual DR testing process
Before even getting involved in the mechanics of implementing a DR test, the person responsible for the DR needs to pull together the requirements. This is the time to get everything on track, and the rest of the DR test will be smooth sailing.
Firstly, as with any DR-enabled infrastructure, there should be an appropriate business criticality attached to the server or service. This criticality should fit into an appropriate tier. Most applications are given a tier rating of one through three. Tier one applications should be more frequently tested than the other tiers.
Secondly, identify those people involved in the DR test. DR managers must get the business on board. This can be an area of friction. Downtime is never desirable. Plan ahead and get approval for the downtime in writing. These documents tend to contain the magic to make people take notice and help as needed.
At the same time, this conversation provides the chance to define exactly what the purpose of the virtualized disaster recovery test is and what is being tested. Defining a successful outcome makes building a testing roadmap more simple and straightforward. You now know the endgame. Make sure it is documented and OK'd.
How often, how much to test?
Key applications and infrastructure tests should be conducted at yearly intervals, at a minimum. Tier three can go up to two years between tests. Each application needs its own specifically tailored DR run book that defines a unique set of steps needed to restore the services. Each order step should detail what needs to be done, by whom and by when.
Even for small companies, a run book per application is critical. Not having one means whoever is doing the DR may miss a step in the heat of the moment.
Testing the entire application stack end to end is a must. This is sometimes where virtualized disaster recovery can unravel. When a component is physical, such as a midrange server, the test can hit a bump in the road. To get a true DR test, though, it does need to be included.
The test plan should include the people who can manage the physical aspects. This includes shutting down the infrastructure cleanly and bringing up the physical DR. Don't forget to include those people who manage the virtual DR failover process, domain name system, Active Directory and similar components.
A series of meetings before the test can ensure all the required people know what they are doing. Ahead of the meeting, make sure the latest available copy of the DR run book is available. Version controlling those copies is especially helpful.
As each step is implemented, the organization should record its status, along with any snags and fixes. Slow and methodical is the order of the day. Upon completion, it is important to distribute the findings to all those involved, irrespective of success or failure, so that everyone knows the status and the issues.
A good virtualized disaster recovery testing plan should be conducted in a similar manner to a physical DR test. The end goal is the same, whether the environment is virtual or physical. Just because the disaster recovery environment is virtual does not change how it should be treated.
Enable virtual machines to aid DR efforts
Handbook: Don't overlook disaster recovery testing
Virtualization strategy can improve DR planning