A great deal of work goes into a disaster recovery test plan. Among these activities are:
- Identifying the test team.
- Securing the test team's availability on the planned DR test date.
- Identifying additional subject matter experts such as database administrators and network engineers.
- Determining what will be tested and the expected outcomes.
During the actual disaster recovery test, the game plan must be ready to use, the testing procedures documented and shared with all players, and the systems and network resources should be in place. But despite all this preparation, here are five mistakes that could happen when planning for and conducting your DR test plan:
Mistake 1: Not having a disaster recovery test plan
You need to have a documented plan of action that clearly outlines pre-test activities, test procedures, post-test wrap-up activities, test scripts and test team members. This written plan is also important from an audit perspective, as it provides auditors with evidence of how the test was planned and executed.
Mistake 2: Not having all the proper technology representatives present
While you may think you have all the necessary players identified for the DR test plan, double-check to make sure a key subject matter expert, such as the system's developer, has not been overlooked. Ensure you have all the technology elements represented -- network, hardware, application, database, utilities -- and any other special arrangements that are unique to the system being recovered.
Mistake 3: Not having a detailed script of the test activities
Aside from the disaster recovery test plan itself, this is one of the most important technical documents you'll have for the test. Scripts are the playbook used to execute tests, and their accuracy is essential. Even transposing two characters in a line of code can cause a system test to fail. If possible, have a subject matter expert review the test script to identify any possible changes or edits. It might also be useful to have a dry run of the test before going live.
Mistake 4: Not having the test elements in place and ready to use
If your test is for a certain day, you'll be expecting all the components needed for the DR test to be in place. If one or more of your hardware, software or network elements are not ready, cancel and reschedule the test so you won't disrupt production systems.
Mistake 5: Not checking to see if others have scheduled tests
In a medium- to large-sized IT department, there will likely be many scheduled activities in place, such as new system acceptance testing, network upgrades and software upgrades. This means your test will need to fit in with the other activities, so there are no disruptions to scheduled work. DR testing can often take several hours, so it should be scheduled as far in advance as possible. You should also send out notifications to other IT leads advising them of your planned test.
Additional disaster recovery testing plan considerations
While the above five mistakes should be avoided before your plan is tested, here are three factors to consider during and after the test.
- Not being willing to halt the test and reschedule if things are going poorly. If things are not going as planned, stop the test and review what happened up to the point where the test systems were no longer performing as expected. If it's possible to bypass the element that failed and continue with the test, continue with the original script.
- Not documenting what happens during the test. This includes recording when steps are completed and identifying modifications to the test script that are developed "on the fly." One of the key players in a disaster recovery test is a scribe -- the person taking notes of what is happening. Another is a timekeeper, who notes the specific start and end times of test activities.
- Not preparing an after-action report summarizing the test, lessons learned and so on. The scribe's and timekeeper's notes will be used to prepare an after-action report on the disaster recovery test. Auditors will want to review how the test was conducted, actions of the specific test participants, how specific issues were handled, any infrastructure elements that didn't work and the results achieved.
When you have a disaster recovery test plan and conduct a system and/or technology test, you can increase the odds of a successful test with sufficient preparation and properly scripted actions.
Ensure your DR plan testing runs efficiently
Lack of funds cited as reason not to have a DR plan