If your disaster recovery as a service vendor includes testing among its service offerings, find out early on how it will support your DR plan tests. Look for firms that offer unlimited testing, or a very liberal test policy, such as quarterly or bimonthly, so you can schedule as many tests as needed.
In addition, see what support the vendor offers for developing your cloud disaster recovery plan testing.
With regard to the testing process for disaster recovery as a service (DRaaS), or cloud DR, the following 20 best practices should ensure success.
- Determine what will be covered in your cloud disaster recovery plan testing, such as failover from production systems to the DR environment and failback from DR to regular production.
- Document the DR test plan, identifying who will be involved in the test and the resources you will need, such as virtualized systems, databases, and data and network services.
- Have your cloud disaster recovery plan reviewed in advance by the system and data owners to ensure that the plan's objectives are aligned with expectations.
- See if the DRaaS vendor will provide test scripts before you prepare your own.
- Prepare test scripts to facilitate the test. This is essential, as the scripts will be validated in the test and will serve as your procedures in an actual emergency that requires the use of your DRaaS vendor.
- Determine the level of support you can expect from the DRaaS vendor; for example, on-site monitoring of your testing progress vs. remote monitoring via a conference bridge, and technical support as needed.
- Ensure all the resources required for the test are working properly.
- Notify other departments in your organization -- especially within IT -- of your proposed disaster recovery plan testing. Notify senior management of your planned test, including what will be tested and the expected results.
- Discuss disaster recovery plan testing activities with all the members of the test team in advance so each person knows their role and responsibilities during the test.
- If it can be scheduled, and if your DRaaS vendor supports it, schedule a dry run of the test with as many participants as possible, including the vendor. This may uncover problems -- for example, incorrect scripts, incorrect URLs or other resources -- that could adversely affect the success of your DR test.
- Ensure the test is conducted in an environment that will not affect ongoing production systems, such as in research and development or a system testing environment.
- Once the cloud disaster recovery plan test commences, have someone take notes and keep time of the testing activities. Test script documents should have places to enter notes and to record times when specific activities are completed.
- Schedule a break in the test to examine how it is progressing; but remember, you won't have that option in a real emergency. Be prepared to halt the test if activities are not completing as planned.
- Once the test has completed and the results have been documented, conduct a debriefing to see what worked, what didn't work and what remediation is needed for activities that failed.
- Document an after-action report for the test and submit it to management.
- Review disaster recovery plan testing results with your DRaaS vendor to see how it can assist you with resolving any issues.
- If possible, conduct another test to see if the revised test plan and scripts result in a successful test. If another test is not possible, examine the risk to the organization if the issues uncovered in the test actually occurred during a real emergency. All findings should be noted in the after-action report or a subsequent post-test report.
- If the systems and resources are mission-critical, perform additional testing to ensure those assets can be recovered.
- Update the disaster recovery plan and any other documents based on the test results.
- Coordinate with your DRaaS vendor to schedule the next round of tests.
When you have a DRaaS vendor that recognizes the importance of cloud disaster recovery plan testing, you are much more likely to have a successful recovery when it's needed.