What are the components of disaster recovery tests as they are conceived today?
I think we are doing DR testing wrong today. Basically, disaster recovery tests are conducted on scheduled days, once or twice a year, or sometimes quarterly. We take teams off-site and hold an event where we test how we will recover data from a backup or mirror, how we will restore applications in a minimum equipment configuration and how we will reconnect application hosts to a network so that users can get to their work with adequate, if not optimal, performance levels and security.
This is the way disaster recovery testing has been done for a long time, but it has the limitation of being nonlinear. We break strategies down into tasks, and we test tasks out of order -- in a nonlinear fashion. That way, the failure of one test will not prevent us from undertaking the other scheduled disaster recovery tests. Such an approach obviates the teaching value of DR testing. The human brain doesn't readily reorganize nonlinear tests into a coherent end-to-end understanding of the strategy, the team member's role in the strategy or the interdependencies between the roles and activities of different team members. That pretty much demolishes the rehearsal value of testing.
Add to this one other criticism: The act of formal disaster recovery test preparation tends to skew the outcome of the test. We pull the right backup tapes in advance, ensure that all necessary tools and equipment are present and so forth -- things that make the test less and less like a real-world application of procedures in the face of an actual disaster. That also limits the efficacy of traditional DR testing.
This was first published in July 2013