BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Disaster recovery (DR) is insurance, an infrastructure and a set of processes that a company may rarely use, but it is essential when the time comes. However, in order for it to have value, a DR plan and infrastructure must be tested.
Historically, testing disaster recovery products has been difficult, time-consuming and expensive, so it was often neglected. However, server virtualization, sophisticated disaster recovery software and cloud computing are making regular DR testing feasible and cost-effective.
Local DR testing
For disasters such as the failure of a single critical server or storage system, recovery means firing up an instance of the application on a secondary server on-site. This requires that a cloned copy of the VM image has been created on a standby server running the appropriate hypervisor. Some of this can be done using tools native to the virtualization platform, but many companies use specific disaster recovery software for a couple of reasons.
These products can keep the DR copy updated with the primary VM efficiently, using snaphots and replication. Many can also convert a physical server and its applications to a VM as part of the initial cloning process, and then keep it updated as they do primary VM images. Capturing the current state of a production server is not trivial and DR planners must take this into consideration.
Also, it's critical that before the DR test is run, a clean, quiesced copy of the protected data is captured. While an application can typically be rebuilt from "dirty" data, this will increase the recovery time and may result in some data loss. It's also important for the organization to understand that in the event of a real disaster there will probably be some data loss, unless the application being tested is gracefully shut down. At a minimum, this involves data in transit but more probably includes data added or modified since the last clean update.
When it comes time to test that recovery process, the cloned VM is restarted on the standby server and users are pointed to that VM. If a more complex stack is involved, like Web servers, load balancers, database servers, etc., DR software can actually orchestrate the entire process, starting each application in order and monitoring the environment. Again, this isn't trivial and care should be taken to understand all the steps involved so they can be included in the test.
Remote DR testing
When the DR location is remote at a secondary site or in the cloud with DR as a service (DRaaS), testing is essentially the same process, but the updated VM images are stored off-site with the compute infrastructure that's running the hypervisor instead of on a local standby server as described above. Testing involves restarting the replicated VMs and pointing users to them. The real difference is that when these VMs are actually used in a DR situation, network latency will come into play.
For this reason, a hybrid DR product is an appealing option. Combining the local and remote approaches, a failover server is kept running in the company data center, and is synchronized with another system off-site.
The cloud is an increasingly popular DR target, because it offers flexibility and cost-effectiveness that an in-house secondary site does not. For example, many DRaaS products offer hot, warm and cold DR options at different price points. Depending on the RTO of a particular application, users can choose the right approach for their organization's needs and budget.
Historically, disaster recovery testing for most companies has been a less-than-comprehensive process conducted on a less-than-regular basis, which has often eroded the confidence in the DR system, at least among those who knew the details. But now, mostly due to the emergence of server virtualization, DR testing can be a much simpler exercise that's not just limited to nights and weekends. The result is a DR system that can provide the peace of mind that a good insurance policy is supposed to.
About the author:
Eric Slack is a senior analyst at Storage Switzerland, an IT analyst firm focused on storage and virtualization.
Compliance considerations with DR software products
DR monitoring tools boost data protection, simplify processes
Blade servers offer key DR benefits