By Dave Raffo, Senior News Director
Disaster recovery (DR) planning is more of a process than a technology. Sure, it involves technologies spanning enterprise data storage, networking and security, but finding the right disaster recovery facility and making sure adequate logistics and communications are set up are also crucial steps. Remember, DR programs are enacted at times of great chaos when workers are often concerned about the safety of their own families as well as saving the business. That brings a lot more into play than making sure servers can fail over and fail back.
In our tutorial on disaster recovery operations, learn about how to choose a disaster recovery facility, failover and failback in disaster recovery operations, virtual servers and DR, and outsourcing disaster recovery services.
When finding the right spot for a disaster recovery site, you have to take into account what type of disaster you're likely to be recovering from and how much protection you need. If you're located in an area at risk of natural disasters -- hurricanes, earthquakes, floods, flash fires etc. -- you need your DR site in an area that is not likely to be affected by the same type of disaster. And you don't want to go too far away because of latency issues.
The same goes for the area you want to set up shop in after getting your IT resources back to working level. You want that spot to be outside the disaster zone but close enough to get your key personnel there without too much cost or inconvenience.
For instance, New Orleans law firm Deutsch Kerrigan & Stiles (DK&S) leases office space in Memphis, Tenn. for a DR team to work out of if a hurricane hits. "We picked Memphis because it's probably the safest large city away from hurricane area," the firm's director of administration Don Champagne said.
Organizations need to set up hot, cold or warm sites to continue operations depending on their recovery time objective (RTO) -- the maximum length of time it can go without its computer resources after a disaster. A hot site is replica of a company's data center. A cold site is nothing more than office space and minimum storage and networking equipment to let the company do enough to keep the doors open until it can re-open its main office. A warm site has more IT resources than a cold site but less than a hot site. Hot sites cost a lot more than cold sites to set up, with the cost of warm sites falling in between.
Organizations also need to keep communications open with workers while recovering from disasters. Some use notifications systems such as MessageOne's AlertFind that sends mass emails and other Web-based messages to workers. These messages can keep staff abreast of developments such as when they can return to the main data center.
After Hurricane Katrina, IT workers for the Supreme Court of Louisiana located in New Orleans began wearing USB thumb drives around their neck with the organizations' DR plan. The thumb drives include key phone numbers of team members and other data the workers need in an emergency.
Even the best laid out disaster recovery plans won't work if the technology comes up short. Recovering from disaster requires a business to replicate data offsite and then fail over and fail back after the disaster strikes.
Failover is the process of automatically switching to the DR site when the primary site fails. This happens when disaster strikes. Failback is when the system is restored to its original state before the failure.
Replicating data between the primary and DR sites before failures ensures that data and applications are current at the secondary site. Asynchronous replication is used for long-distance replication, and better suited for DR than synchronous replication. Most storage vendors have synchronous and asynchronous replication tools specific to their arrays.
CA XOsoft, Double-Take Software Inc., EMC Corp., Neverfail Ltd. and Symantec Corp. are among the vendors with software applications that will handle replication and failover capabilities. Some of these vendors, along with InMage Systems and FalconStor Software, also use continuous data protection (CDP) to minimize data transmission and maximize synchronization of replicated data.
No DR plan is complete without testing, although this is often overlooked. Besides verifying the replication and failover/failback capabilities, DR tests can also train staff how to react when disaster strikes. Yet according to "Symantec's 2009 Disaster Recovery Survey", only 35% of more than 1,650 global IT managers contacted said they tested their disaster recovery plans at least once a year.
The rise in virtual servers is changing the game for disaster recovery, and in many cases, DR drives companies to virtualize servers.
Virtual server images are more portable because they are not tied to specific hardware, and you don't need the same hardware at your disaster recovery site as you have at your primary site. Companies don't have to buy hardware in pairs when setting up DR sites. They can move out old servers to their DR site when they buy new servers for their primary site.
By storing virtual images on a storage area network (SAN), you can take advantage of replication and snapshot tools either built into arrays or included with storage virtualization products from vendors such as DataCore Software Corp., FalconStor Software and IBM Corp.
Organizations can virtualize servers at their primary and disaster recovery site, or only at the DR site. It is easier to manage DR if both sites are virtualized but some applications on the primary site may not lend themselves to running on virtual servers. Replication and testing can be more difficult if only the DR site is virtualized.
VMware's Site Recovery Manager (SRM) makes DR even easier with virtual servers by automatically moving and restoring virtual environments between VMware ESX clusters. VMware Site Recovery Manager also automates DR testing without disrupting production servers.
Getting the most of wide area network (WAN) bandwidth is another key piece to successful DR operations. Many organizations choose to implement WAN optimization tools instead of paying for DS-3 or higher speed circuits to carry traffic.
WAN optimization eliminates redundant data transmission by deduplicating data when replicating it between sites. It also uses quality of service to prioritize traffic. Riverbed Technology, Cisco Systems, Blue Coat Systems, Expand Networks, and Silver Peak Systems are among the vendors offering WAN optimization devices.
Outsourcing disaster recovery is an option for organizations without resources to dedicate to DR. But outsourcing doesn't alleviate the need for planning. You still need to understand your RTO and recovery point objective (RPO), and how long you might have to operate from your DR site.
Geography also plays a role when outsourcing disaster recovery. You don't want your provider too close to you if you're worried about a natural disaster. But the location of a providers' other clients also matters. Cloud disaster recovery services often share resources to help keep costs down, but what happens if too many of their customers are located in the same area and a disaster hits? That can make availability a problem.
The Oklahoma Bar Association in Oklahoma City uses service provider CoreVault to protect its backups from problems ranging from possible tornadoes to accidental deletion of files by users.
"CoreVault provides a backup site 120 miles away and an automatic way to backup data," said the organization's director of information systems Rick Loomis. "It wasn't the cheapest way to go, but we wanted to go with a proven model at a reasonable cost. They have two sites and a facility to store the backup, to make sure not even a disaster would wipe out our backups."
The deciding factors when looking at outsourcing vs. in-house DR is the costs involved and how much time it would take to implement DR under each scenario. Large vendors such as SunGard, IBM, and Iron Mountain Inc. provide DR services, but there are also many smaller collocation providers.