Hurricane Katrina left in its wake more than shattered communities and ruined lives; it also demonstrated the limits of the typical IT disaster recovery plan (DRP). However, it also demonstrated that a lot of DRPs are unnecessarily limited.
The truth of the matter is that every DRP has an upper boundary. A plan won't be able to handle some disasters, and that's perfectly appropriate. However, it is also true that many DRPs can easily be stretched to cover larger disasters -- situations that reach wider areas, last longer and are generally more severe than those planned for contingency.
Here are some tips that can help you expand the reach of your DRP:
Have a complete, well-tested DRP
If you don't have a disaster plan in place to cover 'run-of-the mill' problems, it's not worth thinking about using one for extraordinary problems. The first step is to develop and implement such a plan and to test it regularly. Then, you can start looking for places where you can adjust it for more severe contingencies.
Many of the limits in DRPs are not inherent in the process, but are the result of limits that were built into the plan. A good example is the '72-hour rule,' which says that a facility will be cut off from access for no more than 72 hours. However, it is fairly easy to increase that limit by stockpiling more supplies, such as fuel for backup generators.
You don't have to have a hurricane to exceed the 72-hour rule. In other kinds of disasters from earthquakes to fires to chemical spills, it may be longer than that before you can get most (or any) of your people back in.
Think about other parts of the operation
A DRP needs to include more than just IT; it should look at all the business practices and how they will be supported and maintained after a disaster.
This is an area that is particularly weak in a lot of enterprises, especially in coordination. If the other departments don't share the same assumptions as IT, the plan will likely fail embarrassingly.
The good news is that it's easy to fix with better planning, coordination and especially information.
Consider your people
The best plan in the world isn't worth much, if people aren't available to implement it. The larger the disaster, the less likely that your people will be available. Either they won't be able to reach your site or they'll be busy taking care of their own families.
In a really large disaster, like Katrina, you're likely to have to implement your DRP with a pickup crew. If your disaster plan is well worked out and clearly written down, you'll be able to do a lot more with the people who are able to show up.
Consider infrastructure beyond your interfaces
One of the worst problems in the wake of Katrina was that the communications structure was wrecked. Not only were the land lines out, both voice and data, but cell phones, the backbone of a lot of disaster plans, weren't working. In fact, one of the worst problems for Louisiana's disaster management officials was that they were largely cut off from what was happening because of communications failures.
In most types of disasters, from earthquakes to chemical spills, telephone communication is one of the first things to go. Even if the system isn't physically affected, circuits are likely to be overloaded and unavailable.
One form of communication that kept working after the hurricane in spite of the chaos was satellite links. Satellite phones were invaluable -- and extremely rare -- in the first days after the storm. A few satellite phones and perhaps a satellite terminal for low-rate data can make an enormous difference in how big an emergency you can cope with.
Count the cost -- and the cost-effectiveness
Some of these additions to a DRP are relatively simple and inexpensive to implement. Others, such as a standby VSAT link, can range from expensive to very expensive. Clearly not every enterprise needs to stretch the limits of disaster recovery. Consider how much additional protection you're buying with extensions to your DRP and decide which ones are worth it in your environment.
A very valuable document for disaster recovery planning is "Generally Accepted Practice for Business Continuity Practicioners" put together by the DRI International and the Disaster Recovery Journal. It is available at http://www.drj.com/GAP/.
For more information:
About the author: Rick Cook has been writing about mass storage since the days when the term meant an 80 K floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the last 20 years, he has been a freelance writer specializing in storage and other computer issues.