BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Editor's note: This article was expanded and updated in October 2017.
An information technology disaster recovery (DR) plan provides a structured approach for responding to unplanned incidents that threaten an IT infrastructure, which includes hardware, software, networks, processes and people. Protecting your firm's investment in its technology infrastructure -- and your firm's ability to conduct business -- are the key reasons for implementing an IT disaster recovery plan.
In this article, we cover everything you need to know about putting together an IT disaster recovery plan. You'll learn about IT DR plan development and the most important IT disaster recovery planning considerations. You can then download our IT disaster recovery plan template, which can be printed and customized for your company's unique needs.
Reasons to have a disaster recovery plan
Organizations can't afford to be nonoperational because of regional power outages, cyberattacks or hardware failures. Every minute applications and systems are down translates into lost revenue. For example, the average cost of losing critical applications is estimated to be $5,000 a minute.
A DR plan also ensures that remote offices and branch locations are considered when a catastrophe occurs, and it can ensure they are protected.
In addition, many organizations must adhere to compliance regulations when conducting business. They must produce disaster recovery reports as part of a business impact analysis strategy.
What is an IT disaster recovery plan?
IT disaster recovery plans provide step-by-step procedures to recover disrupted systems and networks, and they help organizations resume normal operations. The goal of these processes is to minimize any negative impacts to company operations.
The IT disaster recovery process identifies critical IT systems and networks; prioritizes their recovery time objectives; and delineates the steps needed to restart, reconfigure and recover them. A comprehensive IT DR plan also includes all the relevant supplier contacts, sources of expertise for recovering disrupted systems and a logical sequence of actions to take for a smooth recovery.
Once you have completed a risk assessment and identified potential threats to your IT infrastructure, the next step is to determine which infrastructure elements are most important to the performance of your company's business. If all IT systems and networks are performing normally, your organization ought to be fully viable, competitive and financially solid. When an incident -- internal or external -- negatively affects the IT infrastructure, the business could be compromised.
According to National Institute of Standards and Technology (NIST) Special Publication 800-34, "Contingency Planning Guide for Federal Information Systems," the following summarizes the ideal structure for an IT disaster recovery plan:
- Develop a contingency planning policy statement. A formal policy provides the authority and guidance necessary to develop an effective contingency plan.
- Conduct a business impact analysis. A BIA helps to identify and prioritize critical IT systems and components.
- Identify preventive controls. These are measures that reduce the effects of system disruptions, can increase system availability and can reduce contingency lifecycle costs.
- Develop recovery strategies. Thorough recovery strategies ensure that the system can be recovered quickly and effectively following a disruption.
- Develop an IT contingency plan. The contingency plan should contain detailed guidance and procedures for restoring a damaged system.
- Plan testing, training and exercising. Testing the plan identifies planning gaps, whereas training prepares recovery personnel for plan activation; both activities improve plan effectiveness and overall agency preparedness.
- Plan maintenance. The plan should be a living document that is updated regularly to remain current with system enhancements.
Step-by-step IT DR plan development
Using the structure noted in SP 800-34, we can expand those activities into the following structured sequence of activities:
1. The plan development team should meet with the internal technology team, application team and network administrator(s) and establish the scope of the activity, e.g., internal elements, external assets, third-party resources, linkages to other offices/clients/vendors. Be sure to brief IT department senior management on these meetings so they are properly informed.
2. Gather all the relevant network infrastructure documents, e.g., network diagrams, equipment configurations, databases.
3. Obtain copies of existing IT and network DR plans. If these do not exist, proceed with the following steps.
- Identify what management perceives as the most serious threats to the IT infrastructure, e.g., fire, human error, loss of power, system failure.
- Identify what management perceives as the most serious vulnerabilities to the infrastructure, e.g., lack of backup power, out-of-date copies of databases.
- Review previous history of outages and disruptions, as well as how the firm handled them.
- Identify what management perceives as the most critical IT assets, e.g., call center, server farms, internet access.
- Determine the maximum outage time management will accept if the identified IT assets are unavailable.
- Identify the operational procedures currently used to respond to critical outages.
- Determine when these procedures were last tested to validate their appropriateness.
4. Identify emergency response team(s) for all critical IT infrastructure disruptions. Determine their level of training with critical systems, especially in emergencies.
5. Identify vendor emergency response capabilities; if they have ever been used; if they were, did they work properly; how much the company is paying for these services; the status of the service contract; and the presence of a service-level agreement and if it is used.
6. Compile results from all the assessments into a gap analysis report that identifies what is currently done versus what ought to be done, with recommendations as to how to achieve the required level of preparedness and the estimated investment required.
7. Have management review the report and agree on recommended actions.
8. Prepare IT disaster recovery plan(s) to address critical IT systems and networks.
9. Conduct tests of plans and system recovery assets to validate their operation.
10. Update DR plan documentation to reflect changes. Any good disaster recovery plan should have strong and thorough documentation that includes a detailed inventory of the equipment in the infrastructure. This is particularly important because it helps new IT administrators get a lay of the land that was created by previous administrators. It helps to maintain good asset management.
11. Schedule next review/audit of IT disaster recovery capabilities. (Source: NIST SP 800-34)
Important IT disaster recovery planning considerations
Get support from senior management. Be sure to obtain senior management support so that your plan goals can be achieved.
Establish clearly defined roles. The DR plan should outline all employee responsibilities and designate a proper chain of command that can ensure a comprehensive DR response during a crisis.
Use available standards. Among the relevant standards you can use when developing IT DR plans are NIST SP 800-34, ISO/IEC 24762:2008 and BS 25777:2008.
Keep it simple. The IT DR plan doesn't have to be dozens of pages long. Plans simply need the right information, which should be current and accurate.
Review results with business units. Once the IT disaster recovery plan is complete, review the findings with business unit leaders to make sure your assumptions are correct.
Be flexible. The suggested disaster recovery plan template in this article can be modified as needed to accomplish your goals.
Reviewing the IT disaster recovery plan template
Now we'll examine the table of contents from the template, indicating key issues to address and activities to perform.
- Information Technology Statement of Intent. This sets the stage and direction for the plan.
- Policy Statement. It is very important to include an approved statement of policy regarding the provision of disaster recovery services.
- Objectives. Main goals of the plan.
- Key Personnel Contact Information. It is very important to have key contact data near the front of the plan. It's the information most likely to be used right away, and it should be easy to locate.
- Plan Overview. Describes basic aspects of the plan, such as how to update it.
- Emergency Response. Describes what needs to be done immediately following the onset of an incident.
- Disaster Recovery Team. Members and contact information of the DR team.
- Emergency Alert, Escalation and DRP Activation. Steps to take through the early phase of the incident, leading to activation of the DR plan.
- Media. Tips for dealing with the media.
- Insurance. Summarizes the insurance coverage associated with the IT environment and any other relevant policies.
- Financial and Legal Issues. Actions to take for dealing with financial and legal issues.
- DRP Exercising. Underscores the importance of DR plan exercising.
- Appendix A, Technology Disaster Recovery Plan Templates. Sample templates for a variety of technology recovery scenarios; it is useful to have technical documentation available from select vendors.
- Appendix B, Suggested Forms. Ready-to-use forms that will help facilitate the plan to completion.
Considering the investments businesses make in their IT infrastructures, they should also invest sufficient time and resources to protect those investments from unplanned and potentially destructive events.
10 DR planning mistakes to avoid
The importance of testing your DR plan
Why you need to include weather threats in your DR planning