Essential guide to business continuity and disaster recovery plans
A comprehensive collection of articles, videos and more, hand-picked by our editors
In disaster recovery (DR) planning, once you've completed a business impact analysis (BIA), the next step is to perform a risk assessment. The BIA helps identify the most critical business processes and describes the potential impact of a disruption to those processes, and a risk assessment identifies internal and external situations that could negatively impact the critical processes. It also attempts to quantify the potential severity of such events and the likelihood of them occurring.
In this guide on information technology (IT) risk assessments in disaster recovery planning, learn how to get started with a risk assessment; how to prepare a risk assessment; and natural vs. man-made hazards in the risk assessment process. Read our guide, and then download our free risk assessment template.
IT RISK ASSESSMENT TEMPLATE AND GUIDE TABLE OF CONTENTS:
The risk assessment should be able to help you identify events that could adversely affect your organization. This includes potential damage the events could cause, the amount of time needed to recover/restore operations, and preventive measures or controls that can mitigate the likelihood of the event occurring. The risk assessment will also help you determine what steps, if properly implemented, could reduce the severity of the event.
To get started with a risk assessment, begin by identifying the most critical business processes from the business impact analysis. For threat information, numerous sources are available, such as:
- Company records of disruptive events
- Employee recollection of disruptive events
- Local and national media records
- Local libraries
- First-responder organizations
- National Weather Service historical data
- U.S. Geological Survey maps and other documentation
- Experience of key stakeholder organizations
- Experience of vendors doing business with the firm
- Government agencies such as the Federal Emergency Management Agency (FEMA), Department of Homeland Security, U.S. Department of Energy, etc.
These sources can help you determine the likelihood of specific events occurring, as well as the severity of actual events. For example, it may be possible to rule out certain kinds of events, such as earthquakes, if U.S. Geological Survey maps indicate the region is not in or near an earthquake zone.
An excellent document to assist you in preparing a risk assessment comes from the National Institute for Standards and Technology (NIST). The document is Special Publication 800-30, Risk Management Guide for Information Technology Systems.
The risk analysis involves risk identification, assessing the likelihood of the event occurring, and defining the severity of the event's consequences. It may also be useful to conduct a vulnerability assessment, which helps identify situations in which the organization may be putting itself at increased risk by not performing certain activities. An example may be the increased risk of virus attacks by not using the most current anti-virus software. Finally, the risk analysis results are summarized into a report to management, with recommended mitigation activities. It may be useful to look for vulnerabilities while performing the risk analysis.
Once risks and vulnerabilities have been identified, four types of defensive responses can be considered:
- Protective measures: These are activities designed to reduce the chances of a disruptive event occurring; an example is security cameras to identify unauthorized visitors and alert authorities before they can cause any damage.
- Mitigation measures: These activities are designed to minimize the severity of the event, once it has occurred. Examples are surge suppressors to reduce the impact of a lightning strike, and uninterruptible power systems to reduce the chances of a hard stop to critical systems due to a blackout or brownout.
- Recovery activities: These activities serve to bring back disrupted systems and infrastructure to a level that can support business operations; an example is critical data that is stored offsite that can be used to restart business operations to an appropriate point in time.
- Contingency plans: These process-level documents describe what an organization can do in the aftermath of a disruptive event; they are usually triggered based on input from the emergency management team.
The sequence in which these measures are implemented depends to a large extent upon the results of the risk assessment. Once a specific threat and its associated vulnerability have been identified, it becomes easier to plan the most effective defensive strategy. Remember that contingency plans must cope with the effects, regardless of the causes.
Disasters are unique combinations of events and circumstances. The two primary categories are natural and man-made. Within the man-made category, we can further define deliberate and accidental causes.
Natural hazards are typically considered "Acts of God" in which there is no one to blame. By contrast, man-made events are those in which an individual or multiple persons may be held accountable for contributing to the event(s) which caused the disaster. This could be through intent, neglect or accident. See the chart below "Natural and man-made hazards" for more detail.
Natural and man-made hazards
|Natural hazards||Man-made hazards (deliberate)||Man-made hazards (accidental)||Man-made hazards (indirect)|
|Thunderstorm||Break-in||Operator error||Power failure|
|Flooding||Fraud||Software programming error||Telecommunications failure|
|Snow storm||Strike||Fire||Floods from fire fighting|
|Ice storm||Riot||Fire extinguisher discharge||Sinkhole from collapsing road|
|Hail||Vandalism||Water leaks||Collapsing elevated roadway|
|Sunspots||Bomb damage||Fire suppression system discharge|
Once the risks have been identified, you'll want to identify the potential effects, symptoms and consequences resulting from the event occurring.
There are five basic effects that can have disastrous consequences: denial of access, data loss, loss of personnel, loss of function and lack of information.
The perceived symptoms might be a loss (or lack of):
- Access or availability
- Data integrity
- Personnel (temporary loss)
- System function
Secondary effects or consequences might include:
- Interrupted cash flow
- Loss of image
- Brand damage
- Loss of market share
- Lower employee morale
- Increased staff turnover
- Costs of repair
- Costs of recovery
- Penalty fees
- Legal fees
Risk assessments generally take one of two forms: quantitative, which seeks to identify the risks and quantify them, based on a numeric scale, e.g., 0.0 to 1.0 or 1 to 10; and qualitative, which is based on gaining a general impression about the risks so as to qualify them. The process uses subjective terms like "low to medium," "high or poor" "good to excellent," instead of numeric values
Quantitative methods, which assign a numeric value to the risk, usually require access to reliable statistics to project future likelihoods. As mentioned earlier, qualitative methods often include subjective measures like low, medium and high. However, sometimes the qualitative approach is more acceptable to management.
A basic formula, Risk = Likelihood x Impact is typically used to compute a risk value. For example, we can use a scale of 0.0 to 1.0, in which 0.0 means the threat is not likely to occur and 1.0 means the threat will absolutely occur. The impact 0.0 means there is no damage or disruption to the organization, whereas 1.0 could mean the company is completely destroyed and unable to further conduct business. Numbers in between can represent the result of a statistical analysis of threat data and company experience. The downloadable risk assessment uses this approach.
Using the quantitative range 0.0 to 1.0 you may decide to assign qualitative terms to results, e.g., 0.0 to 0.4 = low risk, 0.5 to 0.7 = moderate risk, and 0.8 to 1.0 = high risk.
Once all relevant risks have been analyzed and assigned a qualitative category, you can then examine strategies to deal with only the highest risks, or you can address all risk categories. This will depend on management's risk appetite, which is their willingness to deal appropriately with risks. The strategies you define for risks can next be used to help design business continuity and disaster recovery strategies.
Risk assessments are key activities in a business continuity or disaster recovery program. The process can be relatively simple, e.g., if you elect to use a qualitative approach. They can be more rigorous, when using a quantitative approach, as you may want to be able to substantiate your numerical factors with statistical evidence. Results should be updated periodically to determine if any changes to the risks (e.g., likelihood and impact) have occurred. Regardless of the methodology, the results should map to the critical business processes identified in the business impact analysis, and can help define strategies for responding to the identified risks. SearchDisasterRecovery.com's free risk assessment template will help you get started.
About this author: Paul Kirvan, CISA, CISSP, FBCI, CBCP, has more than 20 years experience in business continuity management as a consultant, author and educator. He is also secretary of the Business Continuity Institute USA Chapter.