Disaster Recovery.com

disaster recovery (DR)

By Kinza Yasar

What is disaster recovery (DR)?

Disaster recovery (DR) is an organization's ability to respond to and recover from an event that negatively affects business operations. The goal of DR is to reduce downtime, data loss and operational disruptions while maintaining business continuity. To prepare for this, organizations often perform an in-depth analysis of their systems and IT infrastructure and create a formal document to follow in times of crisis. This document is known as a disaster recovery plan.

What is a disaster?

The practice of DR revolves around events that are serious in nature. These events are often thought of in terms of natural disasters, but they can also be caused by systems or technical failure, human errors, or intentional attacks. These events are significant enough to disrupt or completely stop critical business operations for a period of time. Types of disasters include the following:

Why is disaster recovery important?

Disasters can inflict damage with varying levels of severity, depending on the scenario. A brief network outage could result in frustrated customers and some loss of business to an e-commerce system. A hurricane or tornado could destroy an entire manufacturing facility, data center or office.

An effective disaster recovery plan lets organizations respond promptly to disruptive events, offering the following benefits in return:

DR initiatives are more attainable by businesses of all sizes today due to widespread cloud adoption and the high availability of virtualization technologies that make backup and replication easier. However, much of the terminology and best practices developed for DR were based on enterprise efforts to re-create large-scale physical data centers. This involved plans to transfer, or fail over, workloads from a primary data center to a secondary location or DR site to restore data and operations.

What is the difference between disaster recovery and business continuity?

On a practical level, DR and business continuity are often combined into a single corporate initiative and even abbreviated together as BCDR, but they aren't the same thing. While the two disciplines have similar goals relating to an organization's resilience, they differ greatly in scope.

BC is a proactive discipline intended to minimize risk and help ensure the business can continue to deliver its products and services no matter the circumstances. It focuses especially on how employees continue to work and how the business continues operations while a disaster is occurring. BC is also closely related to business resilience, crisis management and risk management, but each of these disciplines has different goals and parameters.

DR is a subset of business continuity that focuses on the IT systems that enable business functions. It addresses the specific steps an organization must take to resume technology operations following an event. DR is also a reactive process by nature. While planning for it must be done in advance, DR activity isn't kicked off until a disaster actually occurs.

Elements of a disaster recovery strategy

Organizations should consider several factors while developing a disaster recovery strategy. Common elements of a DR strategy include the following:

Risk analysis

Risk analysis or risk assessment is an evaluation of all the potential risks the business could face, as well as their outcomes. Risks can vary greatly depending on the industry the organization is in and its geographic location. The assessment should identify potential hazards, determine whom or what these hazards would harm, and use the findings to create procedures that take these risks into account.

Business impact analysis

A business impact analysis (BIA) evaluates the effects of the identified risks on business operations. A BIA can help predict and quantify costs, both financial and nonfinancial. It also examines the effects of different disasters on an organization's safety, finances, marketing, business reputation, legal compliance and quality assurance.

Understanding the difference between risk analysis and BIA and conducting the assessments can also help an organization define its goals when it comes to data protection and the need for backup. Organizations generally quantify these using measurements called recovery point objective (RPO) and recovery time objective (RTO).

RPO and RTO are both important elements in disaster recovery, but the metrics have different uses. RPO is acted on before a disruptive event takes place to ensure data is backed up, while RTO comes into play after an event occurs.

Incident response

This encompasses detecting, containing, analyzing and resolving a disruptive event. Incident response includes activating the disaster recovery plan, evaluating the incident's scope and effect, executing the recovery strategy, restoring normal operations and deactivating the plan. To maintain accountability and promote ongoing improvement, it's also essential to record and report incident response actions and results.

The components of a DR strategy can vary depending on the size, industry and particular demands of an organization. Therefore, these plans should be customized to meet the unique requirements of each business.

What's in a disaster recovery plan?

Once an organization has thoroughly reviewed its risk factors, recovery goals and technology environment, it can write a disaster recovery plan. The DR plan is the formal document that specifies these elements and outlines how the organization will respond when disruption or disaster occurs. The plan details recovery goals including RTO and RPO, as well as the steps the organization will take to minimize the effects of the disaster.

The components of a DR plan should include the following:

An organization should consider its DR plan a living document. It should schedule regular disaster recovery testing to ensure the plan is accurate and will work when a recovery is required. The plan should also be evaluated against consistent criteria whenever there are changes in the business or IT systems that could affect disaster recovery.

Disaster recovery sites

An organization uses a DR site to recover and restore its data, technology infrastructure and operations when its primary data center is unavailable. DR sites can be internal, external or cloud-based.

An organization sets up and maintains an internal DR site. Organizations with large information requirements and aggressive RTOs are more likely to use an internal DR site, which is typically a second data center. When building an internal site, the business must consider hardware configuration, supporting equipment, power maintenance, heating and cooling of the site, layout design, location and staff.

An external disaster recovery site is owned and operated by a third-party provider. External sites can be hot, warm or cold.

A cloud-based disaster recovery site is another option. An organization should consider site proximity, internal and external resources, operational risks, service-level agreements (SLAs) and cost when contracting with cloud providers to host their DR assets or outsourcing additional services.

Disaster recovery tiers

In addition to choosing the most appropriate DR site, it can be helpful for organizations to consult the tiers of disaster recovery identified by the Share Technical Steering Committee and IBM in the 1980s. The tiers feature a variety of recovery options organizations can use as a blueprint to help determine the best DR approach depending on their business needs.

The recognized disaster recovery tiers include the following:

Another type of DR tiering involves assigning levels of importance to different types of data and applications, and treating each tier differently based on the tolerance for data loss. This approach recognizes that some mission-critical functions might not be able to tolerate any data loss or downtime, while others can be offline for longer or have smaller sets of data restored.

Types of disaster recovery

In addition to choosing a DR site and considering DR tiers, IT and business leaders must evaluate the best way to put their DR plan into action. This will depend on the IT environment and the technology the business chooses to support its DR strategy.

Types of disaster recovery can vary, based on the IT infrastructure and assets that need protection, as well as the method of backup and recovery the organization decides to use. Depending on the size and scope of the organization, it might have separate DR plans and response and resilience teams specific to different departments.

Major types of DR include the following:

Disaster recovery services and vendors

Disaster recovery providers can take many forms, as DR is more than just an IT issue, and business continuity affects the entire organization. DR vendors include those selling backup and recovery software, as well as those offering hosted or managed services. Because disaster recovery is also an element of organizational risk management, some vendors couple it with other aspects of security planning, such as incident response and emergency planning.

Examples of options for DR services and vendors include the following:

Choosing the best option for an organization ultimately depends on top-level business continuity plans and data protection goals, as well as which option best meets those needs and budgetary goals.

Examples of DR software and DRaaS providers include the following:

Emergency communication vendors are also a key part of the disaster recovery process, as they help keep employees informed during a crisis by sending them notifications and communications. Examples of vendors and their systems include AlertMedia, BlackBerry AtHoc, Cisco Emergency Responder, Everbridge Crisis Management and Rave Alert.

Download a free SLA template for use with disaster recovery products and services.

While some organizations might find it challenging to invest in comprehensive disaster recovery planning, none can afford to ignore the concept when planning for long-term growth and sustainability. In addition, if the worst were to happen, organizations that have prioritized DR would experience less downtime and be able to resume normal operations faster.

Businesses often prepare for minor disruptions, but it's easy to overlook larger and more intricate disasters. Examine the top scenarios for IT disasters that disaster recovery teams should test vigorously.

02 Jan 2024

All Rights Reserved, Copyright 2008 - 2024, TechTarget | Read our Privacy Statement