Hot site, warm site or cold site? Here's how to figure out the best disaster recovery strategy for your company.
By Jacob Gsoedl
The ability to recover from a disaster in an acceptable period of time is a critical issue for companies with increasing dependence on information technology. Once thought to be a concern for only larger organizations, being able to recover mission-critical applications within a predictable timeframe is a mandate for any size company today. But some users see disaster recovery (DR) as a pricey insurance policy, and may take shortcuts to try and save a few dollars. To avoid becoming victims of budget cuts, DR provisions and sites must be built around a few basic principles that allow management to decide what's required while candidly showing the possible business impact and consequences of retrenchments.
Recovery time objective (RTO) and recovery point objective (RPO) are the key metrics to determine the DR level required to recover business processes and applications. They are reciprocally proportional to the cost of DR: The closer RTO and RPO need to be to zero, the more expensive DR provisioning will be. If recovery time can be days or even weeks, costs will likely be significantly less.
Determining the necessary RTOs and RPOs is the single most important exercise a business needs to perform to ensure the right level of DR without wasting money. RTOs and RPOs are derived through business impact analysis of business processes and applications to determine the value of business processes and the anticipated financial impact if they become unavailable. Obviously, this varies greatly by business process and application. "While for just-in-time manufacturing the critical threshold may be 15 minutes, it could be days for a marketing application," says George Ferguson, worldwide service segment manager for Hewlett-Packard (HP) Co.'s business continuity and recovery services.
Very likely, determining RTOs and RPOs will be an iterative process because of two competing forces: available budget and required recovery objectives. "The challenge of contingency services like disaster recovery is to find the right balance between available budget and what's required to sustain the business," says Greg Schulz, founder and senior analyst at StorageIO Group, Stillwater, MN.
Disaster recovery options
With a business impact analysis in hand and agreement on RTOs and RPOs, IT management can devise implementation options. Disaster recovery site terminology can be confusing -- terms like hot site, warm site and cold site are common in DR parlance, but they're used inconsistently. A hot site in the U.S. typically comprises shared equipment, while "in Europe the term hot site is predominantly used for dedicated equipment," says Ferguson. The following definitions match the prevailing U.S. interpretations of these terms:
It's quite common for a DR site to serve various roles for different applications. For instance, a DR site may serve as a hosted site with close to real-time failover for a mission-critical e-commerce application, and it may also serve as a low-end warm site with tape-based recovery for a less critical engineering application. Many DR sites are hybrids where the application determines the role of the site. As a result, disaster recovery companies that host DR sites typically offer their services in tiers that can be mapped to RTOs and RPOs required by applications (see "DR tiers," below).).