If you have been assigned the task of determining what type of failover or disaster recovery site is required to...
respond to your organization's disaster recovery (DR) program, then you know there are a number of vendors and options (hot, cold and warm sites) available with a wide range of prices. But which one is right for your company? This tip provides you with what you need to know to determine the right failover disaster recovery site for your company.
First, you must know the difference between a hot, cold and warm site. A hot site is a recovery site that is fully equipped to take over data processing operations at short notice. The data is either frequently or continuously replicated from the live site to the hot site, usually via data communication links. Hot sites provide companies with the highest level of protection. Therefore, they are the most expensive option.
A cold site is a standby location, which can be used to house data processing facilities in the event of a disaster. It typically contains the appropriate electrical and heating/air conditioning systems, but does not contain active equipment or communication links. Think of it as a reserved open space/room where you can set up and connect equipment or furniture when necessary. Cold sites are the least expensive option.
A warm site is in between a hot and cold site. Typically, warm sites contain data links and preconfigured equipment necessary to rapidly start operations, but do not have active or live data. So, if you want operations recovery at a warm site, at a minimum you will be required to load and restore data prior operation.
The type of recovery site you'll need for your company is determined by the business impact analysis (BIA), which is part of most business continuity programs. When the business process is defined within the BIA, one of the required items is to define the recovery time objective (RTO), or the minimal time that the function is unavailable. Most RTOs are classified as less than two hours, two to 8, 8 to 24 hours, or more than 24 hours. The length of the RTO helps determine the recovery strategy and whether the organization needs a hot, cold or warm site.
If a number of applications require no more than two hours of downtime, this may require you to entertain the hot site option. If most of the applications require more than 12 hours, then you can look at other options. And depending on its financial situation, an organization might feel that due to the high price, the business function will have to wait longer than the two hours it expects to return to operation.
A hybrid approach to failover disaster recovery sites
Some organizations take a hybrid approach with their recovery sites. Rather than have a hot site with technology available within the short timeframe, some organizations will use local or standalone PCs, or even manual processing until the technical environment is available in recovery mode. Then when the technology is restored, all the manually and PC collected data is entered into the appropriate applications as normal. This hybrid approach allows the operation to continue while the technology is being established. The delay in restoring the technology can save an organization a substantial amount of money.
Ultimately it is a business decision whether to pay the extra premium for a hot site with reduced delay, or to find some other approach. More and more, organizations are finding themselves dependent on applications such as email, and are thus finding their recovery window narrowing. An option for email is to have that application at an outsourced cloud-based email service. This benefits the organization in two ways: First, an incident in the organization data center does not affect email, and second, the vendor will probably have a more resilient site.
If many of your applications require a short (less than two hour) timeframe to be recovered and up and running, a hot site is your best option. If your applications are not needed for over 12 hours, a cold site may be your best option because it gives you substantial time to get to the site and begin recovering hardware, telecommunications and applications. If your applications require RTOs of anything in between two and 12 hours, choose the warm site option. The selection of recovery type is purely a cost benefit analysis of value to the organization. Applications that are mission critical, and/or more profitable, may justify the additional expense. The answer lies with the business or operating units more than IT.
Warm sites may have equipment stored at the recovery site, but additional time is needed to configure the hardware and load any applications and data. This additional time to configure and load applications for processing reduces the cost, but increases the time needed for operation. The cold site usually requires shipment of equipment to the site, and then one must begin to configure hardware and applications similar to a warm site. Also, with warm and cold site configurations, there is more chance of errors while configuring the equipment and additional set up functions, which can increase the setup time even more.
Third-party vendors and cost for each recovery site option will vary by location. Well-known vendors that offer hot sites are SunGard and IBM Corp; they also provide warm and cold sites. When considering local alternatives, make sure that the site has a recovery site of its own. Depending on your business, you may opt for an in-house solution; hot, warm and cold sites can be established within an organization itself as an alternative.
About this author: Harvey Betan is a certified business continuity (BC) planning consultant with experience in disaster recovery (DR) in both technology and business functions. He has guided government agencies and private companies to prepare disaster recovery plans and exercises to determine readiness and impact of specific events. Betan has also written articles on business continuity and disaster recovery for various industry periodicals, and conducted seminars at DRJ and CPM conferences on pandemic planning and crisis management, and has been on numerous expert panels on business continuity planning.