Many companies and government agencies are engaged in consolidating data centers. The largest of these organizations have data centers around the country and even around the world. In many cases, they have grown by acquisition and have added data centers along with their new subsidiaries.
Data center consolidation isn't only an issue for disaster recovery (DR) planners in large companies. Smaller organizations, often with only one physical location, find themselves with distributed servers and data storage scattered around their office premises. However, the distributed data centers often lack the safety and access control of a centralized facility. Similarly to larger organizations, these companies also see a need to consolidate their processing.
Data center disaster recovery considerations
Moreover, for both large and small organizations, the needless duplication of personnel, real estate, heating and cooling, and network connectivity have become a financial drag at a time when expenses are receiving increased management scrutiny. Thus, these organizations are consolidating the equipment in numerous large and small facilities into data centers intended to serve their enterprises as a whole. The drivers for this consolidation are both technical and economic, but the result is the same: more processing power and storage in fewer locations.
When there are many data centers, the loss of one has limited effect. But if the number of disaster recovery sites is reduced, for example, from many to a relatively small number, the loss of any one of the DR facilities would be devastating. However, the difference of scale also renders many previous economic and technical calculations moot. For most companies, it's not economical to build a data backup site for each consolidated data center. And there are few, if any, commercial disaster recovery services that can accommodate the new mega-data centers.
There are many financial, technical, operational, risk and service-level analyses that go into the strategies for both consolidation and recoverability. No single solution is correct or incorrect.
Assuming that the decision to consolidate is made on the basis of a combination of technology, risk management, cost and customer service considerations, it is clear that recoverability requires a different approach than many companies have taken to disaster recovery in the past. There are a number of factors that might be addressed with regard to the structure of recoverability and resilience among consolidated data centers, and companies should give them consideration in their decision-making.
What kind of backup sites, and how many?
The viability and adaptability of legacy data centers to serve as a backup for large sites built for modern systems should be viewed with caution. For example, data centers originally designed for mainframe operations do not adapt well to large-scale implementation of blade servers, large Unix servers, concentrated storage and intense network activity. The heat generated by the equipment is difficult to dissipate in an older data center, which typically has 12-foot ceilings. The heat often leads to irregular air-flows within the data center and hot spots in a number of locations. Also, the 1½- to 2-foot raised floor found in older data centers makes it difficult to provide and manage sufficient power for server and storage arrays. Legacy data centers were designed for power loading of 35 to 45 watts per square foot, but keep in mind that highly concentrated equipment may draw up to 100 watts per square foot. Many older data centers are laid out with a relatively large area for raised floor and less for mechanical, electrical and plumbing (MEP) equipment. And modern data centers often reverse the ratio of raised floor to MEP space.
In addition, in legacy data centers, the sheer number of cables necessary to link numerous servers, storage and networks becomes a tangle in restricted spaces. Plus, the cabling throws heat that needs to be dissipated under the floor.
One approach is to design the consolidated data centers to back up one another. If the sites are to be consolidated to one, it is clear that an alternate site is also required. If there are two, one might back up the other. With three or more, it might make sense to have one act as a disaster recovery center for the others.
The problems with a centralized disaster recovery facility
A core assumption contained within a centralized disaster recovery strategy is that the backup facility contains enough equipment to run the production operations of any of the other sites should one of them be affected. The implication is that there is sufficient consistency in the hardware and software used within each site that the applications and infrastructure of any one could run on the servers and storage located in the recovery center. Of course, 100% overlap is nearly impossible to achieve, but the broader the skewing among configurations at the primary site, the more diverse the equipment that must be installed at the recovery site. If configurations are significantly diverse (as would be the case in data centers obtained through a merger), the equipment installed in the backup site would be a multiple of any of the others, which is an uneconomic proposition. Server and storage virtualization help with this, but if physical configurations are not comparable, a price will be paid in capacity.
More on data center consolidation
Read how a data center consolidation strategy can benefit from ITIL lifecycle
Learn about proper DR strategies for colocation data centers in this tutorial
Get helpful tips on data center disaster recovery planning
Many organizations have found that they have significant business requirements for recovery with little or no data loss. The equipment and network requirements for synchronous or asynchronous data replication will constrain the use of a single backup site. A centralized disaster recovery site would need to have storage on the floor for those applications in all production sites that require data replication. If there were a need to use the facility for recovery purposes, and if server virtualization were not employed, it would be necessary to run servers equivalent to those in the affected site (with the concerns noted above) while continuing to receive and store replicated data from the other two sites. Moreover, with the disaster recovery site running production, it would also be necessary to have the bandwidth to support both production applications and on-going replication from the surviving sites. If a company has a reliable plan for bandwidth on demand, this may be feasible. Without such arrangements the investment in excess network capacity could be daunting.
Pairing data centers
Pairing data centers is a strategy where two data centers replicate to each other and serve as each other's recovery site. Pairing data centers with similar configurations and relying on virtualization to enable increased utilization of physical platforms in an emergency may be a more effective strategy than a centralized disaster recovery site. While virtualization has helped some companies to alleviate constraints imposed by a lack of hardware available for recovery purposes, many companies are just now seeking to deploy virtualization for mission-critical applications and may not yet be in a position to leverage its full benefits for resilience and data recovery.
There are many financial, technical, operational, risk and service-level analyses that go into the strategies for both consolidation and recoverability. No single solution is correct or incorrect. If these analyses are well documented and convincing, they should lead to conclusions that satisfy production needs and recoverability within a realistic budget. Some companies may benefit from a shared disaster recovery facility, others may find success from a commercial recovery service and many benefit from paired sites. Regardless as to which option you choose, it's important that all aspects of data center consolidation have been given appropriate consideration.
About this author:
Steven Ross is an Executive Principal of Risk Masters Inc. and holds certification as a Master Business Continuity Professional (MBCP). He is a specialist in business continuity management, crisis management and IT disaster recovery planning. He is editor of the multi-volume series, "e-Commerce Security," and author of several of the books in the series, including "e-Commerce Security: Business Continuity Planning."
This was first published in August 2010