Data center and IT systems availability in disaster recovery planning

Data center and IT systems availability in disaster recovery planning

When planning for disaster recovery (DR), IT professionals put a lot of effort in ensuring that the critical components of their IT infrastructure have the necessary redundancies in place to support

    Requires Free Membership to View

    When you register for SearchDisasterRecovery.com, you’ll also receive targeted emails from my team of award-winning editorial writers. As you know, an interruption can threaten your organization at any time – and it’s our goal to ensure you’re armed with the right tips and information to help you ensure a swift recovery.

    Rich Castagna, Editorial Director

    By submitting your registration information to SearchDisasterRecovery.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchDisasterRecovery.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

the availability or recovery requirements defined by the business.

More on disaster recovery planning
Change management in disaster recovery and business continuity planning

Using data classification tools to aid in disaster recovery planning

Data backup strategies simplify remote disaster recovery

While this is definitely a requirement at the system and application level, the underlying infrastructure such as power and cooling cannot be overlooked. Disasters do not always take the shape of destructive events such as a tornado, hurricane or fire that can wipe out an entire facility. A power failure lasting many hours is actually considered a disaster by many organizations. Most will agree that failing over to a recovery site due to building maintenance or other services can potentially be a risky and costly exercise. Likewise, replacing an uninterruptible power system (UPS) or servicing an air conditioning unit should not force a company to activate their disaster recovery plan. This is why the data center infrastructure must have a redundancy and meet the same availability requirements as those applicable to the IT infrastructure it supports.

Data center infrastructure redundancies include cooling and power, but must also have a reliable power distribution path. Essentially, an outage at the facility level should not cause IT systems to be unavailable beyond their defined recovery or availability requirements.

Data center availability rating

Data center availability is usually rated by tiers which were originally developed by The Uptime Institute back in 1995, and have since become widely accepted by the industry. Attributes such as Basic (Tier 1), Redundant capacity components (Tier 2), Concurrently maintainable (Tier 3) and Fault tolerant (Tier 4), are used to describe the availability of site infrastructures. It must be noted that a Tier 1 certified facility must include components such as an emergency power generator and UPS to ensure a basic level of availability. Subsequent tier levels build on those basic redundancies all the way to a fully fault tolerant facility (Tier 4).

A facility that does not benefit from a power generator will not achieve any tier rating. Not every facility seeks availability certification, so this is not necessarily bad as long your IT systems availability requirements can tolerate it.

Common mistakes in data center disaster recovery planning

Without seeking to build the most fault tolerant facility, the following are some common mistakes and issues that are often at the root of outages:

UPS batteries: Organizations that do not have access to an emergency power generator are often tempted to try to fill the gap with extended UPS battery runtime, hoping to ride out a power failure. The problem with this is that systems are not much use if they are the only thing running while the rest of the building is without power. Furthermore, systems cannot run very long without air conditioning, which is one piece of equipment that should never be powered by an uninterruptible power system. It's usually a good idea to limit battery runtime to have just enough capacity to allow a graceful system shutdown; anything more will not necessarily provide a great return on investment unless it is for a very specific reason, such as security or life support systems.

Redundant UPS: Better uninterruptible power systems have built in redundancies such as N+1 power modules and maintenance bypass switches to prevent outages resulting from maintenance such as battery replacement. Deploying dual uninterruptible power systems (N+N) for protection against total UPS or power circuit failure is a good practice, but only if there are dual (A +B) power feeds. Implementing a standard for dual-corded servers powered from separate in-rack power distribution units, each powered by an independent uninterruptible power system, will still leave an exposure if the single breaker or subpanel to which they are both connected is a single point of failure.

Cooling redundancy and capacity: Cooling issues are a common source of outages that can quickly turn into a disaster. Many IT environments are in danger because of cooling, simply because they ran out of capacity for it. Capacity must be monitored closely, and shortages must be addressed before issues develop.

Building a redundant data center is always a challenge for smaller organizations that have high-availability requirements for their IT systems, but do not have the budget to implement a highly redundant or fault-tolerant facility. This is where options like hosting or collocation become appealing to smaller organizations. The cost for access to a hardened facility is shared by many users and becomes an operational cost rather than a large capital expenditure.

Ultimately, it is the recovery and availability requirements of the IT infrastructure supporting the business activity that dictates the availability requirements of the data center facility. The higher the impact of an outage on the business, the easier it becomes to justify the cost of redundancy. IT and data center/storage managers working in high risk areas for natural disasters should never forget that even the most redundant data center is still a single point of failure by itself. Having a highly redundant and fault-tolerant facility is not a substitute for a disaster recovery strategy or plan.

Pierre Dorion is the data center practice director and a senior consultant with Long View Systems Inc. in Phoenix, Ariz., specializing in the areas of business continuity and DR planning services and corporate data protection.

Do you have comments on this tip? Let us know. Please let others know how useful this tip was via the rating scale below.

Do you know a helpful disaster recovery tip, timesaver or workaround? Email the editors to talk about writing for SearchDisasterRecovery.com.


This was first published in August 2009

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.