Problem solve Get help with specific problems with your technologies, process and projects.

Develop a new virtualization strategy for disaster recovery

In part one of this series, learn about the importance of building a virtualization-ready DR strategy with replication and WAN optimization.

dave bartoletti photoThe data center is rapidly evolving. As a result, most data protection and disaster recovery (DR) plans were developed for an IT environment that bears little resemblance to today's dynamic, highly distributed and increasingly virtualized data centers. Traditional disaster recovery planning assumed that remote locations had a dedicated IT staff, applications and data were generally static, local data protection was most efficient, and wide-area networks (WANs) between locations were strictly limited and expensive.

To accommodate the evolving data centers, a new disaster recovery strategy is required -- one that addresses the key constraint on enterprise DR scalability and efficiency: limited network capacity. Wide-area network optimization removes this "choke point" by unlocking existing network bandwidth and optimizing the WAN for the unique transport requirements of backup and recovery workloads. The proven benefits of WAN optimization include dramatic data reduction ratios and massive gains in network throughput. These benefits are further magnified when combined with storage replication products. This powerful joint solution makes long-distance, multi-site replication feasible, simpler, and cost-effective, and provides the clearest path to reducing enterprise dependence on expensive and error-prone tape-based disaster recovery.

In this article, learn about the benefits of a virtualization strategy for disaster recovery.


Limitations of a traditional disaster recovery strategy
New technologies offer new opportunities and challenges in disaster recovery planning 
Cloud computing and storage elasticity
Wide-area networks become core

Limitations of a traditional disaster recovery strategy

Today, many servers and data stores are still backed up locally to tape. According to Taneja Group research, 55% of users still use tape as their primary backup/DR medium, but most also employ at least one additional disk-to-disk method. Having additional tape storage requires local IT staff to manage data backup software, schedules, tape libraries, and offsite archiving. When failure occurs, multiple, complex processes must be coordinated to separately recover and reconfigure servers and data sets, often in multiple locations. As a result, recovery times are often too long and unpredictable.

Distributed, tape-based data backup also suffers from geographic limitations. For example, it can be expensive to ship tapes long distances, and the farther they must be shipped, the longer it will take to recover them in the event of a disaster. This has led many firms to situate recovery sites too close to primary sites, significantly increasing the risk of catastrophic failure due to a major event (power grid failure, hurricane, etc.) affecting a large geographic area.

These workarounds are necessary when a disaster recovery strategy relies too heavily on backup; ideally, data should be replicated to hot or warm recovery sites populated with a mirrored set of server and storage platforms. The most efficient DR architecture is based on data center-to-data center (or site-to-site) replication, eliminating the costs and delays imposed by backup media and handling.

However, data replication between data centers -- either synchronous or asynchronous -- just shifts the cost burden away from the backup media and onto the network, where the high cost of bandwidth remains a prime barrier to deploying widespread replication for DR.

New technologies offer new opportunities and challenges in disaster recovery planning

In addition to these industry-wide challenges, new technologies -- server and storage virtualization and cloud computing, in particular -- open new avenues for virtualization and disaster recovery planning, but only if properly managed.

Server virtualization encapsulates servers and applications into mobile workloads, which is a boon for workload protection: any server capable of running a hypervisor can be a recovery target. Also, since virtual machines (VMs) are actually just sets of files, they can be protected and replicated as data. These features can aid disaster recovery planning if understood and managed properly.

First, a virtualization strategy is not appropriate for all workloads today or for the foreseeable future, so existing physical-server DR is still required, alongside virtualization-aware disaster recovery. In addition, server virtualization delivers two key benefits that must be kept in balance: higher utilization through consolidation, and server mobility. Consolidation significantly reduces capital costs for server hardware, but raises the chance that a physical server failure affects multiple applications. This higher failure risk can be mitigated by workload mobility within the local area network (LAN) -- for example, by using a live data migration technology such as VMware's vMotion.

Storage virtualization enables dynamically optimized storage tiering for greater choice and efficiency when choosing a recovery target. Virtual backup targets, including virtual tape libraries (VTLs), allow easy migration from tape-based to disk-based backup. Thin provisioning reduces the disk footprint of VM images for efficient storage and transport for DR operations. Array-based VM snapshot and cloning moves server protection to the array and leverages economies of scale; multiple data types (including VMs) can be protected with a smaller set of backup targets. Finally, image and file-level recovery options on the array speed VM recovery as well as recovery of files within VM images.

These enhancements are further extended for DR by advanced data replication solutions, including Brocade's FCIP Gateway, Dell/EqualLogic's Auto-Replication, EMC Corp.'s Symmetrix Remote Data Facility/Asynchronous (SRDF/A), Hitachi Data Systems' TrueCopy and Universal Replicator, and NetApp's SnapMirror.

For virtual servers, many advanced data protection features are also offered by the hypervisor platform vendors themselves. Examples include VMware's Consolidated Backup (VCB), Site Recovery Manager (SRM), Storage vMotion and vStorage APIs, and Citrix's StorageLink APIs. Together with replication, these virtualization-optimized enhancements deliver flexibility and choice, but also contribute to an increase in the overall volume of data requiring protection -- an important consideration.

Cloud computing and storage elasticity

The emergence of new utility-priced outsourcing options is leading many enterprises to explore cloud computing as an adjunct to existing DR strategies. Low-cost cloud-based storage offerings are maturing, and can provide an effective alternative to building dedicated disaster recovery facilities. If several key requirements are met, cloud-based storage provides an elastic pool of trusted storage. The provider platform must be able to scale quickly, on demand and to large capacities. It must provide clear multi-tenancy policies to isolate one firm's data from all others, and must enable secure access to data at any time. Cloud storage providers must also provide rapid, robust data recovery, which demands a storage infrastructure with very high MTBF and MTTDL (mean-time-to-data-loss).

Disaster recovery may be one of the most compelling use cases for cloud storage. In fact, according to a Zetta Inc. survey, over 40% of enterprise data center managers recently reported that they either have or plan to deploy some form of cloud storage by the end of 2011. The survey also said most business-critical applications are protected by some amount of offsite tape storage, but many firms require more advanced DR capabilities in order to satisfy industry compliance, risk mitigation, or business partner requirements. However, many firms struggle to justify the capital and operating costs required to implement them.

But however cloud-based storage is implemented, it demands a reliable, efficient, and high-volume WAN interface. Any DR strategy which includes cloud-based storage will clearly benefit from maximizing capacity and performance of the WAN infrastructure.

The wide-area network becomes core

Taken together, these industry trends and technology developments reveal a consistent underlying infrastructure requirement: the wide-area network, more than ever, is an essential storage transport resource for DR, and will increasingly play a critical role in unlocking DR operational efficiency. But, if the WAN is to become a core DR infrastructure element, it's imperative to resolve the most challenging aspects of network performance: contention for bandwidth and protocol latency.

In the next part in this series, we'll look at how WAN optimization can improve virtual DR, and learn how to determine whether or not WAN optimization is right for your company.

About this author: Dave Bartoletti is a senior analyst and consultant at the Taneja Group. Bartoletti has developed, delivered and marketed emerging technologies for more than 20 years at several high-profile infrastructure software and solutions companies. He was at the forefront of the virtualization, data center automation, messaging middleware and Web 2.0 technology waves as both a vendor and consumer. Bartoletti advises Taneja Group clients on server and storage virtualization technologies, cloud computing strategies and the automation of highly virtualized environments.

Dig Deeper on Disaster recovery planning - management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.