Disaster recovery in the cloud is a relatively new concept, and like many technology trends, there's a lot of hype and misinformation out there. In this Storage magazine article from Jacob Gsoedl, you'll learn about the top cloud disaster recovery concerns, like security and data recovery, and whether or not disaster recovery in the cloud is a good choice for your organization.
Cloud computing, along with mobile and tablet devices, accounts for much of the high-tech buzz these days. But when it comes to hype, the cloud seems to absorb more than its fair share, which has had the unintended consequence of sometimes overshadowing its real utility.
Although the concept -- and some of the products and services -- of cloud-based disaster recovery is still nascent, some companies, especially SMBs, are discovering and starting to leverage cloud services for DR. It can be an attractive alternative for companies that may be strapped for IT resources because the usage-based cost of cloud services is well suited for DR where the secondary infrastructure is parked and idling most of the time. Having DR sites in the cloud reduces the need for data center space, IT infrastructure and IT resources, which leads to significant cost reductions, enabling smaller companies to deploy disaster recovery options that were previously only found in larger enterprises. “Cloud-based DR moves the discussion from data center space and hardware to one about cloud capacity planning,” said Lauren Whitehouse, senior analyst at Enterprise Strategy Group (ESG) in Milford, Mass.
But disaster recovery in the cloud isn’t a perfect solution, and its shortcomings and challenges need to be clearly understood before a firm ventures into it. Security usually tops the list of concerns:
- Is data securely transferred and stored in the cloud?
- How are users authenticated?
- Are passwords the only option or does the cloud provider offer some type of two-factor authentication?
- Does the cloud provider meet regulatory requirements?
And because clouds are accessed via the Internet, bandwidth requirements also need to be clearly understood. There’s a risk of only planning for bandwidth requirements to move data into the cloud without sufficient analysis of how to make the data accessible when a disaster strikes:
- Do you have the bandwidth and network capacity to redirect all users to the cloud?
- If you plan to restore from the cloud to on-premises infrastructure, how long will that restore take?
“If you use cloud-based backups as part of your DR, you need to design your backup sets for recovery,” said Chander Kant, CEO and founder at Zmanda Inc., a provider of cloud backup services and an open-source backup app.
Reliability of the cloud provider, its availability and its ability to serve your users while a disaster is in progress are other key considerations. The choice of a cloud service provider or managed service provider (MSP) that can deliver service within the agreed terms is essential, and while making a wrong choice may not land you in IT hell, it can easily put you in the doghouse or even get you fired.
Devising a disaster recovery in the cloud blueprint
Just as with traditional DR, there isn’t a single blueprint for disaster recovery in the cloud. Every company is unique in the applications it runs, and the relevance of the applications to its business and the industry it’s in. Therefore, a cloud disaster recovery plan (aka cloud DR blueprint) is very specific and distinctive for each organization.
Triage is the overarching principle used to derive traditional as well as cloud-based DR plans. The process of devising a DR plan starts with identifying and prioritizing applications, services and data, and determining for each one the amount of downtime that’s acceptable before there’s a significant business impact. Priority and required recovery time objectives (RTOs) will then determine the disaster recovery approach.
Identifying critical resources and recovery methods is the most relevant aspect during this process, since you need to ensure that all critical apps and data are included in your blueprint. By the same token, to control costs and to ensure speedy and focused recovery when the plan needs to be executed, you want to make sure to leave out irrelevant applications and data. The more focused a DR plan is, the more likely you’ll be able to test it periodically and execute it within the defined objectives.
With applications identified and prioritized, and RTOs defined, you can then determine the best and most cost-effective methods of achieving the RTOs, which needs to be done by application and service. In the rarest of cases, you’ll have a single DR method for all your applications and data; more likely you’ll end up with several methods that protect clusters of applications and data with similar RTOs. “A combination of cost and recovery objectives drive different levels of disaster recovery,” said Seth Goodling, virtualization practice manager at Acronis Inc.
Disaster recovery in the cloud options
Below is a look at the different types of disaster recovery in the cloud options enterprises can choice from. For more information, download our chart "Disaster recovery in the cloud approaches."
Managed applications and managed DR. An increasingly popular option is to put both primary production and disaster recovery instances into the cloud and have both handled by a managed service provider (MSP). By doing this you’re reaping all the benefits of cloud computing, from usage-based cost to eliminating on-premises infrastructure. Instead of doing it yourself, you’re deferring DR to the cloud or managed service provider. The choice of service provider and the process of negotiating appropriate service-level agreements (SLAs) are of utmost importance. By handing over control to the service provider, you need to be absolutely certain it’s able to deliver uninterrupted service within the defined SLAs for both primary and DR instances. “The relevance of service-level agreements with a cloud provider cannot be overstated; with SLAs you’re negotiating access to your applications,” said Greg Schulz, founder and senior analyst at Stillwater, Minn.-based StorageIO Group.
A pure cloud play is becoming increasingly popular for email and some other business applications, such as customer relationship management (CRM), where Salesforce.com has been a pioneer and is now leading the cloud-based CRM market.
Back up to and restore from the cloud. Applications and data remain on-premises in this approach, with data being backed up into the cloud and restored onto on-premises hardware when a disaster occurs. In other words, the backup in the cloud becomes a substitute for tape-based off-site backups.
When contemplating cloud backup and recovery, it’s crucial to clearly understand both the backup and the more problematic restore aspects. Backing up into the cloud is relatively straightforward, and backup application vendors have been extending their backup suites with options to directly back up to popular cloud service providers such as AT&T, Amazon, Microsoft Corp., Nirvanix Inc. and Rackspace. “Our cloud connector moves data deduped, compressed and encrypted into the cloud, and allows setting retention times of data in the cloud,” said David Ngo, director of engineering alliances at CommVault Systems Inc., who aptly summarized features you should look for in products that move data into the cloud. Likewise, cloud gateways such as the Cirtas Bluejet Cloud Storage Controller, F5 ARX Cloud Extender, Nasuni Filer, Riverbed Whitewater and TwinStrata CloudArray, can be used to move data into the cloud. They straddle on-premises and cloud storage, and keep both on-premises data and data in the cloud in sync.
The challenging aspect of using cloud-based backups for disaster recovery is the recovery. With bandwidth limited and possibly terabytes of data to be recovered, getting data restored back on-premises within defined RTOs can be challenging. Some cloud backup service providers offer an option to restore data to disks, which are then sent to the customer for local on-premises recovery. Another option is a large on-premises cache of recent backups that can be used for local restores.
In case of a disaster, we’ll pull VMs [virtual machines] from the cloud; with StorSimple’s deduplication we pretty much have to only pull down one full VM copy and the differences for others.
Shaun Partridge, vice president of IT at Rockford Construction.
“I firmly believe that backups need to be local and from there sent into the cloud; in other words, the backup in the cloud becomes your secondary off-site backup,” said Jim Avazpour, president at OS33 Inc.’s infrastructure division.
On the other hand, depending on the data to be restored, features like compression and, more importantly, data dedupe can make restores from data in the cloud to on-premises infrastructure a viable option. A case in point is Michigan-based Rockford Construction Co., which uses a StorSimple appliance for cloud-based protection of its Exchange and SharePoint infrastructures. “In case of a disaster, we’ll pull VMs [virtual machines] from the cloud; with StorSimple’s deduplication we pretty much have to only pull down one full VM copy and the differences for others,” said Shaun Partridge, vice president of IT at Rockford Construction.
Back up to and restore to the cloud. In this approach, data isn’t restored back to on-premises infrastructure; instead it’s restored to virtual machines in the cloud. This requires both cloud storage and cloud compute resources, such as Amazon’s Elastic Compute Cloud (EC2). The restore can be done when a disaster is declared or on a continuous basis (pre-staged). Pre-staging DR VMs and keeping them relatively up-to-date through scheduled restores is crucial in cases where aggressive RTOs need to be met. Some cloud service providers facilitate bringing up cloud virtual machines as part of their DR offering. “Several cloud service providers use our products for secure deduped replication and to bring servers up virtually in the cloud,” said Chris Poelker, VP of enterprise solutions at FalconStor Software.
Replication to virtual machines in the cloud. For applications that require aggressive recovery time and recovery point objectives (RPOs), as well as application awareness, replication is the data movement option of choice. Replication to cloud virtual machines can be used to protect both cloud and on-premises production instances.
In other words, replication is suitable for both cloud-VM-to-cloud-VM and on-premises-to-cloud-VM data protection. Replication products are based on continuous data protection (CDP), such as CommVault Continuous Data Replicator, snapshots or object-based cloud storage such as EMC Atmos or the Hitachi Content Platform (HCP). “Cloud service provider Peak Web Hosting enables on-premises HCP instances to replicate to a Peak Web HCP instance instead of another on-premises HCP instance,” said Robert Primmer, senior technologist and senior director content services, Hitachi Data Systems.
New options, old fundamentals
The cloud greatly extends disaster recovery options, yields significant cost savings, and enables DR methods in SMBs that were previously only possible in larger organizations. It does not, however, change the DR fundamentals of having to devise a solid disaster recovery plan, testing it periodically, and having users trained and prepared appropriately.
About this author: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at firstname.lastname@example.org.
This article was previously published in Storage magazine.
This was first published in August 2011