This content is part of the Essential Guide: VMware virtual recovery and backup best practices and tools

Disaster recovery methods: Legacy DR vs. the cloud

Cost, RTO and expertise are important factors in selecting the right disaster recovery method. After careful analysis, the cloud may be the answer for your organization.

Disaster recovery today requires an agile, fast response. Per-hour losses for businesses due to fire, flood or ransomware, for example, are just too high to continue to follow the disaster recovery methods of a decade ago.

When discussing recovery, you need to separate legacy systems and their proprietary platforms from commercial off-the-shelf (COTS)-based modern products. The standardization in COTS provides additional powerful options for protection when the cloud acts as a recovery site.

Among the common disaster recovery methods is configuring multiple sites for workload execution. The idea is that the loss of one site allows the workload to be restarted elsewhere, which works well if the sites are located far apart.

The problem with this multisite disaster recovery approach is cost. Extra platforms are needed, which can increase costs by 50% to 100%, depending on whether only critical workloads are protected or if all the job streams are mirrored. Additional administration staffing is required, as well as facilities, utilities and so on. This is a common option for legacy gear because the equipment and software configuration consists of semi-customized proprietary elements.

Backup, usually to tape, is an alternative for legacy systems. Copying data to tape is a relatively painless backup process, but recovery times can range from hours to days, depending on if alternative hardware is available or if it has to be physically procured. Backup, but to the cloud, rather than tape, also fits some COTS use cases.

Find your happy place

COTS-based computing is a much happier place when it comes to disaster recovery methods. The answer lies in the cloud. With most applications able to run in virtual instances, the in-house cloud or cluster can reconstitute the workload in a new set of instances in a public cloud.

There are a couple of ways to approach this. One is to operate a cloud-bursting environment, where excess workloads automatically open up in the cloud. We won't go into the data placement issues involved in this process, other than to mention that proper placement is essential to a short recovery time objective (RTO).

It may well be that one offering does not fit all use cases, so a mashup of disaster recovery methods is needed.

Alternatively, an endpoint backup system will keep a copy of close-to-recent data and instance images in a public cloud, ready to be fired up in case of a disaster scenario. With public clouds, there is no need to maintain active instances for recovery because they can be built in minutes and paid for on an actual usage basis. Docker containers can be spawned much faster than hypervisor instances, which bodes well for future RTO.

With backup tools well-tuned to the cloud, the issue focuses on recovery, which, even with a backup, is not the most trivial of exercises. Ensuring the right files are in the right places, with the correct linkages built, is tricky work, especially for administration teams that face the issue perhaps once every 10 years.

Breaking down disaster recovery methods

This is where disaster recovery as a service (DRaaS) comes into the picture. An administration team has to ask if it should do all of the work associated with protecting continuous operation or if it should outsource the job to experts. This comes at a cost, but the justification is the ongoing workload of managing DR readiness, coupled with recovery and schedule risks. Additionally, the costs of being offline are very substantial.

Determining which data protection method is the right choice for a given use case requires good TCO analysis. It may well be that one offering does not fit all use cases, so a mashup of disaster recovery methods is needed.

Go with cloud backup when:

  • The business isn't accessing data on an urgent basis. Cold data meets this criterion, as does any workload that has a long RTO window.
  • The task of rebuilding instances/images in the public cloud isn't too complex.
  • There is in-house expertise in DR, and a willingness to test readiness at least every six months.

DRaaS is a better choice when:

  • The administration team consists of generalists with little expertise in DR.
  • The RTO is short, as with online marketing operations.
  • The application and infrastructure setup is complex.

Typically, DRaaS costs more than an in-house operation using backup, but the ability of a DRaaS vendor to use on-demand instances means their costs have reduced considerably and, in the end, that a detailed TCO analysis will be the deciding factor.

A word about backup/DR software tools is in order here. Many tools make backup relatively easy, even when running a full endpoint backup. The best tools focus a lot of development effort on the recovery problem, and allow for the recovery of single nodes, as well as clusters or data centers. These tools make rebuilding easier, with good interfaces and ways to keep related objects together in the recovery process.

Next Steps

Implement cloud DR at your organization

A step-by-step guide to selecting a DRaaS vendor

Explore options for public cloud backup

Dig Deeper on Disaster recovery services - outsourcing