Server virtualization presents some interesting implications for disaster recovery (DR), including cost reductions and increased recovery times. In this FAQ, business continuity expert Paul Kirvan discusses the various server virtualization approaches that are effective for DR and how they benefit failover and failback processes. His answers are also available to download below as an MP3.
Table of contents:
>>>Server virtualization and recovery times
>>>Effective server virtualization approaches for disaster recovery
>>>Determining what systems need to be virtualized
>>>Server virtualization solutions for disaster recovery
>>>Failover and failback in a server virtualization environment
>>>DR planning in a virtualized environment
Right now, with traditional tape systems, it's not uncommon for recovery to be in the area of one to two days to completely restore a system. Although, a lot of that depends on how large the system is and how much data is being restored. With server virtualization, it's possible to reduce the amount of time needed to complete full restoration to four hours or less. Among the reasons that make this possible is the fact that it is not really necessary to rebuild servers, applications or even operating systems separately because they exist elsewhere and can be brought back online.
But it's important to make sure that those systems are regularly monitored to make sure that they're in place and kept up to date. It's also a very useful way to perform disaster recovery testing because you can run a test on the virtual image of the system without affecting your production system. So there are some really compelling reasons for using virtualization.
Primarily, virtualization performed with hardware, software or a combination of both. And, of course, the idea of behind virtualization is to save money and to reduce space and power requirements. But again, the key issue here in terms of disaster recovery is to determine how many servers you're going to need to back up and create a full image of all the information.
So, this gets us into what is called the server consolidation ratio. That's really in effect the number of virtual machines that a server can host. If you're looking to your inventory of servers and other related technologies, that's fine, but it's also important to make sure that you don't reduce the number of servers too far because you need to be able to recover and restore critical applications and operating systems. You want to make sure that you have enough hardware assets to do a restoration.
So, it's a good idea when you're building this environment to create your virtual DR servers so that you have earmarked assets in place that can be used for a disaster and that those particular servers are fully maintained and managed in the same way that your production environment would be. The good news is, in a virtual environment, you can restore things much faster than if you were moving tapes or recovering servers.
From a practical standpoint, you probably want to look at those applications and operating systems that are most crucial to the business. Email, obviously, is one of the most important applications. Microsoft Office is crucial as well, though it is not uncommon for Microsoft Office to be implemented on individual systems. Depending on how security is arranged, those assets may also be implemented in such a way that people use the files that they're working with, but go to another part of the network to actually operate the applications.
So, it's important to make sure that you first identify those systems and applications that are the most critical to the business and then focus on those in terms of the recovery effort. The principle thing to do would be to protect email and other critical applications, as well as any specialized applications that were developed in-house and can not be easily recreated.
Microsoft Virtual Server is widely used as well, though not as widely used as the VMware products. A lot of organizations leverage partitioning capabilities from their mainframe environment into a virtualization option. Xen hypervisor, which is actually open-source software, serves a number of different environments, including Windows, Linux and Solaris, and has built a reputation for efficiency.
Another company, Neverfail Ltd., started as a U.K. organization and has expanded into the U.S. They specialize in delivering continuous availability solutions for all types of environments. Their products utilize existing hardware and software. There's a nominal extra investment that's necessary with these kinds of things and they're able to do failovers in a matter of seconds. Failbacks also come back to an original environment just as quick.
By definition, failover is the capability to switch over to a redundant or standby server, system or network upon the failure or termination of an existing asset. It should normally happen without any kind of human intervention or warning. This contrasts with switchover, where you're dynamically making a transition from one environment to another.
Failback is the process of restoring a system or another asset that's in a failover state back to its original state. The assumption there is that you're able to bring it back to the state of operation before the disruption.
With a virtualized environment, you can failover to the environment which exists in real time, and it's very easy to failback to the original mode because you can maintain images of your previous environments. What's nice about this is that it speeds up your recovery time and it's possible to do testing on an actual system without adversely affecting your actual production environment. You can then turn it on or off as quickly as you'd like, so virtualization really helps in the area of testing.
Virtualization obviously has a number of very significant benefits, such as reducing the number of devices, reducing the equipment footprint in a data center and reducing power requirements. When you're doing disaster planning, you need to make sure that you're looking at various issues such as backing up critical data. You may be using tape, RAID or disk technologies, but the access to a virtual environment can certainly be factored in when you're developing your DR plan.
If nothing else, you'll have excellent control over where your emergency resources are located. As I mentioned before, you'll have the ability to do testing on a dynamic basis, so you'll always know that your resources are in place, ready to go and that the data is current.
You'll also be able to justify the value of having disaster recovery because as you're reducing your footprint, you're also increasing and improving your ability to recover in any kind of emergency. So, while you're saving money, you can also be improving your ability to recover and you should be able to make the case that disaster recovery, in conjunction with a virtualized environment, could probably go a long way in proving the ROI of your virtualization strategy.
About this author: Paul F. Kirvan, FBCI, CBCP, CISSP, has more than 20 years experience in business continuity management as a consultant, author and educator. He is also secretary of the Business Continuity Institute USA Chapter.
This was first published in March 2009