Site Recovery Manager, which first shipped in spring of 2008 after being previewed at VMWorld 2007, automates the setup and failover processes needed to bring up the VMware environment at a secondary data center in the event of a disaster or outage. It can also be used for DR testing or to test patches and other updates to the VMware environment without affecting production.
VMware product marketing manager Jon Bock said SRM 1.5 is due out before the end of the year. The upgrade includes support for VMware's latest release, vSphere 4, along with the NFS protocol for file storage. SRM currently officially supports only block storage, either iSCSI or Fibre Channel.
Of the Ethernet-based storage protocols, he said, "there's more flexibility with NFS -- with iSCSI, you have to format it with a file system like VMFS. NFS is a more open format." For example, Scherer said, an NFS volume could be mounted to a non-virtual server.
Another feature coming to SRM is the ability to use a "hub/spoke" design for failover. The SRM 1.5 GUI in the demo showed linked vCenter Server instances that an administrator can toggle between in the same console to manage multiple sites.
Automated failback still on the to-do list
Bock said SRM 1.5 will not include the ability to automatically restore the secondary data center environment back to the primary site. That is something VMware customers have asked for and it has been forecasted for future releases of SRM but is not yet ready.
"There are a lot of details involved in recovering the original protection environment, and more variables," Bock said. For example, the failover profile of a power outage and a total primary site disaster are similar, Bock said, but the failback scenarios are not.
This hasn't stopped storage vendors from trying to supply their customers with something similar. FalconStor Software this week rolled out a new set of features for its Network Storage Server (NSS) storage virtualization software that lets customers failback following an SRM failover by using NSS and FalconStor replication to reverse replication from the DR site, re-register virtual machines at the primary site, and power them on.
DataCore Software takes a different approach to automated failback with its Advanced Site Recovery (ASR) product launched in June. Instead of failing over servers to a second data center, ASR uses branch and remote sites as failover sites. ASR can fail back after the disruption.
"Vendors have started to try to automate the failback process, particularly on the storage side," Bock said, citing an EMC Corp. plugin announced with a Celerra refresh in February. "Those are definitely useful, but not the whole process -- there are network settings outside the scope of what's done by partners" such as re-assigning IP addresses.
"It's something we're working to do," Bock continued. "But it's not something we'll have in the short term." VMware is also working to create common APIs for all its storage partners so new code doesn't have to be written to integrate with the replication and failback mechanisms of each vendor's disk arrays.