Many disaster recovery (DR) plans aim to recover (e.g., failover) critical systems and data to another location...
and, following a disaster event, restore (e.g., failback) those same systems back to their original operational status. Once systems have been returned to their original operational status, the organization can resume business.
Failover and failback processes are complex and must be carefully planned and tested in advance. When failing over to an alternate location, the appropriate technology must be in place to accommodate the application, in particular the operating system(s), networking access and bandwidth, sufficient data storage, databases, files and utilities. However, a few additional circumstances must be addressed when failing back to previous operations. This tip will examine issues that must be addressed before you fail back/return systems and data to your primary location.
If your organization is fairly large, with multiple offices and a well-established IT department, you can probably fail over and fail back with minimal difficulty. This is because you’ll probably have backup resources, such as servers, desktop systems, laptop computers and other devices in your inventory and can obtain emergency replacements quickly at your recovery site. But if your organization is smaller and does not have such extensive resources, your options for recovery may be limited.
For small to medium businesses (SMBs), there are plenty of disaster recovery options that must be balanced against available financial resources, staffing, physical space and existing system and data requirements.
But when you are ready to return (failback) to business as usual, the post-disaster environment may be the same or different from what it was before the incident. When failing back, you generally have two options: your original office, assuming little to no damage has occurred there; or a different location, the result of a partial or total loss of your original office space.
When failing over, several issues must be addressed as part of the overall solution. You should ensure that:
- Existing systems and associated software can be replicated at an alternative site (both physical and cloud-based solutions can work)
- All critical data (e.g., daily work files) and information (e.g., customer records) can be replicated at the alternate site as close to the point in time when the incident occurred
- Specialized failover software (e.g., Double-Take) is available at both locations to initiate the failover
- There is sufficient network bandwidth (e.g., via the Internet) to the alternative site that the failover can occur
When preparing to return to your original office, test the systems and infrastructure to ensure they are working properly; ensure there is sufficient storage capacity to handle the returning systems and data; and ensure that sufficient network bandwidth is available for the failback.
Now, if your original office is not available, your disaster recovery plan should initiate the following actions:
- Obtain new hardware (e.g., servers, storage and routers) and software (e.g., operating systems and applications) that will support your operational requirements as in your original office
- Secure space to house the new IT equipment
- Obtain new desktop systems and peripherals
- Install failover/failback software if that strategy is in your recovery plan
- Ensure that network bandwidth is available to support failback activities
As a side note, if you’re an SMB, consider buying a few terabytes of external storage to back up your systems and data. These storage devices can be purchased at many office supply stores. It’s a good idea to have a “secondary backup” arrangement in case your failover/failback arrangements don’t work as planned.
The good news is that there are plenty of options for SMBs (as well as larger firms) for failover and failback activities. Regardless of which strategy you use, remember these key points:
- Ensure that failover and failback activities are in your DR plan
- Work with established data recovery vendors to arrange for emergency backup and recovery
- Document a DR plan that clearly outlines the steps (e.g., scripts) for failover to an alternate location and failback to the primary (or a new) location
- Establish primary and alternate sources for IT hardware and software
- Annually test failover and failback procedures and systems to ensure they work properly (a live test over a weekend is preferred but a tabletop exercise will work)
- Establish roles and responsibilities for employees in a disaster
- Train IT staff in failover and failback procedures
Make sure that when you plan for disaster recovery, you consider relocating your IT operations (failover) to an alternate site and how you’ll want to return (failback) to your original space or possibly to a new location. It’s possible that your failback may be more complex than the failover, so be prepared to carefully examine all the options.
About the author: Paul Kirvan, CISA, FBCI, has more than 24 years of experience in business continuity management (BCM) as a consultant, author and educator. He has completed dozens of BCM consulting and audit engagements that address all aspects of a business continuity management system (BCMS) and which are aligned with global standards including BS 25999 and ISO 22301. Kirvan currently works as an independent business continuity consultant/auditor and is the secretary of the Business Continuity Institute USA chapter and a member of the BCI Global Membership Council. He can be reached at [email protected].