This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
3. - Network disaster recovery planning and building resilient networks: Read more in this section
- Business continuation and disaster recovery tips for your WAN services
- WAN optimization products unshackle disaster recovery
- How to deal with failback problems
Explore other sections in this guide:
- 1. - Good planning and management are key for business continuity and disaster recovery success
- 2. - Recent storage and server developments ease BC/DR planning
- 4. - Security an important part of BC/DR planning
Wide area networks (WANs) provide connectivity to local area and other networks over long distances.
WANs have a multi-faceted role in an organization: They can support voice and data communications and Internet connectivity, provide connectivity for company email and virtual private networks (VPNs), and link to other organizations doing business with the company.
In a disaster situation, WANs become essential tools for an organization to communicate internally among its employees and externally with stakeholders and other third parties. Loss of a WAN infrastructure, without suitable backup and recovery capabilities, can seriously disrupt business operations.
WAN technologies have evolved dramatically from the days of fixed point-to-point circuits. Depending on the applications being transported, a variety of network protocols may be supported by a WAN, such as MPLS (multi-protocol label switching), SIP (session initiation protocol), SONET (synchronous optical network), Ethernet (e.g., 10 GbE) and, of course, the TCP-IP standard. Transport is typically over fiber-optic networks coupled with high-capacity copper- and fiber-based local access facilities.
When building or managing WANs, a primary activity is to keep them running with minimal disruptions. A principal WAN design goal, therefore, is resilience, which ensures that any potential disruptions are found and resolved quickly and efficiently.
When developing WAN resilience plans, your most important ongoing activity is to work with your carriers to take full advantage of their recovery and restoration capabilities. In addition to getting details on their service recovery and restoration offerings, find out how they approach service-level agreements (SLAs) that specifically address how they will respond during a service disruption. Make sure that their time frames align with your business requirements. For instance, if you have a four-hour recovery time objective (RTO) for a specific system that needs Internet access, be sure that your carrier can restore access within your RTO.
To build resilient WANs, access to real-time information about network performance is essential for spotting potential disruptions. That information must be end-to-end, and not limited to network segments. To obtain visibility across WANs, your network management system must be able to “see” all network segments and how well they are performing. Ideally, you should have an automated tool that can be programmed to analyze cross-WAN performance data. Use that data to compare current network performance against specific metrics and/or SLAs. The tool should also be able to flag situations that indicate impending problems.
The most resilient network topology is a mesh network, in which all network end points connect to each other. This, of course, is also the most expensive configuration, so you may wish to use network design software (work with your service provider on this) to define a configuration that balances cost-effectiveness and resilience. Ensure that channels with the highest traffic volumes have alternate routes available, from different carriers if possible, that can be rapidly activated to maintain performance levels. If your WAN uses undersea cables and/or satellite channels, be sure to consider alternate cable and satellite systems for diversity and resilience.
At your data centers and offices, install redundant network connection devices, such as routers and switches, and also have an inventory of spares that can be brought into service quickly if a device fails. Be sure to rotate spare devices into production networks to ensure they perform properly.
It may be worthwhile to consider locating some of your WAN infrastructure in a secure collocated multiple-carrier building, such as the well-known telecommunications hotspot 60 Hudson St. in New York City. This particular site is highly secure, has multiple facility entry/exit paths into the building, redundant power systems, extensive bandwidth to address most requirements, and numerous carriers available to provide service. Many U.S. cities have “carrier hotels” that can support WAN performance and resilience goals.
Ensure that your WAN’s primary commercial power supplies have backup power (e.g., uninterrupted power systems) so they will remain operational in the aftermath of a commercial power outage or lightning strike.
Locate network infrastructure equipment in secure, HVAC-equipped rooms that are accessible to a limited number of employees and vendors.
Establish network disaster recovery (DR) plans that provide step-by-step activities to diagnose problems, establish bypass and recovery arrangements, recover failed network components and return WAN operations to normal. Periodically test these plans to ensure they are appropriate for your WAN as configured, the procedures work and are in the correct sequence, and that your service providers are in synch with your network resilience requirements.
Resilient wide area networks can be achieved through a combination of partnering with service providers, intelligent network design, proactive network management, a disaster recovery program combining plans and regular testing, and an operational philosophy that blends performance with resilience and survivability.
About the author: Paul Kirvan, CISA, FBCI, has more than 24 years of experience in business continuity management (BCM) as a consultant, author and educator. He has completed dozens of BCM consulting and audit engagements that address all aspects of a business continuity management system (BCMS) and which are aligned with global standards including BS 25999 and ISO 22301. Kirvan currently works as an independent business continuity consultant/auditor and is the secretary of the Business Continuity Institute USA chapter and a member of the BCI Global Membership Council. He can be reached at firstname.lastname@example.org.