In late May 2017, British Airways suffered an IT outage that grounded flights and stranded approximately 75,000...
passengers. Due to the substantial losses resulting from the outage, it's worth examining what went wrong and how similar disasters might be prevented in the future.
The reported cause of the British Airways outage varies slightly from one news site to the next, but it seems an engineer disconnected a power supply, and then caused a major power surge upon reconnecting it. This power surge is said to have caused major damage to the company's IT infrastructure, which resulted in the outage.
Industry analysts have speculated that a number of factors may have contributed to the severity of the British Airways outage, ranging from IT outsourcing to badly aging data center hardware. While it is impossible to say for sure what British Airways could have done differently, there are lessons to be learned.
It is highly unlikely British Airways did not have a business continuity/disaster recovery (DR) plan in place going into the outage. However, that plan may very well have been outdated or not properly tested. The airline also outsourced hundreds of IT jobs in 2016, so it is possible that the staff handling IT operations were not made aware of certain aspects of the company's disaster recovery plans.
A key takeaway from the British Airways outage is that a DR plan is not something you create and then file away. IT resources continuously change, so a DR plan must evolve, or it will quickly become stale and outdated.
It is also important to regularly test the DR plan. Ongoing testing not only validates the plan's effectiveness, it ensures IT staff know what to do in the event of a disaster. After all, the moments following an outage are not the time to be figuring out how to initiate disaster recovery operations.
Get the most out of your disaster recovery plan
Getting started with a business disaster recovery plan
Data center bloopers that cause costly outages
Dig Deeper on Disaster recovery planning - management
Related Q&A from Brien Posey
See why inherent differences between hyper-converged and native hypervisor management tools tend to make one better suited for managing virtualized ... Continue Reading
VMware Site Recovery Manager brings a number of benefits to the disaster recovery process. One of the lesser known roles of SRM is as a planning tool. Continue Reading
Disaster recovery may not be the most prominent use case for blockchain technology, but the idea of using it for data protection and recovery is ... Continue Reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.