In late May 2017, British Airways suffered an IT outage that grounded flights and stranded approximately 75,000...
passengers. Due to the substantial losses resulting from the outage, it's worth examining what went wrong and how similar disasters might be prevented in the future.
The reported cause of the British Airways outage varies slightly from one news site to the next, but it seems an engineer disconnected a power supply, and then caused a major power surge upon reconnecting it. This power surge is said to have caused major damage to the company's IT infrastructure, which resulted in the outage.
Industry analysts have speculated that a number of factors may have contributed to the severity of the British Airways outage, ranging from IT outsourcing to badly aging data center hardware. While it is impossible to say for sure what British Airways could have done differently, there are lessons to be learned.
It is highly unlikely British Airways did not have a business continuity/disaster recovery (DR) plan in place going into the outage. However, that plan may very well have been outdated or not properly tested. The airline also outsourced hundreds of IT jobs in 2016, so it is possible that the staff handling IT operations were not made aware of certain aspects of the company's disaster recovery plans.
A key takeaway from the British Airways outage is that a DR plan is not something you create and then file away. IT resources continuously change, so a DR plan must evolve, or it will quickly become stale and outdated.
It is also important to regularly test the DR plan. Ongoing testing not only validates the plan's effectiveness, it ensures IT staff know what to do in the event of a disaster. After all, the moments following an outage are not the time to be figuring out how to initiate disaster recovery operations.
Get the most out of your disaster recovery plan
Getting started with a business disaster recovery plan
Data center bloopers that cause costly outages
Dig Deeper on Disaster recovery planning - management
British Airways cancels flights due to technical issue
British Airways resolves IT systems issues that forced it to ground and delay hundreds of flights
BA IT systems failure results in cancelled flights and delays at London airports
Using the cloud for disaster recovery? You'll need these key terms
Related Q&A from Brien Posey
Although several newer tools are available, Microsoft roaming profiles is a simple and time-tested way to manage a user's profile across physical and... Continue Reading
SaaS application backup is increasingly prevalent, and for good reason -- it's critical for ensuring data protection. These five guidelines will help... Continue Reading
Admins may need to open the BIOS or UEFI interface to change BIOS passwords on Windows 10. This process isn't too complex, but IT pros should follow ... Continue Reading