Jet leasing company Flight Options had to accelerate plans to replace an aging storage area network (SAN) and revamp its disaster recovery (DR) site when a leaky air conditioning unit flooded out its data center. The company put an impromptu data center disaster recovery plan into effect to quickly recover data after the flood, and had a new SAN up and running in less than four days.
Flight Options CIO David Davies said the Sept. 2009 data center flood set his team back about six months in application development work, but caused no downtime in the operational control center. And avoiding downtime is critical for Flight Options because its flight scheduling and maintenance records are stored on its SAN.
"We can't afford to lose our data center or an application," Davies said. "I can't remember the last time we shut something down in the middle of the day. Nine times out of 10, it's going to be at 3:00 or 4:00 in the morning, and if it's more than half an hour, it's going to be a big deal."
In the middle of 2009, Davies said he enlisted reseller CDW to help him look at new storage systems to replace the five-year-old EMC Corp. Clariion CX400 he had running in the data center at Flight Options' Cleveland headquarters. But he had yet to decide on a replacement when one of his IT staff noticed intermittent drive errors on the SAN in the wee hours one morning.
"That by itself didn't cause a lot of alarm because we had redundancy and it happens once in a while," Davies said. "Then all of a sudden one of the disk shelves went off about 4:30 in the morning. One of my guys went into the office, opened up the door to the data center and his exact words were 'It looked like Niagara Falls coming out of the ceiling.'"
A drain for a five-ton air conditioning unit on the roof had been leaking for several hours. By the time Flight Options' IT staff noticed the problem, there was about an inch of water on the data center floor and several inches of water on the drives inside the array. "It soaked three full cases of hardware – including the SAN -- out of about nine full rack cases in the data center," Davies said. "We pulled out drives, and water poured out. We were literally under water."
Flight Options' data was safe because it was backed up to a Columbus, Ohio DR site, but that facility had no flight control center for personnel to work. Davies said he called his CDW rep the next morning to talk about buying a new SAN for headquarters as soon as possible. "The good part was we knew what we wanted and what we needed," he said. "We just didn't want to do it with a gun to our heads. We didn't want to do it in 12 hours; we wanted to have a professional rollout, take our time and do it right."It soaked three full cases of hardware -- including the SAN -- out of about nine full rack cases in the data center. We pulled out drives, and water poured out. We were literally under water.
Davies said he was leaning toward a Compellent Storage Center SAN and his CDW representative called Compellent that afternoon to order a new system. The storage array arrived at Flight Options two days later, and he immediately began setting it up. The data center flood hit Tuesday morning and the SAN was functioning by Friday.
"We had equipment with data on it late Thursday night and [pushed] some data live by Friday morning," he said.
Davies was prepared to migrate his backup data from Columbus via CA XOsoft WANSync (now CA ARCserve Replication) to restore his primary center, but that wasn't necessary for all of the data.
Davies said his staff dried the equipment with heat guns and clear air from compressors, and then copied data using virtual machine builds and Ghost hard drive mirrors.
"Believe it or not, we took the drives out of the cages, dried them off and most of the gear fired back up," he said. "We powered up the old EMC array, powered the Compellent array, put them side by side, and started sucking data off the EMC array into the Compellent array.
"By doing the transfer locally, we got up and running a lot quicker than you would expect."
New disaster recovery site makes room for people
Flight Options also changed its data center disaster recovery plan and its disaster recovery strategy after the flood, opening a larger DR site closer to headquarters. "If we lost this [Cleveland] building, we had our data but no place to put bodies to use the data," Davies said. "As soon as we got all the data and applications back up and running, we found space closer to corporate headquarters where people could drive to and sit down, and we stood up another DR center there."
Davies said if the Cleveland building was lost, he could have operations up and running in the DR site within 45 minutes.
Besides a new disaster recovery center, Flight Options has more than twice as much storage capacity and a more modern SAN after the flood. Davies said Flight Options stores data from its large databases used to track plane maintenance, scheduling and dispatching on Fibre Channel drives, and file data on SATA. Compellent's Data Progression tiering software handles that automatically, he said.
"Before our great flood, one person would spend about three days over the course of a week managing data on the SAN," he said. "With the Compellent SAN, we haven't had to tell it what to do. It's moving data and managing access to the data the way it's supposed to, pretty close to hands-off."