Article

Utility fills testing gaps with third-party disaster recovery service

Beth Pariseau
The largest electricity and natural gas distribution company in New England has already gotten to a point many IT pros are still only dreaming of: a cohesive disaster recovery plan and an infrastructure to match. But reaching infrastructure Nirvana, as Northeast Utilities discovered, is only the beginning of effective disaster recovery.

The company has around 500 TB of data on EMC DMX 3000 arrays at its main data center just north of Hartford, Conn., and a secondary site south of Hartford. Northeast Utilities serves Connecticut, Western Massachusetts and New Hampshire through subsidiaries Connecticut Light and Power, Public Service of New Hampshire, Yankee Gas and Western Massachusetts Electric Co.

Four years ago, Northeast hired EMC to help it classify its data by application, determine the recovery time objective (RTO) and the recovery point objective (RPO), and set up failover between data centers, said Ed Goldberg, the company's business continuity/disaster recovery coordinator. Currently, Northeast has four tiers of recovery: Tier 1 has a two-hour recovery time objective. Because it's reserved for the company's most mission-critical applications that control the electrical grid, the failover process to hot standby servers at the secondary data center requires no manual intervention.

    Requires Free Membership to View

More on DR planning
Cleversafe's dsNEt helps ISPs launch DR service

SunGard cloud service brings disaster recovery relief

SteelEye first to ship disaster recovery features for XenServer

User picks Silver Peak for WAN optimization
Tier 2 has a 48-hour RTO and requires manual intervention to initiate failover to cold standby servers. This tier includes email and some billing systems. Most of the billing systems and the company's financial data are assigned to Tier 3, which as a five-day RTO will use repurposed test and development machines at the secondary data center to recover. The rest, Tier 4 applications with a 21 day RTO, "are basically a shopping list and some configuration procedures," based on what the Northeast knows it can bring in from its vendors to replace systems on short notice.

In addition to setting up the disaster recovery plan, EMC consultants helped conduct its first live test several years ago. But since then, the company's been on its own with testing, and that's where the rest of the disaster recovery battle began. While the disaster recovery plan held still, like most infrastructures, Northeast's data centers did not.

"Over the years as we've added new systems and done technology refreshes, it's become more difficult for us to do real DR tests," Goldberg said. A detailed tabletop exercise still takes place annually, but the company hasn't been able to do many live failovers over the last two years, owing to the nature of the company's business. "We can't cause an outage for the test, and we don't have a means of failing back once the server's over at the secondary data center."

With this issue in mind, Goldberg met a representative from Continuity Software at a disaster-planning conference last August. Continuity Software offers prospects a free evaluation that promises to find misallocated storage; enough to pay for the software's licensing. "They came in and did their foot-in-the-door scheme," Goldberg said. "But whether or not we've misallocated storage isn't where we saw the value – we see the value in having a scan of our disaster recovery capabilities every night."

But there was another catch. "I told them, you run it," he said. "We didn't have the resources to go around deploying software licenses and agents, and monitoring it every day." Continuity Software offered to manage Northeast's disaster recovery monitoring at its facilities, connecting to Northeast's network through a private VPN and calling if there's a critical problem. Issues that don't need an immediate response are brought to the company's management during a conference call once a month. The service has been in production since January.

"They've already found gaps," Goldberg said, such as forgetting to associate a certain disaster recovery copy of data with a new server after a hardware refresh. "And if we can explain to them why we meant to configure something a certain way, they'll squelch the alert on it so it doesn't keep coming up."

Goldberg wasn't able to disclose what he paid for the service, but Continuity Software has since opened up the remote monitoring service to any customer through its Disaster Recovery Assurance offering, which was announced in February. The list price is $3,200 per protected server per year. "It's less than the cost of having a consultant come in once a month, and this also checks up on us every night," Goldberg said.

Goldberg admitted there was some nervousness about allowing a third party to access its network. Careful proof-of-concept testing using network "sniffers," while Continuity Software monitored the systems and firewalls that remain between Continuity Software and anything it's not supposed to access, put that to rest.

In the meantime, there are some items on Goldberg's wish list, chief among them mainframe support. "The mainframe is a more stable system, both by nature and because there's less fingers in it – the open systems are always in major flux," he said. "But I'd love for Continuity to be able to monitor that environment also."


There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: