It’s not just for disaster recovery anymore: Replication technology is finding a place in the more common, but just as crucial, recovery of systems and data in ordinary operations.
Replication technology, a staple of traditional disaster recovery (DR) strategies, can also play a pivotal role in local data recovery processes. Replication isn’t new, but it’s being used in new ways, especially with the increased popularity of virtualization and cloud services models. (These technologies often reduce the need to maintain on-premises physical system replicas.) Optimization technologies, such as data deduplication and compression, have also given replication technology a boost because they enable more efficient data transfers across local-area networks (LANs), wide-area networks (WANs) or storage-area networks (SANs).
But the larger contributors to replication’s shift from DR standard to operational recovery tool are data growth and diminishing downtime tolerance. Annual data growth -- averaging 10% to 30% for most companies, but as high as 40% or more for larger companies (those with more than 100 production servers1) -- is wreaking havoc on IT’s ability to copy systems and data for recovery purposes and meet recovery time objectives (RTOs).
Companies of all sizes also report limited leniency for the span of time they can do without access to applications and data. For tier 1 data, surveys by Enterprise Storage Group (ESG) reveal that for nearly three-quarters of respondents this amount of time is three hours or less. For a little more than 50% of those surveyed, it’s one hour or less. For many, meeting a service-level agreement (SLA) for this RTO would be a daunting task with traditional backup approaches.
Replication technology overview
Replication refers to technology solutions that create a mirror copy of data. It can be performed on different infrastructure components, including the host system and storage array, or in the network. Host-based replication asynchronously replicates data, regardless of storage array make or model. Array-based replication software runs on one or more storage controllers resident in disk storage systems, synchronously or asynchronously replicating data at the logical unit number (LUN) or volume block level between similar storage array models.
Network-based replication performs like array-based replication, but has the advantages of running on a network-resident appliance or intelligent switch and supporting dissimilar storage systems. There’s also application- or database-specific replication, typically packaged with or an add-on to the application. Application-based replication copies a specific app’s data to another instance of the application, taking advantage of its intimate knowledge about that app to offer optimal performance and application integrity.
Replication solutions can be broken into two main categories: synchronous and asynchronous. With a synchronous solution, data is copied to the replica (target) as soon as it’s written to primary storage (source), with the write to primary storage being acknowledged after the target storage array confirms the data was written to the replica. In contrast, asynchronous replication acknowledges the write and then moves data to the target replica, introducing a lag in the synchronicity of systems.
The differences between these approaches can be divided into three categories: bandwidth requirements, latency limitations and the potential impact on applications. Synchronous replication typically requires more bandwidth, and latency limits it to a campus distance. An asynchronous approach is better for over-distance replication because latency isn’t a factor and low-bandwidth connections are better tolerated. As its name suggests, a synchronous approach ensures that data on source and target systems are in lock step. But since the write acknowledgment is delayed until the data has been replicated, any time delay could impact the performance of the application. Asynchronous replication is far less likely to affect primary application performance. However, asynchronous replication could get out of synch -- so much so that it jeopardizes the ability to meet recovery point objectives (RPOs).
Replication risks, rewards
Some might consider replication a risky approach to satisfy backup requirements. Why? There’s no way to roll back to a previous point in time. If corrupted data is replicated, then the replica is damaged goods. That’s why implementing snapshot or continuous data protection (CDP) technology with replication may make more sense.
Implementing a snapshot or CDP strategy locally for operational recovery, and asynchronously replicating the point-in-time copy to a remote location for disaster recovery would provide the failsafe measures to make replication a more viable and reliable method of data protection. This approach can also be more optimized since only the changed blocks between point-in-time captures are replicated. Furthermore, at either the source or target location, recovery can occur to any previous point in time, potentially improving RPOs.
Replication has seen an uptick in adoption since ESG polled 400 IT professionals in North America regarding their organization’s current and planned data protection strategies and technologies. In ESG’s 2008 survey, 30% of respondents cited replication use vs. 46% today. Similarly, those respondents stating an expectation to deploy replication sometime in the next two years is currently 37% compared to the 30% intending to deploy in the 2008 study. As pressure mounts to capture and, more importantly, recover ever-growing volumes of data within increasingly limited windows of time, replication for operational recovery should get serious consideration.
BIO: Lauren Whitehouse is a senior analyst focusing on backup and recovery software and replication solutions at Enterprise Strategy Group, Milford, Mass.
1 ESG Research Report, Data Protection Trends, April 2010.