[This story was updated February 2013] What you will learn in this tip: In this first part of our series on data replication techniques, learn about the differences between asynchronous vs. synchronous replication, and the pros and cons of each technology.
It seems like data replication is everywhere today. From replicating virtual machine images for data protection and high availability to the exchange of information with cloud services, replication has proved to be the most suitable and agile data transfer and protection method in increasingly virtualized IT environments. But choosing the best data replication technique can be difficult. Replication offerings are generally grouped into categories that offer varying benefits and value propositions for different use cases and environments. In this tip, we'll look at
Asynchronous replication is the most broadly supported replication mode, supported by array-, network- and host-based replication products. Committed to the source array first, then buffered or journaled for subsequent replication to the target array, data arrives at the replication target with a delay, ranging from nearly instantaneous to minutes or even hours. Its network latency and bandwidth tolerance make it fit for long-distance replication.
Not all asynchronous replication implementations are equal, though. Key areas of differentiation are how a product deals with network outages, if it supports transaction recovery or if it simply creates a crash-consistent replica that depends on the target OS and application to resolve inconsistencies. For instance, both IBM Corp. Global Mirror for the IBM System Storage DS8000 and the Hitachi Data Systems Corp. Universal Replicator have provisions to maintain the sequence of writes. "Hitachi Universal Replicator guarantees transaction recovery by sequencing replicated data within consistency groups," noted Sarah Hamilton, Hitachi Data Systems' senior product marketing manager, data resilience and security.
Synchronous replication: For the high-end
Rarely supported in host-based replication products, synchronous replication is the hallmark of high-end block-based storage arrays and also supported by most network-based replication products, including the Hewlett-Packard (HP) Co. StorageWorks SAN Virtualization Services Platform (SVSP), IBM System Storage SAN Volume Controller (SVC) and LSI Corp. StoreAge Storage Virtualization Manager (SVM). Committing data to the replication source only after committing it successfully to the replication target, synchronous replication guarantees synchronicity between the replication source and target. A reliable network and low latency are prerequisites and supported distances can't exceed 50km to 300km, depending on the array vendor.
Its primary use is for high-end transactional applications that require instantaneous failover if the primary node fails. It's less relevant in network-attached storage (NAS) unless the NAS can also serve as block-based storage for high-end transactional applications. A pure NAS play, such as the Hitachi NAS Platform, powered by BlueArc, usually lacks synchronous replication support. "A NAS doesn't require synchronous replication," asserted Ravi Chalaka, BlueArc's senior director of solutions marketing. Conversely, NetApp filers with their support for NAS and block-based protocols, especially Fibre Channel (FC), support synchronous replication, allowing NetApp Inc. arrays to compete with very high-end block-based storage systems from EMC Corp., Hitachi Data Systems and IBM.
In the next part of our series, learn about the pros and cons of array-based and network-based replication.
About this author: Jacob Gsoedl is a freelance writer and a corporate director for business systems.
This article was previously published in Storage magazine.
This was first published in April 2011