In array-based data replication, the replication software runs on one or more storage controllers. It's most prevalent...
in medium- and large-sized companies, mostly because larger firms have deployed higher end storage arrays that come with data replication features.
With more than 15 years of history, array-based replication is the most mature and proven replication approach, and its scalability is only constrained by the processing power of the array's storage controllers. "Customers scale replication performance in both our Clariion and Symmetrix arrays by distributing data replication across a larger number of storage processors," explained Rick Walsworth, director product marketing replication solutions at EMC Corp.
With the replication software located on the array, it's well suited for environments with a large number of servers for several reasons: it's operating system-agnostic; capable of supporting Windows and Unix-based open systems, as well as mainframes (high-end arrays); licensing fees are typically based on the amount of storage rather than the number of servers attached; and it doesn't require any administrative work on attached servers. Because replication is offloaded to storage controllers, processing overhead on servers is eliminated, making array-based replication very favorable for mission-critical and high-end transactional applications.
The biggest disadvantage of array-based replication is its lack of support of heterogeneous storage systems. And unless the array provides a storage virtualization option -- as Hitachi Data Systems does for its Universal Storage Platform (USP) -- array-based replication usually only works between similar array models. Besides a high degree of vendor lock-in, entry cost for array-based replication is relatively high, and it could be particularly expensive for companies that have to support a large number of locations. In general, array-based replication works best for companies that have standardized on a single storage array vendor.
Almost all vendors of midsized to high-end arrays provide a replication feature. The replication products of these leading array vendors have made significant inroads and gained market share:
- EMC Symmetrix Remote Data Facility (SRDF) for both synchronous and asynchronous replication, and EMC MirrorView for synchronous and asynchronous replication of Clariion systems.
- Hitachi Data Systems TrueCopy for synchronous replication and Hitachi Data Systems Universal Replicator software for asynchronous replication.
- HP StorageWorks XP Continuous Access and Continuous Access EVA for both synchronous and asynchronous replication for HP XP and EVA arrays.
- IBM Corp. Metro Mirror for synchronous replication and IBM Global Mirror for asynchronous replication.
- NetApp SnapMirror for synchronous and asynchronous block-based replication, and NetApp SnapVault for file-based replication.
Even though these replication products are similar in many aspects, a close technical analysis reveals subtle differences. For instance, the efficiency of the handshake between primary and target storage systems used during synchronous replication greatly impacts the distance a replication product can support. "Metro Mirror is able to write data to the target system with a single handshake, enabling it to support distances of up to 300 km," said Vic Pelz, consulting IT architect at IBM. That distance goes well beyond the 50 km to 200 km cited by other storage vendors.
Differences can also be found among asynchronous replication implementations. While EMC buffers data to be replicated in memory, IBM Metro Mirror tracks changes with so-called bitmaps, continuously transmitting changes and periodically re-synchronizing the source and target to ensure they stay in sync. On the other hand, Hitachi Data Systems uses change journals stored on disk in its Universal Replicator software.
This article originally appeared in Storage magazine.
About this author: Jacob Gsoedl is a frequent contributor to "Storage" magazine.