What you will learn in this tip: Data deduplication tools have morphed from a hot technology trend in data backup into being a necessary tool for your IT organization. Learn how dedupe technology can be a useful component of your business' disaster recovery plan.
Although there's no question data deduplication is a must-have technology for many organizations, many issues come up when it's being used as part of your disaster recovery strategy. Deduplication technology is primarily deployed as part of the data protection process. Dedupe eliminates duplicate copies of the protected data. It doesn't matter whether or not dedupe is source-based (an agent), media/backup server-based, or target-based (NAS or VTL with inline or post-processing deduplication); they all aim at creating a single copy of the protected data.
The first issue to consider with dedupe technology with regards to disaster recovery is its recovery performance. This is because disaster recovery requires large amounts of data to be recovered and restored. Large amounts of data obviously require a longer amount of time to recover and restore. Dedupe technologies add to that recovery time. This is because all deduped data must be "rehydrated" or undeduped before it can be recovered and restored. Rehydration adds latency (delay). Latency slows down performance. Far too many IT pros are distracted by data protection process. They focus on deduplication technology's performance capabilities of backing up data within the required window of time. They forget that the reason they are protecting or backing up their data is to be able to recover it.
Ensure your deduplication tool can meets your RTO
No one ever lost his or her job for a failed backup, but many people have lost their jobs for a failed recovery. What this means to disaster recovery administrators is they must ensure their deduplication tool's recovery performance (as measured in terabytes per hour) meets all recovery time objectives (RTOs). Measuring the recovery performance requires more than just measuring the rehydration speed of data from the target storage to the media server or just doing a specification analysis. It requires measuring the speed of recovering the data all the way to the point in which the application is running, can access its data and is no longer in a state of disruption.
The second disaster recovery deduplication issue pops up when IT is required to provide a business disaster recovery plan where recovery takes place at a different geographic location. In that situation, the dedupe technology must have some form of wide-area replication or the ability to replicate/backup deduped data to a cloud storage service provider.
Cloud backup services and an alternative to dedupe technology
An additional dedupe and disaster recovery technology alternative comes from cloud-based backup and recovery services. Cloud backup and recovery services offer local and geographically distributed disaster recovery with built-in deduplication. Some services offer local disaster recovery by keeping a copy of mission-critical data backed up locally. Local disaster recovery is known as a private cloud.
Cloud backup and recovery service providers also offer geographically distributed disaster recovery when there is a site disaster such as a hurricane, flood, earthquake, or tornado (i.e., public cloud). It's not usually a good idea to depend on a public cloud backup recoveries that require local RTOs. However, to provide a geographically different location disaster recovery absolutely requires the backed up data to be in some form of a public cloud. Disaster recovery requires both types of protection, which is a hybrid cloud (combination of private and public).
Dedupe technology has a key role for both disaster recovery aspects. Deduplication keeps the amount of data stored locally and in the off-site cloud at an absolute minimum. And because most cloud backup services charge based on the amount of data stored, this keeps costs at a minimum, too.
The use of a dedupe tool in your business disaster recovery plan requires that it have a good enough recovery performance to meet all RTOs. In addition, the dedupe technology must have the ability to move the deduped data off-site while recovering data to a different geographical location.
About the author:
Marc Staimer is the founder, senior analyst, and CDS of Dragon Slayer Consulting in Beaverton, OR. The consulting practice of more than 12-plus years has focused in the areas of strategic planning, product development and market development. With over 30-plus years of marketing, sales and business experience in infrastructure, storage, server, software, and virtualization, he's considered one of the industry's leading experts. Marc can be reached at firstname.lastname@example.org.
Read our tutorial on data deduplication technology
How to choose the best dedupe technology
Podcast: Deduplication products and disaster recovery