Complete guide to backup deduplication
A comprehensive collection of articles, videos and more, hand-picked by our editors
Data deduplication reduces the amount of data to be stored by identifying and eliminating duplicate files and blocks of data. Finding the right data dedupe solution allows you to not have to back up as much data to a storage device; therefore, you can store more data while using less network bandwidth to transmit the data to and from the storage site.
Deduplication is an important strategy for disaster recovery (DR). For example, if you can reduce the amount of data to be backed up, yet ensure that the data being backed up is the critical data that needs to be available in an emergency, chances are you can get to the critical data more quickly after a disruptive incident, simply because there is less data to find and recover.
The nature of deduplication makes it a natural fit for DR applications, but users must examine the options carefully. A key goal in disaster recovery is to retrieve backed-up data as quickly as possible so that it may enter production. A data deduplication solution reduces the volume of data so that it can be stored on disk-based storage rather than tape. Disk-based backup is more reliable for restores and takes less time to restore than tape. The combination of deduplication and disk-based backup shrinks backup and restoration windows dramatically.
Improvements in network utilization occur with deduplication. Once data has been reduced, it can be quickly and securely backed up to a variety of storage locations via wide area networks (WANs). Likewise, in a disaster, the speed of accessibility of the deduped data over WANs ensures that recovery point objectives can be achieved.
Compare proposed data compression and deduplication ratios with your disaster recovery backup policies and technical requirements. Since different types of data compress more effectively than others, check to ensure that your deduplication solution maps to your backup and recovery objectives without any trade-offs in performance and scalability.
Determine if your existing backup software works seamlessly with the deduplication solution. Will the data dedupe solution force you to replace and/or modify your existing storage infrastructure and systems?
When evaluating deduplication solutions, determine what is needed to recover and "rehydrate" the data. How much time is needed for the data to be ready for use and how much bandwidth is needed to get the data to where it can be used? If it's determined that the data recovery and reconstitution processes make it difficult or impossible to achieve your recovery time objectives, deduplication may not be the best solution.
If your DR strategies include the replication of on-premise backups to private or public clouds or across a WAN, deduplication can be an effective strategy. By deduping backup data before replicating it, you can reduce the time needed to move the data to an alternate location and also reduce the bandwidth needed, as well as cloud storage costs.
See if you can find a comprehensive and unified data backup and recovery solution that incorporates deduplication and network optimization techniques. Whether it's a backup server running as a physical or virtual appliance, or a traditional data protection package, a unified data protection and deduplication solution reduces costs, accelerates backups and ensures rapid data recovery following a disaster.
Be sure to get references from companies that back up similar amounts of data and choose a data deduplication solution provider with a successful track record.
Deduplication is an increasingly popular approach for handling massive (and growing) data volumes. Properly configured in alignment with data storage and backup policies and disaster recovery requirements, leveraging data deduplication can achieve major economies of scale, reduce storage costs, boost network performance and fulfill DR performance metrics.
About the author:
Paul Kirvan, CISA, FBCI, has more than 24 years of experience in business continuity management (BCM) as a consultant, author and educator. He has completed dozens of BCM consulting and audit engagements that address all aspects of a business continuity management system (BCMS) and which are aligned with global standards including BS 25999 and ISO 22301. Kirvan currently works as an independent business continuity consultant/auditor and is the secretary of the Business Continuity Institute USA chapter and a member of the BCI Global Membership Council. He can be reached at firstname.lastname@example.org.