NetBackup Media Server Deduplication Pool (MSDP): Overview.

By | February 8, 2015

This post covers Media Server Deduplication Storage Pool (MSDP) which is an embedded deduplication technology that ships and is installed with base NetBackup 7.6 software, and that is provided on NetBackup appliances running version 2.6 software. There are below benefits of deduplication:

  • Reduce the amount of data that is stored.
  • Reduce backup bandwidth.
  • Reduce backup windows.
  • Reduce infrastructure.

There are available two deduplication options:

  • Client deduplication
  • Media Server deduplication

NetBackup Media Server Deduplication Pool (MSDP): Overview

With Media Server Deduplication , NetBackup clients send their backups to a NetBackup media server, which deduplicates the backup data. A NetBackup media server hosts the NetBackup Deduplication Engine, which writes the data to a Media Server Deduplication Pool on the target storage and manages the deduplicated data.NetBackup Media Server Deduplication Pool (MSDP): Overview 2

With NetBackup MSDP client deduplication, clients deduplicate their backup data and then send it directly to the storage server, which writes it to the storage. The network traffic is reduced.

For all files to be de-duplicated:

  1. De-duplication plugin Separates file metadata and contents.
  2. The file contents are then Logically separated into segments.
  3. Hash fingerprint of segments are taken.
  4. Finger prints are identified for unique segments and then store all unique segments.
  5. Process the data stream.NetBackup Media Server Deduplication Pool (MSDP): Overview 3

Uniqueness of data segments is maintained across all clients – not just for individual client backup data.

MSDP storage servers

A storage server is an entity that writes to and reads from the storage. One host functions as the storage server, and only one storage server exists for each NetBackup deduplication node. The host must be a NetBackup media server. The MSDP storage server does the following:

  • Receives the backups from clients and then deduplicates the data.
  • Receives the deduplicated data from clients or from other media servers.
  • You can configure NetBackup clients and other NetBackup media servers to deduplicate data also. In which case, the storage server only receives the data after it is deduplicated.
  • Writes the deduplicated data to and reads the deduplicated data from the disk storage.
  • Manages that storage.
  • Manages the deduplication processes

MSDP stream handlers

NetBackup provides the stream handlers that process various backup data stream types. Stream handlers improve backup deduplication rates by processing the underlying data stream:

  • Performs multiple MD5 fingerprint calculations in parallel, rather than serially.
  • Overlaps I/O and CPU operations, making better use of system resources
  • Uses batch fingerprint queries to determine if data segments are unique
  • Transmit data concurrently to the Deduplication Engine over multiple streams

On Windows-based systems, the multi-threaded agent runs as the NetBackup Deduplication -Threaded Agent, that is visible in the Windows’ Services Manager (mtstrmd process). On UNIX and Linux-based NetBackup systems, the multi-threaded agent is visible in the output of the bpps command as mtstrmd

MSDP Data Integrity Enhancements

There are following Data Itengrity Enhancements:

  • Data loss and corruption are automatically contained to ensure new backups are intact.
  • CRC checking of data containers is performed automatically with the results reported to NetBackup
  • Storage leaks are automatically detected and repaired.
  • Reference database entries are automatically recovered if they are  corrupt or missing.
  • Storage garbage collection is automatically performed.

MSDP new features in NetBackup 7.6.1

  • Support for a 96TB Media Server Deduplication Pool (MSDP) on SLES 11.
  • Catalog self-healing. Two identical catalog contents are maintained in two different formats in real time, and corruption/loss of the primary copy can be detected and fixed automatically.
  • Catalog disaster recovery. Offline protection of MSDP metadata via a NetBackup policy. If MSDP catalog self-healing fails due to total loss of both the MSDP catalog and its shadow, the MSDP catalog can be recovered to its state as it was at the backup time of the restored MSDP catalog backup.

To configure Media Server Deduplication Pool (MSDP) please follow my posts:

Author: Mariusz

Architect (~ 15 years experience based on passion...) with strong background as a System Administrator and Engineer. Focused on Data Center Solutions: Virtualization/Cloud Computing and Storage/Backup Systems. Currently living in Poland.