Data Deduplication in Windows Server

With the release of Windows Server Technical Preview 4, I’d like to send one primary message to all of our customers using or evaluating Windows Server Data Deduplication (which I assume applies to you since you are reading this posting!): Test at full scale!

I’ve seen the telemetry from hundreds of dedup installations using Windows Server TP3 and it’s great to see that a full range of usage types are being evaluated and that the functionality is performing just as we expected.

[Note: For the details about all the new dedup features in Windows Server 2016, see the Data Deduplication in Windows Server Technical Preview 3 article which summarizes all the changes. ]

What I’d like to see now are more configurations testing dedup at full scale.

What do I mean by test at full scale?  Basically the following:

  • Large volume sizes: Up to 64TB if you can! Volumes at 32TB or larger are of interest as well.
  • Large file sizes: Files up to 1TB used for any supported scenario!
  • Large data churn: Workloads that modify or create new data across the volume. The ability of a system to process the full data churn depends heavily on the nature of the workload and data as well as on the system’s hardware capabilities.  Evaluating your heaviest loads on the systems you expect to use is the best way to ensure your scenarios are optimally supported by the final release.

We did make some final scale improvements to the dedup job algorithms based on direct feedback from TP3, so we strongly believe that TP4 has everything you need for your large scale deduplication scenarios. (That statement should be enough to get even the most skeptical of you give it a try 🙂

And as always we would love to hear your feedback and results. Please send email to and let us know how your evaluation goes and, of course, any questions you may have.

The official blog of the Microsoft Windows Server storage engineering teams