BigTrees
Performant hash trees to dedup large collections of files.
Take a look at some examples, try it, and/or contact me or open a GitHub issue if anything is unclear.
The code is currently hosted on GitHub. It will always be open source and free—at least for non-commercial use and for early-stage startups not doing anything overtly evil.