Snappy compression for Windows

Quoting from upstream Snappy homepage:

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger.

This is a Windows port of Snappy for C++, .NET, and the command line. There are some other ports of Snappy to Windows listed at the end of this page, but this one aims to be the most complete, the most up to date, and the most stable one. Snappy for Windows is provided free of charge under permissive BSD license.

On this page

Tutorial for C++

C++ NuGet package contains source code that is compiled together with your project. Your project will therefore have no DLL dependencies and there will be no C++ runtime issues. It however means that Debug build of your project will contain slower Debug build of Snappy.

Tool
<PackageReference Include="Snappy" Version="1.1.1.7" />

Alternatively, you can download plain 7-Zip archive of the DLLs and associated LIBs and headers. Sources are available from GitHub and Bitbucket. C++ code and binaries are distributed under BSD license.

After downloading the library, your first step is to include the header:

#include "snappy-c.h"

You can then compress a buffer of data like this:

char compressed[1000];
size_t length = 1000;
snappy_status status = snappy_compress(
    "Hello World!", 12, compressed, &length);

After calling this function, buffer compressed will contain string "Hello World!" compressed by Snappy. Variable length will contain length of the compressed data. Note that the compressed version can be slightly larger than input in extreme cases. Snappy allows you to calculate the required size of the output buffer:

char uncompressed[] = "Hello World!";
size_t length = snappy_max_compressed_length(sizeof(uncompressed));
char *compressed = new char[length];
// ... compress like above

You can decompress compressed buffer like this:

char uncompressed[1000];
size_t uncompressedLength = 1000;
snappy_status status = snappy_uncompress(
    compressed, compressedLength, uncompressed, &uncompressedLength);

This works the same as compression above except the process is reversed. Buffer uncompressed will contain our "Hello World!" string and uncompressedLength will be 12. Again, we can estimate size of output buffer, but this time it's not a simple constant based on input size. Snappy stores size of uncompressed data in header of the compressed block. It has a function that retrieves the exact length of uncompressed data in O(1) time:

size_t uncompressedLength;
snappy_status status = snappy_uncompressed_length(
    compressed, compressedLength, &uncompressedLength);

Snappy also has a function to validate compressed buffer. And there's a whole alternate C++ API that additionally supports pluggable source/sink interfaces that allow you to compress a number of buffers into single solid block. See Snappy header files, which contain documentation for all the public APIs.

Tutorial for .NET

.NET DLL is AnyCPU, but it automatically forwards all calls to one of the two native DLLs depending on whether the current process is 32-bit or 64-bit. The two native DLLs are embedded as resources and unpacked into temporary location before first use.

Tool
<PackageReference Include="Snappy.NET" Version="1.1.1.8" />

Alternatively, you can download plain 7-Zip archive of the DLLs. Sources are available from GitHub and Bitbucket. .NET code and underlying native libraries are distributed under BSD license.

After downloading the library, your can use the Snappy namespace:

using Snappy;

You will gain access to low-level SnappyCodec API that provides block-level compression. Snappy.NET additionally contains SnappyStream that implements streaming API for streams of unbounded size.

SnappyCodec class provides you with a pair of simple compression/uncompression methods:

public static byte[] Compress(byte[] input);
public static byte[] Uncompress(byte[] input);

It cannot be simpler than that. If you would like to squeeze out the last bit of performance, there are corresponding two no-copy methods along with methods for estimating output buffer size. Explore the documentation via IntelliSense or by peeking into SnappyCodec class.

SnappyStream class is very similar to GZipStream class in .NET Framework. Note that SnappyStream creates output incompatible with SnappyCodec. While SnappyCodec is for compressing fixed-size blocks, SnappyStream is intended for unbounded streams. SnappyStream is however compatible with other implementations of Snappy framing specification. You can create Snappy-compressed file like this:

using (var file = File.OpenWrite("mydata.sz"))
using (var compressor = new SnappyStream(file, CompressionMode.Compress))
using (var writer = new StreamWriter(compressor))
    writer.WriteLine("Hello World!");

Decompression is similarly easy:

using (var file = File.OpenRead("mydata.sz"))
using (var decoder = new SnappyStream(file, CompressionMode.Decompress))
using (var reader = new StreamReader(decoder))
    Console.Write(reader.ReadToEnd());

If you are on .NET 4.5, SnappyStream provides you with async variants of all I/O methods. If you would like to do advanced stream manipulation, you can use SnappyFrame class.

Command-line tools

Snappy for Windows includes command-line tools snzip and snunzip that can be used to manipulate Snappy files on the command line. These tools are compatible with other tools implementing Snappy framing specification.

Download the 7-Zip archive, extract it somewhere, and find bin folder in the extracted package. You can then compress a file like this:

snzip.exe test.dat

That will produce file test.dat.sz in the same folder. You can decompress it again like this:

snunzip.exe test.dat.sz

Here's the list of options you can use with snzip:

snzip, snunzip - Snappy compression command-line tool
Options:
 -d --decompress --uncompress
   Run in decompression mode. This is default if started as 'snunzip'.
 -c --stdout --to-stdout
   Output to standard output instead of file.
 -t --test
   Only test integrity of the compressed file. Don't actually unpack it.
 -v --verbose
   Verbose output.
 -V --version
   Version. Display the version number and compilation options then quit.
 -h --help
   Display this information and quit.

Sources are available from GitHub and Bitbucket.

Performance

Tests have been ported to Windows as well and they show that this Windows port is correct and fast. The various benchmark types should be interpreted as follows:

Speed benchmarks should be taken with a grain of salt. The CPU used for testing was a very fast Core i7 3.4GHz with all data fitting in its L3 cache. Benchmarks have been done on a single core with all the other cores unoccupied.

32-bit test results

C:\...\snappy-visual-cpp>Release\runtests.exe
Running microbenchmarks.
Benchmark            Time(ns)    CPU(ns) Iterations
---------------------------------------------------
BM_UFlat/0              86114      90398       1208 1.1GB/s  html
BM_UFlat/1             841390     852105        238 785.8MB/s  urls
BM_UFlat/2               5516       5662      33057 20.9GB/s  jpg
BM_UFlat/3                131        128    1333333 1.4GB/s  jpg_200
BM_UFlat/4              28449      28278       6620 3.1GB/s  pdf
BM_UFlat/5             345159     347857        583 1.1GB/s  html4
BM_UFlat/6              30515      30464       6657 770.2MB/s  cp
BM_UFlat/7              13173      13010      15588 817.3MB/s  c
BM_UFlat/8               3805       3782      53619 938.2MB/s  lsp
BM_UFlat/9            1249163    1275477        159 769.9MB/s  xls
BM_UFlat/10               303        308     606060 617.5MB/s  xls_200
BM_UFlat/11            276564     262186        714 553.2MB/s  txt1
BM_UFlat/12            243323     241429        840 494.5MB/s  txt2
BM_UFlat/13            740712     690778        271 589.2MB/s  txt3
BM_UFlat/14           1011155    1050782        193 437.3MB/s  txt4
BM_UFlat/15            402018     408050        497 1.2GB/s  bin
BM_UFlat/16               266        273     512820 696.7MB/s  bin_200
BM_UFlat/17             50228      51057       3972 714.3MB/s  sum
BM_UFlat/18              4866       4829      38759 834.6MB/s  man
BM_UFlat/19             81618      81972       2474 1.3GB/s  pb
BM_UFlat/20            304378     305423        664 575.5MB/s  gaviota
BM_UValidate/0          39102      39478       5137 2.4GB/s  html
BM_UValidate/1         435403     406075        461 1.6GB/s  urls
BM_UValidate/2            190        196     952380 601.5GB/s  jpg
BM_UValidate/3             73         70    2222222 2.7GB/s  jpg_200
BM_UValidate/4          13005      12112      15455 7.3GB/s  pdf
BM_ZFlat/0             194230     193143       1050 505.6MB/s  html (22.31 %)
BM_ZFlat/1            2245170    2184010        100 306.6MB/s  urls (47.77 %)
BM_ZFlat/2              36788      29345       5316 4.0GB/s  jpg (99.87 %)
BM_ZFlat/3                554        568     246913 335.4MB/s  jpg_200 (79.00 %)
BM_ZFlat/4              95347      87970       2128 1022.6MB/s  pdf (82.07 %)
BM_ZFlat/5             748713     745591        272 523.9MB/s  html4 (22.51 %)
BM_ZFlat/6              83891      80759       2318 290.5MB/s  cp (48.12 %)
BM_ZFlat/7              31498      32165       4365 330.6MB/s  c (42.40 %)
BM_ZFlat/8               9403       9542      21253 371.9MB/s  lsp (48.37 %)
BM_ZFlat/9            2274370    2340020        100 419.7MB/s  xls (41.23 %)
BM_ZFlat/10               705        694     224719 274.8MB/s  xls_200 (78.00 %)
BM_ZFlat/11            681761     680540        298 213.1MB/s  txt1 (57.87 %)
BM_ZFlat/12            602058     598233        339 199.6MB/s  txt2 (61.93 %)
BM_ZFlat/13           1859000    1877787        108 216.7MB/s  txt3 (54.92 %)
BM_ZFlat/14           2377800    2340020        100 196.4MB/s  txt4 (66.22 %)
BM_ZFlat/15            663420     676003        300 724.0MB/s  bin (18.11 %)
BM_ZFlat/16               203        197     869565 966.5MB/s  bin_200 (7.50 %)
BM_ZFlat/17            139682     132860       1409 274.5MB/s  sum (48.96 %)
BM_ZFlat/18             12794      12318      15197 327.3MB/s  man (59.36 %)
BM_ZFlat/19            159285     160316       1265 705.4MB/s  pb (19.64 %)
BM_ZFlat/20            555553     558680        363 314.6MB/s  gaviota (37.72 %)


Running correctness tests.
All tests passed.

64-bit test results

C:\...\snappy-visual-cpp>x64\Release\runtests.exe
Running microbenchmarks.
Benchmark            Time(ns)    CPU(ns) Iterations
---------------------------------------------------
BM_UFlat/0              59839      59391       1576 1.6GB/s  html
BM_UFlat/1             616407     625929        324 1.0GB/s  urls
BM_UFlat/2               7089       7301      23501 16.2GB/s  jpg
BM_UFlat/3                 83         77    1818181 2.4GB/s  jpg_200
BM_UFlat/4              19813      19610       9546 4.5GB/s  pdf
BM_UFlat/5             249815     253501        800 1.5GB/s  html4
BM_UFlat/6              19922      19143       9779 1.2GB/s  cp
BM_UFlat/7               9434       9266      20202 1.1GB/s  c
BM_UFlat/8               2584       2499      74906 1.4GB/s  lsp
BM_UFlat/9             936381     943260        215 1.0GB/s  xls
BM_UFlat/10               215        206     377358 922.7MB/s  xls_200
BM_UFlat/11            205981     204849        990 708.0MB/s  txt1
BM_UFlat/12            183019     188127       1078 634.6MB/s  txt2
BM_UFlat/13            542320     530314        353 767.4MB/s  txt3
BM_UFlat/14            757496     709094        264 648.1MB/s  txt4
BM_UFlat/15            306537     304506        666 1.6GB/s  bin
BM_UFlat/16               209        215     869565 886.0MB/s  bin_200
BM_UFlat/17             35983      35401       5288 1.0GB/s  sum
BM_UFlat/18              3530       3437      58997 1.1GB/s  man
BM_UFlat/19             57265      59177       3427 1.9GB/s  pb
BM_UFlat/20            216308     204145        917 861.1MB/s  gaviota
BM_UValidate/0          35711      36149       5610 2.6GB/s  html
BM_UValidate/1         415625     387579        483 1.7GB/s  urls
BM_UValidate/2            161        154     606060 765.6GB/s  jpg
BM_UValidate/3             52         51    3333333 3.6GB/s  jpg_200
BM_UValidate/4          12067      12249      16556 7.2GB/s  pdf
BM_ZFlat/0             127627     129502       1566 754.1MB/s  html (22.31 %)
BM_ZFlat/1            1647181    1676041        121 399.5MB/s  urls (47.77 %)
BM_ZFlat/2              33583      23252       6038 5.1GB/s  jpg (99.87 %)
BM_ZFlat/3                464        463     303030 411.7MB/s  jpg_200 (79.00 %)
BM_ZFlat/4              53554      55213       3673 1.6GB/s  pdf (82.07 %)
BM_ZFlat/5             527481     525391        386 743.5MB/s  html4 (22.51 %)
BM_ZFlat/6              53204      52971       3534 442.9MB/s  cp (48.12 %)
BM_ZFlat/7              19206      19123       9789 556.0MB/s  c (42.40 %)
BM_ZFlat/8               6457       5389      28943 658.4MB/s  lsp (48.37 %)
BM_ZFlat/9            1650286    1662303        122 590.8MB/s  xls (41.23 %)
BM_ZFlat/10               589        574     298507 331.8MB/s  xls_200 (78.00 %)
BM_ZFlat/11            494736     499509        406 290.4MB/s  txt1 (57.87 %)
BM_ZFlat/12            450292     452680        448 263.7MB/s  txt2 (61.93 %)
BM_ZFlat/13           1332614    1370277        148 297.0MB/s  txt3 (54.92 %)
BM_ZFlat/14           1838211    1860559        109 247.0MB/s  txt4 (66.22 %)
BM_ZFlat/15            504965     468002        400 1.0GB/s  bin (18.11 %)
BM_ZFlat/16               203        196     952380 970.4MB/s  bin_200 (7.50 %)
BM_ZFlat/17             97675      88636       1936 411.4MB/s  sum (48.96 %)
BM_ZFlat/18              8915       8723      21459 462.1MB/s  man (59.36 %)
BM_ZFlat/19            115207     119224       1701 948.6MB/s  pb (19.64 %)
BM_ZFlat/20            410089     404792        501 434.3MB/s  gaviota (37.72 %)


Running correctness tests.
Crazy decompression lengths not checked on 64-bit build
All tests passed.

.NET performance

.NET performance numbers are about the same, perhaps a tiny bit lower. SnappyStream class uses hardware-accelerated CRC-32C wherever possible.

Alternative ports

There have been many efforts to port Snappy to Windows. This port of Snappy for Windows aims to be the most complete, the most up to date, and the most stable one. I will briefly mention existing ports and their relative strengths and weaknesses here.

Snappy for .NET

Developed mostly to compare performance with LZ4 compressor that is a close relative of Snappy. It includes native DLL build as well as .NET wrapper. I have copied bit counting optimization from this port.

It has a couple usability flaws though. It exposes only C APIs, not C++ APIs. The .NET wrapper requires developers to copy native DLLs around instead of embedding them and the native DLLs require installation of Visual C++ redistributable. There are no NuGet packages. It wasn't updated for over one year.

Snappy for .NET on CodePlex

Snappy.Sharp

This is pure .NET reimplementation of Snappy. Its readme plainly states that it is a work in progress. It was saying that for over a year. I need something stable in my projects. Any kind of "work in progress" is out of question for me.

The project is nevertheless maintained. There have been some recent commits. Perhaps someday it will be mature enough. I will then include it in my port as a pure .NET fallback in case the native libraries cannot be loaded.

Snappy.Sharp on GitHub

SnappySharp

This is another pure .NET reimplementation of Snappy. There was unfortunately no commit for over 3 years. It looks abandoned. The readme contains no warnings about unfinished stuff, but there's no performance report either.

SnappySharp on GitHub

Snappy.Net

This seems to have been an attempt to create .NET wrapper for Snappy. There is a single commit made 2 years ago, which contains empty .NET project. I assume this project was abandoned before any progress has been made.

Snappy.Net on GitHub

Contribute

This Windows port of Snappy is maintained by Robert Važan. Nearly all of the C++ code implementing the core Snappy algorithm was taken from upstream Snappy project. You can submit issues on GitHub (C++, .NET, tools) or Bitbucket (C++, .NET, tools), including requests for documentation. Pull requests are also welcome via GitHub (C++, .NET, tools) or Bitbucket (C++, .NET, tools).

Known bugs and issues: