Zstd

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 80.171.205.144 (talk) at 09:36, 10 October 2018 (→‎Features: fixed: compression levels are positive numbers). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Original author(s)Yann Collet
Developer(s)Yann Collet, Przemysław Skibiński (inikep)
Initial release23 January 2015 (2015-01-23)
Stable release
1.3.5 / 28 June 2018; 5 years ago (2018-06-28)[1]
Repository
Written inC
Operating systemCross-platform
PlatformPortable
TypeData compression
LicenseDual: BSD License + GPLv2
Websitefacebook.github.io/zstd/

Zstandard (or Zstd) is a lossless data compression algorithm developed by Yann Collet at Facebook. The name also refers to the reference implementation in C. Version 1 of the implementation was released as free software on 31 August 2016.[2][3]

Features

Zstandard has been designed to give compression comparable to that of the DEFLATE algorithm (developed in 1991, used in original ZIP, gzip and others) with higher compression and especially decompression speeds. Zstandard combines the use of a dictionary-type algorithm (LZ77) with a large search window and fast implementations of entropy coding stage, using either very fast Finite State Entropy (tANS) or Huffman coding.[4] A feature of the Zstandard implementations is backward direction of reading on entropy decompression.

Zstandard also implements several compression levels for additional flexibility, ranging from 1 (fastest) to 22 (slowest in compression speed, but best compression ratio). There are also parallel (multi-threaded) implementations of both compression and decompression. Starting from 1.3.2 version (October 2017), Zstandard optionally implements very long range search and deduplication similar to rzip or lrzip.

According to the LTCB compression benchmark compressing a single 1 GB text file, zstd at maximum levels shows compression ratio close to boz, yxz, tornado archivers, and performs better than lza, brotli, or bzip2. With any compression level it has fast decompression.[5] Zstandard reaches the Pareto frontier, which means it decompresses faster than any other algorithm with similar or better compression ratio.[6][7]

Zstandard can use any user-defined pre-populated compression dictionary. It also offers a training mode, able to generate a dictionary from any set of samples. Dictionaries can have a large impact on the compression ratio of small files.[8][9]. In particular, one dictionary can be loaded to process large amounts of data with redundancy between, but not necessarily within, a set of files, e.g., log files.

Usage

Zstandard method is supported in the Linux kernel since version 4.14 (released November 2017) for usage as compression method for file systems like btrfs and squashfs.[10][11][12] It was also tested for FreeBSD with integration into OpenZFS file system.[13]

The algorithm is also deployed in datacenters, such as AWS Red Shift, and in databases, such as RocksDB.

In 2018 the algorithm was published as RFC 8478, which also defines an associated media type "application/zstd", filename extension "zst", and HTTP content encoding "zstd".[14]

Canonical has plans to change default deb package compression to zstd in version 18.10 of the Ubuntu Linux distribution to speed up installation by around 10 percent. Zstd compression at level 19 has larger packages than the previously used xz compression, but faster decompression speed.[15][16]

License

The reference implementation is licensed under the BSD license, published at GitHub.[17] Since version 1.0, it had an additional Grant of Patent Rights.[18]

From version 1.3.1,[19] this patent grant was dropped and the license was changed to a BSD + GPLv2 dual license.[20]

See also

References

  1. ^ "Releases - facebook/zstd". Retrieved 1 July 2018 – via GitHub.
  2. ^ Sergio De Simone, Facebook Open-Sources New Compression Algorithm Outperforming Zlib / InfoQ, 2 September 2016
  3. ^ "Life imitates satire: Facebook touts zlib killer just like Silicon Valley's Pied Piper". The Register. 31 August 2016. Retrieved 6 September 2016.
  4. ^ "facebook/zstd". GitHub.
  5. ^ Matt Mahoney (29 August 2016). "Large Text Compression Benchmark, .2157 zstd". Retrieved 1 September 2016.
  6. ^ TurboBench: Static/Dynamic web content compression benchmark, PowTurbo
  7. ^ Matt Mahoney, Silesia Open Source Compression Benchmark
  8. ^ https://indico.fnal.gov/event/15154/contribution/5/material/slides/0.pdf "Facebook developers report massive speedups and compression ratio improvements when using dictionaries"
  9. ^ "Smaller and faster data compression with Zstandard". Facebook. 31 August 2016.
  10. ^ "The rest of the 4.14 merge window [LWN.net]". lwn.net.
  11. ^ "Linux_4.14 - Linux Kernel Newbies". Kernelnewbies.org. Retrieved 16 August 2018.
  12. ^ "Zstd Compression For Btrfs & Squashfs Set For Linux 4.14, Already Used Within Facebook - Phoronix". www.phoronix.com.
  13. ^ "Info" (PDF). open-zfs.org. 2017.
  14. ^ Collet, Yann; Kucherawy, Murray (2018), RFC 8478: Zstandard Compression and the application/zstd Media Type, Internet Engineering Task Force Request for Comments, Menlo Park, CA: IETF Trust.
  15. ^ "New Ubuntu Installs Could Be Speed Up by 10% with the Zstd Compression Algorithm". Softpedia. 12 March 2018. Retrieved 13 August 2018.
  16. ^ "Canonical Working On Zstd-Compressed Debian Packages For Ubuntu". phoronix. 12 March 2018. Retrieved 13 August 2018.
  17. ^ "Facebook open sources Zstandard data compression algorithm, aims to replace technology behind Zip". ZDnet. 31 August 2016. Retrieved 1 September 2016.
  18. ^ zstd/PATENTS "Additional Grant of Patent Rights Version 2", Facebook
  19. ^ "Zstd v1.3.1 release", GitHub "facebook/zstd"
  20. ^ "New license", GitHub "facebook/zstd"

External links