Lempel-Ziv-Markov chain algorithm
From Wikipedia, the free encyclopedia
The Lempel-Ziv-Markov chain-Algorithm (LZMA) is an algorithm used to perform data compression. It has been under development since 1998[1] and is used in the 7z format of the 7-Zip archiver. This algorithm uses a dictionary compression scheme somewhat similar to LZ77 and features a high compression ratio (generally higher than bzip2[2][3]) and a variable compression-dictionary size (up to 4 GiB).[4]
Contents |
[edit] Overview
LZMA uses an improved and optimized version of the LZ77 compression algorithm, backed by a range encoder. Streams of data, repeated-sequence size and repeated-sequence location seem to be compressed separately.[citation needed]
[edit] 7-Zip reference implementation
The LZMA implementation extracted from 7-zip is available as LZMA SDK and has been put by Igor Pavlov under the public domain.[5] It was originally distributed under the terms of both the LGPL and Common Public License, with a special exception for linked binaries. Version 4.61 beta, was released into the public domain on November 23, 2008.
The reference open source LZMA compression library is written in C++ and has the following properties:
- Compression speed: approximately 1 MiB per second on a 2 GHz CPU
- Decompression speed: 10-20 MiB per second on a 2 GHz CPU
- Support for multi-threading.
In addition to the original C++, the LZMA SDK contains reference implementations of LZMA compression and decompression ported to ANSI C, C#, and Java.[5] There also exists an implementation in Pascal and Python bindings for the C++ library.[6][7]
This (the 7-Zip) implementation uses several variants of hash chains, binary trees and Patricia tries as the basis for its dictionary search algorithm.
Decompression-only code for LZMA generally compiles to around 5 kiB and the amount of RAM required during decompression is principally determined by the size of the sliding window used during compression. Small code size and relatively low memory overhead, particularly with smaller dictionary lengths, and free source code make the LZMA decompression algorithm well-suited to embedded applications.
[edit] Algorithm
Paul Sladen summarizes the algorithm:
LZMA is effectively deflate (zlib, gzip, zip) with a larger dictionary size, 32 MiB instead of 32 kiB. LZMA stands for Lempel-Ziv-Markov chain-Algorithm, after string back-references have been located, values are reduced using a Markov chain range-encoder (a.k.a. arithmetic coding) instead of huffman.[8]
[edit] LZMA2
The second version of LZMA is still being developed. LZMA2 provides the following advantages over LZMA:
- better compression ratio for incompressible data, by storing blocks of such data in uncompressed form
- faster decompression
- better multithreading support, by splitting a big file into smaller chunks and compressing these chunks in multiple threads.
[edit] Uses
Software that uses or supports LZMA:
- SQL Backup from Red-Gate. Simple compression and delivery system for SQL Server backups.
- cramfs and SquashFS, with applied patches
- Inno Setup
- Nullsoft Scriptable Install System
- Peazip, GUI frontend to command-line 7z and POSIX 7z binaries
- UPX, from version 2.92 (beta) and above, features optional LZMA compression
- The RPM Package Manager has experimental LZMA support since version 4.6.0[9]
- PiSi (Packages Installed Successfully, as Intended) Pardus[citation needed]
- dpkg, from version 1.13.35.
- GNU Tar, from version 1.20 (using the --lzma switch or --auto-compress (-a) with the archive's filename ending in .tar.lzma or .tlz).
- It is also a part of the Cygwin distribution for Windows and used there at least by MiKTeX.[10]
- PowerArchiver
- WinZip
- GNU GRUB (version 2), for compressing core.img (only on x86 BIOS platforms).
- Unity game engine uses LZMA for compressing web player data files.
- Linux (as the compressed kernel image format, since version 2.6.30)
- compressing man pages
[edit] Notes
- ^ The SDK history file states that it was in development from 1996, and first used in 7-zip 2001-08-30. Aside from some unreferenced comments about 1998, the algorithm appears to have been unpublished before its use in 7-zip.
- ^ Collin, Lasse (2005-05-31). "A Quick Benchmark: Gzip vs. Bzip2 vs. LZMA". The Tukaani Project. http://tukaani.org/lzma/benchmarks. Retrieved on 2008-09-02.
- ^ Klausmann, Tobias (2008-05-08). "Gzip, Bzip2 and Lzma compared". Blog of an Alpha animal. http://blog.i-no.de//archives/2008/05/08/index.html#e2008-05-08T16_35_13.txt. Retrieved on 2008-09-02.
- ^ Overview of the LZMA format 7z Format
- ^ a b http://www.7-zip.org/sdk.html
- ^ http://www.birtles.org.uk/programming/
- ^ http://www.joachim-bauch.de/projects/python/pylzma/
- ^ Paul Sladen. "Comment on 7-zip, LZMA and blah". http://www.paul.sladen.org/projects/compression/. Retrieved on 2007-10-06.
- ^ "RPM 4.6.0 (4.6.0-rc3)". rpm.org. http://rpm.org/wiki/Releases/4.6.0. Retrieved on 2008-12-31.
- ^ Schenk, Christian. "Creating a custom package repository". miktex.org. http://docs.miktex.org/packaging/. Retrieved on 2008-10-15.
[edit] External links
- Official home page
- LZMA SDK (Software Development Kit)
- LZMA Utils = XZ Utils
- Data compression, Compressors & Archivers
|
|||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||

