Talk:DEFLATE

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing  
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
 

ZipForge[edit]

Find out whether ZipForge is a separate implementation. The ZipForge web page includes the wording:

 "One of the fastest Deflate implementations available today"

but doesn't give further information. Downloading zf_d4_274.zip, then unzip install.exe gives a files called HISTORY.TXT. The history (changelog) file contains the wording:

 - version 2.70 (01/31/2006)
 New features added: 
 1. Extraction speed is increased. ZLib library is updated.
 ...

Whether this is the zlib is unknown. Sladen 17:03, 10 November 2006 (UTC)

I have not received a reply from the company yet, but having downloaded a copy of Zipforge and attacked the library with 'strings' I'm fairly sure it's plain zlib. Sladen 20:05, 12 November 2006 (UTC)

AHA36x[edit]

I ended up writing so much after researching the AHA36x that it may be worth moving this section to a separate page rather than risk losing it. Sladen 17:46, 8 December 2006 (UTC)

Notability? Tomstdenis 17:59, 23 February 2007 (UTC)
Many of the details on the site have nothing to do with DEFLATE itself. Like benchmarks, and ratios of their product, the PCI device id. Pretty much all of the details after the first paragraph belong in a separate article (if it passes the notability muster test). Wikipedia is not a place to list products or vendors if they're not truly notable. There are many vendors of hardware IP, we can't possibly list them all, and even if we could, it wouldn't help the reader. I suggest we remove everything after the first paragraph of AHA. Maybe google for other vendors to add to the list. Tomstdenis 15:42, 26 February 2007 (UTC)
Yeah, from Googling, I only found two vendors of hardware DEFLATE solutions. The AHA36x was the one I found first and the information in the main page is the result of my research; my conclusion being that it is lacking to the point of non-usefulness. The second card I managed to turn up later (StorCompress, see below, not yet researched/listed) looks to be a much better solution. I think a hardware DEFLATE implementation is notable, though certainly not to the degree currently presented. The second card found should be researched and its capabilities discerned and listed aswell. Sladen 05:58, 1 March 2007 (UTC)
Cool edits. Informational and to the point. Tomstdenis 13:04, 1 March 2007 (UTC)

StorCompress[edit]

Indra Networks appear to do the http://www.indranetworks.com/products-StorCompress300.html device, also hardware deflate. Sladen 05:02, 9 December 2006 (UTC)

BJWFlate/DeflOpt[edit]

Well, I feel honoured to be mentioned here on wiki... I stopped developing BJWFlate a few years ago. I picked up DeflOpt again a few months ago. It does not only do .zip files now, but also gzip and PNG. My page on my site: http://www.walbeehm.com/download

Hello Ben. What's wonderful to here that you've picked things up again. BTW, the URL you have is 404'ing; do you also have a preferred method of contacting you. I'd like to speak a little more about your experience with some of the other DEFLATE implementations you've come across (eg. Stuffit) as these haven't been written up on the article yet. Sladen 19:31, 7 February 2007 (UTC)
The 404 was due to a server migration and subsequent IP change. You can find how to contact me in the DeflOpt205.7z file on my site. 67.8.85.101 16:19, 10 March 2007 (UTC)

Capitalization[edit]

Why does this article capitalize the word DEFLATE? I'm guessing that this is done because that's the way it's written in the title of RFC 1951. But in fact, except in the title, this RFC writes "deflate" in lowercase (and usually, though not always, in quotes). So unless someone can find a better justification for writing the word in uppercase, I think it should be changed. --Zundark 09:28, 18 March 2007 (UTC)

Patents[edit]

I vaguely remember LZW being based on LZ78 being based on LZ77... So what exactly are those patents, and why don't they apply to Deflate? 82.139.85.220 (talk) 12:26, 5 April 2008 (UTC)

Suggestions - English Edits[edit]

Please feel free to delete this section without my permission!

I noticed that the second sentence of the article is actually NOT a sentence! ;-) You active editors may want to change it.

The original: Deflate is widely thought to be free of any subsisting patents, and at a time before the patent on LZW (which is used in the GIF file format) expired, this has led to its use in gzip compressed files and PNG image files, in addition to the ZIP file format for which Katz originally designed it.

My suggestion: Deflate is widely thought to be free of any subsisting patents. Prior to the time the patent on LZW (which is used in the GIF file format) expired, this patent-free aspect of deflate led to its use in gzip compressed files and PNG image files, in addition to the ZIP file format for which Katz originally designed it.

I am not sure this will maintain the original intent, so I've put it here without changing the article itself.

Thx for letting me intrude in your otherwise technical discussions!

Gloucks (talk) 22:49, 20 September 2008 (UTC)

DEFLATE hardware cards[edit]

Sladen, you removed my comment about the new PCIe boards included zlib library and command line utility that comes with the Linux driver of the aha363/364 boards? Why? The older comtech PCIx cards mention the included software. It would seem more consistent, that the newer cards have their included software mentioned as well. Also the deletion of the limitiations of the aha362 card was because that comment seemed more like a guess rather than a fact (sorry, I probably should have commented on that in the change log so as to look less suspicious). I think that statement about static Huffman limitations was true about the 361 board (which I removed because it is no longer availiable), but I don't know if that is true about the 362 board. —Preceding unsigned comment added by Pur-ja (talkcontribs) 17:46, 21 November 2008 (UTC)

Hello, greetings Pur-ja! Thank you for your edit and getting involved with Wikipedia.
I had missed that AHA361 had been altered to AHA362 as the PCI ID hadn't been changed. I think it's good to mention the original card, even if a new revision is now available. Most of the documents that would clearly answer the details are marked as being "protected" on the Comtech website, so it is hard to use them as verifiable references. If you are able to shed any further light it would be most useful—I seem to recall the answer I got from Comtech stated that, in decode mode, the AHA361 could only guarantee to parse streams that it had produced; which further lead onto the root-cause being the lack of Dynamic Huffman support (the most common case).
I'm unsure that it's necessary to duplicate the information about the zlib library and command-line utility as these are mentioned already in previous paragraph covering the AHA361, but it could perhaps stated somehow that the modified zlib library/apache module/ahagzip is necessary for all of the cards but in a way that doesn't lead to direct copy and pasting.
Once again, thank you for getting involved! —Sladen (talk) 18:52, 21 November 2008 (UTC)
Sladen,
Thanks! I use wiki all the time, I thought it was about time to open an account and start contributing.
I understand your concern about duplication regarding the ahagzip utility, as those have the same name and probably serve the same purpose. However, the newer cards have a modified zlib library which is much more versatile than the "deflate" apache module that came with the PCI-X cards. Apache typically compresses by using a "deflate" module, which in turn, uses the zlib library to compress data (excuse me if you are already well versed on this). The old version of the cards came with the modified "deflate" module which limited its uses to Apache only. By modifiying the zlib library, other applications that utilize zlib can compress data without having to alter their existing code (though a recompile will be neccessary if compiled against a static zlib library). I had previously mentioned Apache in the same sentence as zlib because that would probably be the most common use of the zlib library in the card's market. Although it is a small detail, I think it is important to note the difference between an included apache module versus an included zlib library which can, not only, be used with Apache, but can be used in a wider array of applications. I'll probably try and take another stab at this wiki later tonight, unless someone else gets to it first.
Thanks,
pur-ja —Preceding unsigned comment added by Pur-ja (talkcontribs) 19:32, 21 November 2008 (UTC)
Hunting around I found[1] which gives differing compression rates of 2.2:1 and 3.0:1 for two of the cards; I'm wondering if that represents the presence of dynamic Huffman tree support. —Sladen (talk) 19:36, 21 November 2008 (UTC) (mid-air collision)
I know the new cards use dynamic Huffman. You are probably right about the 362 card not using dynamic Huffman. —Preceding unsigned comment added by Pur-ja (talkcontribs) 08:36, 22 November 2008 (UTC)

Summary does't make sense[edit]

"Deflate is widely thought to be free of any subsisting patents, and at a time before the patent on LZW (which is used in the GIF file format) expired, this has led to its use in gzip compressed files and PNG image files, in addition to the ZIP file format for which Katz originally designed it." [Emp. added]

Wordy and confusing. What exactly happened then? Could the bold phrase be dropped or moved?

--68.3.198.253 (talk) —Preceding undated comment added 05:52, 17 May 2010 (UTC).

Nothing in particular happened afterwards, it's just that there wouldn't have been much motivation for the PNG or gzip formats if the LZW patents hadn't existed (since PNG and gzip were developed as patent-free alternatives to GIF and compress, both of which use LZW). The wording needs to be improved, but I don't think we should drop the mention of LZW entirely, since it's an important factor in the current widespread use of Deflate. --Zundark (talk) 08:36, 17 May 2010 (UTC)

My issue is with "Deflate is widely thought to be implementable in a manner not covered by patents." Is it implementable without patents, or isn't it? --KimikoMuffin (talk) 21:25, 26 December 2013 (UTC)

Sliding window[edit]

This wording is, I think, not quite correct: "Relative back-references can be made across any number of blocks, as long as the distance appears within the last 32 kB of uncompressed data decoded (termed the sliding window)." The sliding window is usually 32 kB but can be set to some other value. OptiPNG allows you to use a different sliding window value, and I found that sometimes a different value results in a smaller file. I mentioned this to Ken Silverman, though, and he said his own testing showed no advantage significant enough to implement it in PNGOUT. I'm not sure how this might best be explained in the article though. —pfahlstrom (talk) 17:07, 4 March 2011 (UTC)

Parts compression[edit]

It is not referred anywhere on the article whether zlib supports or not parts compression; that means instead of giving it the full contents, giving it gradually chunks of it and obtaining compressed chunks. Looking to zlib.h code it seems to me that it is not possible. —Preceding unsigned comment added by 85.139.250.34 (talk) 14:43, 11 March 2011 (UTC)

Compression Ratios[edit]

Observation: little is said about the performance of compression algorithms other than speed; this does not help people in deciding which to utilize; while moderate ratios are that is needs between sites located by thousands of kilometers which are linked by broad pipes, it is the ratio that is critical for those at the end of a narrow pipe; if someone can transmit gigabytes in the same time as (formerly) megabytes, that would be a winner for "developing nations" and New York City; despite being of the ten most compacted cities on the planet, we have worse telecomm than any other "developed" (i.e., anybody in the EEC, even France); Howard nyc (talk) 20:50, 23 January 2012 (UTC)HOWARD NYC

Stream Format[edit]

The stream format section is laid out in a very confusing manner. The way it is written:

  • The data is divided into blocks (size of blocks/method of determining block size not specified)
  • The stream is encoded with an LZW-type algorithm, either at the block level or the entire stream (the paragraph starts saying each block, then ends with over many blocks)
  • The stream is then Huffman coded again

The encoder/compressor section implies that string deduplication is the final stage, so the section round of Huffman coding is probably just out of order (should be right after the block type description).

Since the bit-level detail of the block header is given, should the encoding for the dynamic Huffman tree also be included?

Since I came to the page looking for how Deflate works, I don't have the knowledge at this time to correct this section, but hopefully the RFC will shed enough light for a contribution. 76.118.1.171 (talk) 22:10, 1 March 2013 (UTC)

  1. It's up the encoder to determine the size of each block. Literal uncompressed blocks are limited to 65535 bytes, Compressed blocks can be as long as the encoder chooses.
  2. LZ77 is string matching (within the dictionary == last 32kB). This is only done within compressed blocks. An encoder, if it wished, could skip looking for duplicate strings and simply not bother with this stage.
  3. Compressed blocks are then (always) Huffman-coded. Either using the default pre-agreed ASCII text-focused Huffman table; or more frequently using a supplied Huffman table that the encoder thinks will be more optimal for the block.
Could you suggest how this could be phrased better? —Sladen (talk) 23:08, 1 March 2013 (UTC)