Talk:Hash function

This is the talk page for discussing Hash function and anything related to its purposes and tasks.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Archives: 1: 31 days

Computer science Top‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

Top

This article has been rated as Top-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Computing: Software / CompSci Mid‑importance

This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles

Mid

This article has been rated as Mid-importance on the project's importance scale.

This article is supported by WikiProject Software (assessed as High-importance).

This article is supported by WikiProject Computer science (assessed as Top-importance).

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

MD5 and SHA-1 weaknesses[edit]

This article should document the issues, raised last summer and sharpened last month, with MD5 and some other widely used hash functions. I'm putting a note that the algorithm has issues on this page and the MD5 page. If anyone wants to document this properly, look at:

CRYPTO2004 Rump Session Presentations, was Re: A collision in MD5' by Jim Hughes, who is also cited in Tech Republic calling for MD5 to be phased out;
SHA-1 broken and Cryptanalysis of MD5 and SHA: Time for a New Standardby Bruce Schneier;
SHA hash functions already documents the issues with the SHA algorithms.

(I won't write this up properly: not my field and I have very limited interest in the topic) ---- Charles Stewart 15:20, 14 Mar 2005 (UTC)

I confused this page with Cryptographic hash function: sorry, no issue ---- Charles Stewart 15:26, 14 Mar 2005 (UTC)

Missing content[edit]

The following have been mentioned as being missing from the article. Moving them to talk instead of keeping in a huge cleanup banner:

performance versus data length & distributions, chip architectures, etc
analysis: worst case, average case, etc
collision resolution, since all practical hash functions yield collisions
Basic info missing: is the output longer or shorter than the input? Is the output length fixed as input varies? Examples please. — Preceding unsigned comment added by Ian R Bryce (talk • contribs) 07:05, 17 June 2021 (UTC)[reply]

– Thjarkur (talk) 00:19, 21 February 2020 (UTC)[reply]

@Þjarkur and Ian R Bryce: This article is also missing information about locality-sensitive hash functions. Jarble (talk) 22:06, 12 April 2022 (UTC)[reply]

Folding hash codes[edit]

The paragraph on folding hash codes is a mess.

First:

A folding hash code is produced by dividing the input into n sections of m bits, where 2^m is the table size…

and:

…For example, for a table size of 15 bits…

If 2^m = table size = 15, then m ≅ 3.91 bits per section.

…and key value of 0x0123456789ABCDEF…

The binary representation of 0x0123456789ABCDEF,

100100011010001010110011110001001101010111100110111101111,

has 57 bits. 57 bits ÷ 3.91 bits per section = n ≅ 14.59 sections. However:

…there are five sections consisting of 0x4DEF, 0x1357, 0x159E, 0x091A and 0x8…

These sections appear to consist of 15 bits, sort of:

Section 1

0x4DEF = 100110111101111 — 15 bits — matches key bits 0 through 14.

100100011010001010110011110001001101010111100110111101111
                                          100110111101111

Section 2

0x1357 = 1001101010111 — 13 bits — matches key bits 15 through 27.

100100011010001010110011110001001101010111100110111101111
                             1001101010111

There are 2 key bits skipped between sections 2 and 3. Prepending those two bits (00) to section 2 would make it 15 bits long.

Section 3

0x159E = 1010110011110 — 13 bits — matches key bits 30 through 42.

100100011010001010110011110001001101010111100110111101111
              1010110011110

There are 2 key bits skipped between sections 3 and 4. Prepending those two bits (00) to section 3 would make it 15 bits long.

Section 4

0x091A = 100100011010 — 12 bits — matches key bits 45 through 56.

100100011010001010110011110001001101010111100110111101111
100100011010

Section 5

0x8 = 1000 — 4 bits — matches 3 separate 4-bit segments of the key:

100100011010001010110011110001001101010111100110111101111
   1000   1000           1000

Problematically, all of these segments overlap other sections, and the first 4 sections account for every bit in the key value.

Finally:

“...Adding, we obtain 0x7AA4, a 15-bit value.”

The binary representation of 0x7AA4 is 111101010100100. It is indeed a 15-bit value, but neither adding nor XOR-ing all 5 sections (or just the first 4 sections) together produces this number. 216.30.159.11 (talk) 19:15, 5 January 2024 (UTC)[reply]