Talk:Checksum

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated Start-class)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
 
WikiProject Computer science (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

common knowledge does not need citation[edit]

For the section "Some checksum algorithms", I'd suggest that there should be no citation needed for the exaple, since an example that illustrates a concept like that is the equivalent of a description of how to graph x2=y: it is common knowledge to those who write checksum algorithms. 74.58.154.55 (talk) 01:55, 18 December 2008 (UTC)

How to detect changes to a file[edit]

i just have a question about the checksum when the checksum will be useful to detect changes to the file???????

Checksum is only used to check that the data sent is correct, it doesnt change anything!

In order to ensure content accuracy you should consider using something like MD5 or SHA-1, which are cryptographic hashing algorithms and in wide use for such purposes.

Checksums are good at detecting accidental corruption (e.g. that introduced by signal noise) but the algorithm is sufficiently predictable that it forms no defence against malicious attack. For that sort of purpose, a cryptographic hash function should instead be used, which is a kind of checksumming algorithm that has the additional property that it is pretty easy to go from the data to the hash, but very hard to find some data that hashes to a particular value. (Or at least that's the way it is supposed to work; SHA-256 is the current recommendation for crypto-hash functions as MD5 has been broken and SHA-1 is believed breakable. It should be possible to reference these; they were results from a team in China, and were fairly widely reported even outside the security conference where their papers were.) 82.42.250.222 00:26, 11 January 2007 (UTC)

hmm thats good —Preceding unsigned comment added by 63.133.162.226 (talk) 21:23, 7 January 2008 (UTC)

How to get fuzzy match in search inquiry[edit]

I accidentally typed checksun into the search field. Is there a way I could put in, did you mean checksum? Like google does?

Someone could create a redirect from checksun to checksum.-96.237.1.158 (talk) 11:15, 7 April 2012 (UTC)
Yes check.svg Done. checksun now redirects to "checksum". --DavidCary (talk) 18:20, 11 November 2013 (UTC)

example has an error[edit]

Step 3 of the example has an error: the two's complement of 18h is E7h. This means that when testing the checksum byte, the result will be FFh, not 00h as stated. Should 1 be added to the checksum at some point? Klox 21:42, 25 July 2007 (UTC)

I guess, you made the one's complement what is not requested. (Elmar Sack (talk) 08:03, 8 December 2007 (UTC))

The example does not seem correct: 118h + E8h = 200h, not 100h as stated. Is this a mistake? Brolin Empey 17:42, 10 August 2007 (UTC)

Yes, it is. (Elmar Sack (talk) 08:03, 8 December 2007 (UTC))

hex[edit]

For us non-computery regular folks, howabout using regular numbers instead of hex for clarity? Also, the last digit of a barcode is based on added all the other digits together. Is this a checksum? A bar code would probably be a great example of a checksum at work, as regular people could understand it. —Preceding unsigned comment added by 216.57.220.248 (talk) 22:25, 21 November 2007 (UTC)

The last digit of a barcode is a "check digit". A "check digit" is a kind of checksum that is only 1 digit long. I agree that better examples would be good. --68.0.124.33 (talk) 07:41, 27 September 2008 (UTC)
In what way are values in bases other than ten, not "regular numbers"? — Preceding unsigned comment added by 67.242.114.255 (talk) 03:00, 12 July 2013 (UTC)

Checksums, FEC and hash functions[edit]

These, though apparently similar, are distinguished by their different purposes:

  • Checksums are added to data items to provide a simple (incomplete) means of error detection. A checksum is useful to assure the integrity of even a single data instance. This type of checksum is generally much shorter than the data and is only for a cursory check.
  • Longer FEC codes that are derived from the data provide both error detection and some error correction capabilities. They can be called checksums but in reality are intended for more than just checking the data. They are used widely in digital communication and represent a significant bandwidth overhead.
  • Hash functions are intended to segregate a mass of many data items into a smaller number of groups where, ideally, similar but not identical data items fall into different groups. Hash functions are useful only to detect duplicates or speed searching in a large database, where the data items are assumed to be free of errors. Cuddlyable3 (talk) 14:35, 22 November 2008 (UTC)

Tools[edit]

Could do with reference to tools that test how good a checksum algorithm is. —Preceding unsigned comment added by Ralph Corderoy (talkcontribs) 11:54, 17 January 2009 (UTC)

Checksums should not be used where data may change![edit]

To strengthen a previous note of warning...
Checksums can guarantee to catch 1-or-2-bit corruption, and are
likely to catch other changes, but cannot be trusted to guarantee
that data has not changed in any general/wholesale way, particularly
for large datasets. A checksum is ultimately/only a highly many-to-one
mapping. I just ran a test, generating a 64-bit checksum for every file
on my computer, and of the 1.5 million-or-so files, I found 60 pairs of
completely different files that shared a checksum (no triples).
76.233.235.203 (talk) 20:36, 7 February 2011 (UTC) Joe Weinstein at Oracle Inc.

Either you are very lucky/unlucky or your 64-bit checksum is using a very poor function. A good 64-bit checksum would have a very small chance of any random false duplicates among a million files.-96.237.1.158 (talk) 11:21, 7 April 2012 (UTC)

A near namesake[edit]

O.K., but is it related to the Wiki's Checkuser functions? 213.81.117.254 (talk) 16:51, 26 June 2011 (UTC)

No, completely unrelated (assuming you're referring to Wikipedia:CheckUser). Jaho (talk) 17:20, 12 September 2011 (UTC)

Order independence[edit]

The article should highlight the fact that only the simplest checksum hash functions are order independent. Generally order dependence is preferred, for more sensitive detection of differences. But sometimes order independence is desired (and insensitivity to zeroes) so a simple sum or XOR is desired. In the *nix world this is available with the classic sum utility. But in the Windows world these simple hash functions are very hard to find. The best source seems to be sum, ported in UnxUtils and CoreUtils for Windows.

See superuser.com/questions/168202/difference-between-unxutils-and-gnu-coreutils for details: "Unxutils are windows native, and have no dependancies - but haven't been updated in a while, and are a subset of coreutils."

-96.237.1.158 (talk) 21:59, 7 April 2012 (UTC)

Lead sentence is incomprehensible to this layman[edit]

"A checksum or hash sum is a sedfrg-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. "

What the heck is sedfrg-size? This needs to be redone! But I don't know what it means so someone who does should. Vranak (talk)

That's because it was vandalism by User:59.160.198.162 [1] which has since been reverted [2]. —Lowellian (reply) 00:32, 2 April 2013 (UTC)

mentally computed checksums[edit]

The "disposable email address" article briefly mentions "mentally computed checksums". Which Wikipedia article is the best place for more details about such checksums? --DavidCary (talk) 18:33, 11 November 2013 (UTC)