Jump to content

Talk:Checksum

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 76.233.235.203 (talk) at 20:28, 7 February 2011 (→‎Checksums should not be used where data may change!). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

WikiProject iconComputing Start‑class
WikiProject iconThis article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
???This article has not yet received a rating on the project's importance scale.
WikiProject iconComputer science Start‑class Mid‑importance
WikiProject iconThis article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
MidThis article has been rated as Mid-importance on the project's importance scale.
Things you can help WikiProject Computer science with:


For the section "Some checksum algorithms", I'd suggest that there should be no citation needed for the exaple, since an example that illustrates a concept like that is the equivalent of a description of how to graph x2=y: it is common knowledge to those who write checksum algorithms. 74.58.154.55 (talk) 01:55, 18 December 2008 (UTC)[reply]


i just have a question about the checksum when the checksum will be useful to detect changes to the file???????

Checksum is only used to check that the data sent is correct, it doesnt change anything!

In order to ensure content accuracy you should consider using something like MD5 or SHA-1, which are cryptographic hashing algorithms and in wide use for such purposes.

Checksums are good at detecting accidental corruption (e.g. that introduced by signal noise) but the algorithm is sufficiently predictable that it forms no defence against malicious attack. For that sort of purpose, a cryptographic hash function should instead be used, which is a kind of checksumming algorithm that has the additional property that it is pretty easy to go from the data to the hash, but very hard to find some data that hashes to a particular value. (Or at least that's the way it is supposed to work; SHA-256 is the current recommendation for crypto-hash functions as MD5 has been broken and SHA-1 is believed breakable. It should be possible to reference these; they were results from a team in China, and were fairly widely reported even outside the security conference where their papers were.) 82.42.250.222 00:26, 11 January 2007 (UTC)


hmm thats good —Preceding unsigned comment added by 63.133.162.226 (talk) 21:23, 7 January 2008 (UTC)[reply]


I accidentally typed checksun into the search field. Is there a way I could put in, did you mean checksum? Like google does?


Step 3 of the example has an error: the two's complement of 18h is E7h. This means that when testing the checksum byte, the result will be FFh, not 00h as stated. Should 1 be added to the checksum at some point? Klox 21:42, 25 July 2007 (UTC)[reply]

I guess, you made the one's complement what is not requested. (Elmar Sack (talk) 08:03, 8 December 2007 (UTC))[reply]


The example does not seem correct: 118h + E8h = 200h, not 100h as stated. Is this a mistake? Brolin Empey 17:42, 10 August 2007 (UTC)[reply]

Yes, it is. (Elmar Sack (talk) 08:03, 8 December 2007 (UTC))[reply]

hex

For us non-computery regular folks, howabout using regular numbers instead of hex for clarity? Also, the last digit of a barcode is based on added all the other digits together. Is this a checksum? A bar code would probably be a great example of a checksum at work, as regular people could understand it. —Preceding unsigned comment added by 216.57.220.248 (talk) 22:25, 21 November 2007 (UTC)[reply]

The last digit of a barcode is a "check digit". A "check digit" is a kind of checksum that is only 1 digit long. I agree that better examples would be good. --68.0.124.33 (talk) 07:41, 27 September 2008 (UTC)[reply]

Checksums, FEC and hash functions

These, though apparently similar, are distinguished by their different purposes:

  • Checksums are added to data items to provide a simple (incomplete) means of error detection. A checksum is useful to assure the integrity of even a single data instance. This type of checksum is generally much shorter than the data and is only for a cursory check.
  • Longer FEC codes that are derived from the data provide both error detection and some error correction capabilities. They can be called checksums but in reality are intended for more than just checking the data. They are used widely in digital communication and represent a significant bandwidth overhead.
  • Hash functions are intended to segregate a mass of many data items into a smaller number of groups where, ideally, similar but not identical data items fall into different groups. Hash functions are useful only to detect duplicates or speed searching in a large database, where the data items are assumed to be free of errors. Cuddlyable3 (talk) 14:35, 22 November 2008 (UTC)[reply]

Tools

Could do with reference to tools that test how good a checksum algorithm is. —Preceding unsigned comment added by Ralph Corderoy (talkcontribs) 11:54, 17 January 2009 (UTC)[reply]

Checksums should not be used where data may change!

 To strengthen a previous note, checksums can guarantee to catch 1-or-2-bit corruption, and are likely to catch other changes, but cannot be trusted to guarantee that data has not changed in any general/wholesale way, particularly for large datasets. At it's heart, a checksum is a highly many-to-one mapping. I just ran a test, generating a 64-bit checksum for every file on my computer, and of the 1.5 million-or-so files, I found 60 pairs of completely different files that shared a checksum (no triples).

Joe Weinstein at Oracle Inc.