|WikiProject Computing||(Rated Start-class)|
|WikiProject Computer science||(Rated Start-class, Mid-importance)|
- 1 common knowledge does not need citation
- 2 How to detect changes to a file
- 3 How to get fuzzy match in search inquiry
- 4 example has an error
- 5 hex
- 6 Checksums, FEC and hash functions
- 7 Tools
- 8 Checksums should not be used where data may change!
- 9 A near namesake
- 10 Order independence
- 11 Lead sentence is incomprehensible to this layman
- 12 mentally computed checksums
- 13 Promodular sum?
- 14 Error in modular sum description? (resolved, reader error)
common knowledge does not need citation
For the section "Some checksum algorithms", I'd suggest that there should be no citation needed for the exaple, since an example that illustrates a concept like that is the equivalent of a description of how to graph x2=y: it is common knowledge to those who write checksum algorithms. 22.214.171.124 (talk) 01:55, 18 December 2008 (UTC)
How to detect changes to a file
i just have a question about the checksum when the checksum will be useful to detect changes to the file???????
Checksum is only used to check that the data sent is correct, it doesnt change anything!
In order to ensure content accuracy you should consider using something like MD5 or SHA-1, which are cryptographic hashing algorithms and in wide use for such purposes.
Checksums are good at detecting accidental corruption (e.g. that introduced by signal noise) but the algorithm is sufficiently predictable that it forms no defence against malicious attack. For that sort of purpose, a cryptographic hash function should instead be used, which is a kind of checksumming algorithm that has the additional property that it is pretty easy to go from the data to the hash, but very hard to find some data that hashes to a particular value. (Or at least that's the way it is supposed to work; SHA-256 is the current recommendation for crypto-hash functions as MD5 has been broken and SHA-1 is believed breakable. It should be possible to reference these; they were results from a team in China, and were fairly widely reported even outside the security conference where their papers were.) 126.96.36.199 00:26, 11 January 2007 (UTC)
How to get fuzzy match in search inquiry
I accidentally typed checksun into the search field. Is there a way I could put in, did you mean checksum? Like google does?
- Someone could create a redirect from checksun to checksum.-188.8.131.52 (talk) 11:15, 7 April 2012 (UTC)
example has an error
Step 3 of the example has an error: the two's complement of 18h is E7h. This means that when testing the checksum byte, the result will be FFh, not 00h as stated. Should 1 be added to the checksum at some point? Klox 21:42, 25 July 2007 (UTC)
- The example does not seem correct: 118h + E8h = 200h, not 100h as stated. Is this a mistake? Brolin Empey 17:42, 10 August 2007 (UTC)
For us non-computery regular folks, howabout using regular numbers instead of hex for clarity? Also, the last digit of a barcode is based on added all the other digits together. Is this a checksum? A bar code would probably be a great example of a checksum at work, as regular people could understand it. —Preceding unsigned comment added by 184.108.40.206 (talk) 22:25, 21 November 2007 (UTC)
- The last digit of a barcode is a "check digit". A "check digit" is a kind of checksum that is only 1 digit long. I agree that better examples would be good. --220.127.116.11 (talk) 07:41, 27 September 2008 (UTC)
- In what way are values in bases other than ten, not "regular numbers"? — Preceding unsigned comment added by 18.104.22.168 (talk) 03:00, 12 July 2013 (UTC)
Checksums, FEC and hash functions
These, though apparently similar, are distinguished by their different purposes:
- Checksums are added to data items to provide a simple (incomplete) means of error detection. A checksum is useful to assure the integrity of even a single data instance. This type of checksum is generally much shorter than the data and is only for a cursory check.
- Longer FEC codes that are derived from the data provide both error detection and some error correction capabilities. They can be called checksums but in reality are intended for more than just checking the data. They are used widely in digital communication and represent a significant bandwidth overhead.
- Hash functions are intended to segregate a mass of many data items into a smaller number of groups where, ideally, similar but not identical data items fall into different groups. Hash functions are useful only to detect duplicates or speed searching in a large database, where the data items are assumed to be free of errors. Cuddlyable3 (talk) 14:35, 22 November 2008 (UTC)
Checksums should not be used where data may change!
To strengthen a previous note of warning...
Checksums can guarantee to catch 1-or-2-bit corruption, and are
likely to catch other changes, but cannot be trusted to guarantee
that data has not changed in any general/wholesale way, particularly
for large datasets. A checksum is ultimately/only a highly many-to-one
mapping. I just ran a test, generating a 64-bit checksum for every file
on my computer, and of the 1.5 million-or-so files, I found 60 pairs of
completely different files that shared a checksum (no triples).
22.214.171.124 (talk) 20:36, 7 February 2011 (UTC) Joe Weinstein at Oracle Inc.
Either you are very lucky/unlucky or your 64-bit checksum is using a very poor function. A good 64-bit checksum would have a very small chance of any random false duplicates among a million files.-126.96.36.199 (talk) 11:21, 7 April 2012 (UTC)
A near namesake
- No, completely unrelated (assuming you're referring to Wikipedia:CheckUser). Jaho (talk) 17:20, 12 September 2011 (UTC)
The article should highlight the fact that only the simplest checksum hash functions are order independent. Generally order dependence is preferred, for more sensitive detection of differences. But sometimes order independence is desired (and insensitivity to zeroes) so a simple sum or XOR is desired. In the *nix world this is available with the classic sum utility. But in the Windows world these simple hash functions are very hard to find. The best source seems to be sum, ported in UnxUtils and CoreUtils for Windows.
See superuser.com/questions/168202/difference-between-unxutils-and-gnu-coreutils for details: "Unxutils are windows native, and have no dependancies - but haven't been updated in a while, and are a subset of coreutils."
Lead sentence is incomprehensible to this layman
"A checksum or hash sum is a sedfrg-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. "
- That's because it was vandalism by User:188.8.131.52  which has since been reverted . —Lowellian (reply) 00:32, 2 April 2013 (UTC)
mentally computed checksums
The "disposable email address" article briefly mentions "mentally computed checksums". Which Wikipedia article is the best place for more details about such checksums? --DavidCary (talk) 18:33, 11 November 2013 (UTC)
The modular sum section uses this term, which I have never come across. The term is not defined elsewhere in the article nor in the article linked to shortly after the term is used. Searching the web for "Promodular sum" only brings up references to this article, or copies of it, plus some references to a company called Promodular. Can someone either provide references for this term, or remove it?184.108.40.206 (talk) 21:13, 17 July 2015 (UTC)
Error in modular sum description? (resolved, reader error)
The article includes:
A variant of the previous algorithm is to add all the "words" as unsigned binary numbers, discarding any overflow bits, and append the two's complement of the total as the checksum. To validate a message, the receiver adds all the words in the same manner, including the checksum; if the result is not a word full of zeros, an error must have occurred.
The message validation description doesn't sound right. E.g., suppose the words are 001 010. Then the sender computes the sum: 011, and sends along the message 001 010 011. Then the receiver would add all three to get: 110. Then "the result is not a word full of zeros", but it's not the case that "an error … occurred;" in fact, the transmission was correct. Am I missing something, or is this description in the article misleading? 220.127.116.11 (talk) 22:39, 6 January 2016 (UTC)
- Oh, I missed the bit about two's complement. Taking that into consideration, the sender adds 001 and 010 to get 011, then takes the two's complement of 011 to get 101, and sends the message 001 010 101. The sender then adds these and gets (1)000 which, when the overflow is ignored, is 000, a word full of zeros. 18.104.22.168 (talk) 22:45, 6 January 2016 (UTC)