Talk:Canonicalization

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Normalisation[edit]

Article singles out UTF-8 as requiring normalisation. I'm not sure what it's referring to. Surrogates? UTF-16 has them too. All Unicode encodings require normalisation, because on code-point level (so basically after decoding UTF-something) there are ambiguities - see NFKC for example. —Preceding unsigned comment added by Jjjjjjbbb (talkcontribs) 23:44, 13 August 2008 (UTC)[reply]

The article refers to UTF-8 "overlong" forms, such as 0xC080 being an alias for 0x00. Of course, the current UTF-8 standard defines these overlong forms as being invalid, so any decoder of UTF-8 to sequence of codepoints must reject any characters which are encoded in such overlong forms. Canonicalisation in this context would refer to either the rejection mandated by the standards or their replacement by the non-overlong forms. --Wtrmute (talk) 01:33, 28 August 2011 (UTC)[reply]
Improved the section. 80.235.83.183 (talk) 18:49, 24 March 2015 (UTC)[reply]

XML section[edit]

Personally I don't think the entire list of possible changes should be laid out in the XML section. My suggestion would be to replace the bullet list with something like

In addition, a full XML canonicalization would also ensure the document is encoded as UTF-8, normalize attribute values, and remove superfluous namespace declarations. For a full list of canonicalization changes, see the W3C specification.

The section already links to the W3C specification for XML canonicalization. JadeMatrix (talk) 23:53, 11 November 2015 (UTC)[reply]

Biological taxonomy[edit]

I have removed the following unsourced text and image because I see some problems with it and suspect that there are others:

Type species is the reference for genus definition, even before the existence of a consensual genus name.

"In zoological nomenclature, a type species (Species typica) is the species name with which the name of a genus or subgenus is considered to be permanently taxonomically associated, i.e., the species that contains the biological type specimen(s), and is used as "canonical type" or reference model to a genus."

I think this material could easily be confusing, because a reader might think that the type species or type specimen has to be "central" to a genus, which is not the case, the types are merely contained within the circumscription. More seriously, though, I think this text might get into difficulties with what is a "canonical type", and what is a "canonical object" (and the type specimen is an object: potentially quite confusing). There is no "canonicalization" (the title of this page) involved in biology, and since this page starts off with "in computer science" and this material is not computer science, it doesn't belong. Sminthopsis84 (talk) 17:47, 14 April 2016 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Canonicalization. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 09:21, 14 November 2016 (UTC)[reply]