Talk:XML

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Internet (Rated B-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Internet, a collaborative effort to improve the coverage of the internet on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 B  This article has been rated as B-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Computing (Rated B-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 B  This article has been rated as B-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 

Archives

Contents

[edit] Music Markup Language

Please consider adding this external link: Music Markup Language. -- Wavelength (talk) 18:28, 14 April 2010 (UTC)

(1) There's already an article on it (Music Markup Language) and internal links are preferred to external links (2) It is already mentioned in the categories and list articles linked in the 'See also' section (3) Why would this particular language merit mentioning? There are tons of XML-based languages, it's not feasible to list every single one on the page; only a very few examples are given, solely in the last paragraph of the introduction, and those ones are wildly popular; there is no indication Music Markup Language is anywhere near as widely used. --Cybercobra (talk) 18:44, 14 April 2010 (UTC)
Thank you for your reply. I did not notice the article on it. I accept your reasoning for not listing it as an external link.
-- Wavelength (talk) 16:47, 15 April 2010 (UTC)

[edit] I would like to see more resource to the tutorial resource

Hello, members of this community. I consider that this article is very informative, professionaled and so on. The great tutorials was used for writing of this article such w3schools. But I would like to see here tutorials which are not so great but very useful for beginers too, such as http://phpforms.net/tutorial/tutorial.html What do you think you about it? Thank you in advance. —Preceding unsigned comment added by Malinari (talkcontribs) 16:35, 21 April 2010 (UTC)

Wikipedia is not a tutorial: see What Wikipedia is not. That policy statement doesn't directly address the level at which an article should be written, for example whether an article on XML should be written for the general public or for professional programmers. But in my view, this article is pitched at about the right level. There are plenty of other sources if you want a more gentle introduction or (conversely) something addressing the formal computer science audience. Mhkay (talk) 13:26, 22 April 2010 (UTC)

The above commentary notwithstanding, this entry is incredibly obtuse, only slightly more readable than a reference book. I come here every few months lookng for some useful information as to what XML is, forgetting my last abortive attempts to understand it here, the details of what it is made up of, etc, something that I can elucidate my own lacking knowledge of it. Instead I read abstruse cryptic commentary, assumptive descriptions, and ambiguous terminology that presumes the reader knows enough about the topic that it would appear unnecessary to read the entry. I am a programmer from the old school, and an EE so I have had my soirees with technical manuals, but this prose is so dense and undefinitive as to the terms used, I find my mind wandering away from it, not mulling over the information in it. It doesn't have to be a tutorial to be understandable.

[edit] text/xml deprecated?

Not sure why it says text/xml is deprecated.

Just skimming over the RFC can't see that explicitly [1]

Jjjjjjjjjj (talk) 19:20, 10 June 2010 (UTC)

I have updated the citation to the IETF memo that deprecates text/xml and explains why. Mhkay (talk) 22:57, 11 June 2010 (UTC)

The RFC says "If an XML document -- that is, the unprocessed, source XML document -- is readable by casual users, text/xml is preferable to application/xml." I think characterising this as deprecation is inaccurate. Perhaps the description should be "application/xml (preferred for most technical use), text/xml (preferred when readable by casual users)" with a link to the RFC. What do people think? Paul Foxworthy (talk) 03:52, 13 June 2010 (UTC)

The cited Murata/Kohn/Lilley memo clearly labels text/xml as deprecated. This memo is much more recent than RFC 3023. The problems with text/xml largely emerged after 3023 was published. Mhkay (talk) 22:50, 14 June 2010 (UTC)
Thanks. But the citation is to RFC 3023. If the MKL memo is the source that confirms that text/xml is deprecated, then the citation is misleading. Well, it misled me at least :-). The memo I can find [2] is a draft and supposedly expired in March. If it's now an RFC, where is it? I propose a second citation be added to the deprecation referring to the draft. If and when the draft becomes an RFC to replace 3023, there should be just one citation that refers to that replacement. Does that make sense to everyone?Paul Foxworthy (talk) 15:47, 21 June 2010 (UTC)
I can't see where you have problems. (You say "But the citation is to RFC 3023". But there are multiple citations.) The article says that RFC 3023 standardizes text/xml and application/xml, which is true, and it also says that text/xml is in the process of being deprecated, which is also true, and both statements are linked to relevant citations. I've no idea what the current state of that process is, but the fact that the memo has timed out doesn't mean the process has been abandoned, unless you can find evidence to the contrary. Mhkay (talk) 21:39, 21 June 2010 (UTC)
I was talking about the citation in the infobox, sorry I didn't make that clear. I am not too fussed about the status of the memo, all I want is the best citation for the fact that text/xml has been deprecated.Paul Foxworthy (talk) 06:02, 1 July 2010 (UTC)
I've added a citation in the infobox. Paul Foxworthy (talk) 04:52, 6 July 2010 (UTC)
Now it still looks as if it were deprecated, but in reality it isn't. It was - as you tell yourself - deprecated in a draft which expired. It this really notable? Anyway it should be made clear, that it isn't deprecated and may never be deprecated, although there may be reasons against it's use. —h.e.r—79.236.22.145 (talk) 08:54, 2 August 2010 (UTC)

[edit] XML Abuse

XML being developed for text markup is being used as general serialization container for any data structure.

Should we add a section about XML Abuse?

Or maybe just a reference at the header should be added?

What do you think?

I would like to add a reference to the header since this is an important problem. —Preceding unsigned comment added by 87.217.111.16 (talk) 07:03, 20 June 2010 (UTC)

I think you would find it very hard to get consensus on any statement (let alone one short and pithy enough to go in the article lead) about when XML is and is not appropriate. Certainly, the opinions on the page you cite are far too debateable to go here. Let's keep this article factual and concise. It should tell people what XML is and does, not try to precis all the debates about its whys and wherefores. This is an encyclopedia. Mhkay (talk) 21:44, 21 June 2010 (UTC)
I would like to see mention to the XML abuse as this is an extended practice: What is the worst abuse of XML that you have seen? —Preceding unsigned comment added by 79.144.221.71 (talk) 18:41, 12 December 2010 (UTC)
XML abuse is a serious, real-world problem and as such it should be addressed by the Wikipedia article. Things have cooled off now that the current buzz is about anything with the word "cloud" written over it, but it was quite terrible not many years ago. XML probably was, and still is, the most widely misunderstood and heavily buzzworded technology of this century so far, and acts as a selling point of anything that uses it, regardless of purpose and schema. People (especially pointy-haired bosses) think anything which has "XML" in the box will automagically talk to anything else with the same label. 08:55, 13 January 2011 (UTC)

A very interesting insight on two potential examples of XML abuse can be quoted from Håkon Wium Lie, Opera's CTO (e.g. here): he describes OOXML and ODF as essentially "memory dumps with angle brackets". 08:55, 13 January 2011 (UTC)

08:55, 13 January 2011 (UTC)  —Preceding unsigned comment added by 217.125.117.197 (talk)  

Thank you very much for keeping the Criticism section and making clear that XML should not be used to represent structured data, but narrative documents. — Preceding unsigned comment added by 193.127.207.152 (talk) 08:09, 13 October 2011 (UTC)

[edit] XML (Extensible Markup Language) is a set of rules for encoding documents in machine-readable form.

Would it not be more appropriate to say that xml is for encoding in human-readable form?

What is machine-readable form supposed to mean?

UndercoverAgents (talk) 18:52, 7 July 2010 (UTC)

Its sole purpose is to interpret human-readable content/context and to turn it into machine-readable from which through a medium/interface. XML can be read/interpreted and parsed from within compiled and ascii, which suggests both application and text would be valid (personal opinion). Daemondevel (talk) 00:57, 4 August 2010 (UTC)

[edit] Spelled-out title

The W3C defines XML as follows:

Extensible Markup Language, abbreviated XML, describes a class of data objects called XML documents ... and so on

The important part here is that the article should first use the fully spelled out name and then the abbreviation. This is not only in accordance with the standard, but also follows general rules of good writing in English. Kbrose (talk) 03:15, 4 August 2010 (UTC)

This might be true if there were consensus that "XML" is an abbreviation of the three-word form, but many of us just don't believe that. Tim Bray (talk) 05:39, 4 August 2010 (UTC)

[edit] "Extensible Markup Language" first?

We're going to need to sort out what it should say at the top of the article:

Current candidates:

Extensible Markup Language (XML) and XML (Extensible Markup Language)

The first is supported by the English convention that a full name is listed first, and wording from the W3C spec: "The Extensible Markup Language (XML) is..." The second by the fact that the title of the article is (appropriately) XML and since the three-letter version is used rather than the three-word version in approximately 100% of spoken and written discourse.

Also note that XML is *not* an abbreviation or an acronym, it is just another name for the same thing.

My vote would be that the primary name should be the same as the title of the article and should reflect common usage. But it's not a matter of life or death. What do others think? Tim Bray (talk) 03:19, 4 August 2010 (UTC)

Of course it's an abbreviation, even the standards documents specifically say so, as quoted above. The title of the article should also be the full name, ideally. The reason that it isn't, is that too many writers here are suffering from Acronymitis. Almost all articles of computer networking protocols use the full protocol name as title, even for the most common of protocols, such as IP. Kbrose (talk) 03:27, 4 August 2010 (UTC)
There's no need for the title and the first name mentioned in the lede to match, particularly for acronyms where the full name is less common: NATO, Laser; see WP:SINGULAR on acronyms. --Cybercobra (talk) 03:56, 4 August 2010 (UTC)
The form in the NATO entry looks better than either alternative to me. "Extensible Markup Language or XML" correctly reflects that one is not an abbreviation of the other. I'd put the more common form first, but that's hard to get too excited about. Tim Bray (talk) 05:42, 4 August 2010 (UTC)
Empirically, the first letter of "Extensible" is not "X". It is generally agreed upon that writing "eXtensible Markup Language" is an error, which is another symptom of the fact that "XML" and "Extensible Markup Language" are two names for the same thing, one immensely more popular and widely used than the other. When I drafted the first sentence of the XML specification, I was insufficiently percipient to have predicted which would catch on. Tim Bray (talk) 05:38, 4 August 2010 (UTC)
XML is definitely an acronym, as evidenced by the fact that "ML" is "Markup Language" and "X" is generally considered an acceptable character abbreviation to represent extensible, at least in part because extensible and xtensible share the exact same phonetics. For other examples see XP (eXtreme Programming, eXperience Point), XSL (eXtensible Stylesheet Language), XBML (eXtended Business Modeling Language, eXtensible Battle Management Language), XMP (eXtensible Metadata Platform), and so on. The oXygen editor's product name is a play on the "X" acronym use, so with many counter examples, I would argue that it's not generally agreed that eXtensible is incorrect. It may not be a well-formed acronym, but it does have the most important semantic mapping characteristic of an acronym and in the one case that it doesn't take the first letter mapping, it uses an acceptable replacement. That's the first point, so if I replace XML with DNA, does the second point hold up?
"The second by the fact that the title of the article is (appropriately) DNA and since the three-letter version is used rather than the three-word version in approximately 100% of spoken and written discourse."
In this case, it's obvious that the typical rules of English apply, even though most people probably don't even know what DNA stands for any more. I can definitely see (and agree with) the logic of mapping from the commonly seen and heard acronym back to the expanded form when the acronym serves as a mental key, but that's inconsistent with currently correct english usage. It's essentially guaranteed that acronyms are always going to be more popular and more widely used than their expansions because that is their very purpose, so your argument would apply to all acronyms. MaxxD (talk) 07:10, 4 August 2010 (UTC)
(Okay, I'll argue with myself...) XML is technically an initialism, not an acronym or abbreviation because it is not a pronounceable word, but the point that it does represent the initials of the expanded form (notwithstanding the ex/x issue) is reasonable. However, The Chicago Manual of Style (CMS) states, "Occasionally, too, it makes sense to use the acronym first and put the full name in parentheses, if the acronym in question is so familiar to your expected audience that it almost goes without explication." [3] and XML has certainly achieved this distinction, so writing it as "XML (Extensible Markup Language)" is not only perfectly okay per the CMS, but almost certainly preferred. MaxxD (talk) 09:36, 4 August 2010 (UTC)

[edit] Details of valid characters

The section "details of valid characters" is getting absurdly detailed, especially as it appears so close to the start of the article. It's simply not interesting to the average reader who comes here wanting an overview of what XML is - the kind of people who want this level of detail are much more likely to go to the specs than to come here. I think the usual Wikipedia solution is to move the material out to a separate article, and I propose doing that. Mhkay (talk) 11:06, 13 August 2010 (UTC) (Now done.)

[edit] By definition?

Under "key terminology" it is stated: "By definition, an XML document is a string of characters.". By what definition, pray? That's not what the definition of "document" in the XML 1.0 rec says. It might be nice if it did, but it doesn't. Instead it mumbles about "textual objects", thus leaving (deliberately?) ambiguous the question of whether a document is a sequence of characters or a sequence of octets. Mhkay (talk) 23:23, 25 August 2010 (UTC)

[edit] This description is only useful to people who already know what XML is useful for

Hi!

it would be helpful if someone re-wrote this to explain why XML exists, as this would justify the entry. —Preceding unsigned comment added by 86.9.13.234 (talk) 14:49, 8 September 2010 (UTC)

[edit] "&" and "<" in XML entity values

  • The article itself states they "may never appear in content."
  • The matching reference's summary states they are allowed (just not recommended).
  • The actual reference (i.e. the specs) states they "MUST NOT appear in their literal form, except when..." (certain cases like when inside CDATA).

So should the first two be fixed to reflect the latter? Can someone offer correct fixes then? -109.66.203.215 (talk) 08:52, 16 November 2010 (UTC)

[edit] Example shown in Icon, not technically invalid but a poor example

Looking at the example, it shows questions and answers being thrown straight into the <quiz> bracket. Surely each pair of Q&A would need to be wrapped in a tag <round> or <question_set>? Otherwise the program using this would have to read through the whole thing serially for any of it to make sense.

This is more a practical issue and not a technical one.--92.14.116.17 (talk) 14:12, 3 March 2011 (UTC)

Using XML for question-and-answer quizzes seems to be a common student exercise set by unimaginative teachers, and as the problem never occurs in real life I guess you'd better find out what those teachers consider the right answer to be. Or at any rate, find out what requirements they are assessing the solution against. Mhkay (talk) 15:15, 3 March 2011 (UTC)

Still a poor example, and barely an example as it is shown as a small piece of graphics. There should definitely be a real example in the text. And in that example it should be explained which one is the root element. My issue is that I believe I heard that the root element is in fact an implicit element above the topmost element. This article does not even explain what a root element is, just that there is only one. 193.140.194.148 (talk) 12:15, 13 January 2012 (UTC)

[edit] History needs a bit of cleanup

Fixed a couple of things, but the section could (and should) be much better written. A very short to-do list, in decreasing order of importance:

  • the link to Kimber's blog is totally out of place;
  • more supporting citations are needed;
  • more historical sources should be found and linked to.

Andy Monakov (talk) 11:32, 15 September 2011 (UTC)

On the number-of-weeks issue, I can't get Jon's count to work in my head. I seem to remember that we were working in at least part of August, and when I pop up a 1996 calendar I have trouble getting the week count down to his number. However, it is absolutely the case that the first wave of work was in the August-November timeframe, so I thought it best just to say that rather than arguing over the number of weeks. On the section in general, I agree it's rambling and messy, that may have been partially a consequence of too many of the people who were involved wanting their opinion/contribution included. I think it would be a good idea for someone else to be bold and clean it up. Tim Bray (talk) 17:35, 20 September 2011 (UTC)

[edit] Jsonix

Is it worth including a link to Jsonix? --Gak (talk) 12:22, 21 September 2011 (UTC)

A bit of Web searching reveals no uptake, and also confluence.highsource.org is offline. So, no. Tim Bray (talk) 23:04, 30 September 2011 (UTC)

[edit] Large commented out section under Well-formedness and error-handling

The section in question can be found at the end of this section: http://en.wikipedia.org/w/index.php?title=XML&action=edit&section=8

Are there plans to use that? If not it should be removed, although it does seem to contain some valid information. — Preceding unsigned comment added by Nick Garvey (talkcontribs) 04:03, 2 November 2011 (UTC)

Hidden commented out sections like this are a menace. I've moved it here from the article:
--Cybercobra (talk) 05:12, 2 November 2011 (UTC)

[edit] Character entity references for escaping

I attempted to link the #Escaping section with the Character entity reference article. I thought the link was relevant because it seems that article also lists the same five objects, and could potentially expand on the topic. If the problem with my change was just an issue with terminology or semantics, perhaps I can avoid this by directly naming of the article, for example “There are five predefined entities (see Character entity reference)”. Otherwise, I’d love to know why the two topics shouldn’t be related when they seem so similar. Vadmium (talk, contribs) 12:34, 5 February 2012 (UTC).


Cite error: There are <ref> tags on this page, but the references will not show without a {{Reflist}} template or a <references /> tag; see the help page.

Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export