Talk:Uniform resource locator

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Internet (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Internet, a collaborative effort to improve the coverage of the internet on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Computing (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

Headline text[edit]

Often a user of the Internet is asked to provide a URL. . The user looks in Wikipedia, and finds what follows. If he/she is not a computer programmer, he/she still does not understand. It appears to be very difficult for those who know what a URL is to say what it is, to bridge the gap in understanding between the knower and the unknower. Who knows whether a URL is similar to to a clowning of a a divining rod? The language used to describe a URL is vastly more complex. —Preceding unsigned comment added by 118.68.92.29 (talk) 16:27, 12 June 2008 (UTC) Most of us just want to know what to write in the space provided when we are asked for our URL when we want to comment on an article that appears on the Internet. Writing in 'What is URL?' works perfectly well, which would confirm the impression that 'URL' must be one of the most popular and least comprehended expressions of all time. "To go to the homepage one usually just enters the domain name such as www.wikipedia.org. Sometimes, and also in this case, "www." can be omitted: wikipedia.org."

In what cases can "www" be omitted ? What is the difference between URLs having "www" and those that don't ? Jay 07:19, 21 Mar 2004 (UTC)

The first part of an HTTP URL consists of the DNS host name plus the domain name. In the case of "http://www.wikipedia.org/", the host name is "www" and the domain name is "wikipedia.org". When you enter this in a web browser, the browser typically uses the DNS resolver on your system to determine the IP address of host "www" in domain "wikipedia.org". If the DNS servers for domain wikipedia.org define an IP address for "wikipedia.org" or an alias for wikipedia.org that uses that same IP address as "www.wikipedia.org", then the leading "www" is not required since it will resolve in DNS to the same IP address. The requirement for the leading "www" is entirely dependent on how DNS is configured for that domain.

---

Apart from the possibility of a name being available in the DNS, many browsers actually do a number of heuristical modifications to the URL if it doesn't lead anywhere. Typically, a TLD (such as .com, org, etc) is appended to a name if it doesn't contain dots, and also a "www." can be prepended to it (thereby allowing you to type google in your browser and have it complete it to www.google.com. Some browsers can even be customized with details about this rewriting... -- Schnolle 16:48, 2004 Oct 24 (UTC)

"Uniform resource locator" or "universal resource locator"?[edit]

Title says Uniform, first sentence says Universal....which is right? 41222-KenS

Uniform is correct. Check some of the external links... Schnolle 10:01, 24 Dec 2004 (UTC)
Some external dictionaries give "universal resource locator" as an alternate expansion, and of those, some preface it with "previously", as if the term's original expansion was "universal resource locator" and it originally changed. Merriam-Webster, for example, traces the "universal" term to 1993. If it did change from one to the other, why, and by whom, and when? And what are some older examples (in RFCs, perhaps) of the "universal" version being used? This issue of nomenclature provenance should at least be touched upon in the article. Robert K S (talk) 06:17, 3 July 2009 (UTC)

New RFC[edit]

What about RFC 3987? --ajvol 08:40, 22 Apr 2005 (UTC)

Irrelevant. RFC 3987 (IRI generic syntax) is essentially an extension to RFC 3986 (URI generic syntax), which supercedes the previous version, RFC 2396, which supercedes the RFCs that define URLs. As RFC 3305 (Aug 2002) explains, the term URL has effectively been obsolete for some time. The longevity of the term URL is attributable to a number of factors, not the least of which is horrible articles like this one. So if you're going to ask 'what about RFC ___?', you should probably ask 'how can we make this article better reflect the modern view of URL as being an informal term for a certain class of URIs, with respect to RFCs 2396, 3986, and 3305?' —mjb 01:37, 25 Apr 2005 (UTC)

Merging with URI[edit]

I think the bulk of content on this page should be merged with the page on URIs, leaving only parts really relevant to locators, as it is on the Uniform Resource Name. As specified in RFC 3986:

  An individual scheme does not have to be classified as being just one
  of "name" or "locator".  Instances of URIs from any given scheme may
  have the characteristics of names or locators or both, often
  depending on the persistence and care in the assignment of
  identifiers by the naming authority, rather than on any quality of
  the scheme.  Future specifications and related documentation should
  use the general term "URI" rather than the more restrictive terms
  "URL" and "URN" [RFC3305].

Hrvoje Šimić 07:00, 13 September 2005 (UTC)

That may be technically correct as far as the spec goes, but the term URL is much more popular, so I'd favor the current arrangement. --William Pietri 07:58, 7 October 2005 (UTC)

I agree that URL is a more popular term, but I don't think that should be the main criteria for Wikipedia organisation. Page on URL could explain the popular and the technical meaning, and refer to the page on URI for details, while containing information specific to locators. This approach is used many times on Wikipedia (e.g. Monkey/Ape). Besides, I don't think that current arrangement is anything to be satisfied about: articles overlap in scope, duplicate material, etc. Hrvoje Šimić 08:27, 11 October 2005 (UTC)
Sorry, but I'm still unpersuaded. I agree that some of the material here could move into URI, and that the article could use some cleanup. But I think there are two definitions of URL: what the very small number of people who read and follow the RFCs mean, and what the buik of internet-connected English-speakers (including most web programmers I know) mean. In addition to mentioning the formal URI/URN/URL distinction and formally defining the official URL, I think this article should continue to contain an explanation of URL that's friendly to the man on the street. -- William Pietri 02:59, 12 October 2005 (UTC)
Could you explain about the two definitions of "URL"? If I understood you correctly, the one definition is "popular URL" which should correspond to the "rfc URI", and the other one is "rfc URL". The "popular URL" part should cover the same topic as "rfc URI", but in a different, more practical language, like "type in the protocol name, colon, slash, slash, computer name, the file path and the file name". I think that approach is wrong. The Web may started like that, but it grew into so much more. And the models and definitions changed accordingly. And they were changed not by theorists or politicians, but by the same people who invented it in the first place. And they changed it not because they felt like it, but because they were forced to. The Web would be much more limited and confusing without the change. I wish I have been given this "RFC view" when I was learning about the Web, instead of the contradictory and misleading material I found. That being said, I don't mind covering history, the popular and "practical" aspects of the topic. In fact, I think it is essential for explaining why the RFCs are the way they are. But I'm all for bringing these two views together, instead of keeping them apart. -- Hrvoje Šimić 08:24, 12 October 2005 (UTC)
There is a difference between a URI and a URL according to other Wikipedia articles. A URL is a kind of URI, another kind of a URI is a URN (e.g. a Magnet link). A URL specifies a location, a URI does not always as the URN specifies data. --Computerjoe 17:47, 17 November 2005 (UTC)

I agree with William and Joe on both points. Technically, URLs have some features that make them different from URI in general. Intuitively, the use of the term URL is still much used even in technical contexts, even if URI should be sometimes used. However, my main reason for keeping the articles separated is to ease the reading and learning of these concepts. In this case at least, it is much easier to first learn about URLs and then discovering that these are particular cases of URI rather than the other way around. The fact that the article on URI starts with a section that explains the relationship between URI, URL, and URN seems to me a proof of this. I am not even so much convinced that the URL article should insist so much in comparing URLs with URIs from the beginning. Paolo Liberatore (Talk) 22:04, 2 December 2005 (UTC)

The proposal is to keep both articles. The merging refers to the content of the URL article that talks about URIs in general. The new URL article should contain information specific to the concept of URL, relation to URI, and the popular and historical usage of the term. Hrvoje Šimić 18:45, 6 December 2005 (UTC)
I agree that the article would be improved by being more specific on URLs. At some point, URLs had been very important in the specification of the Web, and the current "popular" use of URLs is mainly based on that specification. A possible solution could be to describe the (historical) specification, leaving the comparison with the current specification at the end. What I propose is something like:
  1. intro (more or less as is, removing most of the second paragraph but adding a note saying that URLs are technically extinct)
  2. origins (I assume that the basic idea was that of having a single string to identify files that can be retrived in different ways)
  3. syntax (the hystorical one, without referring to URI)
  4. the specific case of HTTP URLs
  5. why URLs were replaced by URIs, and the current specification
Your last comment made me realize that perhaps there is a way of writing this article that satisfies all of us. The point I insist is that the reader should be able to read the URL article completely without having to read the URI article first. Paolo Liberatore (Talk) 19:16, 6 December 2005 (UTC)

Sorry I missed your earlier comment, Hrvoje. Paolo's proposed solution seems like a great approach to me, and firmly in keeping with NPOV. --William Pietri 15:31, 7 December 2005 (UTC)

Pronunciation[edit]

A Uniform Resource Locator, URL (spelled out as an acronym, not pronounced as 'earl')

Yuarell is the technically correct pronunciation, but in terms of actual usage it is pronounced as Earl by a lot of people, whether it's formally correct or not.

Out of curiosity, I checked with a number of pals who did web stuff early on, mainly at Hotwired. The they all agree on you-are-ell. -- William Pietri 02:36, 12 October 2005 (UTC)
Same in France, for all computer users. Either in French or in English, they prononce it as an initialism, not as an acronym (earl, ou [yrl] in French)
Technically, all pronounable acronyms are supposed to be pronounced... Thus, NATO is not pronounced n-a-t-o and URL should be pronounced "earl." I won't change the article until I get some input from other Wikipedians though... --Wulf 00:27, 29 April 2006 (UTC)
"Technically, all pronounable acronyms are supposed to be pronounced." That's ridiculous. --153.48.52.241 (talk) 16:45, 6 March 2008 (UTC)
According to The Internet for Dummies (10th edition), it's never pronounced as though it's a word. (Excerpt at Dummies.com.) B7T 18:08, 5 May 2006 (UTC)

You must define pronouncable. HTTP could be hatuhtuhpuh, or, UAE as ooay. VCR could be vacur. CD-cud, dvd-duved. need i go on?

LCD TV is lukd-tuv? Either way, even though it shouldn't be pronounced as a word, people do because it's easier to say that spelling it out. If it's able to be pronounced, people will pronounce it. As stated with NATO being Nay-Toe, PC being Pee-See, and RAID being Rayd it's all able to said without spelling it out. darthbob 19:03, 8 January 2008 (UTC)

The PC-not-being-spelled point sure made me laugh. Thank you. Oh, and in case it's of any meaning, in Poland it's pronounced spelled (oo-err-ell) and not as a word (oorl), too. IrekReklama (talk) 02:08, 6 September 2010 (UTC)

Keep Both[edit]

Just summarise both debates and crosslink.

99% of internet users don't know what a URI is anyhow and a link from URL to URI wouldn't hurt.

HTTP cookie[edit]

I have submitted the article HTTP cookie for peer review (I am posting this notice here as this article is related). Comments are welcome here: Wikipedia:Peer review/HTTP cookie/archive1. Thanks. - Liberatore(T) 16:57, 14 January 2006 (UTC)

RFC 1738 is now far obsolete[edit]

I'm not a native english speaker, and not registered in the english version of WikiPedia. Bu I would like to tell you that the RFC 1738 is now far obsolete, and that the article should refere to RFC 3986 wich have the status of standard (STD 66).

I may come in few week later to do the update if nobodies have done it... but I prefer native english speaker to do it.

  • Yes, it is true, good fella. STD 66 is the way to go. I am right now preparing 110 training hours on lots of topics including this one.
  • As of today, this article reads that a URL has "a colon, two slashes" at https://en.wikipedia.org/wiki/Uniform_resource_locator#Syntax. But, according to STD 66, the hierarchical part of a URI may begin with 2 slashes (to be used with path-abempty), 1 slash (in case of path-absolute), and 1 segment (path-rootless). And, let us remember that the hierarchical part may be completely empty (path empty).
  • I think the articles for URI, URL and URN must be updated to reflect the current state of the standard.
  • Can anyone take care of these changes? I am quite busy now and I usually like to do this kind of things perfect or just not to do them. Gotta get back to work...
  • George Rodney Maruri Game (talk) 16:10, 24 November 2013 (UTC)

URL and IP address[edit]

Shouldn't there be something about how a DNS server resolves a URL into an IP address? I started writing something but realised I didn't know enough about it, so someone else had better do that. DirkvdM 18:40, 5 March 2006 (UTC)

  • I agree, at least a one-liner with hyperlinks to the wikipedia articles for 'DNS Server' and 'IP Address'. Something like
  A network lookup service, the Domain Name System (DNS),
  provides the ability to map domain names to a specific
  IP address.  For example, the URL www.wikipedia.org 
  maps to the IP address 207.142.131.248.

(which I stole with minor modification from the DNS article)

--Mcswell 19:30, 30 June 2006 (UTC)

  • URLs don't necessarily have a host part, so they don't always need to do IP resolution. --66.241.66.190 (talk) 19:23, 27 November 2007 (UTC)

according to my programmer room mate once you connect to the DNS server you press f11 and you accidently the whole thing —Preceding unsigned comment added by 121.73.16.14 (talk) 05:15, 15 July 2009 (UTC)

URL of this article[edit]

To quote the article: "For example, the URL of this page on Wikipedia is http://en.wikipedia.org/wiki/Uniform_Resource_Locator."

However, I came across this page by searching for "url", which directed me to another version of this page, which of course has a different URL! This could be confusing to some who arrived at that page instead, and noticed that the URL in the above sentence is not the same as the text in their browser's address bar. (If there is a way to use the Wikipedia markup language to always show the correct URL of a page within the text of a page, this would be a good place to use it.) B7T 18:36, 5 May 2006 (UTC)

At some point, Wikipedia may go on CD or even on paper, where using a URL that depends on the web page will not work. While the sentence "the URL of this page on Wikipedia" is formally correct (that is indeed the URL of that page in the Wikipedia web site) it may be somehow misleading if read on answer.com etc. I'd support choosing an enterely different URL as an example. - Liberatore(T) 17:54, 6 May 2006 (UTC)

mailto[edit]

I'm confused about the mailto: being a URL rather than a URN. This is because a URL tells you how to get there, which mailto: doesn't really do. After all, mailto:person@server.com will probably end up in a file like /user/person/mail/342345 on server.com. Furthermore, the mail actually gets sent to the sender's SMTP server before heading to server.com. It's like writing a letter to John Doe, and having the post office try and figure out where John Doe lives. So shouldn't mailto: be classified as a URN? If not, then why is it a URL? (P.S. sorry for the mailto auto-link, I don't know how to turn that off...) IMacWin95 22:18, 15 June 2006 (UTC)

A URN is a name for a thing, which should have a very long lifetime. A URL is a current locator, a much more transient concept. Thus, for example, "mail:John Doe, son of Jim Doe and Mary Roe, born 1972-02-27 in St. Elegius Hospital, Boston MA, USA, Boston birth certificate #123-456-78" could be a URN because it is a permanent reference, but "mailto:John@Doe.Com" has to be a URL because that account might be reused a dozen times in the next 100 years.
P.S. To stop the automatic treatment of almost any text in Wikipedia, wrap it in <nowiki>...</nowiki>. RossPatterson 01:20, 16 June 2006 (UTC)

Avoid self-references[edit]

avoid self-references to wikipedia —Preceding unsigned comment added by 71.123.46.27 (talkcontribs) 18 July 2006

Aug 2006 deletions and the future of this article[edit]

Etan Wexler removed a large amount of text from the article in mid-August. The changes weren't really discussed here, and now, a few weeks later, people are being tempted to start adding material that covers some of what was previously removed. Although I don't oppose Wexler's edit (most things that can be said about URLs are covered by the URI article), in Wikipedia we often have to acknowledge the priorities of contributors: someone came to this recently-trimmed article and felt that it was missing examples and details, so they attempted to add this info. They did a poor job of it, so I reverted, but this is not a battle I want to have to keep fighting. I hold it up as an example of how we can't rely on people to refer to linked articles like Uniform Resource Identifier; we either have to explain why they need to refer to that other article, or we need to address the topics of interest here, even if it means replicating or making repeated reference to material better covered elsewhere. Thoughts? —mjb 14:25, 6 September 2006 (UTC)

I think that in this article there are actually two concepts hidden under the same label:
  1. URL as a popular synonym for URI;
  2. URLs as locators, in contrast to identifiers and names.
These two meanings should be clearly stated in the introduction. The second meaning should be covered very briefly in the article, because this distinction is rarely needed in practice, as well as being marginal in theory. The introduction should strongly refer reader to article on URI for the complete coverage of the first meaning. Following sections would explain this in detail:
  • Term URL in popular culture. Here we should give the brief, dumbed-down, practical definition of the URI (alias URL) to satisfy the superficial curiosity. Then, in more detail, we should explain why the term URI replaces it now.
  • URLs as locators: a technical discussion.
  • History of URL and how the change in terminology and semantics happened.
What is not needed are the schema listings and concrete syntaxes, as these are available on URI and URI scheme.
Hrvoje Šimić 15:56, 6 September 2006 (UTC)

What was the purpose of the double-slash again?[edit]

Rather obviously, the slashes are not part of the protocol name -- otherwise they'd be on the other side of the colon. Thus, they are seperators of some kind for something which is in the hierarchy above the domain name, even above the TLD.

Are they merely historical artifacts? Have they ever actually seperated any values other than <null>? i.e. are there any circumstances, at least historically, why an URL would be any different than http:<null>/<null>/example.com? Or are they not actual seperators but rather an indicator of some sorts?

Considering other URI schemes, such as URNs, or different protocols (mailto, tel, etc) do not use initial slashes, this is a bit awkward for http and ftp protocols. — Ashmodai (talk · contribs) 00:14, 8 October 2006 (UTC)

Oopsie. The RFC has it all. In case anybody else was wondering:
Many URI schemes include a hierarchical element for a naming
authority so that governance of the name space defined by the
remainder of the URI is delegated to that authority (which may, in
turn, delegate it further). The generic syntax provides a common
means for distinguishing an authority based on a registered name or
server address, along with optional port and user information.
The authority component is preceded by a double slash ("//") and is
terminated by the next slash ("/"), question mark ("?"), or number
sign ("#") character, or by the end of the URI.
i.e. it's an indicator for the next segment to be the name of the authority for the rest of the URL. I guess http:/foo/bar.html would therefore be (theoretically) legal within the context of the server on which the content resides. In theory, anyway. I suppose the HTTP protocol forbids that use, as relative URLs omit the protocol prefix entirely. So I guess the FILE protocol has a leading double-slash followed by a slash (for the system root) to indicate that the authority is <null>, i.e. local, inherent or non-existant, or something. — Ashmodai (talk · contribs) 00:24, 8 October 2006 (UTC)

There is one place where the double-slashes may be useful, in defining a link in an HTML page to use the scheme of the current page by omitting it (much like you can get the current host by omitting it):

Omitting the scheme in a link tag: '<a href="//site2.com/">site2</a>' will inherit the scheme:
on the address "http://site1.com/" the link becomes "http://site2.com/"
on the address "https://site1.com/" the link becomes "https://site2.com/"
Just like omitting the host in '<a href="/path2">path2</a>' will inherit the scheme and the host:
on the address "http://site1.com/path1" the link becomes "http://site1.com/path1"
on the address "https://site2.com/path1" the link becomes "https://site2.com/path1"

Note: this was tested in Firefox 2 some time ago and if I recall correctly it also worked in all major browsers at the time. Things may have changed. But other than that the double slashes are quite useless and can be left out.--86.121.33.154 (talk) 08:41, 16 April 2010 (UTC)

delimiters[edit]

Aren't there alternatives to the '?' and '&' in the query strings?

Also, do the delimitters around the username and password reduce the availability of those delimiters in the actual username/password. For example, can a password contain an '@' and/or ':'? —The preceding unsigned comment was added by Davidmaxwaterman (talkcontribs) 04:49, 9 May 2007 (UTC).

Yes, CGI authors should write code to accept the colon (';') as an alternative to the ampersand ('&') in a query string.
It appears that the URI scheme assumes that passwords do not contain '@' (the ':' is OK).
Would it improve this article to mention these things, or would it be better to only mention them at URI_scheme#Generic_syntax? --68.0.124.33 (talk) 02:55, 25 August 2008 (UTC)
I don't think there's any reason to mention them. This is an encyclopedia, not a tutorial. It would be okay to mention something about the acceptable characters though. — FatalError 06:08, 25 August 2008 (UTC)
Also, to answer the original question, you can use a different URL format as long as you write the code that interprets it as needed. The ? and & characters are standard, so they obviously have much more support, but you can use anything if you can write the code yourself. I, for example, don't use them in the first place; I use pretty URLs. :) — FatalError 06:10, 25 August 2008 (UTC)
I believe any need for reserved characters such as "@" should simply be URI encoded, "%40" for "@", "%3A" for ":" for examples. -- Joe (talk) 18:08, 4 October 2010 (UTC)

Is it okay?[edit]

Is it okay if we can create URL's? Plus, I don't know how to create them. I'm creating a homepage using Publisher 2003.--  PNiddy  Go!  0 00:14, 27 June 2007 (UTC)

Don't know about how to create them, (Not the place to ask about them) but wikipedia uses clean URLs. —Andrew Hampe Talk 17:03, 2 July 2007 (UTC)

'clean URLs with web services'[edit]

Do we need this section? The services suggest don't seem to obey the ideas mentioned in the list above and I don't think we gain anything by mentioning them. Robin 14:18, 12 July 2007 (UTC)

I agree, I really don't think they're needed. SQL(Query Me!) 10:02, 12 October 2007 (UTC)
OK, 'be bold' :) I've removed both the 'web services' section and following subsection. Robin 14:57, 16 October 2007 (UTC)
I like it :) Good work! SQL(Query Me!) 02:33, 19 October 2007 (UTC)

I don't know what this section was, but I came here from a link (from another article) to http://en.wikipedia.org/wiki/Uniform_Resource_Locator#Clean_URLs ... which is a section that no longer exists. Can't find it in the history, though. Put it back if you can, I want to read it.

My edit is here in the history. I don't think it deserves reinstatement, but by all means argue against me :) Which page linked through to that section? Robin (talk) —Preceding comment was added at 10:09, 20 March 2008 (UTC)
OK, I found a reference in Rewrite engine and removed it. If there are any more then feel free to edit. Robin (talk) 11:55, 9 June 2008 (UTC)

Fair use rationale for Image:Address Bar - Wikipedia.png[edit]

Nuvola apps important.svg

Image:Address Bar - Wikipedia.png is being used on this article. I notice the image page specifies that the image is being used under fair use but there is no explanation or rationale as to why its use in this Wikipedia article constitutes fair use. In addition to the boilerplate fair use template, you must also write out on the image description page a specific explanation or rationale for why using this image in each article is consistent with fair use.

Please go to the image description page and edit it to include a fair use rationale. Using one of the templates at Wikipedia:Fair use rationale guideline is an easy way to insure that your image is in compliance with Wikipedia policy, but remember that you must complete the template. Do not simply insert a blank template on an image page.

If there is other fair use media, consider checking that you have specified the fair use rationale on the other images used on this page. Note that any fair use images uploaded after 4 May, 2006, and lacking such an explanation will be deleted one week after they have been uploaded, as described on criteria for speedy deletion. If you have any questions please ask them at the Media copyright questions page. Thank you.

BetacommandBot 20:30, 29 August 2007 (UTC)

Corrected. I replaced the rationale with a more bot-ready template. SQL(Query Me!) 10:01, 12 October 2007 (UTC)

URL vs URI[edit]

URLs and URIs are not the same thing, but this article acts as if they are. I think it needs to be rewritten, with only a sentence stating that URL is commonly used as a synonym for URI. (And a source needs to be found for that statement; I know it's true but we need a reliable source to prove it.) The article currently treats them the same, only sometimes talking about "its current strict technical meaning". For example, the article says, "Every URI (and therefore, every URL)." This article isn't about URIs, it's about URLs; URI has its own article. It doesn't matter what the term "URL" is commonly used as, this is an encyclopedia, not a blog post. Please keep that in mind before adding more content. Thank you. — FatalError 21:34, 19 June 2008 (UTC)

I came to this article lookin for the answer to a simple question: In "http://www.site.com/folder/page.html", is the URL "www.site.com", is it the whole thing? What is the URI there? What is the URN? I think that is the most essential answer this article should respond, and it fails at that. The definitions are simply not clear and to the point. A good example such as the one I put above, explaining which part of the string is the URL, which one is the URI and which is the URN wold make the whole article much clearer. 71.197.183.128 (talk) 06:25, 11 June 2011 (UTC)

Country codes[edit]

I came to this article looking for information on what the country codes in URLs are (eg. .au for australia). That presumably is a separate article, but I would expect it to be linked from here. Thanks. --Irrevenant [ talk ] 20:58, 5 January 2009 (UTC)

  • The article in question is at: Country code top-level domain I haven't added it because there are a number of of inter-related articles and I don't understand the structure well enough to tinker with it. Top level domains (.gov, .org etc. should probably also be touched on. --Irrevenant [ talk ] 21:09, 5 January 2009 (UTC)

Absolute versus relative URLs[edit]

A short paragraph on relative versus absolute urls would be a helpful addition to this URLs page. Mention of the "fragment URLs" found in web pages would also be nice — cf W3C: HTML 4.0 Specification. Page Notes (talk) 19:47, 7 February 2009 (UTC)


I think now this section exists but it writes that an URI points to a "file" while it points to a resource, that might be eventually a file, but is not necessary, also the URI in the example (a .jpeg) might be generated by a web server from a blob in a database. —Preceding unsigned comment added by 151.65.44.21 (talk) 14:45, 23 January 2010 (UTC)

History[edit]

It might be nice to know some of the history about the URL. I will add some information about this in the coming days. BCP5023 (talk) 17:22, 2 March 2009 (UTC)

Syntax[edit]

With the section on syntax, it would be nice to see a brief description about each part of the URL. We can combine the Internet Hostnames section with the syntax. I will update this in the coming days, but I wanted others opinions. BCP5023 (talk) 17:22, 2 March 2009 (UTC)

Syntax: a bit more clarification about the encoding of a URL[edit]

I came to this article looking for a clarification of the way text is encoded in a URL (eg: the way spaces are encoded as "%20") Is there a name for this encoding? This must be part of the syntax definition. —Preceding unsigned comment added by Stib (talkcontribs) 23:08, 9 March 2009 (UTC)

encoding also relevant when other language is used, for example when chinese character is used. does it always use UTF8 or something else? Jackzhp (talk) 02:46, 23 July 2010 (UTC)
It's indeed part of the syntax definition in RFC 3986. It's called percent-encoding there. (See RFC 3986, section 2.1) Jaho (talk) 05:14, 11 January 2011 (UTC)
From memory, aren't there slightly different rules for encoding characters in the hierarchical part of the URL compared to the query part? For example + rather than %20 for spaces, and different allowed characters in the data parts of the query string? Can't find a good ref at the moment, but I remember writing different code to do the two encodings/decodings some years ago. --Nigelj (talk) 10:15, 11 January 2011 (UTC)

I got asked this very question today by a student. How are Arabic or Hebrew encoded in URL/URI? Is it converted into latin characters somewhere in DNS? How're languages standardized to fit inside these very specific syntax? Might be a good new section?Rousseaua001 (talk) 09:50, 22 August 2013 (UTC)

URIs (the actual character string that gets sent over the wire in an HTTP message) can only contain characters that are a subset of US-ASCII. IRIs (International Resource Identifiers), if they contain any characters not allowed in URIs (like Arabic or Hebrew) then those characters should be encoded in UTF-8 and then the byte string should be percent encoded. There's some more detail here. Klortho (talk) 19:02, 24 August 2013 (UTC)

Bad reference in History section?[edit]

There's a reference in the History section to "World Wide Web History" that appears to be pointing to a blog that has just copied and pasted much of the content from the Wikipedia "World Wide Web" entry and doesn't add any value. Additionally, it's poorly formatted and the link does not point directly to the article. Can someone provide a better reference? Right now it seems like a spam link to get people to go to the mrfweb.wordpress.com blog. — op12 22:02, 21 July 2009 (UTC)

Berners-Lee regrets the use of dots[edit]

I would like to see a citation on this. In particular, I think the "dotless" URL is ambiguous and inefficient, and I doubt that Berners-Lee would suggest this.

Dotless is ambiguous[edit]

For example, the dotless URL:

  • http:com/serverroute/www/path/to/file.html

could be any one of these:

  • http://serverroute.com/www/path/to/file.html
  • http://www.serverroute.com/path/to/file.html
  • http://path.www.serverroute.com/to/file.html
  • http://to.path.www.serverroute.com/file.html

To resolve this conflict, the implementation would be much more inefficient that today. See below.

Dotless is inefficient[edit]

The browser implementation also have to make multiple queries just to get the IP address. In the example above, the browser has to

# Do a NS (nameserver) query on com. This succeeds with the name and address of a root server
# Do a NS query on "serverroute.com" If this fails, it has to do an A (address) query on server.com
# do a NS query on "www.server.com". If this fails, do an A query on "www.server.com"

I realize things can be cached, but the DNS does the caching. In this implementation, the browser has to do the caching as well. An A (address) query on "www.server.com" is much simpler, and more efficient that the iterative mechanism the dotless form would require. There is no way to know by just looking at a dotless URL to determine which part refers to a hostname, and which part is a file.

BruceBarnett (talk) 16:01, 2 December 2009 (UTC)


I think the reason for the inefficiency is that DNS was built with the idea that you could know the full A query instead of each one separately. If you choose the simplified version it makes more sense because the path to a file needs not be domain dependent, and the DNS querying would be done just like you suggested caching making it ultimately efficient enough. A more pressing problem could that the following parts of the address (URI) are lost: username, password and port. Of course the port could be inferred by the scheme, and the username and password can be sent in the query portion and are virtually unused anyway, and a simpler scheme would benefit everyone.

Currently this could be implemented along with the current protocol by defining an alias scheme, like web, as in: web:com/serverroute/www/path/to/file.html

The link http://www.imdb.com/chart/top would become web:com/imdb/chart/top

I think this is easier to read and interpret (especially for ordinary users), shows a clear path to the file and is address, IP and device agnostic (meaning you don't care about the physical location of the file, or the domain name assigned to the IP assigned to the computer holding the file). And I think that's the way it should be, giving flexibility to the developer and ease of use to the user. --86.121.33.154 (talk) 09:01, 16 April 2010 (UTC)

Wrong Capitalization[edit]

Just because a popular term has an initialism doesn't mean it should be capitalized. As to the other case requiring capitalization, the proper noun, I don't think this is a proper noun, and as such, I move that the redirect at "Uniform resource locator" be removed and this article should be moved to that title/location. Of course the biggest fly in that ointment is that there are just tons of things which link to the capitalized version of the title, all of which should be updated. -- Joe (talk) 18:23, 4 October 2010 (UTC)

Bare URL - What are those?[edit]

I was directed to the URL page from "Bare URL" [found in the Link rot page], and I haven't found anything related to "Bare URL" - can we include this, or link things correctly?

70.50.4.45 (talk) 22:29, 28 January 2011 (UTC)

I think WP:LINKROT is saying that <ref>http://www.example.com/some/page</ref> is not a good reference because it contains only a "bare" URL (that is, the only information is the URL itself). A better reference would use one of the citation templates (see {{citation}}) and would mention the title, author, and so on. Then, if the link later fails to work, other editors have a hope of looking for a new link, or they at least have some way of identifying the original reference. Johnuniq (talk) 22:45, 28 January 2011 (UTC)

URI, URL, URN diagram change[edit]

Hi, just noticed that this article is using a diagram that's been replaced on the URI article per Talk:Uniform Resource Identifier#Misleading Venn Diagram in a different way. To be consistent, I'd recommend updating it. But it's not like it got a lot of discussion on the URI page so I'll give it the same opportunity here as I did there. Read my comments there for the reasoning. Short version: could someone show me an example of a URI that is neither a URL nor a URN? The diagram suggests they exist but the URI article text suggests they don't.

--Qwerty0 (talk) 20:58, 12 April 2011 (UTC)

Hearing no objections (for half a year) I'm replacing the diagram per these two discussions:
--Qwerty0 (talk) 22:04, 26 September 2011 (UTC)

Character set[edit]

There is no mention of what character set is used to encode URLs. Perhaps ISO/IEC 646 with a few additional characters? It would also be useful to mention how international (accented) characters are encoded using the %HH representation. — Loadmaster (talk) 23:45, 28 July 2011 (UTC)

Requested move[edit]

The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the move request was: page moved. Vegaswikian (talk) 06:42, 1 December 2011 (UTC)



Uniform Resource LocatorUniform resource locator

Like uniform resource identifier, this might be a specific thing for a particular context (in this case a string of characters), it's still generic: there are many of them, as a class of things. Per WP:MOSCAPS ("Wikipedia avoids unnecessary capitalization") and WP:TITLE, this is a generic, common term, not a propriety or commercial term, so the article title should be downcased. In addition, WP:MOSCAPS says that a compound item should not be upper-cased just because it is abbreviated with caps. Lowercase will match the formatting of related article titles. Tony (talk) 13:16, 24 November 2011 (UTC)

  • Support – clearly generic; appears in uppercase mostly in defining the acronym URL, and more often lowercase elsewhere. Dicklyon (talk) 15:28, 24 November 2011 (UTC)
The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

Church enquiry[edit]

One of We need to know executive person email address for to contact the church. Lalhminthanga Win Naing lhthanga@gmail.com Myanmar (Previously known BURMA) — Preceding unsigned comment added by 61.4.72.107 (talk) 03:36, 1 July 2013 (UTC)

Change wording of section on Syntax[edit]

I think the wording of the sentence "Passwords embedded in this way are not conducive to secure working, but the full possible syntax is..." should be changed to "Passwords embedded in this way are not conducive to security, but the full possible syntax is..." (emphasis mine). I made this change, but it was undone here: http://en.wikipedia.org/w/index.php?title=Uniform_resource_locator&diff=563997234&oldid=561938680 Is there any explanation available for the reasoning behind the current wording, or the undo? --Mount Flatten (talk) 19:40, 12 July 2013 (UTC)

I agree, and reinstated your change. Klortho (talk) 19:37, 18 July 2013 (UTC)

Representation of URI scheme in article[edit]

This article references specific URI schemes (standing alone; not in a URI itself) in a few places using representations such as "http:" and "ftp://", as though the : or // delimiters are part of the scheme name. This isn't a proper way to represent scheme names. These delimiters should only appear in the context of a URI. I'm going to remove the delimiters included in standalone representations of scheme names and display them in a fixed width typeface unless anyone has any objections. 128.205.39.40 (talk) 23:22, 11 March 2014 (UTC)