URL shortening
From Wikipedia, the free encyclopedia
|
|
This article's introduction section may not adequately summarize its contents. To comply with Wikipedia's lead section guidelines, please consider expanding the lead to provide an accessible overview of the article's key points. (September 2009) |
URL shortening is a technique on the World Wide Web where a provider makes a web page available under a very short URL in addition to the original address. For example, the page http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://tinyurl.com/mmw6lb.
Contents |
[edit] Purposes
There are several reasons to use URL shortening:
[edit] Avoid URL garbling
Currently, web developers tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths and session information. This may result in a URL that is aesthetically unpleasant and difficult to remember. Copying a URL that is hundreds of characters long can make the URL garbled. Then a short URL is useful to copy on an e-mail message or a forum post.
[edit] Use the smallest space possible
On Twitter or instant message status even a 60 character long URL can be too long. A URL shortener can make short URLs such as http://br.st/74 (15 characters), http://vork.us/go/bj28 (22 characters), http://tinyurl.com/gf65th (25 characters), http://ni2.in/j (15 characters), http://tr.im/o65Tg (18 characters), http://x2t.com/88 (17 characters) or http://➡.ws/⦅剝 (13 characters). In Twitter http://www can be replaced by www and 7 fewer characters can be used.
[edit] Reading aloud
To accomplish this objective any URL shortening service can be useful. However those which let the user choose the URL are more suited for this task.
[edit] Manipulating visitors
URL shortening is a special kind of URL redirection, which is sometimes used in pranks, phishing, or affiliate hiding. For example tinyurl.com/ha56k0k redirects to goatse.cx in a prank. More recently, some of these services (br.st in particular) have started filtering all shortened links through services like Google Safe Browsing.
[edit] Techniques
Every long URL is associated with a key, which is the part after http://domain.tld/. For example http://tinyurl.com/m3q2xt has a key of m3q2xt.
There are several techniques to implement a shortening.
- Keys can be generated numerically in base 36 assuming 26 letters and 10 numbers. The keys in order would be 0, 1, 2, ..., 9, a, b, ..., z. If uppercase and lowercase letters are accepted then the number should be in base 62 (26 + 26 + 10).
- A hash function can be made or a random number can be generated so that key sequence is not predictable.
- Users can propose their own keys. For example, http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://tinyurl.com/w1k1t1ny.
For redirection techniques see URL redirection.
[edit] History
The first notable URL shortening service, TinyURL, was launched in 2002; however, the idea dates to at least 2001.[1]
The popularity of TinyURLs influenced the creation of at least 100 similar websites.[2]. Most are simply domain alternatives.
Initially Twitter automatically translated long URLs using TinyURL. As of 2009, bit.ly is used.[3]
On May 2009 .tk, which previously was used to generate memorable domains via URL redirection, launched Tweak.tk[4], which generates very short URLs such as http://mxtux.tk.
On 10 August 2009 a notice on the tr.im shortening service home page announced that "[s]tatistics can no longer be considered reliable, or reliably available going forward" and that they were shuttering the generation of new shortened URLs, but assured existing tr.im short URLs would "continue to redirect, and will do so until at least December 31, 2009".[5] A blog post on the site attributed this move to several factors, including the lack of suitable revenue generation mechanisms to cover ongoing hosting and maintenance costs, lack of interest among possible purchasers of the service, and Twitter's default use of the bit.ly shortener.[6] This blog post also questioned whether other shortening services can successfully monetize URL shortening in the longer term. A few days later tr.im reversed itself on this move, announcing it would resume all operations "going forward, indefinitely, while we continue to consider our options in regards to tr.im’s future". [7]
[edit] Criticism
The convenience offered by URL shortening also introduces potential problems, which have led to criticism of the use of these services.
Short URLs are subject to linkrot, in the case the service stops working, all URLs related to the service will become broken. A solution can be to save the destination URLs of potentially useful redirects locally, or to periodically download the redirect database to backup it, if the service permits that.
Users may be exposed to privacy issues in that the link shortening service is in a position to track a user's behaviour across many domains.
Short URLs add an additional layer of complexity, where for every access, more requests are necessary (at least one more DNS lookup and HTTP access).
A short URL obscures the target address, and as a result it's sometimes used to redirect to an unexpected site. Examples of this are rickrolling, redirecting to scam and affiliate websites, or shock sites; ZoneAlarm has given the warning "TinyURL may be unsafe. This website has been known to distribute spyware." TinyURL has countered this problem by offering an option to present a link when using TinyURL, instead of redirection.[8] In addition, even if the link does not include a preview, the preview may still be accessed by simply prefixing the word "preview" to the front of the URL (Ex: http://tinyurl.com/8kmfp could be retyped as http://preview.tinyurl.com/8kmfp) to see where the link will lead. Opaqueness is also leveraged by spammers, who use such links in spam to bypass URL blacklists. TinyURL, in turn, disables spam related links from redirecting.[9]
Some websites have responded by blocking short redirected URLs from being posted:
- In 2006, MySpace banned posting TinyURLs.[citation needed]
- Yahoo! Answers blocks postings that contain TinyURLs.[citation needed]
- The Orkut social network recently suppressed all TinyURL addresses.[citation needed]
- The Twitter network recently replaced TinyURL with Bit.ly as its default shortener of links longer than 26 characters.[10]
- Panera Bread blocks access to TinyUrl within its free Wi-Fi network.[citation needed]
Security professionals are also suggesting users to always preview the short URLs before accessing it, especially after the shortener service cli.gs got hacked, exposing millions of users.[11]
[edit] Theoretical limits
|
|
This section does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (October 2009) |
If every web page were to be shortened, then URLs would become longer and longer. For example, a site which uses lowercase letters and numbers for its shortening (such as TinyURL) can have (26 + 10)6 URLs of 6 characters, i.e. more than 2 billion pages. This number is very big but not infinite.
Let r be the size of the alphabet used by the URL shortener, and n the length of the longest desirable key (the part after http://domain.com/). Then the total number of short pages equal or shorter than n is[12]
For r = 36 and n = 6 (current values for TinyURL) the result is 2.24*109.
To achieve even shorter URLs tr.im uses uppercase and lowercase, with r = 26 + 26 + 10 = 62. Currently n = 4 (e.g. http://tr.im/oaG5); in this case the result is 1.5*107, i.e. 15 million.
Bit.ly also uses uppercase and lowercase and currently has n = 6 (e.g. http://bit.ly/11ozU3). This means they can serve up to 5.68*1010 pages (almost 57 billion).
TinyArro.ws and Doiop allow Unicode characters, thus they can create an incredible number of links for small values of n. For example, with only n = 2 TinyArro.ws can generate billions of possible URL combinations. Assuming r = 100,000 with Unicode, the result is 10 billion. Doiop, on the other hand, uses Unicode only for user-generated keys. Therefore, short URLs must be searched for. Since probably most people use ASCII characters in user-generated Doiop URLs, most Unicode characters are available. Here is an example of a TinyArro.ws randomly generated URL http://➡.ws/⦅剝 (13 characters). On the other hand, as an example that probably is the general case, on June 2009 Doiop had available http://doiop.com/剝 (18 characters).
While using Unicode allows URL shortening services such as TinyArro.ws to represent a large number of links with a small number of additional characters, each Unicode character may require 2, 3 or even more bytes to store and transmit, so in some applications there is no gain to using Unicode shortening.
[edit] References
- ^ Comment thread 8916, Metafilter.com, 10 June 2001. Announcement of url shortening service available at makeashorterlink.com.
- ^ URL Shortening Serviceshttp://code.google.com/p/shortenurl/wiki/URLShorteningServices
- ^ Bit.ly Eclipses TinyURL on Twitter
- ^ http://twitter.com/TweaKdotTK/status/1834883583
- ^ tr.im
- ^ blog.tr.im/post/159369789/tr-im-r-i-p
- ^ blog.tr.im/post/160697842/tr-im-resurrected
- ^ Preview a URL feature
- ^ Spam Spotted Using TinyURL, Brian Krebs
- ^ Bit.ly Eclipses TinyURL on Twitter
- ^ blog.cli.gs/news/cligs-got-hacked-restoration-from-backup-started cli.gs
- ^ Geometric progression
