Jump to content

URL shortening

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 128.12.130.158 (talk) at 05:39, 25 February 2010 (→‎History). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

URL shortening is a technique on the World Wide Web where a provider makes a web page available under a very short URL in addition to the original address. For example, the page http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://tinyurl.com/mmw6lb.

Purposes

There are several reasons to use URL shortening:

Avoid URL garbling

Currently, web developers tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths and session information. This may result in a URL that is aesthetically unpleasant and difficult to remember. Copying a URL that is hundreds of characters long can make the URL garbled. Thus a short URL is more useful to copy from an e-mail message or forum post.

Use the smallest space possible

On Twitter or instant message status even a 60 character long URL can be too long. A URL shortener can make short URLs.[citation needed]

Alternatively http://www can be replaced by www and 7 fewer characters can be used.[citation needed]

A QR Code that stores a URL can be made physically smaller (or more readable in the same size) by using a URL shortener to minimize a URL it encodes.[citation needed]

Reading aloud

To accomplish this objective any URL shortening service can be useful. However those which let the user choose the URL are more suited for this task. Some shortening services, such as 7les.com, generate URLs that are human-readable, though the resulting strings are longer than those generated by a length-optimized service.

Manipulating visitors

URL shortening is a special kind of URL redirection, which is sometimes used in pranks, phishing, or affiliate hiding. For example tinyurl.com/ha56k0k redirects to the shock site goatse.cx in a prank. Some of these services (br.st, for example) have started filtering all shortened links through services like Google Safe Browsing.

Unsupported schemes

Most URI schemes are supported by URL shorteners, including http:, https:, ftp:, pop:, imap:, nntp:, news:, ldap:, gopher:, dict:, dns:, etc.

However, data: and javascript: URLs are not typically supported for security reasons.

Techniques

Every long URL is associated with a key, which is the part after http://domain.tld/. For example http://tinyurl.com/m3q2xt has a key of m3q2xt.

There are several techniques to implement a shortening.

  • Keys can be generated numerically in base 36 assuming 26 letters and 10 numbers. The keys in order would be 0, 1, 2, ..., 9, a, b, ..., z. If uppercase and lowercase letters are accepted then the number should be in base 62 (26 + 26 + 10).
  • Users can propose their own keys. For example, http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://tinyurl.com/w1k1t1ny.

For redirection techniques see URL redirection.

History

  • The first notable URL shortening service, TinyURL, was launched in 2002; however, the idea dates to at least 2001.[1]
  • The popularity of TinyURLs influenced the creation of at least 100 similar websites.[2]. Most are simply domain alternatives.
  • Initially Twitter automatically translated long URLs using TinyURL. As of 2009, the domain name bit.ly is used.[3]
  • On May 2009 .tk, which previously was used to generate memorable domains via URL redirection, launched Tweak.tk[4], which generates very short URLs such as http://mxtux.tk.
  • On 10 August 2009 a notice on the tr.im shortening service home page announced that "[s]tatistics can no longer be considered reliable, or reliably available going forward" and that they were shuttering the generation of new shortened URLs, but assured existing tr.im short URLs would "continue to redirect, and will do so until at least December 31, 2009".[5] A blog post on the site attributed this move to several factors, including the lack of suitable revenue generation mechanisms to cover ongoing hosting and maintenance costs, lack of interest among possible purchasers of the service, and Twitter's default use of the bit.ly shortener.[6] This blog post also questioned whether other shortening services can successfully monetize URL shortening in the longer term. A few days later tr.im reversed itself on this move, announcing it would resume all operations "going forward, indefinitely, while we continue to consider our options in regards to tr.im’s future". [7]
  • On 14 August 2009 WordPress announced a URL shortener for use when referring to any WordPress.com blog post. [8].
  • In November 2009, shortened links on bit.ly were accessed 2.1 billion times.[9] Around that time, bit.ly and TinyURL were the most widely used URL shortening services.[9]
  • A URL-shortening web service is hosted at the top-level domain of the nation of Tonga (http://to.). With only a top-level domain specified in its URL, it creates shorter URLs than most other services, but some programs do not properly handle a URL containing only a top-level domain.

Criticism and problems

The convenience offered by URL shortening also introduces potential problems, which have led to criticism of the use of these services.

Linkrot

Short URLs are subject to linkrot: In case the service stops working, all URLs related to the service will become broken. This problem is emphasized by the concern that many existing URL shortening services may not have a sustainable business model in the long term, which was highlighted by the statement from tr.im in August 2009 (see above)[9] In fall 2009, the Internet Archive started the "301 Works" projects, together with (initially) 20 collaborating companies, whose short URLs will be preserved by the project.[9]

Other issues

Users may be exposed to privacy issues in that the link shortening service is in a position to track a user's behaviour across many domains.

Short URLs add an additional layer of complexity, where for every access, more requests are necessary (at least one more DNS lookup and HTTP request).

A short URL obscures the target address, and as a result it's sometimes used to redirect to an unexpected site. Examples of this are rickrolling, redirecting to scam and affiliate websites, or shock sites; ZoneAlarm has given the warning "TinyURL may be unsafe. This website has been known to distribute spyware." TinyURL has countered this problem by offering an option to present a link when using TinyURL, instead of redirection.[13] In addition, even if the link does not include a preview, the preview may still be accessed by simply prefixing the word "preview" to the front of the URL (Ex: http://tinyurl.com/8kmfp could be retyped as http://preview.tinyurl.com/8kmfp) to see where the link will lead. Opaqueness is also used by spammers, who use such links in spam to bypass URL blacklists. TinyURL, in turn, disables spam related links from redirecting.[14]

Blocking

TinyURL is blocked in Saudi Arabia[15]

Some websites have responded by blocking short redirected URLs from being posted:

Security professionals are also suggesting users to always preview the short URLs before accessing it, especially after the shortener service cli.gs got hacked, exposing millions of users.[18]

Theoretical limits

If every web page were to be shortened, then URLs would become longer and longer. For example, a site which uses lowercase letters and numbers for its shortening (such as TinyURL) can have URLs of 6 characters, i.e. more than 2 billion pages. This number is very big but not infinite.

Let be the size of the alphabet used by the URL shortener, and the length of the longest desirable key (the part after http://domain.com/). Then the total number of short pages equal or shorter than is[19]

For and (current values for TinyURL) the result is 2.24*109.

To achieve even shorter URLs tr.im uses uppercase and lowercase, with . Currently (e.g. http://tr.im/oaG5); in this case the result is 1.5*107, i.e. 15 million.

Bit.ly also uses uppercase and lowercase and currently has n = 6 (e.g. http://bit.ly/11ozU3). This means they can serve up to 5.68*1010 pages (almost 57 billion).

TinyArro.ws and Doiop allow Unicode characters, thus they can create an incredible number of links for small values of . For example, with only TinyArro.ws can generate billions of possible URL combinations. Assuming with Unicode, the result is 10 billion. Doiop, on the other hand, uses Unicode only for user-generated keys. Therefore, short URLs must be searched for. Since probably most people use ASCII characters in user-generated Doiop URLs, most Unicode characters are available. Here is an example of a TinyArro.ws randomly generated URL http://➡.ws/⦅剝 (13 characters). On the other hand, as an example that probably is the general case, on June 2009 Doiop had available http://doiop.com/剝 (18 characters).

While using Unicode allows URL shortening services such as TinyArro.ws to represent a large number of links with a small number of additional characters, each Unicode character may require 2, 3 or even more bytes to store and transmit, so in some applications there is no gain to using Unicode shortening.

References

  1. ^ Comment thread 8916, Metafilter.com, 10 June 2001. Announcement of url shortening service available at makeashorterlink.com.
  2. ^ URL Shortening Services
  3. ^ Bit.ly Eclipses TinyURL on Twitter
  4. ^ http://twitter.com/TweaKdotTK/status/1834883583
  5. ^ tr.im
  6. ^ blog.tr.im/post/159369789/tr-im-r-i-p
  7. ^ blog.tr.im/post/160697842/tr-im-resurrected
  8. ^ http://en.blog.wordpress.com/2009/08/14/shorten/
  9. ^ a b c d Murad Ahmed: New project in scramble to save vanishing internet links Times online, December 7, 2009
  10. ^ Google URL Shortener.
  11. ^ Making URLs shorter for Google Toolbar and FeedBurner
  12. ^ http://youtube-global.blogspot.com/2009/12/make-way-for-youtube-links.html
  13. ^ Preview a URL feature
  14. ^ Spam Spotted Using TinyURL, Brian Krebs
  15. ^ [1]
  16. ^ Bit.ly Eclipses TinyURL on Twitter
  17. ^ [2]
  18. ^ blog.cli.gs/news/cligs-got-hacked-restoration-from-backup-started cli.gs
  19. ^ Geometric progression

See also

External links