Jump to content

URL shortening: Difference between revisions

From Wikipedia, the free encyclopedia
[pending revision][pending revision]
Content deleted Content added
No edit summary
Line 8: Line 8:


=== Avoid URL garbling ===
=== Avoid URL garbling ===
Currently, web developers tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths and session information. This often results in a URL that is aesthetically unpleasant and impossible to remember. Copying a URL that is hundreds of characters long can make the URL garbled. Then a short URL is useful to copy on an [[e-mail]] message or a [[forum]] post.
Currently, web developers tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths and session information. This may result in a URL that is aesthetically unpleasant and difficult to remember. Copying a URL that is hundreds of characters long can make the URL garbled. Then a short URL is useful to copy on an [[e-mail]] message or a [[forum]] post.


=== Use the smallest space possible ===
=== Use the smallest space possible ===

Revision as of 21:09, 8 September 2009

URL shortening is a technique on the World Wide Web where a provider makes a web page available under a very short URL in addition to the original address. For example, the page http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://tinyurl.com/mmw6lb.

Purposes

There are several reasons to use URL shortening:

Avoid URL garbling

Currently, web developers tend to pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths and session information. This may result in a URL that is aesthetically unpleasant and difficult to remember. Copying a URL that is hundreds of characters long can make the URL garbled. Then a short URL is useful to copy on an e-mail message or a forum post.

Use the smallest space possible

On Twitter or instant message status even a 60 character long URL can be too long. A URL shortener can make short URLs such as http://br.st/74 (15 characters), http://haxurl.com/jI50 (22 characters), http://tinyurl.com/gf65tg (25 characters), http://tr.im/o65Tg (18 characters), http://x2t.com/88 (17 characters) , http://gf65tg.tk (16 characters) or http://➡.ws/⦅剝 (13 characters). In Twitter http:// can be replaced by www. and 3 fewer characters can be used.

Reading aloud

To accomplish this objective any URL shortening service can be useful. However those which let the user choose the URL are more suited for this task.

Manipulating visitors

URL shortening is a special kind of URL redirection, which is sometimes used in pranks, phishing, or affiliate hiding. For example tinyurl.com/ha56k0k redirects to goatse.cx in a prank. More recently, some of these services (http://br.st/ in particular) have started filtering all shortened links through services like Google Safe Browsing.

Techniques

Every long URL is associated with a key, which is the part after http://domain.tld/. For example http://tinyurl.com/m3q2xt has a key of m3q2xt.

There are several techniques to implement a shortening.

  • Keys can be generated numerically in base 36 assuming 26 letters and 10 numbers. The keys in order would be 0, 1, 2, ..., 9, a, b, ..., z, 00, 01, ..., 0z, 10, etc. If uppercase and lowercase letters are accepted then the number should be in base 62 (26 + 26 + 10).

For redirection techniques see URL redirection.

History

The first notable URL shortening service, TinyURL, was launched in 2002; however, the idea dates to at least 2001.[1]

The popularity of TinyURLs influenced the creation of at least 100 similar websites.[2] Most are simply domain alternatives. However, some offer additional features:

  • NotifyURL sends an email when the link is first visited.
  • SnipURL introduces social bookmarking features such as usernames and RSS feeds.
  • DwarfURL generates statistics.
  • Adjix, XR.com and Linkbee are ad-supported models of URL shorteners that share the revenue with their users.[3]
  • om.ly[4] provides real time, multi-dimensional analysis of click-through statistics.
  • bit.ly offers gratis click-through statistics and charts.
  • br.st! offers real-time stats in your own time zone showing only human traffic, and all destination links are checked against Google Safebrowsing (updated every 30 mins).
  • Digg offers a shortened URL which includes not just the target URL, but an iframed version that includes a set of Digg-related controls called the Digg bar.
  • Doiop allows the shortening to be selected by the user, and Unicode can be used to achieve really short URLs. For example http://doiop.com/Ⓦ redirects to Wikipedia's main page.
  • urlShort is an open source shortener project.

Initially Twitter automatically translated long URLs using TinyURL. As of 2009, bit.ly is used.[5]

On May 2009 .tk, which previously was used to generate memorable domains via URL redirection, launched Tweak.tk[6], which generates very short URLs such as http://mxtux.tk.

On 10 August 2009 a notice on the tr.im shortening service home page announced that "[s]tatistics can no longer be considered reliable, or reliably available going forward" and that they were shuttering the generation of new shortened URLs, but assured existing tr.im short URLs would "continue to redirect, and will do so until at least December 31, 2009".[7] A blog post on the site attributed this move to several factors, including the lack of suitable revenue generation mechanisms to cover ongoing hosting and maintenance costs, lack of interest among possible purchasers of the service, and Twitter's default use of the bit.ly shortener.[8] This blog post also questioned whether other shortening services can successfully monetize URL shortening in the longer term. A few days later tr.im reversed itself on this move, announcing it would resume all operations "going forward, indefinitely, while we continue to consider our options in regards to tr.im’s future". [9]

Criticism

The convenience offered by URL shortening also introduces potential problems, which have led to criticism of the use of these services.

Short URLs are subject to linkrot, in the case the service stops working, all URLs related to the service will become broken. A solution can be to save the destination URLs of potentially useful redirects locally, or to periodically download the redirect database to backup it, if the service permits that.

Users may be exposed to privacy issues in that the link shortening service is in a position to track a user's behaviour across many domains.

Short URLs add an additional layer of complexity, where for every access, more requests are necessary (at least one more DNS lookup and HTTP access).

A short URL obscures the original address, and as a result it's sometimes used to redirect to an unexpected site. Examples of this are rickrolling, redirecting to scam and affiliate websites, or shock sites; ZoneAlarm has given the warning "TinyURL may be unsafe. This website has been known to distribute spyware." TinyURL has countered this problem by offering an option to present a link when using TinyURL, instead of redirection.[10] In addition, even if the link does not include a preview, the preview may still be accessed by simply prefixing the word "preview" to the front of the URL (Ex: "http://tinyurl.com/8kmfp" could be retyped as "http://preview.tinyurl.com/8kmfp") to see where the link will lead. Opaqueness is also leveraged by spammers, who use such links in spam to bypass URL blacklists. TinyURL, in turn, disables spam related links from redirecting.[11]

Some websites have responded by blocking short redirected URLs from being posted:

Security professionals are also suggesting users to always preview the short URLs before accessing it, especially after the shortener service cligs got hacked, exposing millions of users.

Wikimedia's spam-blacklist has almost 500 URL shorteners including doiop.com, dwarfurl.com and tr.im.

Theoretical limits

If every web page were to be shortened, then URLs would become longer and longer. For example, a site which uses lowercase letters and numbers for its shortening (such as TinyURL) can have URLs of 6 characters, i.e. more than 2 billion pages. This number is very big but not infinite.

Let be the size of the alphabet used by the URL shortener, and the length of the longest desirable key (the part after http://domain.com/). Then the total number of short pages equal or shorter than is[12]

For and (current values for TinyURL) the result is 2.24*109.

To achieve even shorter URLs tr.im uses uppercase and lowercase, with . Currently (e.g. http://tr.im/oaG5); in this case the result is 1.5e7, i.e. 15 million.

Bit.ly also uses uppercase and lowercase and currently has n = 6 (e.g. http://bit.ly/11ozU3) This means they can serve up to 5.77e10 pages (almost 58 billion).

TinyArro.ws and Doiop allow Unicode characters, thus they can create an incredible number of links for small values of . For example, with only TinyArro.ws can generate billions of possible URL combinations. Assuming with Unicode, the result is 10 billion. Doiop, on the other hand, uses Unicode only for user-generated keys. Therefore, short URLs must be searched for. Since probably most people use ASCII characters in user-generated Doiop URLs, most Unicode characters are available. Here is an example of a TinyArro.ws randomly generated URL http://➡.ws/⦅剝 (13 characters). On the other hand, as an example that probably is the general case, on June 2009 Doiop had available http://doiop.com/剝 (18 characters).

While using Unicode allows URL shortening services such as TinyArro.ws to represent a large number of links with a small number of additional characters, each Unicode character may require 2, 3 or even more bytes to store and transmit, so in some applications there is no gain to using Unicode shortening.

References

  1. ^ Comment thread 8916, Metafilter.com, 10 June 2001. Announcement of url shortening service available at makeashorterlink.com.
  2. ^ 90+ URL Shortening Services, Mashable.Com, 8 January 2008, page 84
  3. ^ TinyURL with a (questionable) revenue model: Adjix and Linkbee, Cnet.Com, 22 August 2008
  4. ^ http://objectivemarketer.com
  5. ^ Bit.ly Eclipses TinyURL on Twitter
  6. ^ http://twitter.com/TweaKdotTK/status/1834883583
  7. ^ tr.im
  8. ^ blog.tr.im/post/159369789/tr-im-r-i-p
  9. ^ blog.tr.im/post/160697842/tr-im-resurrected
  10. ^ Preview a URL feature
  11. ^ Spam Spotted Using TinyURL, Brian Krebs
  12. ^ Geometric progression