Jump to content

Amazon S3

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Aednichols (talk | contribs) at 16:18, 16 February 2013 (→‎S3 API and competing services). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Amazon Simple Storage Service
File:AWS Simple Icons Storage Amazon S3 Bucket with Objects.svg
An S3 Bucket with Objects
Type of site
Online backup service
Available inEnglish
OwnerAmazon.com
URLaws.amazon.com/s3/
IPv6 supportNo
CommercialYes
RegistrationRequired

Amazon S3 (Simple Storage Service) is an online storage web service offered by Amazon Web Services. Amazon S3 provides storage through web services interfaces (REST, SOAP, and BitTorrent).[1] Amazon launched S3, its first publicly available web service, in the United States in March 2006[2] and in Europe in November 2007.[3]

At its inception, Amazon charged end users US$0.15 per gigabyte-month, with additional charges for bandwidth used in sending and receiving data, and a per-request (get or put) charge.[4] As of November 1, 2008, pricing moved to tiers where end users storing more than 50 terabytes receive discounted pricing.[5] Amazon claims that S3 uses the same scalable storage infrastructure that Amazon.com uses to run its own global e-commerce network.[6]

Amazon S3 is reported to store more than a trillion objects as of June 2012.[7] This is up from 102 billion objects as of March 2010,[8] 64 billion objects in August 2009,[9] 52 billion in March 2009,[10] 29 billion in October 2008,[5] 14 billion in January 2008, and 10 billion in October 2007.[11] S3 uses include web hosting, image hosting, and storage for backup systems. S3 comes with a 99.9% monthly uptime guarantee[12] which equates to approximately 43 minutes of downtime per month.[13]

Design

Details of S3's design are not made public by Amazon. According to Amazon, S3's design aims to provide scalability, high availability, and low latency at commodity costs.

S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year,[14] though there is no SLA for durability.

S3 stores arbitrary objects (computer files) up to 5 terabytes in size, each accompanied by up to 2 kilobytes of metadata. Objects are organized into buckets (each owned by an Amazon Web Services or AWS account), and identified within each bucket by a unique, user-assigned key. Amazon Machine Images (AMIs) which are modified in the Elastic Compute Cloud (EC2) can be exported to S3 as bundles.[15]

Buckets and objects can be created, listed, and retrieved using either a REST-style HTTP interface or a SOAP interface. Additionally, objects can be downloaded using the HTTP GET interface and the BitTorrent protocol.

Requests are authorized using an access control list associated with each bucket and object.

Bucket names and keys are chosen so that objects are addressable using HTTP URLs:

  • http://s3.amazonaws.com/bucket/key
  • http://bucket.s3.amazonaws.com/key
  • http://bucket/key (where bucket is a DNS CNAME record pointing to bucket.s3.amazonaws.com)

Because objects are accessible by unmodified HTTP clients, S3 can be used to replace significant existing (static) web hosting infrastructure.[16] The Amazon AWS Authentication mechanism allows the bucket owner to create an authenticated URL with time-bounded validity. That is, someone can construct a URL that can be handed off to a third-party for access for a period such as the next 30 minutes, or the next 24 hours.

Every item in a bucket can also be served up as a BitTorrent feed. The S3 store can act as a seed host for a torrent and any BitTorrent client can retrieve the file. This drastically reduces the bandwidth costs for the download of popular objects. While the use of BitTorrent does reduce bandwidth, AWS does not provide native bandwidth limiting and as such users have no access to automated cost control. This can lead to users on the 'free-tier' S3 or small hobby users amassing dramatic bills. AWS representatives have previously stated that such a feature was on the design table from 2006 to 2010[17] but have recently stated the feature is no longer in development.[18]

A bucket can be configured to save HTTP log information to a sibling bucket; this can be used in later data mining operations. This feature is currently still in beta.

Hosting entire websites

Amazon S3 provides options to host static websites with Index document support and error document support.[19] This support was added as a result of user requests dating at least to 2006.[20] For example, suppose that Amazon S3 was configured with CNAME records to host http://subdomain.example.com/. In the past, a visitor to this URL would find only an XML-formatted list of objects instead of a general landing page (e.g., index.html) to accommodate casual visitors. Now, however, websites hosted on S3 may designate a default page to display, and another page to display in the event of a partially invalid URL. However, the current domain registration infrastructure only allows a subdomain to be hosted this way, not a second level domain. That is, subdomain.example.com can be hosted, but not example.com. One may use an A record pointing to the S3 server, but this method is not documented by Amazon.

Notable uses

Photo hosting service SmugMug has used S3 since April 2006. They experienced a number of initial outages and slowdowns, but after one year they described it as being "considerably more reliable than our own internal storage" and claimed to have saved almost $1 million in storage costs.[21]

There are various User Mode File System (FUSE)-based file systems for Unix-like operating systems (Linux, etc.) that can be used to mount an S3 bucket as a file system. Note that as the semantics of the S3 file system are not that of a Posix file system, the file system may not behave entirely as expected.[22]

Apache Hadoop file systems can be hosted on S3, as its requirements of a file system are met by S3. As a result, Hadoop can be used to run MapReduce algorithms on EC2 servers, reading data and writing results back to S3.

Dropbox,[23] StoreGrid, SyncBlaze,[24] Tahoe-LAFS-on-S3,[25] Zmanda and Ubuntu One[26] are some of the many online backup and synchronization services that use S3 as their storage and transfer facility.

Minecraft hosts game updates and player skins on the S3 servers.[27]

Tumblr, Formspring and Posterous images are hosted on the S3 servers.

Alfresco (software) the OpenSource Enterprise Content Management provider are hosting data for the Alfresco in the cloud service on S3.

Garry's Mod's "Toybox" feature is hosted on the S3 servers.

S3 use to be used by some enterprises as a long term archiving solution, until Amazon Glacier was released. This provided long term archiving solution for up to 90% cheaper than S3, "Cost Comparison Amazon Glacier vs S3".</ref> however with retrieval times increasing to a few hours.

S3 API and competing services

The broad adoption of Amazon S3 and related tooling has given rise to competing services based on the S3 API. These services utilize the standard programing interface; however, they are differentiated by their underlying technologies and supporting business models.[28] A cloud storage standard (like electrical and networking standards) enable competing service providers to design their services and clients using different piece parts in different ways yet still communicate and provide the following benefits:[29]

  1. Increase competition by providing a set of rules and a level playing field, encouraging market entry by smaller companies which might otherwise be precluded.
  2. Encourage innovation by Cloud Storage Vendors, Developers, and Client Tool Vendors, because they can focus on improving their own products and services instead of focusing on compatibility.
  3. Allow economies of scale in implementation (i.e., if a service provider encounters an outage or as clients outgrow their tools and need faster operating systems or tools, they can easily swap out solutions).
  4. Provide timely solutions for delivering functionality in response to demands of the marketplace (i.e, As business growth in new locations increases demand, clients can easily change or add service providers simply by subscribing to the new service).

Adopting a new technology is challenging. Knowing that there are competing services in the marketplace built on the standard (or subsets of the standard) makes adopting client applications easier. You get to choose your own service vendor based on location, speed, pricing, and service level agreements as well as choosing from a multitude of client tools and developer tools[30] that work with the API.

Examples of competing S3 compliant storage implementations include:

Notes

  1. ^ http://aws.amazon.com/s3/
  2. ^ "Amazon Web Services Launches "Amazon S3"" (Press release). Amazon.com. 2006-03-14.
  3. ^ Dorsey, John (2007-11-06). "Amazon S3 Storage Now Available in Europe". Dr. Dobb's Portal. Archived from the original on 15 April 2008. Retrieved 2008-03-26. {{cite news}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  4. ^ "Amazon Simple Storage Service pricing". Amazon.com. 2009-02-05.
  5. ^ a b "Amazon S3 - Busier Than Ever". Amazon.com. 2008-10-08.
  6. ^ The same data storage infrastructure that Amazon uses to run its own global network of web sites
  7. ^ - Amazon S3 - The First Trillion Objects
  8. ^ Brian Lillie of Equinix said that Amazon now is hosting 102 billion objects in S3
  9. ^ S3 (Amazon's Simple Storage Service) alone has over 64 billion objects in it.
  10. ^ Just a year ago, there were 18 billion objects in S3. As of today there are 52 billion
  11. ^ Vogels, Werner (2008-03-19). "Happy Birthday, Amazon S3!". All Things Distributed.
  12. ^ Amazon S3 SLA
  13. ^ 60 min/hour * 24 hours in a day * 30 days * 0.1% = 43.2
  14. ^ Amazon S3 Protecting Your Data
  15. ^ Starting Websphere in Cloud and saving the data in S3
  16. ^ How to use Amazon S3 for Web Hosting
  17. ^ https://forums.aws.amazon.com/thread.jspa?threadID=10532&start=0&tstart=0
  18. ^ https://forums.aws.amazon.com/thread.jspa?threadID=58127&tstart=75
  19. ^ http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?WebsiteHosting.html
  20. ^ Garnaat, Mitch, (19 Nov 2009). "Re: default key or 'default document' - is it possible to specify in S3?". Retrieved 21 Sep 2010.{{cite web}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  21. ^ "Amazon S3: Show Me the Money". SmugMug Blog. SmugMug. November 10, 2006.
  22. ^ "Comparison of S3QL and other S3 file systems". Retrieved 2012-06-29.
  23. ^ "Where are my files stored?". November 28, 2010.
  24. ^ "Where SyncBlaze Cloud stores my files?".
  25. ^ "What is Tahoe-LAFS-on-S3?". August 21, 2012.
  26. ^ "Ubuntu One Technical Details". Ubuntu.com. Retrieved 16 October 2012.
  27. ^ "Minecraft Beta 1.2_02". January 21, 2010.
  28. ^ Watters, Audrey. "Cloud Community Debates, Is Amazon S3's API the Standard? (And Should It Be?)". SAY Media, Inc. Retrieved 19 December 2012.
  29. ^ Committee on Standards Workshop Planning, Board on Telecommunications and Computer Applications, Commission on Engineering and Technical Systems, National Research Council (1990). Crossroads of Information Technology Standards. Washington, DC:: The National Academies Press, 1990. pp. 36–37.{{cite book}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  30. ^ "Developer Tools: Amazon Web Services". Amazon Web Services, Inc. Retrieved 19 December 2012.
  31. ^ Harris, Derrick (Mar 9, 2011). "Cloud.com Expands Service Provider Footprint". Gigaom. Retrieved 19 December 2012.
  32. ^ http://www.cloudian.com/
  33. ^ "Connectria Cloud Storage - Amazon S3® Compatible Cloud Storage Service". Connectria. Connectria. Retrieved 19 December 2012.
  34. ^ "Connectria Hosting Launches Cloud Storage Solution". Connectria Hosting. Retrieved 19 December 2012.
  35. ^ Ross, Rose (February 22, 2011). "Connectria selects Scality to launch a public cloud storage service". RealWire. Retrieved 19 December 2012.
  36. ^ "Cloud Storage Providers". Twinstrata. Twinstrata. Retrieved 19 December 2012.
  37. ^ http://basho.com/products/riakcs/
  38. ^ http://ceph.com/docs/master/radosgw/s3/

References