Jump to content

Amazon S3: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
add wiki link to Cleversafe
Line 36: Line 36:
== Design ==
== Design ==
[[File:AmazonS3TwoTrillionObjects.JPG|thumb|At AWS Summit 2013 NYC, CTO [[Werner Vogels]] announces 2 trillion objects stored in S3.]]
[[File:AmazonS3TwoTrillionObjects.JPG|thumb|At AWS Summit 2013 NYC, CTO [[Werner Vogels]] announces 2 trillion objects stored in S3.]]
Amazon does not make details of S3's design public, though it clearly manages data with an [[object storage]] architecture. According to Amazon, S3's design aims to provide [[scalability]], [[high availability]], and [[low latency]] at [[commodity computing|commodity]] costs.{{cn|date=August 2016}}
Amazon does not make details of S3's design public, though it clearly manages data with an [[object storage|object-storage]] architecture. According to Amazon, S3's design aims to provide [[scalability]], [[high availability]], and [[low latency]] at [[commodity computing|commodity]] costs.{{cn|date=August 2016}}


S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year,<ref>[http://aws.amazon.com/s3/#protecting Amazon S3 Protecting Your Data]</ref> though there is no [[service-level agreement]] for durability.
S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year,<ref>[http://aws.amazon.com/s3/#protecting Amazon S3 Protecting Your Data]</ref> though there is no [[service-level agreement]] for durability.

Revision as of 21:13, 10 May 2018

Amazon Simple Storage Service
Type of site
Cloud storage
Available inEnglish
OwnerAmazon.com
URLaws.amazon.com/s3/
IPv6 supportYes
CommercialYes
RegistrationRequired (included in free tier layer)

Amazon S3 (Simple Storage Service) is a web service offered by Amazon Web Services (AWS). Amazon S3 provides storage through web services interfaces (REST, SOAP, and BitTorrent).[1][2] Amazon launched S3 on its fifth publicly available web service[citation needed], in the United States in March 2006[3] and in Europe in November 2007.[4]

Amazon says that S3 uses the same scalable storage infrastructure that Amazon.com uses to run its own global e-commerce network.[5]

Amazon S3 is reported to store more than 2 trillion objects as of April 2013.[6] This is up from 10 billion as of October 2007,[7] 14 billion in January 2008, 29 billion in October 2008,[8] 52 billion in March 2009,[9] 64 billion objects in August 2009,[10] and 102 billion objects in March 2010.[11] S3 uses include web hosting, image hosting, and storage for backup systems. S3 guarantees 99.9% monthly uptime service-level agreement (SLA),[12] that is, not more than 43 minutes of downtime per month.[13]

According to The Guardian, Amazon S3 is the most profitable division under the entire Amazon company.[14]

Design

At AWS Summit 2013 NYC, CTO Werner Vogels announces 2 trillion objects stored in S3.

Amazon does not make details of S3's design public, though it clearly manages data with an object-storage architecture. According to Amazon, S3's design aims to provide scalability, high availability, and low latency at commodity costs.[citation needed]

S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year,[15] though there is no service-level agreement for durability.

S3 stores arbitrary objects (computer files) up to 5 terabytes in size, each accompanied by up to 2 kilobytes of metadata. Objects are organized into buckets (each owned by an Amazon Web Services account), and identified within each bucket by a unique, user-assigned key. Amazon Machine Images (AMIs) which are used in the Elastic Compute Cloud (EC2) can be exported to S3 as bundles.[16]

Buckets and objects can be created, listed, and retrieved using either a REST-style HTTP interface or a SOAP interface. Additionally, objects can be downloaded using the HTTP GET interface and the BitTorrent protocol.

Requests are authorized using an access control list associated with each bucket and object.

Bucket names and keys are chosen so that objects are addressable using HTTP URLs:

  • http://s3.amazonaws.com/bucket/key
  • http://bucket.s3.amazonaws.com/key
  • http://bucket/key (where bucket is a DNS CNAME record pointing to bucket.s3.amazonaws.com)

Because objects are accessible by unmodified HTTP clients, S3 can be used to replace significant existing (static) web hosting infrastructure.[17] The Amazon AWS Authentication mechanism allows the bucket owner to create an authenticated URL with time-bounded validity. That is, someone can construct a URL that can be handed off to a third-party for access for a period such as the next 30 minutes, or the next 24 hours.

Every item in a bucket can also be served up as a BitTorrent feed. The S3 store can act as a seed host for a torrent and any BitTorrent client can retrieve the file. This drastically reduces the bandwidth costs for the download of popular objects. While the use of BitTorrent does reduce bandwidth, AWS does not provide native bandwidth limiting and as such users have no access to automated cost control. This can lead to users on the "free-tier" S3 or small hobby users amassing dramatic bills. AWS representatives have previously stated that such a feature was on the design table from 2006 to 2010[18] but in 2011 stated the feature is no longer in development.[19]

A bucket can be configured to save HTTP log information to a sibling bucket; this can be used in later data mining operations.[20]

Pricing

At its inception, Amazon charged end users US$0.15 per gigabyte-month, with additional charges for bandwidth used in sending and receiving data, and a per-request (get or put) charge.[21] On November 1, 2008, pricing moved to tiers where end users storing more than 50 terabytes receive discounted pricing.[8]

Hosting entire websites

Amazon S3 provides options to host static websites with Index document support and error document support.[22] This support was added as a result of user requests dating at least to 2006.[23] For example, suppose that Amazon S3 was configured with CNAME records to host http://subdomain.example.com/. In the past, a visitor to this URL would find only an XML-formatted list of objects instead of a general landing page (e.g., index.html) to accommodate casual visitors. Now, however, websites hosted on S3 may designate a default page to display, and another page to display in the event of a partially invalid URL.

Notable users

Photo hosting service SmugMug has used S3 since April 2006. They experienced a number of initial outages and slowdowns, but after one year they described it as being "considerably more reliable than our own internal storage" and claimed to have saved almost $1 million in storage costs.[24]

There are various User Mode File System (FUSE)-based file systems for Unix-like operating systems (Linux, etc.) that can be used to mount an S3 bucket as a file system. Note that as the semantics of the S3 file system are not that of a Posix file system, the file system may not behave entirely as expected.[25]

Apache Hadoop file systems can be hosted on S3, as its requirements of a file system are partially met by S3.[26] As a result, Hadoop can be used to run MapReduce algorithms on EC2 servers, reading data and writing results back to S3.

Netflix uses Amazon Web Services for their storage and compute operations with S3 being their system of record. Netflix implemented a tool, S3mper,[27] to address the limitations of eventual consistency that Amazon S3 provides.[28] S3mper stores the filesystem metadata: filenames, directory structure and permissions in Amazon DynamoDB.[14]

reddit is hosted on S3.[29]

Dropbox,[30] Bitcasa,[31] and Tahoe-LAFS-on-S3,[32] among others, use S3 for online backup and synchronization services. In 2016, Dropbox moved out from using Amazon S3 services and developed its own cloud server.[33][34]

Mojang hosts Minecraft game updates and player skins on S3.[35]

Tumblr, Formspring, and Pinterest host images on S3.

Swiftype's CEO has mentioned that the company uses S3.[36]

S3 was used in the past by some enterprises as a long term archiving solution, until Amazon Glacier was released in August 2012.[citation needed]

The API has become a popular method for object storage.[37] As a result, more and more applications have been built to natively support the S3 API.[38] This includes applications that write data to AWS S3, as well as to S3-compatible object stores:[39]

Type Company Name Product
Client Backup Haystack Software LLC Arq backup[40]
Client Backup CloudBerry Lab CloudBerry Backup[41]
Client Backup open-source Duplicati[42]
Client Backup Novosoft LLC Handy Backup[43]
File Browser odrive odrive[44]
MySQL Backup Oracle MySQL Enterprise Backup
Oracle Database Backup Oracle Oracle Secure Backup Cloud Manager[45]
Server Backup Commvault Commvault[46]
Server Backup Veritas NetBackup[47]
Server Backup Asigra Asigra Cloud Backup[48]
Server Backup Rubrik Rubrik[49]
Cloud Storage Wasabi Wasabi Hot Storage
Cloud Storage Gateway CTERA Networks C00 Series[50]
Cloud Storage Gateway Avere FXT Series[51]
Cloud Storage Gateway EMC CloudArray[52]
Cloud Storage Gateway Microsoft StorSimple[53]
Cloud Storage Gateway Nasuni NF Series[54]
Cloud Storage Gateway NetApp Altavault[55]
Cloud Storage Gateway Panzura Global File System[56]
Sync & Share Storage Made Easy SME
Hybrid Storage Cloudian Cloudian HyperStore[57]
Hybrid Storage NooBaa NooBaa Storage
On-Premises Storage Pure Storage FlashBlade
On-Premises Storage Scality RING Storage[58]
Open Source Zenko.io Open Source S3 Server[59]

Amazon S3 logs

Amazon S3 allows users to enable or disable logging. If enabled, the logs are stored on Amazon S3 buckets which can then be analyzed. These logs contain useful information like,

  • Date / time of access to the user's content
  • Protocol used etc.
  • HTTP Status
  • Turn around time

These logs can be analyzed and managed by using third-party tools such as S3Stat, Cloudlytics, Qloudstat, AWS Stats or Splunk.

S3 API and competing services

The broad adoption of Amazon S3 and related tooling has given rise to competing services based on the S3 API. These services use the standard programming interface; however, they are differentiated by their underlying technologies and supporting business models.[60] A cloud storage standard (like electrical and networking standards) enables competing service providers to design their services and clients using different parts in different ways yet still communicate and provide the following benefits:[61]

  1. Increase competition by providing a set of rules and a level playing field, encouraging market entry by smaller companies which might otherwise be precluded.
  2. Encourage innovation by cloud storage & tool vendors, and developers because they can focus on improving their own products and services instead of focusing on compatibility.
  3. Allow economies of scale in implementation (i.e., if a service provider encounters an outage or as clients outgrow their tools and need faster operating systems or tools, they can easily swap out solutions).
  4. Provide timely solutions for delivering functionality in response to demands of the marketplace (i.e., as business growth in new locations increases demand, clients can easily change or add service providers simply by subscribing to the new service).

Examples of competing S3 compliant storage implementations

Amazon S3 tools

Amazon S3 provides an API for third-party developers. It describes various API operations, related request and response structures, and error codes.[67] The original AWS Console provides tools for managing and uploading files, but it is not capable of managing large buckets or editing files online.[68] Third party websites like S3edit.com can help edit files on S3. [69]

Notes

  1. ^ Amazon S3, Cloud Computing Storage for Files, Images, Videos. Aws.amazon.com (2006-03-01). Retrieved on 2013-08-09.
  2. ^ Huang, Dijiang; Wu, Huijun (2017-09-08). Mobile Cloud Computing: Foundations and Service Models. Morgan Kaufmann. p. 67. ISBN 9780128096444.
  3. ^ "Amazon Web Services Launches "Amazon S3"" (Press release). Amazon.com. 2006-03-14. Retrieved 2015-09-22.
  4. ^ "Amazon Web Services Offers European Storage for Amazon S3" (Press release). Amazon.com. 2007-11-06. Retrieved 2015-09-22.
  5. ^ The same data storage infrastructure that Amazon uses to run its own global network of web sites
  6. ^ - Amazon S3 - Two Trillion Objects, 1.1 Million Requests / Second
  7. ^ Vogels, Werner (2008-03-19). "Happy Birthday, Amazon S3!". All Things Distributed.
  8. ^ a b "Amazon S3 - Busier Than Ever". Amazon.com. 2008-10-08.
  9. ^ Just a year ago, there were 18 billion objects in S3. As of today there are 52 billion
  10. ^ S3 (Amazon's Simple Storage Service) alone has over 64 billion objects in it.
  11. ^ Brian Lillie of Equinix said that Amazon now is hosting 102 billion objects in S3
  12. ^ Amazon S3 SLA
  13. ^ 60 min/hour * 24 hours in a day * 30 days * 0.1% = 43.2 min
  14. ^ a b Hern, Alex (2017-02-02). "Amazon Web Services: the secret to the online retailer's future success". the Guardian. Retrieved 2018-04-23.
  15. ^ Amazon S3 Protecting Your Data
  16. ^ Starting Websphere in Cloud and saving the data in S3
  17. ^ How to use Amazon S3 for Web Hosting
  18. ^ AWS Developer Forums: Limit my own bandwidth?. Forums.aws.amazon.com. Retrieved on 2013-08-09.
  19. ^ AWS Developer Forums: What is the status on the bill capping. Forums.aws.amazon.com. Retrieved on 2013-08-09.
  20. ^ http://docs.aws.amazon.com/AmazonS3/latest/dev/ServerLogs.html Server Access Logging
  21. ^ "Amazon S3 Pricing". Amazon.com. 2009-02-05. Retrieved 2014-05-02.
  22. ^ Amazon Simple Storage Service. Docs.amazonwebservices.com. Retrieved on 2013-08-09.
  23. ^ Garnaat, Mitch (19 Nov 2009). "Re: default key or 'default document' - is it possible to specify in S3?". Retrieved 21 Sep 2010.
  24. ^ "Amazon S3: Show Me the Money". SmugMug Blog. SmugMug. November 10, 2006.
  25. ^ "Comparison of S3QL and other S3 file systems". Retrieved 2012-06-29.
  26. ^ "Hadoop Filesystem Specification".
  27. ^ "S3mper: Consistency in the Cloud".
  28. ^ "Introduction to Amazon S3". Amazon. Retrieved 28 December 2017.
  29. ^ "AWS Case Study: reddit". aws.amazon.com. 2015. Retrieved March 18, 2015.
  30. ^ "Where are my files stored?". November 28, 2010.
  31. ^ "What is Tahoe-LAFS-on-S3?". August 21, 2012.
  32. ^ "The Epic Story of Dropbox's Exodus From the Amazon Cloud Empire". WIRED. Retrieved 2018-04-23.
  33. ^ "Dropbox saved almost $75 million over two years by building its own tech infrastructure". GeekWire. 2018-02-23. Retrieved 2018-04-23.
  34. ^ "Minecraft Beta 1.2_02". January 21, 2010.
  35. ^ "Swiftype Explains Their Cloud Stack". July 1, 2013.
  36. ^ Lelii, Sonia (23 September 2013). "Amazon S3 API for cloud storage leads pack, for now". TechTarget.com. Retrieved 31 May 2016.
  37. ^ Evans, Chris (12 January 2016). "Has S3 Become the De-Facto API Standard?". Architecting.it. Retrieved 31 May 2016.
  38. ^ Leopold, George (July 11, 2017). "Scality Targets Multi-Cloud Data Storage". Datanami news portal. {{cite news}}: Cite has empty unknown parameter: |dead-url= (help)
  39. ^ Sadun, Erica (6 November 2012). "Arq cloud backup adds low-cost Amazon Glacier support". www.engadget.com. Retrieved 31 May 2016.
  40. ^ Moran, Joe (1 December 2015). "Data Backup Software Review: CloudBerry Lab Backup 4.5". www.smallbusinesscomputing.com. Retrieved 31 May 2016.
  41. ^ Sanders, James (4 August 2014). "Securely back up personal files with Duplicati: Q&A with the open source client's creators". www.TechRepublic.com. Retrieved 31 May 2016.
  42. ^ Handy, Backup (17 April 2017). "Amazon S3 Backup Software for Cloud Backup". www.handybackup.net. Retrieved 12 September 2017.
  43. ^ Lohnash, Mike (19 June 2015). "Odrive Review: One Folder for All Your Clouds". www.BackupReview.com. Retrieved 31 May 2016.
  44. ^ "Oracle Database Backup To Cloud: Amazon Simple Storage Service (S3)" (PDF). Oracle.com. Retrieved 31 May 2016.
  45. ^ "Cloud Storage Support". Commvault.com. Retrieved 31 May 2016.
  46. ^ "Veritas launches NetBackup 7.7 with emphasis on cloud backup". SearchDataBackup. Retrieved 2016-05-31.
  47. ^ "Asigra, Veeam remain top users' choice for backup applications". SearchDataBackup. Retrieved 2016-05-31.
  48. ^ "Startup Rubrik Aiming to Erase Backup, Recovery Software". www.eweek.com. Retrieved 2016-05-31.
  49. ^ Ibm; Emc; Netapp; Seagate; Hp; Hill, Seagate rolls storage kit for manufacturers down Dot. "CTERA Networks offers up in-cloud server backup". Spectralogic CTO talks up hybrid flash-tape cartridge. Welcome tape robot overlords and backup, CTERA Networks offers up in-cloud server. Retrieved 2016-05-31.
  50. ^ Mellor, Chris (7 October 2015). "Like a wedding cake: Avere unveils three-tier AWS cloud NAS". TheRegister.com. Retrieved 8 June 2016.
  51. ^ Armstrong, Adam (28 January 2015). "EMC CloudArray 5.0 Launched". StorageReview.com. Retrieved 8 June 2016.
  52. ^ Mackie, Kurt (1 June 2015). "Microsoft StorSimple Extends Cloud Support to AWS, OpenStack". RedmondChannelPartner.com. Retrieved 8 June 2016.
  53. ^ Mellor, Chris (14 May 2015). "Azure gives AWS the blues again in Nasuni cloud storage poll". TheRegister.com. Retrieved 8 June 2016.
  54. ^ Ramel, David (28 May 2015). "NetApp Introduces AltaVault for Cloud Backup". AWSInsider.net. Retrieved 8 June 2016.
  55. ^ Knuth, Gabe (26 May 2015). "Panzura explains their Global File System and how they can help you deploy XenDesktop from AWS". Retrieved 7 June 2016 – via BrianMadden.com.
  56. ^ "Cloudian deploys Amazon S3-compatible on-premises object storage, sold and metered in AWS Marketplace | Cloudian". Cloudian. Retrieved 2016-11-21.
  57. ^ "Scality RING S3 Connector - Scality". Scality. Retrieved 2017-09-02.
  58. ^ Armstrong, Adam (11 July 2017). "Scality Releases Zenko A Multi-Cloud Controller". Storage Review. {{cite news}}: Cite has empty unknown parameter: |dead-url= (help)
  59. ^ Watters, Audrey. "Cloud Community Debates, Is Amazon S3's API the Standard? (And Should It Be?)". SAY Media, Inc. Retrieved 19 December 2012.
  60. ^ Crossroads of Information Technology Standards. Washington, DC:: The National Academies Press, 1990. 1990. pp. 36–37. {{cite book}}: Cite uses deprecated parameter |authors= (help)CS1 maint: extra punctuation (link)
  61. ^ "Connectria Cloud Storage - Amazon S3® Compatible Cloud Storage Service". Connectria. Connectria. Retrieved 19 December 2012.
  62. ^ "Connectria Hosting Launches Cloud Storage Solution". Connectria Hosting. Retrieved 19 December 2012.
  63. ^ Riak CS. Basho (2013-01-18). Retrieved on 2013-08-09.
  64. ^ Ceph Object Gateway S3 API — Ceph Documentation. Ceph.com. Retrieved on 2013-08-09.
  65. ^ DigitalOcean Spaces API - DigitalOcean Documentation. developers.DigitalOcean.com. Retrieved on 2017-10-30.
  66. ^ Amazon Simple Storage Service Documentation
  67. ^ AWS Management Console
  68. ^ S3Edit.com Online S3 File Editor

References