MongoDB

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Dwo (talk | contribs) at 22:38, 18 October 2016 (→‎Main features: MOS:ANDOR). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

MongoDB
Developer(s)MongoDB Inc.
Initial release2009 (2009)
Stable release
3.2.9[1] / 16 August 2016; 7 years ago (2016-08-16)
Preview release
3.3.12[2] / 30 August 2016; 7 years ago (2016-08-30)
Repository
Written inC++, C and JavaScript
Operating systemWindows Vista and later, Linux, OS X 10.7 and later, Solaris,[3] FreeBSD[4]
Available inEnglish
TypeDocument-oriented database
LicenseVarious; see § Licensing
Websitewww.mongodb.org

MongoDB (from humongous) is a free and open-source cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB avoids the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (It calls the format BSON), making the integration of data in certain types of applications easier and faster. MongoDB is developed by MongoDB Inc. and is free and open-source, published under a combination of the GNU Affero General Public License and the Apache License.

History

The software company 10gen began developing MongoDB in 2007 as a component of a planned platform as a service product. In 2009, the company shifted to an open source development model, with the company offering commercial support and other services. In 2013, 10gen changed its name to MongoDB Inc.[5]

Main features

Some of the features include:[6]

Ad hoc queries

MongoDB supports field, range queries, regular expression searches.[7] Queries can return specific fields of documents and also include user-defined JavaScript functions. Queries can also be configured to return a random sample of results of a given size.

Indexing

Any field in a MongoDB document can be indexed – including within arrays and embedded documents (indices in MongoDB are conceptually similar to those in RDBMSes). Primary and secondary indices are available.

Replication

MongoDB provides high availability with replica sets.[8] A replica set consists of two or more copies of the data. Each replica set member may act in the role of primary or secondary replica at any time. All writes and reads are done on the primary replica by default. Secondary replicas maintain a copy of the data of the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can optionally serve read operations, but that data is only eventually consistent by default.

Load balancing

MongoDB scales horizontally using sharding.[9] The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.). Alternatively, the shard key can be hashed to map to a shard – enabling an even data distribution.

MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure. MongoDB is easy to deploy, and new machines can be added to a running database.

File storage

MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.

This function, called Grid File System,[10] is included with MongoDB drivers and available for many development languages (see "Language Support" for a list of supported languages). MongoDB exposes functions for file manipulation and content to developers. GridFS is used, for example, in plugins for NGINX[11] and lighttpd.[12] Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.[13]

In a multi-machine MongoDB system, files can be distributed and copied multiple times between machines transparently, thus effectively creating a load-balanced and fault-tolerant system.

Aggregation

MapReduce can be used for batch processing of data and aggregation operations.

The aggregation framework enables users to obtain the kind of results for which the SQL GROUP BY clause is used. Aggregation operators can be strung together to form a pipeline – analogous to Unix pipes. The aggregation framework includes the $lookup operator which can join documents from multiple documents, as well as statistical operators such as standard deviation.

Server-side JavaScript execution

JavaScript can be used in queries, aggregation functions (such as MapReduce), and sent directly to the database to be executed.

Capped collections

MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.

Bug reports and criticisms

In some failure scenarios where an application can access two distinct MongoDB processes, but these processes cannot access each other, it is possible for MongoDB to return stale reads. In this scenario it is also possible for MongoDB to roll back writes that have been acknowledged.[14]

Before version 2.2, concurrency control was implemented on a per-mongod basis. With version 2.2, concurrency control was implemented at the database level.[15] Since version 3.0,[16] pluggable storage engines were introduced, and each storage engine may implement concurrency control differently.[17] With MongoDB 3.0 concurrency control is implemented at the collection level for the MMAPv1 storage engine,[18] and at the document level with the WiredTiger storage engine.[19] With versions prior to 3.0, one approach to increase concurrency is to use sharding.[20] In some situations, reads and writes will yield their locks. If MongoDB predicts a page is unlikely to be in memory, operations will yield their lock while the pages load. The use of lock yielding expanded greatly in 2.2.[21]

Another criticism is related to the limitations of MongoDB when used on 32-bit systems.[22] In some cases, this was due to inherent memory limitations.[23][self-published source] MongoDB recommends 64-bit systems and that users provide sufficient RAM for their working set.

MongoDB cannot do collation-based sorting and is limited to byte-wise comparison via memcmp,[24] which will not provide correct ordering for many non-English languages[25] when used with a Unicode encoding.

MongoDB queries against an index are not atomic and can miss documents which are being updated while the query is running and match the query both before and after an update.[26]

Architecture

Programming language accessibility

MongoDB has official drivers for a variety of popular programming languages and development environments.[27] There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks. [28]

Management and graphical front-ends

Record insertion in MongoDB with Robomongo 0.8.5.

Most administration is done from command line tools such as the mongo shell because MongoDB does not include a GUI-style administrative interface. There are products and third-party projects that offer user interfaces for administration and data viewing.[29]

Licensing

MongoDB is available at no cost under the GNU Affero General Public License, version 3.[30] The language drivers are available under an Apache License. In addition, MongoDB Inc. offers proprietary licenses for MongoDB.

Production deployments

Large-scale deployments of MongoDB are tracked by MongoDB Inc. Notable users of MongoDB include:

  • Adobe: Adobe Experience Manager is intended to accelerate development of digital experiences that increase customer loyalty, engagement and demand. Adobe uses MongoDB to store petabytes of data in the large-scale content repositories underpinning the Experience Manager.[31]
  • Amadeus IT Group uses MongoDB for its back-end software.[32]
  • The Compact Muon Solenoid at CERN uses MongoDB as the primary back-end for the Data Aggregation System for the Large Hadron Collider.[33]
  • Craigslist: With 80 million classified ads posted every month, Craigslist needs to archive billions of records in multiple formats, and must be able to query and report on these archives at runtime. Craigslist migrated from MySQL to MongoDB to support its active archive, with continuous availability mandated for regulatory compliance across 700 sites in 70 different countries.[34]
  • eBay uses MongoDB in the search suggestion and the internal Cloud Manager State Hub.[35]
  • FIFA (video game series): EA's Spearhead development studio uses MongoDB[36] to store user data and game state. Auto-sharding allows scaling MongoDB across EA's 250+ servers as user demand grows.
  • Foursquare deploys MongoDB on Amazon AWS to store venues and user check-ins into venues.[37]
  • LinkedIn uses MongoDB for its internal learning platform.[38]
  • McAfee: MongoDB powers McAfee Global Threat Intelligence (GTI), a cloud-based intelligence service that correlates data from millions of sensors around the globe. Billions of documents are stored and analyzed in MongoDB to deliver real-time threat intelligence to other McAfee end-client products.[39]
  • MetLife uses MongoDB for “The Wall", a customer service application providing a "360-degree view" of MetLife customers.[40]
  • Plexistor for MongoDB delivers persistent, high capacity storage at near-memory speed, enabling last transaction safety in the event of an application or power failure. Plexistor can be used in Amazon AWS as well as on premise, running on Linux OS or on Docker containers.[41]
  • SAP uses MongoDB in the SAP PaaS.[42]
  • Shutterfly uses MongoDB for its photo platform. As of 2013, the photo platform stores 18 billion photos uploaded by Shutterfly's 7 million users.[43]
  • Tuenti uses MongoDB as its backend DB.[44]
  • Yandex: The largest search engine in Russia uses MongoDB to manage all user and metadata for its file sharing service. MongoDB has scaled[45] to support tens of billions of objects and TBs of data, growing at 10 million new file uploads per day.

MongoDB World

File:Mongodb world 2015.jpg

MongoDB World [46] is an annual developer conference hosted by MongoDB, Inc. Started in 2014, MongoDB World provides a multi-day opportunity for communities and experts in MongoDB to network, learn from peers, research upcoming trends and interesting use cases, and hear about new releases and developments from MongoDB, Inc.[47]

See also

References

  1. ^ "Release Notes for MongoDB 3.2". MongoDB.
  2. ^ "Core Server Versions". MongoDB.
  3. ^ "Install MongoDB". MongoDB Manual. Retrieved 2016-08-17.
  4. ^ "MongoDB Ports". FreeBSD Ports Search. Retrieved 2016-09-15.
  5. ^ "10gen embraces what it created, becomes MongoDB Inc". Gigaom. Retrieved 29 January 2016.
  6. ^ MongoDB. "MongoDB Developer Manual". MongoDB.
  7. ^ "MongoDB Find Command".
  8. ^ MongoDB. "Introduction to Replication". MongoDB.
  9. ^ MongoDB. "Introduction to Sharding". MongoDB.
  10. ^ MongoDB. "GridFS article on MongoDB Developer's Manual". MongoDB.
  11. ^ "NGINX plugin for MongoDB source code". GitHub.
  12. ^ "lighttpd plugin for MongoDB source code". Bitbucket.
  13. ^ Malick Md. "MongoDB overview". Expertstown.
  14. ^ Kyle Kingsbury (2015-04-20). "Call me maybe: MongoDB stale reads". Retrieved 2015-07-04.
  15. ^ "MongoDB Jira Ticket 4328". jira.mongodb.org.
  16. ^ Eliot Horowitz (2015-01-22). "Renaming Our Upcoming Release to MongoDB 3.0". MongoDB. Retrieved 2015-02-23.
  17. ^ "MongoDB 2.8 release". MongoDB.
  18. ^ MongoDB. "MMAPv1 Concurrency Improvement". MongoDB.
  19. ^ MongoDB. "WiredTiger Concurrency and Compression". MongoDB.
  20. ^ MongoDB. "FAQ Concurrency - How Does Sharding Affect Concurrency". MongoDB.
  21. ^ MongoDB. "FAQ Concurrency - Do Operations Ever Yield the Lock". MongoDB.
  22. ^ MongoDB (8 July 2009). "32-bit Limitations". MongoDB.
  23. ^ David Mytton (25 September 2012). "Does Everybody Hate MongoDB". Server Density.
  24. ^ "memcmp". cppreference.com. 31 May 2013. Retrieved 26 April 2014.
  25. ^ "MongoDB Jira ticket 1920". jira.mongodb.org.
  26. ^ MongoDB queries don’t always return all matching documents!
  27. ^ MongoDB. "MongoDB Drivers and Client Libraries". MongoDB. Retrieved 2013-07-08.
  28. ^ MongoDB. "Community Supported Drivers". MongoDB. Retrieved 2014-07-09.
  29. ^ MongoDB. "Admin UIs". Retrieved 15 September 2015.
  30. ^ MongoDB. "The AGPL". The MongoDB NoSQL Database Blog. MongoDB.
  31. ^ MongoDB. "Adobe Experience Manager". MongoDB.
  32. ^ "Presentation by Amadeus 11/2014". MongoDB.
  33. ^ "Holy Large Hadron Collider, Batman!". MongoDB.
  34. ^ MongoDB. "Craigslist". MongoDB.
  35. ^ "MongoDB at eBay". Slideshare.
  36. ^ "MongoDB based FIFA Online". MongoDB.
  37. ^ "Experiences Deploying MongoDB on AWS". MongoDB.
  38. ^ "Presentation by LinkedIn". MongoDB.
  39. ^ MongoDB. "McAfee is Improving Global Cybersecurity with MongoDB". MongoDB.
  40. ^ Doug Henschen (13 May 2013). "Metlife uses nosql for customer service". Information Week. Retrieved 8 November 2014.
  41. ^ "Plexistor for MangoDB".
  42. ^ Richard Hirsch (30 September 2011). "The Quest to Understand the Use of MongoDB in the SAP PaaS".
  43. ^ Guy Harrison (28 January 2011). "Real World NoSQL: MongoDB at Shutterfly". Gigaom.
  44. ^ "We host the MongoDB user group meetup at our office".
  45. ^ "Yandex: MongoDB". Yandex.
  46. ^ MongoDB World
  47. ^ Interview with Alex Komyagin, Senior CE at MongoDB

Bibliography

External links