This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)(Learn how and when to remove this template message)
Azure Cosmos DB is Microsoft’s proprietary globally-distributed, multi-model database service "for managing data at planet-scale" launched in May 2017. It is schema-agnostic, horizontally scalable and generally classified as a NoSQL database.
Internally, Cosmos DB stores "items" in "containers", with these 2 concepts being surfaced differently depending on the API used (these would be "documents" in "collections" when using the MongoDB-compatible API, for example). Containers are grouped in "databases", which are analogous to namespaces above containers. Containers are schema-agnostic, which means that no schema is enforced when adding items.
By default, every field in each item is automatically indexed, generally providing good performance without tuning to specific query patterns. These defaults can be modified by setting an indexing policy which can specify, for each field, the index type and precision desired. Cosmos DB offers 3 types of indexes:
- Hash, supporting equality queries,
- Range, supporting range and ORDER BY queries,
- Spatial, supporting spatial queries from points, polygons and line strings encoded in standard GeoJSON fragments.
Containers can also enforce unique key constraints to ensure data integrity.
Each Cosmos DB container exposes a change feed, which clients can subscribe to in order to get notified of new items being added or updated in the container. Item deletions are currently not exposed by the change feed. Changes are persisted by Cosmos DB, which makes it possible to request changes from any point in time, up to the creation of the container.
A "Time to Live" (or TTL) can be specified at the container level to let Cosmos DB automatically delete items after a certain amount of time expressed in seconds. This countdown starts after the last update of the item. If needed, the TTL can also be overloaded at the item level.
The internal data model described in the previous section is exposed through:
- A proprietary SQL API
- 4 different compatibility APIs, exposing endpoints that are partially compatible with the wire protocols of MongoDB, Gremlin, Cassandra and Azure Table Storage; these compatibility APIs make it possible for any compatible application to connect to and use Cosmos DB through standard drivers or SDKs, while also benefiting from Cosmos DB's core features like partitioning and global distribution.
|API||Internal mapping||Compatibility status and remarks|
|MongoDb||Collections||Documents||Compatible with version 3.2 of the MongoDB wire protocol. Version 3.4 and support for MongoDB's aggregation pipeline are currently under preview.|
|Gremlin||Graphs||Nodes and edges||Compatible with version 3.2 of the Gremlin specification.|
|Cassandra||Table||Row||Compatible with version 4 of the Cassandra Query Language (CQL) wire protocol.|
|Azure Table Storage||Table||Item|
- Stored procedures. Functions that bundle an arbitrarily complex set of operations and logic into an ACID-compliant transaction. They are isolated from changes made while the stored procedure is executing and either all write operations succeed or they all fail, leaving the database in a consistent state. Stored procedures are executed in a single partition. Therefore, the caller must provide a partition key when calling into a partitioned collection. Stored procedures can be used to make up for the lack of certain functionality. For instance, the lack of aggregation capability is made up for by the implementation of an OLAP cube as a stored procedure in the open sourced documentdb-lumenize project.
- Triggers. Functions that get executed before or after specific operations (like on a document insertion for example) that can either alter the operation or cancel it. Triggers are only executed on request.
- User-defined functions (UDF). Functions that can be called from and augment the SQL query language making up for limited SQL features.
Cosmos DB added automatic partitioning capability in 2016 with the introduction of partitioned containers. Behind the scenes, partitioned containers span multiple physical partitions with items distributed by a client-supplied partition key. Cosmos DB automatically decides how many partitions to spread data across depending on the size and throughput needs. When partitions are added or removed, the operation is performed without any downtime so data remains available while it is re-balanced across the new or remaining partitions.
Before partitioned containers were available, it was common to write custom code to partition data and some of the Cosmos DB SDKs explicitly supported several different partitioning schemes. That mode is still available but only recommended when storage and throughput requirements don't exceed the capacity of one container, or when the built-in partitioning capability does not otherwise meet the application's needs.
Developers can specify desired throughput to match the application's expected load. Cosmos DB reserves resources (memory, CPU and IOPS) to guarantee the requested throughput while maintaining request latency below 10ms for both reads and writes at the 99.999th percentile. Throughput is specified in request units (RUs) per second. The number of RUs consumed for a particular operation depends on a number of factors, but fetching a single 1KB document by its `id` field consumes 1 RU. Delete, update, and insert operations consume around 5 RUs for 1 KB documents. Large queries and stored procedure executions can consume hundreds to thousands of RUs depending on the complexity of the operations needed.
Throughput can be provisioned at either the container or the database level. When provisioned at the database level, the throughput is shared across all the containers within that database, with the additional ability to have dedicated throughput for some containers.
Cosmos DB databases can be configured to be available in any of the Microsoft Azure regions (54 regions as of December 2018), letting application developers place their data closer to where their users are. Each container's data gets transparently replicated across all configured regions. Adding or removing regions is performed without any downtime or impact on performance. By leveraging Cosmos DB's multi-homing API, applications don't have to be updated or redeployed when regions are added or removed, as Cosmos DB will automatically route their requests to the regions that are available and closest to their location.
- Eventual doesn't guarantee any ordering and only ensures that replicas will eventually converge
- Consistent prefix adds ordering guarantees on top of eventual
- Session is scoped to a single client connection and basically ensures a read-your-own-writes consistency for each client; it is the default consistency level
- Bounded staleness augments consistent prefix by ensuring that reads won't lag beyond x versions of an item or some specified time window
- Strong consistency (or linearizable) ensures that clients always read the latest globally committed write
The desired consistency level is defined at the account level but can be overridden on a per request basis by using a specific HTTP header or the corresponding feature exposed by the SDKs. All 5 consistency levels have been specified and verified using the TLA+ specification language, with the TLA+ model being open-sourced on GitHub.
Gartner Research positions Microsoft as the leader in the Magic Quadrant Operational Database Management Systems in 2016 and explicitly calls out the unique capabilities of Cosmos DB in their write-up.
Real-world use cases
These Microsoft services utilize Cosmos DB:
- Active Directory
If you're looking to use Cosmos DB to build a more globally resilient application / system, you can combine it with other Azure services such as Azure App Service and Azure Traffic Manager.
Limitations, criticism and cautions
- Limited backup/restore features. Whilst automated backups are taken, they are limited in duration (only last two backups are retained over an 8 hour period). Restoration of backups can only be achieved by raising a support ticket and awaiting Microsoft Support Team's assistance. Furthermore, whilst the backup facility does protect against accidental deletion of databases and whole collections, it offers very little protection against document-level corruption, due to the fact that there is no "point-in-time" restore option. These limiting factors mean that Cosmos DB may not satisfy the long-term data retention policies and requirements of many organisations.
- Triggers must be explicitly specified for each operation that you wish to use them which renders them ineffective as a mechanism for maintaining business logic consistency unless you can be certain that all the correct triggers are specified for every operation.
- .NET LINQ language integrated queries are not fully supported. More and more LINQ support has been added over time, but developers are often confused when the LINQ code that they use on other systems fails to work as expected on Cosmos DB as evidenced by the large number of StackOverflow questions containing both tags.
- SQL is very limited, offering no joins or aggregation capability. Aggregations limited to COUNT, SUM, MIN, MAX, AVG functions but no support for GROUP BY or other aggregation functionality found in database systems. However, stored procedures can be used to implement in-the-database aggregation capability.
- "Collection" means something different in Cosmos DB. It is simply a bucket of documents. There is a tendency to equate them to tables where each collection would hold only a single type of document which is not recommended with Cosmos DB. Rather, developers are encouraged to distinguish document types with a "type" field or by adding an "isTypeA = true" field to all documents of TypeA, "isTypeB = true" for all documents of Type B, etc. This is especially confusing to developers that are coming from MongoDB which has a "collection" entity that is intended to be used in a very different way.
- The lack of query plan visibility (e.g. "EXPLAIN" keyword in SQL).
- Many developers have asked that Microsoft support real pagination through a Skip/Take mechanism. Skip/Take was first requested in late August 2014. To date, Microsoft keeps saying they're working on it.
- "Azure Cosmos DB". Microsoft Azure. Microsoft. Retrieved 9 July 2017.
- dharmas. "Working with Azure Cosmos DB databases, containers and items". docs.microsoft.com. Retrieved 2018-12-13.
- "Unique keys in Azure Cosmos DB". Dibran's Blog. Retrieved 2018-12-13.
- rafats. "Working with the change feed support in Azure Cosmos DB". docs.microsoft.com. Retrieved 2018-12-13.
- "Azure #CosmosDB extends support for MongoDB aggregation pipeline, unique indexes, and more". azure.microsoft.com. Retrieved 2018-12-13.
- LalithaMV. "SQL language syntax in Azure Cosmos DB". docs.microsoft.com. Retrieved 2018-12-13.
- Maccherone, Larry. "Announcing documentdb-lumenize". blog.lumenize.com. Retrieved 2016-12-11.
- "Using Azure DocumentDB and ASP.NET Core for extreme NoSQL performance". auth0.com.
- syamkmsft. "How to manage an Azure Cosmos DB account". docs.microsoft.com. Retrieved 2017-08-22.
- kiratp. "How to distribute data globally with Azure Cosmos DB". docs.microsoft.com. Retrieved 2017-08-22.
- "Diving Deep Into Different Consistency Levels Of Azure Cosmos DB". www.c-sharpcorner.com. Retrieved 2018-12-13.
- syamkmsft. "Tunable data consistency levels in Azure Cosmos DB". docs.microsoft.com. Microsoft. Retrieved 2017-08-22.
- Azure Cosmos TLA+ specifications. Contribute to Azure/azure-cosmos-tla development by creating an account on GitHub, Microsoft Azure, 2018-12-09, retrieved 2018-12-13
- "Magic Quadrant for Operational Database Management Systems". www.gartner.com. Retrieved 2016-12-11.
- Pietschmann, Chris. "Building Globally Resilient Apps with Azure App Service and Cosmos DB". BuildAzure.com. Opsgility. Retrieved 30 January 2018.
- "Newest 'azure-documentdb' Questions". stackoverflow.com. Retrieved 2016-12-07.
- "Add Group By support for Aggregate Functions". feedback.azure.com. Retrieved 2019-03-31.
- "Using Azure Cosmos DB as a database for the globally distributed Christian dating app Flourish" (PDF). flourishflourish.com. zareda Ltd. Retrieved March 31, 2019.
- "[DocumentDB] Allow Paging (skip/take)". feedback.azure.com. Retrieved 2018-03-06.