Most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated. In contrast, a multi-model database is designed to support multiple data models against a single, integrated backend. Document, graph, relational, and key-value models are examples of data models that may be supported by a multi-model database.
The relational data model became popular after its publication by Edgar F. Codd in 1970. Due to increasing requirements for horizontal scalability and fault tolerance, NoSQL databases became prominent after 2009. NoSQL databases use a variety of data models, with document, graph, and key-value models being popular.
A Multi-model database is a database that can store, index and query data in more than one model. For some time, databases have primarily supported only one model, such as: relational database, document-oriented database, graph database or triplestore. A database that combines many of these is multi-model.
For some time, it was all but forgotten (or considered irrelevant) that there were any other database models besides Relational. The Relational model and notion of third normal form were the de facto standard for all data storage. However, prior to the dominance of Relational data modeling from about 1980 to 2005 the hierarchical database model was commonly used, and since 2000 or 2010, many NoSQL models that are non-relational including Documents, triples, key-value stores and graphs are popular. Arguably, geospatial data, temporal data and text data are also separate models, though indexed, queryable text data is generally termed a "search engine" rather than a database.
The first time the word "multi-model" has been associated to the databases was on May 30, 2012 in Cologne, Germany, during the Luca Garulli's key note "NoSQL Adoption – What’s the Next Step?". Luca Garulli envisioned the evolution of the 1st generation NoSQL products into new products with more features able to be used by multiple use cases.
The idea of multi-model databases can be traced back to Object-Relational Data Management Systems (ORDBMS) in the early 1990s and in a more broader scope even to federated and integrated DBMSs in the early 1980s. An ORDBMS system manages different types of data such as relational, object, text and spatial by plugging domain specific data types, functions and index implementations into the DBMS kernels. A Multi-model database is most directly a response to the "polyglot persistence" approach of knitting together multiple database products, each handing a different model, to achieve a multi-model capability as described by Martin Fowler. This strategy has two major disadvantages: it leads to a significant increase in operational complexity, and there is no support for maintaining data consistency across the separate data stores, so multi-model databases have begun to fill in this gap.
Multi-model databases are intended to offer the data modeling advantages of polyglot persistence, without its disadvantages. Operational complexity, in particular, is reduced through the use of a single data store. In general, there are two solutions to directly manage multi-model data currently: a single integrated multi-model database system or a tightly-integrated middleware over multiple single-model data stores.
Multi-model databases include (in alphabetic order):
- ArangoDB – document (JSON), graph, key-value
- Cosmos DB – document (JSON), key-value, SQL
- Couchbase – document (JSON), key-value, N1QL
- Datastax – key-value, tabular, graph
- EnterpriseDB – document (XML and JSON), key-value
- MarkLogic – document (XML and JSON), graph triplestore, binary, SQL
- Oracle Database – relational, document (JSON and XML), graph triplestore, property graph, key-value, objects
- OrientDB – document (JSON), graph, key-value, reactive, SQL
- Redis – key-value, document (JSON), property graph, streaming, time-series
- SAP HANA – relational, document (JSON), graph, streaming
Benchmarking multi-model databases
As more and more platforms are proposed to deal with multi-model data, there are a few works on benchmarking multi-model databases. For instance, Pluciennik , Oliveira, and UniBench reviewed existing multi-model databases and made an evaluation effort towards comparing multi-model databases and other SQL and NoSQL databases respectively. They pointed out that the advantages of multi-model databases over single-model databases are as follows : (i) they are able to ingest a variety of data formats such as CSV( including Graph, Relational), JSON into storage without any additional efforts, (ii) they can employ a unified query language such as AQL, Orient SQL, SQL/XML, SQL/JSON to retrieve correlated multi-model data, such as graph-JSON-key/value, XML-relational, and JSON-relational in a single platform. (iii) they are able to support multi-model ACID transactions in the stand-alone mode.
The main difference between the available multi-model databases is related to their architectures. Multi-model databases can support different models either within the engine or via different layers on top of the engine. Some products may provide an engine which supports documents and graphs while others provide layers on top of a key-key store. With a layered architecture, each data model is provided via its own component.
User-defined data models
In addition to offering multiple data models in a single data store, some databases allow developers to easily define custom data models. This capability is enabled by ACID transactions with high performance and scalability. In order for a custom data model to support concurrent updates, the database must be able to synchronize updates across multiple keys. ACID transactions, if they are sufficiently performant, allow such synchronization. JSON documents, graphs, and relational tables can all be implemented in a manner that inherits the horizontal scalability and fault-tolerance of the underlying data store.
- Comparison of multi-model databases
- Comparison of structured storage software
- Database transaction
- Distributed database
- Distributed transaction
- Document-oriented database
- Graph database
- Relational model
- The 451 Group, "Neither Fish Nor Fowl: The Rise of Multi-Model Databases"
- Infoworld, "The Rise of the Multi-Model Database"
- Jiaheng, Lu; Irena, Holubová (2019). "Multi-model Databases: A New Journey to Handle the Variety of Data" (PDF). ACM Computing Surveys.
- "Multi-Model storage 1/2 one product,". 2012-06-01.
- "Nosql Matters Conference 2012 | NoSQL Matters CGN 2012" (PDF). 2012.nosql-matters.org. Retrieved 2017-01-12.
- Jiaheng, Lu; Irena, Holubová (2017). "Multi-model Data Management: What's New and What's Next?" (PDF). EDBT: 602–605.
- Polyglot Persistence
- Jiaheng Lu, Irena Holubová, Bogdan Cautis. Multi-model Databases and Tightly Integrated Polystores: Current Practices, Comparisons, and Open Challenges. CIKM 2018: 2301-2302
- Ewa Pluciennik and Kamil Zgorzalek. "The Multi-model Databases - A Review". BDAS 2017: 141–152.
- Fábio Roberto Oliveira, Luis del Val Cura. "Performance Evaluation of NoSQL Multi-Model Data Stores in Polyglot Persistence Applications". IDEAS '16: 230–235.
- Chao Zhang, Jiaheng Lu, Pengfei Xu, Yuxing Chen. "UniBench: A Benchmark for Multi-Model Database Management Systems" (PDF). TPCTC 2018.CS1 maint: Multiple names: authors list (link)
- ODBMS, "Polyglot Persistence or Multiple Data Models?"