Object database: Difference between revisions

Content deleted Content added

Inline

Revision as of 18:30, 28 March 2006

An object database is a database in which information is represented in the form of objects. The database management system for an object database is referred to variously as a ODBMS or OODBMS.

There are two main factors that lead users to adopt object database technology. Firstly, a relational database becomes cumbersome to use with complex data. Secondly, data is generally manipulated by application software written using object-oriented programming languages such as C++, Java, Delphi and C#, and the code needed to translate between this representation of the data and the tuples of a relational database can be tedious to write, and time-consuming to execute. This mismatch between the models used to represent information in the application programs and the database is sometimes referred to as an impedance mismatch.

History

Object database management systems grew out of research during the early to mid-1980s into having intrinsic database management support for graph-structured objects. The term "object-oriented database system" first appeared around 1985. Notable research projects included Encore-Ob/Server (Brown University), EXODUS (University of Wisconsin), IRIS (Hewlett-Packard), ODE (Bell Labs), ORION (Microelectronics and Computer Technology Corporation or MCC), Vodak (GMD-IPSI), and Zeitgeist (Texas Instruments). The ORION project had more published papers than any of the other efforts. Won Kim of MCC compiled the best of those papers in a book published by The MIT Press.^[1]

Early commercial products included GemStone (Servio Logic, name changed to GemStone Systems), Gbase (Graphael), and Vbase (Ontologic). The early to mid-1990s saw additional commercial products enter the market. These included ITASCA (Itasca Systems), Matisse (Matisse Software), Objectivity/DB (Objectivity, Inc.), ObjectStore (Progress Software, acquired from eXcelon which was originally Object Design), ONTOS (Ontos, Inc., name changed from Ontologic), O₂^[2] (O₂ Technology, merged with several companies, acquired by Informix, which was in turn acquired by IBM), POET (now FastObjects from Versant which acquired Poet Systems), and Versant Object Database (Versant Corporation). Some of these products remain on the market and have been joined by new products (see the product listings below).

Object database management systems added the concept of persistence to object programming languages. The early commercial products were integrated with various languages: GemStone (Smalltalk), Gbase (Lisp), and Vbase (COP). COP was the C Object Processor, a proprietary language based on C that pre-dated C++. For much of the 1990s, C++ dominated the commercial object database management market. Vendors added Java in the late 1990s and more recently, C#.

Adoption of object databases

Object databases based on persistent programming acquired a niche in application areas such as engineering and spatial databases, telecommunications, and scientific areas such as high energy physics and molecular biology. They have made little impact on mainstream commercial data processing, though there is some usage in specialized areas of financial services. It is also worth noting that object datbases hold record for the World's largest database (over 1000 Terabytes at Stanford Linear Accelerator Center) and the highest ingest rate ever recorded for a commercial database (over one Terabyte per hour).

Starting in 2004, object databases have seen a second growth period when open source object databases emerged that were widely affordable and easy to use, because they are entirely written in OOP languages like Java, C++, or C#. ObjectDB is an example of an Object Relational Database Management System.

Technical features

In a pure object database, data is stored as objects which can be manipulated only using the methods defined for the class to which the object belongs. Objects are organized into a type hierarchy (sometimes a lattice), and subtypes inherit the characteristics of their supertypes. Objects can contain references to other objects, and applications can therefore access the data they require using a navigational style of programming.

Most object databases also offer some kind of query language, allowing objects to be found by a more declarative programming approach. It is in the area of object query languages, and the integration of the query and navigational interfaces, that the biggest differences between products are found.

Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database). This is because an object can be retrieved directly without a search, by following pointers.

Another area of variation between products is in the way that the schema of a database is defined. A general characteristic, however, is that the programming language and the database schema use the same type definitions.

Multimedia applications are facilitated because the class methods associated with the data are responsible for its correct interpretation.

Many object databases offer support for versioning. An object can be viewed as the set of all its versions. Also, object versions can be treated as objects in their own right. Some object databases also provide systematic support for triggers and constraints which are the basis of active databases.

Advantages and disadvantages

Benchmarks between ODBMSs and relational DBMSs have shown that ODBMS can be clearly superior for certain kinds of tasks. The main reason for this is that many operations are performed using navigational rather than declarative interfaces, and navigational access to data is usually implemented very efficiently by following pointers.^[3]

Critics of Navigational Database-based technologies, like ODBMS, suggest that pointer-based techniques are optimized for very specific "search routes" or viewpoints. However, for general-purpose queries on the same information, pointer-based techniques will tend to be slower and more difficult to formulate than relational. Thus, navigational appears to simplify specific known uses at the expense of general, unforseen future uses.

Other things that work against ODBMS seem to be the lack of interoperability with a great number of tools/features that are taken for granted in the SQL world including but not limited to industry standard connectivity, reporting tools, OLAP tools and backup and recovery standards. Additionally, object databases lack a formal mathematical foundation, unlike the relational model, and this in turn leads to weaknesses in their query support. However, this objection is offset by the fact that some ODBMSs fully support SQL in addition to navigational access, e.g. Objectivity/SQL++.

In fact there is an intrinsic tension between the notion of encapsulation, which hides data and makes it available only through a published set of interface methods, and the assumption underlying much database technology, which is that data should be accessible to queries based on data content rather than predefined access paths. Database-centric thinking tends to view the world through a declarative and attribute-driven viewpoint, while OOP tends to view the world through a behavioral viewpoint. This is one of the many impedance mismatch issues surrounding OOP and databases.

Although some commentators have written off object database technology as a failure, the essential arguments in its favour remain valid, and attempts to integrate database functionality more closely into object programming languages continue in both the research and the industrial communities.

Standards

The Object Data Management Group (ODMG) was a consortium of object database and object-relational mapping vendors, members of the academic community, and interested parties. Its goal was to create a set of specifications that would allow for portable applications that store objects in database management systems. It published several versions of its specification. The last release was ODMG 3.0. By 2001, most of the major object database and object-relational mapping vendors claimed conformance to the ODMG Java Language Binding. Compliance to the other components of the specification was mixed.^[4] In 2001, the ODMG Java Language Binding was submitted to the Java Community Process as a basis for the Java Data Objects specification. The ODMG member companies then decided to concentrate their efforts on the Java Data Objects specification. As a result, the ODMG disbanded in 2001.

In February 2006, the Object Management Group (OMG) announced the acquisition of the ODMG 3.0 specification and the formation of the Object Database Technology Working Group (ODBT WG). The ODBT WG plans to work on the next generation of object database specifications.

Many object database ideas were also absorbed into SQL:1999 and have been implemented in varying degrees in object-relational database products.

In 2005 Cook, Rai, and Rosenberger proposed to drop all standardization efforts to introduce additional object-oriented query APIs but rather use the OO programming language itself, i.e., Java and .NET, to express queries. As a result, Native Queries emerged. Similarly, Microsoft announced Language integrated query (LINQ) and DLINQ in September 2005, to provide close, language-integrated database query capabilities with its programming languages C# and VB.NET.9

References and notes

^ Kim, Won. Introduction to Object-Oriented Databases. The MIT Press, 1990. ISBN 0-262-11124-1
^ Bancilhon, Francois; Delobel,Claude; and Kanellakis, Paris. Building an Object-Oriented Database System: The Story of O₂. Morgan Kaufmann Publishers, 1992. ISBN 1-55860-169-4.
^ Animation showing how an object database works
^ Barry, Douglas and Duhl, Joshua. Object Storage Fact Books: Object DBMSs and Object-Relational Mapping. Barry & Associates, Inc., 2001. Pages showing the ODMG compliance for both object database and object-relational mapping products in 2001.

External links

@@ Line 16: / Line 16: @@
 Object databases based on persistent programming acquired a niche in application areas such as
-engineering and spatial databases, [[telecommunications]], and scientific areas such as [[high energy physics]] and [[molecular biology]]. They have made little impact on mainstream commercial data processing, though there is some usage in specialized areas of [[financial services]].
+engineering and spatial databases, [[telecommunications]], and scientific areas such as [[high energy physics]] and [[molecular biology]]. They have made little impact on mainstream commercial data processing, though there is some usage in specialized areas of [[financial services]]. It is also worth noting that object datbases hold record for the World's largest database (over 1000 Terabytes at Stanford Linear Accelerator Center) and the highest ingest rate ever recorded for a commercial database (over one Terabyte per hour).
 Starting in 2004, object databases have seen a second growth period when open source object databases emerged that were widely affordable and easy to use, because they are entirely written in OOP languages like Java, C++, or C#.