Triplestore

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Overview[edit]

A triplestore is a purpose-built database for the storage and retrieval of triples[1] through semantic queries. A triple is a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".

Much like a relational database, one stores information in a triplestore and retrieves it via a query language. Unlike a relational database, a triplestore is optimized for the storage and retrieval of triples. In addition to queries, triples can usually be imported/exported using Resource Description Framework (RDF) and other formats.

Implementations[edit]

Some triplestores have been built as database engines from scratch, while others have been built on top of existing commercial relational database engines (e.g., SQL-based).[2] Like the early development of online analytical processing (OLAP) databases, this intermediate approach allowed large and powerful database engines to be constructed for little programming effort in the initial phases of triplestore development. Long-term though it seems likely that native triplestores will have the advantage for performance. A difficulty with implementing triplestores over SQL is that although triples may thus be stored, implementing efficient querying of a graph-based RDF model (e.g., mapping from SPARQL) onto SQL queries is difficult.[3]

See also[edit]

  • Dataspaces - notes that fact-based, subject-predicate-object triples (data entities) rely on existing matching and mapping generation techniques. The triple data structure allows a pay-as-you-go approach to data integration which effectively postpones the labor-intensive aspects of integration to the very end, just before the integrated data is absolutely needed.
  • Graph database - More generalized structure than triplestore. Uses graph structures with nodes, edges, and properties to represent and store data. Provides index-free adjacency, meaning every element contains a direct pointer to its adjacent elements and no index lookups are necessary. General graph databases that can store any graph are distinct from specialized graph databases such as triplestores and network databases.
  • ISO/IEC 19788 - Metadata for learning resources (MLR). In a MLR triple, the subject is always the literal of an identifier of the learning resource, such as a URI or ISBN. The predicate is also a literal, the MLR data element specification identifier. Finally, the object can be a literal or a resource class (a set of accepted values, such as a list of terms identifiers from a controlled vocabulary list).
  • Metadata - syntax section - subject-predicate-object triple a/k/a class-attribute-value triple. The first two elements of the triple (class, attribute) are pieces of some structural metadata having a defined semantic. The third element is a value, preferably from some controlled vocabulary, some reference (master) data. The combination of the metadata and master data elements results in a statement which is a metacontent statement i.e. "metacontent = metadata + master data". All these elements can be thought of as vocabulary. Both metadata and master data are vocabularies which can be assembled into metacontent statements. There are many sources of these vocabularies, both meta and master data: UML, EDIFACT, XSD, Dewey/UDC/LoC, SKOS, ISO-25964, Pantone, Linnaean Binomial Nomenclature, etc. Using controlled vocabularies for the components of metacontent statements, whether for indexing or finding, is endorsed by ISO-25964: If both the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved.
  • Named graph a.k.a. quad store. Also see above, Graph database.
  • Outline of databases - overview article useful to place Triplestore in context of various other types of database systems.
  • Resource Description Framework RDF - standard for making statements about resources (in particular web resources) in the form of subject–predicate–object expressions.
  • Semantic data model - covers semmantic information, symbols (instance data), meaning from instances, facts as binary relations between data elements. Object-RelationType-Object
  • RDFLib - a Python library for working with RDF including both in-memory and persistent Graph backends. Supports subject-predicate-object triple pattern matching.
  • Semantic wiki and Semantic MediaWiki - illustrates subject-predicate-object support for Wikis, advanced query support, and implementations by organizations including: Pfizer, Harvard Pilgrim Health Care, Johnson & Johnson Pharmaceutical Research and Development, Pacific Northwest National Laboratory,Metropolitan Museum of Art, and the U.S. Department of Defense.

References[edit]

  1. ^ TripleStore, Jack Rusher, Semantic Web Advanced Development for Europe (SWAD-Europe), Workshop on Semantic Web Storage and Retrieval - Position Papers
  2. ^ US 2003145022  Storage and Management of Semi-structured Data (Use of SQL relational databases as an RDF triple store), 2003
  3. ^ Broekstra, Jeen (19 September 2007). "The importance of SPARQL can not be overestimated". 

External links[edit]