Linked data

From Wikipedia, the free encyclopedia
Jump to: navigation, search
An introductory overview of Linked Open Data in the context of cultural institutions.

In computing, linked data (often capitalized as Linked Data) describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.[1]

Tim Berners-Lee, director of the World Wide Web Consortium, coined the term in a design note discussing issues around the Semantic Web project.[2]

Principles[edit]

Tim Berners-Lee outlined four principles of linked data in his Design Issues: Linked Data note,[2] paraphrased along the following lines:

  1. Use URIs to denote things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, leveraging standards such as RDF, SPARQL.
  4. Include links to other related things (using their URIs) when publishing data on the Web.

Tim Berners-Lee gave a presentation on linked data at the TED 2009 conference.[3] In it, he restated the linked data principles as three "extremely simple" rules:

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. I get important information back. I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

Components[edit]

Linking open-data community project[edit]

Instance linkages within the linking open data datasets
Class linkages within the linking open data datasets

The goal of the W3C Semantic Web Education and Outreach group's Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links.[4][5] By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. There is also an interactive visualization of the linked data sets to browse through the cloud.[6]

European Union Projects[edit]

There are a number of European Union projects[when defined as?] involving linked data. These include the linked open data around the clock (LATC) project,[7] the PlanetData project,[8] and the Linked Open Data 2 (LOD2) project.[9][10][11] Data linking is one of the main goals of the EU Open Data Portal, which makes available thousands of datasets for anyone to reuse and link.

Datasets[edit]

  • CKAN – registry of open data and content packages provided by the Open Knowledge Foundation
  • DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts described by 1 billion triples, including abstracts in 11 different languages
  • GeoNames provides RDF descriptions of more than 7,500,000 geographical features worldwide.
  • UMBEL – a lightweight reference structure of 20,000 subject concept classes and their relationships derived from OpenCyc, which can act as binding classes to external data; also has links to 1.5 million named entities from DBpedia and YAGO
  • FOAF – a dataset describing persons, their properties and relationships
  • reegle data – a linked open data pool containing clean energy datasets, policy reports, project output documents, and terminology from reegle
  • eagle-i - a federated dataset publishing over 60,000 curated biomedical resources with SPARQL endpoints.
  • Ontobee - a SPARQL-based linked ontology data server and browser that has been utilized for over 100 ontologies containing over two million ontology terms.

Dataset instance and class relationships[edit]

Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud, as shown by the figures to the right, are:

See also[edit]

References[edit]

  1. ^ Bizer, Christian; Heath, Tom; Berners-Lee, Tim (2009). "Linked Data—The Story So Far". International Journal on Semantic Web and Information Systems 5 (3): 1–22. doi:10.4018/jswis.2009081901. ISSN 1552-6283. Retrieved 2010-12-18.  Solving Semantic Interoperability Conflicts in Cross–Border E–Government Services.
  2. ^ a b Tim Berners-Lee (2006-07-27). "Linked Data—Design Issues". W3C. Retrieved 2010-12-18. 
  3. ^ "Tim Berners-Lee on the next Web". 
  4. ^ Linking Open Data
  5. ^ Fensel, Dieter; Facca, Federico Michele; Simperl, Elena; Ioan, Toma (2011). Semantic Web Services. Springer. p. 99. ISBN 3642191924. 
  6. ^ interactive visualization of the linked data sets
  7. ^ Linked open data around the clock (LATC)
  8. ^ PlanetData
  9. ^ Linking Open Data 2 (LOD2)
  10. ^ "CORDIS FP7 ICT Projects – LOD2". European Commission. 2010-04-20. 
  11. ^ "LOD2 Project Fact Sheet – Project Summary". 2010-09-01. Retrieved 2010-12-18. 

Further reading[edit]

External links[edit]

Use case demos