Linked data

From Wikipedia, the free encyclopedia
  (Redirected from Linked Data)
Jump to: navigation, search

In computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.[1]

Tim Berners-Lee, director of the World Wide Web Consortium, coined the term in a design note discussing issues around the Semantic Web project.[2] However, the idea is very old and is closely related to concepts including database network models, citations between scholarly articles, and controlled headings in library catalogs.[citation needed]

Contents

[edit] Principles

Tim Berners-Lee outlined four principles of linked data in his Design Issues: Linked Data note, paraphrased along the following lines:

  1. Use URIs to identify things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML.
  4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.

Tim Berners-Lee gave a presentation on linked data at the TED 2009 conference. In it, he restated the linked data principles as three "extremely simple" rules:

  1. All kinds of conceptual things, they have names now that start with HTTP.
  2. I get important information back. I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event.
  3. I get back that information it's not just got somebody's height and weight and when they were born, it's got relationships. And when it has relationships, whenever it expresses a relationship then the other thing that it's related to is given one of those names that starts with HTTP.

Note that although the second rule mentions "standard formats", it does not require any specific standard, such as RDF/XML.

[edit] Components

[edit] Linking open-data community project

Instance linkages within the linking open data datasets
Class linkages within the linking open data datasets

The goal of the W3C Semantic Web Education and Outreach group's Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links. By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. There is also an interactive visualization of the linked data sets to browse through the cloud.

[edit] Dataset instance and class relationships

Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud, as shown by the figures to the right, are:

[edit] Linked open data around the clock (LATC) – EU project

The European Commission has provided a support action grant as part of the 7th Framework Programme to support the publishing and consumption of linked open data [2].

The goals are:

  • improve a round-the-clock infrastructure to monitor the usage and improve the quality of linked open data
  • provide low barrier access for data publishers and consumers
  • develop a library of open source data processing tools
  • maintain a test-bed for processing linked data in combination with European Union data
  • support the community with tutorials and best practices

[edit] PlanetData – EU project

The PlanetData project is an European Commission-funded network of excellence which is concerned with bringing together European researchers in the area of large-scale data management which includes Semantic Web (RDF) data published adhering to Linked Data principles. Planet Data is unique in its approach to having open calls for bringing in additional partners during the project duration via the PlanetData Programs [3].

[edit] Linking Open Data 2 – EU project

As part of the European Commission's 7th Framework Programme a €6.5m grant has been given to the LOD2 project,[3] to continue the work of the Linking Open Data project. Started in September 2010 and due to run until 2014, this project states its aims as "Creating Knowledge out of Interlinked Data" by developing:

  • enterprise-ready tools and methodologies for exposing and managing very large amounts of structured information on the Data Web,
  • a testbed and bootstrap network of high-quality multi-domain, multi-lingual ontologies from sources such as Wikipedia and OpenStreetMap.
  • algorithms based on machine learning for automatically interlinking and fusing data from the Web.
  • standards and methods for reliably tracking provenance, ensuring privacy and data security as well as for assessing the quality of information.
  • adaptive tools for searching, browsing, and authoring of linked data.[4][5]

[edit] Examples

[edit] Datasets

  • CKAN – registry of open data and content packages provided by the Open Knowledge Foundation
  • DBpedia – a dataset containing extracted data from Wikipedia; it contains about 3.4 million concepts described by 1 billion triples, including abstracts in 11 different languages
  • DBLP Bibliography – provides bibliographic information about scientific papers; it contains about 800,000 articles, 400,000 authors, and approx. 15 million triples
  • GeoNames provides RDF descriptions of more than 7,500,000 geographical features worldwide.
  • Revyu – a Review service consumes and publishes linked data, primarily from DBpedia.
  • riese – serving statistical data about 500 million Europeans (the first linked dataset deployed with XHTML+RDFa)
  • UMBEL – a lightweight reference structure of 20,000 subject concept classes and their relationships derived from OpenCyc, which can act as binding classes to external data; also has links to 1.5 million named entities from DBpedia and YAGO
  • Sensorpedia – A scientific initiative at Oak Ridge National Laboratory using a RESTful web architecture to link to sensor data and related sensing systems.
  • FOAF – a dataset describing persons, their properties and relationships
  • OpenPSI for the OpenPSI project a community effort to create UK government linked data service that supports research
  • VIAF (Virtual International Authority File) – an aggregation of authority files (author names) from national libraries from around the world.

[edit] Use case demos

[edit] See also

[edit] References

[edit] External links

[edit] Further reading

[edit] Browsers

[edit] Presentations

[edit] Events

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages