Jump to content

Data Commons: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m @DanBri: Disable the categories on this page while it is still a draft, per WP:DRAFTNOCAT/WP:USERNOCAT (using Draft no cat v1.5). The easiest way to do this is by converting them to links, by adding a colon: "[[Category:" → "[[:Category:"
DanBri (talk | contribs)
m Tidied up, especially links/references.
Line 9: Line 9:


The datacommons.org site was launched in May 2018 with an initial dataset consisting of fact-checking data published in Schema.org "ClaimReview" format by several fact checkers from the [[Poynter_Institute#International_Fact-Checking_Network|International Fact-Checking Network]] ({{cite web |url=http://www.datacommons.org/factcheck/
The datacommons.org site was launched in May 2018 with an initial dataset consisting of fact-checking data published in Schema.org "ClaimReview" format by several fact checkers from the [[Poynter_Institute#International_Fact-Checking_Network|International Fact-Checking Network]] ({{cite web |url=http://www.datacommons.org/factcheck/
|title=Fact Checks |date=29 March 2019 |website=datacommons.org |access-date=14 October 2020}}). The service expanded during 2019 to include an RDF-style [[Knowledge_graph|Knowledge Graph]] populated from a number of largely statistical open datasets. The service was announced to a wider audience in 2019<ref>{{cite web |title=Doing our part to share open data responsibly <ref>{{cite web |url=https://www.blog.google/technology/ai/sharing-open-data/ |website=The Keyword |publisher=Google |accessdate=14 October 2020}}</ref>.
|title=Fact Checks |date=29 March 2019 |website=datacommons.org |access-date=14 October 2020}}). The service expanded during 2019 to include an [[Resource_Description_Framework|RDF-style]] [[Knowledge_graph|Knowledge Graph]] populated from a number of largely statistical open datasets. The service was announced to a wider audience in 2019<ref>{{cite web |title=Doing our part to share open data responsibly
|url=https://www.blog.google/technology/ai/sharing-open-data/ |website=The Keyword |publisher=Google |accessdate=14 October 2020}}</ref>.




Line 22: Line 23:
== Technology ==
== Technology ==


The dataCommons.org approach is built on a graph data-model. The graph can be accessed through several APIs, and is expanded through loading data (typically CSV and MCF-based templates). [[Meta_Content_Framework]]
The datacommons.org approach is built on a [[Graph_database|graph data-model]]. The graph can be accessed through several APIs, and is expanded through loading data (typically CSV and [[Meta_Content_Framework|MCF]]-based templates).
<ref>{{cite web |title=Contributing to Data Commons - Adding datasets |url=https://docs.datacommons.org/contributing/adding_datasets.html |website=datacommons.org | publisher=Data Commons }}</ref>. The data vocabulary used to define the datacommons.org graph is based upon [[Schema.org]]. In particular the schema.org terms http://schema.org/StatisticalPopulation and https://schema.org/Observation were proposed to Schema.org to support datacommons-like usecases.

<ref>{{cite web |title= |url=https://docs.datacommons.org/contributing/adding_datasets.html|website=datacommons.org | publisher=Data Commons |accessdate 14 October 2020}}</ref>. The data vocabulary used to define the datacommons.org graph is based upon [[Schema.org]]. In particular the schema.org terms http://schema.org/StatisticalPopulation and https://schema.org/Observation were proposed to Schema.org to support datacommons-like usecases.
<ref>{{cite web |url=https://github.com/schemaorg/schemaorg/issues/2291 |title=Proposal for representing Aggregate Statistical Data |date=25 June 2019 |website=Github - Schema.org repository |access-date=14 October 2020}}</ref>
<ref>{{cite web |url=https://github.com/schemaorg/schemaorg/issues/2291 |title=Proposal for representing Aggregate Statistical Data |date=25 June 2019 |website=Github - Schema.org repository |access-date=14 October 2020}}</ref>


Software from the project is available on Github under Apache 2 license. <ref>{{cite web |url=https://github.com/datacommonsorg/ |title=datacommons.org github}}</ref>



== External links ==
== External links ==

Revision as of 19:30, 14 October 2020


datacommons.org


datacommons.org is an open knowledge repository hosted by Google that provides a unified view across multiple public datasets, combining economic, scientific and other open datasets into an integrated data graph.

The datacommons.org site was launched in May 2018 with an initial dataset consisting of fact-checking data published in Schema.org "ClaimReview" format by several fact checkers from the International Fact-Checking Network ("Fact Checks". datacommons.org. 29 March 2019. Retrieved 14 October 2020.). The service expanded during 2019 to include an RDF-style Knowledge Graph populated from a number of largely statistical open datasets. The service was announced to a wider audience in 2019[1].


Features

The emphasis of datacommons.org is more focused on statistical data than is common for Linked Data and Knowledge Graph initiatives. It centers on the entity-oriented integration of statistical observations from a variety of public datasets. As such, although it supports a subset of the W3C SPARQL query language (https://docs.datacommons.org/api/python/query.html), its APIs (https://docs.datacommons.org/api/) also include tools - such as a Pandas dataframe interface - oriented towards data science, statistics and data visualization.

The most important feature of datacommons.org is that it is integrative. Rather than providing a hosting platform for diverse datasets, it also attempts to consolidate much of the information the datasets provide into a single data graph.


Technology

The datacommons.org approach is built on a graph data-model. The graph can be accessed through several APIs, and is expanded through loading data (typically CSV and MCF-based templates). [2]. The data vocabulary used to define the datacommons.org graph is based upon Schema.org. In particular the schema.org terms http://schema.org/StatisticalPopulation and https://schema.org/Observation were proposed to Schema.org to support datacommons-like usecases. [3]

Software from the project is available on Github under Apache 2 license. [4]



Category:Google Category:Open_data Category:Knowledge_graphs

  1. ^ "Doing our part to share open data responsibly". The Keyword. Google. Retrieved 14 October 2020.
  2. ^ "Contributing to Data Commons - Adding datasets". datacommons.org. Data Commons.
  3. ^ "Proposal for representing Aggregate Statistical Data". Github - Schema.org repository. 25 June 2019. Retrieved 14 October 2020.
  4. ^ "datacommons.org github".