Blank node

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In RDF, a blank node (also called bnode) is a node in an RDF graph representing a resource for which a URI or literal is not given. The resource represented by a blank node is also called an anonymous resource. According to the RDF standard a blank node can only be used as subject or object of an RDF triple.

Notation in serialization formats[edit]

Blank nodes can be denoted through blank node identifiers in the following formats, RDF/XML, Turtle, N3 and N-Triples. The following example shows how it works in RDF/XML.

<rdf:Description rdf:about="http://www.csd.uoc.gr/~hy561" dc:title="Web Data Management">
<ex:professor rdf:nodeID="_:b"/>
</rdf:Description>
<rdf:Description rdf:nodeID="_:b" ex:fullName="Adam Smith">
<ex:homePage rdf:resource="http://www.csd.uoc.gr/~smith/"/>
</rdf:Description>

The blank node identifiers are only limited in scope to a serialization of a particular RDF graph, i.e. the node '_:b' in the subsequent example does not represent the same node as a node named '_:b' in any other graph.

Blank nodes can also be denoted through nested elements (in RDF/XML, Turtle and N3). Here is the same triples with the above.

<rdf:Description rdf:about="http://www.csd.uoc.gr/~hy561" dc:title="Web Data Management">
<ex:professor>
<rdf:Description ex:fullName="Adam Smith">
<ex:homePage rdf:resource="http://www.csd.uoc.gr/~smith/"/>
</rdf:Description>
</ex:professor>
</rdf:Description>

Below there is the same example in Turtle.

 <http://www.csd.uoc.gr/~hy561> dc:title "Web Data Management" ;
                               ex:professor [ex:fullName "Adam Smith" ;
                               ex:homePage <http://www.csd.uoc.gr/~smith/>].

Usability[edit]

Blank nodes are treated as simply indicating the existence of a thing, without using a URI (Uniform Resource Identifier) to identify any particular thing. This is not the same as assuming that the blank node indicates an 'unknown' URI.[1]

Anonymous resources in RDF[edit]

From a technical perspective they give the capability to:
(a) describe multi-component structures, like the RDF containers,
(b) describe reification (i.e. provenance information),
(c) represent complex attributes without having to name explicitly the auxiliary node (e.g. the address of a person consisting of the street, the number, the postal code and the city) and
(d) offer protection of the inner information (e.g. protecting the sensitive information of the customers from the browsers).[2]

Below there is an example where blank nodes are used to represent resources in the aforementioned ways. In particular, the blank node with the identifier '_:students' represents a Bag RDF Container, the blank node with the identifier '_:ad' represents a complex attribute and those with the identifiers '_:s1' and '_:s2' represent events in the lifecycle of a digital object.

 <http://www.csd.uoc.gr/~hy561> dc:title "Web Data Management" ;
                               ex:professor _:b ;
                               ex:students _:students .
                               prov:generatedBy _:a1 .

  _:b ex:fullName "Adam Smith" ;
      ex:homePage <http://www.csd.uoc.gr/~smith/> ;
      ex:hasAddress _:ad .

  _:ad rdf:type ex:Address;
       ex:street "Knossou" ;
       ex:number "122";
       ex:postalcode "71409" ;
       ex:city "Heraklion" .

  _:students rdf:type rdf:Bag;
             dc:hasMember _:s1 ;
             dc:hasMember _:s2 .

  _:a1 rdf:type prov:Event;
       prov:creator _:b ;
       prov:atTime "Tuesday 11 February, 06:51:00 CST".

  _:a2 rdf:type prov:Event;
       rdf:type prov:Update;
       prov:ActionOver _:a1;
       prov:creator _:b ;
       prov:atTime "Monday 17 February, 08:12:00 CST".

Anonymous classes in OWL[edit]

The ontology language OWL uses blank nodes to represent anonymous classes such as unions or intersections of classes, or classes called restrictions, defined by a constraint on a property.

For example to express that a person has at most one birth date, one will define the class "Person" as a subclass of an anonymous class of type "owl:Restriction". This anonymous class is defined by two attributes specifying the constrained property and the constraint itself (cardinality ≤ 1)

 <owl:Class rdf:about="http://example.org/ontology/Person">
    <rdfs:subClassOf>
      <owl:Restriction>
        <owl:maxCardinality>1</owl:maxCardinality>
        <owl:onProperty rdf:resource="http://xmlns.com/foaf/0.1/birthDate"/>
      </owl:Restriction>
    </rdfs:subClassOf>
 </owl:Class>

Blank nodes in published data[edit]

Blank node prevalence[edit]

According to an empirical survey[3] in Linked Data published on the Web, out of the 783 domains contributing to the corpus, 345 (44.1%) did not publish any blank nodes. The average percentage of unique terms which were blank nodes for each domain was 7.5%, indicating that although a small number of high-volume domains publish many blank nodes, many other domains publish blank nodes more infrequently.

From the 286.3 MB unique terms found in data-level positions the 165.4 MB (57.8%) were blank nodes, 92.1 MB (32.2%) were URIs, and 28.9 MB (10%) were literals. Each blank node had on average 5.2 data-level occurrences. It occurred, on average, 0.99 times in the object position of a non-rdf:type triple, and 4.2 times in the subject position of a triple.

Structure of blank nodes[edit]

According to the same empirical survey of linked data published on the Web, the majority of documents surveyed contain tree-based blank node structures. A small fraction contain complex blank node structures for which various tasks are potentially very expensive to compute.

Sensitive tasks[edit]

The existence of blank nodes requires special treatment in various tasks, whose complexity grows exponentially to the number of these nodes.

Comparing RDF graphs[edit]

The inability to match blank nodes increases the delta size (the number of triples that need to be deleted and added in order to transform one RDF graph to another) and does not assist in detecting the changes between subsequent versions of a Knowledge Base. Building a mapping between the blank nodes of two compared Knowledge Bases that minimizes the delta size is NP-Hard in the general case.[4]

BNodeLand[5] is a framework that deals with this problem and proposes solutions through particular tools.

Entailment checking[edit]

Regarding the entailment problem it is proved that (a) deciding simple or RDF/S entailment of RDF graphs is NP-Complete,[6] and (b) deciding equivalence of simple RDF graphs is Isomorphism-Complete.

Notes[edit]

  1. ^ http://www.w3.org/TR/2014/REC-rdf11-mt-20140225/#blank-nodes
  2. ^ L. Chen, H. Zhang, Y. Chen, and W. Guo. Blank Nodes in RDF. Journal of Software, 2012.
  3. ^ A. Mallea, M. Arenas, A. Hogan, and A. Polleres. On Blank Nodes. In Procs of the 10th Intern. Semantic Web Conference (ISWC 2011), 2011.
  4. ^ Y. Tzitzikas, C. Lantzaki, and D. Zeginis. Blank Node Matching and RDF/S Comparison Functions. In Procs of the 11th Intern. Semantic Web Conference (ISWC 2012), 2012.
  5. ^ http://www.ics.forth.gr/isl/bnodeland/
  6. ^ H. J. ter Horst. "Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary." J. of Web Sem. 3:79-115, 2005.