|This article needs additional citations for verification. (July 2013)|
In RDF, a blank node (also called bnode) is a node in an RDF graph representing a resource for which a URI or literal is not given. The resource represented by a blank node is also called an anonymous resource. According to the RDF standard a blank node can only be used as subject or object of an RDF triple.
Notation in serialization formats
<rdf:Description rdf:about="http://www.csd.uoc.gr/~hy561" dc:title="Web Data Management"> <ex:professor rdf:nodeID="_:b"/> </rdf:Description> <rdf:Description rdf:nodeID="_:b" ex:fullName="Adam Smith"> <ex:homePage rdf:resource="http://www.csd.uoc.gr/~smith/"/> </rdf:Description>
The blank node identifiers are only limited in scope to a serialization of a particular RDF graph, i.e. the node '_:b' in the subsequent example does not represent the same node as a node named '_:b' in any other graph.
<rdf:Description rdf:about="http://www.csd.uoc.gr/~hy561" dc:title="Web Data Management"> <ex:professor> <rdf:Description ex:fullName="Adam Smith"> <ex:homePage rdf:resource="http://www.csd.uoc.gr/~smith/"/> </rdf:Description> </ex:professor> </rdf:Description>
Below there is the same example in Turtle.
<http://www.csd.uoc.gr/~hy561> dc:title "Web Data Management" ; ex:professor [ex:fullName "Adam Smith" ; ex:homePage <http://www.csd.uoc.gr/~smith/>].
Blank nodes are treated as simply indicating the existence of a thing, without using a URI (Uniform Resource Identifier) to identify any particular thing. This is not the same as assuming that the blank node indicates an 'unknown' URI.
Anonymous resources in RDF
From a technical perspective they give the capability to:
(a) describe multi-component structures, like the RDF containers,
(b) describe reification (i.e. provenance information),
(c) represent complex attributes without having to name explicitly the auxiliary node (e.g. the address of a person consisting of the street, the number, the postal code and the city) and
(d) offer protection of the inner information (e.g. protecting the sensitive information of the customers from the browsers).
Below there is an example where blank nodes are used to represent resources in the aforementioned ways. In particular, the blank node with the identifier '_:students' represents a Bag RDF Container, the blank node with the identifier '_:ad' represents a complex attribute and those with the identifiers '_:s1' and '_:s2' represent events in the lifecycle of a digital object.
<http://www.csd.uoc.gr/~hy561> dc:title "Web Data Management" ; ex:professor _:b ; ex:students _:students . prov:generatedBy _:a1 . _:b ex:fullName "Adam Smith" ; ex:homePage <http://www.csd.uoc.gr/~smith/> ; ex:hasAddress _:ad . _:ad rdf:type ex:Address; ex:street "Knossou" ; ex:number "122"; ex:postalcode "71409" ; ex:city "Heraklion" . _:students rdf:type rdf:Bag; dc:hasMember _:s1 ; dc:hasMember _:s2 . _:a1 rdf:type prov:Event; prov:creator _:b ; prov:atTime "Tuesday 11 February, 06:51:00 CST". _:a2 rdf:type prov:Event; rdf:type prov:Update; prov:ActionOver _:a1; prov:creator _:b ; prov:atTime "Monday 17 February, 08:12:00 CST".
Anonymous classes in OWL
For example to express that a person has at most one birth date, one will define the class "Person" as a subclass of an anonymous class of type "owl:Restriction". This anonymous class is defined by two attributes specifying the constrained property and the constraint itself (cardinality ≤ 1)
<owl:Class rdf:about="http://example.org/ontology/Person"> <rdfs:subClassOf> <owl:Restriction> <owl:maxCardinality>1</owl:maxCardinality> <owl:onProperty rdf:resource="http://xmlns.com/foaf/0.1/birthDate"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class>
Blank nodes in publsihed data
Blank node prevalence
According to an empirical survey in [Linked_data|Linked Data] published on the Web, out of the 783 domains contributing to the corpus, 345 (44.1%) did not publish any blank nodes. The average percentage of unique terms which were blank nodes for each domain was 7.5%, indicating that although a small number of high-volume domains publish many blank nodes, many other domains publish blank nodes more infrequently.
From the 286.3 MB unique terms found in data-level positions the 165.4 MB (57.8%) were blank nodes, 92.1 MB (32.2%) were URIs, and 28.9 MB (10%) were literals. Each blank node had on average 5.2 data-level occurrences. It occurred, on average, 0.99 times in the object position of a non-rdf:type triple, and 4.2 times in the subject position of a triple.
Structure of blank nodes
According to the same empirical survey in [Linked_data|Linked Data] published on the Web, the majority of documents surveyed contain tree-based blank node structures. A small fraction contain complex blank node structures for which various tasks are potentially very expensive to compute.
The existence of blank nodes requires special treatment in various tasks, whose complexity grows exponentially to the number of these nodes.
Comparing RDF graphs
The inability to match blank nodes increases the delta size (the number of triples that need to be deleted and added in order to transform one RDF graph to another) and does not assist in detecting the changes between subsequent versions of a Knowledge Base. Building a mapping between the blank nodes of two compared Knowledge Bases that minimizes the delta size is NP-Hard in the general case.
BNodeLand is a framework that deals with this problem and proposes solutions through particular tools.
Regarding the entailment problem it is proved that (a) deciding simple or RDF/S entailment of RDF graphs is NP-Complete, and (b) deciding equivalence of simple RDF graphs is Isomorphism-Complete.
- L. Chen, H. Zhang, Y. Chen, and W. Guo. Blank Nodes in RDF. Journal of Software, 2012.
- A. Mallea, M. Arenas, A. Hogan, and A. Polleres. On Blank Nodes. In Procs of the 10th Intern. Semantic Web Conference (ISWC 2011), 2011.
- Y. Tzitzikas, C. Lantzaki, and D. Zeginis. Blank Node Matching and RDF/S Comparison Functions. In Procs of the 11th Intern. Semantic Web Conference (ISWC 2012), 2012.