Jump to content

Syntax and semantics of logic programming: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Undid revision 1143082959 by Bearcat (talk) not a user page
→‎Datalog: Note name range restriction
Line 33: Line 33:
</syntaxhighlight>
</syntaxhighlight>


Atoms are also referred to as {{dfni|literals}}. The atom to the left of the <code>:-</code> symbol is called the {{dfni|head}} of the rule; the atoms to the right are the {{dfni|body}}. Every Datalog program must satisfy the condition that every variable that appears in the head of a rule also appears in the body.{{sfn|Ceri|Gottlob|Tanca|1989|p=146}}
Atoms are also referred to as {{dfni|literals}}. The atom to the left of the <code>:-</code> symbol is called the {{dfni|head}} of the rule; the atoms to the right are the {{dfni|body}}. Every Datalog program must satisfy the condition that every variable that appears in the head of a rule also appears in the body (this condition is sometimes called the {{dfni|range restriction}}).{{sfn|Ceri|Gottlob|Tanca|1989|p=146}}<ref>{{Cite journal |last=Eisner |first=Jason |last2=Filardo |first2=Nathaniel W. |date=2011 |editor-last=de Moor |editor-first=Oege |editor2-last=Gottlob |editor2-first=Georg |editor3-last=Furche |editor3-first=Tim |editor4-last=Sellers |editor4-first=Andrew |title=Dyna: Extending Datalog for Modern AI |url=https://link.springer.com/chapter/10.1007/978-3-642-24206-9_11 |journal=Datalog Reloaded |language=en |location=Berlin, Heidelberg |publisher=Springer |pages=181–220 |doi=10.1007/978-3-642-24206-9_11 |isbn=978-3-642-24206-9}}</ref>


Rules with empty bodies are called {{dfni|facts}}. For example, the following rule is a fact:
Rules with empty bodies are called {{dfni|facts}}. For example, the following rule is a fact:

Revision as of 23:25, 5 March 2023

Logic programming is a programming paradigm that includes languages based on formal logic, including Datalog and Prolog. This article describes the syntax and semantics of the purely declarative subset of these languages. In particular, programs in the languages considered in this article consist entirely of rules of the form

H :- B1, ..., BN.

Each such rule can be read as an implication:

meaning "If each is true, then is true".

Many implementations of Datalog, Prolog, and related languages add procedural features such as Prolog's cut operator or extra-logical features such as a foreign function interface. The formal semantics of such extensions are beyond the scope of this article.

Datalog

The simplest widely-studied logic programming language is called Datalog. There are three major definitions of the semantics of Datalog, and they are all equivalent. The syntax and semantics of other logic programming languages are extensions and generalizations of those of Datalog.

Syntax

A Datalog program consists of a list of rules (Horn clauses).[1] If constant and variable are two countable sets of constants and variables respectively and relation is a countable set of predicate symbols, then the following BNF grammar expresses the structure of a Datalog program:

<program> ::= <rule> <program> | ""
<rule> ::= <atom> ":-" <atom-list> "."
<atom> ::= <relation> "(" <term-list> ")"
<atom-list> ::= <atom> | <atom> "," <atom-list> | ""
<term> ::= <constant> | <variable>
<term-list> ::= <term> | <term> "," <term-list> | ""

Atoms are also referred to as literals. The atom to the left of the :- symbol is called the head of the rule; the atoms to the right are the body. Every Datalog program must satisfy the condition that every variable that appears in the head of a rule also appears in the body (this condition is sometimes called the range restriction).[1][2]

Rules with empty bodies are called facts. For example, the following rule is a fact:

r(x) :- .

Syntactic sugar

Many implementations of logic programming extend the above grammar to allow writing facts without the :-, like so:

r(x).

Many also allow writing 0-ary relations without parentheses, like so:

p :- q.

These are merely abbreviations (syntactic sugar); they have no impact on the semantics of the program.

Semantics

There are three widely-used approaches to the semantics of Datalog programs: model-theoretic, fixed-point, and proof-theoretic. These three approaches can be proven to be equivalent.[3]

An atom is called ground if none of its subterms are variables. Intuitively, each of the semantics define the meaning of a program to be the set of all ground atoms that can be deduced from the rules of the program, starting from the facts.

Model theoretic

A rule is called ground if all of its atoms (head and body) are ground. A ground rule R1 is a ground instance of another rule R2 if R1 is the result of a substitution of constants for all the variables in R2.

The Herbrand base of a Datalog program is the set of all ground atoms that can be made with the constants appearing in the program. An interpretation (also known as a database instance) is a subset of the Herbrand base. A ground atom is true in an interpretation I if it is an element of I. A rule is true in an interpretation I if for each ground instance of that rule, if all the clauses in the body are true in I, then the head of the rule is also true in I.

A model of a Datalog program P is an interpretation I of P which contains all the ground facts of P, and makes all of the rules of P true in I. Model-theoretic semantics state that the meaning of a Datalog program is its minimal model (equivalently, the intersection of all its models).[4]

Fixed-point

Let I be the set of interpretations of a Datalog program P, that is, I = P(H), where H is the Herbrand base of P and P is the powerset operator. The immediate consequence operator for P is the following map T from I to I: For each ground instance of each rule in P, if every clause in the body is in the input interpretation, then add the head of the ground instance to the output interpretation. This map T is monotonic with respect to the partial order given by subset inclusion on T. By the Knaster–Tarski theorem, this map has a least fixed point; by the Kleene fixed-point theorem the fixed point is the supremum of the chain . The least fixed point of M coincides with the minimal Herbrand model of the program.[5]

The fixpoint semantics suggest an algorithm for computing the minimal Herbrand model: Start with the set of ground facts in the program, then repeatedly add consequences of the rules until a fixpoint is reached. This algorithm is called naïve evaluation.

Proof-theoretic

Other approaches

The semantics of Datalog have also been studied in the context of fixpoints over more general semirings.[6]

Extending Datalog with negation

Datalog has the desirable property that all three major definitions of the semantics of Datalog programs agree. In contrast, there are many conflicting proposals for the semantics of Datalog with negation. The source of the disagreement is that Datalog programs have a unique minimal Herbrand model, but general Datalog programs with negation do not.

Syntax

Negation is written not, and can appear in front of any atom in the body of a rule.

<atom-list> ::= <atom> | "not" <atom> | <atom> "," <atom-list> | ""

Semantics

Stratified negation

A Datalog program with negation is stratified when it is possible to assign each predicate to some stratum, such that if a relation R appears negated in the body of a relation S, then R is in a lower stratum than S.[7] The model-theoretic and fixed-point semantics of Datalog can be extended to handle stratified negation, and such extensions can be proved equivalent.

Many implementations of Datalog use a bottom-up evaluation model inspired by the fixed point semantics. Since this semantics can handle stratified negation, several implementations of Datalog implement stratified negation.

Completion semantics

Perfect model semantics

Stable model semantics

The stable model semantics define a condition for calling certain Herbrand models of a program stable. Intuitively, stable models are the "possible sets of beliefs that a rational agent might hold, given [the program]" as premises.[8]

A program with negation may have many stable models or no stable models. For instance, the program

p :- not q.
q :- not p.

has two stable models , . The one-rule program

p :- not p.

has no stable models.

Every stable model is a minimal Herbrand model. A Datalog program without negation has a single stable model, which is exactly its minimal Herbrand model. The stable model semantics defines the meaning of a logic program with negation to be its stable model, if there is exactly one. However, it can be useful to investigate all (or at least, several) of the stable models of a program; this is the goal of answer set programming.

Well-founded semantics

Disjunctive Datalog

Disjunctive Datalog extends Datalog to allow disjunctions in the heads of rules.

Syntax

Disjunction in the head of a rule is written ; (contrast with conjunction in the body, written ,). Atoms in the head of rules are not negated.

<rule> ::= <disj-atom-list> ":-" <conj-atom-list> "."
<disj-atom-list> ::= <atom> | <atom> "," <disj-atom-list> | ""
<conj-atom-list> ::= <atom> | "not" <atom> | <atom> "," <conj-atom-list> | ""

Semantics

There are at least three ways to define the semantics of disjunctive Datalog:[9]

  • Minimal model semantics
  • Perfect model semantics
  • Disjunctive stable model semantics, which generalizes the stable model semantics

Further extensions

Several other extensions of Datalog have been proposed and studied, including variants with support for integer constants and functions (including Datalog),[10][11] inequality constraints in the bodies of rules, and aggregate functions.

Constraint logic programming allows for constraints over domains such as the reals or integers to appear in the bodies of rules.

See also

References

Notes

  1. ^ a b Ceri, Gottlob & Tanca 1989, p. 146.
  2. ^ Eisner, Jason; Filardo, Nathaniel W. (2011). de Moor, Oege; Gottlob, Georg; Furche, Tim; Sellers, Andrew (eds.). "Dyna: Extending Datalog for Modern AI". Datalog Reloaded. Berlin, Heidelberg: Springer: 181–220. doi:10.1007/978-3-642-24206-9_11. ISBN 978-3-642-24206-9.
  3. ^ Van Emden, M. H.; Kowalski, R. A. (1976-10-01). "The Semantics of Predicate Logic as a Programming Language". Journal of the ACM. 23 (4): 733–742. doi:10.1145/321978.321991. ISSN 0004-5411.
  4. ^ Ceri, Gottlob & Tanca 1989, p. 149.
  5. ^ Ceri, Gottlob & Tanca 1989, p. 150.
  6. ^ Khamis, Mahmoud Abo; Ngo, Hung Q.; Pichler, Reinhard; Suciu, Dan; Wang, Yisu Remy (2023-02-01). "Convergence of Datalog over (Pre-) Semirings". arXiv:2105.14435 [cs].
  7. ^ Halevy, Alon Y.; Mumick, Inderpal Singh; Sagiv, Yehoshua; Shmueli, Oded (2001-09-01). "Static analysis in datalog extensions". Journal of the ACM. 48 (5): 971–1012. doi:10.1145/502102.502104. ISSN 0004-5411.
  8. ^ Lifschitz, Michael Gelfond and Vladimir (1988). "The Stable Model Semantics for Logic Programming". {{cite journal}}: Cite journal requires |journal= (help)
  9. ^ Eiter, Thomas; Gottlob, Georg; Mannila, Heikki (1997-09-01). "Disjunctive datalog". ACM Transactions on Database Systems. 22 (3): 364–418. doi:10.1145/261124.261126. ISSN 0362-5915.
  10. ^ Kaminski, Mark; Grau, Bernardo Cuenca; Kostylev, Egor V.; Motik, Boris; Horrocks, Ian (2017-11-12). "Foundations of Declarative Data Analysis Using Limit Datalog Programs". arXiv:1705.06927 [cs].
  11. ^ Grau, Bernardo Cuenca; Horrocks, Ian; Kaminski, Mark; Kostylev, Egor V.; Motik, Boris (2020-02-25). "Limit Datalog: A Declarative Query Language for Data Analysis". ACM SIGMOD Record. 48 (4): 6–17. doi:10.1145/3385658.3385660. ISSN 0163-5808.

Sources

External links