Selection (relational algebra)
In relational algebra, a selection (sometimes called a restriction in reference to E.F. Codd's 1970 paper and not, contrary to a popular belief, to avoid confusion with SQL's use of SELECT, since Codd's article predates the existence of SQL) is a unary operation that denotes a subset of a relation.
A selection is written as or where:
- a and b are attribute names
- θ is a binary operation in the set
- v is a value constant
- R is a relation
The selection denotes all tuples in R for which θ holds between the a and the b attribute.
The selection denotes all tuples in R for which θ holds between the a attribute and the value v.
For an example, consider the following tables where the first table gives the relation Person, the second table gives the result of and the third table gives the result of .
More formally the semantics of the selection is defined as follows:
The result of the selection is only defined if the attribute names that it mentions are in the heading of the relation that it operates upon.
A generalized selection is a unary operation written as where is a propositional formula that consists of atoms as allowed in the normal selection and, in addition, the logical operators ∧ (and), ∨ (or) and (negation). This selection selects all those tuples in R for which holds.
For an example, consider the following tables where the first table gives the relation Person and the second the result of .
Formally the semantics of the generalized selection is defined as follows:
The generalized selection is expressible with other basic algebraic operations. A simulation of generalized selection using the fundamental operators is defined by the following rules:
In computer languages it is expected that any truth-valued expression be permitted as the selection condition rather than restricting it to be a simple comparison.
In SQL, selections are performed by using
WHERE definitions in
DELETE statements, but note that the selection condition can result in any of three truth values (true, false and unknown) instead of the usual two.
- Codd, E.F. (June 1970). "A Relational Model of Data for Large Shared Data Banks". Communications of the ACM. 13 (6): 377–387. doi:10.1145/362384.362685.