Jump to content

Covariance and contravariance (computer science)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 81.243.7.205 (talk) at 09:33, 9 June 2008 (→‎Overview of covariance/contravariance in some programming languages). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In a type system, a covariant operator preserves the ordering, ≤, of types; a contravariant operator reverses the ordering. If neither of these apply, the operator is invariant. These terms come from category theory. In software-engineering terms, this distinction is important in considering argument and return types of class hierarchies. In object-oriented languages such as C++, if class B is a subclass of class A, then all member functions of B must return the same or narrower set of types as A; the return type is said to be covariant. On the other hand, the member functions of B must take the same or broader set of arguments compared with the member functions of A; the argument type is said to be contravariant. The problem for B is how to perform a substitute job perfectly. The only way to avoid breaking the surrounding system is to be equally or more liberal than A on inputs, and to be equally or more strict than A on outputs.

Typical examples:

  • The array type is usually covariant on the base type: since StringObject then ArrayOf(String)ArrayOf(Object). Note that this is only correct (i.e. type safe) if the array is immutable; if insert and remove operators are permitted, then the insert operator is covariant (e.g. one can insert a String into an ArrayOf(Object)) and the remove operator is contravariant (e.g. one can remove an Object from an ArrayOf(String)). Since the mutators have conflicting variance, arrays should be invariant on the base type.
  • A function with a parameter of type T (defined as fun f (x : T) : Integer) can be replaced by a function g (defined as fun g (x : S) : Integer) if TS. In other words, if g cares less about the type of its parameter, then it can replace f anywhere, since both return an Integer. So, in a language accepting function arguments, gf and the type of the parameter to f is said to be contravariant.
  • In the general case, the type of the result is covariant.

In object-oriented programming, substitution is also implicitly invoked by overriding methods in subclasses: the new method can be used where the old method was invoked in the original code. Programming languages vary widely on their allowed forms of overriding, and on the variance of overridden methods' types.

Origin of the terms

The origin of these terms is in category theory, where the types in the type system form a category C, with arrows representing the subtype relationship. The subtype relationship supposedly reflects the substitution principle: that any expression of type t can be substituted by an expression of type s if st.

Defining a function that accepts type p and returns type r creates a new type pr in the type system which the new function name is associated with. This function definition operator is actually a functor F : C × CC that creates the said type. From the substitution principle above, this functor must be contravariant in the first argument and covariant in the second (see Luca Cardelli).

The controversy in object oriented languages

The problem arises since different object oriented languages have different strategies to select the actual code used in a particular context and the first parameter is the object itself (which is not contravariant).

However, it was shown by Castagna (1995) that all depends on the method fetching algorithm: types used to select the method are contravariant; types not used to select the method are covariant. It is immaterial if the fetching occurs at run-time or at compile time.

These terms are also used in the context of modern programming languages that offer other functors to create new types with type variables, e.g., generic programming or parametric polymorphism, and exception handling where method definitions are enriched with annotations that indicate possible failures.

Software engineering

In many strictly-typed languages, subclassing must allow for substitution. That is, a child class can always stand in for a parent class. This places restrictions on the sorts of relationships that subclassing can represent. In particular, it means that arguments to member functions must be contravariant and return types must be covariant.

Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Person::see(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. Only children visit pediatricians, so we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Child::see(d) must take any Doctor, not just a Pediatrician.

We could try moving the see() method to the Doctor class hierarchy, but we would have the same problem: If a Doctor could see a Person and a Child is a Person, then there is still no way to enforce that a Child must see a Pediatrician and that a Person who is not a Child cannot see a Pediatrician and must see another Doctor. In this case, the visitor pattern could be used to enforce this relationship.

Overview of covariance/contravariance in some programming languages

Both the subtype and method overriding (programming) concepts are defined differently from programming language to programming language. They do not necessarily follow the substitution principle above, sometimes adding runtime checking instead. What follows is a simple comparison of how overriding methods behave in some common programming languages.

C++

C++ supports covariant return types in overridden virtual functions. Adding the covariant return type was the first modification of the C++ language approved by the standards committee in 1998. See Allison, Chuck. "What's New in Standard C++?".

With generic programming, C++ allows for what amounts to covariance in argument and return type alike. For example, the argument and return types of member functions of the std::vector<T> class vary with T. The push_back method takes a const T&, so one pushes an int onto a vector<int> but a std::string onto a vector<string>. This is done at compile time (statically) and, strictly speaking, is parametric polymorphism.

C#

Arrays of reference-types are covariant: string[] is a subtype of object[], although with some caveats:

// a is a single-element array of System.String
string[] a = new string[1];

// b is an array of System.Object
object[] b = a;

// Assign an integer to b. This would be possible if b really were
// an array of objects, but since it really is an array of strings,
// we will get an ArrayTypeMismatchException with the following message:
// "Attempted to store an element of the incorrect type into the array".
b[0] = 1;

Note: In the above case you can read from b without problem. It is only when trying to write to the array that you must know its real type.

Arrays of value-types are invariant: int[] is not a subtype of double[].

In the C# programming language, support for both return-type covariance and parameter contravariance for delegates was added in version 2.0 of the language. Neither covariance nor contravariance are supported for method overriding.

Eiffel

Eiffel allows covariant return and parameter types in overriding methods. This is possible because Eiffel does not require subclasses to be substitutable for superclasses — that is, subclasses are not necessarily subtypes.

However, this can lead to surprises if subclasses with such covariant parameter types are operated upon presuming they were a more general class (polymorphism), leading to the possibility of compiler errors.

Java

Arrays of objects are covariant: String[] is a subtype of Object[], although with some caveats:

// a is a single-element array of String
String[] a = new String[1];

// b is an array of Object
Object[] b = a;

// Assign an Integer to b. This would be possible if b really were
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException.
b[0] = new Integer (1);

Note: In the above case you can read from b without problem. It is only when trying to write to the array that you must know its real type.

Arrays of primitive types are invariant: int[] is not a subtype of long[], although int is in some sense a subtype of long.

Exception covariance has been supported since the introduction of the language. Return type covariance is implemented in the Java programming language version J2SE 5.0. Parameter types have to be exactly the same (invariant) for method overriding, otherwise the method is overloaded with a parallel definition instead.

Generics were introduced in Java in Java 5.0 to allow type-safe generic programming. Unlike arrays, generic classes are neither covariant nor contravariant. For example, neither List<String> nor List<Object> is a subtype of the other:

// a is a single-element List of String
List<String> a = new ArrayList<String>();
a.add("foo");

// b is an List of Object
List<Object> b = a; // This is a compile-time error

However, generic type parameters can contain wildcards (a shortcut for an extra type parameter that is only used once). Example: Given a requirement for a method which operates on Lists, of any object, then the only operations that can be performed on the object are those for which the type relationships can be guaranteed to be safe.

// a is a single-element List of String
List<String> a = new ArrayList<String>();
a.add("foo");

// b is an List of anything
List<?> b = a;

// retrieve the first element
Object c = b.get(0); // This is legal, because we can guarantee that the return type "?" is a subtype of Object

// Add an Integer to b.
b.add(new Integer (1)); // This is a compile-time error; we cannot guarantee that Integer is a subtype of the parameter type "?"

Wildcards can also be bound, e.g. "? extends Foo" or "? super Foo" for upper and lower bounds, respectively. This allows to refine permitted performance. Example: given a List<? extends Foo>, then an element can be retrieved and safely assigned to a Foo type (contravariance). Given a List<? super Foo>, then a Foo object can be safely added as an element (covariance).

REALbasic

REALbasic added support for return type covariance in version 5.5. Like with Java, the parameter types of the overriding method must be the same.

Scala

Scala supports both covariance and contravariance.

Sather

Sather supports both covariance and contravariance. Calling convention for overridden methods are covariant with out arguments and return values, and contravariant with normal arguments (with the mode in).

See also