Farkas' lemma: Difference between revisions

Content deleted Content added

Inline

Revision as of 21:37, 1 July 2019

Farkas' lemma is a solvability theorem for a finite system of linear inequalities in mathematics. It was originally proven by the Hungarian mathematician Gyula Farkas.^[1] Farkas' lemma is the key result underpinning the linear programming duality and has played a central role in the development of mathematical optimization (alternatively, mathematical programming). It is used amongst other things in the proof of the Karush–Kuhn–Tucker theorem in nonlinear programming.

Generalizations of the Farkas' lemma are about the solvability theorem for convex inequalities,^[2] i.e., infinite system of linear inequalities. Farkas' lemma belongs to a class of statements called "theorems of the alternative": a theorem stating that exactly one of two systems has a solution.

Statement of the lemma

There are a number of slightly different (but equivalent) formulations of the lemma in the literature. The one given here is due to Gale, Kuhn and Tucker (1951).^[3]

Farkas' lemma — Let $\mathbf {A} \in \mathbb {R} ^{m\times n}$ and $\mathbf {b} \in \mathbb {R} ^{m}$ . Then exactly one of the following two statements is true:

There exists an $\mathbf {x} \in \mathbb {R} ^{n}$ such that $\mathbf {Ax} =\mathbf {b}$ and $\mathbf {x} \geq 0$ .
There exists a $\mathbf {y} \in \mathbb {R} ^{m}$ such that $\mathbf {A} ^{\mathsf {T}}\mathbf {y} \geq 0$ and $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ .

Here, the notation $\mathbf {x} \geq 0$ means that all components of the vector $\mathbf {x}$ are nonnegative.

Example

Let n,m=2 and A = [6,4; 3,0] and b = [b₁,b₂]. The lemma says that exactly one of the following two statements must be true (depending on b₁ and b₂):

There exist x₁ ≥ 0, x₂ ≥ 0 such that 6 x₁+ 4 x₂ = b₁ and 3 x₁ = b₂, or -
There exist y₁, y₂ such that 6 y₁ + 3 y₂ ≥ 0 and 4 y₁ ≥ 0 and b₁ y₁ + b₂ y₂ < 0.

Here is a proof of the lemma in this special case:

If b₂ ≥ 0 and b₁-2b₂ ≥ 0, then option 1 is true, since the solution of the linear equations is x₁ = b₂/3 and x₂ = b₁-2b₂. Option 2 is false, since b₁ y₁ + b₂ y₂ ≥ b2 (2 y₁ + y₂) = b2 (6 y₁ + 3 y₂) / 3, so if the right-hand side is positive, the left-hand side must be positive too.
Otherwise, option 1 is false, since the unique solution of the linear equations is not weakly-positive. But in this case, option 2 is true:
- If b₂ < 0 then we can take e.g. y₁ = 0 and y₂ = 1;
- If b₁-2b₂ < 0 then, for some number B > 0, b₁ = 2b₂ - B, so: b₁ y₁ + b₂ y₂ = 2 b₂ y₁ + b₂ y₂ - B y₁ = b₂ (6 y₁ + 3 y₂)/3 - B y₁. So we can take, for example, y1 = 1, y2 = -2.

Geometric interpretation

Denote the convex cone generated by the columns of $\mathbf {A}$ by $C(\mathbf {A} )=\{\mathbf {A} \mathbf {x} |\mathbf {x} \geq 0\}$ . Then $C(\mathbf {A} )$ is a closed convex cone. The vector $\mathbf {x}$ proves that $\mathbf {b}$ lies in $C(\mathbf {A} )$ , while the vector $\mathbf {y}$ gives a linear functional that separates $\mathbf {b}$ from $C(\mathbf {A} )$ .

Let $\mathbf {a} _{1},\dots ,\mathbf {a} _{n}\in \mathbb {R} ^{m}$ denote the columns of $\mathbf {A}$ . In terms of these vectors, Farkas' lemma states that exactly one of the following two statements is true:

There exist coefficients $x_{1},\dots ,x_{n}\in \mathbb {R} ,x_{1},\dots ,x_{n}\geq 0$ , such that $\mathbf {b} =x_{1}\mathbf {a} _{1}+\dots +x_{n}\mathbf {a} _{n}$ , i.e., b lies in the cone of A;
There exists a vector $\mathbf {y} \in \mathbb {R} ^{m}$ such that $\mathbf {a} _{i}^{\mathsf {T}}\mathbf {y} \geq 0$ for $i=1,\dots ,n$ and $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ , i.e., there is a hyperplane through the origin, separating the vector b from the cone of A.

The vectors $x_{1}\mathbf {a} _{1}+\dots +x_{n}\mathbf {a} _{n}$ with nonnegative coefficients constitute the convex cone of the set $\{\mathbf {a} _{1},\dots ,\mathbf {a} _{n}\}$ so the first statement says that $\mathbf {b}$ is in this cone.

The second statement says that there exists a vector $\mathbf {y}$ such that the angle of $\mathbf {y}$ with the vectors $\mathbf {a} _{i}$ is at most 90° while the angle of $\mathbf {y}$ with the vector $\mathbf {b}$ is more than 90°. The hyperplane normal to this vector has the vectors $\mathbf {a} _{i}$ on one side and the vector $\mathbf {b}$ on the other side. Hence, this hyperplane separates the vectors in the cone of $\{\mathbf {a} _{1},\dots ,\mathbf {a} _{n}\}$ and the vector $\mathbf {b}$ .

For example, let n,m=2 and a₁ = (1,0)^T and a₂ = (1,1)^T. The convex cone spanned by a₁ and a₂ can be seen as a wedge-shaped slice of the first quadrant in the x-y plane. Now, suppose b = (0,1). Certainly, b is not in the convex cone a₁x₁+a₂x₂. Hence, there must be a separating hyperplane. Let y = (1,−1)^T. We can see that a₁ · y = 1, a₂ · y = 0, and b · y = −1. Hence, the hyperplane with normal y indeed separates the convex cone a₁x₁+a₂x₂ from b.

Logic interpretation

A particularly suggestive and easy-to-remember version is the following: if a set of inequalities has no solution, then a contradiction can be produced from it by linear combination with nonnegative coefficients. In formulas: if $Ax$ ≤ $b$ is unsolvable then $y^{\mathsf {T}}A=0$ , $y^{\mathsf {T}}b=-1$ , $y$ ≥ $0$ has a solution.^[4] Note that $y^{\mathsf {T}}A$ is a combination of the left hand sides, $y^{\mathsf {T}}b$ a combination of the right hand side of the inequalities. Since the positive combination produces a zero vector on the left and a −1 on the right, the contradiction is apparent.

Thus, Farkas' lemma can be viewed as a theorem of logical completeness: $Ax$ ≤ $b$ is a set of "axioms", the linear combinations are the "derivation rules", and the lemma says that, if the set of axioms is inconsistent, then it can be refuted using the derivation rules.^[5]^: 92–94

Variants

The Farkas Lemma has several variants with different sign-constraints (the first one is the original version):^[5]^: 92

Either the system $\mathbf {Ax} =\mathbf {b}$ has a solution with $\mathbf {x} \geq 0$ , or the system $\mathbf {A} ^{\mathsf {T}}\mathbf {y} \geq 0$ has a solution with $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ .
Either the system $\mathbf {Ax} \leq \mathbf {b}$ has a solution with $\mathbf {x} \geq 0$ , or the system $\mathbf {A} ^{\mathsf {T}}\mathbf {y} \geq 0$ has a solution with $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ and $\mathbf {y} \geq 0$ .
Either the system $\mathbf {Ax} \leq \mathbf {b}$ has a solution with $\mathbf {x} \in \mathbb {R} ^{n}$ , or the system $\mathbf {A} ^{\mathsf {T}}\mathbf {y} =0$ has a solution with $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ and $\mathbf {y} \geq 0$ .
Either the system $\mathbf {Ax} =\mathbf {b}$ has a solution with $\mathbf {x} \in \mathbb {R} ^{n}$ , or the system $\mathbf {A} ^{\mathsf {T}}\mathbf {y} =0$ has a solution with $\mathbf {b} ^{\mathsf {T}}\mathbf {y} \neq 0$ .

The latter variant is mentioned for completeness; it is not actually a "Farkas lemma" since it contains only equalities. Its proof is a simple exercise in linear algebra.

Generalizations

Generalized Farkas' lemma — Let $\mathbf {A} \in \mathbb {R} ^{m\times n}$ , $\mathbf {b} \in \mathbb {R} ^{m}$ , $\mathbf {S}$ is a closed convex cone in $\mathbb {R} ^{n}$ and the dual cone of $\mathbf {S}$ is $\mathbf {S^{*}} =\{\mathbf {z} \in \mathbb {R} ^{n}|\mathbf {z} ^{\mathsf {T}}\mathbf {x} \geq 0,\forall \mathbf {x} \in \mathbf {S} \}$ . If convex cone $C(\mathbf {A} )=\{\mathbf {A} \mathbf {x} |\mathbf {x} \in \mathbf {S} \}$ is closed, then exactly one of the following two statements is true:

There exists an $\mathbf {x} \in \mathbb {R} ^{n}$ such that $\mathbf {Ax} =\mathbf {b}$ and $\mathbf {x} \in \mathbf {S}$ .
There exists a $\mathbf {y} \in \mathbb {R} ^{m}$ such that $\mathbf {A} ^{\mathsf {T}}\mathbf {y} \in \mathbf {S^{*}}$ and $\mathbf {b} ^{\mathsf {T}}\mathbf {y} <0$ .

Generalized Farkas' lemma can be interpreted geometrically as follows: either a vector is in a given closed convex cone or there exists a hyperplane separating the vector from the cone—there are no other possibilities. The closedness condition is necessary, see Separation theorem I in Hyperplane separation theorem. For original Farkas' lemma, $\mathbf {S}$ is the nonnegative orthant $\mathbb {R} _{+}^{n}$ , hence the closedness condition holds automatically. Indeed, for polyhedral convex cone, i.e., there exists a $\mathbf {B} \in \mathbb {R} ^{n\times k}$ such that $\mathbf {S} =\{\mathbf {B} \mathbf {x} |\mathbf {x} \in \mathbb {R} _{+}^{k}\}$ , the closedness condition holds automatically. In convex optimization, various kinds of constraint qualification, e.g. Slater's condition, are responsible for closedness of the underlying convex cone $C(\mathbf {A} )$ .

By setting $\mathbf {S} =\mathbb {R} ^{n}$ and $\mathbf {S^{*}} =\{0\}$ in Generalized Farkas' lemma, we obtain the following corollary about the solvability for a finite system of linear equalities.

Corollary — Let $\mathbf {A} \in \mathbb {R} ^{m\times n}$ and $\mathbf {b} \in \mathbb {R} ^{m}$ . Then exactly one of the following two statements is true:

There exists an $\mathbf {x} \in \mathbb {R} ^{n}$ such that $\mathbf {Ax} =\mathbf {b}$ .
There exists a $\mathbf {y} \in \mathbb {R} ^{m}$ such that $\mathbf {A} ^{\mathsf {T}}\mathbf {y} =0$ and $\mathbf {b} ^{\mathsf {T}}\mathbf {y} \neq 0$ .

Further implications

Farkas's lemma can be varied to many further theorems of alternative by simple modifications, such as Gordan's theorem: Either $Ax<0$ has a solution x, or $A^{\mathsf {T}}y=0$ has a nonzero solution y with y ≥ 0.

Common applications of Farkas' lemma include proving the strong duality theorem associated with linear programming, game theory at a basic level,^{[clarification needed]} and the Kuhn–Tucker constraints. An extension of Farkas' lemma can be used to analyze the strong duality conditions for and construct the dual of a semidefinite program. It is sufficient to prove the existence of the Kuhn–Tucker constraints using the Fredholm alternative but for the condition to be necessary, one must apply Von Neumann's minimax theorem to show the equations derived by Cauchy are not violated.

Notes

^ Farkas, Julius (Gyula) (1902), "Über die Theorie der Einfachen Ungleichungen", Journal für die Reine und Angewandte Mathematik, 124 (124): 1–27, doi:10.1515/crll.1902.124.1
^ Dinh, N.; Jeyakumar, V. (2014), "Farkas' lemma three decades of generalizations for mathematical optimization", TOP, 22 (1): 1–22, doi:10.1007/s11750-014-0319-y
^ Gale, David; Kuhn, Harold; Tucker, Albert W. (1951), "Linear Programming and the Theory of Games - Chapter XII" (PDF), in Koopmans (ed.), Activity Analysis of Production and Allocation, Wiley. See Lemma 1 on page 318.
^ Boyd, Stephen P.; Vandenberghe, Lieven (2004), "Section 5.8.3" (pdf), Convex Optimization, Cambridge University Press, ISBN 978-0-521-83378-3, retrieved October 15, 2011
^ ^a ^b Gärtner, Bernd; Matoušek, Jiří (2006). Understanding and Using Linear Programming. Berlin: Springer. ISBN 3-540-30697-8. Pages 81–104.

@@ Line 6: / Line 6: @@
 == Statement of the lemma ==
 There are a number of slightly different (but equivalent) formulations of the lemma in the literature. The one given here is due to Gale, Kuhn and Tucker (1951).<ref>{{citation|last1=Gale|first1=David|title=Activity Analysis of Production and Allocation|year=1951|editor=Koopmans|chapter=Linear Programming and the Theory of Games - Chapter XII|chapter-url=http://cowles.econ.yale.edu/P/cm/m13/m13-19.pdf|publisher=Wiley|last2=Kuhn|first2=Harold|last3=Tucker|first3=Albert W.|author1-link=David Gale|author2-link=Harold Kuhn|author3-link=Albert W. Tucker}}.  See Lemma 1 on page 318.</ref>
-{{math_theorem|name=Farkas' lemma| Let <math>\mathbf{A} \in \mathbb{R}^{m\times n}</math>  and <math>\mathbf{b} \in \mathbb{R}^{m}</math> . Then [[Uniqueness quantification|exactly one]] of the following two statements is true:
+{{math_theorem|name=Farkas' lemma| Let <math>\mathbf{A} \in \mathbb{R}^{m\times n}</math>  and <math>\mathbf{b} \in \mathbb{R}^{m}</math> . Then exactly one of the following two statements is true:
 # There exists an <math>\mathbf{x} \in \mathbb{R}^{n}</math> such that <math>\mathbf{Ax} = \mathbf{b}</math> and <math>\mathbf{x} \geq 0</math>.
 # There exists a <math>\mathbf{y} \in \mathbb{R}^{m}</math> such that <math>\mathbf{A}^{\mathsf{T}}\mathbf{y} \geq 0</math> and <math>\mathbf{b}^{\mathsf{T}} \mathbf{y}  < 0</math>. }}
 Here, the notation <math>\mathbf{x} \geq 0</math> means that all components of the vector <math>\mathbf{x}</math> are nonnegative.
 == Example ==