Separation oracle
A separation oracle (also called a cutting-plane oracle) is a concept in the mathematical theory of convex optimization. It is a method to describe a convex set that is given as an input to an optimization algorithm. Separation oracles are used as input to ellipsoid methods.[1]: 87, 96, 98
Definition
Let K be a convex and compact set in Rn. A strong separation oracle for K is an oracle (black box) that, given a vector y in Rn, returns one of the following:[1]: 48
- Assert that y is in K.
- Find a hyperplane that separates y from K: a vector a in Rn, such that for all x in K.
A strong separation oracle is completely accurate, and thus may be hard to construct. For practical reasons, a weaker version is considered, which allows for small errors in the boundary of K and the inequalities. Given a small error tolerance d>0, we say that:
- A vector y is d-near K if its Euclidean distance from K is at most d;
- A vector y is d-deep in K if it is in K, and its Euclidean distance from any point in outside K is at least d.
The weak version also considers rational numbers, which have a representation of finite length, rather than arbitrary real numbers. A weak separation oracle for K is an oracle that, given a vector y in Qn and a rational number d>0, returns one of the following::[1]: 51
- Assert that y is d-near K;
- Find a vector a in Qn, normalized such that its maximum element is 1, such that for all x that are d-deep in K.
Implementation
A special case of a convex set is a set represented by linear inequalities: . Such a set is called a convex polytope. A strong separation oracle for a convex polytope can be implemented, but its run-time depends on the input format.
Representation by inequalities
If the matrix A and the vector b are given as input, so that , then a strong separation oracle can be implemented as follows.[2] Given a point y, compute :
- If the outcome is at most , then y is in K by definition;
- Otherwise, there is at least one row of A, such that is larger than the corresponding value in ; this row gives us the separating hyperplane, as for all x in K.
This oracle runs in polynomial time as long as the number of constraints is polynomial.
Representation by vertices
Suppose the set of vertices of K is given as an input, so that the convex hull of its vertices. Then, deciding whether y is in K requires to check whether y is a convex combination of the input vectors, that is, whether there exist coefficients z1,...,zk such that: [1]: 49
- ;
- for all i in 1,...,k.
This is a linear program with k variables and n equality constraints (one for each element of y). If y is not in K, then the above program has no solution, and the separation oracle needs to find a vector c such that
- for all i in 1,...,k.
Note that the two above representations can be very different in size: it is possible that a polytope can be represented by a small number of inequalities, but has exponentially many vertices (for example, an n-dimensional cube). Conversely, it is possible that a polytope has a small number of vertices, but requires exponentially many inequalities (for example, the convex hull of the 2n vectors of the form (0,...,±1,...,0).
Problem-specific representation
In some linear optimization problems, even though the number of constraints is exponential, one can still write a custom separation oracle that works in polynomial time. Some examples are:
- The minimum-cost arborescence problem: given a weighted directed graph and a vertex r in it, find a subgraph of minimum cost that contains a directed path from r to any other vertex. The problem can be presented as an LP with a constraint for each subset of vertices, which is an exponential number of constraints. However, a separation oracle can be implemented using n-1 applications of the minimum cut procedure.[3]
- The maximum independent set problem. It can be approximated by an LP with a constraint for every odd-length cycle. While there are exponentially-many such cycles, a separation oracle that works in polynomial time can be implemented by just finding an odd cycle of minimum length, which can be done in polynomial time.[3]
- The dual of the configuration linear program for the bin packing problem. It can be approximated by an LP with a constraint for each feasible configuration. While there are exponentially-many such cycles, a separation oracle that works in pseudopolynomial time can be implemented by solving a knapsack problem. This is used by the Karmarkar-Karp bin packing algorithms.
Non-linear sets
Let f be a convex function on Rn. The set is a convex set in Rn+1. Given an evaluation oracle for f (a black box that returns the value of f for every given point), one can easily check whether a vector (y, t) is in K. In order to get a separation oracle, we need also an oracle to evaluate the subgradient of f.[1]: 49 Suppose some vector (y, s) is not in K, so f(y) > s. Let g be the subgradient of f at y (g is a vector in Rn). Denote .Then, , and for all (x, t) in K: . By definition of a subgradient: for all x in Rn. Therefore, , so , and c represents a separating hyperplane.
Usage
A strong separation oracle can be given as an input to the ellipsoid method for solving a linear program. Consider the linear program . The ellipsoid method maintains an ellipsoid that initially contains the entire feasible domain . At each iteration t, it takes the center of the current ellipsoid, and sends it to the separation oracle:
- If the oracle says that is feasible (that is, contained in the set ), then we do an "optimality cut" at : we cut from the ellipsoid all points x for which . These points are definitely not optimal.
- If the oracle says that is infeasible, then it typically returns a specific constraint that is violated by , that is, a row in the matrix A, such that . Since for all feasible x, this implies that for all feasible x. Then, we do a "feasibility cut" at : we cut from the ellipsoid all points y for which . These points are definitely not feasible.
After making a cut, we construct a new, smaller ellipsoid, that contains the remaining region. It can be shown that this process converges to an approximate solution, in time polynomial in the required accuracy.
Converting a weak oracle to a strong oracle
Given a weak separation oracle for a polyhedron, it is possible to construct a strong separation oracle by a careful method of rounding, or by diophantine approximations.[1]: 159
Relation to other oracles
Membership oracle
A membership oracle is a weaker variant of the separation oracle, which does not require the separating hyperplane.
Formally, a strong membership oracle for K is an oracle that, given a vector y in Rn, returns one of the following:
- Assert that y is in K.
- Assert that y is in not in K.
Obviously, a strong separation oracle implies a strong membership oracle.
A weak membership oracle for K is an oracle that, given a vector y in Qn and a rational number d>0, returns one of the following:
- Assert that y is d-near K;
- Assert that y is not d-deep in K.
Using a weak separation oracle (WSO), one can construct a weak membership oracle as follows.
- Run WSO(K,y,d). If it returns "yes", then return "yes".
- Otherwise, run WSO(K,y,d/3). If it returns "yes", then return "yes".
- Otherwise, return "no"; see [1]: 52 for proof of correctness.
Optimization oracle
An optimization oracle is an oracle which finds an optimal point in a given convex set.
Formally, a strong optimization oracle for K is an oracle that, given a vector c in Rn, returns one of the following:
- Assert that K is empty.
- Find a vector y in K that maximizes (that is, . for all x in K).
A weak optimization oracle for K is an oracle that, given a vector c in Qn and a rational number d>0, returns one of the following:
- Assert that no point is d-deep in K;
- Find a vector y that is d-near K, such that for all x that are d-deep in K.
Violation oracle
A strong violation oracle for K is an oracle that, given a vector c in Rn and a real number r, returns one of the following:
- Assert that for all x in K;
- Find a vector y in K with .
A weak violation oracle for K is an oracle that, given a vector c in Qn, a rational number r, and a rational number d>0, returns one of the following:
- Assert that for all x that are d-deep in K;
- Find a vector y that is d-near K, such that .
A major result in convex optimization is that, for any "well-described" polyhedron, the strong-separation problem, the strong-optimization problem and the strong-violation problem are equivalent
A major result in convex optimization is that, for any "well-described" polyhedron, the strong-separation problem, the strong-optimization problem and the strong-violation problem are polynomial-time-equivalent. That is, given an oracle for any one of these three problems, the other two problems can be solved in polynomial time.[1]: 158 A polyhedron is called "well-described" if the input contains n (the number of dimensions of the space it lies in), and contains a number p such K can be defined by linear inequalities with encoding-length at most p.[1]: 163 The result is proved using the ellipsoid method.
References
- ^ a b c d e f g h i M. Grötschel, L. Lovász, A. Schrijver: Geometric Algorithms and Combinatorial Optimization, Springer, 1988.
- ^ "MIT 6.854 Spring 2016 Lecture 12: From Separation to Optimization and Back; Ellipsoid Method - YouTube". www.youtube.com. Retrieved 2021-01-03.
- ^ a b Vempala, Santosh (2016). "Separation oracle" (PDF).