Clique problem

From Wikipedia, the free encyclopedia
Jump to: navigation, search
The brute force algorithm finds a 4-clique in this 7-vertex graph (the complement of the 7-vertex path graph) by systematically checking all C(7,4)=35 4-vertex subgraphs for completeness.

In computer science, the clique problem is any of a number of computational problems of finding cliques (subsets of vertices, all adjacent to each other, also called complete subgraphs) in a graph.

For example, the maximum clique problem arises in the following real-world setting. Consider a social network, where the graph’s vertices represent people, and the graph’s edges represent mutual acquaintance. To find a largest subset of people who all know each other, one can systematically inspect all subsets, a process that is too time-consuming to be practical for social networks comprising more than a few dozen people. Although this brute-force search can be improved by more efficient algorithms, all of these algorithms take exponential time to solve the problem. Therefore, much of the theory about the clique problem is devoted to identifying special types of graph that admit more efficient algorithms, or to establishing the computational difficulty of the general problem in various models of computation.[1] Along with its applications in social networks, the clique problem also has many applications in bioinformatics and computational chemistry.[2]

Clique problems include finding a maximum clique (largest clique by vertices), finding a maximum weight clique in a weighted graph, listing all maximal cliques (cliques that cannot be enlarged), and solving the decision problem of testing whether a graph contains a clique larger than a given size. These problems are all hard: the clique decision problem is NP-complete (one of Karp's 21 NP-complete problems), the problem of finding the maximum clique is both fixed-parameter intractable and hard to approximate, and listing all maximal cliques may require exponential time as there exist graphs with exponentially many maximal cliques. Nevertheless, there are algorithms for these problems that run in exponential time or that handle certain more specialized input graphs in polynomial time.[1]

History and applications[edit]

The study of complete subgraphs in mathematics predates the "clique" terminology.[3] The term "clique" and the problem of algorithmically listing cliques both come from the social sciences, where complete subgraphs are used to model social cliques, groups of people who all know each other. Luce & Perry (1949) used graphs to model social networks, and adapted the social science terminology to graph theory. The first algorithm for solving the clique problem is that of Harary & Ross (1957),[1] who were motivated by the sociological application.

Since the work of Harary and Ross, many others have devised algorithms for various versions of the clique problem.[1] In the 1970s, researchers began studying these algorithms from the point of view of worst-case analysis; see, for instance, Tarjan & Trojanowski (1977), an early work on the worst-case complexity of the maximum clique problem. Also in the 1970s, beginning with the work of Cook (1971) and Karp (1972), researchers began using the theory of NP-completeness and related intractability results to provide a mathematical explanation for the perceived difficulty of the clique problem. In the 1990s, a breakthrough series of papers beginning with Feige et al. (1991) and reported at the time in major newspapers,[4] showed that (assuming P ≠ NP) it is not even possible to approximate the problem accurately and efficiently.

Beyond their applications in social network analysis, clique-finding algorithms have been used in automatic test pattern generation to bound the size of a test set,[5] and in chemistry, to find chemicals that match a target structure[6] and to model binding sites of chemical reactions.[7] In bioinformatics, clique-finding algorithms have been used to infer evolutionary trees,[8] predict protein structures,[9] and find closely interacting clusters of proteins.Spirin & Mirny (2003) Listing the cliques in a dependency graph is an important step in the analysis of certain random processes.[10]


Main article: Clique (graph theory)
The graph shown has one maximum clique, the triangle {1,2,5}, and four more maximal cliques, the pairs {2,3}, {3,4}, {4,5}, and {4,6}.

An undirected graph is formed by a finite set of vertices and a set of unordered pairs of vertices, which are called edges. By convention, in algorithm analysis, the number of vertices in the graph is denoted by n and the number of edges is denoted by m. A clique in a graph G is a complete subgraph of G; that is, it is a subset S of the vertices such that every two vertices in S are connected by an edge in G. A maximal clique is a clique to which no more vertices can be added; a maximum clique is a clique that includes the largest possible number of vertices, and the clique number ω(G) is the number of vertices in a maximum clique of G.[1]

Several closely related clique-finding problems have been studied.[11]

  • In the maximum clique problem, the input is an undirected graph, and the output is a maximum clique in the graph. If there are multiple maximum cliques, only one need be output.[11]
  • In the weighted maximum clique problem, the input is an undirected graph with weights on its vertices (or, less frequently, edges) and the output is a clique with maximum total weight. The maximum clique problem is the special case in which all weights are equal.[12] As well as the problem of optimizing the sum of weights, other more complicated bicriterion optimization problems have also been studied.[13]
  • In the maximal clique listing problem, the input is an undirected graph, and the output is a list of all its maximal cliques. The maximum clique problem may be solved using as a subroutine an algorithm for the maximal clique listing problem, because the maximum clique must be included among all the maximal cliques.[14]
  • In the k-clique problem, the input is an undirected graph and a number k, and the output is a clique of size k if one exists (or, sometimes, all cliques of size k).[15]
  • In the clique decision problem, the input is an undirected graph and a number k, and the output is a Boolean value: true if the graph contains a k-clique, and false otherwise.[16]

The first four of these problems are all important in practical applications; the clique decision problem is not, but is necessary in order to apply the theory of NP-completeness to clique-finding problems.[16]

The clique problem and the independent set problem are complementary: a clique in G is an independent set in the complement graph of G and vice versa.[17] Therefore, many computational results may be applied equally well to either problem, and some research papers do not clearly distinguish between the two problems. However, the two problems have different properties when applied to restricted families of graphs; for instance, the clique problem may be solved in polynomial time for planar graphs[18] while the independent set problem remains NP-hard on planar graphs.[19]


Maximal versus maximum[edit]

A maximal clique, sometimes called inclusion-maximal, is a clique that is not included in a larger clique. Therefore, every clique is contained in a maximal clique.[20] Maximal cliques can be very small. A graph may contain a non-maximal clique with many vertices and a separate clique of size 2 which is maximal. While a maximum (i.e., largest) clique is necessarily maximal, the converse does not hold. There are some types of graphs in which every maximal clique is maximum (the complements of well-covered graphs, notably including complete graphs, triangle-free graphs without isolated vertices, complete multipartite graphs, and k-trees) but other graphs have maximal cliques that are not maximum.

Finding a single maximal clique is straightforward: Starting with an arbitrary clique (for instance, a single vertex), grow the current clique one vertex at a time by iterating over the graph’s remaining vertices, adding a vertex if it is connected to each vertex in the current clique, and discarding it otherwise. This algorithm runs in linear time.[21] Because of the ease of finding maximal cliques, and their potential small size, more attention has been given to the much harder algorithmic problem of finding a maximum or otherwise large clique than has been given to the problem of finding a single maximal clique. However, some research in parallel algorithms has studied the problem of finding a maximal clique; in particular, the problem of finding the lexicographically first maximal clique (the one found by the greedy algorithm described above) has been shown to be complete for the class of polynomial-time functions. This result implies that the problem is unlikely to be solvable within the parallel complexity class NC.[22]

Cliques of fixed size[edit]

A brute force algorithm to test whether a graph G contains a k-vertex clique, and to find any such clique that it contains, is to examine each subgraph with k vertices and check to see whether it forms a clique. This algorithm takes time O(nk k2): there are O(nk) subgraphs to check, each of which has O(k2) edges whose presence in G needs to be checked. Thus, the problem may be solved in polynomial time whenever k is a fixed constant. When k is part of the input to the problem, however, the time is exponential.[23]

The simplest nontrivial case of the clique-finding problem is finding a triangle in a graph, or equivalently determining whether the graph is triangle-free. In a graph with m edges, there may be at most Θ(m3/2) triangles; the worst case occurs when G is itself a clique. Therefore, algorithms for listing all triangles must take at least Ω(m3/2) time in the worst case, and algorithms are known that match this time bound.[24] For instance, Chiba & Nishizeki (1985) describe an algorithm that sorts the vertices in order from highest degree to lowest and then iterates through each vertex v in the sorted list, looking for triangles that include v and do not include any previous vertex in the list. To do so the algorithm marks all neighbors of v, searches through all edges incident to a neighbor of v outputting a triangle for every edge that has two marked endpoints, and then removes the marks and deletes v from the graph. As the authors show, the time for this algorithm is proportional to the arboricity of the graph (a(G)) times the number of edges, which is O(m a(G)). Since the arboricity is at most O(m1/2), this algorithm runs in time O(m3/2). More generally, all k-vertex cliques can be listed by a similar algorithm that takes time proportional to the number of edges times the (k − 2)nd power of the arboricity. For graphs of constant arboricity, such as planar graphs (or in general graphs from any non-trivial minor-closed graph family), this algorithm takes O(m) time, which is optimal since it is linear in the size of the input.[15]

If one desires only a single triangle, or an assurance that the graph is triangle-free, faster algorithms are possible. As Itai & Rodeh (1978) observe, the graph contains a triangle if and only if its adjacency matrix and the square of the adjacency matrix contain nonzero entries in the same cell; therefore, fast matrix multiplication techniques such as the Coppersmith–Winograd algorithm can be applied to find triangles in time O(n2.376), which may be faster than O(m3/2) for sufficiently dense graphs. Alon, Yuster & Zwick (1994) have improved the O(m3/2) algorithm for finding triangles to O(m1.41) by using fast matrix multiplication. This idea of using fast matrix multiplication to find triangles has also been extended to problems of finding k-cliques for larger values of k.[25]

Listing all maximal cliques[edit]

By a result of Moon & Moser (1965), any n-vertex graph has at most 3n/3 maximal cliques. The Bron–Kerbosch algorithm is a recursive backtracking procedure of Bron & Kerbosch (1973) that augments a candidate clique by considering one vertex at a time, either adding it to the candidate clique or to a set of excluded vertices that cannot be in the clique but must have some non-neighbor in the eventual clique; variants of this algorithm can be shown to have worst-case running time O(3n/3).[26] Therefore, this provides a worst-case-optimal solution to the problem of listing all maximal independent sets; further, the Bron–Kerbosch algorithm has been widely reported as being faster in practice than its alternatives.[27]

As Tsukiyama et al. (1977) showed, it is also possible to list all maximal cliques in a graph in an amount of time that is polynomial per generated clique. An algorithm such as theirs in which the running time depends on the output size is known as an output-sensitive algorithm. Their algorithm is based on the following two observations, relating the maximal cliques of the given graph G to the maximal cliques of a graph G \ v formed by removing an arbitrary vertex v from G:

  • For every maximal clique C of G \ v, either C continues to form a maximal clique in G, or C ⋃ {v} forms a maximal clique in G. Therefore, G has at least as many maximal cliques as G \ v does.
  • Each maximal clique in G that does not contain v is a maximal clique in G \ v, and each maximal clique in G that does contain v can be formed from a maximal clique C in G \ v by adding v and removing the non-neighbors of v from C.

Using these observations they can generate all maximal cliques in G by a recursive algorithm that, for each maximal clique C in G \ v, outputs C and the clique formed by adding v to C and removing the non-neighbors of v. However, some cliques of G may be generated in this way from more than one parent clique of G \ v, so they eliminate duplicates by outputting a clique in G only when its parent in G \ v is lexicographically maximum among all possible parent cliques. On the basis of this principle, they show that all maximal cliques in G may be generated in time O(mn) per clique, where m is the number of edges in G and n is the number of vertices; Chiba & Nishizeki (1985) improve this to O(ma) per clique, where a is the arboricity of the given graph. Makino & Uno (2004) provide an alternative output-sensitive algorithm based on fast matrix multiplication, and Johnson & Yannakakis (1988) show that it is even possible to list all maximal cliques in lexicographic order with polynomial delay per clique, although the reverse of this order is NP-hard to generate.

On the basis of this result, it is possible to list all maximal cliques in polynomial time, for families of graphs in which the number of cliques is polynomially bounded. These families include chordal graphs, complete graphs, triangle-free graphs, interval graphs, graphs of bounded boxicity, and planar graphs.[28] In particular, the planar graphs, and more generally, any family of graphs that is both sparse (having a number of edges at most a constant times the number of vertices) and closed under the operation of taking subgraphs, have O(n) cliques, of at most constant size, that can be listed in linear time.[15][29]

Finding maximum cliques in arbitrary graphs[edit]

It is possible to find the maximum clique, or the clique number, of an arbitrary n-vertex graph in time O(3n/3) = O(1.4422n) by using one of the algorithms described above to list all maximal cliques in the graph and returning the largest one. However, for this variant of the clique problem better worst-case time bounds are possible. The algorithm of Tarjan & Trojanowski (1977) solves this problem in time O(2n/3) = O(1.2599n); it is a recursive backtracking scheme similar to that of the Bron–Kerbosch algorithm, but is able to eliminate some recursive calls when it can be shown that some other combination of vertices not used in the call is guaranteed to lead to a solution at least as good. Jian (1986) improved this to O(20.304n) = O(1.2346n). Robson (1986) improved this to O(20.276n) = O(1.2108n) time, at the expense of greater space usage, by a similar backtracking scheme with a more complicated case analysis, together with a dynamic programming technique in which the optimal solution is precomputed for all small connected subgraphs of the complement graph and these partials solutions are used to shortcut the backtracking recursion. The fastest algorithm known today is due to Robson (2001) which runs in time O(20.249n) = O(1.1888n).[30]

There has also been extensive research on heuristic algorithms for solving maximum clique problems without worst-case runtime guarantees, based on methods including branch and bound,[31] local search,[32] greedy algorithms,[33] and constraint programming.[34] Non-standard computing methodologies for finding cliques include DNA computing[35] and adiabatic quantum computation.[36] The maximum clique problem was the subject of an implementation challenge sponsored by DIMACS in 1992–1993,[37] and a collection of graphs used as benchmarks for the challenge is publicly available.[38]

Special classes of graphs[edit]

In this permutation graph, the maximum cliques correspond to the longest decreasing subsequences (4,3,1) and (4,3,2) of the defining permutation.

Planar graphs, and other families of sparse graphs, have been discussed above: they have linearly many maximal cliques, of bounded size, that can be listed in linear time.[15] In particular, for planar graphs, any clique can have at most four vertices, by Kuratowski's theorem.[18]

Perfect graphs are defined by the properties that their clique number equals their chromatic number, and that this equality holds also in each of their induced subgraphs. For perfect graphs, it is possible to find a maximum clique in polynomial time, using an algorithm based on semidefinite programming.[39] However, this method is complex and non-combinatorial, and specialized clique-finding algorithms have been developed for many subclasses of perfect graphs.[40] In the complement graphs of bipartite graphs, König's theorem allows the maximum clique problem to be solved using techniques for matching. In another class of perfect graphs, the permutation graphs, a maximum clique is a longest decreasing subsequence of the permutation defining the graph and can be found using known algorithms for the longest decreasing subsequence problem.[41] In chordal graphs, the maximal cliques are a subset of the n cliques formed as part of an elimination ordering.[42]

In some cases, these algorithms can be extended to other, non-perfect, classes of graphs as well: for instance, in a circle graph, the neighborhood of each vertex is a permutation graph, so a maximum clique in a circle graph can be found by applying the permutation graph algorithm to each neighborhood.[43] Similarly, in a unit disk graph (with a known geometric representation), there is a polynomial time algorithm for maximum cliques based on applying the algorithm for complements of bipartite graphs to shared neighborhoods of pairs of vertices.[44]

The algorithmic problem of finding a maximum clique in a random graph drawn from the Erdős–Rényi model (in which each edge appears with probability 1/2, independently from the other edges) was suggested by Karp (1976). A maximum clique in a random graph can be computed in expected 2O(log22n) time, a quasi-polynomial time bound.[45] Although the clique number of such graphs is very close to 2 log2n, simple greedy algorithms as well as more sophisticated randomized approximation techniques only find cliques with size log2n, and the number of maximal cliques in such graphs is with high probability exponential in log2n preventing a polynomial time solution that lists all of them.[46] Because of the difficulty of this problem, several authors have investigated the planted clique problem, the clique problem on random graphs that have been augmented by adding large cliques.[47] While spectral methods[48] and semidefinite programming[49] can detect hidden cliques of size Ω(√n), no polynomial-time algorithms are currently known to detect those of size o(√n).[50]

Approximation algorithms[edit]

Several authors have considered approximation algorithms that attempt to find a clique or independent set that, although not maximum, has size as close to the maximum as can be found in polynomial time. Although much of this work has focused on independent sets in sparse graphs, a case that does not make sense for the complementary clique problem, there has also been work on approximation algorithms that do not use such sparsity assumptions.[51]

Feige (2004) describes a polynomial time algorithm that finds a clique of size Ω((log n/log log n)2) in any graph that has clique number Ω(n/logkn) for any constant k. By using this algorithm when the clique number of a given input graph is between n/log n and n/log3n, switching to a different algorithm of Boppana & Halldórsson (1992) for graphs with higher clique numbers, and choosing a two-vertex clique if both algorithms fail to find anything, Feige provides an approximation algorithm that finds a clique with a number of vertices within a factor of O(n(log log n)2/log3n) of the maximum. Although the approximation ratio of this algorithm is weak, it is the best known to date, and the results on hardness of approximation described below suggest that there can be no approximation algorithm with an approximation ratio significantly less than linear.

Lower bounds[edit]


The 3-CNF Satisfiability instance (x ∨ x ∨ y) ∧ (~x ∨ ~y ∨ ~y) ∧ (~x ∨ y ∨ y) reduced to Clique. The green vertices form a 3-clique and correspond to a satisfying assignment.[52]

The clique decision problem is NP-complete. It was one of Richard Karp's original 21 problems shown NP-complete in his 1972 paper "Reducibility Among Combinatorial Problems".[53] This problem was also mentioned in Stephen Cook's paper introducing the theory of NP-complete problems.[54] Thus, the problem of finding a maximum clique is NP-hard: if one could solve it, one could also solve the decision problem, by comparing the size of the maximum clique to the size parameter given as input in the decision problem.

Karp's NP-completeness proof is a many-one reduction from the Boolean satisfiability problem for formulas in conjunctive normal form, which was proved NP-complete in the Cook–Levin theorem.[55] From a given CNF formula, Karp forms a graph that has a vertex for every pair (v,c), where v is a variable or its negation and c is a clause in the formula that contains v. Vertices are connected by an edge if they represent compatible variable assignments for different clauses: that is, there is an edge from (v,c) to (u,d) whenever c ≠ d and u and v are not each other's negations. If k denotes the number of clauses in the CNF formula, then the k-vertex cliques in this graph represent ways of assigning truth values to some of its variables in order to satisfy the formula; therefore, the formula is satisfiable if and only if a k-vertex clique exists.[53]

Some NP-complete problems (such as the travelling salesman problem in planar graphs) may be solved in time that is exponential in a sublinear function of the input size parameter n.[56] However, it is unlikely that such bounds exist for the clique problem in arbitrary graphs, as they would imply similarly subexponential bounds for many other standard NP-complete problems.[57]

Circuit complexity[edit]

A monotone circuit to detect a k-clique in an n-vertex graph for k = 3 and n = 4. Each of the 6 inputs encodes the presence or absence of a particular (red) edge in the input graph. The circuit uses one internal and-gate to detect each potential k-clique.

The computational difficulty of the clique problem has led it to be used to prove several lower bounds in circuit complexity. Because the existence of a clique of a given size is a monotone graph property (if a clique exists in a given graph, it will exist in any supergraph) there must exist a monotone circuit, using only and gates and or gates, to solve the clique decision problem for a given fixed clique size. However, the size of these circuits can be proven to be a super-polynomial function of the number of vertices and the clique size, exponential in the cube root of the number of vertices.[58] Even if a small number of NOT gates are allowed, the complexity remains superpolynomial.[59] Additionally, the depth of a monotone circuit for the clique problem using gates of bounded fan-in must be at least a polynomial in the clique size.[60]

Decision tree complexity[edit]

A simple decision tree to detect the presence of a 3-clique in a 4-vertex graph. It uses up to 6 questions of the form “Does the red edge exist?”, matching the optimal bound n(n − 1)/2.

The (deterministic) decision tree complexity of determining a graph property is the number of questions of the form "Is there an edge between vertex u and vertex v?" that have to be answered in the worst case to determine whether a graph has a particular property. That is, it is the minimum height of a boolean decision tree for the problem. Since there are at most n(n − 1)/2 possible questions to be asked, any graph property can be determined with n(n − 1)/2 questions. It is also possible to define random and quantum decision tree complexity of a property, the expected number of questions (for a worst case input) that a randomized or quantum algorithm needs to have answered in order to correctly determine whether the given graph has the property.[61]

Because the property of containing a clique is a monotone property (adding an edge can only cause more cliques to exist within the graph, not fewer), it is covered by the Aanderaa–Karp–Rosenberg conjecture, which states that the deterministic decision tree complexity of determining any non-trivial monotone graph property is exactly n(n − 1)/2. For deterministic decision trees, the property of containing a k-clique (2 ≤ kn) was shown to have decision tree complexity exactly n(n − 1)/2 by Bollobás (1976). Deterministic decision trees also require exponential size to detect cliques, or large polynomial size to detect cliques of bounded size.[62]

The Aanderaa–Karp–Rosenberg conjecture also states that the randomized decision tree complexity of non-trivial monotone functions is Θ(n2). The conjecture is resolved for the property of containing a k-clique (2 ≤ kn), since it is known to have randomized decision tree complexity Θ(n2).[63] For quantum decision trees, the best known lower bound is Ω(n), but no matching algorithm is known for the case of k ≥ 3.[64]

Fixed-parameter intractability[edit]

Parameterized complexity is the complexity-theoretic study of problems that are naturally equipped with a small integer parameter k, and for which the problem becomes more difficult as k increases, such as finding k-cliques in graphs. A problem is said to be fixed-parameter tractable if there is an algorithm for solving it on inputs of size n in time f(knO(1); that is, if it can be solved in polynomial time for any fixed value of k and moreover if the exponent of the polynomial does not depend on k.[65]

For the clique problem, the brute force search algorithm has running time O(nkk2), and although it can be improved by fast matrix multiplication the running time still has an exponent that is linear in k. Thus, although the running time of known algorithms for the clique problem is polynomial for any fixed k, these algorithms do not suffice for fixed-parameter tractability. Downey & Fellows (1995) defined a hierarchy of parametrized problems, the W hierarchy, that they conjectured did not have fixed-parameter tractable algorithms; they proved that independent set (or, equivalently, clique) is hard for the first level of this hierarchy, W[1]. Thus, according to their conjecture, clique is not fixed-parameter tractable. Moreover, this result provides the basis for proofs of W[1]-hardness of many other problems, and thus serves as an analogue of the Cook–Levin theorem for parameterized complexity.[66]

Chen et al. (2006) showed that the clique problem cannot be solved in time unless the exponential time hypothesis fails.[67]

Although the problems of listing maximal cliques or finding maximum cliques are unlikely to be fixed-parameter tractable with the parameter k, they may be fixed-parameter tractable for other parameters of instance complexity. For instance, both problems are known to be fixed-parameter tractable when parametrized by the degeneracy of the input graph.[29]

Hardness of approximation[edit]

A graph of compatibility relations among 2-bit samples of 3-bit proof strings. Each maximal clique (triangle) in this graph represents all ways of sampling a single 3-bit string. The proof of inapproximability of the clique problem involves induced subgraphs of analogously defined graphs for larger numbers of bits.

The computational complexity of approximating the clique problem has been studied for a long time; for instance, Garey & Johnson (1978) observed that, because of the fact that the clique number takes on small integer values and is NP-hard to compute, it cannot have a fully polynomial-time approximation scheme. However, little more was known until the early 1990s, when several authors began to make connections between the approximation of maximum cliques and probabilistically checkable proofs, and used these connections to prove hardness of approximation results for the maximum clique problem.[4][68] After many improvements to these results it is now known that, unless P = NP, there can be no polynomial time algorithm that approximates the maximum clique to within a factor better than O(n1 − ε), for any ε > 0.[69]

The rough idea of these inapproximability results is to form a graph that represents a probabilistically checkable proof system for an NP-complete problem such as Satisfiability. A proof system of this type is defined by a family of proof strings (sequences of bits) and proof checkers: algorithms that, after a polynomial amount of computation over a given Satisfiability instance, examine a small number of randomly chosen bits of the proof string and on the basis of that examination either declare it to be a valid proof or declare it to be invalid. False negatives are not allowed: a valid proof must always be declared to be valid, but an invalid proof may be declared to be valid as long as the probability that a checker makes a mistake of this type is low. To transform a probabilistically checkable proof system into a clique problem, one forms a graph in which the vertices represent all the possible ways that a proof checker could read a sequence of proof string bits and end up accepting the proof. Two vertices are connected by an edge whenever the two proof checker runs that they describe agree on the values of the proof string bits that they both examine. The maximal cliques in this graph consist of the accepting proof checker runs for a single proof string, and one of these cliques is large if and only if there exists a proof string that many proof checkers accept. If the original Satisfiability instance is satisfiable, there will be a large clique defined by a valid proof string for that instance, but if the original instance is not satisfiable, then all proof strings are invalid, any proof string has only a small number of checkers that mistakenly accept it, and all cliques are small. Therefore, if one could distinguish in polynomial time between graphs that have large cliques and graphs in which all cliques are small, or accurately approximate the clique problem, then applying this ability to the graphs generated from Satisfiability instances would allow satisfiable instances to be distinguished from unsatisfiable instances. However, this is not possible unless P = NP.[70]


  1. ^ a b c d e For surveys of these algorithms, and basic definitions used in this article, see Bomze et al. (1999) and Gutin (2004).
  2. ^ For more details and references, see clique (graph theory).
  3. ^ Complete subgraphs make an early appearance in the mathematical literature in the graph-theoretic reformulation of Ramsey theory by Erdős & Szekeres (1935).
  4. ^ a b Kolata, Gina (June 26, 1990), "In a Frenzy, Math Enters Age of Electronic Mail", New York Times .
  5. ^ Hamzaoglu & Patel (1998).
  6. ^ Rhodes et al. (2003).
  7. ^ Kuhl, Crippen & Friesen (1983).
  8. ^ Day & Sankoff (1986).
  9. ^ Samudrala & Moult (1998).
  10. ^ Frank & Strauss (1986).
  11. ^ a b Valiente (2002); Pelillo (2009).
  12. ^ Pelillo (2009).
  13. ^ Sethuraman & Butenko (2015).
  14. ^ & Valiente (2002).
  15. ^ a b c d Chiba & Nishizeki (1985).
  16. ^ a b Cormen et al. (2001).
  17. ^ Cormen et al. (2001), Exercise 34-1, p. 1018.
  18. ^ a b Papadimitriou & Yannakakis (1981); Chiba & Nishizeki (1985).
  19. ^ Garey, Johnson & Stockmeyer (1976).
  20. ^ See, e.g., Frank & Strauss (1986).
  21. ^ Skiena (2009), p. 526.
  22. ^ Cook (1985).
  23. ^ E.g., see Downey & Fellows (1995).
  24. ^ Itai & Rodeh (1978) provide an algorithm with O(m3/2) running time that finds a triangle if one exists but does not list all triangles; Chiba & Nishizeki (1985) list all triangles in time O(m3/2).
  25. ^ Eisenbrand & Grandoni (2004); Kloks, Kratsch & Müller (2000); Nešetřil & Poljak (1985); Vassilevska & Williams (2009); Yuster (2006).
  26. ^ Tomita, Tanaka & Takahashi (2006).
  27. ^ Cazals & Karande (2008); Eppstein, Löffler & Strash (2013).
  28. ^ Rosgen & Stewart (2007).
  29. ^ a b Eppstein, Löffler & Strash (2013).
  30. ^ Robson (2001).
  31. ^ Balas & Yu (1986); Carraghan & Pardalos (1990); Pardalos & Rogers (1992); Östergård (2002); Fahle (2002); Tomita & Seki (2003); Tomita & Kameda (2007); Konc & Janežič (2007).
  32. ^ Battiti & Protasi (2001); Katayama, Hamamoto & Narihisa (2005).
  33. ^ Abello, Pardalos & Resende (1999); Grosso, Locatelli & Della Croce (2004).
  34. ^ Régin (2003).
  35. ^ Ouyang et al. (1997). Although the title refers to maximal cliques, the problem this paper solves is actually the maximum clique problem.
  36. ^ Childs et al. (2002).
  37. ^ Johnson & Trick (1996).
  38. ^ DIMACS challenge graphs for the clique problem, accessed 2009-12-17.
  39. ^ Grötschel, Lovász & Schrijver (1988).
  40. ^ Golumbic (1980).
  41. ^ Golumbic (1980), p. 159. Even, Pnueli & Lempel (1972) provide an alternative quadratic-time algorithm for maximum cliques in comparability graphs, a broader class of perfect graphs that includes the permutation graphs as a special case.
  42. ^ Blair & Peyton (1993), Lemma 4.5, p. 19.
  43. ^ Gavril (1973); Golumbic (1980), p. 247.
  44. ^ Clark, Colbourn & Johnson (1990).
  45. ^ Song (2015).
  46. ^ Jerrum (1992).
  47. ^ Arora & Barak (2009), Example 18.2, pp. 362–363.
  48. ^ Alon, Krivelevich & Sudakov (1998).
  49. ^ Feige & Krauthgamer (2000).
  50. ^ Meka, Potechin & Wigderson (2015).
  51. ^ Boppana & Halldórsson (1992); Feige (2004); Halldórsson (2000).
  52. ^ Adapted from Sipser (1996)
  53. ^ a b Karp (1972).
  54. ^ Cook (1971).
  55. ^ Cook (1971) gives essentially the same reduction, from 3-SAT instead of Satisfiability, to show that subgraph isomorphism is NP-complete.
  56. ^ Lipton & Tarjan (1980).
  57. ^ Impagliazzo, Paturi & Zane (2001).
  58. ^ Alon & Boppana (1987). For earlier and weaker bounds on monotone circuits for the clique problem, see Valiant (1983) and Razborov (1985).
  59. ^ Amano & Maruoka (1998).
  60. ^ Goldmann & Håstad (1992) used communication complexity to prove this result.
  61. ^ See Arora & Barak (2009), Chapter 12, "Decision trees", pp. 259–269.
  62. ^ Wegener (1988).
  63. ^ For instance, this follows from Gröger (1992).
  64. ^ Childs & Eisenberg (2005); Magniez, Santha & Szegedy (2007).
  65. ^ Downey & Fellows (1999).
  66. ^ Downey & Fellows (1995).
  67. ^ Chen et al. (2006).
  68. ^ Feige et al. (1991); Arora & Safra (1998); Arora et al. (1998).
  69. ^ Håstad (1999) showed inapproximability for this ratio using a stronger complexity theoretic assumption, the inequality of NP and ZPP; Khot (2001) described more precisely the inapproximation ratio, and Zuckerman (2006) derandomized the construction weakening its assumption to P ≠ NP.
  70. ^ This reduction is originally due to Feige et al. (1991) and used in all subsequent inapproximability proofs; the proofs differ in the strengths and details of the probabilistically checkable proof systems that they rely on.