# Averaging argument

In computational complexity theory and cryptography, averaging argument is a standard argument for proving theorems. It usually allows us to convert probabilistic polynomial-time algorithms into non-uniform polynomial-size circuits.

## Example

To simplify, let's first consider an example.

Example: If every person likes at least 1/3 of the books in a library, then, there exists a book, which at least 1/3 of people liked it.

Proof: Suppose there are ${\displaystyle N}$ people and B books. Each person likes at least ${\displaystyle B/3}$ of the books. Let people leave a mark on the book they like. Then, there will be at least ${\displaystyle M=(NB)/3}$ marks. The averaging argument claims that there exists a book with at least ${\displaystyle N/3}$ marks on it. Assume, to the contradiction, that no such book exists. Then, every book has fewer than ${\displaystyle N/3}$ marks. However, since there are ${\displaystyle B}$ books, the total number of marks will be fewer than ${\displaystyle (NB)/3}$, contradicting the fact that there are at least ${\displaystyle M}$ marks. ${\displaystyle \scriptstyle \blacksquare }$

## Formalized definition of averaging argument

Let X and Y be sets, let p be a predicate on X × Y and let f be a real number in the interval [0, 1]. If for each x in X and at least f |Y| of the elements y in Y satisfy p(x, y), then there exists a y in Y such that there exist at least f |X| elements x in X that satisfy p(x, y).

There is another definition, defined using the terminology of probability theory.[1]

Let ${\displaystyle f}$ be some function. The averaging argument is the following claim: if we have a circuit ${\displaystyle C}$ such that ${\displaystyle C(x,y)=f(x)}$ with probability at least ${\displaystyle \rho }$, where ${\displaystyle x}$ is chosen at random and ${\displaystyle y}$ is chosen independently from some distribution ${\displaystyle Y}$ over ${\displaystyle \{0,1\}^{m}}$ (which might not even be efficiently sampleable) then there exists a single string ${\displaystyle y_{0}\in \{0,1\}^{m}}$ such that ${\displaystyle \Pr _{x}[C(x,y_{0})=f(x)]\geq \rho }$.

Indeed, for every ${\displaystyle y}$ define ${\displaystyle p_{y}}$ to be ${\displaystyle \Pr _{x}[C(x,y)=f(x)]}$ then

${\displaystyle \Pr _{x,y}[C(x,y)=f(x)]=E_{y}[p_{y}]\,}$

and then this reduces to the claim that for every random variable ${\displaystyle Z}$, if ${\displaystyle E[Z]\geq \rho }$ then ${\displaystyle \Pr[Z\geq \rho ]>0}$ (this holds since ${\displaystyle E[Z]}$ is the weighted average of ${\displaystyle Z}$ and clearly if the average of some values is at least ${\displaystyle \rho }$ then one of the values must be at least ${\displaystyle \rho }$).

## Application

This argument has wide use in complexity theory (e.g. proving ${\displaystyle {\mathsf {BPP}}\subsetneq {\mathsf {P/poly}}}$) and cryptography (e.g. proving that indistinguishable encryption results in semantic security). A plethora of such applications can be found in Goldreich's books.[2][3][4]

## References

1. ^ Barak, Boaz (March 2006). "Note on the averaging and hybrid arguments and prediction vs. distinguishing" (PDF). COS522. Princeton University.
2. ^ Oded Goldreich, Foundations of Cryptography, Volume 1: Basic Tools, Cambridge University Press, 2001, ISBN 0-521-79172-3
3. ^ Oded Goldreich, Foundations of Cryptography, Volume 2: Basic Applications, Cambridge University Press, 2004, ISBN 0-521-83084-2
4. ^ Oded Goldreich, Computational Complexity: A Conceptual Perspective, Cambridge University Press, 2008, ISBN 0-521-88473-X