# Learning with errors

Learning with errors (LWE) is a problem in machine learning that is conjectured to be hard to solve. Introduced[1] by Oded Regev in 2005, it is a generalization of the parity learning problem. Regev showed, furthermore, that the LWE problem is as hard to solve as several worst-case lattice problems. The LWE problem has recently[1][2] been used as a hardness assumption to create public-key cryptosystems, such as the ring learning with errors key exchange by Peikert.[3]

An algorithm is said to solve the LWE problem if, when given access to samples ${\displaystyle (x,y)}$ where ${\displaystyle x\in \mathbb {Z} _{q}^{n}}$ and ${\displaystyle y\in \mathbb {Z} _{q}}$, with the assurance, for some fixed linear function ${\displaystyle f:\mathbb {Z} _{q}^{n}\rightarrow \mathbb {Z} _{q},}$ that ${\displaystyle y=f(x)}$ with high probability and deviates from it according to some known noise model, the algorithm can recreate ${\displaystyle f}$ or some close approximation of it with high probability.

## Definition

Denote by ${\displaystyle \mathbb {T} =\mathbb {R} /\mathbb {Z} }$ the additive group on reals modulo one. Denote by ${\displaystyle A_{\mathbf {s} ,\phi }}$ the distribution on ${\displaystyle \mathbb {Z} _{q}^{n}\times \mathbb {T} }$ obtained by choosing a vector ${\displaystyle \mathbf {a} \in \mathbb {Z} _{q}^{n}}$ uniformly at random, choosing ${\displaystyle e}$ according to a probability distribution ${\displaystyle \phi }$ on ${\displaystyle \mathbb {T} }$ and outputting ${\displaystyle (\mathbf {a} ,\langle \mathbf {a} ,\mathbf {s} \rangle /q+e)}$ for some fixed vector ${\displaystyle \mathbf {s} \in \mathbb {Z} _{q}^{n}}$. Here ${\displaystyle \textstyle \langle \mathbf {a} ,\mathbf {s} \rangle =\sum _{i=1}^{n}a_{i}s_{i}}$ is the standard inner product ${\displaystyle \mathbb {Z} _{q}^{n}\times \mathbb {Z} _{q}^{n}\longrightarrow \mathbb {Z} _{q}}$, the division is done in the field of reals (or more formally, this "division by ${\displaystyle q}$" is notation for the group homomorphism ${\displaystyle \mathbb {Z} _{q}\longrightarrow \mathbb {T} }$ mapping ${\displaystyle 1\in \mathbb {Z} _{q}}$ to ${\displaystyle 1/q+\mathbb {Z} \in \mathbb {T} }$), and the final addition is in ${\displaystyle \mathbb {T} }$.

The learning with errors problem ${\displaystyle \mathrm {LWE} _{q,\phi }}$ is to find ${\displaystyle \mathbf {s} \in \mathbb {Z} _{q}^{n}}$, given access to polynomially many samples of choice from ${\displaystyle A_{\mathbf {s} ,\phi }}$.

For every ${\displaystyle \alpha >0}$, denote by ${\displaystyle D_{\alpha }}$ the one-dimensional Gaussian with density function ${\displaystyle D_{\alpha }(x)=\rho _{\alpha }(x)/\alpha }$ where ${\displaystyle \rho _{\alpha }(x)=e^{-\pi (|x|/\alpha )^{2}}}$, and let ${\displaystyle \Psi _{\alpha }}$ be the distribution on ${\displaystyle \mathbb {T} }$ obtained by considering ${\displaystyle D_{\alpha }}$ modulo one. The version of LWE considered in most of the results would be ${\displaystyle \mathrm {LWE} _{q,\Psi _{\alpha }}}$

## Decision version

The LWE problem described above is the search version of the problem. In the decision version (DLWE), the goal is to distinguish between noisy inner products and uniformly random samples from ${\displaystyle \mathbb {Z} _{q}^{n}\times \mathbb {T} }$ (practically, some discretized version of it). Regev[1] showed that the decision and search versions are equivalent when ${\displaystyle q}$ is a prime bounded by some polynomial in ${\displaystyle n}$.

### Solving decision assuming search

Intuitively, if we have a procedure for the search problem, the decision version can be solved easily: just feed the input samples for the decision problem to the solver for the search problem. Denote the given samples by ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} )\}\subset \mathbb {Z} _{q}^{n}\times \mathbb {T} }$. If the solver returns a candidate ${\displaystyle \mathbf {s} }$, for all ${\displaystyle i}$, calculate ${\displaystyle \{\langle \mathbf {a_{i}} ,\mathbf {s} \rangle -\mathbf {b_{i}} \}}$. If the samples are from an LWE distribution, then the results of this calculation will be distributed according ${\displaystyle \chi }$, but if the samples are uniformly random, these quantities will be distributed uniformly as well.

### Solving search assuming decision

For the other direction, given a solver for the decision problem, the search version can be solved as follows: Recover ${\displaystyle \mathbf {s} }$ one coordinate at a time. To obtain the first coordinate, ${\displaystyle \mathbf {s} _{1}}$, make a guess ${\displaystyle k\in Z_{q}}$, and do the following. Choose a number ${\displaystyle r\in \mathbb {Z} _{q}}$ uniformly at random. Transform the given samples ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} )\}\subset \mathbb {Z} _{q}^{n}\times \mathbb {T} }$ as follows. Calculate ${\displaystyle \{(\mathbf {a_{i}} +(r,0,\ldots ,0),\mathbf {b_{i}} +(rk)/q)\}}$. Send the transformed samples to the decision solver.

If the guess ${\displaystyle k}$ was correct, the transformation takes the distribution ${\displaystyle A_{\mathbf {s} ,\chi }}$ to itself, and otherwise, since ${\displaystyle q}$ is prime, it takes it to the uniform distribution. So, given a polynomial-time solver for the decision problem that errs with very small probability, since ${\displaystyle q}$ is bounded by some polynomial in ${\displaystyle n}$, it only takes polynomial time to guess every possible value for ${\displaystyle k}$ and use the solver to see which one is correct.

After obtaining ${\displaystyle \mathbf {s} _{1}}$, we follow an analogous procedure for each other coordinate ${\displaystyle \mathbf {s} _{j}}$. Namely, we transform our ${\displaystyle \mathbf {b_{i}} }$ samples the same way, and transform our ${\displaystyle \mathbf {a_{i}} }$ samples by calculating ${\displaystyle \mathbf {a_{i}} +(0,\ldots ,r,\ldots ,0)}$, where the ${\displaystyle r}$ is in the ${\displaystyle j^{th}}$ coordinate.[1]

Peikert[2] showed that this reduction, with a small modification, works for any ${\displaystyle q}$ that is a product of distinct, small (polynomial in ${\displaystyle n}$) primes. The main idea is if ${\displaystyle q=q_{1}q_{2}\cdots q_{t}}$, for each ${\displaystyle q_{\ell }}$, guess and check to see if ${\displaystyle \mathbf {s} _{j}}$ is congruent to ${\displaystyle 0\mod q_{\ell }}$, and then use the Chinese remainder theorem to recover ${\displaystyle \mathbf {s} _{j}}$.

### Average case hardness

Regev[1] showed the Random self-reducibility of the LWE and DLWE problems for arbitrary ${\displaystyle q}$ and ${\displaystyle \chi }$. Given samples ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} )\}}$ from ${\displaystyle A_{\mathbf {s} ,\chi }}$, it is easy to see that ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} +\langle \mathbf {a_{i}} ,\mathbf {t} \rangle )/q\}}$ are samples from ${\displaystyle A_{\mathbf {s} +\mathbf {t} ,\chi }}$.

So, suppose there was some set ${\displaystyle {\mathcal {S}}\subset \mathbb {Z} _{q}^{n}}$ such that ${\displaystyle |{\mathcal {S}}|/|\mathbb {Z} _{q}^{n}|=1/poly(n)}$, and for distributions ${\displaystyle A_{\mathbf {s'} ,\chi }}$, with ${\displaystyle \mathbf {s'} \leftarrow {\mathcal {S}}}$, DLWE was easy.

Then there would be some distinguisher ${\displaystyle {\mathcal {A}}}$, who, given samples ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} )\}}$, could tell whether they were uniformly random or from ${\displaystyle A_{\mathbf {s'} ,\chi }}$. If we need to distinguish uniformly random samples from ${\displaystyle A_{\mathbf {s} ,\chi }}$, where ${\displaystyle \mathbf {s} }$ is chosen uniformly at random from ${\displaystyle \mathbb {Z} _{q}^{n}}$, we could simply try different values ${\displaystyle \mathbf {t} }$ sampled uniformly at random from ${\displaystyle \mathbb {Z} _{q}^{n}}$, calculate ${\displaystyle \{(\mathbf {a_{i}} ,\mathbf {b_{i}} +\langle \mathbf {a_{i}} ,\mathbf {t} \rangle )/q\}}$ and feed these samples to ${\displaystyle {\mathcal {A}}}$. Since ${\displaystyle {\mathcal {S}}}$ comprises a large fraction of ${\displaystyle \mathbb {Z} _{q}^{n}}$, with high probability, if we choose a polynomial number of values for ${\displaystyle \mathbf {t} }$, we will find one such that ${\displaystyle \mathbf {s} +\mathbf {t} \in {\mathcal {S}}}$, and ${\displaystyle {\mathcal {A}}}$ will successfully distinguish the samples.

Thus, no such ${\displaystyle {\mathcal {S}}}$ can exist, meaning LWE and DLWE are (up to a polynomial factor) as hard in the average case as they are in the worst case.

## Hardness results

### Regev's result

For a n-dimensional lattice ${\displaystyle L}$, let smoothing parameter ${\displaystyle \eta _{\epsilon }(L)}$ denote the smallest ${\displaystyle s}$ such that ${\displaystyle \rho _{1/s}(L^{*}\setminus \{\mathbf {0} \})\leq \epsilon }$ where ${\displaystyle L^{*}}$ is the dual of ${\displaystyle L}$ and ${\displaystyle \rho _{\alpha }(x)=e^{-\pi (|x|/\alpha )^{2}}}$ is extended to sets by summing over function values at each element in the set. Let ${\displaystyle D_{L,r}}$ denote the discrete Gaussian distribution on ${\displaystyle L}$ of width ${\displaystyle r}$ for a lattice ${\displaystyle L}$ and real ${\displaystyle r>0}$. The probability of each ${\displaystyle x\in L}$ is proportional to ${\displaystyle \rho _{r}(x)}$.

The discrete Gaussian sampling problem(DGS) is defined as follows: An instance of ${\displaystyle DGS_{\phi }}$ is given by an ${\displaystyle n}$-dimensional lattice ${\displaystyle L}$ and a number ${\displaystyle r\geq \phi (L)}$. The goal is to output a sample from ${\displaystyle D_{L,r}}$. Regev shows that there is a reduction from ${\displaystyle GapSVP_{100{\sqrt {n}}\gamma (n)}}$ to ${\displaystyle DGS_{{\sqrt {n}}\gamma (n)/\lambda (L^{*})}}$ for any function ${\displaystyle \gamma (n)}$.

Regev then shows that there exists an efficient quantum algorithm for ${\displaystyle DGS_{{\sqrt {2n}}\eta _{\epsilon }(L)/\alpha }}$ given access to an oracle for ${\displaystyle LWE_{q,\Psi _{\alpha }}}$ for integer ${\displaystyle q}$ and ${\displaystyle \alpha \in (0,1)}$ such that ${\displaystyle \alpha q>2{\sqrt {n}}}$. This implies the hardness for ${\displaystyle LWE}$. Although the proof of this assertion works for any ${\displaystyle q}$, for creating a cryptosystem, the ${\displaystyle q}$ has to be polynomial in ${\displaystyle n}$.

### Peikert's result

Peikert proves[2] that there is a probabilistic polynomial time reduction from the ${\displaystyle GapSVP_{\zeta ,\gamma }}$ problem in the worst case to solving ${\displaystyle LWE_{q,\Psi _{\alpha }}}$ using ${\displaystyle poly(n)}$ samples for parameters ${\displaystyle \alpha \in (0,1)}$, ${\displaystyle \gamma (n)\geq n/(\alpha {\sqrt {\log {n}}})}$, ${\displaystyle \zeta (n)\geq \gamma (n)}$ and ${\displaystyle q\geq (\zeta /{\sqrt {n}})\omega {\sqrt {\log {n}}})}$.

## Use in Cryptography

The LWE problem serves as a versatile problem used in construction of several[1][2][4][5] cryptosystems. In 2005, Regev[1] showed that the decision version of LWE is hard assuming quantum hardness of the lattice problems ${\displaystyle GapSVP_{\gamma }}$ (for ${\displaystyle \gamma }$ as above) and ${\displaystyle SIVP_{t}}$ with t=Õ(n/${\displaystyle \alpha }$). In 2009, Peikert[2] proved a similar result assuming only the classical hardness of the related problem ${\displaystyle GapSVP_{\zeta ,\gamma }}$. The disadvantage of Peikert's result is that it bases itself on a non-standard version of an easier (when compared to SIVP) problem GapSVP.

### Public-key cryptosystem

Regev[1] proposed a public-key cryptosystem based on the hardness of the LWE problem. The cryptosystem as well as the proof of security and correctness are completely classical. The system is characterized by ${\displaystyle m,q}$ and a probability distribution ${\displaystyle \chi }$ on ${\displaystyle \mathbb {T} }$. The setting of the parameters used in proofs of correctness and security is

• ${\displaystyle q\geq 2}$, a prime number between ${\displaystyle n^{2}}$ and ${\displaystyle 2n^{2}}$.
• ${\displaystyle m=(1+\epsilon )(n+1)\log {q}}$ for an arbitrary constant ${\displaystyle \epsilon }$
• ${\displaystyle \chi =\Psi _{\alpha (n)}}$ for ${\displaystyle \alpha (n)\in o(1/{\sqrt {n}}\log {n})}$

The cryptosystem is then defined by:

• Private Key: Private key is an ${\displaystyle \mathbf {s} \in \mathbb {Z} _{q}^{n}}$ chosen uniformly at random.
• Public Key: Choose ${\displaystyle m}$ vectors ${\displaystyle a_{1},\ldots ,a_{m}\in \mathbb {Z} _{q}^{n}}$ uniformly and independently. Choose error offsets ${\displaystyle e_{1},\ldots ,e_{m}\in \mathbb {T} }$ independently according to ${\displaystyle \chi }$. The public key consists of ${\displaystyle (a_{i},b_{i}=\langle a_{i},\mathbf {s} \rangle /q+e_{i})_{i=1}^{m}}$
• Encryption: The encryption of a bit ${\displaystyle x\in \{0,1\}}$ is done by choosing a random subset ${\displaystyle S}$ of ${\displaystyle [m]}$ and then defining ${\displaystyle Enc(x)}$ as ${\displaystyle (\sum _{i\in S}a_{i},x/2+\sum _{i\in S}b_{i})}$
• Decryption: The decryption of ${\displaystyle (a,b)}$ is ${\displaystyle 0}$ if ${\displaystyle b-\langle a,\mathbf {s} \rangle /q}$ is closer to ${\displaystyle 0}$ than to ${\displaystyle {\frac {1}{2}}}$, and ${\displaystyle 1}$ otherwise.

The proof of correctness follows from choice of parameters and some probability analysis. The proof of security is by reduction to the decision version of LWE: an algorithm for distinguishing between encryptions (with above parameters) of ${\displaystyle 0}$ and ${\displaystyle 1}$ can be used to distinguish between ${\displaystyle A_{s,\chi }}$ and the uniform distribution over ${\displaystyle \mathbb {Z} _{q}^{n}\times \mathbb {T} }$

### CCA-secure cryptosystem

Peikert[2] proposed a system that is secure even against any chosen-ciphertext attack.