# Shamir's Secret Sharing

Shamir's Secret Sharing is an algorithm in cryptography created by Adi Shamir. It is a form of secret sharing, where a secret is divided into parts, giving each participant its own unique part, where some of the parts or all of them are needed in order to reconstruct the secret.

Counting on all participants to combine together the secret might be impractical, and therefore sometimes the threshold scheme is used where any $k$ of the parts are sufficient to reconstruct the original secret.

## Mathematical definition

The goal is to divide data $D$ (e.g., a safe combination) into $n$ pieces $D_1,\ldots,D_n$ in such a way that:

1. Knowledge of any $k$ or more $D_i$ pieces makes $D$ easily computable.
2. Knowledge of any $k-1$ or fewer $D_i$ pieces leaves $D$ completely undetermined (in the sense that all its possible values are equally likely).

This scheme is called $\left(k,n\right)$ threshold scheme. If $k=n$ then all participants are required to reconstruct the secret.

## Shamir's secret-sharing scheme

One can draw an infinite number of polynomials of degree 2 through 2 points. 3 points are required to define a unique polynomial of degree 2. This image is for illustration purposes only — Shamir's scheme uses polynomials over a finite field, not representable on a 2-dimensional plane.

The essential idea of Adi Shamir's threshold scheme is that 2 points are sufficient to define a line, 3 points are sufficient to define a parabola, 4 points to define a cubic curve and so forth. That is, it takes $k\,\!$ points to define a polynomial of degree $k-1\,\!$.

Suppose we want to use a $\left(k,n\right)\,\!$ threshold scheme to share our secret $S\,\!$, without loss of generality assumed to be an element in a finite field $F$ of size $P$ where $0 < k \le n < P$ and $P$ is a prime number.

Choose at random $k-1\,\!$ coefficients $a_1,\cdots,a_{k-1}\,\!$ in $F$, and let $a_0=S\,\!$. Build the polynomial $f\left(x\right)=a_0+a_1x+a_2x^2+a_3x^3+\cdots+a_{k-1}x^{k-1}\,\!$. Let us construct any $n\,\!$ points out of it, for instance set $i=1,\cdots,n\,\!$ to retrieve $\left(i,f\left(i\right)\right)\,\!$. Every participant is given a point (an integer input to the polynomial, and the corresponding integer output). Given any subset of $k\,\!$ of these pairs, we can find the coefficients of the polynomial using interpolation. The secret is the constant term $a_0\,\!$.

## Usage

### Example

The following example illustrates the basic idea. Note, however, that calculations in the example are done using integer arithmetic rather than using finite field arithmetic. Therefore the example below does not provide perfect secrecy and is not a true example of Shamir's scheme. So we'll explain this problem and show the right way to implement it (using finite field arithmetic).

#### Preparation

Suppose that our secret is 1234 $(S=1234)\,\!$.

We wish to divide the secret into 6 parts $(n=6)\,\!$, where any subset of 3 parts $(k=3)\,\!$ is sufficient to reconstruct the secret. At random we obtain two ($k-1$) numbers: 166 and 94.

$(a_1=166;a_2=94)\,\!$

Our polynomial to produce secret shares (points) is therefore:

$f\left(x\right)=1234+166x+94x^2\,\!$

We construct 6 points $D_{x-1}=(x, f(x))$ from the polynomial:

$D_0=\left(1,1494\right);D_1=\left(2,1942\right);D_2=\left(3,2578\right);D_3=\left(4,3402\right);D_4=\left(5,4414\right);D_5=\left(6,5614\right)\,\!$

We give each participant a different single point (both $x\,\!$ and $f\left(x\right)\,\!$). Because we use $D_{x-1}$ instead of $D_x$ the points start from $(1, f(1))$ and not $(0, f(0))$. This is necessary because if one would have $(0, f(0))$ he would also know the secret ($S=f(0)$)

#### Reconstruction

In order to reconstruct the secret any 3 points will be enough.

Let us consider $\left(x_0,y_0\right)=\left(2,1942\right);\left(x_1,y_1\right)=\left(4,3402\right);\left(x_2,y_2\right)=\left(5,4414\right)\,\!$.

We will compute Lagrange basis polynomials:

$\ell_0=\frac{x-x_1}{x_0-x_1}\cdot\frac{x-x_2}{x_0-x_2}=\frac{x-4}{2-4}\cdot\frac{x-5}{2-5}=\frac{1}{6}x^2-\frac{3}{2}x+\frac{10}{3}\,\!$

$\ell_1=\frac{x-x_0}{x_1-x_0}\cdot\frac{x-x_2}{x_1-x_2}=\frac{x-2}{4-2}\cdot\frac{x-5}{4-5}=-\frac{1}{2}x^2+\frac{7}{2}x-5\,\!$

$\ell_2=\frac{x-x_0}{x_2-x_0}\cdot\frac{x-x_1}{x_2-x_1}=\frac{x-2}{5-2}\cdot\frac{x-4}{5-4}=\frac{1}{3}x^2-2x+\frac{8}{3}\,\!$

Therefore

$f(x)=\sum_{j=0}^2 y_j\cdot\ell_j(x)\,\!$

$=1234+166x+94x^2\,\!$

Recall that the secret is the free coefficient, which means that $S=1234\,\!$, and we are done.

##### Problem

Although this method works fine, there is a security problem: Eve wins a lot of information about $S$ with every $D_i$ that she finds.

Suppose that she finds the 2 points $D_0=(1,1494)$ and $D_1=(2,1942)$, she still doesn't have $k=3$ points so in theory she shouldn't have won anymore info about $S$. But she combines the info from the 2 points with the public info: $n=6, k=3, f(x)=a_0+a_1x+\dots+a_{k-1}x^{k-1}, a_0=S, a_i\in\mathbb{N}$ and she :

1. fills the $f(x)$-formula with $S$ and the value of $k: f(x)=S+a_1x+\dots+a_{3-1}x^{3-1}\Rightarrow{}f(x)=S+a_1x+a_2x^2$
2. fills (i) with the values of $D_0$'s $x$ and $f(x): 1494=S+a_{1}1+a_{2}1^2\Rightarrow{}1494=S+a_1+a_2$
3. fills (i) with the values of $D_1$'s $x$ and $f(x): 1942=S+a_{1}2+a_{2}2^2\Rightarrow{}1942=S+2a_1+4a_2$
4. does (iii)-(ii): $(1942-1494)=(S-S)+(2a_1-a_1)+(4a_2-a_2)\Rightarrow{}448=a_1+3a_2$ and rewrites this as $a_1=448-3a_2$
5. knows that $a_2\in\mathbb{N}$ so she starts replacing $a_2$ in (iv) with 0, 1, 2, 3, ... to find all possible values for $a_1$:
• $a_2=0\rightarrow{}a_1=448-3\times0=448$
• $a_2=1\rightarrow{}a_1=448-3\times1=445$
• $a_2=2\rightarrow{}a_1=448-3\times2=442$
• $\dots$
• $a_2=148\rightarrow{}a_1=448-3\times148=4$
• $a_2=149\rightarrow{}a_1=448-3\times149=1$

After $a_2=149$ she stops because she reasons that if she continues she would get negative values for $a_1$ (which is impossible because $a_1\in\mathbb{N}$), she can now conclude $a_2\in[0,1,\dots,148,149]$

6. replaces $a_1$ by (iv) in (ii): $1494=S+(448-3a_2)+a_2\Rightarrow{}S=1046+2a_2$
7. replaces in (vi) $a_2$ by the values found in (v) so she gets $S\in[1046+2\times0,1046+2\times1,\dots,1046+2\times148,1046+2\times149]$ which leads her to the information:

$S\in[1046,1048,\dots,1342,1344]$. She now only has 150 numbers to guess from instead of a infinitive number of natural numbers.

#### Solution

This problem can be fixed by using finite field arithmetic in a field of size $p\in\mathbb{P}:p>S,p>n$.

This is in practice only a small change, it just means that we should choose a prime $p$ that is bigger than both the secret and the number of participants and we have to calculate the points as $(x, f(x)\pmod{p})$ instead of $(x, f(x))$.

Everyone that receives a point also has to know the value of $p$ so it's publicly known so you should choose a value for $p$ that is not too low because Eve knows $p>S\Rightarrow{}S\in{[0,1,\dots,p-2,p-1]}$, so the lower you choose $p$, the lower the number of possible values Eve has to guess from to get $S$.

You should also not choose it too high because Eve knows that the chance for $f(x)\pmod{p}=f(x)$ increases with a higher $p$ and she can use the procedure from the original problem to guess $S$ (although now, instead of being sure of the 150 possible values, they just have a increased chance of being valid compared to the other natural numbers)

For this example we choose $p=1613$, so our polynomial becomes $f\left(x\right)=1234+166x+94x^2\mod{1613}$ which gives the points: $\left(1,1494\right);\left(2,329\right);\left(3,965\right);\left(4,176\right);\left(5,1188\right);\left(6,775\right)$

This time Eve doesn't win any info when she finds a $D_x$ (until she has $k$ points).

Suppose again Eve again finds $D_0=\left(1,1494\right)$ and $D_1=\left(2,329\right)$, this time the public info is: $n=6, k=3, p=1613, f(x)=a_0+a_1x+\dots+a_{k-1}x^{k-1}\mod{p}, a_0=S, a_i\in\mathbb{N}$ so she:

1. fills the $f(x)$-formula with $S$ and the value of $k$ and $p$: $f(x)=S+a_1x+\dots+a_{3-1}x^{3-1}\mod1613\Rightarrow{}f(x)=S+a_1x+a_2x^2-1613m_x: m_x\in\mathbb{N}$
2. fills (i) with the values of $D_0$'s $x$ and $f(x): 1494=S+a_{1}1+a_{2}1^2-1613m_1\Rightarrow{}1494=S+a_1+a_2-1613m_1$
3. fills (i) with the values of $D_1$'s $x$ and $f(x): 1942=S+a_{1}2+a_{2}2^2-1613m_2\Rightarrow{}1942=S+2a_1+4a_2-1613m_2$
4. does (iii)-(ii): $(1942-1494)=(S-S)+(2a_1-a_1)+(4a_2-a_2)+(1613m_2-1613m_1)\Rightarrow{}448=a_1+3a_2+1613(m_2-m_1)$ and rewrites this as $a_1=448-3a_2-1613(m_2-m_1)$
5. knows that $a_2\in\mathbb{N}$ so she starts replacing $a_2$ in (iv) with 0, 1, 2, 3, ... to find all possible values for $a_1$:
• $a_2=0\rightarrow{}a_1=448-3\times0-1613(m_2-m_1)=448-1613(m_2-m_1)$
• $a_2=1\rightarrow{}a_1=448-3\times1-1613(m_2-m_1)=445-1613(m_2-m_1)$
• $a_2=2\rightarrow{}a_1=448-3\times2-1613(m_2-m_1)=442-1613(m_2-m_1)$
• $\dots$

This time she can't stop because $(m_2-m_1)$ could be any integer (even negative if $m_2>m_1$) so there are a infinite amount of possible values for $a_1$. She knows that $[448,445,442,...]$ always decreases by 3 so if $1613$ was divisible by $3$ she could conclude $a_1\in[1, 4, 7, \dots]$ but because it's prime she can't even conclude that and so she didn't win any information.

#### Javascript example

var prime = 257;

/*
* Split number into the shares
*/
function split(number, available, needed)
{
var coef = [number, 166, 94], x, exp, c, accum, shares = [];
/*
* Normally, we use the line:
* for(c = 1, coef[0] = number; c < needed; c++) coef[c] = Math.floor(Math.random() * (prime  - 1));
* where (prime - 1) is the maximum allowable value.
* However, to follow this example, we hardcode the values:
* coef = [number, 166, 94];
* For production, replace the hardcoded value with the random loop
*
* For each share that is requested to be available, run through the formula plugging the corresponding coefficient
* The result is f(x), where x is the byte we are sharing (in the example, 1234)
*/
for(x = 1; x <= available; x++)
{
/*
* coef = [1234, 166, 94] which is 1234x^0 + 166x^1 + 94x^2
*/
for(exp = 1, accum = coef[0]; exp < needed; exp++)
accum = (accum + (coef[exp] * (Math.pow(x, exp) % prime) % prime)) % prime; // Modular math
/*
* Store values as (1, 1494), (2, 1942), (3, 2578), (4, 3402), (5, 4414) (6, 5614)
*/
shares[x - 1] = [x, accum];
}
return shares;
}

/*
* Gives the decomposition of the gcd of a and b.
* Returns [x,y,z] such that x = gcd(a,b) and y*a + z*b = x
*/
function gcdD(a,b) {
if (b == 0) return [a, 1, 0];
else {
var n = Math.floor(a/b), c = a % b, r = gcdD(b,c);
return [r[0], r[2], r[1]-r[2]*n];
}
}

/*
* Gives the multiplicative inverse of k mod prime.
* In other words (k * modInverse(k)) % prime = 1 for all 1 <= k < prime
*/
function modInverse(k) {
k = k % prime;
var r = (k < 0) ? -gcdD(prime,-k)[2] : gcdD(prime,k)[2];
return (prime + r) % prime;
}
/*
* Join the shares into a number
*/
function join(shares)
{
var accum, count, formula, startposition, nextposition, value, numerator, denominator;
for(formula = accum = 0; formula < shares.length; formula++)
{
/*
* Multiply the numerator across the top and denominators across the bottom to do Lagrange's interpolation
* Result is x0(2), x1(4), x2(5) -> -4*-5 and (2-4=-2)(2-5=-3), etc for l0, l1, l2...
*/
for(count = 0, numerator = denominator = 1; count < shares.length; count++)
{
if(formula == count) continue; // If not the same value
startposition = shares[formula][0];
nextposition = shares[count][0];
numerator = (numerator * -nextposition) % prime;
denominator = (denominator * (startposition - nextposition)) % prime;
}
value = shares[formula][1];
accum = (prime + accum + (value * numerator * modInverse(denominator))) % prime;
}
return accum;
}

var sh = split(129, 6, 3) /* split the secret value 129 into 6 components - at least 3 of which will be needed to figure out the secret value */
var newshares = [sh[1], sh[3], sh[4]]; /* pick any any selection of 3 shared keys from sh */



## Properties

Some of the useful properties of Shamir's $\left(k,n\right)\,\!$ threshold scheme are:

1. Secure: Information theoretic security.
2. Minimal: The size of each piece does not exceed the size of the original data.
3. Extensible: When $k\,\!$ is kept fixed, $D_i\,\!$ pieces can be dynamically added or deleted without affecting the other pieces.
4. Dynamic: Security can be easily enhanced without changing the secret, but by changing the polynomial occasionally (keeping the same free term) and constructing new shares to the participants.
5. Flexible: In organizations where hierarchy is important, we can supply each participant different number of pieces according to their importance inside the organization. For instance, the president can unlock the safe alone, whereas 3 secretaries are required together to unlock it.