# Block Wiedemann algorithm

The block Wiedemann algorithm for computing kernel vectors of a matrix over a finite field is a generalisation of an algorithm due to Don Coppersmith.

## Coppersmith's algorithm

Let M be an $n\times n$ square matrix over some finite field F, let $x_{\mathrm {base}}$ be a random vector of length n, and let $x = M x_{\mathrm {base}}$. Consider the sequence of vectors $S = \left[x, Mx, M^2x, \ldots\right]$ obtained by repeatedly multiplying the vector by the matrix M; let y be any other vector of length n, and consider the sequence of finite-field elements $S_y = \left[y \cdot x, y \cdot Mx, y \cdot M^2x \ldots\right]$

We know that the matrix M has a minimal polynomial; by the Cayley–Hamilton theorem we know that this polynomial is of degree (which we will call $n_0$) no more than n. Say $\sum_{r=0}^{n_0} p_rM^r = 0$. Then $\sum_{r=0}^{n_0} y \cdot (p_r (M^r x)) = 0$; so the minimal polynomial of the matrix annihilates the sequence $S$ and hence $S_y$.

But the Berlekamp–Massey algorithm allows us to calculate relatively efficiently some sequence $q_0 \ldots q_L$ with $\sum_{i=0}^L q_i S_y[{i+r}]=0 \forall r$. Our hope is that this sequence, which by construction annihilates $y \cdot S$, actually annihilates $S$; so we have $\sum_{i=0}^L q_i M^i x = 0$. We then take advantage of the initial definition of $x$ to say $M \sum_{i=0}^L q_i M^i x_{\mathrm {base}} = 0$ and so $\sum_{i=0}^L q_i M^i x_{\mathrm {base}}$ is a hopefully non-zero kernel vector of $M$.

## The block Wiedemann algorithm

The natural implementation of sparse matrix arithmetic on a computer makes it easy to compute the sequence S in parallel for a number of vectors equal to the width of a machine word – indeed, it will normally take no longer to compute for that many vectors than for one. If you have several processors, you can compute the sequence S for a different set of random vectors in parallel on all the computers.

It turns out, by a generalization of the Berlekamp–Massey algorithm to provide a sequence of small matrices, that you can take the sequence produced for a large number of vectors and generate a kernel vector of the original large matrix. You need to compute $y_i \cdot M^t x_j$ for some $i = 0 \ldots i_\max, j=0 \ldots j_\max, t = 0 \ldots t_\max$ where $i_\max, j_\max, t_\max$ need to satisfy $t_\max > \frac{d}{i_\max} + \frac{d}{j_\max} + O(1)$ and $y_i$ are a series of vectors of length n; but in practice you can take $y_i$ as a sequence of unit vectors and simply write out the first $i_\max$ entries in your vectors at each time t.

## References

Villard's 1997 research report 'A study of Coppersmith's block Wiedemann algorithm using matrix polynomials' (the cover material is in French but the content in English) is a reasonable description.

Thomé's paper 'Subquadratic computation of vector generating polynomials and improvement of the block Wiedemann algorithm' uses a more sophisticated FFT-based algorithm for computing the vector generating polynomials, and describes a practical implementation with imax = jmax = 4 used to compute a kernel vector of a 484603×484603 matrix of entries modulo 2607−1, and hence to compute discrete logarithms in the field GF(2607).