# NK model

The NK model is a mathematical model described by its primary inventor Stuart Kauffman as a "tunably rugged" fitness landscape. "Tunable ruggedness" captures the intuition that both the overall size of the landscape and the number of its local "hills and valleys" can be adjusted via changes to its two parameters, ${\displaystyle N}$ and ${\displaystyle K}$, defined below. The NK model has found application in a wide variety of fields, including the theoretical study of evolutionary biology, immunology, optimisation, technological evolution, and complex systems. The model was also adopted in organizational theory, where it is used to describe the way an agent may search a landscape by manipulating various characteristics of itself. For example, an agent can be an organization, the hills and valleys represent profit (or changes thereof), and movement on the landscape necessitates organizational decisions (such as adding product lines or altering the organizational structure), which tend to interact with each other and affect profit in a complex fashion.[1]

An early version of the model, which considered only the smoothest (${\displaystyle K=0}$) and most rugged (${\displaystyle K=N}$) landscapes, was presented in Kauffman and Levin (1987).[2] The model as it is currently known first appeared in Kauffman and Weinberger (1989).[3]

One of the reasons why the model has attracted wide attention in optimisation is that it is a particularly simple instance of a so-called NP-complete problem.[4]

## Mathematical details

The NK model defines a combinatorial phase space, consisting of every string (chosen from a given alphabet) of length ${\displaystyle N}$. For each string in this search space, a scalar value (called the fitness) is defined. If a distance metric is defined between strings, the resulting structure is a landscape.

Fitness values are defined according to the specific incarnation of the model, but the key feature of the NK model is that the fitness of a given string ${\displaystyle S}$ is the sum of contributions from each locus ${\displaystyle S_{i}}$ in the string:

${\displaystyle F(S)=\sum _{i}f(S_{i}),}$

and the contribution from each locus in general depends on the value of ${\displaystyle K}$ other loci:

${\displaystyle f(S_{i})=f(S_{i},S_{1}^{i},\dots ,S_{K}^{i}),\,}$

where ${\displaystyle S_{j}^{i}}$ are the other loci upon which the fitness of ${\displaystyle S_{i}}$ depends.

Hence, the fitness function ${\displaystyle f(S_{i},S_{1}^{i},\dots ,S_{K}^{i})}$ is a mapping between strings of length K + 1 and scalars, which Weinberger's later work calls "fitness contributions". Such fitness contributions are often chosen randomly from some specified probability distribution.

In 1991, Weinberger published a detailed analysis[5] of the case in which ${\displaystyle 1< and the fitness contributions are chosen randomly. His analytical estimate of the number of local optima was later shown to be flawed. However, numerical experiments included in Weinberger's analysis support his analytical result that the expected fitness of a string is normally distributed with a mean of approximately

${\displaystyle \mu +\sigma {\sqrt {{2\ln(k+1)} \over {k+1}}}}$

and a variance of approximately

${\displaystyle {{(k+1)\sigma ^{2}} \over {N[k+1+2(k+2)\ln(k+1)]}}}$.

Visualization of two dimensions of a NK fitness landscape. The arrows represent various mutational paths that the population could follow while evolving on the fitness landscape.

## Example

For simplicity, we will work with binary strings. Consider an NK model with N = 5, K = 1. Here, the fitness of a string is given by the sum of individual fitness contributions from each of 5 loci. Each fitness contribution depends on the local locus value and one other. We will employ the convention that ${\displaystyle f(S_{i})=f(S_{i},S_{i+1})}$, so that each locus is affected by its neighbour, and ${\displaystyle f(S_{5})=f(S_{5},S_{1})}$ for cyclicity. If we choose, for example, the fitness function f(0, 0) = 0; f(0, 1) = 1; f(1, 0) = 2; f(1, 1) = 0, the fitness values of two example strings are:

${\displaystyle F(00101)=f(0,0)+f(0,1)+f(1,0)+f(0,1)+f(1,0)=0+1+2+1+2=6.\,}$
${\displaystyle F(11100)=f(1,1)+f(1,1)+f(1,0)+f(0,0)+f(0,1)=0+0+2+0+1=3.\,}$

## Tunable topology

Illustration of tunable topology in the NK model. Nodes are individual binary strings, edges connect strings with a Hamming distance of exactly one. (left) N = 5, K = 0. (centre) N = 5, K = 1. (right) N = 5, K = 2. The colour of a node denotes its fitness, with redder values having higher fitness. The embedding of the hypercube is chosen so that the fitness maximum is at the centre. Notice that the K = 0 landscape appears smoother than the higher-K cases.

The value of K controls the degree of epistasis in the NK model, or how much other loci affect the fitness contribution of a given locus. With K = 0, the fitness of a given string is a simple sum of individual contributions of loci: for nontrivial fitness functions, a global optimum is present and easy to locate (the genome of all 0s if f(0) > f(1), or all 1s if f(1) > f(0)). For nonzero K, the fitness of a string is a sum of fitnesses of substrings, which may interact to frustrate the system (consider how to achieve optimal fitness in the example above). Increasing K thus increases the ruggedness of the fitness landscape.

### Variations with neutral spaces

The bare NK model does not support the phenomenon of neutral space -- that is, sets of genomes connected by single mutations that have the same fitness value. Two adaptations have been proposed to include this biologically important structure. The NKP model introduces a parameter ${\displaystyle P}$: a proportion ${\displaystyle P}$ of the ${\displaystyle 2^{K}}$ fitness contributions is set to zero, so that the contributions of several genetic motifs are degenerate. The NKQ model introduces a parameter ${\displaystyle Q}$ and enforces a discretisation on the possible fitness contribution values so that each contribution takes one of ${\displaystyle Q}$ possible values, again introducing degeneracy in the contributions from some genetic motifs. The bare NK model corresponds to the ${\displaystyle P=0}$ and ${\displaystyle Q=\infty }$ cases under these parameterisations.

## Applications

The NK model has found use in many fields, including in the study of spin glasses, epistasis and pleiotropy in evolutionary biology, and combinatorial optimisation.

## References

1. ^ Levinthal, D. A. (1997). "Adaptation on Rugged Landscapes". Management Science. 43 (7): 934–950. doi:10.1287/mnsc.43.7.934.
2. ^ Kauffman, S.; Levin, S. (1987). "Towards a general theory of adaptive walks on rugged landscapes". Journal of Theoretical Biology. 128 (1): 11–45. doi:10.1016/s0022-5193(87)80029-2.
3. ^ Kauffman, S.; Weinberger, E. (1989). "The NK Model of rugged fitness landscapes and its application to the maturation of the immune response". Journal of Theoretical Biology. 141 (2): 211–245. doi:10.1016/s0022-5193(89)80019-0.
4. ^ Weinberger, E. (1996), "NP-completeness of Kauffman's N-k model, a Tuneably Rugged Fitness Landscape", Santa Fe Institute Working Paper, 96-02-003.
5. ^ Weinberger, Edward (November 15, 1991). "Local properties of Kauffman's N-k model: A tunably rugged energy landscape". Physical Review A. 10. 44: 6399–6413. doi:10.1103/physreva.44.6399.