= Redundancy principle (biology) =

The redundancy principle in biology expresses the need of many copies of the same entity (cells, molecules, ions) to fulfill a biological function. Examples are numerous: disproportionate numbers of spermatozoa during fertilization compared to one egg, large number of neurotransmitters released during neuronal communication compared to the number of receptors, large numbers of released calcium ions during transient in cells, and many more in molecular and cellular transduction or gene activation and cell signaling. This redundancy is particularly relevant when the sites of activation are physically separated from the initial position of the molecular messengers. The redundancy is often generated for the purpose of resolving the time constraint of fast-activating pathways. It can be expressed in terms of the theory of extreme statistics to determine its laws and quantify how the shortest paths are selected. The main goal is to estimate these large numbers from physical principles and mathematical derivations.

When a large distance separates the source and the target (a small activation site), the redundancy principle explains that this geometrical gap can be compensated by large number. Had nature used less copies than normal, activation would have taken a much longer time, as finding a small target by chance is a rare event and falls into narrow escape problems.

== Molecular rate ==
The time for the fastest particles to reach a target in the context of redundancy depends on the numbers and the local geometry of the target. In most of the time, it is the rate of activation. This rate should be used instead of the classical Smoluchowski's rate describing the mean arrival time, but not the fastest. The statistics of the minimal time to activation set kinetic laws in biology, which can be quite different from the ones associated to average times.

== Physical models ==

=== Stochastic process ===
The motion of a particle located at position $X_t$ can be described by the Smoluchowski's limit of the Langevin equation:

$dX_t=\sqrt{2D} \, dB_t+\frac{1}{\gamma}F(x)dt,$

where $D$ is the diffusion coefficient of the particle, $\gamma$ is the friction coefficient per unit of mass, $F(x)$ the force per unit of mass, and $B_t$ is a Brownian motion. This model is classically used in molecular dynamics simulations.

=== Jump processes ===
$\begin{align}
x_{n+1}=
\begin{cases} x_n-a, & \text{with probability } l(x_n) \\ x_n+b, &
\text{ with probability } r(x_n)
\end{cases}
\end{align}$, which is for example a model of telomere length dynamics. Here $r(x)=\frac{1}{1+\beta x},$ , with $r(x)+l(x)=1$.

=== Directed motion process ===
$\dot{X}=v_0 \bf u,$ where $\bf u$ is a unit vector chosen from a uniform distribution. Upon hitting an obstacle at a boundary point $X_0 \in \partial \Omega$, the velocity changes to $\dot{X}=v_0 \bf v,$ where $\bf v$ is chosen on the unit sphere in the supporting half space at $X_0$ from a uniform distribution, independently of $\bf u$. This rectilinear with constant velocity is a simplified model of spermatozoon motion in a bounded domain $\Omega$. Other models can be diffusion on graph, active graph motion.

== Mathematical formulation: Computing the rate of arrival time for the fastest ==
The mathematical analysis of large numbers of molecules, which are obviously redundant in the traditional activation theory, is used to compute the in vivo time scale of stochastic chemical reactions. The computation relies on asymptotics or probabilistic approaches to estimate the mean time of the fastest to reach a small target in various geometries.

With N non-interacting i.i.d. Brownian trajectories (ions) in a bounded domain Ω that bind at a site, the shortest arrival time is by definition

$\tau^{1}=\min (t_1,\ldots,t_N),$ where $t_i$ are the independent arrival times of the N ions in the medium. The survival distribution of arrival time of the fastest $Pr(\tau^{1}>t)$ is expressed in terms of a single particle, $Pr(\tau^{1}>t)=Pr^N(t_1>t)$. Here $Pr\{t_{1}>t \}$ is the survival probability of a single particle prior to binding at the target.This probability is computed from the solution of the diffusion equation in a domain $\Omega$:

$\frac{\partial p(x,t)}{\partial t} =D \Delta p(x,t) \hbox { for } x \in \Omega, t>0$

$\begin{align}
p(x,0)=&p_0(x) \hbox{ for } x \in \Omega \\
\frac{\partial p}{\partial n}(x,t) &=0 \hbox{ for } x \in \partial \Omega_r\\
p(x,t)&=0 \hbox{ for } x \in \partial \Omega_a,
\end{align}$

where the boundary $\partial \Omega$ contains NR binding sites $\partial \Omega_i\subset\partial \Omega$ ($\partial \Omega_a=\bigcup\limits_{i=1}^{N_R}\partial\Omega_i,\ \partial\Omega_r=\partial\Omega-\partial\Omega_a$). The single particle survival probability is

$\Pr\{t_{1}>t \} =\int\limits_{\Omega} p(x,t)dx,$ so that $\Pr\{\tau^{1}=t \} = \frac{d}{dt}\Pr\{\tau^{1}<t \}=N(\Pr\{t_{1}>t \})^{N-1}\Pr\limits\{t_{1}=t
\},$where

$\Pr\{t_{1}=t \}= {\oint_{\partial \Omega_a}} \frac{\partial p(x,t)}{\partial n}\, dS_{x}$and $\Pr\{t_{1}=t \}= N_R {\oint_{\partial \Omega_1}} \frac{\partial p(x,t)}{\partial n}\,dS_{x}$.

The probability density function (pdf) of the arrival time is

$\Pr\{\tau^{1}=t \} =N N_R \left[\int\limits_{\Omega} p(x,t)dx \right]^{N-1}\oint\limits_{\partial \Omega_1} \frac{\partial p(x,t)}{\partial n} dS_{x},$ which gives the MFPT

$\bar{\tau}^{1}=\int\limits\limits_0 ^{\infty}\Pr\{\tau^{1}>t\} dt = \int\limits_0 ^{\infty} \left[ \Pr\{t_{1}>t\} \right]^N dt.$ The probability $\Pr\{t_{1}>t \}$ can be computed using short-time asymptotics of the diffusion equation as shown in the next sections.

== Explicit computation in dimension 1 ==
The short-time asymptotic of the diffusion equation is based on the ray method approximation. For an semi-interval $[0,\infty[$, the survival pdf is solution of

$\begin{align}
\frac{\partial (x,t)}{\partial t}& =D \frac{\partial^2 p(x,t)}{\partial x^2}
\quad\mbox{ for } x>0,\ t>0 \\
p(x,0)&=\delta(x-a)\quad\mbox{ for }\ x>0,\quad p(0,t)=0\quad\mbox{ for } t>0,
\end{align}$

that is$p(x,t) =\frac{1}{\sqrt{4D \pi t}}\left[\exp\left\{ - \frac{(x-a)^2}{4Dt}\right\}- \exp\left\{ - \frac{(x+a)^2}{4Dt}\right\}\right].$

The survival probability with D=1 is $\Pr\{t_{1}>t \}=\int\limits\limits_{0}^{\infty} p(x,t)\,dx=1-\frac{2}{\sqrt{\pi}} \int\limits\limits_{a/\sqrt{4t}}^{\infty}e^{-u^2}\,du$. To compute the MFPT, we expand the complementary error function

$\frac{2}{\sqrt{\pi}} \int\limits\limits_{x}^{\infty}e^{-u^2}\,du =\frac{e^{-x^2}}{x\sqrt{\pi}}\left(1-\frac{1}{2x^2}+O(x^{-4})\right)\quad\mbox{for}\ x\gg1,$ which gives$\bar{\tau}^{1}=\int\limits\limits_0 ^{\infty} \left[ \Pr\{t_{1}>t\} \right]^N dt \approx \int\limits\limits_0 ^{\infty} \exp\left\{ N\ln\left(1-\frac{e^{-(a/\sqrt{4t})^2}}{(a/\sqrt{4t})\sqrt{\pi}}\right)\right\}\, dt \approx \frac{a^2}{4}\int\limits\limits_0^{\infty} \exp \left\{ -N\frac{\sqrt{u}e^{-\frac{1}{u}}}{\sqrt{\pi}} \right\}du$,

leading (the main contribution of the integral is near 0) to $\bar{\tau}^{1} \approx \frac{a^2}{4D\ln \frac{N}{\sqrt{\pi}}}\quad\mbox{for}\ N\gg1.$

This result is reminiscent of using the Gumbel's law. Similarly, escape from the interval [0,a] is computed from the infinite sum

$p(x,t\,|\,y) =\frac{1}{\sqrt{ 4 D \pi t}}\sum\limits_{n=-\infty}^{\infty} \left[\exp \left\{ -\frac{(x-y+2na)^2}{4t} \right\} -\exp \left\{ -\frac{(x+y+2na)^2}{4t} \right\} \right]$.The conditional survival probability is approximated by

$\Pr\{t_{1}>t\,|\,y \}=\int\limits\limits_{0}^{a} p(x,t\,|\,y)\,dx ds\sim1-\max\frac{2\sqrt{t}}{\sqrt{\pi}}\left[\frac{e^{-y^2/4t}}{y},\frac{e^{-(a-y)^2/4t}}{a-y}\right] \quad\mbox{as}\ t\to0$, where the maximum occurs at $\delta=$ min[y,a-y] for 0<y<a (the shortest ray from y to the boundary). All other integrals can be computed explicitly, leading to

$\bar{\tau}^{1}= \int\limits\limits_0 ^{\infty} \left[ \Pr\{t_{1}>t\} \right]^N dt \approx
\int\limits\limits_0 ^{\infty} \exp\left\{ N\ln\left(1-\frac{8\sqrt{t}}{\delta\sqrt{\pi}}
e^{-\delta^2/16t}\right) \right\}dt \approx \frac{\delta^2}{16D\ln\frac{2N}{\sqrt{\pi}}}\quad\mbox{for}\ N\gg1.$

== Arrival times of the fastest in higher dimensions ==
The arrival times of the fastest among many Brownian motions are expressed in terms of the shortest distance from the source S to the absorbing window A, measured by the distance $\delta_{min}=d(S,A),$where d is the associated Euclidean distance. Interestingly, trajectories followed by the fastest are as close as possible from the optimal trajectories. In technical language, the associated trajectories of the fastest among N, concentrate near the optimal trajectory (shortest path) when the number N of particles increases. For a diffusion coefficient D and a window of size a, the expected first arrival times of N identically independent distributed Brownian particles initially positioned at the source S are expressed in the following asymptotic formulas :

$\bar\tau^{d1} \approx \frac{\delta^2_{min}}{4D\ln\left(\frac{N}{\sqrt{\pi}}\right)}, \hbox{in dim 1, valid for} N \gg1
,$

$\bar \tau^{d2} \approx \frac{\delta^2_{min}}{ 4 D \log \left(\frac{\pi
\sqrt{2}N}{8\log\left(\frac{1}{a}\right)}\right)}, \hbox{ in dim 2 for } \frac{N}{\log (\frac{1}{\epsilon})}\gg1,$

$\bar\tau^{d3} \approx \frac{\delta^2_{min}}{4D{\log\left(
N\frac{4a^2}{\pi^{1/2}\delta^2_{min}}\right)}}, \hbox{ in dim } 3, \hbox{ for } \frac{Na^2}{\delta^2_{min}}\gg1.$

These formulas show that the expected arrival time of the fastest particle is in dimension 1 and 2, O(1/\log(N)). They should be used instead of the classical forward rate in models of activation in biochemical reactions. The method to derive formulas is based on short-time asymptotic and the Green's function representation of the Helmholtz equation. Note that other distributions could lead to other decays with respect N.

== Optimal Paths ==
=== Minimizing The optimal path in large N ===
The optimal paths for the fastest can be found using the Wencell-Freidlin functional in the Large-deviation theory. These paths correspond to the short-time asymptotics of the diffusion equation from a source to a target. In general, the exact solution is hard to find, especially for a space containing various distribution of obstacles.

The Wiener integral representation of the pdf for a pure Brownian motion is obtained for a zero drift and diffusion tensor $\sigma=D$ constant, so that it is given by the probability of a sampled path until it exits at the small window $\partial\Omega_a$at the random time T

$Pr\{ x_N(t_{1,M})\in\Omega,{x}_N(t_{2,M})\in\Omega,\dots, x_M(t)=x, t\leq T\leq t+\Delta t |x(0)=y\}$

$=[\int\limits_{\Omega} \cdots \int\limits\limits_{\Omega}\prod_{j=1}^{M} \frac{d{y}_j}{\sqrt{(2\pi \Delta t)^n\det {\sigma}(x)(t_{j-1,M}))}}
\exp \{ -\frac{1}{2\Delta t} \left[{y}_j-x(t_{j-1,N})- {a}({x}(t_{j-1,N}))\Delta t \right]^T{\sigma}^{-1}(x(t_{j-1,N}))\left[{y}_j-x(t_{j-1,N})-{a}(x(t_{j-1,N}))\Delta t \right]\}$

where

$\Delta t=t/M, t_{j,N}=j\Delta t,\ x(t_{0,N})=y \hbox{ and } {y}_j=x(t_{j,N})$ in the product and T is the exit time in the narrow absorbing window $\partial\Omega_a.$ Finally,

$\langle\tau^{(n)}\rangle=\int\limits\limits_0 ^{\infty}\exp \left\{ n \log \int\limits_{\Omega} p(x,t|y)\,dx\right\} dt =\int_0 ^{\infty} \tau_{\sigma} Pr\{ \hbox{ Path }\sigma \in S_n (y), \tau_{\sigma}=t \} dt,$

where $S_n(y)$ is the ensemble of shortest paths selected among n Brownian trajectories, starting at point y and exiting between time t and t+dt from the domain $\Omega$. The probability$Pr\{ \hbox{ Path }\sigma \in S_n \}$ is used to show that the empirical stochastic trajectories of $S_n$ concentrate near the shortest paths starting from y and ending at the small absorbing window $\partial \Omega_a$, under the condition that $\epsilon=\frac{|\partial \Omega_a|}{|\partial\Omega|} \ll 1$.  The paths of $S_n(y)$ can be approximated using discrete broken lines among a finite number of points and we denote the associated ensemble by $\tilde S_n(y)$.  Bayes' rule leads to$Pr\{ \hbox{ Path }\sigma \in \tilde S_n(y)| t<\tau_{\sigma}<t+dt \}=\sum_{m=0}^{\infty}

Pr\{ \hbox{ Path }\sigma \in \tilde S_n(y)|m, t<\tau_{\sigma}<t+dt \}Pr\{ m \mbox{ steps}\}$ where $Pr\{ m \mbox{ steps}\}=Pr\{ \mbox{the paths of }\tilde S_n(y) \mbox{exit in m steps} \}$ is the probability that a path of $\tilde S_n(y)$  exits in m-discrete time steps. A path made of broken lines (random walk with a time step$\Delta t$) can be expressed using Wiener path-integral.  The probability of a Brownian path x(s) can be expressed in the limit of a path-integral with the functional:

$Pr\{ x(s)| s\in[0,t] \} \approx \exp \left(-\int_{0}^t |\dot x|^2ds \right).$

The Survival probability conditioned on starting at y is given by the Wiener representation:

$S(t|x_0)= \int_{x\in \Omega} dx \int_{x(0)}^{x(t)=x} {\mathcal D} (x)\exp \left(-\int_{0}^t |\dot x|^2ds \right),$

where ${\mathcal D} (x)$ is the limit Wiener measure: the exterior integral is taken over all end points x and the path integral is over all paths starting from x(0). When we consider n-independent paths $(\sigma_1,..\sigma_n)$ (made of points with a time step $\Delta t$ that exit in m-steps, the probability of such an event is

$Pr \{ \sigma_1,..\sigma_n \in S_n(y)|m, \tau_{\sigma}=m \Delta t \}=

 \left(\int\limits_{y_0=y} \cdots \int\limits_\right)^n\int_{x} {\mathcal D}
(x)\exp \Bigg \{-n\int\limits_0^{m \Delta t} \dot{x}^2ds \Bigg \}$.Indeed, when there are n paths of m steps, and the fastest one escapes in m-steps, they should all exit in m steps. Using the limit of path integral, we get heuristically the representation

<math>Pr \{ \hbox{ Path }\sigma \in \tilde S_n(y)|m, \tau_{\sigma}=m \Delta t \}= \left(\int\limits_
