# Branching process

In probability theory, a branching process is a Markov process that models a population in which each individual in generation n produces some random number of individuals in generation n + 1, according, in the simplest case, to a fixed probability distribution that does not vary from individual to individual.[1] Branching processes are used to model reproduction; for example, the individuals might correspond to bacteria, each of which generates 0, 1, or 2 offspring with some probability in a single time unit. Branching processes can also be used to model other systems with similar dynamics, e.g., the spread of surnames in genealogy or the propagation of neutrons in a nuclear reactor.

A central question in the theory of branching processes is the probability of ultimate extinction, where no individuals exist after some finite number of generations. It is not hard to show that, starting with one individual in generation zero, the expected size of generation n equals μn where μ is the expected number of children of each individual. If μ < 1, then the expected number of individuals goes rapidly to zero, which implies ultimate extinction with probability 1 by Markov's inequality. Alternatively, if μ > 1, then the probability of ultimate extinction is less than 1 (but not necessarily zero; consider a process where each individual either dies without issue or has 100 children with equal probability). If μ = 1, then ultimate extinction occurs with probability 1 unless each individual always has exactly one child.

In theoretical ecology, the parameter μ of a branching process is called the basic reproductive rate.

## Mathematical formulation

The most common formulation of a branching process is that of the Galton–Watson process. Let Zn denote the state in period n (often interpreted as the size of generation n), and let Xn,i be a random variable denoting the number of direct successors of member i in period n, where Xn,i are independent and identically distributed random variables over all n ∈ 0, 1, 2, ...} and i ∈ {1, ..., Zn}. Then the recurrence equation is

$Z_{n+1} = \sum_{i=1}^{Z_n} X_{n,i}$

with Z0 = 1. Alternatively, one can formulate a branching process as a random walk. Let Si denote the state in period i, and let Xi be a random variable that is iid over all i. Then the recurrence equation is

$S_{i+1} = S_i+X_{i+1}-1 = \sum_{j=1}^{i+1} X_j-i$

with S0 = 1. To gain some intuition for this formulation, one can imagine a walk where the goal is to visit every node, but every time a previously unvisited node is visited, additional nodes are revealed that must also be visited. Let Si represent the number of revealed but unvisited nodes in period i, and let Xi represent the number of new nodes that are revealed when node i is visited. Then in each period, the number of revealed but unvisited nodes equals the number of such nodes in the previous period, plus the new nodes that are revealed when visiting a node, minus the node that is visited. The process ends once all revealed nodes have been visited.

## Extinction problem

The ultimate extinction probability is given by

$\lim_{n \to \infty} \Pr(Z_n=0).$

For any nontrivial cases (trivial cases are ones in which the probability of having no offspring is zero for every member of the population - in such cases the probability of ultimate extinction is 0), the probability of ultimate extinction equals one if μ ≤ 1 and strictly less than one if μ > 1.

The process can be analyzed using the method of probability generating function. Let p0p1p2, ... are the probabilities of producing 0, 1, 2, ... offspring by each individual in each generation. Let dm be the extinction probability by the mth generation. Obviously, d0= 0. Since the probabilities for all paths that lead to 0 by the m-th generation must be added up, the extinction probability is nondecreasing in generations. That is,

$0=d_0 \leq d_1\leq d_2 \leq \cdots \leq 1.$

Therefore, dm converges to a limit d, and d is the ultimate extinction probability. If there are j offspring in the first generation, then to die out by the mth generation, each of these lines must die out in m-1 generations. Since they proceed independently, the probability is (dm−1)j. Thus,

$d_m=p_0+p_1d_{m-1}+p_2(d_{m-1})^2+p_3(d_{m-1})^3+\cdots. \,$

The right-hand side of the equation is probability generating function. Let h(z) be the ordinary generating function for pi:

$h(z)=p_0+p_1z+p_2z^2+\cdots. \,$

Using the generating function, the previous equation becomes

$d_m=h(d_{m-1}). \,$

Since dmd, d can be found by solving

$d=h(d). \,$

This is also equivalent to finding the intersection point(s) of lines y = z and y = h(z) for z ≥ 0. y = z is a straight line. y = h(z) is an increasing (since $h'(z) = p_1 + 2 p_2 z + 3 p_3 z^2 + \cdots \geq 0$) and convex (since $h''(z) = 2 p_2 + 6 p_3 z + 12 p_4 z^2 + \cdots \geq 0$) function. There are at most two intersection points. Since (1,1) is always an intersect point for the two functions, there only exist three cases:

Three cases of y = h(z) intersect with y = z.

Case 1 has another intersect point at z < 1 (see the red curve in the graph).

Case 2 has only one intersect point at z = 1.(See the green curve in the graph)

Case 3 has another intersect point at z > 1.(See the black curve in the graph)

In case 1, the ultimate extinction probability is strictly less than one. For case 2 and 3, the ultimate extinction probability equals to one.

By observing that h′(1) = p1 + 2p2 + 3p3 + ... = μ is exactly the expected number of offspring a parent could produce, it can be concluded that for a branching process with generating function h(z) for the number of offspring of a given parent, if the mean number of offspring produced by a single parent is less than or equal to one, then the ultimate extinction probability is one. If the mean number of offspring produced by a single parent is greater than one, then the ultimate extinction probability is strictly less than one.

## Example of extinction problem

Consider a parent can produce at most two offspring and the probabilities for the number produced are p0 = 0.1, p1 = 0.6, and p2 = 0.3. The extinction probability in each generation is

$d_m=p_0+p_1d_{m-1}+p_2(d_{m-1})^2. \,$

with d0 = 0. Here, the extinction probability is calculated from generation 1 to generation 20. The result is shown in the table.

The extinction probability converges to the ultimate extinction probability as time goes.

For the ultimate extinction probability, we need to find d which satisfies d = p0 + p1d + p2d2. In this example, d = 1/3. This is exactly what the probabilities in the table converges to.