# Mathematical modelling of infectious disease

Mathematical models can project how infectious diseases progress to show the likely outcome of an epidemic and help inform public health interventions. Models use some basic assumptions and mathematics to find parameters for various infectious diseases and use those parameters to calculate the effects of possible interventions, like mass vaccination programmes.

## History

The modeling of infectious diseases is a tool which has been used to study the mechanisms by which diseases spread, to predict the future course of an outbreak and to evaluate strategies to control an epidemic.[1]

The first scientist who systematically tried to quantify causes of death was John Graunt in his book Natural and Political Observations made upon the Bills of Mortality, in 1662. The bills he studied were listings of numbers and causes of deaths published weekly. Graunt’s analysis of causes of death is considered the beginning of the “theory of competing risks” which according to Daley and Gani [1] is “a theory that is now well established among modern epidemiologists”.

The earliest account of mathematical modelling of spread of disease was carried out in 1766 by Daniel Bernoulli. Trained as a physician, Bernoulli created a mathematical model to defend the practice of inoculating against smallpox.[2] The calculations from this model showed that universal inoculation against smallpox would increase the life expectancy from 26 years 7 months to 29 years 9 months.[3]

Daniel Bernoulli's work preceded the modern understanding of germ theory. It wasn't until the 20th century when pioneers in infectious disease modelling such as William Hamer[4] applied the law of mass action to explain epidemic behaviour. This was soon followed by the work of A. G. McKendrick and W. O. Kermack who published Kermack–McKendrick epidemic model (1927) to describe the relationship between susceptible, infected and immune individuals in a population. In 1928 of Ronald Ross into the spread of malaria, that modern theoretical epidemiology began with his development of the Reed–Frost epidemic model (1928).[5]

## Assumptions

Models are only as good as the assumptions on which they are based. If a model makes predictions which are out of line with observed results and the mathematics is correct, the initial assumptions must change to make the model useful.

• Rectangular and stationary age distribution, i.e., everybody in the population lives to age L and then dies, and for each age (up to L) there is the same number of people in the population. This is often well-justified for developed countries where there is a low infant mortality and much of the population lives to the life expectancy.
• Homogeneous mixing of the population, i.e., individuals of the population under scrutiny assort and make contact at random and do not mix mostly in a smaller subgroup. This assumption is rarely justified because social structure is widespread. For example, most people in London only make contact with other Londoners. Further, within London then there are smaller subgroups, such as the Turkish community or teenagers (just to give two examples), who mix with each other more than people outside their group. However, homogeneous mixing is a standard assumption to make the mathematics tractable.

## Types of epidemic models

### Stochastic

"Stochastic" means being or having a random variable. A stochastic model is a tool for estimating probability distributions of potential outcomes by allowing for random variation in one or more inputs over time. Stochastic models depend on the chance variations in risk of exposure, disease and other illness dynamics. They are used when these fluctuations are important, as in small populations[6][7]

### Deterministic

When dealing with large populations, as in the case of tuberculosis, deterministic or compartmental mathematical models are often used. In a deterministic model, individuals in the population are assigned to different subgroups or compartments, each representing a specific stage of the epidemic. Letters such as M, S, E, I, and R are often used to represent different stages.

The transition rates from one class to another are mathematically expressed as derivatives, hence the model is formulated using differential equations. While building such models, it must be assumed that the population size in a compartment is differentiable with respect to time and that the epidemic process is deterministic. In other words, the changes in population of a compartment can be calculated using only the history used to develop the model.[5]

## Reproduction number

There is a threshold quantity which determines whether an epidemic occurs or the disease simply dies out. This quantity is called the basic reproduction number, denoted by R0, which can be defined as the number of secondary infections caused by a single infective introduced into a population made up entirely of susceptible individuals (S(0) ≈ N) over the course of the infection of this single infective. This infectious individual makes β contacts per unit time producing new infections with a mean infectious period of 1/γ. Therefore, the basic reproduction number is

R0 = β/γ

This value quantifies the transmission potential of a disease. If the basic reproduction number falls below one (R0 < 1), i.e. the infective may not pass the infection on during the infectious period, the infection dies out. If R0 > 1 there is an epidemic in the population. In cases where R0 = 1, the disease becomes endemic, meaning the disease remains in the population at a consistent rate, as one infected individual transmits the disease to one susceptible.

In cases of diseases with varying latent periods, the basic reproduction number can be calculated as the sum of the reproduction number for each transition time into the disease. An example of this is tuberculosis. Blower et al.[8] calculated from a simple model of TB the following reproduction number:

R0 = R0FAST + R0SLOW

In their model, it is assumed that the infected individuals can develop active TB by either direct progression (the disease develops immediately after infection) considered above as FAST tuberculosis or endogenous reactivation (the disease develops years after the infection) considered above as SLOW tuberculosis.

An infectious disease is said to be endemic when it can be sustained in a population without the need for external inputs. This means that, on average, each infected person is infecting exactly one other person (any more and the number of people infected will grow exponentially and there will be an epidemic, any less and the disease will die out). In mathematical terms, that is:

${\displaystyle \ R_{0}\ =1.}$

The basic reproduction number (R0) of the disease, assuming everyone is susceptible, multiplied by the proportion of the population that is actually susceptible (S) must be one (since those who are not susceptible do not feature in our calculations as they cannot contract the disease). Notice that this relation means that for a disease to be in the endemic steady state, the higher the basic reproduction number, the lower the proportion of the population susceptible must be, and vice versa.

Assume the rectangular stationary age distribution and let also the ages of infection have the same distribution for each birth year. Let the average age of infection be A, for instance when individuals younger than A are susceptible and those older than A are immune (or infectious). Then it can be shown by an easy argument that the proportion of the population that is susceptible is given by:

${\displaystyle S={\frac {A}{L}}.}$

But the mathematical definition of the endemic steady state can be rearranged to give:

${\displaystyle S={\frac {1}{R_{0}}}.}$

Therefore, due to the transitive property:

${\displaystyle {\frac {1}{R_{0}}}={\frac {A}{L}}\Rightarrow R_{0}={\frac {L}{A}}.}$

This provides a simple way to estimate the parameter R0 using easily available data.

For a population with an exponential age distribution,

${\displaystyle R_{0}=1+{\frac {L}{A}}.}$

This allows for the basic reproduction number of a disease given A and L in either type of population distribution.

## Modelling epidemics

The SIR model is one of the more basic models used for modelling epidemics. There are a large number of modifications to the model.

### The SIR model

In 1927, W. O. Kermack and A. G. McKendrick created a model in which they considered a fixed population with only three compartments: susceptible, ${\displaystyle S(t)}$; infected, ${\displaystyle I(t)}$; and removed, ${\displaystyle R(t)}$. The compartments used for this model consist of three classes:[9]

• ${\displaystyle S(t)}$ is used to represent the number of individuals not yet infected with the disease at time t, or those susceptible to the disease.
• ${\displaystyle I(t)}$ denotes the number of individuals who have been infected with the disease and are capable of spreading the disease to those in the susceptible category.
• ${\displaystyle R(t)}$ is the compartment used for those individuals who have been infected and then removed from the disease, either due to immunization or due to death. Those in this category are not able to be infected again or to transmit the infection to others.

The flow of this model may be considered as follows:

${\displaystyle {\color {blue}{{\mathcal {S}}\rightarrow {\mathcal {I}}\rightarrow {\mathcal {R}}}}}$

Using a fixed population, ${\displaystyle N=S(t)+I(t)+R(t)}$, Kermack and McKendrick derived the following equations:

${\displaystyle {\frac {dS}{dt}}=-{\frac {\beta SI}{N}}}$
${\displaystyle {\frac {dI}{dt}}={\frac {\beta SI}{N}}-\gamma I}$
${\displaystyle {\frac {dR}{dt}}=\gamma I}$

Several assumptions were made in the formulation of these equations: First, an individual in the population must be considered as having an equal probability as every other individual of contracting the disease with a rate of ${\displaystyle \beta }$, which is considered the contact or infection rate of the disease. Therefore, an infected individual makes contact and is able to transmit the disease with ${\displaystyle \beta N}$ others per unit time and the fraction of contacts by an infected with a susceptible is ${\displaystyle S/N}$. The number of new infections in unit time per infective then is ${\displaystyle \beta N(S/N)}$, giving the rate of new infections (or those leaving the susceptible category) as ${\displaystyle \beta N(S/N)I=\beta SI}$.[5] For the second and third equations, consider the population leaving the susceptible class as equal to the number entering the infected class. However, a number equal to the fraction (${\displaystyle \gamma }$ which represents the mean recovery/death rate, or ${\displaystyle 1/\gamma }$ the mean infective period) of infectives are leaving this class per unit time to enter the removed class. These processes which occur simultaneously are referred to as the Law of Mass Action, a widely accepted idea that the rate of contact between two groups in a population is proportional to the size of each of the groups concerned.[1] Finally, it is assumed that the rate of infection and recovery is much faster than the time scale of births and deaths and therefore, these factors are ignored in this model.

### Other compartmental models

There are a large number of modifications of the SIR model, including those that include births and deaths, where upon recovery there is no immunity (SIS model), where immunity lasts only for a short period of time (SIRS), where there is a latent period of the disease where the person is not infectious (SEIS and SEIR), and where infants can be born with immunity (MSIR).

## Infectious disease dynamics

Mathematical models need to integrate the increasing volume of data being generated on host-pathogen interactions. Many theoretical studies of the population dynamics, structure and evolution of infectious diseases of plants and animals, including humans, are concerned with this problem.[citation needed]

Research topics include:

## Mathematics of mass vaccination

If the proportion of the population that is immune exceeds the herd immunity level for the disease, then the disease can no longer persist in the population. Thus, if this level can be exceeded by vaccination, the disease can be eliminated. An example of this being successfully achieved worldwide is the global smallpox eradication, with the last wild case in 1977. The WHO is carrying out a similar vaccination campaign to eradicate polio.[citation needed]

The herd immunity level will be denoted q. Recall that, for a stable state:

${\displaystyle \ R_{0}\cdot S=1.}$

S will be (1 − q), since q is the proportion of the population that is immune and q + S must equal one (since in this simplified model, everyone is either susceptible or immune). Then:

${\displaystyle \ R_{0}\cdot (1-q)=1,}$
${\displaystyle 1-q={\frac {1}{R_{0}}},}$
${\displaystyle q=1-{\frac {1}{R_{0}}}.}$

Remember that this is the threshold level. If the proportion of immune individuals exceeds this level due to a mass vaccination programme, the disease will die out.

We have just calculated the critical immunisation threshold (denoted qc). It is the minimum proportion of the population that must be immunised at birth (or close to birth) in order for the infection to die out in the population.

${\displaystyle q_{c}=1-{\frac {1}{R_{0}}}}$

### When mass vaccination cannot exceed the herd immunity

If the vaccine used is insufficiently effective or the required coverage cannot be reached (for example due to popular resistance), the programme may fail to exceed qc. Such a programme can, however, disturb the balance of the infection without eliminating it, often causing unforeseen problems.

Suppose that a proportion of the population q (where q < qc) is immunised at birth against an infection with R0>1. The vaccination programme changes R0 to Rq where

${\displaystyle \ R_{q}=R_{0}(1-q)}$

This change occurs simply because there are now fewer susceptibles in the population who can be infected. Rq is simply R0 minus those that would normally be infected but that cannot be now since they are immune.

As a consequence of this lower basic reproduction number, the average age of infection A will also change to some new value Aq in those who have been left unvaccinated.

Recall the relation that linked R0, A and L. Assuming that life expectancy has not changed, now:

${\displaystyle \ R_{q}={\frac {L}{A_{q}}},}$
${\displaystyle \ A_{q}={\frac {L}{R_{q}}}={\frac {L}{R_{0}(1-q)}}.}$

But R0 = L/A so:

${\displaystyle \ {A_{q}}={\frac {L}{(L/A)(1-q)}}={\frac {AL}{L(1-q)}}={\frac {A}{1-q}}.}$

Thus the vaccination programme will raise the average age of infection, another mathematical justification for a result that might have been intuitively obvious. Unvaccinated individuals now experience a reduced force of infection due to the presence of the vaccinated group.

However, it is important to consider this effect when vaccinating against diseases that are more severe in older people. A vaccination programme against such a disease that does not exceed qc may cause more deaths and complications than there were before the programme was brought into force as individuals will be catching the disease later in life. These unforeseen outcomes of a vaccination programme are called perverse effects.[citation needed]

### When mass vaccination exceeds the herd immunity

If a vaccination programme causes the proportion of immune individuals in a population to exceed the critical threshold for a significant length of time, transmission of the infectious disease in that population will stop. This is known as elimination of the infection and is different from eradication.[citation needed]

Elimination
Interruption of endemic transmission of an infectious disease, which occurs if each infected individual infects less than one other, is achieved by maintaining vaccination coverage to keep the proportion of immune individuals above the critical immunisation threshold.
Reduction of infective organisms in the wild worldwide to zero. So far, this has only been achieved for smallpox and rinderpest. To get to eradication, elimination in all world regions must be achieved.

## References

1. ^ a b c Daley, D. J. & Gani, J. (2005). Epidemic Modeling: An Introduction. NY: Cambridge University Press.
2. ^ Hethcote, H. W. (2000). "The mathematics of infectious diseases." Society for Industrial and Applied Mathematics, 42, 599 – 653.
3. ^ Bernoulli, D. & Blower, S. (2004). "An attempt at a new analysis of the mortality caused by smallpox and of the advantages of inoculation to prevent it." Reviews in Medical Virology, 14, 275 – 288.
4. ^ Hamer, W. (1928). Epidemiology Old and New. London: Kegan Paul
5. ^ a b c Brauer, F. & Castillo-Chávez, C. (2001). Mathematical Models in Population Biology and Epidemiology. NY: Springer.
6. ^ Trottier, H., & Philippe, P. (2001). "Deterministic modeling of infectious diseases: theory and methods." The Internet Journal of Infectious Diseases.
7. ^ Nakamura, G. M, Monteiro, A. C. P., Cardoso, G. C., & Martinez, A. S. (2017). "Efficient method for comprehensive computation of agent-level epidemic dissemination in networks." Scientific Reports, 7, 40885.
8. ^ Blower, S. M., Mclean, A. R., Porco, T. C., Small, P. M., Hopewell, P. C., Sanchez, M. A., et al. (1995). "The intrinsic transmission dynamics of tuberculosis epidemics." Nature Medicine, 1, 815–821.
9. ^ Kermack, W. O.; McKendrick, A. G. (1927). "A Contribution to the Mathematical Theory of Epidemics". Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences. 115 (772): 700. Bibcode:1927RSPSA.115..700K. JSTOR 94815. doi:10.1098/rspa.1927.0118.