Jump to content

Mathematical modelling of infectious diseases

From Wikipedia, the free encyclopedia
(Redirected from Epidemic model)

Mathematical models can project how infectious diseases progress to show the likely outcome of an epidemic (including in plants) and help inform public health and plant health interventions. Models use basic assumptions or collected statistics along with mathematics to find parameters for various infectious diseases and use those parameters to calculate the effects of different interventions, like mass vaccination programs. The modelling can help decide which intervention(s) to avoid and which to trial, or can predict future growth patterns, etc.


The modelling of infectious diseases is a tool that has been used to study the mechanisms by which diseases spread, to predict the future course of an outbreak and to evaluate strategies to control an epidemic.[1]

The first scientist who systematically tried to quantify causes of death was John Graunt in his book Natural and Political Observations made upon the Bills of Mortality, in 1662. The bills he studied were listings of numbers and causes of deaths published weekly. Graunt's analysis of causes of death is considered the beginning of the "theory of competing risks" which according to Daley and Gani [1] is "a theory that is now well established among modern epidemiologists".

The earliest account of mathematical modelling of spread of disease was carried out in 1760 by Daniel Bernoulli. Trained as a physician, Bernoulli created a mathematical model to defend the practice of inoculating against smallpox.[2] The calculations from this model showed that universal inoculation against smallpox would increase the life expectancy from 26 years 7 months to 29 years 9 months.[3] Daniel Bernoulli's work preceded the modern understanding of germ theory.[4]

In the early 20th century, William Hamer[5] and Ronald Ross[6] applied the law of mass action to explain epidemic behaviour.

The 1920s saw the emergence of compartmental models. The Kermack–McKendrick epidemic model (1927) and the Reed–Frost epidemic model (1928) both describe the relationship between susceptible, infected and immune individuals in a population. The Kermack–McKendrick epidemic model was successful in predicting the behavior of outbreaks very similar to that observed in many recorded epidemics.[7]

Recently, agent-based models (ABMs) have been used in exchange for simpler compartmental models.[8] For example, epidemiological ABMs have been used to inform public health (nonpharmaceutical) interventions against the spread of SARS-CoV-2.[9] Epidemiological ABMs, in spite of their complexity and requiring high computational power, have been criticized for simplifying and unrealistic assumptions.[10][11] Still, they can be useful in informing decisions regarding mitigation and suppression measures in cases when ABMs are accurately calibrated.[12]


Models are only as good as the assumptions on which they are based. If a model makes predictions that are out of line with observed results and the mathematics is correct, the initial assumptions must change to make the model useful.[13]

  • Rectangular and stationary age distribution, i.e., everybody in the population lives to age L and then dies, and for each age (up to L) there is the same number of people in the population. This is often well-justified for developed countries where there is a low infant mortality and much of the population lives to the life expectancy.
  • Homogeneous mixing of the population, i.e., individuals of the population under scrutiny assort and make contact at random and do not mix mostly in a smaller subgroup. This assumption is rarely justified because social structure is widespread. For example, most people in London only make contact with other Londoners. Further, within London then there are smaller subgroups, such as the Turkish community or teenagers (just to give two examples), who mix with each other more than people outside their group. However, homogeneous mixing is a standard assumption to make the mathematics tractable.

Types of epidemic models[edit]


"Stochastic" means being or having a random variable. A stochastic model is a tool for estimating probability distributions of potential outcomes by allowing for random variation in one or more inputs over time. Stochastic models depend on the chance variations in risk of exposure, disease and other illness dynamics. Statistical agent-level disease dissemination in small or large populations can be determined by stochastic methods.[14] [15][16]


When dealing with large populations, as in the case of tuberculosis, deterministic or compartmental mathematical models are often used. In a deterministic model, individuals in the population are assigned to different subgroups or compartments, each representing a specific stage of the epidemic.[17]

The transition rates from one class to another are mathematically expressed as derivatives, hence the model is formulated using differential equations. While building such models, it must be assumed that the population size in a compartment is differentiable with respect to time and that the epidemic process is deterministic. In other words, the changes in population of a compartment can be calculated using only the history that was used to develop the model.[7]

Sub-exponential growth[edit]

A common explanation for the growth of epidemics holds that 1 person infects 2, those 2 infect 4 and so on and so on with the number of infected doubling every generation. It is analogous to a game of tag where 1 person tags 2, those 2 tag 4 others who've never been tagged and so on. As this game progresses it becomes increasing frenetic as the tagged run past the previously tagged to hunt down those who have never been tagged. Thus this model of an epidemic leads to a curve that grows exponentially until it crashes to zero as all the population have been infected. i.e. no herd immunity and no peak and gradual decline as seen in reality.[18]

Epidemic Models on Networks[edit]

Epidemics can be modeled as diseases spreading over networks of contact between people. Such a network can be represented mathematically with a graph and is called the contact network.[19] Every node in a contact network is a representation of an individual and each link (edge) between a pair of nodes represents the contact between them. Links in the contact networks may be used to transmit the disease between the individuals and each disease has its own dynamics on top of its contact network. The combination of disease dynamics under the influence of interventions, if any, on a contact network may be modeled with another network, known as a transmission network. In a transmission network, all the links are responsible for transmitting the disease. If such a network is a locally tree-like network, meaning that any local neighborhood in such a network takes the form of a tree, then the basic reproduction can be written in terms of the average excess degree of the transmission network such that:

where is the mean-degree (average degree) of the network and is the second moment of the transmission network degree distribution. It is, however, not always straightforward to find the transmission network out of the contact network and the disease dynamics.[20] For example, if a contact network can be approximated with an Erdős–Rényi graph with a Poissonian degree distribution, and the disease spreading parameters are as defined in the example above, such that is the transmission rate per person and the disease has a mean infectious period of , then the basic reproduction number is [21][22] since for a Poisson distribution.

Reproduction number[edit]

The basic reproduction number (denoted by R0) is a measure of how transferable a disease is. It is the average number of people that a single infectious person will infect over the course of their infection. This quantity determines whether the infection will increase sub-exponentially, die out, or remain constant: if R0 > 1, then each person on average infects more than one other person so the disease will spread; if R0 < 1, then each person infects fewer than one person on average so the disease will die out; and if R0 = 1, then each person will infect on average exactly one other person, so the disease will become endemic: it will move throughout the population but not increase or decrease.[23]

Endemic steady state[edit]

An infectious disease is said to be endemic when it can be sustained in a population without the need for external inputs. This means that, on average, each infected person is infecting exactly one other person (any more and the number of people infected will grow sub-exponentially and there will be an epidemic, any less and the disease will die out). In mathematical terms, that is:

The basic reproduction number (R0) of the disease, assuming everyone is susceptible, multiplied by the proportion of the population that is actually susceptible (S) must be one (since those who are not susceptible do not feature in our calculations as they cannot contract the disease). Notice that this relation means that for a disease to be in the endemic steady state, the higher the basic reproduction number, the lower the proportion of the population susceptible must be, and vice versa. This expression has limitations concerning the susceptibility proportion, e.g. the R0 equals 0.5 implicates S has to be 2, however this proportion exceeds the population size.[citation needed]

Assume the rectangular stationary age distribution and let also the ages of infection have the same distribution for each birth year. Let the average age of infection be A, for instance when individuals younger than A are susceptible and those older than A are immune (or infectious). Then it can be shown by an easy argument that the proportion of the population that is susceptible is given by:

We reiterate that L is the age at which in this model every individual is assumed to die. But the mathematical definition of the endemic steady state can be rearranged to give:

Therefore, due to the transitive property:

This provides a simple way to estimate the parameter R0 using easily available data.

For a population with an exponential age distribution,

This allows for the basic reproduction number of a disease given A and L in either type of population distribution.

Compartmental models in epidemiology[edit]

Compartmental models are formulated as Markov chains.[24] A classic compartmental model in epidemiology is the SIR model, which may be used as a simple model for modelling epidemics. Multiple other types of compartmental models are also employed.

The SIR model[edit]

Diagram of the SIR model with initial values , and rates for infection and for recovery
Animation of the SIR model with initial values , and rate of recovery . The animation shows the effect of reducing the rate of infection from to . If there is no medicine or vaccination available, it is only possible to reduce the infection rate (often referred to as "flattening the curve") by appropriate measures such as social distancing.

In 1927, W. O. Kermack and A. G. McKendrick created a model in which they considered a fixed population with only three compartments: susceptible, ; infected, ; and recovered, . The compartments used for this model consist of three classes:[25]

  • is used to represent the individuals not yet infected with the disease at time t, or those susceptible to the disease of the population.
  • denotes the individuals of the population who have been infected with the disease and are capable of spreading the disease to those in the susceptible category.
  • is the compartment used for the individuals of the population who have been infected and then removed from the disease, either due to immunization or due to death. Those in this category are not able to be infected again or to transmit the infection to others.

Other compartmental models[edit]

There are many modifications of the SIR model, including those that include births and deaths, where upon recovery there is no immunity (SIS model), where immunity lasts only for a short period of time (SIRS), where there is a latent period of the disease where the person is not infectious (SEIS and SEIR), and where infants can be born with immunity (MSIR).[citation needed]

Infectious disease dynamics[edit]

Mathematical models need to integrate the increasing volume of data being generated on host-pathogen interactions. Many theoretical studies of the population dynamics, structure and evolution of infectious diseases of plants and animals, including humans, are concerned with this problem.[26]

Research topics include:

Mathematics of mass vaccination[edit]

If the proportion of the population that is immune exceeds the herd immunity level for the disease, then the disease can no longer persist in the population and its transmission dies out.[27] Thus, a disease can be eliminated from a population if enough individuals are immune due to either vaccination or recovery from prior exposure to disease. For example, smallpox eradication, with the last wild case in 1977, and certification of the eradication of indigenous transmission of 2 of the 3 types of wild poliovirus (type 2 in 2015, after the last reported case in 1999, and type 3 in 2019, after the last reported case in 2012).[28]

The herd immunity level will be denoted q. Recall that, for a stable state:[citation needed]

In turn,

which is approximately:[citation needed]

Graph of herd immunity threshold vs basic reproduction number with selected diseases

S will be (1 − q), since q is the proportion of the population that is immune and q + S must equal one (since in this simplified model, everyone is either susceptible or immune). Then:

Remember that this is the threshold level. Die out of transmission will only occur if the proportion of immune individuals exceeds this level due to a mass vaccination programme.

We have just calculated the critical immunization threshold (denoted qc). It is the minimum proportion of the population that must be immunized at birth (or close to birth) in order for the infection to die out in the population.

Because the fraction of the final size of the population p that is never infected can be defined as:


Solving for , we obtain:

When mass vaccination cannot exceed the herd immunity[edit]

If the vaccine used is insufficiently effective or the required coverage cannot be reached, the program may fail to exceed qc. Such a program will protect vaccinated individuals from disease, but may change the dynamics of transmission.[citation needed]

Suppose that a proportion of the population q (where q < qc) is immunised at birth against an infection with R0 > 1. The vaccination programme changes R0 to Rq where

This change occurs simply because there are now fewer susceptibles in the population who can be infected. Rq is simply R0 minus those that would normally be infected but that cannot be now since they are immune.

As a consequence of this lower basic reproduction number, the average age of infection A will also change to some new value Aq in those who have been left unvaccinated.

Recall the relation that linked R0, A and L. Assuming that life expectancy has not changed, now:[citation needed]

But R0 = L/A so:

Thus, the vaccination program may raise the average age of infection, and unvaccinated individuals will experience a reduced force of infection due to the presence of the vaccinated group. For a disease that leads to greater clinical severity in older populations, the unvaccinated proportion of the population may experience the disease relatively later in life than would occur in the absence of vaccine.

When mass vaccination exceeds the herd immunity[edit]

If a vaccination program causes the proportion of immune individuals in a population to exceed the critical threshold for a significant length of time, transmission of the infectious disease in that population will stop. If elimination occurs everywhere at the same time, then this can lead to eradication.[citation needed]

Interruption of endemic transmission of an infectious disease, which occurs if each infected individual infects less than one other, is achieved by maintaining vaccination coverage to keep the proportion of immune individuals above the critical immunization threshold.[citation needed]
Elimination everywhere at the same time such that the infectious agent dies out (for example, smallpox and rinderpest).[citation needed]


Models have the advantage of examining multiple outcomes simultaneously, rather than making a single forecast. Models have shown broad degrees of reliability in past pandemics, such as SARS, SARS-CoV-2,[29] Swine flu, MERS and Ebola.[30]

See also[edit]


  1. ^ a b Daley DJ, Gani J (2005). Epidemic Modeling: An Introduction. New York: Cambridge University Press.
  2. ^ Hethcote HW (2000). "The mathematics of infectious diseases". SIAM Review. 42 (4): 599–653. Bibcode:2000SIAMR..42..599H. doi:10.1137/S0036144500371907. S2CID 10836889.
  3. ^ Blower S, Bernoulli D (2004). "An attempt at a new analysis of the mortality caused by smallpox and of the advantages of inoculation to prevent it". Reviews in Medical Virology. 14 (5): 275–88. doi:10.1002/rmv.443. PMID 15334536. S2CID 8169180.
  4. ^ "Germ Theory - an overview | ScienceDirect Topics".
  5. ^ Hamer W (1928). Epidemiology Old and New. London: Kegan Paul.
  6. ^ Ross R (1910). The Prevention of Malaria. Dutton.
  7. ^ a b Brauer F, Castillo-Chávez C (2001). Mathematical Models in Population Biology and Epidemiology. New York: Springer.
  8. ^ Eisinger D, Thulke HH (April 2008). "Spatial pattern formation facilitates eradication of infectious diseases". The Journal of Applied Ecology. 45 (2): 415–423. Bibcode:2008JApEc..45..415E. doi:10.1111/j.1365-2664.2007.01439.x. PMC 2326892. PMID 18784795.
  9. ^ Adam D (April 2020). "Special report: The simulations driving the world's response to COVID-19". Nature. 580 (7803): 316–318. Bibcode:2020Natur.580..316A. doi:10.1038/d41586-020-01003-6. PMID 32242115. S2CID 214771531.
  10. ^ Squazzoni F, Polhill JG, Edmonds B, Ahrweiler P, Antosz P, Scholz G, et al. (2020). "Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action". Journal of Artificial Societies and Social Simulation. 23 (2): 10. doi:10.18564/jasss.4298. hdl:10037/19057. ISSN 1460-7425. S2CID 216426533.
  11. ^ Sridhar D, Majumder MS (April 2020). "Modelling the pandemic". BMJ. 369: m1567. doi:10.1136/bmj.m1567. PMID 32317328. S2CID 216074714.
  12. ^ Maziarz M, Zach M (October 2020). "Agent-based modelling for SARS-CoV-2 epidemic prediction and intervention assessment: A methodological appraisal". Journal of Evaluation in Clinical Practice. 26 (5): 1352–1360. doi:10.1111/jep.13459. PMC 7461315. PMID 32820573.
  13. ^ Huppert A, Katriel G (2013). "Mathematical modelling and prediction in infectious disease epidemiology". Clinical Microbiology and Infection. 19 (11): 999–1005. doi:10.1111/1469-0691.12308. PMID 24266045.
  14. ^ Tembine H. "COVID-19: Data-Driven Mean-Field-Type Game Perspective. Games". Games Journal. doi:10.3390/g11040051. hdl:10419/257469. {{cite journal}}: Cite journal requires |journal= (help)
  15. ^ Nakamura GM, Monteiro AC, Cardoso GC, Martinez AS (February 2017). "Efficient method for comprehensive computation of agent-level epidemic dissemination in networks". Scientific Reports. 7 (1): 40885. arXiv:1606.07825. Bibcode:2017NatSR...740885N. doi:10.1038/srep40885. ISSN 2045-2322. PMC 5247741. PMID 28106086.
  16. ^ Nakamura GM, Cardoso GC, Martinez AS (February 2020). "Improved susceptible–infectious–susceptible epidemic equations based on uncertainties and autocorrelation functions". Royal Society Open Science. 7 (2): 191504. Bibcode:2020RSOS....791504N. doi:10.1098/rsos.191504. ISSN 2054-5703. PMC 7062106. PMID 32257317.
  17. ^ Dietz K (1967). "Epidemics and Rumours: A Survey". Journal of the Royal Statistical Society. Series A (General). 130 (4): 505–528. doi:10.2307/2982521. JSTOR 2982521.
  18. ^ Maier BF, Brockmann D (2020). "Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China". Science. 368 (6492): 742–746. Bibcode:2020Sci...368..742M. doi:10.1126/science.abb4557. PMC 7164388. PMID 32269067.
  19. ^ Network Science by Albert-László Barabási.
  20. ^ Kenah E, Robins JM (September 2007). "Second look at the spread of epidemics on networks". Physical Review E. 76 (3 Pt 2): 036113. arXiv:q-bio/0610057. Bibcode:2007PhRvE..76c6113K. doi:10.1103/PhysRevE.76.036113. ISSN 1539-3755. PMC 2215389. PMID 17930312.
  21. ^ Pastor-Satorras R, Castellano C, Van Mieghem P, Vespignani A (2015-08-31). "Epidemic processes in complex networks". Reviews of Modern Physics. 87 (3): 925–979. arXiv:1408.2701. Bibcode:2015RvMP...87..925P. doi:10.1103/RevModPhys.87.925. S2CID 14306926.
  22. ^ K Rizi A, Faqeeh A, Badie-Modiri A, Kivelä M (2022-04-20). "Epidemic spreading and digital contact tracing: Effects of heterogeneous mixing and quarantine failures". Physical Review E. 105 (4): 044313. arXiv:2103.12634. Bibcode:2022PhRvE.105d4313R. doi:10.1103/PhysRevE.105.044313. PMID 35590624. S2CID 232320251.
  23. ^ "Basic Reproduction Number - an overview | ScienceDirect Topics".
  24. ^ Cosma Shalizi (15 November 2018). "Data over Space and Time; Lecture 21: Compartment Models" (PDF). Carnegie Mellon University. Retrieved September 19, 2020.
  25. ^ Kermack WO, McKendrick AG (1991). "Contributions to the mathematical theory of epidemics--I. 1927". Bulletin of Mathematical Biology. 53 (1–2): 33–55. Bibcode:1927RSPSA.115..700K. doi:10.1007/BF02464423. JSTOR 94815. PMID 2059741.
  26. ^ Brauer F (2017). "Mathematical epidemiology: Past, present, and future". Infectious Disease Modelling. 2 (2): 113–127. doi:10.1016/j.idm.2017.02.001. PMC 6001967. PMID 29928732.
  27. ^ Britton T, Ball F, Trapman P (2020). "A mathematical model reveals the influence of population heterogeneity on herd immunity to SARS-CoV-2". Science. 369 (6505): 846–849. Bibcode:2020Sci...369..846B. doi:10.1126/science.abc6810. PMC 7331793. PMID 32576668.
  28. ^ Pollard AJ, Bijker EM (2021). "A guide to vaccinology: From basic principles to new developments". Nature Reviews Immunology. 21 (2): 83–100. doi:10.1038/s41577-020-00479-7. PMC 7754704. PMID 33353987.
  29. ^ Renz A, Widerspick L, Dräger A (2020). "FBA reveals guanylate kinase as a potential target for antiviral therapies against SARS-CoV-2". Bioinformatics. 36 (Supplement_2): i813–i821. doi:10.1093/bioinformatics/btaa813. PMC 7773487. PMID 33381848.
  30. ^ Costris-Vas C, Schwartz EJ, Smith? RJ (November 2020). "Predicting COVID-19 using past pandemics as a guide: how reliable were mathematical models then, and how reliable will they be now?". Mathematical Biosciences and Engineering. 17 (6): 7502–7518. doi:10.3934/mbe.2020383 (inactive 2024-04-24). PMID 33378907.{{cite journal}}: CS1 maint: DOI inactive as of April 2024 (link)

Further reading[edit]

External links[edit]

  • Model-Builder: Interactive (GUI-based) software to build, simulate, and analyze ODE models.
  • GLEaMviz Simulator: Enables simulation of emerging infectious diseases spreading across the world.
  • STEM: Open source framework for Epidemiological Modeling available through the Eclipse Foundation.
  • R package surveillance: Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena