= Species distribution modelling =

Species distribution modelling (SDM), also known as environmental (or ecological) niche modelling (ENM), habitat suitability modelling, predictive habitat distribution modelling, and range mapping uses ecological models to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species' future distribution under climate change, a species' past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).

There are two main types of SDMs. Correlative SDMs, also known as climate envelope models, bioclimatic models, or resource selection function models, model the observed distribution of a species as a function of environmental conditions. Mechanistic SDMs, also known as process-based models or biophysical models, use independently derived information about a species' physiology to develop a model of the environmental conditions under which the species can exist.

The extent to which such modelled data reflect real-world species distributions will depend on a number of factors, including the nature, complexity, and accuracy of the models used and the quality of the available environmental data layers; the availability of sufficient and reliable species distribution data as model input; and the influence of various factors such as barriers to dispersal, geologic history, or biotic interactions, that increase the difference between the realized niche and the fundamental niche. Environmental niche modelling may be considered a part of the discipline of biodiversity informatics.

== History ==
A. F. W. Schimper used geographical and environmental factors to explain plant distributions in his 1898 Pflanzengeographie auf physiologischer Grundlage (Plant Geography Upon a Physiological Basis) and his 1908 work of the same name. Andrew Murray used the environment to explain the distribution of mammals in his 1866 The Geographical Distribution of Mammals. Robert Whittaker's work with plants and Robert MacArthur's work with birds strongly established the role the environment plays in species distributions. Elgene O. Box constructed environmental envelope models to predict the range of tree species. His computer simulations were among the earliest uses of species distribution modelling.

The adoption of more sophisticated generalised linear models (GLMs) made it possible to create more sophisticated and realistic species distribution models. The expansion of remote sensing and the development of GIS-based environmental modelling increase the amount of environmental information available for model-building and made it easier to use.

== Correlative vs mechanistic models ==

=== Correlative SDMs ===
SDMs originated as correlative models. Correlative SDMs model the observed distribution of a species as a function of geographically referenced climatic predictor variables using multiple regression approaches. Given a set of geographically referred observed presences of a species and a set of climate maps, a model defines the most likely environmental ranges within which a species lives. Correlative SDMs assume that species are at equilibrium with their environment and that the relevant environmental variables have been adequately sampled. The models allow for interpolation between a limited number of species occurrences.

For these models to be effective, it is required to gather observations not only of species presences, but also of absences, that is, where the species does not live. Records of species absences are typically not as common as records of presences, thus often "random background" or "pseudo-absence" data are used to fit these models. If there are incomplete records of species occurrences, pseudo-absences can introduce bias. Since correlative SDMs are models of a species' observed distribution, they are models of the realized niche (the environments where a species is found), as opposed to the fundamental niche (the environments where a species can be found, or where the abiotic environment is appropriate for the survival). For a given species, the realized and fundamental niches might be the same, but if a species is geographically confined due to dispersal limitation or species interactions, the realized niche will be smaller than the fundamental niche.

Correlative SDMs are easier and faster to implement than mechanistic SDMs, and can make ready use of available data. Since they are correlative however, they do not provide much information about causal mechanisms and are not good for extrapolation. They will also be inaccurate if the observed species range is not at equilibrium (e.g. if a species has been recently introduced and is actively expanding its range).

In standard SDMs, the distribution of a single species is often modeled, with unique parameters describing how environmental (abiotic) factors influence its occurrence probability. This allows for differentiated responses to environmental drivers among species, but can be problematic for data-deficient species. In contrast, similarities in environmental responses can be accounted for in multi-species SDMs, which model several species jointly using shared or hierarchically related parameters. However, neither approach explicitly accounts for community-level biotic interactions, which can be important in explaining species diversity patterns. Joint species distribution models (joint SDMs or J-SDMs) address this by modeling species co-occurrence patterns directly. The occurrence probability of a given species is thus influenced not only by abiotic drivers but also by inferred biotic associations with other species. This can improve accuracy for rarer taxa and provide insights into community ecology. Both standard SDMs and J-SDMs can be used to generate community-level metrics, such as species richness, by aggregating outputs across multiple species. These can be important for decision-making such as conservation planning.

=== Mechanistic SDMs ===
Mechanistic SDMs are more recently developed. In contrast to correlative models, mechanistic SDMs use physiological information about a species (taken from controlled field or laboratory studies) to determine the range of environmental conditions within which the species can persist. These models aim to directly characterize the fundamental niche, and to project it onto the landscape. A simple model may simply identify threshold values outside of which a species can't survive. A more complex model may consist of several sub-models, e.g. micro-climate conditions given macro-climate conditions, body temperature given micro-climate conditions, fitness or other biological rates (e.g. survival, fecundity) given body temperature (thermal performance curves), resource or energy requirements, and population dynamics. Geographically referenced environmental data are used as model inputs. Because the species distribution predictions are independent of the species' known range, these models are especially useful for species whose range is actively shifting and not at equilibrium, such as invasive species.

Mechanistic SDMs incorporate causal mechanisms and are better for extrapolation and non-equilibrium situations. However, they are more labor-intensive to create than correlational models and require the collection and validation of a lot of physiological data, which may not be readily available. The models require many assumptions and parameter estimates, and they can become very complicated.

Dispersal, biotic interactions, and evolutionary processes present challenges, as they aren't usually incorporated into either correlative or mechanistic models.

Correlational and mechanistic models can be used in combination to gain additional insights. For example, a mechanistic model could be used to identify areas that are clearly outside the species' fundamental niche, and these areas can be marked as absences or excluded from analysis. See for a comparison between mechanistic and correlative models.

== Niche models (correlative) ==
There are a variety of mathematical methods that can be used for fitting, selecting, and evaluating correlative SDMs. Models include "profile" methods, which are simple statistical techniques that use e.g. environmental distance to known sites of occurrence such as BIOCLIM and DOMAIN; "regression" methods (e.g. forms of generalized linear models); and "machine learning" methods such as maximum entropy (MAXENT). Ten machine learning techiniques used in SDM can be seen in. An incomplete list of models that have been used for niche modelling includes:

===Profile techniques===
- BIOCLIM
- DOMAIN
- Ecological niche factor analysis (ENFA)
- Mahalanobis distance
- Isodar analysis

===Regression-based techniques===
- Generalized linear model (GLM)
- Generalized additive model (GAM)
- Multivariate adaptive regression splines (MARS)
- Maxlike
- Favourability Function (FF)

===Machine learning techniques===
- MAXENT
- Artificial neural networks (ANN)
- Genetic Algorithm for Rule Set Production (GARP)
- Boosted regression trees (BRT)/gradient boosting machines (GBM)
- Random forest (RF)
- Support vector machines (SVM)
- XGBoost (XGB)

Furthermore, ensemble models can be created from several model outputs to create a model that captures components of each. Often the mean or median value across several models is used as an ensemble. Similarly, consensus models are models that fall closest to some measure of central tendency of all models—consensus models can be individual model runs or ensembles of several models.

== Niche modelling software (correlative) ==
SPACES is an online Environmental niche modeling platform that allows users to design and run dozens of the most prominent methods in a high performance, multi-platform, browser-based environment.

MaxEnt is the most widely used method/software uses presence only data and performs well when there are few presence records available.

ModEco implements various methods.

Qb.SDM implements Random Forest, XGBoost, MaxEnt with GBIF online integration.

DIVA-GIS has an easy to use (and good for educational use) implementation of BIOCLIM

The Biodiversity and Climate Change Virtual Laboratory (BCCVL) is a "one stop modelling shop" that simplifies the process of biodiversity and climate impact modelling. It connects the research community to Australia's national computational infrastructure by integrating a suite of tools in a coherent online environment. Users can access global climate and environmental datasets or upload their own data, perform data analysis across six different experiment types with a suite of 17 different methods, and easily visualize, interpret and evaluate the results of the models. Experiments types include: Species Distribution Model, Multispecies Distribution Model, Species Trait Model (currently under development), Climate Change Projection, Biodiverse Analysis and Ensemble Analysis. Example of BCCVL SDM outputs can be found here

Another example is Ecocrop, which is used to determine the suitability of a crop to a specific environment. This database system can also project crop yields and evaluate the impact of environmental factors such as climate change on plant growth and suitability.

Most niche modelling methods are available in the R packages 'dismo', 'biomod2' and 'mopa'..

Software developers may want to build on the openModeller project.

The Collaboratory for Adaptation to Climate Change adapt.nd.edu has implemented an online version of openModeller that allows users to design and run openModeller in a high-performance, browser-based environment to allow for multiple parallel experiments without the limitations of local processor power.

== SDM applications ==
SDMs have become one of the most versatile and widely applied tools in ecological research, conservation planning, and environmental management. SDMs allow researchers to quantify and project the ecological niches of species across space and time. Their applications extend far beyond theoretical ecology, providing essential insights for decision-making and policy development in a rapidly changing world.

A primary application of SDMs lies in biodiversity conservation and reserve design. Models are used to identify areas of high habitat suitability, helping to prioritise sites for protection and restoration. In many cases, SDMs guide the expansion of protected area networks by predicting potential habitats for rare, endemic, or threatened species, even in regions where field data are scarce. They also assist in assessing the effectiveness of existing reserves under current and future environmental conditions.

Another key application concerns climate change impact assessment. SDMs are widely used to forecast shifts in species distributions under different climate scenarios, offering projections of potential range contractions, expansions, or migrations. These projections provide critical inputs for climate adaptation strategies, enabling policymakers and conservationists to anticipate biodiversity responses and implement proactive management measures. In addition, SDMs support the identification of climate refugia, areas likely to remain suitable under future climatic conditions, which are crucial for long-term conservation planning.

In invasive species management, SDMs play an increasingly strategic role. By predicting suitable habitats for alien species, models help to assess invasion risks, inform early warning systems, and support eradication or containment programmes. Similarly, in disease ecology, SDMs are applied to forecast the potential spread of vector-borne pathogens, integrating environmental, climatic, and host distribution data to identify risk zones.

In ecological and evolutionary research, SDMs contribute to understanding species–environment relationships, niche differentiation, and biogeographical patterns. They are frequently integrated with genetic data to explore population connectivity, phylogeographic structure, and evolutionary responses to environmental gradients. This integrative approach offers explanations for processes such as local adaptation, gene flow, and the historical dynamics of species distributions.

Finally, SDMs are increasingly applied in ecosystem service assessments, restoration ecology, and land-use planning. By identifying potential habitats for pollinators, keystone species, or functional groups, SDMs inform sustainable management strategies and support the design of multifunctional landscapes that reconcile conservation with human development needs.

Overall, the direct applications of species distribution models span from fundamental ecological inquiry to practical conservation action. Their flexibility, combined with advances in remote sensing, machine learning, and big data availability, continues to expand their role as a bridge between ecological theory, empirical data, and environmental decision-making.

== See also ==
- Biogeography
- Ecosystem model
- Quantum evolution
