# Ensemble forecasting

Top: Weather Research and Forecasting model simulation of Hurricane Rita tracks. Bottom: The spread of National Hurricane Center multi-model ensemble forecast.

Ensemble forecasting is a numerical prediction method that is used to attempt to generate a representative sample of the possible future states of a dynamical system. Ensemble forecasting is a form of Monte Carlo analysis: multiple numerical predictions are conducted using slightly different initial conditions that are all plausible given the past and current set of observations, or measurements. Sometimes the ensemble of forecasts may use different forecast models for different members, or different formulations of a forecast model. The multiple simulations are conducted to account for the two usual sources of uncertainty in forecast models: (1) the errors introduced by the use of imperfect initial conditions, amplified by the chaotic nature of the evolution equations of the dynamical system, which is often referred to as sensitive dependence on the initial conditions; and (2) errors introduced because of imperfections in the model formulation, such as the approximate mathematical methods to solve the equations. Ideally, the verified future dynamical system state should fall within the predicted ensemble spread, and the amount of spread should be related to the uncertainty (error) of the forecast.

Consider the problem of numerical weather prediction. In this case, the dynamic system is the atmosphere, the model is a numerical weather prediction model and the initial condition is represented by an objective analysis of an atmospheric state. Today ensemble predictions are commonly made at most of the major operational weather prediction facilities worldwide, including:

Experimental ensemble forecasts are made at a number of universities, such as the University of Washington, and ensemble forecasts in the US are also generated by the US Navy and Air Force. There are various ways of viewing the data such as spaghetti plots, ensemble means or Postage Stamps where a number of different results from the models run can be compared.

## History

As proposed by Edward Lorenz in 1963, it is impossible for long-range forecasts—those made more than two weeks in advance—to predict the state of the atmosphere with any degree of skill, owing to the chaotic nature of the fluid dynamics equations involved.[1] Furthermore, existing observation networks have limited spatial and temporal resolution (for example, over large bodies of water such as the Pacific Ocean), which introduces uncertainty into the true initial state of the atmosphere. While a set of equations, known as the Liouville equations, exists to determine the initial uncertainty in the model initialization, the equations are too complex to run in real-time, even with the use of supercomputers.[2] These uncertainties limit forecast model accuracy to about six days into the future.[3]

Edward Epstein recognized in 1969 that the atmosphere could not be completely described with a single forecast run due to inherent uncertainty, and proposed a stochastic dynamic model that produced means and variances for the state of the atmosphere.[4] Although these Monte Carlo simulations showed skill, in 1974 Cecil Leith revealed that they produced adequate forecasts only when the ensemble probability distribution was a representative sample of the probability distribution in the atmosphere.[5] It was not until 1992 that ensemble forecasts began being prepared by the European Centre for Medium-Range Weather Forecasts and the National Centers for Environmental Prediction. The ECMWF model, the Ensemble Prediction System,[6] uses singular vectors to simulate the initial probability density, while the NCEP ensemble, the Global Ensemble Forecasting System, uses a technique known as vector breeding.[7][8]

## Variations

When many different forecast models are used to try to generate a forecast, the approach is termed multi-model ensemble forecasting. This method of forecasting has been shown to improve forecasts when compared to a single model-based approach.[9] When the models within a multi-model ensemble are adjusted for their various biases, this process is known as "superensemble forecasting". This type of a forecast significantly reduces errors in model output.[10] When models of different physical processes are combined, such as combinations of atmospheric, ocean and wave models, the multi-model ensemble is called hyper-ensemble.[11]

## Methods of accounting for uncertainty

Stochastic or "ensemble" forecasting is used to account for uncertainty. It involves multiple forecasts created with an individual forecast model by using different physical parametrizations or varying initial conditions.[2] The ensemble forecast is usually evaluated in terms of an average of the individual forecasts concerning one forecast variable, as well as the degree of agreement between various forecasts within the ensemble system, as represented by their overall spread. Ensemble spread is diagnosed through tools such as spaghetti diagrams, which show the dispersion of one quantity on prognostic charts for specific time steps in the future. Another tool where ensemble spread is used is a meteogram, which shows the dispersion in the forecast of one quantity for one specific location. It is common for the ensemble spread to be too small to incorporate the solution which verifies, which can lead to a misdiagnosis of model uncertainty;[12] this problem becomes particularly severe for forecasts of the weather about 10 days in advance.[13]

## Probability assessment

When ensemble spread is small and the forecast solutions are consistent within multiple model runs, forecasters perceive more confidence in the ensemble mean, and the forecast in general.[12] A spread-skill relationship sometimes exists, as spread-error correlations are normally less than 0.6.[14] The relationship between ensemble spread and skill varies substantially depending on such factors as the forecast model and the region for which the forecast is made.

Ideally, the relative frequency of events from the ensemble could be used directly to estimate the probability of a given weather event. For example, if 30 of 50 members indicated greater than 1 cm rainfall during the next 24 h, the probability of exceeding 1 cm could be estimated to be 60%. The forecast would be considered reliable if, considering all the situations in the past when a 60% probability was forecast, on 60% of those occasions did the rainfall actually exceed 1 cm. This is known as reliability or calibration. In practice, the probabilities generated from operational weather ensemble forecasts are not highly reliable, though with a set of past forecasts (reforecasts or hindcasts) and observations, the probability estimates from the ensemble can be adjusted to ensure greater reliability. Another desirable property of ensemble forecasts is sharpness. Provided that the ensemble is reliable, the more an ensemble forecast deviates from the climatological event frequency and issues 0% or 100% forecasts of an event, the more useful the forecast will be. However, sharp forecasts that are unaccompanied by high reliability will generally not be useful. Forecasts at long leads will inevitably not be particularly sharp, for the inevitable (albeit usually small) errors in the initial condition will grow with increasing forecast lead until the expected difference between two model states is as large as the difference between two random states from the forecast model's climatology.

## Research

The Observing System Research and Predictability Experiment (THORPEX) is a 10-year international research and development programme to accelerate improvements in the accuracy of one-day to two-week high impact weather forecasts for the benefit of society, the economy and the environment.

THORPEX establishes an organizational framework that addresses weather research and forecast problems whose solutions will be accelerated through international collaboration among academic institutions, operational forecast centres and users of forecast products.

The THORPEX Interactive Grand Global Ensemble (known as TIGGE for short), is a key component of THORPEX: a World Weather Research Programme to accelerate the improvements in the accuracy of 1-day to 2 week high-impact weather forecasts for the benefit of humanity. Centralized archives of ensemble model forecast data, from many international centers, are used to enable extensive data sharing and research. The designated TIGGE archive centers include the Chinese Meteorological Administration (CMA), The European Center for Medium-Range Weather Forecasts (ECMWF), and the National Center for Atmospheric Research. Scientific data requirements and archive planning solidified in late 2005, and archive collection began in October 2006.

The Unidata LDM software package is used to transport the ensemble model data from the providers to the archive centers. Currently, the output from the ECMWF, UK Met Office (UKMO), CMA, Japan Meteorological Agency (JMA), National Centers for Environmental Prediction (NCEP-USA), Meteorological Service of Canada (CMC), Bureau of Meteorology Australia (BOM), Centro de Previsao Tempo e Estudos Climaticos Brazil (CPTEC), Korea Meteorological Administration (KMA), and MeteoFrance (MF) global models, totaling 440 GB/day, is moved at up to 30 GB/hour to NCAR (Realtime Statistics). By requirement the parameter fields, atmospheric levels, and physical units are consistent across all data from the providers and encoded in WMO GRIB-2 format. In contrast, each provider may submit their model output in a resolution they choose.

TIGGE data are available to the public for non-commercial research, with a 48-hour delay after forecast initialization time. At NCAR, users can discover data through the TIGGE portal and select parameters, grid resolution, and spatial subsets for the most current two-week period. The most current two-week period of TIGGE data are also available for direct download in the form of forecast files through the RDA near realtime 3-month TIGGE archive. Long term TIGGE data archives are available through the RDA full TIGGE archive. Forecast files are organized by level type (single level, pressure level, potential vorticity level, and potential temperature level), and forecast time-step for a specified model. All ensemble members are included in each forecast file. At ECMWF, users can discover and download data through a web interface linked to the Meteorological Archival and Retrieval System (MARS). CMA offers an additional option for CMA TIGGE data access. Each center will offer fast access to terabytes of data kept online and delayed access to the long term archives preserved in their archive systems.

The key objectives of TIGGE

• An enhanced collaboration on development of ensemble prediction, internationally and between operational centres and universities,
• New methods of combining ensembles from different sources and of correcting for systematic errors (biases, spread over-/under-estimation),
• A deeper understanding of the contribution of observation, initial and model uncertainties to forecast error,
• A deeper understanding of the feasibility of interactive ensemble system responding dynamically to changing uncertainty (including use for adaptive observing, variable ensemble size, on-demand regional ensembles) and exploiting new technology for grid computing and high-speed data transfer,
• Test concepts of a TIGGE Prediction Centre to produce ensemble-based predictions of high-impact weather, wherever it occurs, on all predictable time ranges,
• The development of a prototype future Global Interactive Forecasting System.

## References

1. ^ Cox, John D. (2002). Storm Watchers. John Wiley & Sons, Inc. pp. 222–224. ISBN 0-471-38108-X.
2. ^ a b Manousos, Peter (2006-07-19). "Ensemble Prediction Systems". Hydrometeorological Prediction Center. Retrieved 2010-12-31.
3. ^ Weickmann, Klaus, Jeff Whitaker, Andres Roubicek and Catherine Smith (2001-12-01). The Use of Ensemble Forecasts to Produce Improved Medium Range (3–15 days) Weather Forecasts. Climate Diagnostics Center. Retrieved 2007-02-16.
4. ^ Epstein, E.S. (December 1969). "Stochastic dynamic prediction". Tellus A 21 (6): 739–759. Bibcode:1969Tell...21..739E. doi:10.1111/j.2153-3490.1969.tb00483.x.
5. ^ Leith, C.E. (June 1974). "Theoretical Skill of Monte Carlo Forecasts". Monthly Weather Review 102 (6): 409–418. Bibcode:1974MWRv..102..409L. doi:10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2. ISSN 1520-0493.
6. ^ "The Ensemble Prediction System (EPS)". ECMWF. Retrieved 2011-01-05.
7. ^ Toth, Zoltan; Kalnay, Eugenia (December 1997). "Ensemble Forecasting at NCEP and the Breeding Method". Monthly Weather Review 125 (12): 3297–3319. Bibcode:1997MWRv..125.3297T. doi:10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2. ISSN 1520-0493.
8. ^ Molteni, F.; Buizza, R.; Palmer, T.N.; Petroliagis, T. (January 1996). "The ECMWF Ensemble Prediction System: Methodology and validation". Quarterly Journal of the Royal Meteorological Society 122 (529): 73–119. Bibcode:1996QJRMS.122...73M. doi:10.1002/qj.49712252905.
9. ^ Zhou, Binbin and Jun Du (February 2010). "Fog Prediction From a Multimodel Mesoscale Ensemble Prediction System" (PDF). Weather and Forecasting (American Meteorological Society) 25: 303. Bibcode:2010WtFor..25..303Z. doi:10.1175/2009WAF2222289.1. Retrieved 2011-01-02.
10. ^ Cane, D. and M. Milelli (2010-02-12). "Multimodel SuperEnsemble technique for quantitative precipitation forecasts in Piemonte region" (PDF). Natural Hazards and Earth System Sciences 10: 265. Bibcode:2010NHESS..10..265C. doi:10.5194/nhess-10-265-2010. Retrieved 2011-01-02.
11. ^ Vandenbulcke, L. et al. (2009). "Super-Ensemble techniques: application to surface drift prediction" (PDF). Progress in Oceanography (Pergamon Press - An Imprint of Elsevier Science) 82: 149–167. Bibcode:2009PrOce..82..149V. doi:10.1016/j.pocean.2009.06.002.
12. ^ a b Warner, Thomas Tomkins (2010). Numerical Weather and Climate Prediction. Cambridge University Press. pp. 266–275. ISBN 978-0-521-51389-0. Retrieved 2011-02-11.
13. ^ Palmer, T.N.; G.J. Shutts; R. Hagedorn; F.J. Doblas-Reyes; T. Jung; M. Leutbecher (May 2005). "Representing Model Uncertainty in Weather and Climate Prediction". Annual Review of Earth and Planetary Sciences 33: 163–193. Bibcode:2005AREPS..33..163P. doi:10.1146/annurev.earth.33.092203.122552. Retrieved 2011-02-09.
14. ^ Grimit, Eric P. and Clifford F. Mass (October 2004). "Redefining the Ensemble Spread-Skill Relationship from a Probabilistic Perspective" (PDF). University of Washington. Retrieved 2010-01-02.