Jump to content

Sensitivity analysis: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
WillBecker (talk | contribs)
A major edit - I have reshuffled a lot of the page, cleaned up duplicated information and added more information where appropriate. Please see the talk page for more details and discussion.
Line 1: Line 1:
'''Sensitivity analysis''' is the study of how the [[uncertainty]] in the output of a [[mathematical model]] or system (numerical or otherwise) can be apportioned to different sources of [[uncertainty]] in its inputs.<ref name="Primer">Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D. Saisana, M., and Tarantola, S., 2008, ''Global Sensitivity Analysis. The Primer'', John Wiley & Sons.</ref> A related practice is [[uncertainty analysis]], which has a greater focus on [[uncertainty quantification]] and [[propagation of uncertainty]], but gives no distinction given to different sources of uncertainty. Ideally, uncertainty and sensitivity analysis should be run in tandem.
{{Cleanup|reason= the article needs to be re-structured - please see the talk page for detailed proposals|date=August 2012}}


Sensitivity analysis can be useful for a range of purposes,<ref name="Examples">Pannell, D.J. (1997). Sensitivity analysis of normative economic models: Theoretical framework and practical strategies, ''Agricultural Economics'' 16: 139-152.[http://dpannell.fnas.uwa.edu.au/dpap971f.htm]</ref> including:
'''Sensitivity analysis (SA)''' is the study of how the uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input.<ref name="Primer">Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D. Saisana, M., and Tarantola, S., 2008, ''Global Sensitivity Analysis. The Primer'', John Wiley & Sons.</ref> A related practice is [[uncertainty analysis]] which focuses rather on quantifying uncertainty in model output. Ideally, uncertainty and sensitivity analysis should be run in tandem.
* Testing the [[robustness]] of the results of a model or system in the presence of uncertainty.
* Increased understanding of the relationships between input and output variables in a system or model.
* Uncertainty reduction: identifying model inputs that cause significant uncertainty in the output and should therefore be the focus of attention if the robustness is to be increased (perhaps by further research).
* Searching for errors in the model (by encountering unexpected relationships between inputs and outputs).
* Model simplification - fixing model inputs that have no effect on the output, or identifying and removing redundant parts of the model structure.
* Enhancing communication from modelers to decision makers (e.g. by making recommendations more credible, understandable, compelling or persuasive).
* Finding regions in the space of input factors for which the model output is either maximum or minimum or meets some optimum criterion (see [[optimization]] and Monte Carlo filtering).


Taking an example from economics, in any budgeting process there are always variables that are uncertain. Future tax rates, interest rates, inflation rates, headcount, operating expenses and other variables may not be known with great precision. Sensitivity analysis answers the question, "if these variables deviate from expectations, what will the effect be (on the business, model, system, or whatever is being analyzed), and which variables are causing it?"
In more general terms uncertainty and sensitivity analysis investigate the robustness of a study when the study includes some form of [[statistical modeling]]. Sensitivity analysis can be useful to computer modelers for a range of purposes,<ref name="Examples">Pannell, D.J. (1997). Sensitivity analysis of normative economic models: Theoretical framework and practical strategies, ''Agricultural Economics'' 16: 139-152.[http://dpannell.fnas.uwa.edu.au/dpap971f.htm]</ref> including:
* Support [[decision making]] or the development of recommendations for decision makers (e.g. testing the robustness of a result);
* Enhancing communication from modelers to decision makers (e.g. by making recommendations more credible, understandable, compelling or persuasive);
* Increased understanding or quantification of the system (e.g. understanding relationships between input and output variables); and
* Model development (e.g. searching for errors in the model).

Let us give an example: in any budgeting process there are always variables that are uncertain. Future tax rates, interest rates, inflation rates, headcount, operating expenses and other variables may not be known with great precision. Sensitivity analysis answers the question, "if these variables deviate from expectations, what will the effect be (on the business, model, system, or whatever is being analyzed)?"


==Overview==
==Overview==


A [[mathematical model]] is defined by a series of [[equations]], input variables and parameters aimed at characterizing some process under investigation. Some examples might be a [[climate model]], an [[economic model]], or a [[finite element analysis|finite element]] model in engineering. Increasingly, such models are highly complex, and as a result their input/output relationships may be poorly understood. In such cases, the model can be viewed as a [[black box]], i.e. the output is an [[Closed-form expression|intractable]] function of its inputs.
''Problems'' met in social, economic or natural sciences may entail the use of computer models, which generally do not lend themselves to a straightforward understanding of the relationship between input factors (what goes into the model) and output (the model’s dependent variables). Such an appreciation, i.e. the understanding of how the model behaves in response to changes in its inputs, is of fundamental importance to ensure a correct use of the models.


Quite often, some or all of the model inputs are subject to sources of [[Uncertainty quantification|uncertainty]], including [[Measurement_uncertainty|errors of measurement]], absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our [[confidence]] in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system (aleatory), such as the occurrence of [[stochastic]] events <ref>Der Kiureghian, A., Ditlevsen, O. (2009) Aleatory or epistemic? Does it matter?, Structural Safety '''31'''(2), 105-112.</ref>
A [[computer model]] is defined by a series of [[equations]], input factors, parameters, and variables aimed at characterizing the process being investigated.


Good modeling practice requires that the modeler provides an evaluation of the confidence in the model. This requires, first, a [[quantification]] of the uncertainty in any model results ([[uncertainty analysis]]); and second, an evaluation of how much each input is contributing to the output uncertainty. Sensitivity analysis addresses the second of these issues (although uncertainty analysis is usually a necessary precursor), performing the role of ordering by importance the strength and relevance of the inputs in determining the variation in the output.<ref name="Primer" />
Input is subject to many sources of uncertainty including errors of [[measurement]], absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our [[confidence]] in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system, such as the occurrence of [[stochastic]] events.


In models involving many input variables, sensitivity analysis is an essential ingredient of model building and quality assurance. National and international agencies involved in [[impact assessment]] studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the [[European Commission]] (see e.g. the [http://ec.europa.eu/governance/impact/commission_guidelines/docs/iag_2009_en.pdf guidelines] for [[impact assessment]]), the White House [[Office of Management and Budget]], the [[Intergovernmental Panel on Climate Change]] and [[US Environmental Protection Agency]]'s [http://www.epa.gov/CREM/library/cred_guidance_0309.pdf modelling guidelines].
Good modeling practice requires that the modeler provides an evaluation of the confidence in the model, possibly assessing the uncertainties associated with the modeling process and with the outcome of the model itself. [[Uncertainty]] and Sensitivity Analysis offer valid tools for characterizing the uncertainty associated with a model. [[Uncertainty analysis]] (UA) quantifies the uncertainty in the outcome of a model. Sensitivity Analysis has the complementary role of ordering by importance the strength and relevance of the inputs in determining the variation in the output.<ref name="Primer" />

In models involving many input variables sensitivity analysis is an essential ingredient of model building and quality assurance. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the [[European Commission]], the White House [[Office of Management and Budget]], the [[Intergovernmental Panel on Climate Change]] and [[US Environmental Protection Agency]]. Examples from research led sensitivity analyses can be found on gender wage gap in Chile<ref>Perticara, M (2007) 'Gender wage gap in Chile: a sensitivity analysis: Analizing the differences between men and women employment and wages', Alberto Hurtado University: http://cloud2.gdnet.org/~research_papers/Gender%20wage%20gap%20in%20Chile:%20a%20sensitivity%20analysis</ref> and water sector interventions in Nigeria.

Sometimes a sensitivity analysis may reveal surprising insights about the subject of interest. For instance, the field of [[multi-criteria decision making]] (MCDM) studies (among other topics) the problem of how to select the best alternative among a number of competing alternatives. This is an important task in [[decision making]]. In such a setting each alternative is described in terms of a set of evaluative criteria. These criteria are associated with weights of importance. Intuitively, one may think that the larger the weight for a criterion is, the more critical that criterion should be. However, this may not be the case. It is important to distinguish here the notion of ''criticality'' with that of ''importance.'' By ''critical,'' we mean that a criterion with small change (as a percentage) in its weight, may cause a significant change of the final solution. It is possible criteria with rather small weights of importance (i.e., ones that are not so important in that respect) to be much more critical in a given situation than ones with larger weights.<ref name = 'SENSITIVITY'>{{cite journal | title = A Sensitivity Analysis Approach for Some Deterministic Multi-Criteria Decision-Making Methods | journal = Decision Sciences | year=1997 | first = E. | last = Triantaphyllou | authorlink = | coauthors = A. Sanchez | volume = 28 | issue = 1 | pages = 151–194 | id = | url = http://www.csc.lsu.edu/trianta/Journal_PAPERS1/SENSIT1.htm | accessdate = 2010-06-28 }}</ref><ref name='MCDM'>{{cite book | last = Triantaphyllou | first = E. | authorlink = | title = Multi-Criteria Decision Making: A Comparative Study | publisher = Kluwer Academic Publishers (now Springer) | year = 2000 | location = Dordrecht, The Netherlands | pages = 320 | url = http://www.csc.lsu.edu/trianta/Books/DecisionMaking1/Book1.htm | doi = | id = | isbn = 0-7923-6607-7 }}</ref> That is, a sensitivity analysis may shed light into issues not anticipated at the beginning of a study. This, in turn, may dramatically improve the effectiveness of the initial study and assist in the successful implementation of the final solution.


==Settings and Constraints==
==Methodology==
The choice of method of sensitivity analysis is typically dictated by a number of problem constraints or settings. Some of the most common are:
* '''Computational expense:''' Sensitivity analysis is almost always performed by running the model a (possibly large) number of times, i.e. a [[Sampling (statistics)|sampling]]-based approach<ref>J.C. Helton, J.D. Johnson, C.J. Salaberry, and C.B. Storlie, 2006, Survey of sampling based methods for uncertainty and sensitivity analysis. ''Reliability Engineering and System Safety'', '''91''':1175&ndash;1209.</ref>. This can be a significant problem when,
** A single run of the model takes a significant amount of time (minutes, hours or longer). This is not unusual with very complex models.
** The model has a large number of uncertain inputs. Sensitivity analysis is essentially the exploration of the [[Dimension|multidimensional input space]], which grows exponentially in size with the number of inputs. See the [[curse of dimensionality]].
:Computational expense is a problem in many practical sensitivity analyses. Some methods of reducing computational expense include the use of emulators (for large models), and screening methods (for reducing the dimensionality of the problem).
* '''Correlated inputs:''' Most common sensitivity analysis methods assume [[Independence (probability theory)|independence]] between model inputs, but sometimes inputs can be strongly correlated. This is still an immature field of research and definitive methods have yet to be established.
* '''Nonlinearity:''' Some sensitivity analysis approaches, such as those based on [[linear regression]], can inaccurately measure sensitivity when the model response is [[nonlinear system|nonlinear]] with respect to its inputs. In such cases, [[Variance-based sensitivity analysis|variance-based measures]] are more appropriate.
* '''Model interactions:''' [[Interaction (statistics)|Interactions]] occur when the perturbation of two or more inputs ''simultaneously'' causes variation in the output greater than that of varying each of the inputs alone. Such interactions are present in any model that is non-[[Additive function|additive]], but will be neglected by methods such as scatterplots and one-at-a-time perturbations<ref name="OAT">Saltelli, A., Annoni, P., 2010, How to avoid a perfunctory sensitivity analysis, ''Environmental Modeling and Software'' '''25''', 1508-1517.</ref>. The effect of interactions can be measured by the [[Variance-based sensitivity analysis|total-order sensitivity index]].
* '''Multiple outputs:''' Virtually all sensitivity analysis methods consider a single [[univariate]] model output, yet many models output a large number of possibly spatially or time-dependent data. Note that this does not preclude the possibility of performing different sensitivity analyses for each output of interest. However, for models in which the outputs are correlated, the sensitivity measures can be hard to interpret.
* '''Given data:''' While in many cases the practitioner has access to the model, in some instances a sensitivity analysis must be performed with "given data", i.e. where the sample points (the values of the model inputs for each run) cannot be chosen by the analyst. This may occur when a sensitivity analysis has to performed retrospectively, perhaps using data from an optimisation or uncertainty analysis, or when data comes from a [[discrete]] source.<ref name="voodoo">Paruolo, P., Saisana, M., and Saltelli, A., (2012) Ratings and rankings: voodoo or science? ''The Royal Statistical Society: Journal Series A''</ref>


==Core Methodology==
[[File:Sensitivity scheme.jpg|thumb | right | 500px | Ideal scheme of a possibly sampling-based sensitivity analysis. Uncertainty arising from different sources—errors in the data, parameter estimation procedure, alternative model structures—are propagated through the model for uncertainty analysis and their relative importance is quantified via sensitivity analysis.]]
[[File:Scatter plots for sensitivity analysis bis.jpg|thumb | right | 500px | Sampling-based sensitivity analysis by scatterplots. ''Y'' (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by ''Z''<sub>1</sub>, ''Z''<sub>2</sub>, ''Z''<sub>3</sub>, ''Z''<sub>4</sub> in turn. Note that the abscissa is different for each plot: (&minus;5,&nbsp;+5) for ''Z''<sub>1</sub>, (&minus;8,&nbsp;+8) for ''Z''<sub>2</sub>, (&minus;10,&nbsp;+10) for ''Z''<sub>3</sub> and ''Z''<sub>4</sub>. ''Z''<sub>4</sub> is most important in influencing ''Y'' as it imparts more 'shape' on ''Y''.]]
[[File:Scatter plots for sensitivity analysis bis.jpg|thumb | right | 500px | Sampling-based sensitivity analysis by scatterplots. ''Y'' (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by ''Z''<sub>1</sub>, ''Z''<sub>2</sub>, ''Z''<sub>3</sub>, ''Z''<sub>4</sub> in turn. Note that the abscissa is different for each plot: (&minus;5,&nbsp;+5) for ''Z''<sub>1</sub>, (&minus;8,&nbsp;+8) for ''Z''<sub>2</sub>, (&minus;10,&nbsp;+10) for ''Z''<sub>3</sub> and ''Z''<sub>4</sub>. ''Z''<sub>4</sub> is most important in influencing ''Y'' as it imparts more 'shape' on ''Y''.]]


There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above.<ref name="Primer" /> They are also distinguished by the type of sensitivity measure, be it based on (for example) [[Variance-based sensitivity analysis|variance decompositions]], [[partial derivatives]] or [[elementary effects method|elementary effects]]. In general, however, most procedures adhere to the following outline:
There are several possible procedures to perform uncertainty (UA) and sensitivity analysis (SA). Important classes of methods are:
# Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data <ref>O'Hagan, A., ''Uncertain Judgements: Eliciting Experts' Probabilities.'' Wiley, Chichester, 2006.</ref>.
# Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model).
# Run the model a number of times using some [[design of experiments]]<ref>Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and analysis of computer experiments. ''Statistical Science'' '''4''', 409&ndash;435.</ref>, dictated by the method of choice and the input uncertainty.
# Using the resulting model outputs, calculate the sensitivity measures of interest.
In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis.


This section discusses various types of "core methods", distinguished by the various sensitivity measures that are calculated (note that some of these categories "overlap" somewhat). The following section focuses on alternative ways of obtaining these measures, under the constraints of the problem.
* Local methods, such as the simple derivative of the output <math> Y </math> with respect to an input factor <math> X_i </math>:
:<math>\left| \frac{\partial Y}{\partial X_i} \right |_{\textbf {x}^0 }</math>,
where the subscript <math> \textbf {x}^0 </math> indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling<ref>Cacuci, Dan G., ''Sensitivity and Uncertainty Analysis: Theory, Volume I'', Chapman & Hall.</ref><ref>Cacuci, Dan G., Mihaela Ionescu-Bujor, Michael Navon, 2005, ''Sensitivity And Uncertainty Analysis: Applications to Large-Scale Systems (Volume II)'', Chapman & Hall.</ref> and Automated Differentiation<ref>Grievank, A. (2000). ''Evaluating derivatives, Principles and techniques of algorithmic differentiation.'' SIAM publisher.</ref> are methods in this class.
* A [[Sampling (statistics)|sampling]]<ref>J.C. Helton, J.D. Johnson, C.J. Salaberry, and C.B. Storlie, 2006, Survey of sampling based methods for uncertainty and sensitivity analysis. ''Reliability Engineering and System Safety'', '''91''':1175&ndash;1209.</ref>-based sensitivity is one in which the model is executed repeatedly for combinations of values sampled from the [[probability distribution|distribution]] (assumed known) of the input factors. Once the sample is generated, several strategies (including simple input-output scatterplots) can be used to derive sensitivity measures for the factors.
* Methods based on emulators (e.g. Bayesian<ref>Oakley, J. and A. O'Hagan (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. ''J. Royal Stat. Soc. B'' '''66, 751&ndash;769'''.</ref>). With these methods the value of the output <math> Y </math>, or directly the value of the sensitivity measure of a factor <math> X_i </math>, is treated as a stochastic process and estimated from the available computer-generated data points. This is useful when the computer program which describes the model is expensive to run.
* Screening methods. This is a particular instance of sampling based methods. The objective here is to estimate a few active factors in models with many factors. One of the most commonly used screening method is the [[Elementary effects method|elementary effect method]].<ref>Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. ''Technometrics'', '''33''', 161–174.</ref><ref>Campolongo, F., J. Cariboni, and A. Saltelli (2007). An effective screening design for sensitivity analysis of large models. ''Environmental Modelling and Software'', '''22''',
1509&ndash;1518.</ref>
* Variance based methods.<ref>Sobol’, I. (1990). Sensitivity estimates for nonlinear mathematical models. ''Matematicheskoe Modelirovanie'' '''2''', 112–118. in Russian, translated in English in Sobol’ , I. (1993). Sensitivity analysis for non-linear mathematical models. ''Mathematical Modeling & Computational Experiment (Engl. Transl.)'', 1993, '''1''', 407–414.
</ref><ref>Homma, T. and A. Saltelli (1996). Importance measures in global sensitivity analysis of nonlinear models. ''Reliability Engineering and System Safety'', '''52''', 1–17.</ref><ref>Saltelli, A., K. Chan, and M. Scott (Eds.) (2000). ''Sensitivity Analysis''. Wiley Series in Probability and Statistics. New York: John Wiley and Sons.
</ref> Here the unconditional variance <math> V(Y) </math> of <math> Y </math> is decomposed into terms due to individual factors plus terms due to interaction among factors. Full variance decompositions are only meaningful when the input factors are independent from one another.<ref>Saltelli, A. and S. Tarantola (2002). On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal. ''Journal of American Statistical Association'', '''97''', 702–709.</ref>
* High Dimensional Model Representations (HDMR).<ref>Li, G., J. Hu, S.-W. Wang, P. Georgopoulos, J. Schoendorf, and H. Rabitz (2006). Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions. ''Journal of Physical Chemistry A'' '''110''', 2474&ndash;2485.</ref><ref>Li, G., W. S. W., and R. H. (2002). Practical approaches to construct RS-HDMR component functions. ''Journal of Physical Chemistry'' '''106''', 8721{8733.</ref> The term is due to H. Rabitz<ref>Rabitz, H. (1989). System analysis at molecular scale. ''Science'', '''246''', 221–226.</ref> and include as a particular case the variance based methods. In HDMR the output <math> Y </math> is expressed as a linear combination of terms of increasing dimensionality.
* Methods based on Monte Carlo filtering.<ref>Hornberger, G. and R. Spear (1981). An approach to the preliminary analysis of environmental systems. ''Journal of Environmental Management'' '''7''', 7&ndash;18.</ref><ref>Saltelli, A., S. Tarantola, F. Campolongo, and M. Ratto (2004). ''Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models''. John Wiley and Sons.</ref> These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding to particular values (e.g. high or low) of the output.


===One-at-a-time (OAT/OFAT)===
[[File:Sensitivity scheme.jpg|thumb | right | 500px | Ideal scheme of a possibly sampling-based sensitivity analysis. Uncertainty arising from different sources—errors in the data, parameter estimation procedure, alternative model structures—are propagated through the model for uncertainty analysis and their relative importance is quantified via sensitivity analysis.]]


One of the simplest and most common approaches is that of changing one-factor-at-a-time ([[OFAT]] or OAT), to see what effect this produces on the output.<ref>J. Campbell, et al. (2008), Photosynthetic Control of Atmospheric Carbonyl Sulfide During the Growing Season, ''Science'' '''322''': 1085-1088</ref>
Often (e.g. in sampling-based methods) UA and SA are performed jointly by executing the model repeatedly for combination of factor values sampled with some [[probability distribution]]. The following steps can be listed:
<ref>R. Bailis, M. Ezzati, D. Kammen, (2005), Mortality and Greenhouse Gas Impacts of Biomass and Petroleum Energy Futures in Africa, ''Science'' '''308''': 98-103</ref>
<ref>J. Murphy, et al.(2004), Quantification of modelling uncertainties in a large ensemble of climate change simulations, ''Nature'' '''430''': 768-772</ref> OAT customarily involves:


* Moving one input variable, keeping others at their baseline (nominal) values, then,
*Specify the target [[function (mathematics)|function]] of interest.
**It is easier to communicate the results of a sensitivity analysis when the target of interest has a direct relation to the problem tackled by the model.
*Assign a [[probability density function]] to the selected factors.
**When this involves eliciting experts' opinion this is the most expensive and time consuming part of the analysis.
*Generate a [[matrix (mathematics)|matrix]] of inputs with that distribution(s) through an appropriate design.
** As in [[experimental design]], a good design for numerical experiments<ref>Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and analysis of computer experiments. ''Statistical Science'' '''4''', 409&ndash;435.</ref> should give a maximum of effects with a minimum of computed points.
*Evaluate the model and compute the distribution of the target function.
**This is the computer-time intensive step.
*Select a method for assessing the influence or relative importance of each input factor on the target function.
**This depends upon the purpose of the analysis, e.g. model simplification, factor prioritization, uncertainty reduction, etc.


* Returning the variable to its nominal value, then repeating for each of the other inputs in the same way.
===Assumptions vs. inferences===
In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input [[:wikt:assumption|assumptions]] and how wide the resulting [[inference]] may be. The point is well illustrated by the econometrician Edward E. Leamer (1990) <ref>Leamer, E., (1990) Let's take the con out of econometrics, and Sensitivity analysis would help. In C. Granger (ed.), Modelling Economic Series. Oxford: Clarendon Press 1990.</ref>:


Sensitivity may then be measured by monitoring changes in the output, e.g. by [[partial derivatives]] or [[linear regression]]. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all ‘effects’ are computed with reference to the same central point in space) and minimizes the chances of computer programme crashes, more likely when several input factors are changed simultaneously.
<blockquote>I have proposed a form of organized sensitivity analysis that I call ‘global sensitivity analysis’ in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.</blockquote>
OAT is frequently preferred by modellers because of practical reasons. In case of model failure under OAT analysis the modeller immediately knows which is the input factor responsible for the failure.<ref name="OAT" />


Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of [[Interaction (statistics)|interactions]] between input variables.<ref>[http://www.questia.com/googleScholar.qst?docId=5001888588 Czitrom (1999) "One-Factor-at-a-Time Versus Designed Experiments", American Statistician, 53, 2.]</ref> <!--formerly http://www.amstat.org/publications/tas/czitrom.pdf-->
Note Leamer’s emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when ''uncertainties in inputs must be suppressed lest outputs become indeterminate.''<ref>Ravetz, J.R., 2007, ''No-Nonsense Guide to Science'', New Internationalist Publications Ltd.</ref>


=== Errors ===
===Local methods===
Local methods involve taking the [[partial derivative]] of the output ''Y'' with respect to an input factor ''X''<sub>''i''</sub>:
:<math>
\left| \frac{\partial Y}{\partial X_i} \right |_{\textbf {x}^0 }
</math>,
where the subscript '''X'''<sup>0</sup> indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling<ref>Cacuci, Dan G., ''Sensitivity and Uncertainty Analysis: Theory, Volume I'', Chapman & Hall.</ref><ref>Cacuci, Dan G., Mihaela Ionescu-Bujor, Michael Navon, 2005, ''Sensitivity And Uncertainty Analysis: Applications to Large-Scale Systems (Volume II)'', Chapman & Hall.</ref> and Automated Differentiation<ref>Grievank, A. (2000). ''Evaluating derivatives, Principles and techniques of algorithmic differentiation.'' SIAM publisher.</ref> are methods in this class. Similar to OAT/OFAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time.


===Scatter plots===
In a sensitivity analysis, a Type I error is assessing as important a non-important factor and a Type II error is assessing as non-important an important factor. A Type III error corresponds to analysing the wrong problem, e.g. via an incorrect specification of the input uncertainties. Possible pitfalls in a sensitivity analysis are:
A simple but useful tool is to plot [[scatter plots]] of the output variable against individual input variables, after (randomly) sampling the model over its input distributions. The advantage of this approach is that it can also deal with "given data", i.e. a set of arbitrarily-placed data points, and gives a direct visual indication of sensitivity. Quantitative measures can also be drawn, for example by measuring the [[Correlation and dependence|correlation]] between ''Y'' and ''X''<sub>''i''</sub>, or even by estimating variance-based measures by [[nonlinear regression]]<ref name="voodoo" />.


===Regression analysis===
* Unclear purpose of the analysis. Different statistical tests and measures are applied to the problem and different factors rankings are obtained. The test should instead be tailored to the purpose of the analysis, e.g. one uses Monte Carlo filtering if one is interested in which factors are most responsible for generating high/low values of the output.
[[Regression analysis]] can be a very useful tool when the model response is approximately [[linear]], and is a simple approach with low computational cost. Sensitivity can be judged by [[Standardized coefficient|standardized regression coefficients]]. This method is ineffective however when the response is strongly [[nonlinear system|nonlinear]]. This can however be identified by the use of the [[coefficient of determination]].


===Variance-based methods===
* Too many model outputs are considered. This may be acceptable for quality assurance of sub-models but should be avoided when presenting the results of the overall analysis.
{{Main|Variance-based sensitivity analysis}}
Variance-based methods<ref>Sobol’, I. (1990). Sensitivity estimates for nonlinear mathematical models. ''Matematicheskoe Modelirovanie'' '''2''', 112–118. in Russian, translated in English in Sobol’ , I. (1993). Sensitivity analysis for non-linear mathematical models. ''Mathematical Modeling & Computational Experiment (Engl. Transl.)'', 1993, '''1''', 407–414.</ref><ref>Homma, T. and A. Saltelli (1996). Importance measures in global sensitivity analysis of nonlinear models. ''Reliability Engineering and System Safety'', '''52''', 1–17.</ref><ref>Saltelli, A., K. Chan, and M. Scott (Eds.) (2000). ''Sensitivity Analysis''. Wiley Series in Probability and Statistics. New York: John Wiley and Sons.</ref> are a class of probabilistic approaches which quantify the input and output uncertainties as [[probability distribution|probability distributions]], and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input. These can be expressed as conditional expectations, i.e. considering a model ''Y''=''f''('''''X''''') for '''''X'''''={''X''<sub>''1''</sub>, ''X''<sub>''2''</sub>, ... ''X''<sub>''k''</sub>}, a measure of sensitivity of the ''i''th variable ''X''<sub>''i''</sub> is given as,


:<math>
* Piecewise sensitivity. This is when one performs sensitivity analysis on one sub-model at a time. This approach is non conservative as it might overlook interactions among factors in different sub-models (Type II error).
\operatorname{Var}_{X_i} \left( E_{\textbf{X}_{\sim i}} \left( Y \mid X_i \right)
\right)
</math>


where "Var" and "''E''" denote the variance and expected value operators respectively. This expression essentially measures the contribution ''X''<sub>''i''</sub> alone to the uncertainty (variance) in ''Y'' (averaged over variations in other variables), and is known as the ''first-order sensitivity index'' or ''main effect index''. Importantly, it does not measure the uncertainty caused by interactions with other variables. A further measure, known as the ''total effect index'', gives the total variance in ''Y'' caused by ''X''<sub>''i''</sub> ''and'' its interactions with any of the other input variables. Both quantities are typically standardised by dividing by Var(''Y'').
===About OAT===


Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of [[Monte Carlo integration|Monte Carlo]] methods, but since this can involve many thousands of model runs, other methods (such as emulators) can be used to reduce computational expense when necessary. Note that full variance decompositions are only meaningful when the input factors are independent from one another.<ref>Saltelli, A. and S. Tarantola (2002). On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal. ''Journal of American Statistical Association'', '''97''', 702–709.</ref>
In sensitivity analysis a common approach is that of changing one-factor-at-a-time ([[OFAT]] or OAT), to see what effect this produces on the output.<ref>J. Campbell, et al., Science 322, 1085 (2008) {{full}}</ref><ref>
R. Bailis, M. Ezzati, D. Kammen, Science 308, 98 (2005){{full}}</ref><ref>
E. Stites, P. Trampont, Z. Ma, K. Ravichandran, Science 318, 463 (2007){{full}}</ref><ref>
J. Murphy, et al., Nature 430, 768-772 (2004){{full}}</ref><ref>
J. Coggan, et al., Science 309, 446 (2005){{full}}</ref> OAT customarily involves:


===Screening===
* Moving one factor at a time and
Screening is a particular instance of a sampling-based method. The objective here is rather to identify which input variables are contributing significantly to the output uncertainty in high-dimensionality models, rather than exactly quantifying sensitivity (i.e. in terms of variance). Screening tends to have a relatively low computational cost when compared to other approaches, and can be used in a preliminary analysis to weed out uninfluential variables before applying a more informative analysis to the remaining set. One of the most commonly used screening method is the [[Elementary effects method|elementary effect method]].<ref>Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. ''Technometrics'', '''33''', 161–174.</ref><ref>Campolongo, F., J. Cariboni, and A. Saltelli (2007). An effective screening design for sensitivity analysis of large models. ''Environmental Modelling and Software'', '''22''',
1509&ndash;1518.</ref>


==Alternative Methods==
* Going back to the central/baseline point after each movement.
A number of methods have been developed to overcome some of the constraints discussed above, which would otherwise make the estimation of sensitivity measures infeasible (most often due to [[computational expense]]). Generally, these methods focus on efficiently calculating variance-based measures of sensitivity.


===Emulators===
This appears a logical approach as any change observed in the output will unambiguously be due to the single factor changed. Furthermore by changing one factor at a time one can keep all other factors fixed to their central or baseline value. This increases the comparability of the results (all ‘effects’ are computed with reference to the same central point in space) and minimizes the chances of computer programme crashes, more likely when several input factors are changed simultaneously.
Emulators (also known as metamodels, surrogate models or response surfaces) are [[data-modelling]]/[[machine learning]] approaches that involve building a relatively simple mathematical function, known as an ''emulator'', that approximates the input/output behaviour of the model itself.<ref name="emcomp">Storlie, C.B., Swiler, L.P., Helton, J.C., and Sallaberry, C.J. (2009), Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models, ''Reliability Engineering & System Safety'' '''94'''(11): 1735-1763</ref> In other words, it is the concept of "modelling a model" (hence the name "metamodel"). The idea is that, although computer models may be a very complex series of series of equations that can take a long time to solve, they can always be regarded as a function of their inputs ''Y''=''f''('''''X'''''). By running the model at a number of points in the input space, it may be possible to fit a much simpler emulator ''η''('''''X'''''), such that ''η''('''''X''''')≈''f''('''''X''''') to within an acceptable margin of error. Then, sensitivity measures can be calculated from the emulator (either with Monte Carlo or analytically), which will have a negligible additional computational cost. Importantly, the number of model runs required to fit the emulator can be orders of magnitude less than the number of runs required to directly estimate the sensitivity measures from the model<ref name="oak">Oakley, J. and A. O'Hagan (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. ''J. Royal Stat. Soc. B'' '''66, 751&ndash;769'''.</ref>.
"OAT is frequently preferred by modellers because of practical reasons. In case of model failure under OAT analysis the modeller immediately knows which is the input factor responsible for the failure."<ref name="OAT">Saltelli, A., Annoni, P., 2010, How to avoid a perfunctory sensitivity analysis, ''Environmental Modeling and Software'' '''25''', 1508-1517.</ref>
Despite its simplicity, this approach is non-explorative of the space of the factors and does not take into account their simultaneous variation. This means, that the OAT approach cannot detect the presence of interactions between input factors.<ref>[http://www.questia.com/googleScholar.qst?docId=5001888588 Czitrom (1999) "One-Factor-at-a-Time Versus Designed Experiments", American Statistician, 53, 2.]</ref> <!--formerly http://www.amstat.org/publications/tas/czitrom.pdf-->


Clearly the crux of an emulator approach is to find an ''η'' (emulator) that is a sufficiently close approximation to the model ''f''. This requires the following steps,
==Related concepts==
# Sampling (running) the model at a number of points in its input space. This requires a sample design.
While uncertainty analysis studies the overall [[uncertainty]] in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions. For example, several guidelines for modelling [http://www.epa.gov/CREM/library/cred_guidance_0309.pdf (see e.g. one from the US EPA)] or for [[impact assessment]] [http://ec.europa.eu/governance/impact/commission_guidelines/docs/iag_2009_en.pdf (see one from the European Commission)] prescribe sensitivity analysis as a tool to ensure the quality of the modelling/assessment.
# Selecting a type of emulator (mathematical function) to use.
# "Training" the emulator using the sample data from the model - this generally involves adjusting the emulator parameters until the emulator mimics the true model as well as possible.


Sampling the model can often be done with [[low-discrepancy sequences]], such as the [[Sobol sequence]] or [[Latin hypercube sampling]], although random designs can also be used, at the loss of some efficiency. The selection of the emulator type and the training are intrinsically linked, since the training method will be dependent on the class of emulator. Some types of emulators that have been used successfully for sensitivity analysis include,
The problem setting in sensitivity analysis has strong similarities with [[design of experiments]]. In design of experiments one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.
* [[Gaussian processes]]<ref name="oak" /> (also known as [[kriging]]), where the any combination of output points is assumed to be distributed as a [[multivariate Gaussian distribution]]. Recently, "treed" Gaussian processes have been used to deal with [[Heteroscedasticity|heteroscedastic]] and discontinuous responses.
* [[Random forest|Random forests]]<ref name="emcomp" />, in which a large number of [[decision trees]] are trained, and the result averaged.
* [[Gradient boosting]]<ref name="emcomp" />, where a succession of simple regressions are used to weight data points to sequentially reduce error.
* [[polynomial chaos|Polynomial chaos expansions]]<ref>Sudret, B., (2008), Global sensitivity analysis using polynomial chaos expansions}, ''Reliability Engineering & System Safety'' '''93'''(7): 964-979,</ref>, which use [[orthogonal polynomials]] to approximate the response surface.
* [[Smoothing spline|Smoothing splines]]<ref>Ratto, M. and Pagano, A., (2010), Using recursive algorithms for the efficient identification of smoothing spline ANOVA models, ''AStA Advances in Statistical Analysis'' '''94'''(4): 367-388</ref>, normally used in conjunction with HDMR truncations (see below).


The use of an emulator introduces a [[machine learning]] problem, which can be difficult if the response of the model is highly [[nonlinear]]. In all cases it is useful to check the accuracy of the emulator, for example using [[Cross-validation (statistics)|cross-validation]].
==Applications==


===High-Dimensional Model Representations (HDMR)===
Sensitivity analysis can be used
A [[high-dimensional model representation]] (HDMR)<ref>Li, G., J. Hu, S.-W. Wang, P. Georgopoulos, J. Schoendorf, and H. Rabitz (2006). Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions. ''Journal of Physical Chemistry A'' '''110''', 2474&ndash;2485.</ref><ref>Li, G., W. S. W., and R. H. (2002). Practical approaches to construct RS-HDMR component functions. ''Journal of Physical Chemistry'' '''106''', 8721{8733.</ref> (the term is due to H. Rabitz<ref>Rabitz, H. (1989). System analysis at molecular scale. ''Science'', '''246''', 221–226.</ref>) is essentially an emulator approach, which involves decomposing the function output into a linear combination of input terms and interactions of increasing dimensionality. The HDMR approach exploits the fact that the model can usually be well-approximated by neglecting higher-order interactions (second or third-order and above). The terms in the truncated series can then each be approximated by e.g. polynomials or splines (REFS) and the response expressed as the sum of the main effects and interactions up to the truncation order. From this perspective, HDMRs can be seen as emulators which neglect high-order interactions; the advantage being that they are able to emulate models with higher dimensionality than full-order emulators.


===Fourier Amplitude Sensitivity Test (FAST)===
* To simplify models
{{Main| Fourier amplitude sensitivity testing }}


The Fourier Amplitude Sensitivity Test (FAST) uses the [[Fourier series]] to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.
* To investigate the robustness of the model predictions


===Other===
* To play what-if analysis exploring the impact of varying input assumptions and scenarios
Methods based on Monte Carlo filtering.<ref>Hornberger, G. and R. Spear (1981). An approach to the preliminary analysis of environmental systems. ''Journal of Environmental Management'' '''7''', 7&ndash;18.</ref><ref>Saltelli, A., S. Tarantola, F. Campolongo, and M. Ratto (2004). ''Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models''. John Wiley and Sons.</ref> These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding to particular values (e.g. high or low) of the output.


==Other Issues==
* As an element of quality assurance (unexpected factors sensitivities may be associated to coding errors or misspecifications).
===Assumptions vs. inferences===
In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input [[:wikt:assumption|assumptions]] and how wide the resulting [[inference]] may be. The point is well illustrated by the econometrician Edward E. Leamer (1990) <ref>Leamer, E., (1990) Let's take the con out of econometrics, and Sensitivity analysis would help. In C. Granger (ed.), Modelling Economic Series. Oxford: Clarendon Press 1990.</ref>:


<blockquote>I have proposed a form of organized sensitivity analysis that I call ‘global sensitivity analysis’ in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.</blockquote>
It provides as well information on:


Note Leamer’s emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when ''uncertainties in inputs must be suppressed lest outputs become indeterminate.''<ref>Ravetz, J.R., 2007, ''No-Nonsense Guide to Science'', New Internationalist Publications Ltd.</ref>
* Factors that mostly contribute to the [[output]] variability


===Pitfalls and Difficulties===
* The region in the space of input factors for which the model output is either maximum or minimum or within pre-defined bounds (see Monte Carlo filtering above)
Some common difficulties in sensitivity analysis include:
* Too many model inputs to analyse. Screening can be used to reduce dimensionality.
* The model takes too long to run. Emulators (including HDMR) can reduce the number of model runs needed.
* There is not enough information to build probability distributions for the inputs. Probability distributions can be constructed from [[expert elicitation]], although even then it may be hard to build distributions with great confidence. The subjectivity of the probability distributions or ranges will strongly affect the sensitivity analysis.
* Unclear purpose of the analysis. Different statistical tests and measures are applied to the problem and different factors rankings are obtained. The test should instead be tailored to the purpose of the analysis, e.g. one uses Monte Carlo filtering if one is interested in which factors are most responsible for generating high/low values of the output.
* Too many model outputs are considered. This may be acceptable for quality assurance of sub-models but should be avoided when presenting the results of the overall analysis.
* Piecewise sensitivity. This is when one performs sensitivity analysis on one sub-model at a time. This approach is non conservative as it might overlook interactions among factors in different sub-models (Type II error).


==Applications==
* [[Optimization (mathematics)|Optimal]]—or instability—regions within the space of factors for use in a subsequent [[calibration]] study
Some examples of sensitivity analyses performed in various disciplines follow here.

* [[Interaction (statistics)|Interaction]] between factors

Sensitivity Analysis is common in physics and chemistry,<ref>Saltelli, A., M. Ratto, S. Tarantola and F. Campolongo (2005) Sensitivity Analysis for Chemical Models, ''Chemical Reviews'', 105(7) pp 2811–2828.</ref> in [[financial]] applications, risk analysis, [[signal processing]], [[neural networks]] and any area where models are developed. Sensitivity analysis can also be used in model-based [[policy assessment studies]].<ref>Saltelli, Andrea (2006) [http://www.modeling.uga.edu/tauc/background_material/Washington-Main.pdf "The critique of modelling and sensitivity analysis in the scientic discourse: An overview of good practices"], Transatlantic Uncertainty Colloquium (TAUC) Washington, October 10–11</ref> Sensitivity analysis can be used to assess the robustness of [[composite indicators]],<ref>Saisana M., Saltelli A., Tarantola S. (2005) "Uncertainty and Sensitivity analysis techniques as tools for the quality assessment of composite indicators", ''[[Journal of the Royal Statistical Society]], A'', '''168''' (2), 307&ndash;323.</ref> also known as indices, such as the [[Environmental Performance Index]].


===Environmental===
===Environmental===
Environmental computer models are increasingly used in a wide variety of studies and applications. For example, [[global climate model|global climate models]] are used for both short-term [[weather forecasts]] and long-term [[climate change]]. Moreover, computer models are increasingly used for environmental decision-making at a local scale, for example for assessing the impact of a waste water treatment plant on a river flow, or for assessing the behavior and life-length of bio-filters for contaminated waste water.
Computer environmental models are increasingly used in a wide variety of studies and applications.
For example, [[global climate model]] are used for both short term [[weather forecasts]] and long term [[climate change]].


In both cases sensitivity analysis may help to understand the contribution of the various sources of uncertainty to the model output uncertainty and the system performance in general. In these cases, depending on model complexity, different sampling strategies may be advisable and traditional sensitivity indices have to be generalized to cover multiple model outputs,<ref>Fassò, Alessandro () [http://www.iemss.org/iemss2006/papers/s7/268_Fasso_0.pdf "Sensitivity Analysis for Environmental Models and Monitoring Networks"]. Preprint</ref> [[heteroskedastic]] effects and correlated inputs.
Moreover, computer models are increasingly used for environmental decision making at a local scale, for example for assessing the impact of a waste water treatment plant on a river flow, or for assessing the behavior and life length of bio-filters for contaminated waste water.

In both cases sensitivity analysis may help understanding the contribution of the various sources of uncertainty to the model output uncertainty and system performance in general.
In these cases, depending on model complexity, different sampling strategies may be advisable and traditional sensitivity indexes have to be generalized to cover [[multivariate sensitivity analysis]],<ref>Fassò, Alessandro () [http://www.iemss.org/iemss2006/papers/s7/268_Fasso_0.pdf "Sensitivity Analysis for Environmental Models and Monitoring Networks"]. Preprint</ref> [[heteroskedastic]] effects and correlated inputs.


===Business===
===Business===
In a decision problem, the analyst may want to identify cost drivers as well as other quantities for which we need to acquire better knowledge in order to make an informed decision. On the other hand, some quantities have no influence on the predictions, so that we can save resources at no loss in accuracy by relaxing some of the conditions. See [[Corporate finance#Quantifying uncertainty|Corporate finance: Quantifying uncertainty]].
In a decision problem, the analyst may want to identify cost drivers as well as other quantities for which we need to acquire better knowledge in order to make an informed decision. On the other hand, some quantities have no influence on the predictions, so that we can save resources at no loss in accuracy by relaxing some of the conditions. See [[Corporate finance#Quantifying uncertainty|Corporate finance: Quantifying uncertainty]].
Sensitivity analysis can help in a variety of other circumstances which can be handled by the settings illustrated below:
Additionally to the general motivations listed above, sensitivity analysis can help in a variety of other circumstances specific to business:

* to identify critical assumptions or compare alternative model structures
* To identify critical assumptions or compare alternative model structures
* guide future data collections
* To guide future data collections
* detect important criteria
* optimize the tolerance of manufactured parts in terms of the uncertainty in the parameters
* To optimize the tolerance of manufactured parts in terms of the uncertainty in the parameters
* optimize resources allocation
* To optimize resources allocation

* model simplification or model lumping, etc.
However there are also some problems associated with sensitivity analysis in the business context:
However there are also some problems associated with sensitivity analysis in the business context:
* Variables are often interdependent, which makes examining them each individually unrealistic, e.g.: changing one factor such as sales volume, will most likely affect other factors such as the selling price.
* Variables are often interdependent ([[correlation|correlated]]), which makes examining them each individually unrealistic, e.g. changing one factor such as sales volume, will most likely affect other factors such as the selling price.
* Often the assumptions upon which the analysis is based are made by using past experience/data which may not hold in the future.
* Often the assumptions upon which the analysis is based are made by using past experience/data which may not hold in the future.
* Assigning a maximum and minimum (or optimistic and pessimistic) value is open to subjective interpretation. For instance one persons 'optimistic' forecast may be more conservative than that of another person performing a different part of the analysis. This sort of subjectivity can adversely affect the accuracy and overall objectivity of the analysis.
* Assigning a maximum and minimum (or optimistic and pessimistic) value is open to subjective interpretation. For instance one person's 'optimistic' forecast may be more conservative than that of another person performing a different part of the analysis. This sort of subjectivity can adversely affect the accuracy and overall objectivity of the analysis.

===Social Sciences===
Examples from research-led sensitivity analyses can be found on gender wage gap in Chile<ref>Perticara, M (2007) 'Gender wage gap in Chile: a sensitivity analysis: Analizing the differences between men and women employment and wages', Alberto Hurtado University: http://cloud2.gdnet.org/~research_papers/Gender%20wage%20gap%20in%20Chile:%20a%20sensitivity%20analysis</ref> and water sector interventions in Nigeria.


In modern econometrics the use of sensitivity analysis to anticipate criticism is the subject of one of the ten commandments of applied econometrics (from Kennedy, 2007<ref>Kennedy, P. (2007). ''A guide to econometrics'', Fifth edition. Blackwell Publishing.
In modern econometrics the use of sensitivity analysis to anticipate criticism is the subject of one of the ten commandments of applied econometrics (from Kennedy, 2007<ref>Kennedy, P. (2007). ''A guide to econometrics'', Fifth edition. Blackwell Publishing.
</ref> ):
</ref> ):


<blockquote>Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism [···] When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an ‘honesty is the best policy’ approach, advocated by Leamer, (1978<ref>Leamer, E. (1978). ''Specification Searches: Ad Hoc Inferences with Nonexperimental Data''. John Wiley & Sons, Ltd, p. vi.</ref>).</blockquote>
<blockquote>Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism [•••] When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an ‘honesty is the best policy’ approach, advocated by Leamer, (1978<ref>Leamer, E. (1978). ''Specification Searches: Ad Hoc Inferences with Nonexperimental Data''. John Wiley & Sons, Ltd, p. vi.</ref>).</blockquote>

Sensitivity analysis can also be used in model-based policy assessment studies.<ref>Saltelli, Andrea (2006) [http://www.modeling.uga.edu/tauc/background_material/Washington-Main.pdf "The critique of modelling and sensitivity analysis in the scientic discourse: An overview of good practices"], Transatlantic Uncertainty Colloquium (TAUC) Washington, October 10–11</ref> Sensitivity analysis can be used to assess the robustness of composite indicators,<ref>Saisana M., Saltelli A., Tarantola S. (2005) "Uncertainty and Sensitivity analysis techniques as tools for the quality assessment of composite indicators", ''[[Journal of the Royal Statistical Society]], A'', '''168''' (2), 307&ndash;323.</ref> also known as indices, such as the [[Environmental Performance Index]].

===Chemistry===
Sensitivity Analysis is common in many areas of physics and chemistry<ref>Saltelli, A., M. Ratto, S. Tarantola and F. Campolongo (2005) Sensitivity Analysis for Chemical Models, ''Chemical Reviews'', 105(7) pp 2811–2828.</ref>.


===Chemical kinetics===
With the accumulation of knowledge about kinetic mechanisms under investigation and with the advance of power of modern computing technologies, detailed complex kinetic models are increasingly used as predictive tools and as aids for understanding the underlying phenomena. A kinetic model is usually described by a set of differential equations representing the concentration-time relationship. Sensitivity analysis has been proven to be a powerful tool to investigate a complex kinetic model.<ref>Rabitz, H., M. Kramer and D. Dacol (1983). Sensitivity analysis in chemical kinetics. ''Annual Review of Physical Chemistry'', '''34''', 419–461.</ref><ref>Turanyi, T (1990). Sensitivity analysis of complex kinetic systems. Tools and applications. ''Journal of Mathematical Chemistry'', '''5''', 203–248.</ref><ref name="SRI">
With the accumulation of knowledge about kinetic mechanisms under investigation and with the advance of power of modern computing technologies, detailed complex kinetic models are increasingly used as predictive tools and as aids for understanding the underlying phenomena. A kinetic model is usually described by a set of differential equations representing the concentration-time relationship. Sensitivity analysis has been proven to be a powerful tool to investigate a complex kinetic model.<ref>Rabitz, H., M. Kramer and D. Dacol (1983). Sensitivity analysis in chemical kinetics. ''Annual Review of Physical Chemistry'', '''34''', 419–461.</ref><ref>Turanyi, T (1990). Sensitivity analysis of complex kinetic systems. Tools and applications. ''Journal of Mathematical Chemistry'', '''5''', 203–248.</ref><ref name="SRI">
Komorowski M, Costa MJ, Rand DA , Stumpf MPH (2011). Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. ''Proc Natl Acad Sci U S A', '''108(21)''', 8645-50.</ref>
Komorowski M, Costa MJ, Rand DA , Stumpf MPH (2011). Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. ''Proc Natl Acad Sci U S A', '''108(21)''', 8645-50.</ref>


Kinetic parameters are frequently determined from experimental data via nonlinear estimation. Sensitivity analysis can be used for [[optimal experimental design]], e.g. determining initial conditions, measurement positions, and sampling time, to generate informative data which are critical to estimation accuracy. A great number of parameters in a complex model can be candidates for estimation but not all are estimable.<ref name="SRI"/> Sensitivity analysis can be used to identify the influential parameters which can be determined from available data while screening out the unimportant ones. Sensitivity analysis can also be used to identify the redundant species and reactions allowing model reduction.
Kinetic parameters are frequently determined from experimental data via nonlinear estimation. Sensitivity analysis can be used for [[optimal experimental design]], e.g. determining initial conditions, measurement positions, and sampling time, to generate informative data which are critical to estimation accuracy. A great number of parameters in a complex model can be candidates for estimation but not all are estimable.<ref name="SRI"/> Sensitivity analysis can be used to identify the influential parameters which can be determined from available data while screening out the unimportant ones. Sensitivity analysis can also be used to identify the redundant species and reactions allowing model reduction.

===Engineering===
Modern [[engineering]] design makes extensive use of computer models to test designs before they are manufactured. Sensitivity analysis allows designers to assess the effects and sources of uncertainties, in the interest of building robust models. Sensitivity analyses have for example been performed in biomechanical models<ref>Becker W., Rowson J., Oakley J.E., Yoxall A., Manson G., Worden K. (2011) Bayesian sensitivity analysis of a model of the aortic valve, ''Journal of Biomechanics'' '''44'''(8): 1499-506</ref> amongst others.


===In meta-analysis===
===In meta-analysis===
In a [[meta analysis]], a sensitivity analysis tests if the results are sensitive to restrictions on the data included. Common examples are large trials only, higher quality trials only, and more recent trials only. If results are consistent it provides stronger evidence of an effect and of [[generalizability]].<ref>[http://clinicalevidence.bmj.com/ceweb/resources/glossary.jsp clinicalevidence.bmj.com > Glossary > sensitivity analysis] Retrieved on June 21, 2010</ref>
In a [[meta analysis]], a sensitivity analysis tests if the results are sensitive to restrictions on the data included. Common examples are large trials only, higher quality trials only, and more recent trials only. If results are consistent it provides stronger evidence of an effect and of [[generalizability]].<ref>[http://clinicalevidence.bmj.com/ceweb/resources/glossary.jsp clinicalevidence.bmj.com > Glossary > sensitivity analysis] Retrieved on June 21, 2010</ref>

===Multi-criteria decision making===
Sometimes a sensitivity analysis may reveal surprising insights about the subject of interest. For instance, the field of [[multi-criteria decision making]] (MCDM) studies (among other topics) the problem of how to select the best alternative among a number of competing alternatives. This is an important task in [[decision making]]. In such a setting each alternative is described in terms of a set of evaluative criteria. These criteria are associated with weights of importance. Intuitively, one may think that the larger the weight for a criterion is, the more critical that criterion should be. However, this may not be the case. It is important to distinguish here the notion of ''criticality'' with that of ''importance.'' By ''critical,'' we mean that a criterion with small change (as a percentage) in its weight, may cause a significant change of the final solution. It is possible criteria with rather small weights of importance (i.e., ones that are not so important in that respect) to be much more critical in a given situation than ones with larger weights.<ref name = 'SENSITIVITY'>{{cite journal | title = A Sensitivity Analysis Approach for Some Deterministic Multi-Criteria Decision-Making Methods | journal = Decision Sciences | year=1997 | first = E. | last = Triantaphyllou | authorlink = | coauthors = A. Sanchez | volume = 28 | issue = 1 | pages = 151–194 | id = | url = http://www.csc.lsu.edu/trianta/Journal_PAPERS1/SENSIT1.htm | accessdate = 2010-06-28 }}</ref><ref name='MCDM'>{{cite book | last = Triantaphyllou | first = E. | authorlink = | title = Multi-Criteria Decision Making: A Comparative Study | publisher = Kluwer Academic Publishers (now Springer) | year = 2000 | location = Dordrecht, The Netherlands | pages = 320 | url = http://www.csc.lsu.edu/trianta/Books/DecisionMaking1/Book1.htm | doi = | id = | isbn = 0-7923-6607-7 }}</ref> That is, a sensitivity analysis may shed light into issues not anticipated at the beginning of a study. This, in turn, may dramatically improve the effectiveness of the initial study and assist in the successful implementation of the final solution.

==Related concepts==
Sensitivity analysis is closely related with [[uncertainty analysis]]; while the latter studies the overall [[uncertainty]] in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions.

The problem setting in sensitivity analysis also has strong similarities with the field of [[design of experiments]]. In a design of experiments, one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.


==See also==
==See also==
* [[Experimental uncertainty analysis]]
* [[Variance-based sensitivity analysis]]
* [[Fourier amplitude sensitivity testing]]
* [[Fourier amplitude sensitivity testing]]
* [[Elementary effects method]]
* [[Uncertainty analysis]]
* [[Uncertainty quantification]]
* [[Experimental uncertainty analysis]]
* [[Info-gap decision theory]]
* [[Info-gap decision theory]]
* [[Perturbation analysis]]
* [[Perturbation analysis]]
Line 168: Line 208:
* [[ROC curve]]
* [[ROC curve]]
* [[Interval FEM]]
* [[Interval FEM]]
* [[Morris method]]


==Notes==
==References==
{{reflist}}
{{reflist}}


Line 191: Line 230:


==External links==
==External links==
*[http://ipsc.jrc.ec.europa.eu/events.php?idx=41/ Seventh Summer School on Sensitivity Analysis], Ispra, (ITALY) 3–6 July 2012
*[http://www.gdr-mascotnum.fr/2013/ 7th International Conference on Sensitivity Analysis of Model Output], July 1-4, 2013, University of Nice, Valrose Campus, Nice, France.
*[http://samo2010.unibocconi.it/ Sixth International Conference on Sensitivity Analysis of Model Output], Bocconi University, Milan (ITALY), 19–22 July 2010
* [http://sensitivity-analysis.jrc.ec.europa.eu web-page on Sensitivity analysis ] - (Joint Research Centre of the European Commission)
* [http://sensitivity-analysis.jrc.ec.europa.eu web-page on Sensitivity analysis ] - (Joint Research Centre of the European Commission)
* [http://simlab.jrc.ec.europa.eu SimLab], the free software for global sensitivity analysis of the Joint Research Centre
* [http://simlab.jrc.ec.europa.eu SimLab], the free software for global sensitivity analysis of the Joint Research Centre
* [http://www.life-cycle-costing.de/sensitivity_analysis/ Sensitivity Analysis Excel Add-In] is a free (for private and commercial use) Excel Add-In that allows for simple sample based sensitivity analysis runs
* [http://www.life-cycle-costing.de/sensitivity_analysis/ Sensitivity Analysis Excel Add-In] is a free (for private and commercial use) Excel Add-In that allows for simple sample based sensitivity analysis runs
* [http://mucm.ac.uk/index.html MUCM Project] - Extensive resources for uncertainty and sensitivity analysis of computationally-demanding models.
* [http://ctcd.group.shef.ac.uk/gem.html GEM-SA] - a program for performing sensitivity analysis with Gaussian processes.


{{DEFAULTSORT:Sensitivity Analysis}}
{{DEFAULTSORT:Sensitivity Analysis}}
Line 208: Line 248:
[[pl:Analiza wrażliwości]]
[[pl:Analiza wrażliwości]]
[[pt:Análise de sensibilidade]]
[[pt:Análise de sensibilidade]]
[[ru:Анализ чувствительности]]
[[fi:Herkkyysanalyysi]]
[[fi:Herkkyysanalyysi]]
[[vi:Phân tích độ nhạy]]
[[vi:Phân tích độ nhạy]]

Revision as of 16:36, 30 October 2012

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs.[1] A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty, but gives no distinction given to different sources of uncertainty. Ideally, uncertainty and sensitivity analysis should be run in tandem.

Sensitivity analysis can be useful for a range of purposes,[2] including:

  • Testing the robustness of the results of a model or system in the presence of uncertainty.
  • Increased understanding of the relationships between input and output variables in a system or model.
  • Uncertainty reduction: identifying model inputs that cause significant uncertainty in the output and should therefore be the focus of attention if the robustness is to be increased (perhaps by further research).
  • Searching for errors in the model (by encountering unexpected relationships between inputs and outputs).
  • Model simplification - fixing model inputs that have no effect on the output, or identifying and removing redundant parts of the model structure.
  • Enhancing communication from modelers to decision makers (e.g. by making recommendations more credible, understandable, compelling or persuasive).
  • Finding regions in the space of input factors for which the model output is either maximum or minimum or meets some optimum criterion (see optimization and Monte Carlo filtering).

Taking an example from economics, in any budgeting process there are always variables that are uncertain. Future tax rates, interest rates, inflation rates, headcount, operating expenses and other variables may not be known with great precision. Sensitivity analysis answers the question, "if these variables deviate from expectations, what will the effect be (on the business, model, system, or whatever is being analyzed), and which variables are causing it?"

Overview

A mathematical model is defined by a series of equations, input variables and parameters aimed at characterizing some process under investigation. Some examples might be a climate model, an economic model, or a finite element model in engineering. Increasingly, such models are highly complex, and as a result their input/output relationships may be poorly understood. In such cases, the model can be viewed as a black box, i.e. the output is an intractable function of its inputs.

Quite often, some or all of the model inputs are subject to sources of uncertainty, including errors of measurement, absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our confidence in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system (aleatory), such as the occurrence of stochastic events [3]

Good modeling practice requires that the modeler provides an evaluation of the confidence in the model. This requires, first, a quantification of the uncertainty in any model results (uncertainty analysis); and second, an evaluation of how much each input is contributing to the output uncertainty. Sensitivity analysis addresses the second of these issues (although uncertainty analysis is usually a necessary precursor), performing the role of ordering by importance the strength and relevance of the inputs in determining the variation in the output.[1]

In models involving many input variables, sensitivity analysis is an essential ingredient of model building and quality assurance. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the European Commission (see e.g. the guidelines for impact assessment), the White House Office of Management and Budget, the Intergovernmental Panel on Climate Change and US Environmental Protection Agency's modelling guidelines.

Settings and Constraints

The choice of method of sensitivity analysis is typically dictated by a number of problem constraints or settings. Some of the most common are:

  • Computational expense: Sensitivity analysis is almost always performed by running the model a (possibly large) number of times, i.e. a sampling-based approach[4]. This can be a significant problem when,
    • A single run of the model takes a significant amount of time (minutes, hours or longer). This is not unusual with very complex models.
    • The model has a large number of uncertain inputs. Sensitivity analysis is essentially the exploration of the multidimensional input space, which grows exponentially in size with the number of inputs. See the curse of dimensionality.
Computational expense is a problem in many practical sensitivity analyses. Some methods of reducing computational expense include the use of emulators (for large models), and screening methods (for reducing the dimensionality of the problem).
  • Correlated inputs: Most common sensitivity analysis methods assume independence between model inputs, but sometimes inputs can be strongly correlated. This is still an immature field of research and definitive methods have yet to be established.
  • Nonlinearity: Some sensitivity analysis approaches, such as those based on linear regression, can inaccurately measure sensitivity when the model response is nonlinear with respect to its inputs. In such cases, variance-based measures are more appropriate.
  • Model interactions: Interactions occur when the perturbation of two or more inputs simultaneously causes variation in the output greater than that of varying each of the inputs alone. Such interactions are present in any model that is non-additive, but will be neglected by methods such as scatterplots and one-at-a-time perturbations[5]. The effect of interactions can be measured by the total-order sensitivity index.
  • Multiple outputs: Virtually all sensitivity analysis methods consider a single univariate model output, yet many models output a large number of possibly spatially or time-dependent data. Note that this does not preclude the possibility of performing different sensitivity analyses for each output of interest. However, for models in which the outputs are correlated, the sensitivity measures can be hard to interpret.
  • Given data: While in many cases the practitioner has access to the model, in some instances a sensitivity analysis must be performed with "given data", i.e. where the sample points (the values of the model inputs for each run) cannot be chosen by the analyst. This may occur when a sensitivity analysis has to performed retrospectively, perhaps using data from an optimisation or uncertainty analysis, or when data comes from a discrete source.[6]

Core Methodology

Ideal scheme of a possibly sampling-based sensitivity analysis. Uncertainty arising from different sources—errors in the data, parameter estimation procedure, alternative model structures—are propagated through the model for uncertainty analysis and their relative importance is quantified via sensitivity analysis.
Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Note that the abscissa is different for each plot: (−5, +5) for Z1, (−8, +8) for Z2, (−10, +10) for Z3 and Z4. Z4 is most important in influencing Y as it imparts more 'shape' on Y.

There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above.[1] They are also distinguished by the type of sensitivity measure, be it based on (for example) variance decompositions, partial derivatives or elementary effects. In general, however, most procedures adhere to the following outline:

  1. Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data [7].
  2. Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model).
  3. Run the model a number of times using some design of experiments[8], dictated by the method of choice and the input uncertainty.
  4. Using the resulting model outputs, calculate the sensitivity measures of interest.

In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis.

This section discusses various types of "core methods", distinguished by the various sensitivity measures that are calculated (note that some of these categories "overlap" somewhat). The following section focuses on alternative ways of obtaining these measures, under the constraints of the problem.

One-at-a-time (OAT/OFAT)

One of the simplest and most common approaches is that of changing one-factor-at-a-time (OFAT or OAT), to see what effect this produces on the output.[9] [10] [11] OAT customarily involves:

  • Moving one input variable, keeping others at their baseline (nominal) values, then,
  • Returning the variable to its nominal value, then repeating for each of the other inputs in the same way.

Sensitivity may then be measured by monitoring changes in the output, e.g. by partial derivatives or linear regression. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all ‘effects’ are computed with reference to the same central point in space) and minimizes the chances of computer programme crashes, more likely when several input factors are changed simultaneously. OAT is frequently preferred by modellers because of practical reasons. In case of model failure under OAT analysis the modeller immediately knows which is the input factor responsible for the failure.[5]

Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of interactions between input variables.[12]

Local methods

Local methods involve taking the partial derivative of the output Y with respect to an input factor Xi:

,

where the subscript X0 indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling[13][14] and Automated Differentiation[15] are methods in this class. Similar to OAT/OFAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time.

Scatter plots

A simple but useful tool is to plot scatter plots of the output variable against individual input variables, after (randomly) sampling the model over its input distributions. The advantage of this approach is that it can also deal with "given data", i.e. a set of arbitrarily-placed data points, and gives a direct visual indication of sensitivity. Quantitative measures can also be drawn, for example by measuring the correlation between Y and Xi, or even by estimating variance-based measures by nonlinear regression[6].

Regression analysis

Regression analysis can be a very useful tool when the model response is approximately linear, and is a simple approach with low computational cost. Sensitivity can be judged by standardized regression coefficients. This method is ineffective however when the response is strongly nonlinear. This can however be identified by the use of the coefficient of determination.

Variance-based methods

Variance-based methods[16][17][18] are a class of probabilistic approaches which quantify the input and output uncertainties as probability distributions, and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input. These can be expressed as conditional expectations, i.e. considering a model Y=f(X) for X={X1, X2, ... Xk}, a measure of sensitivity of the ith variable Xi is given as,

where "Var" and "E" denote the variance and expected value operators respectively. This expression essentially measures the contribution Xi alone to the uncertainty (variance) in Y (averaged over variations in other variables), and is known as the first-order sensitivity index or main effect index. Importantly, it does not measure the uncertainty caused by interactions with other variables. A further measure, known as the total effect index, gives the total variance in Y caused by Xi and its interactions with any of the other input variables. Both quantities are typically standardised by dividing by Var(Y).

Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of Monte Carlo methods, but since this can involve many thousands of model runs, other methods (such as emulators) can be used to reduce computational expense when necessary. Note that full variance decompositions are only meaningful when the input factors are independent from one another.[19]

Screening

Screening is a particular instance of a sampling-based method. The objective here is rather to identify which input variables are contributing significantly to the output uncertainty in high-dimensionality models, rather than exactly quantifying sensitivity (i.e. in terms of variance). Screening tends to have a relatively low computational cost when compared to other approaches, and can be used in a preliminary analysis to weed out uninfluential variables before applying a more informative analysis to the remaining set. One of the most commonly used screening method is the elementary effect method.[20][21]

Alternative Methods

A number of methods have been developed to overcome some of the constraints discussed above, which would otherwise make the estimation of sensitivity measures infeasible (most often due to computational expense). Generally, these methods focus on efficiently calculating variance-based measures of sensitivity.

Emulators

Emulators (also known as metamodels, surrogate models or response surfaces) are data-modelling/machine learning approaches that involve building a relatively simple mathematical function, known as an emulator, that approximates the input/output behaviour of the model itself.[22] In other words, it is the concept of "modelling a model" (hence the name "metamodel"). The idea is that, although computer models may be a very complex series of series of equations that can take a long time to solve, they can always be regarded as a function of their inputs Y=f(X). By running the model at a number of points in the input space, it may be possible to fit a much simpler emulator η(X), such that η(X)≈f(X) to within an acceptable margin of error. Then, sensitivity measures can be calculated from the emulator (either with Monte Carlo or analytically), which will have a negligible additional computational cost. Importantly, the number of model runs required to fit the emulator can be orders of magnitude less than the number of runs required to directly estimate the sensitivity measures from the model[23].

Clearly the crux of an emulator approach is to find an η (emulator) that is a sufficiently close approximation to the model f. This requires the following steps,

  1. Sampling (running) the model at a number of points in its input space. This requires a sample design.
  2. Selecting a type of emulator (mathematical function) to use.
  3. "Training" the emulator using the sample data from the model - this generally involves adjusting the emulator parameters until the emulator mimics the true model as well as possible.

Sampling the model can often be done with low-discrepancy sequences, such as the Sobol sequence or Latin hypercube sampling, although random designs can also be used, at the loss of some efficiency. The selection of the emulator type and the training are intrinsically linked, since the training method will be dependent on the class of emulator. Some types of emulators that have been used successfully for sensitivity analysis include,

The use of an emulator introduces a machine learning problem, which can be difficult if the response of the model is highly nonlinear. In all cases it is useful to check the accuracy of the emulator, for example using cross-validation.

High-Dimensional Model Representations (HDMR)

A high-dimensional model representation (HDMR)[26][27] (the term is due to H. Rabitz[28]) is essentially an emulator approach, which involves decomposing the function output into a linear combination of input terms and interactions of increasing dimensionality. The HDMR approach exploits the fact that the model can usually be well-approximated by neglecting higher-order interactions (second or third-order and above). The terms in the truncated series can then each be approximated by e.g. polynomials or splines (REFS) and the response expressed as the sum of the main effects and interactions up to the truncation order. From this perspective, HDMRs can be seen as emulators which neglect high-order interactions; the advantage being that they are able to emulate models with higher dimensionality than full-order emulators.

Fourier Amplitude Sensitivity Test (FAST)

The Fourier Amplitude Sensitivity Test (FAST) uses the Fourier series to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.

Other

Methods based on Monte Carlo filtering.[29][30] These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding to particular values (e.g. high or low) of the output.

Other Issues

Assumptions vs. inferences

In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input assumptions and how wide the resulting inference may be. The point is well illustrated by the econometrician Edward E. Leamer (1990) [31]:

I have proposed a form of organized sensitivity analysis that I call ‘global sensitivity analysis’ in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.

Note Leamer’s emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when uncertainties in inputs must be suppressed lest outputs become indeterminate.[32]

Pitfalls and Difficulties

Some common difficulties in sensitivity analysis include:

  • Too many model inputs to analyse. Screening can be used to reduce dimensionality.
  • The model takes too long to run. Emulators (including HDMR) can reduce the number of model runs needed.
  • There is not enough information to build probability distributions for the inputs. Probability distributions can be constructed from expert elicitation, although even then it may be hard to build distributions with great confidence. The subjectivity of the probability distributions or ranges will strongly affect the sensitivity analysis.
  • Unclear purpose of the analysis. Different statistical tests and measures are applied to the problem and different factors rankings are obtained. The test should instead be tailored to the purpose of the analysis, e.g. one uses Monte Carlo filtering if one is interested in which factors are most responsible for generating high/low values of the output.
  • Too many model outputs are considered. This may be acceptable for quality assurance of sub-models but should be avoided when presenting the results of the overall analysis.
  • Piecewise sensitivity. This is when one performs sensitivity analysis on one sub-model at a time. This approach is non conservative as it might overlook interactions among factors in different sub-models (Type II error).

Applications

Some examples of sensitivity analyses performed in various disciplines follow here.

Environmental

Environmental computer models are increasingly used in a wide variety of studies and applications. For example, global climate models are used for both short-term weather forecasts and long-term climate change. Moreover, computer models are increasingly used for environmental decision-making at a local scale, for example for assessing the impact of a waste water treatment plant on a river flow, or for assessing the behavior and life-length of bio-filters for contaminated waste water.

In both cases sensitivity analysis may help to understand the contribution of the various sources of uncertainty to the model output uncertainty and the system performance in general. In these cases, depending on model complexity, different sampling strategies may be advisable and traditional sensitivity indices have to be generalized to cover multiple model outputs,[33] heteroskedastic effects and correlated inputs.

Business

In a decision problem, the analyst may want to identify cost drivers as well as other quantities for which we need to acquire better knowledge in order to make an informed decision. On the other hand, some quantities have no influence on the predictions, so that we can save resources at no loss in accuracy by relaxing some of the conditions. See Corporate finance: Quantifying uncertainty. Additionally to the general motivations listed above, sensitivity analysis can help in a variety of other circumstances specific to business:

  • To identify critical assumptions or compare alternative model structures
  • To guide future data collections
  • To optimize the tolerance of manufactured parts in terms of the uncertainty in the parameters
  • To optimize resources allocation

However there are also some problems associated with sensitivity analysis in the business context:

  • Variables are often interdependent (correlated), which makes examining them each individually unrealistic, e.g. changing one factor such as sales volume, will most likely affect other factors such as the selling price.
  • Often the assumptions upon which the analysis is based are made by using past experience/data which may not hold in the future.
  • Assigning a maximum and minimum (or optimistic and pessimistic) value is open to subjective interpretation. For instance one person's 'optimistic' forecast may be more conservative than that of another person performing a different part of the analysis. This sort of subjectivity can adversely affect the accuracy and overall objectivity of the analysis.

Social Sciences

Examples from research-led sensitivity analyses can be found on gender wage gap in Chile[34] and water sector interventions in Nigeria.

In modern econometrics the use of sensitivity analysis to anticipate criticism is the subject of one of the ten commandments of applied econometrics (from Kennedy, 2007[35] ):

Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism [•••] When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an ‘honesty is the best policy’ approach, advocated by Leamer, (1978[36]).

Sensitivity analysis can also be used in model-based policy assessment studies.[37] Sensitivity analysis can be used to assess the robustness of composite indicators,[38] also known as indices, such as the Environmental Performance Index.

Chemistry

Sensitivity Analysis is common in many areas of physics and chemistry[39].

With the accumulation of knowledge about kinetic mechanisms under investigation and with the advance of power of modern computing technologies, detailed complex kinetic models are increasingly used as predictive tools and as aids for understanding the underlying phenomena. A kinetic model is usually described by a set of differential equations representing the concentration-time relationship. Sensitivity analysis has been proven to be a powerful tool to investigate a complex kinetic model.[40][41][42]

Kinetic parameters are frequently determined from experimental data via nonlinear estimation. Sensitivity analysis can be used for optimal experimental design, e.g. determining initial conditions, measurement positions, and sampling time, to generate informative data which are critical to estimation accuracy. A great number of parameters in a complex model can be candidates for estimation but not all are estimable.[42] Sensitivity analysis can be used to identify the influential parameters which can be determined from available data while screening out the unimportant ones. Sensitivity analysis can also be used to identify the redundant species and reactions allowing model reduction.

Engineering

Modern engineering design makes extensive use of computer models to test designs before they are manufactured. Sensitivity analysis allows designers to assess the effects and sources of uncertainties, in the interest of building robust models. Sensitivity analyses have for example been performed in biomechanical models[43] amongst others.

In meta-analysis

In a meta analysis, a sensitivity analysis tests if the results are sensitive to restrictions on the data included. Common examples are large trials only, higher quality trials only, and more recent trials only. If results are consistent it provides stronger evidence of an effect and of generalizability.[44]

Multi-criteria decision making

Sometimes a sensitivity analysis may reveal surprising insights about the subject of interest. For instance, the field of multi-criteria decision making (MCDM) studies (among other topics) the problem of how to select the best alternative among a number of competing alternatives. This is an important task in decision making. In such a setting each alternative is described in terms of a set of evaluative criteria. These criteria are associated with weights of importance. Intuitively, one may think that the larger the weight for a criterion is, the more critical that criterion should be. However, this may not be the case. It is important to distinguish here the notion of criticality with that of importance. By critical, we mean that a criterion with small change (as a percentage) in its weight, may cause a significant change of the final solution. It is possible criteria with rather small weights of importance (i.e., ones that are not so important in that respect) to be much more critical in a given situation than ones with larger weights.[45][46] That is, a sensitivity analysis may shed light into issues not anticipated at the beginning of a study. This, in turn, may dramatically improve the effectiveness of the initial study and assist in the successful implementation of the final solution.

Sensitivity analysis is closely related with uncertainty analysis; while the latter studies the overall uncertainty in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions.

The problem setting in sensitivity analysis also has strong similarities with the field of design of experiments. In a design of experiments, one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.

See also

References

  1. ^ a b c Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D. Saisana, M., and Tarantola, S., 2008, Global Sensitivity Analysis. The Primer, John Wiley & Sons.
  2. ^ Pannell, D.J. (1997). Sensitivity analysis of normative economic models: Theoretical framework and practical strategies, Agricultural Economics 16: 139-152.[1]
  3. ^ Der Kiureghian, A., Ditlevsen, O. (2009) Aleatory or epistemic? Does it matter?, Structural Safety 31(2), 105-112.
  4. ^ J.C. Helton, J.D. Johnson, C.J. Salaberry, and C.B. Storlie, 2006, Survey of sampling based methods for uncertainty and sensitivity analysis. Reliability Engineering and System Safety, 91:1175–1209.
  5. ^ a b Saltelli, A., Annoni, P., 2010, How to avoid a perfunctory sensitivity analysis, Environmental Modeling and Software 25, 1508-1517.
  6. ^ a b Paruolo, P., Saisana, M., and Saltelli, A., (2012) Ratings and rankings: voodoo or science? The Royal Statistical Society: Journal Series A
  7. ^ O'Hagan, A., Uncertain Judgements: Eliciting Experts' Probabilities. Wiley, Chichester, 2006.
  8. ^ Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and analysis of computer experiments. Statistical Science 4, 409–435.
  9. ^ J. Campbell, et al. (2008), Photosynthetic Control of Atmospheric Carbonyl Sulfide During the Growing Season, Science 322: 1085-1088
  10. ^ R. Bailis, M. Ezzati, D. Kammen, (2005), Mortality and Greenhouse Gas Impacts of Biomass and Petroleum Energy Futures in Africa, Science 308: 98-103
  11. ^ J. Murphy, et al.(2004), Quantification of modelling uncertainties in a large ensemble of climate change simulations, Nature 430: 768-772
  12. ^ Czitrom (1999) "One-Factor-at-a-Time Versus Designed Experiments", American Statistician, 53, 2.
  13. ^ Cacuci, Dan G., Sensitivity and Uncertainty Analysis: Theory, Volume I, Chapman & Hall.
  14. ^ Cacuci, Dan G., Mihaela Ionescu-Bujor, Michael Navon, 2005, Sensitivity And Uncertainty Analysis: Applications to Large-Scale Systems (Volume II), Chapman & Hall.
  15. ^ Grievank, A. (2000). Evaluating derivatives, Principles and techniques of algorithmic differentiation. SIAM publisher.
  16. ^ Sobol’, I. (1990). Sensitivity estimates for nonlinear mathematical models. Matematicheskoe Modelirovanie 2, 112–118. in Russian, translated in English in Sobol’ , I. (1993). Sensitivity analysis for non-linear mathematical models. Mathematical Modeling & Computational Experiment (Engl. Transl.), 1993, 1, 407–414.
  17. ^ Homma, T. and A. Saltelli (1996). Importance measures in global sensitivity analysis of nonlinear models. Reliability Engineering and System Safety, 52, 1–17.
  18. ^ Saltelli, A., K. Chan, and M. Scott (Eds.) (2000). Sensitivity Analysis. Wiley Series in Probability and Statistics. New York: John Wiley and Sons.
  19. ^ Saltelli, A. and S. Tarantola (2002). On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal. Journal of American Statistical Association, 97, 702–709.
  20. ^ Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33, 161–174.
  21. ^ Campolongo, F., J. Cariboni, and A. Saltelli (2007). An effective screening design for sensitivity analysis of large models. Environmental Modelling and Software, 22, 1509–1518.
  22. ^ a b c Storlie, C.B., Swiler, L.P., Helton, J.C., and Sallaberry, C.J. (2009), Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models, Reliability Engineering & System Safety 94(11): 1735-1763
  23. ^ a b Oakley, J. and A. O'Hagan (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. J. Royal Stat. Soc. B 66, 751–769.
  24. ^ Sudret, B., (2008), Global sensitivity analysis using polynomial chaos expansions}, Reliability Engineering & System Safety 93(7): 964-979,
  25. ^ Ratto, M. and Pagano, A., (2010), Using recursive algorithms for the efficient identification of smoothing spline ANOVA models, AStA Advances in Statistical Analysis 94(4): 367-388
  26. ^ Li, G., J. Hu, S.-W. Wang, P. Georgopoulos, J. Schoendorf, and H. Rabitz (2006). Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions. Journal of Physical Chemistry A 110, 2474–2485.
  27. ^ Li, G., W. S. W., and R. H. (2002). Practical approaches to construct RS-HDMR component functions. Journal of Physical Chemistry 106, 8721{8733.
  28. ^ Rabitz, H. (1989). System analysis at molecular scale. Science, 246, 221–226.
  29. ^ Hornberger, G. and R. Spear (1981). An approach to the preliminary analysis of environmental systems. Journal of Environmental Management 7, 7–18.
  30. ^ Saltelli, A., S. Tarantola, F. Campolongo, and M. Ratto (2004). Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models. John Wiley and Sons.
  31. ^ Leamer, E., (1990) Let's take the con out of econometrics, and Sensitivity analysis would help. In C. Granger (ed.), Modelling Economic Series. Oxford: Clarendon Press 1990.
  32. ^ Ravetz, J.R., 2007, No-Nonsense Guide to Science, New Internationalist Publications Ltd.
  33. ^ Fassò, Alessandro () "Sensitivity Analysis for Environmental Models and Monitoring Networks". Preprint
  34. ^ Perticara, M (2007) 'Gender wage gap in Chile: a sensitivity analysis: Analizing the differences between men and women employment and wages', Alberto Hurtado University: http://cloud2.gdnet.org/~research_papers/Gender%20wage%20gap%20in%20Chile:%20a%20sensitivity%20analysis
  35. ^ Kennedy, P. (2007). A guide to econometrics, Fifth edition. Blackwell Publishing.
  36. ^ Leamer, E. (1978). Specification Searches: Ad Hoc Inferences with Nonexperimental Data. John Wiley & Sons, Ltd, p. vi.
  37. ^ Saltelli, Andrea (2006) "The critique of modelling and sensitivity analysis in the scientic discourse: An overview of good practices", Transatlantic Uncertainty Colloquium (TAUC) Washington, October 10–11
  38. ^ Saisana M., Saltelli A., Tarantola S. (2005) "Uncertainty and Sensitivity analysis techniques as tools for the quality assessment of composite indicators", Journal of the Royal Statistical Society, A, 168 (2), 307–323.
  39. ^ Saltelli, A., M. Ratto, S. Tarantola and F. Campolongo (2005) Sensitivity Analysis for Chemical Models, Chemical Reviews, 105(7) pp 2811–2828.
  40. ^ Rabitz, H., M. Kramer and D. Dacol (1983). Sensitivity analysis in chemical kinetics. Annual Review of Physical Chemistry, 34, 419–461.
  41. ^ Turanyi, T (1990). Sensitivity analysis of complex kinetic systems. Tools and applications. Journal of Mathematical Chemistry, 5, 203–248.
  42. ^ a b Komorowski M, Costa MJ, Rand DA , Stumpf MPH (2011). Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proc Natl Acad Sci U S A', 108(21), 8645-50.
  43. ^ Becker W., Rowson J., Oakley J.E., Yoxall A., Manson G., Worden K. (2011) Bayesian sensitivity analysis of a model of the aortic valve, Journal of Biomechanics 44(8): 1499-506
  44. ^ clinicalevidence.bmj.com > Glossary > sensitivity analysis Retrieved on June 21, 2010
  45. ^ Triantaphyllou, E. (1997). "A Sensitivity Analysis Approach for Some Deterministic Multi-Criteria Decision-Making Methods". Decision Sciences. 28 (1): 151–194. Retrieved 2010-06-28. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  46. ^ Triantaphyllou, E. (2000). Multi-Criteria Decision Making: A Comparative Study. Dordrecht, The Netherlands: Kluwer Academic Publishers (now Springer). p. 320. ISBN 0-7923-6607-7.

References

  • Cruz, J. B., editor, (1973) System Sensitivity Analysis, Dowden, Hutchinson & Ross, Stroudsburg, PA.
  • Cruz, J. B. and Perkins, W.R., (1964), A New Approach to the Sensitivity Problem in Multivariable Feedback System Design, IEEE TAC, Vol. 9, 216-223.
  • Fassò A. (2007) Statistical sensitivity analysis and water quality. In Wymer L. Ed, Statistical Framework for Water Quality Criteria and Monitoring. Wiley, New York.
  • Fassò A., Esposito E., Porcu E., Reverberi A.P., Vegliò F. (2003) Statistical Sensitivity Analysis of Packed Column Reactors for Contaminated Wastewater. Environmetrics. Vol. 14, n.8, 743 - 759.
  • Fassò A., Perri P.F. (2002) Sensitivity Analysis. In Abdel H. El-Shaarawi and Walter W. Piegorsch (eds) Encyclopedia of Environmetrics, Volume 4, pp 1968–1982, Wiley.
  • Saltelli, A., S. Tarantola, and K. Chan (1999). Quantitative model-independent method for global sensitivity analysis of model output. Technometrics 41(1), 39–56.
  • Santner, T. J.; Williams, B. J.; Notz, W.I. Design and Analysis of Computer Experiments; Springer-Verlag, 2003.
  • Haug, Edward J.; Choi, Kyung K.; Komkov, Vadim (1986) Design sensitivity analysis of structural systems. Mathematics in Science and Engineering, 177. Academic Press, Inc., Orlando, FL.
  • Taleb, N. N., (2007) The Black Swan: The Impact of the Highly Improbable, Random House.
  • Pilkey, O. H. and L. Pilkey-Jarvis (2007), Useless Arithmetic. Why Environmental Scientists Can't Predict the Future. New York: Columbia University Press.

Further reading