Propagation of uncertainty
In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations (e.g., instrument precision) which propagate due to the combination of variables in the function.
The uncertainty u can be expressed in a number of ways. It may be defined by the absolute error Δx. Uncertainties can also be defined by the relative error (Δx)/x, which is usually written as a percentage. Most commonly, the uncertainty on a quantity is quantified in terms of the standard deviation, σ, which is the positive square root of the variance. The value of a quantity and its error are then expressed as an interval x ± u. If the statistical probability distribution of the variable is known or can be assumed, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a normal distribution are approximately ± one standard deviation σ from the central value x, which means that the region x ± σ will cover the true value in roughly 68% of cases.
If the uncertainties are correlated then covariance must be taken into account. Correlation can arise from two different sources. First, the measurement errors may be correlated. Second, when the underlying values are correlated across a population, the uncertainties in the group averages will be correlated.[1] For very expensive data or complex functions, the error propagation may be achieved with a surrogate model, e.g. based on Bayesian probability theory.[2]
Linear combinations
Let be a set of m functions, which are linear combinations of variables with combination coefficients :
or in matrix notation,
Also let the variance–covariance matrix of x = (x1, ..., xn) be denoted by and let the mean value be denoted by :
is the outer product.
Then, the variance–covariance matrix of f is given by
In component notation, the equation
reads
This is the most general expression for the propagation of error from one set of variables onto another. When the errors on x are uncorrelated, the general expression simplifies to
where is the variance of k-th element of the x vector. Note that even though the errors on x may be uncorrelated, the errors on f are in general correlated; in other words, even if is a diagonal matrix, is in general a full matrix.
The general expressions for a scalar-valued function f are a little simpler (here a is a row vector):
Each covariance term can be expressed in terms of the correlation coefficient by , so that an alternative expression for the variance of f is
In the case that the variables in x are uncorrelated, this simplifies further to
In the simple case of identical coefficients and variances, we find
For the arithmetic mean, , the result is the standard error of the mean:
Non-linear combinations
When f is a set of non-linear combination of the variables x, an interval propagation could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function f must usually be linearised by approximation to a first-order Taylor series expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products.[3] The Taylor expansion would be:
where denotes the partial derivative of fk with respect to the i-th variable, evaluated at the mean value of all components of vector x. Or in matrix notation,
where J is the Jacobian matrix. Since f0 is a constant it does not contribute to the error on f. Therefore, the propagation of error follows the linear case, above, but replacing the linear coefficients, Aki and Akj by the partial derivatives, and . In matrix notation,[4]
That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument. Note this is equivalent to the matrix expression for the linear case with .
Simplification
Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula:[5]
where represents the standard deviation of the function , represents the standard deviation of , represents the standard deviation of , and so forth.
It is important to note that this formula is based on the linear characteristics of the gradient of and therefore it is a good estimation for the standard deviation of as long as are small enough. Specifically, the linear approximation of has to be close to inside a neighbourhood of radius .[6]
Example
Any non-linear differentiable function, , of two variables, and , can be expanded as
now, taking variance on both sides, and using the formula[7] for variance of a linear combination of variables:
hence:
where is the standard deviation of the function , is the standard deviation of , is the standard deviation of and is the covariance between and .
In the particular case that , . Then
or
where is the correlation between and .
When the variables and are uncorrelated, . Then
Caveats and warnings
Error estimates for non-linear functions are biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+x) increases as x increases, since the expansion to x is a good approximation only when x is near zero.
For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation;[8] see Uncertainty quantification for details.
Reciprocal and shifted reciprocal
In the special case of the inverse or reciprocal , where follows a standard normal distribution, the resulting distribution is a reciprocal standard normal distribution, and there is no definable variance.[9]
However, in the slightly more general case of a shifted reciprocal function for following a general normal distribution, then mean and variance statistics do exist in a principal value sense, if the difference between the pole and the mean is real-valued.[10]
Ratios
Ratios are also problematic; normal approximations exist under certain conditions.
Example formulae
This table shows the variances and standard deviations of simple functions of the real variables , with standard deviations covariance , and correlation . The real-valued coefficients and are assumed exactly known (deterministic), i.e., .
In the columns "Variance" and "Standard Deviation", A and B should be understood as expectation values (i.e. values around which we're estimating the uncertainty), and should be understood as the value of the function calculated at the expectation value of .
For uncorrelated variables (, ) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives
For the case we also have Goodman's expression[3] for the exact variance: for the uncorrelated case it is
and therefore we have:
Effect of correlation on differences
If A and B are uncorrelated, their difference A-B will have more variance than either of them. An increasing positive correlation () will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation () will further increase the variance of the difference, compared to the uncorrelated case.
For example, the self-subtraction f=A-A has zero variance only if the variate is perfectly autocorrelated (). If A is uncorrelated, , then the output variance is twice the input variance, . And if A is perfectly anticorrelated, , then the input variance is quadrupled in the output, (notice for f = aA - aA in the table above).
Example calculations
Inverse tangent function
We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error.
Define
where is the absolute uncertainty on our measurement of x. The derivative of f(x) with respect to x is
Therefore, our propagated uncertainty is
where is the absolute propagated uncertainty.
Resistance measurement
A practical application is an experiment in which one measures current, I, and voltage, V, on a resistor in order to determine the resistance, R, using Ohm's law, R = V / I.
Given the measured variables with uncertainties, I ± σI and V ± σV, and neglecting their possible correlation, the uncertainty in the computed quantity, σR, is:
See also
- Accuracy and precision
- Automatic differentiation
- Bienaymé's identity
- Delta method
- Dilution of precision (navigation)
- Errors and residuals in statistics
- Experimental uncertainty analysis
- Interval finite element
- Measurement uncertainty
- Numerical stability
- Probability bounds analysis
- Significance arithmetic
- Uncertainty quantification
- Random-fuzzy variable
- Variance#Propagation
References
- ^ Kirchner, James. "Data Analysis Toolkit #5: Uncertainty Analysis and Error Propagation" (PDF). Berkeley Seismology Laboratory. University of California. Retrieved 22 April 2016.
- ^ Ranftl, Sascha; von der Linden, Wolfgang (2021-11-13). "Bayesian Surrogate Analysis and Uncertainty Propagation". Physical Sciences Forum. 3 (1): 6. doi:10.3390/psf2021003006. ISSN 2673-9984.
- ^ a b Goodman, Leo (1960). "On the Exact Variance of Products". Journal of the American Statistical Association. 55 (292): 708–713. doi:10.2307/2281592. JSTOR 2281592.
- ^ Ochoa1,Benjamin; Belongie, Serge "Covariance Propagation for Guided Matching" Archived 2011-07-20 at the Wayback Machine
- ^ Ku, H. H. (October 1966). "Notes on the use of propagation of error formulas". Journal of Research of the National Bureau of Standards. 70C (4): 262. doi:10.6028/jres.070c.025. ISSN 0022-4316. Retrieved 3 October 2012.
- ^ Clifford, A. A. (1973). Multivariate error analysis: a handbook of error propagation and calculation in many-parameter systems. John Wiley & Sons. ISBN 978-0470160558.[page needed]
- ^ Soch, Joram (2020-07-07). "Variance of the linear combination of two random variables". The Book of Statistical Proofs. Retrieved 2022-01-29.
- ^ Lee, S. H.; Chen, W. (2009). "A comparative study of uncertainty propagation methods for black-box-type problems". Structural and Multidisciplinary Optimization. 37 (3): 239–253. doi:10.1007/s00158-008-0234-7. S2CID 119988015.
- ^ Johnson, Norman L.; Kotz, Samuel; Balakrishnan, Narayanaswamy (1994). Continuous Univariate Distributions, Volume 1. Wiley. p. 171. ISBN 0-471-58495-9.
- ^ Lecomte, Christophe (May 2013). "Exact statistics of systems with uncertainties: an analytical theory of rank-one stochastic dynamic systems". Journal of Sound and Vibration. 332 (11): 2750–2776. doi:10.1016/j.jsv.2012.12.009.
- ^ "A Summary of Error Propagation" (PDF). p. 2. Archived from the original (PDF) on 2016-12-13. Retrieved 2016-04-04.
- ^ "Propagation of Uncertainty through Mathematical Operations" (PDF). p. 5. Retrieved 2016-04-04.
- ^ "Strategies for Variance Estimation" (PDF). p. 37. Retrieved 2013-01-18.
- ^ a b Harris, Daniel C. (2003), Quantitative chemical analysis (6th ed.), Macmillan, p. 56, ISBN 978-0-7167-4464-1
- ^ "Error Propagation tutorial" (PDF). Foothill College. October 9, 2009. Retrieved 2012-03-01.
Further reading
- Bevington, Philip R.; Robinson, D. Keith (2002), Data Reduction and Error Analysis for the Physical Sciences (3rd ed.), McGraw-Hill, ISBN 978-0-07-119926-1
- Fornasini, Paolo (2008), The uncertainty in physical measurements: an introduction to data analysis in the physics laboratory, Springer, p. 161, ISBN 978-0-387-78649-0
- Meyer, Stuart L. (1975), Data Analysis for Scientists and Engineers, Wiley, ISBN 978-0-471-59995-1
- Peralta, M. (2012), Propagation Of Errors: How To Mathematically Predict Measurement Errors, CreateSpace
- Rouaud, M. (2013), Probability, Statistics and Estimation: Propagation of Uncertainties in Experimental Measurement (PDF) (short ed.)
- Taylor, J. R. (1997), An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements (2nd ed.), University Science Books
- Wang, C M; Iyer, Hari K (2005-09-07). "On higher-order corrections for propagating uncertainties". Metrologia. 42 (5): 406–410. doi:10.1088/0026-1394/42/5/011. ISSN 0026-1394.
External links
- A detailed discussion of measurements and the propagation of uncertainty explaining the benefits of using error propagation formulas and Monte Carlo simulations instead of simple significance arithmetic
- GUM, Guide to the Expression of Uncertainty in Measurement
- EPFL An Introduction to Error Propagation, Derivation, Meaning and Examples of Cy = Fx Cx Fx'
- uncertainties package, a program/library for transparently performing calculations with uncertainties (and error correlations).
- soerp package, a Python program/library for transparently performing *second-order* calculations with uncertainties (and error correlations).
- Joint Committee for Guides in Metrology (2011). JCGM 102: Evaluation of Measurement Data - Supplement 2 to the "Guide to the Expression of Uncertainty in Measurement" - Extension to Any Number of Output Quantities (PDF) (Technical report). JCGM. Retrieved 13 February 2013.
- Uncertainty Calculator Propagate uncertainty for any expression