Endogeneity (econometrics)

From Wikipedia, the free encyclopedia
  (Redirected from Endogeneity (economics))
Jump to: navigation, search
For endogeneity in a non-econometric sense, see Endogeny.

In a statistical model, a parameter or variable is said to be endogenous when there is a correlation between the parameter or variable and the error term.[1] Endogeneity can arise as a result of measurement error, autoregression with autocorrelated errors, simultaneity and omitted variables. Broadly, a loop of causality between the independent and dependent variables of a model leads to endogeneity.

For example, in a simple supply and demand model, when predicting the quantity demanded in equilibrium, the price is endogenous because producers change their price in response to demand and consumers change their demand in response to price. In this case, the price variable is said to have total endogeneity once the demand and supply curves are known. In contrast, a change in consumer tastes or preferences would be an exogenous change on the demand curve.

Exogeneity vs. endogeneity[edit]

In a stochastic model, the notion of the usual exogeneity, sequential exogeneity, strong/strict exogeneity can be defined. Exogeneity is articulated in such a way that a variable or variables is exogenous for parameter \alpha. Even if a variable is exogenous for parameter \alpha, it might be endogenous for parameter \beta.

When the explanatory variables are not stochastic, then they are strong exogenous for all the parameters.

The problem of endogeneity occurs when the independent variable is correlated with the error term in a regression model. This implies that the regression coefficient in an Ordinary Least Squares (OLS) regression is biased, however if the correlation is not contemporaneous, then it may still be consistent. There are many methods of overcoming this, including instrumental variable regression and Heckman selection correction.

Static models[edit]

The following are some common sources of endogeneity.

Omitted variable[edit]

Further information: Omitted-variable bias

In this case, the endogeneity comes from an uncontrolled confounding variable. A variable is both correlated with an independent variable in the model and with the error term. (Equivalently, the omitted variable both affects the independent variable and separately affects the dependent variable.) Assume that the "true" model to be estimated is,

 y_i = \alpha + \beta x_i + \gamma z_i + u_i

but we omit z_i (perhaps because we don't have a measure for it) when we run our regression. z_i will get absorbed by the error term and we will actually estimate,

 y_i = \alpha + \beta x_i + \varepsilon_i      (where \varepsilon_i=\gamma z_i + u_i)

If the correlation of x and z is not 0 and z separately affects y (meaning \gamma \neq 0), then x is correlated with the error term \varepsilon.

Here, x and 1 are not exogenous for alpha and beta since, given x and 1, the distribution of y depends not only on alpha and beta, but also on z and gamma.

Measurement error[edit]

Suppose that we do not get a perfect measure of one of our independent variables. Imagine that instead of observing x^{*}_{i} we observe x_i=x^{*}_{i}+ \nu_i where \nu_i is the measurement "noise". In this case, a model given by

 y_i = \alpha+\beta x^{*}_{i} + \varepsilon_i

is written in terms of observables and error terms as

 y_i = \alpha+\beta(x_i-\nu_i) + \varepsilon_i
 y_i = \alpha+\beta x_i +(\varepsilon_i - \beta\nu_i)
 y_i = \alpha+\beta x_i +u_i     (where u_i=\varepsilon_i - \beta\nu_i)

Since both x_i and u_i depend on \nu_i, they are correlated, so OLS estimation will be downward bias. Measurement error in the dependent variable, however, does not cause endogeneity (though it does increase the variance of the error term).

Simultaneity[edit]

Suppose that two variables are codetermined, with each affecting the other. Suppose that we have two "structural" equations,

y_i = \beta_1 x_i + \gamma_1 z_i + u_i
z_i = \beta_2 x_i + \gamma_2 y_i + v_i

We can show that estimating either equation results in endogeneity. In the case of the first structural equation, we will show that E(z_i u_i) \neq 0. First, solving for z_i we get (assuming that 1-\gamma_1 \gamma_2 \neq 0 ),

z_i = \frac{\beta_2 + \gamma_2 \beta_1}{1-\gamma_1 \gamma_2}x_i+\frac{1}{1-\gamma_1 \gamma_2}v_i+\frac{\gamma_2}{1-\gamma_1 \gamma_2}u_i

Assuming that x_i and v_i are uncorrelated with u_i, we find that,

E(z_i u_i) = \frac{\gamma_2}{1-\gamma_1 \gamma_2}E(u_i u_i)
E(z_i u_i) \neq 0

Therefore, attempts at estimating either structural equation will be hampered by endogeneity.

Dynamic models[edit]

The endogeneity problem is particularly relevant in the context of time series analysis of causal processes. It is common for some factors within a causal system to be dependent for their value in period t on the values of other factors in the causal system in period t-1. Suppose that the level of pest infestation is independent of all other factors within a given period, but is influenced by the level of rainfall and fertilizer in the preceding period. In this instance it would be correct to say that infestation is exogenous within the period, but endogenous over time.

Let the model be y=f(x,z)+u, then if the variable x is sequential exogenous for parameter \alpha, and y does not cause x in Granger sense, then the variable x is strong/strict exogenous for the parameter \alpha.

Simultaneity[edit]

Generally speaking, simultaneity occurs in the dynamic model just like in the example of static simultaneity above.

See also[edit]

References[edit]

  1. ^ Wooldridge, Jeffrey M. (2013). Introductory Econometrics: A Modern Approach (Fifth international ed.). Australia: South-Western. pp. 82–83. ISBN 978-1-111-53439-4. 

Further reading[edit]

  • Greene, William H. (2007). Econometric Analysis (Sixth ed.). Upper Saddle River: Pearson. ISBN 978-0-13-513740-6. 
  • Kennedy, Peter (2008). A Guide to Econometrics (Sixth ed.). Malden: Blackwell. p. 139. ISBN 978-1-4051-8257-7. 

External links[edit]