# Hausman test

The Hausman test (also called the Wu–Hausman test, Hausman specification test, and Durbin–Wu–Hausman test) is a statistical hypothesis test in econometrics named after James Durbin,[1] De-Min Wu[2] and Jerry A. Hausman.[3][citation needed][clarification needed] The test evaluates the significance[clarification needed] of an estimator versus an alternative estimator. It helps one evaluate if a statistical model corresponds to the data.

## Details

Consider the linear model y = bX + e, where y is the dependent variable and X is vector of regressors, b is a vector of coefficients and e is the error term. We have two estimators for b: b0 and b1. Under the null hypothesis, both of these estimators are consistent, but b1 is efficient (has the smallest asymptotic variance), at least in the class of estimators containing b0. Under the alternative hypothesis, b0 is consistent, whereas b1 isn’t.

Then the Wu–Hausman statistic is:[4]

$H=(b_{1}-b_{0})'\big(\operatorname{Var}(b_{0})-\operatorname{Var}(b_{1})\big)^\dagger(b_{1}-b_{0}),$

where denotes the Moore–Penrose pseudoinverse. Under the null hypothesis, this statistic has asymptotically the chi-squared distribution with the number of degrees of freedom equal to the rank of matrix Var(b0) − Var(b1).

If we reject the null hypothesis, it means that b1 is inconsistent. This test can be used to check for the endogeneity of a variable (by comparing instrumental variable (IV) estimates to ordinary least squares (OLS) estimates). It can also be used to check the validity of extra instruments by comparing IV estimates using a full set of instruments Z to IV estimates that use a proper subset of Z. Note that in order for the test to work in the latter case, we must be certain of the validity of the subset of Z and that subset must have enough instruments to identify the parameters of the equation.

Hausman also showed that the covariance between an efficient estimator and the difference of an efficient and inefficient estimator is zero.

## Panel data

Hausman test can be also used to differentiate between fixed effects model and random effects model in panel data. In this case, Random effects (RE) is preferred under the null hypothesis due to higher efficiency, while under the alternative Fixed effects (FE) is at least consistent and thus preferred.

 H0 is true H1 is true b1 (RE estimator) Consistent Efficient Inconsistent b0 (FE estimator) Consistent Inefficient Consistent