= Information matrix test =

In econometrics, the information matrix test is used to determine whether a regression model is misspecified. The test was developed by Halbert White, who observed that in a correctly specified model and under standard regularity assumptions, the Fisher information matrix can be expressed in either of two ways: as the outer product of the gradient of the log-likelihood function, or as a function of its Hessian matrix.

Consider a linear model $\mathbf{y} = \mathbf{X} \mathbf{\beta} + \mathbf{u}$, where the errors $\mathbf{u}$ are assumed to be distributed $\mathrm{N}(0, \sigma^2 \mathbf{I})$. If the parameters $\beta$ and $\sigma^2$ are stacked in the vector $\mathbf{\theta}^{\mathsf{T}} = \begin{bmatrix} \beta & \sigma^2 \end{bmatrix}$, the resulting log-likelihood function is

$\ell (\mathbf{\theta}) = - \frac{n}{2} \log \sigma^2 - \frac{1}{2 \sigma^2} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right)^{\mathsf{T}} \left( \mathbf{y} - \mathbf{X} \mathbf{\beta} \right)$

The information matrix can then be expressed as

 $\mathbf{I} (\mathbf{\theta}) = \operatorname{E} \left[ \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right) \left( \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right)^{\mathsf{T}} \right]$
that is the expected value of the outer product of the gradient or score. Second, it can be written as the negative of the Hessian matrix of the log-likelihood function

 $\mathbf{I} (\mathbf{\theta}) = - \operatorname{E} \left[ \frac{\partial^2 \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}}} \right]$

If the model is correctly specified, both expressions should be equal. Combining the equivalent forms yields

 $\mathbf{\Delta}(\mathbf{\theta}) = \sum_{i=1}^n \left[ \frac{\partial^2 \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} \, \partial \mathbf{\theta}^{\mathsf{T}} } + \frac{\partial \ell(\mathbf{\theta}) }{ \partial \mathbf{\theta} } \frac{\partial \ell (\mathbf{\theta}) }{ \partial \mathbf{\theta} } \right]$

where $\mathbf{\Delta} (\mathbf{\theta})$ is an $(r \times r)$ random matrix, where $r$ is the number of parameters. White showed that the elements of $n^{-1/2} \mathbf{\Delta} ( \mathbf{\hat{\theta}} )$, where $\mathbf{\hat{\theta}}$ is the MLE, are asymptotically normally distributed with zero means when the model is correctly specified. In small samples, however, the test generally performs poorly.
