# First-difference estimator

Jump to: navigation, search

The first-difference (FD) estimator is an approach used to address the problem of omitted variables in econometrics and statistics with panel data. The estimator is obtained by running a pooled OLS estimation for a regression of ${\displaystyle \Delta y_{it}}$ on ${\displaystyle \Delta x_{it}}$.[clarification needed]

The FD estimator wipes out time invariant omitted variables ${\displaystyle c_{i}}$ using the repeated observations over time:

${\displaystyle y_{it}=x_{it}\beta +c_{i}+u_{it},t=1,...T,}$
${\displaystyle y_{it-1}=x_{it-1}\beta +c_{i}+u_{it-1},t=2,...T.}$

Differencing both equations, gives:

${\displaystyle \Delta y_{it}=y_{it}-y_{it-1}=\Delta x_{it}\beta +\Delta u_{it},t=2,...T,}$

which removes the unobserved ${\displaystyle c_{i}}$.

The FD estimator ${\displaystyle {\hat {\beta }}_{FD}}$ is then simply obtained by regressing changes on changes using OLS:

${\displaystyle {\hat {\beta }}_{FD}=(\Delta X'\Delta X)^{-1}\Delta X'\Delta y}$

Note that the rank condition must be met for ${\displaystyle \Delta X'\Delta X}$ to be invertible (${\displaystyle rank[\Delta X'\Delta X]=k}$).

Similarly,

${\displaystyle Av{\hat {a}}r({\hat {\beta }}_{FD})={\hat {\sigma }}_{u}^{2}(\Delta X'\Delta X)^{-1},}$

where ${\displaystyle {\hat {\sigma }}_{u}^{2}}$ is given by

${\displaystyle {\hat {\sigma }}_{u}^{2}=[n(T-1)-K]^{-1}{\hat {u}}'{\hat {u}}.}$

## Properties

Under the assumption of ${\displaystyle E[u_{it}-u_{it-1}|x_{it}-x_{it-1}]=0}$, the FD estimator is unbiased and consistent, i.e. ${\displaystyle E[{\hat {\beta }}_{FD}]=\beta }$ and ${\displaystyle plim{\hat {\beta }}=\beta }$. Note that this assumption is less restrictive than the assumption of weak exogeneity required for unbiasedness using the fixed effects (FE) estimator. If the disturbance term ${\displaystyle u_{it}}$ follows a random walk, the usual OLS standard errors are asymptotically valid.

## Relation to fixed effects estimator

For ${\displaystyle T=2}$, the FD and fixed effects estimators are numerically equivalent.

Under the assumption of spherical errors, i.e. homoscedasticity and no serial correlation in ${\displaystyle u_{it}}$, the FE estimator is more efficient than the FD estimator. If ${\displaystyle u_{it}}$ follows a random walk, however, the FD estimator is more efficient as ${\displaystyle \Delta u_{it}}$ are serially uncorrelated.

In practice, the FD estimator is easier to implement without special software, as the only transformation required is to first difference.