Fixed effects model
|
|
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (September 2009) |
In econometrics and statistics, a fixed effects model is a statistical model that represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random. This is in contrast to random effects models and mixed models in which either all or some of the explanatory variables are treated as if they arise from the random causes. Note that the biostatistics definitions differ, as biostatisticians respectively refer to the population-average and subject-specific effects as "fixed" and "random" effects.[1][2][3] Often the same structure of model, which is usually a linear regression model, can be treated as any of the three types depending on the analyst's viewpoint, although there may be a natural choice in any given situation.
In panel data analysis, the term fixed effects estimator (also known as the within estimator) is used to refer to an estimator for the coefficients in the regression model. If we assume fixed effects, we impose time independent effects for each entity that are possibly correlated with the regressors.
Contents |
[edit] Qualitative description
Such models assist in controlling for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. This constant can be removed from the data through differencing, for example by taking a first difference which will remove any time invariant components of the model.
There are two common assumptions made about the individual specific effect, the random effects assumption and the fixed effects assumption. The random effects assumption (made in a random effects model) is that the individual specific effects are uncorrelated with the independent variables. The fixed effect assumption is that the individual specific effect is correlated with the independent variables. If the random effects assumption holds, the random effects model is more efficient than the fixed effects model. However, if this assumption does not hold (i.e., if the Durbin–Watson test fails), the random effects model is not consistent.
[edit] Formal description
Consider the linear unobserved effects model for
observations and
time periods:
for
and 
where
is the dependent variable observed for individual
at time
is the time-variant
regressor matrix,
is the unobserved time-invariant individual effect and
is the error term. Unlike
,
cannot be observed by the econometrician. Common examples for time-invariant effects
are innate ability for individuals or historical and institutional factors for countries.
Unlike the Random effects (RE) model where the unobserved
is independent of
for all
, the FE model allows
to be correlated with the regressor matrix
. Strict exogeneity, however, is still required.
Since
is not observable, it cannot be directly controlled for. The FE model eliminates
by demeaning the variables using the within transformation:
where
and
. Since
is constant,
and hence the effect is eliminated. The FE estimator
is then obtained by an OLS regression of
on
.
Another alternative to the within transformation is to add a dummy variables for each individual
. This is numerically, but not computationally, equivalent to the fixed effect model and only works if
the number of time observations per individual, is much larger than the number of individuals in the panel.
[edit] Equality of Fixed Effects (FE) and First Differences (FD) estimators when T=2
For the special two period case (
), the FE estimator and the FD estimator are numerically equivalent. To see this, establish that the fixed effects estimator is:
![{FE}_{T=2}= \left[ (x_{i1}-\bar x_{i}) (x_{i1}-\bar x_{i})' +
(x_{i2}-\bar x_{i}) (x_{i2}-\bar x_{i})' \right]^{-1}\left[
(x_{i1}-\bar x_{i}) (y_{i1}-\bar y_{i}) + (x_{i2}-\bar x_{i}) (y_{i2}-\bar y_{i})\right]](http://upload.wikimedia.org/wikipedia/en/math/5/4/2/542d059d2682fd025bdc2b6ad058429c.png)
Since each
can be re-written as
, we'll re-write the line as:
![{FE}_{T=2}= \left[\sum_{i=1}^{N} \dfrac{x_{i1}-x_{i2}}{2} \dfrac{x_{i1}-x_{i2}}{2} ' + \dfrac{x_{i2}-x_{i1}}{2} \dfrac{x_{i2}-x_{i1}}{2} ' \right]^{-1} \left[\sum_{i=1}^{N} \dfrac{x_{i1}-x_{i2}}{2} \dfrac{y_{i1}-y_{i2}}{2} + \dfrac{x_{i2}-x_{i1}}{2} \dfrac{y_{i2}-y_{i1}}{2} \right]](http://upload.wikimedia.org/wikipedia/en/math/0/c/a/0caa1de06b30fe19509804cf176be863.png)
[edit] Hausman–Taylor method
Need to have more than one time-variant regressor (
) and time-invariant regressor (
) and at least one
and one
that are uncorrelated with
.
Partition the
and
variables such that
where
and
are uncorrelated with
. Need
.
Estimating
via OLS on
using
and
as instruments yields a consistent estimate.
[edit] Testing FE vs. RE
We can test whether a fixed or random effects model is appropriate using a Hausman test.
: 
: 
If
is true, both
and
are consistent, but only
is efficient. If
is true,
is consistent and
is not.
where 
The Hausman test is a specification test so a large test statistic might be indication that there might be Errors in Variables (EIV) or our model is misspecified. If the FE assumption is true, we should find that
.
A simple heuristic is that if
there could be EIV.
[edit] Steps in Fixed Effects Model for sample data
- Calculate group and grand means
- Calculate k=number of groups, n=number of observations per group, N=total number of observations (k x n)
- Calculate SS-total (or total variance) as: (Each score - Grand mean)^2 then summed
- Calculate SS-treat (or treatment effect) as: (Each group mean- Grand mean)^2 then summed x n
- Calculate SS-error (or error effect) as (Each score - Its group mean)^2 then summed
- Calculate df-total: N-1, df-treat: k-1 and df-error k(n-1)
- Calculate Mean Square MS-treat: SS-treat/df-treat, then MS-error: SS-error/df-error
- Calculate obtained f value: MS-treat/MS-error
- Use F-table or probability function, to look up critical f value with a certain significance level
- Conclude as to whether treatment effect significantly affects the variable of interest
[edit] See also
[edit] Notes
- ^ Peter J. Diggle, Patrick Heagerty, Kung-Yee Liang, and Scott L. Zeger (2002) Analysis of Longitudinal Data. 2nd ed., Oxford University Press, p. 169-171.
- ^ Garrett M. Fitzmaurice, Nan M. Laird, and James H. Ware (2004) Applied Longitudinal Analysis. John Wiley & Sons, Inc., p. 326-328.
- ^ Nan M. Laird and James H. Ware (1982) "Random-Effects Models for Longitudinal Data," Biometrics, 38 (4), 963-974.
|
|
This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducing more precise citations. (September 2009) |
[edit] References
- Christensen, Ronald (2002). Plane Answers to Complex Questions: The Theory of Linear Models (Third ed.). New York: Springer. ISBN 0-387-95361-2.
for
and 

![= \left[\sum_{i=1}^{N} 2 \dfrac{x_{i2}-x_{i1}}{2} \dfrac{x_{i2}-x_{i1}}{2} ' \right]^{-1} \left[\sum_{i=1}^{N} 2 \dfrac{x_{i2}-x_{i1}}{2} \dfrac{y_{i2}-y_{i1}}{2} \right]](http://upload.wikimedia.org/wikipedia/en/math/c/4/7/c47024fc0a3107b7fdea0fddcfcb019a.png)
![= 2\left[\sum_{i=1}^{N} (x_{i2}-x_{i1})(x_{i2}-x_{i1})' \right]^{-1} \left[\sum_{i=1}^{N} \frac{1}{2} (x_{i2}-x_{i1})(y_{i2}-y_{i1}) \right]](http://upload.wikimedia.org/wikipedia/en/math/9/f/2/9f20aeea969fdac4ed1318a910b0c9f9.png)
![= \left[\sum_{i=1}^{N} (x_{i2}-x_{i1})(x_{i2}-x_{i1})' \right]^{-1} \sum_{i=1}^{N} (x_{i2}-x_{i1})(y_{i2}-y_{i1}) ={FD}_{T=2}](http://upload.wikimedia.org/wikipedia/en/math/1/6/7/16722f237709d94cded72624aef9b7ec.png)



where 