Specification (regression)

In regression analysis specification is the process of developing a regression model. This process consists of selecting an appropriate functional form for the model and choosing which variables to include. For instance, one may specify the functional relationship ${\displaystyle y=f(s,x)}$ between personal income ${\displaystyle y}$ and human capital in terms of schooling ${\displaystyle s}$ and on-the-job experience ${\displaystyle x}$ as:[1]

${\displaystyle \ln y=\ln y_{0}+\rho s+\beta _{1}x+\beta _{2}x^{2}+\varepsilon }$

where ${\displaystyle \varepsilon }$ is the unexplained error term that is supposed to be independent and identically distributed. If assumptions of the regression model are correct, the least squares estimates of the parameters ${\displaystyle \rho }$ and ${\displaystyle \beta }$ will be efficient and unbiased. Hence specification diagnostics usually involve testing the first to fourth moment of the residuals.[2]

Specification error and bias

Specification error occurs when an independent variable is correlated with the error term. There are several different causes of specification error:

• An incorrect functional form could be employed;
• a variable omitted from the model may have a relationship with both the dependent variable and one or more of the independent variables (omitted-variable bias);[3]
• an irrelevant variable may be included in the model;
• the dependent variable may be part of a system of simultaneous equations (simultaneity bias);
• measurement errors may affect the independent variables.

Detection

The Ramsey RESET test can help test for specification error.

References

1. ^ This particular example is known as Mincer earnings function.
2. ^ Long, J. Scott; Trivedi, Pravin K. (1993). "Some Specification Tests for the Linear Regression Model". In Bollen, Kenneth A.; Long, J. Scott. Testing Structural Equation Models. London: Sage. pp. 66–110. ISBN 0-8039-4506-X.
3. ^ Untitled