The Tobit model is a statistical model proposed by James Tobin (1958) to describe the relationship between a non-negative dependent variable and an independent variable (or vector) . The term Tobit was derived from Tobin's name by truncating and adding -it by analogy with the probit model.
The model supposes that there is a latent (i.e. unobservable) variable . This variable linearly depends on via a parameter (vector) which determines the relationship between the independent variable (or vector) and the latent variable (just as in a linear model). In addition, there is a normally distributed error term to capture random influences on this relationship. The observable variable is defined to be equal to the latent variable whenever the latent variable is above zero and zero otherwise.
where is a latent variable:
When asked why it was called the "Tobit" model, instead of Tobin, James Tobin explained that this term was introduced by Arthur Goldberger, either as a contraction of "Tobin probit", or as a reference to the novel The Caine Mutiny, a novel by Tobin's friend Herman Wouk, in which Tobin makes a cameo as "Mr Tobit". Tobin reports having actually asked Goldberger which it was, and the man refused to say.
If the relationship parameter is estimated by regressing the observed on , the resulting ordinary least squares regression estimator is inconsistent. It will yield a downwards-biased estimate of the slope coefficient and an upward-biased estimate of the intercept. Takeshi Amemiya (1973) has proven that the maximum likelihood estimator suggested by Tobin for this model is consistent.
The coefficient should not be interpreted as the effect of on , as one would with a linear regression model; this is a common error. Instead, it should be interpreted as the combination of (1) the change in of those above the limit, weighted by the probability of being above the limit; and (2) the change in the probability of being above the limit, weighted by the expected value of if above.
Variations of the Tobit model
Variations of the Tobit model can be produced by changing where and when censoring occurs. Amemiya (1985, p. 384) classifies these variations into five categories (Tobit type I - Tobit type V), where Tobit type I stands for the first model described above. Schnedler (2005) provides a general formula to obtain consistent likelihood estimators for these and other variations of the Tobit model.
The Tobit model is a special case of a censored regression model, because the latent variable cannot always be observed while the independent variable is observable. A common variation of the Tobit model is censoring at a value different from zero:
Another example is censoring of values above .
Yet another model results when is censored from above and below at the same time.
The rest of the models will be presented as being bounded from below at 0, though this can be generalized as we have done for Type I.
Type II Tobit models introduce a second latent variable.
Heckman (1987) falls into the Type II Tobit. In Type I Tobit, the latent variable absorb both the process of participation and 'outcome' of interest. Type II Tobit allows the process of participation/selection and the process of 'outcome' to be independent, conditional on x.
Type III introduces a second observed dependent variable.
The Heckman model falls into this type.
Type IV introduces a third observed dependent variable and a third latent variable.
Similar to Type II, in Type V we only observe the sign of .
The likelihood function
Below are the likelihood and log likelihood functions for a type I Tobit. This is a Tobit that is censored from below at when the latent variable . In writing out the likelihood function, we first define an indicator function where:
Next, we mean to be the standard normal cumulative distribution function and to be the standard normal probability density function. For a data set with N observations the likelihood function for a type I Tobit is
and the log likelihood is given by
If the underlying latent variable is not normally distributed, one must use quantiles instead of moments to analyze the observable variable . Powell's CLAD estimator offers a possibility to achieve this.
- Generalized Tobit
- Limited dependent variable
- Rectifier (neural networks)
- Truncated regression model
- Probit model, the name Tobit is a pun on both Tobin, their creator, and their similarities to probit models.
- Tobin, James (1958). "Estimation of relationships for limited dependent variables". Econometrica 26 (1): 24–36. doi:10.2307/1907382. JSTOR 1907382.
- International Encyclopedia of the Social Sciences (2008)
- The ET Interview: Professor James Tobin
- Amemiya, Takeshi (1973). "Regression analysis when the dependent variable is truncated normal". Econometrica 41 (6): 997–1016. doi:10.2307/1914031. JSTOR 1914031.
- McDonald, John F.; Moffit, Robert A. (1980). "The Uses of Tobit Analysis". The Review of Economics and Statistics (The MIT Press) 62 (2): 318–321. doi:10.2307/1924766. JSTOR 1924766.
- Schnedler, Wendelin (2005). "Likelihood estimation for censored random vectors". Econometric Reviews 24 (2): 195–217. doi:10.1081/ETC-200067925.
- Powell, James L (1 July 1984). "Least absolute deviations estimation for the censored regression model". Journal of Econometrics 25 (3): 303–325. doi:10.1016/0304-4076(84)90004-6.
- Amemiya, Takeshi (1984). "Tobit models: A survey". Journal of Econometrics 24 (1–2): 3–61. doi:10.1016/0304-4076(84)90074-5.
- Amemiya, Takeshi (1985). "Tobit Models". Advanced Econometrics. Oxford: Basil Blackwell. pp. 360–411. ISBN 0-631-13345-3.
- Gouriéroux, Christian (2000). "The Tobit Model". Econometrics of Qualitative Dependent Variables. New York: Cambridge University Press. pp. 170–207. ISBN 0-521-58985-1.