Zero-inflated model

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.

The zero-inflated Poisson model concerns a random event containing excess zero-count data in unit time.[1] For example, the number of insurance claims within a population for a certain type of risk would be zero-inflated by those people who have not taken out insurance against the risk and thus are unable to claim. The zero-inflated Poisson (ZIP) model employs two components that correspond to two zero generating processes. The first process is governed by a binary distribution that generates structural zeros. The second process is governed by a Poisson distribution that generates counts, some of which may be zero. The two model components are described as follows:

 \Pr (y_j = 0) = \pi + (1 - \pi) e^{-\lambda}
\Pr (y_j = h_i) = (1 - \pi) \frac{\lambda^{h_i} e^{-\lambda}} {h_i!},\qquad h_i \ge 1

where the outcome variable y_j has any non-negative integer value, \lambda_i is the expected Poisson count for the ith individual; \pi is the probability of extra zeros.

The mean is  (1-\pi) \lambda and the variance is   \lambda (1-\pi) (1+\lambda \pi) .

Estimators[edit]

The method of moments estimators are given by

 \hat{\lambda}_{mo} = \frac{s^2+m^2-m}{m},

 \hat{\pi}_{mo} = \frac{s^2 - m}{s^2 + m^2 - m},

where m is the sample mean and s^2 is the sample variance.

The maximum likelihood estimator[2] can be found by solving the following equation

 \bar{x}(1- e^{-\hat{\lambda}_{ml}}) = \hat{\lambda}_{ml} \left( 1 - \frac{n_0}{n} \right).

Where  \bar{x} is the sample mean, and  \frac{n_0}{n} is the observed proportion of zeros.

This can be solved by iteration,[3] and the maximum likelihood estimator for \pi is given by

 \hat{\pi}_{ml} = 1 - \frac{\bar{x}}{\hat{\lambda}_{ml}}.

Related properties[edit]

In 1994, Greene considered the zero-inflated negative binomial (ZINB) model.[4] Daniel B. Hall adapted Lambert's methodology to an upper-bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model.[5]

See also[edit]

References[edit]

  1. ^ Lambert, Diane (1992). "Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing". Technometrics 34 (1): 1–14. JSTOR 1269547. 
  2. ^ Johnson, Norman L.; Kotz, Samuel; Kemp, Adrienne W. (1992). Univariate Discrete Distributions (2nd ed.). Wiley. pp. 312–314. ISBN 0-471-54897-9. 
  3. ^ Böhning, Dankmar; Dietz, Ekkehart; Schlattmann, Peter; Mendonca, Lisette; Kirchner, Ursula (1999). "The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology". Journal of the Royal Statistical Society: Series A (Statistics in Society) (Wiley Online Library) 162 (2): 195–209. doi:10.1111/1467-985x.00130. 
  4. ^ Greene, William H. (1994). "Some Accounting for Excess Zeros and Sample Selection in Poisson and Negative Binomial Regression Models". Working Paper EC-94-10: Department of Economics, New York University. 
  5. ^ Hall, Daniel B. (2000). "Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case Study". Biometrics 56 (4): 1030–1039. doi:10.1111/j.0006-341X.2000.01030.x. 

See also[edit]