= Prais–Winsten estimation =

In econometrics, Prais–Winsten estimation is a procedure meant to take care of the serial correlation of type AR(1) in a linear model. Conceived by Sigbert Prais and Christopher Winsten in 1954, it is a modification of Cochrane–Orcutt estimation in the sense that it does not lose the first observation, which leads to more efficiency as a result and makes it a special case of feasible generalized least squares.

==Theory==
Consider the model

$y_t = \alpha + X_t \beta+\varepsilon_t,\,$

where $y_{t}$ is the time series of interest at time t, $\beta$ is a vector of coefficients, $X_{t}$ is a matrix of explanatory variables, and $\varepsilon_t$ is the error term. The error term can be serially correlated over time: $\varepsilon_t =\rho \varepsilon_{t-1}+e_t,\ |\rho| <1$ and $e_t$ is white noise. In addition to the Cochrane–Orcutt transformation, which is

$y_t - \rho y_{t-1} = \alpha(1-\rho)+(X_t - \rho X_{t-1})\beta + e_t, \,$

for t = 2,3,...,T, the Prais-Winsten procedure makes a reasonable transformation for t = 1 in the following form:

$\sqrt{1-\rho^2}y_1 = \alpha\sqrt{1-\rho^2}+\left(\sqrt{1-\rho^2}X_1\right)\beta + \sqrt{1-\rho^2}\varepsilon_1. \,$

Then the usual least squares estimation is done.

==Estimation procedure==
First notice that

$\mathrm{var}(\varepsilon_t)=\mathrm{var}(\rho\varepsilon_{t-1}+e_{t})=\rho^2 \mathrm{var}(\varepsilon_{t-1}) +\mathrm{var}(e_{t})$

Noting that for a stationary process, variance is constant over time,

$(1-\rho^2 )\mathrm{var}(\varepsilon_t)= \mathrm{var}(e_{t})$

and thus,

$\mathrm{var}(\varepsilon_t)=\frac{\mathrm{var}(e_{t})}{(1-\rho^2 )}$

Without loss of generality suppose the variance of the white noise is 1. To do the estimation in a compact way one must look at the autocovariance function of the error term considered in the model below:

 $\mathrm{cov}(\varepsilon_t,\varepsilon_{t+h})=\rho^h \mathrm{var}(\varepsilon_t)=\frac{\rho^h}{1-\rho^2}, \text{ for } h=0,\pm 1, \pm 2, \dots \, .$

It is easy to see that the variance–covariance matrix, $\mathbf{\Omega}$, of the model is

 $\mathbf{\Omega} = \begin{bmatrix}
\frac{1}{1-\rho^2} & \frac{\rho}{1-\rho^2} & \frac{\rho^2}{1-\rho^2} & \cdots & \frac{\rho^{T-1}}{1-\rho^2} \\[8pt]
\frac{\rho}{1-\rho^2} & \frac{1}{1-\rho^2} & \frac{\rho}{1-\rho^2} & \cdots & \frac{\rho^{T-2}}{1-\rho^2} \\[8pt]
\frac{\rho^2}{1-\rho^2} & \frac{\rho}{1-\rho^2} & \frac{1}{1-\rho^2} & \cdots & \frac{\rho^{T-3}}{1-\rho^2} \\[8pt]
\vdots & \vdots & \vdots & \ddots & \vdots \\[8pt]
\frac{\rho^{T-1}}{1-\rho^2} & \frac{\rho^{T-2}}{1-\rho^2} & \frac{\rho^{T-3}}{1-\rho^2} & \cdots & \frac{1}{1-\rho^2}
\end{bmatrix}.$
Having $\rho$ (or an estimate of it), we see that,

 $\hat{\Theta}=(\mathbf{Z}^{\mathsf{T}}\mathbf{\Omega}^{-1}\mathbf{Z})^{-1}(\mathbf{Z}^{\mathsf{T}}\mathbf{\Omega}^{-1}\mathbf{Y}), \,$

where $\mathbf{Z}$ is a matrix of observations on the independent variable (X_{t}, t = 1, 2, ..., T) including a vector of ones, $\mathbf{Y}$ is a vector stacking the observations on the dependent variable (y_{t}, t = 1, 2, ..., T) and $\hat{\Theta}$ includes the model parameters.

==Note==
To see why the initial observation assumption stated by Prais–Winsten (1954) is reasonable, considering the mechanics of generalized least square estimation procedure sketched above is helpful. The inverse of $\mathbf{\Omega}$ can be decomposed as $\mathbf{\Omega}^{-1}=\mathbf{G}^{\mathsf{T}}\mathbf{G}$ with

 $\mathbf{G} = \begin{bmatrix}
\sqrt{1-\rho^2} & 0 & 0 & \cdots & 0 \\
-\rho & 1 & 0 & \cdots & 0 \\
0 & -\rho & 1 & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & \cdots & 1
\end{bmatrix}.$
A pre-multiplication of model in a matrix notation with this matrix gives the transformed model of Prais–Winsten.

==Restrictions==

The error term is still restricted to be of an AR(1) type. If $\rho$ is not known, a recursive procedure (Cochrane–Orcutt estimation) or grid-search (Hildreth–Lu estimation) may be used to make the estimation feasible. Alternatively, a full information maximum likelihood procedure that estimates all parameters simultaneously has been suggested by Beach and MacKinnon.
