# Scoring algorithm

Scoring algorithm, also known as Fisher's scoring,[1] is a form of Newton's method used in statistics to solve maximum likelihood equations numerically, named after Ronald Fisher.

## Sketch of Derivation

Let ${\displaystyle Y_{1},\ldots ,Y_{n}}$ be random variables, independent and identically distributed with twice differentiable p.d.f. ${\displaystyle f(y;\theta )}$, and we wish to calculate the maximum likelihood estimator (M.L.E.) ${\displaystyle \theta ^{*}}$ of ${\displaystyle \theta }$. First, suppose we have a starting point for our algorithm ${\displaystyle \theta _{0}}$, and consider a Taylor expansion of the score function, ${\displaystyle V(\theta )}$, about ${\displaystyle \theta _{0}}$:

${\displaystyle V(\theta )\approx V(\theta _{0})-{\mathcal {J}}(\theta _{0})(\theta -\theta _{0}),\,}$

where

${\displaystyle {\mathcal {J}}(\theta _{0})=\sum _{i=1}^{n}\left.\nabla \nabla ^{\top }\right|_{\theta =\theta _{0}}\log f(Y_{i};\theta )}$

is the observed information matrix at ${\displaystyle \theta _{0}}$. Now, setting ${\displaystyle \theta =\theta ^{*}}$, using that ${\displaystyle V(\theta ^{*})=0}$ and rearranging gives us:

${\displaystyle \theta ^{*}\approx \theta _{0}+{\mathcal {J}}^{-1}(\theta _{0})V(\theta _{0}).\,}$

We therefore use the algorithm

${\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {J}}^{-1}(\theta _{m})V(\theta _{m}),\,}$

and under certain regularity conditions, it can be shown that ${\displaystyle \theta _{m}\rightarrow \theta ^{*}}$.

## Fisher scoring

In practice, ${\displaystyle {\mathcal {J}}(\theta )}$ is usually replaced by ${\displaystyle {\mathcal {I}}(\theta )=\mathrm {E} [{\mathcal {J}}(\theta )]}$, the Fisher information, thus giving us the Fisher Scoring Algorithm:

${\displaystyle \theta _{m+1}=\theta _{m}+{\mathcal {I}}^{-1}(\theta _{m})V(\theta _{m})}$.