The Hosmer–Lemeshow test is a statistical test for goodness of fit for logistic regression models. It is used frequently in risk prediction models. The test assesses whether or not the observed event rates match expected event rates in subgroups of the model population. The Hosmer–Lemeshow test specifically identifies subgroups as the deciles of fitted risk values. Models for which expected and observed event rates in subgroups are similar are called well calibrated.
The Hosmer–Lemeshow test statistic is given by:
Here O1g, E1g, O0g, E0g, Ng, and πg denote the observed Y=1 events, expected Y=1 events, observed Y=0 events, expected Y=0 events, total observations, predicted risk for the gth risk decile group, and G is the number of groups. The test statistic asymptotically follows a distribution with G − 2 degrees of freedom. The number of risk groups may be adjusted depending on how many fitted risks are determined by the model. This helps to avoid singular decile groups.