# Structural similarity

The structural similarity (SSIM) index is a method for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measuring of image quality based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proven to be inconsistent with human eye perception.

The difference with respect to other techniques mentioned previously such as MSE or PSNR is that these approaches estimate perceived errors; on the other hand, SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene.

The SSIM metric is calculated on various windows of an image. The measure between two windows $x$ and $y$ of common size N×N is:

$\hbox{SSIM}(x,y) = \frac{(2\mu_x\mu_y + c_1)(2\sigma_{xy} + c_2)}{(\mu_x^2 + \mu_y^2 + c_1)(\sigma_x^2 + \sigma_y^2 + c_2)}$

with

• $\scriptstyle\mu_x$ the average of $\scriptstyle x$;
• $\scriptstyle\mu_y$ the average of $\scriptstyle y$;
• $\scriptstyle\sigma_x^2$ the variance of $\scriptstyle x$;
• $\scriptstyle\sigma_y^2$ the variance of $\scriptstyle y$;
• $\scriptstyle \sigma_{xy}$ the covariance of $\scriptstyle x$ and $\scriptstyle y$;
• $\scriptstyle c_1 = (k_1L)^2$, $\scriptstyle c_2 = (k_2L)^2$ two variables to stabilize the division with weak denominator;
• $\scriptstyle L$ the dynamic range of the pixel-values (typically this is $\scriptstyle 2^{\#bits\ per\ pixel}-1$);
• $\scriptstyle k_1 = 0.01$ and $\scriptstyle k_2 = 0.03$ by default.

In order to evaluate the image quality this formula is applied only on luma. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data. Typically it is calculated on window sizes of 8×8. The window can be displaced pixel-by-pixel on the image but the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation.

Structural dissimilarity (DSSIM) is a distance metric derived from SSIM (though the triangle inequality is not necessarily satisfied).

$\hbox{DSSIM}(x,y) = \frac{1 - \hbox{SSIM}(x, y)}{2}$