# Structural similarity

Jump to: navigation, search
"SSIM" redirects here. For other uses, see SSIM (disambiguation).

The structural similarity (SSIM) index is a method for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measuring of image quality based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proven to be inconsistent with human eye perception.

The difference with respect to other techniques mentioned previously such as MSE or PSNR is that these approaches estimate perceived errors; on the other hand, SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene.

The SSIM metric is calculated on various windows of an image. The measure between two windows $x$ and $y$ of common size N×N is:

$\hbox{SSIM}(x,y) = \frac{(2\mu_x\mu_y + c_1)(2\sigma_{xy} + c_2)}{(\mu_x^2 + \mu_y^2 + c_1)(\sigma_x^2 + \sigma_y^2 + c_2)}$

with

• $\scriptstyle\mu_x$ the average of $\scriptstyle x$;
• $\scriptstyle\mu_y$ the average of $\scriptstyle y$;
• $\scriptstyle\sigma_x^2$ the variance of $\scriptstyle x$;
• $\scriptstyle\sigma_y^2$ the variance of $\scriptstyle y$;
• $\scriptstyle \sigma_{xy}$ the covariance of $\scriptstyle x$ and $\scriptstyle y$;
• $\scriptstyle c_1 = (k_1L)^2$, $\scriptstyle c_2 = (k_2L)^2$ two variables to stabilize the division with weak denominator;
• $\scriptstyle L$ the dynamic range of the pixel-values (typically this is $\scriptstyle 2^{\#bits\ per\ pixel}-1$);
• $\scriptstyle k_1 = 0.01$ and $\scriptstyle k_2 = 0.03$ by default.

In order to evaluate the image quality this formula is applied only on luma. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data. Typically it is calculated on window sizes of 8×8. The window can be displaced pixel-by-pixel on the image but the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation.

Structural dissimilarity (DSSIM) is a distance metric derived from SSIM (though the triangle inequality is not necessarily satisfied).

$\hbox{DSSIM}(x,y) = \frac{1 - \hbox{SSIM}(x, y)}{2}$

## Discussions over performance

Some research papers such as "A comprehensive assessment of the structural similarity index" by Richard Dosselmann and Xue Dong Yang show that SSIM is actually not very precise (not as precise as it claims to be) and that SSIM provides quality scores which are no more correlated to human judgment than MSE (Mean Square Error) values.

Moreover, although SSIM claims to reproduce human perception, its formula clearly does not contain any elaborate visual perception modelling and SSIM even relies on non-perceptual computations. For example, the human visual system does not compute a product between the mean values of the two images.

Finally, SSIM is designed to measure the quality of still images. It doesn't contain any parameter related to the effects of human perception and human judgment over time. However, some people use SSIM for video quality measurement.