# Histogram matching

An example of histogram matching

In image processing, histogram matching or histogram specification is the transformation of an image so that its histogram matches a specified histogram.[1] The well-known histogram equalization method is a special case in which the specified histogram is uniformly distributed.[2]

It is possible to use histogram matching to balance detector responses as a relative detector calibration technique. It can be used to normalize two images, when the images were acquired at the same local illumination (such as shadows) over the same location, but by different sensors, atmospheric conditions or global illumination

## Implementation

Consider a grayscale input image X. It has a probability density function pr(r), where r is a grayscale value, and pr(r) is the probability of that value. This probability can easily be computed from the histogram of the image by

${\textstyle p_{r}(r_{j})={n_{j} \over n}}$

Where nj is the frequency of the grayscale value rj, and n is the total number of pixels in the image.

Now consider a desired output probability density function pz(z). A transformation of pr(r) is needed to convert it to pz(z).

Input image CDF matched to desired output CDF

Each pdf can easily be mapped to its cumulative density function by

${\displaystyle S(r_{k})=\textstyle \sum _{j=0}^{k}\displaystyle p_{r}(r_{j}),\qquad k=0,1,2,3,...L}$

${\displaystyle G(z_{k})=\textstyle \sum _{j=0}^{k}\displaystyle p_{z}(z_{j}),\qquad k=0,1,2,3,...L}$

Where L is the total number of gray level (256 for a standard image).

The idea is to map each r value in X to the z value that has the same probability in the desired pfd. I.e. S(rj) = G(zi) or z = G−1(S(r)).[3]

## Example

The following input grayscale image is to be changed to match the reference histogram.

The input image has the following histogram

Histogram of input image

It will be matched to this reference histogram to emphasize the lower gray levels.

Desired reference histogram

After matching, the output image has the following histogram

Histogram of output image after matching

And looks like this

Output image after histogram matching

## Algorithm

Given two images, the reference and the target images, we compute their histograms. Following, we calculate the cumulative distribution functions of the two images' histograms – ${\displaystyle F_{1}()\,}$ for the reference image and ${\displaystyle F_{2}()\,}$ for the target image. Then for each gray level ${\displaystyle G_{1}\in [0,255]}$, we find the gray level ${\displaystyle G_{2}\,}$ for which ${\displaystyle F_{1}(G_{1})=F_{2}(G_{2})\,}$, and this is the result of histogram matching function: ${\displaystyle M(G_{1})=G_{2}\,}$. Finally, we apply the function ${\displaystyle M()}$ on each pixel of the reference image.

## Exact histogram matching

In typical real-world applications, with 8-bit pixel values (discrete values in range [0, 255]), histogram matching can only approximate the specified histogram. All pixels of a particular value in the original image must be transformed to just one value in the output image.

Exact histogram matching is the problem of finding a transformation for a discrete image so that its histogram exactly matches the specified histogram.[4] Several techniques have been proposed for this. One simplistic approach converts the discrete-valued image into a continuous-valued image and adds small random values to each pixel so their values can be ranked without ties. However, this introduces noise to the output image.

Because of this there may be holes or open spots in the output matched histogram.

## Multiple histogram matching

The histogram matching algorithm can be extended to find a monotonic mapping between two sets of histograms. Given two sets of histograms ${\displaystyle P=\{p_{i}\}_{i=1}^{k}}$ and ${\displaystyle Q=\{q_{i}\}_{i=1}^{k}}$, the optimal monotonic color mapping ${\displaystyle M}$ is calculated to minimize the distance between the two sets simultaneously, namely ${\displaystyle \operatorname {arg} \min _{M}\ \sum _{k}d(M(p_{k}),q_{k})}$ where ${\displaystyle d(\cdot ,\cdot )}$ is a distance metric between two histograms. The optimal solution is calculated using dynamic programming.[5]