= Morisita's overlap index =

Morisita's overlap index, named after Masaaki Morisita, is a statistical measure of dispersion of individuals in a population. It is used to compare overlap among samples (Morisita 1959). This formula is based on the assumption that increasing the size of the samples will increase the diversity because it will include different habitats (i.e. different faunas).

Formula:

 $C_D= \frac{2\sum_{i=1}^S x_i y_i }{(D_x + D_y) XY }$

 x_{i} is the number of times species i is represented in the total X from one sample.
 y_{i} is the number of times species i is represented in the total Y from another sample.
 D_{x} and D_{y} are the Simpson's index values for the x and y samples respectively.
 S is the number of unique species

C_{D} = 0 if the two samples do not overlap in terms of species, and C_{D} = 1 if the species occur in the same proportions in both samples.

Horn's modification of the index is (Horn 1966):
$C_H= \frac{ 2 \sum_{i=1}^S x_i y_i }{ \left( {\sum_{i=1}^S x_i^2 \over X^2} + {\sum_{i=1}^S y_i^2 \over Y^2} \right) X Y } \,.$

Note, not to be confused with Morisita’s index of dispersion.

== As probabilities/frequencies ==

If the population frequencies are defined as

 $p_i = x_i / X$
 $q_i = y_i / Y$

Then $D_x = \sum_{i=1}^S p_i^2$ and $D_y = \sum_{i=1}^S q_i^2$, and the formula can be refactored as

 $C_D = \frac{\sum_{i=1}^S p_i q_i}
                   {\frac{1}{2}(\sum_{i=1}^S p_i^2 + \sum_{i=1}^S q_i^2)}$

which reveals a relationship to the Cosine similarity, but normalizes the dot product by the arithmetic means instead of geometric means of the frequency vector squared lengths.
