# Moran's I

The white and black squares are perfectly dispersed so Moran's I would be −1. If the white squares were stacked to one half of the board and the black squares to the other, Moran's I would be close to +1. A random arrangement of square colors would give Moran's I a value that is close to 0.

In statistics, Moran's I is a measure of spatial autocorrelation developed by Patrick Alfred Pierce Moran.[1][2] Spatial autocorrelation is characterized by a correlation in a signal among nearby locations in space. Spatial autocorrelation is more complex than one-dimensional autocorrelation because spatial correlation is multi-dimensional (i.e. 2 or 3 dimensions of space) and multi-directional.

## Definition

Moran's I is defined as

${\displaystyle I={\frac {N}{W}}{\frac {\sum _{i}\sum _{j}w_{ij}(x_{i}-{\bar {x}})(x_{j}-{\bar {x}})}{\sum _{i}(x_{i}-{\bar {x}})^{2}}}}$

where ${\displaystyle N}$ is the number of spatial units indexed by ${\displaystyle i}$ and ${\displaystyle j}$; ${\displaystyle x}$ is the variable of interest; ${\displaystyle {\bar {x}}}$ is the mean of ${\displaystyle x}$; ${\displaystyle w_{ij}}$ is a matrix of spatial weights with zeroes on the diagonal (i.e., ${\displaystyle w_{ii}=0}$); and ${\displaystyle W}$ is the sum of all ${\displaystyle w_{ij}}$.

### Defining weights matrices

The value of ${\displaystyle I}$ can depend quite a bit on the assumptions built into the spatial weights matrix ${\displaystyle w_{ij}}$. The idea is to construct a matrix that accurately reflects your assumptions about the particular spatial phenomenon in question. A common approach is to give a weight of 1 if two zones are neighbors, and 0 otherwise, though the definition of 'neighbors' can vary. Another common approach might be to give a weight of 1 to ${\displaystyle k}$ nearest neighbors, 0 otherwise. An alternative is to use a distance decay function for assigning weights. Sometimes the length of a shared edge is used for assigning different weights to neighbors. The selection of spatial weights matrix should be guided by theory about the phenomenon in question.

### Expected value

The expected value of Moran's I under the null hypothesis of no spatial autocorrelation is

${\displaystyle E(I)={\frac {-1}{N-1}}}$

At large sample sizes (i.e., as N approaches infinity), the expected value approaches zero.

Its variance equals

${\displaystyle \operatorname {Var} (I)={\frac {NS_{4}-S_{3}S_{5}}{(N-1)(N-2)(N-3)W^{2}}}-(E(I))^{2}}$

where

${\displaystyle S_{1}={\frac {1}{2}}\sum _{i}\sum _{j}(w_{ij}+w_{ji})^{2}}$
${\displaystyle S_{2}=\sum _{i}\left(\sum _{j}w_{ij}+\sum _{j}w_{ji}\right)^{2}}$
${\displaystyle S_{3}={\frac {N^{-1}\sum _{i}(x_{i}-{\bar {x}})^{4}}{(N^{-1}\sum _{i}(x_{i}-{\bar {x}})^{2})^{2}}}}$
${\displaystyle S_{4}=(N^{2}-3N+3)S_{1}-NS_{2}+3W^{2}}$
${\displaystyle S_{5}=(N^{2}-N)S_{1}-2NS_{2}+6W^{2}}$[3]

Values of I usually range from −1 to +1. Values significantly below -1/(N-1) indicate negative spatial autocorrelation and values significantly above -1/(N-1) indicate positive spatial autocorrelation. For statistical hypothesis testing, Moran's I values can be transformed to z-scores.

Moran's I is inversely related to Geary's C, but it is not identical. Moran's I is a measure of global spatial autocorrelation, while Geary's C is more sensitive to local spatial autocorrelation.

## Uses

Moran's I is widely used in the fields of geography and GIScience. Some examples include:

• The analysis of geographic differences in health variables.[4]
• It has been used to characterize the impact of lithium concentrations in public water on mental health.[5]
• It has also recently been used in dialectology to measure the significance of regional language variation.[6]
• It has been used to define an objective function for meaningful terrain segmentation for geomorphological studies[7]

## Sources

1. ^ Moran, P. A. P. (1950). "Notes on Continuous Stochastic Phenomena". Biometrika. 37 (1): 17–23. doi:10.2307/2332142. JSTOR 2332142.
2. ^ Li, Hongfei; Calder, Catherine A.; Cressie, Noel (2007). "Beyond Moran's I: Testing for Spatial Dependence Based on the Spatial Autoregressive Model". Geographical Analysis. 39 (4): 357–375. doi:10.1111/j.1538-4632.2007.00708.x.
3. ^ Cliff and Ord (1981), Spatial Processes, London
4. ^ "The Analysis of Spatial Association by Use of Distance Statistics". Geographical Analysis. 24 (3): 189–206. 3 Sep 2010. doi:10.1111/j.1538-4632.1992.tb00261.x.
5. ^ Helbich, M; Leitner, M; Kapusta, ND (2012). "Geospatial examination of lithium in drinking water and suicide mortality". Int J Health Geogr. 11 (1): 19. doi:10.1186/1476-072X-11-19. PMC 3441892. PMID 22695110.
6. ^ Grieve, Jack (2011). "A regional analysis of contraction rate in written Standard American English". International Journal of Corpus Linguistics. 16 (4): 514–546. doi:10.1075/ijcl.16.4.04gri.
7. ^ Alvioli, M.; Marchesini, I.; Reichenbach, P.; Rossi, M.; Ardizzone, F.; Fiorucci, F.; Guzzetti, F. (2016). "Automatic delineation of geomorphological slope units with r.slopeunits v1.0 and their optimization for landslide susceptibility modeling". Geoscientific Model Development. 9: 3975–3991. doi:10.5194/gmd-9-3975-2016.