Join count statistic
Join count statistics are a method of spatial analysis used to assess the degree of association, in particular the autocorrelation, of categorical variables distributed over a spatial map. They were originally introduced by Australian statistician P. A. P. Moran.[1] Join count statistics have found widespread use in econometrics,[2] remote sensing[3] and ecology.[4] Join count statistics can be computed in a number of software packages including PASSaGE,[5] GeoDA, PySAL[6] and spdep.[7]
Binary data
[edit]Given binary data distributed over spatial sites, where the neighbour relations between regions and are encoded in the spatial weight matrix
the join count statistics are defined as [8][4]
Where
The subscripts refer to 'black'=1 and 'white'=0 sites. The relation implies only three of the four numbers are independent. Generally speaking, large values of and relative to imply autocorrelation and relatively large values of imply anti-correlation.
To assess the statistical significance of these statistics, the expectation under various null models has been computed.[9] For example, if the null hypothesis is that each sample is chosen at random according to a Bernoulli process with probability
then Cliff and Ord [8] show that
where
However in practice[10] an approach based on random permutations is preferred, since it requires fewer assumptions.
Local join count statistic
[edit]Anselin and Li introduced[11][12] the idea of the local join count statistic, following Anselin's general idea of a Local Indicator of Spatial Association (LISA).[13] Local Join Count is defined by e.g.
with similar definitions for and . This is equivalent to the Getis-Ord statistics computed with binary data. Some analytic results for the expectation of the local statistics are available based on the hypergeometric distribution[11] but due to the multiple comparisons problem a permutation based approach is again preferred in practice.[12]
Extension to multiple categories
[edit]When there are categories join count statistics have been generalised[4][8][9]
Where is an indicator function for the variable belonging to the category . Analytic results are available[14] or a permutation approach can be used to test for significance as in the binary case.
References
[edit]- ^ Moran PA. The interpretation of statistical maps. Journal of the Royal Statistical Society. Series B (Methodological). 1948 Jan 1;10(2):243-51.
- ^ Anselin L. Spatial econometrics. Handbook of spatial analysis in the social sciences. 2022 Nov 15:101-22.
- ^ Congalton RG, Green K. Assessing the accuracy of remotely sensed data: principles and practices. CRC press; 2019 Aug 8.
- ^ a b c Dale MR, Fortin MJ. Spatial analysis: a guide for ecologists. Cambridge University Press; 2014 Sep 11.
- ^ https://www.passagesoftware.net/
- ^ "Esda.Join_Counts — esda v0.1.dev1+ga296c39 Manual".
- ^ "Spdep: Spatial Dependence: Weighting Schemes, Statistics and Models version 0.6-15 from R-Forge".
- ^ a b c
Cliff, A.D. and Ord, J.K. (1981). Spatial Processes: Models & Applications. Pion. ISBN 9780850860818.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ a b Sokal RR, Oden NL. Spatial autocorrelation in biology: 1. Methodology. Biological journal of the Linnean Society. 1978 Jun 1;10(2):199-228.
- ^ "Local Spatial Autocorrelation (4)".
- ^ a b Anselin L, Li X. Operational local join count statistics for cluster detection. Journal of geographical systems. 2019 Jun 1;21:189-210.
- ^ a b "Local Spatial Autocorrelation (4)".
- ^ Anselin, Luc. 1995. “Local Indicators of Spatial Association — LISA.” Geographical Analysis 27: 93–115.
- ^ Epperson, B.K., 2003. Covariances among join-count spatial autocorrelation measures. Theoretical Population Biology, 64(1), pp.81-87.