Goodman and Kruskal's gamma

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In statistics, Goodman and Kruskal's gamma is a measure of rank correlation, i.e., the similarity of the orderings of the data when ranked by each of the quantities. It measures the strength of association of the cross tabulated data when both variables are measured at the ordinal level. It makes no adjustment for either table size or ties. Values range from −1 (100% negative association, or perfect inversion) to +1 (100% positive association, or perfect agreement). A value of zero indicates the absence of association.

This statistic (which is distinct from Goodman and Kruskal's lambda) is named after Leo Goodman and William Kruskal, who proposed it in a series of papers from 1954 to 1972.[1][2][3][4]

Definition[edit]

The estimate of gamma, G, depends on two quantities:

  • Ns, the number of pairs of cases ranked in the same order on both variables (number of concordant pairs),
  • Nd, the number of pairs of cases ranked differently on the variables (number of discordant pairs),

where "ties" are dropped. That is cases where either of the two variables in the pair are equal. Then

G=\frac{N_s-N_d}{N_s+N_d}\ .

This statistic can be regarded as the maximum likelihood estimator for the theoretical quantity \gamma, where

\gamma=\frac{P_s-P_d}{P_s+P_d}\ ,

and where Ps and Pd are the probabilities that a randomly selected pair of observations will place in the same or opposite order respectively, when ranked by both variables.

Critical values for the gamma statistic are sometimes found by using an approximation, whereby a transformed value, t of the statistic is referred to Student t distribution, where[citation needed]

t \approx G \sqrt{  \frac{ N_s+N_d}{n(1-G^2)} }\ ,

and where n is the number of observations (not the number of pairs):

n \ne N_s+N_d. \,

See also[edit]

References[edit]

  1. ^ Goodman, Leo A.; Kruskal, William H. (1954). "Measures of Association for Cross Classifications". Journal of the American Statistical Association 49 (268): 732–764. JSTOR 2281536. 
  2. ^ Goodman, Leo A.; Kruskal, William H. (1959). "Measures of Association for Cross Classifications. II: Further Discussion and References". Journal of the American Statistical Association 54 (285): 123–163. doi:10.1080/01621459.1959.10501503. JSTOR 2282143. 
  3. ^ Goodman, Leo A.; Kruskal, William H. (1963). "Measures of Association for Cross Classifications III: Approximate Sampling Theory". Journal of the American Statistical Association 58 (302): 310–364. doi:10.1080/01621459.1963.10500850. JSTOR 2283271. 
  4. ^ Goodman, Leo A.; Kruskal, William H. (1972). "Measures of Association for Cross Classifications, IV: Simplification of Asymptotic Variances". Journal of the American Statistical Association 67 (338): 415–421. doi:10.1080/01621459.1972.10482401. JSTOR 2284396. 

Further reading[edit]

Sheskin, D.J. (2007) The Handbook of Parametric and Nonparametric Statistical Procedures. Chapman & Hall/CRC, ISBN 9781584888147 http://www.psych.cornell.edu/Darlington/crosstab/TABLE5.HTM