Kendall's W

Kendall's W (also known as Kendall's coefficient of concordance) is a non-parametric statistic for rank correlation. It is a normalization of the statistic of the Friedman test, and can be used for assessing agreement among raters and in particular inter-rater reliability. Kendall's W ranges from 0 (no agreement) to 1 (complete agreement).

Suppose, for instance, that a number of people have been asked to rank a list of political concerns, from the most important to the least important. Kendall's W can be calculated from these data. If the test statistic W is 1, then all the survey respondents have been unanimous, and each respondent has assigned the same order to the list of concerns. If W is 0, then there is no overall trend of agreement among the respondents, and their responses may be regarded as essentially random. Intermediate values of W indicate a greater or lesser degree of unanimity among the various responses.

While tests using the standard Pearson correlation coefficient assume normally distributed values and compare two sequences of outcomes simultaneously, Kendall's W makes no assumptions regarding the nature of the probability distribution and can handle any number of distinct outcomes.

Steps of Kendall's W

Suppose that object i is given the rank r_i,j by judge number j, where there are in total n objects and m judges. Then the total rank given to object i is

R_{i}=\sum _{j=1}^{m}r_{i,j},

and the mean value of these total ranks is

{\bar {R}}={\frac {1}{n}}\sum _{i=1}^{n}R_{i}.

The sum of squared deviations, S, is defined as

S=\sum _{i=1}^{n}(R_{i}-{\bar {R}})^{2},

and then Kendall's W is defined as^[1]

W={\frac {12S}{m^{2}(n^{3}-n)}}.

If the test statistic W is 1, then all the judges or survey respondents have been unanimous, and each judge or respondent has assigned the same order to the list of objects or concerns. If W is 0, then there is no overall trend of agreement among the respondents, and their responses may be regarded as essentially random. Intermediate values of W indicate a greater or lesser degree of unanimity among the various judges or respondents.

Kendall and Gibbons (1990) also show W is linearly related to the mean value of the Spearman's rank correlation coefficients between all $m \choose {2}$ possible pairs of rankings between judges

{\bar {r}}_{s}={\frac {mW-1}{m-1}}

Incomplete Blocks

When the judges evaluate only some subset of the n objects, and when the correspondent block design is a (n, m, r, p, λ)-design (note the different notation). In other words, when

each judge ranks the same number p of objects for some $p<n$ ,
every object is ranked exactly the same total number r of times,
and each pair of objects is presented together to some judge a total of exactly λ times, $\lambda \geq 1$ , a constant for all pairs.

Then Kendall's W is defined as ^[2]

W={\frac {12\sum _{i=1}^{n}(R_{i}^{2})-3r^{2}n\left(p+1\right)^{2}}{\lambda ^{2}n(n^{2}-1)}}.

If $p=n$ and $\lambda =r=m$ so that each judge ranks all n objects, the formula above is equivalent to the original one.

Correction for Ties

When tied values occur, they are each given the average of the ranks that would have been given had no ties occurred. For example, the data set {80,76,34,80,73,80} has values of 80 tied for 4th, 5th, and 6th place; since the mean of {4,5,6} = 5, ranks would be assigned to the raw data values as follows: {5,3,1,5,2,5}.

The effect of ties is to reduce the value of W; however, this effect is small unless there are a large number of ties. To correct for ties, assign ranks to tied values as above and compute the correction factors

T_{j}=\sum _{i=1}^{g_{j}}(t_{i}^{3}-t_{i}),

where t_i is the number of tied ranks in the ith group of tied ranks, (where a group is a set of values having constant (tied) rank,) and g_j is the number of groups of ties in the set of ranks (ranging from 1 to n) for judge j. Thus, T_j is the correction factor required for the set of ranks for judge j, i.e. the jth set of ranks. Note that if there are no tied ranks for judge j, T_j equals 0.

With the correction for ties, the formula for W becomes

W={\frac {12\sum _{i=1}^{n}(R_{i}^{2})-3m^{2}n(n+1)^{2}}{m^{2}n(n^{2}-1)-m\sum _{j=1}^{m}(T_{j})}},

where R_i is the sum of the ranks for object i, and $\sum _{j=1}^{m}(T_{j})$ is the sum of the values of T_j over all m sets of ranks.^[3]

Steps of Weighted Kendall's W

In some cases, the importance of the raters (experts) might not be the same as each other. In this case, the Weighted Kendall's W should be used.^[4] Suppose that object $i$ is given the rank $r_{ij}$ by judge number $j$ , where there are in total $n$ objects and $m$ judges. Also, the weight of judge $j$ is shown by $\vartheta _{j}$ (in real-world situation, the importance of each rater can be different). Indeed, the weight of judges is $\vartheta _{j}(j=1,2,...,m)$ . Then, the total rank given to object $i$ is

R_{i}=\sum _{j=1}^{m}\vartheta _{j}r_{ij}

and the mean value of these total ranks is,

{\bar {R}}={\frac {1}{n}}\sum _{i=1}^{n}R_{i}

The sum of squared deviations, $S$ , is defined as,

S=\sum _{i=1}^{n}(R_{i}-{\bar {R}})^{2}

and then Weighted Kendall's W is defined as,

W_{w}={\frac {12S}{(n^{3}-n)}}

The above formula is suitable when we do not have any tie rank.

Correction for Ties

In case of tie rank, we need to consider it in the above formula. To correct for ties, we should compute the correction factors,

T_{j}=\sum _{i=1}^{n}(t_{ij}^{3}-t_{ij})\;\;\;\;\;\;\;\forall j

where $t_{ij}$ represents the number of tie ranks in judge $j$ for object $i$ . $T_{j}$ shows the total number of ties in judge $j$ . With the correction for ties, the formula for Weighted Kendall's W becomes,

W_{w}={\frac {12S}{(n^{3}-n)-\sum _{j=1}^{m}\vartheta _{j}T_{j}}}

If the weights of the raters are equal (the distribution of the weights is uniform), the value of Weighted Kendall's W and Kendall's W are equal.^[4]

Significance Tests

In the case of complete ranks, a commonly used significance test for W against a null hypothesis of no agreement (i.e. random rankings) is given by Kendall and Gibbons (1990)^[5]

\chi ^{2}=m(n-1)W

Where the test statistic takes a chi-squared distribution with $df=n-1$ degrees of freedom.

In the case of incomplete rankings (see above), this becomes

\chi ^{2}={\frac {\lambda (n^{2}-1)}{k+1}}W

Where again, there are $df=n-1$ degrees of freedom.

Legendre^[6] compared via simulation the power of the chi-square and permutation testing approaches to determining significance for Kendall's W. Results indicated the chi-square method was overly conservative compared to a permutation test when $m<20$ . Marozzi^[7] extended this by also considering the F test, as proposed in the original publication introducing the W statistic by Kendall & Babington Smith (1939):

F={\frac {W(m-1)}{1-W}}

Where the test statistic follows an F distribution with $v_{1}=n-1-(2/m)$ and $v_{2}=(m-1)v_{1}$ degrees of freedom. Marozzi found the F test performs approximately as well as the permutation test method, and may be preferred to when $m$ is small, as it is computationally simpler.

Software

Kendall's W and Weighted Kendall's W are implemented in MATLAB,^[8] SPSS, R,^[9] and other statistical software packages.

Notes

^ Dodge (2003): see "concordance, coefficient of"
^ Gibbons & Chakraborti (2003)
^ Siegel & Castellan (1988, p. 266)
^ ^a ^b Mahmoudi, Amin; Abbasi, Mehdi; Yuan, Jingfeng; Li, Lingzhi (2022). "Large-scale group decision-making (LSGDM) for performance measurement of healthcare construction projects: Ordinal Priority Approach". Applied Intelligence. 52 (12): 13781–13802. doi:10.1007/s10489-022-04094-y. ISSN 1573-7497. PMC 9449288. PMID 36091930.
^ Kendall, Maurice G. (Maurice George), 1907-1983. (1990). Rank correlation methods. Gibbons, Jean Dickinson, 1938- (5th ed.). London: E. Arnold. ISBN 0-19-520837-4. OCLC 21195423.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)
^ Legendre (2005)
^ Marozzi, Marco (2014). "Testing for concordance between several criteria". Journal of Statistical Computation and Simulation. 84 (9): 1843–1850. doi:10.1080/00949655.2013.766189. S2CID 119577430.
^ "Weighted Kendall's W". www.mathworks.com. Retrieved 2022-10-06.
^ "Kendall's coefficient of concordance W – generalized for randomly incomplete datasets". The R Project for Statistical Computing.

References

Kendall, M. G.; Babington Smith, B. (Sep 1939). "The Problem of m Rankings". The Annals of Mathematical Statistics. 10 (3): 275–287. doi:10.1214/aoms/1177732186. JSTOR 2235668.
Kendall, M. G., & Gibbons, J. D. (1990). Rank correlation methods. New York, NY : Oxford University Press.
Corder, G.W., Foreman, D.I. (2009). Nonparametric Statistics for Non-Statisticians: A Step-by-Step Approach Wiley, ISBN 978-0-470-45461-9
Dodge, Y. (2003). The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9
Legendre, P (2005) Species Associations: The Kendall Coefficient of Concordance Revisited. Journal of Agricultural, Biological and Environmental Statistics, 10(2), 226–245. [1]
Siegel, Sidney; Castellan, N. John Jr. (1988). Nonparametric Statistics for the Behavioral Sciences (2nd ed.). New York: McGraw-Hill. p. 266. ISBN 978-0-07-057357-4.
Gibbons, Jean Dickinson; Chakraborti, Subhabrata (2003). Nonparametric Statistical Inference (4th ed.). New York: Marcel Dekker. pp. 476–482. ISBN 978-0-8247-4052-8.

[1] Dodge (2003): see "concordance, coefficient of"

[2] Gibbons & Chakraborti (2003)

[3] Siegel & Castellan (1988, p. 266)

[:0-4] Mahmoudi, Amin; Abbasi, Mehdi; Yuan, Jingfeng; Li, Lingzhi (2022). "Large-scale group decision-making (LSGDM) for performance measurement of healthcare construction projects: Ordinal Priority Approach". Applied Intelligence. 52 (12): 13781–13802. doi:10.1007/s10489-022-04094-y. ISSN 1573-7497. PMC 9449288. PMID 36091930.

[5] Kendall, Maurice G. (Maurice George), 1907-1983. (1990). Rank correlation methods. Gibbons, Jean Dickinson, 1938- (5th ed.). London: E. Arnold. ISBN 0-19-520837-4. OCLC 21195423.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)

[6] Legendre (2005)

[7] Marozzi, Marco (2014). "Testing for concordance between several criteria". Journal of Statistical Computation and Simulation. 84 (9): 1843–1850. doi:10.1080/00949655.2013.766189. S2CID 119577430.

[:1-8] "Weighted Kendall's W". www.mathworks.com. Retrieved 2022-10-06.

[9] "Kendall's coefficient of concordance W – generalized for randomly incomplete datasets". The R Project for Statistical Computing.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Steps of Kendall's W

Incomplete Blocks

Correction for Ties

Steps of Weighted Kendall's W

Correction for Ties

Significance Tests

Software

See also

Notes

References