= Kendall's W =

Kendall's W (also known as Kendall's coefficient of concordance) is a non-parametric statistic for rank correlation. It is a normalization of the statistic of the Friedman test, and can be used for assessing agreement among raters and in particular inter-rater reliability. Kendall's W ranges from 0 (no agreement) to 1 (complete agreement).

Suppose, for instance, that a number of people have been asked to rank a list of political concerns, from the most important to the least important. Kendall's W can be calculated from these data. If the test statistic W is 1, then all the survey respondents have been unanimous, and each respondent has assigned the same order to the list of concerns. If W is 0, then there is no overall trend of agreement among the respondents, and their responses may be regarded as essentially random. Intermediate values of W indicate a greater or lesser degree of unanimity among the various responses.

While tests using the standard Pearson correlation coefficient assume normally distributed values and compare two sequences of outcomes simultaneously, Kendall's W makes no assumptions regarding the nature of the probability distribution and can handle any number of distinct outcomes.

== Steps of Kendall's W ==

Suppose that object i is given the rank r_{i,j} by judge number j, where there are in total n objects and m judges. Then the total rank given to object i is
$R_i=\sum_{j=1}^m r_{i,j} ,$
and the mean value of these total ranks is
$\bar R= \frac{1}{n} \sum_{i=1}^n R_i.$
The sum of squared deviations, S, is defined as
$S=\sum_{i=1}^n (R_i- \bar R)^2 ,$
and then Kendall's W is defined as
$W=\frac{12 S}{m^2(n^3-n)}.$

If the test statistic W is 1, then all the judges or survey respondents have been unanimous, and each judge or respondent has assigned the same order to the list of objects or concerns. If W is 0, then there is no overall trend of agreement among the respondents, and their responses may be regarded as essentially random. Intermediate values of W indicate a greater or lesser degree of unanimity among the various judges or respondents.

Kendall and Gibbons (1990) also show W is linearly related to the mean value of the Spearman's rank correlation coefficients between all $m \choose{2}$ possible pairs of rankings between judges

$\bar{r}_s = \frac{mW-1}{m-1}$

=== Incomplete Blocks ===
When the judges evaluate only some subset of the n objects, and when the correspondent block design is a (n, m, r, p, λ)-design (note the different notation). In other words, when

1. each judge ranks the same number p of objects for some $p < n$,
2. every object is ranked exactly the same total number r of times,
3. and each pair of objects is presented together to some judge a total of exactly λ times, $\lambda \ge 1$, a constant for all pairs.

Then Kendall's W is defined as
$W=\frac{12 \sum_{i=1}^n (R_i^2) - 3r^2n\left(p+1\right)^2}{\lambda^2n(n^2-1)}.$

If $p = n$ and $\lambda = r = m$ so that each judge ranks all n objects, the formula above is equivalent to the original one.

=== Correction for Ties ===
When tied values occur, they are each given the average of the ranks that would have been given had no ties occurred. For example, the data set {80,76,34,80,73,80} has values of 80 tied for 4th, 5th, and 6th place; since the mean of {4,5,6} = 5, ranks would be assigned to the raw data values as follows: {5,3,1,5,2,5}.

The effect of ties is to reduce the value of W; however, this effect is small unless there are a large number of ties. To correct for ties, assign ranks to tied values as above and compute the correction factors
$T_j=\sum_{i=1}^{g_j} (t_i^3-t_i),$
where t_{i} is the number of tied ranks in the ith group of tied ranks, (where a group is a set of values having constant (tied) rank,) and g_{j} is the number of groups of ties in the set of ranks (ranging from 1 to n) for judge j. Thus, T_{j} is the correction factor required for the set of ranks for judge j, i.e. the jth set of ranks. Note that if there are no tied ranks for judge j, T_{j} equals 0.

With the correction for ties, the formula for W becomes
$W=\frac{12\sum_{i=1}^n (R_i^2)-3m^2n(n+1)^2}{m^2n(n^2-1)-m\sum_{j=1}^m (T_j)},$
where R_{i} is the sum of the ranks for object i, and $\sum_{j=1}^m (T_j)$ is the sum of the values of T_{j} over all m sets of ranks.

== Steps of Weighted Kendall's W ==

In some cases, the importance of the raters (experts) might not be the same as each other. In this case, the Weighted Kendall's W should be used. Suppose that object $i$ is given the rank $r_{ij}$ by judge number $j$, where there are in total $n$ objects and $m$ judges. Also, the weight of judge $j$ is shown by $\vartheta_{j}$ (in real-world situation, the importance of each rater can be different). Indeed, the weight of judges is $\vartheta_{j} (j=1,2,...,m)$. Then, the total rank given to object $i$ is

$R_i=\sum_{j=1}^m \vartheta_{j} r_{ij}$
and the mean value of these total ranks is,
$\bar R= \frac{1}{n} \sum_{i=1}^n R_i$
The sum of squared deviations, $S$, is defined as,
$S=\sum_{i=1}^n (R_i- \bar R)^2$
and then Weighted Kendall's W is defined as,
$W_{w}=\frac{12 S}{(n^3-n)}$
The above formula is suitable when we do not have any tie rank.

=== Correction for Ties ===
In case of tie rank, we need to consider it in the above formula. To correct for ties, we should compute the correction factors,
$T_j=\sum_{i=1}^{n} (t_{ij}^3-t_{ij}) \;\;\;\;\;\;\; \forall j$
where $t_{ij}$ represents the number of tie ranks in judge $j$ for object $i$. $T_j$ shows the total number of ties in judge $j$.
With the correction for ties, the formula for Weighted Kendall's W becomes,
$W_{w}=\frac{12 S}{(n^3-n)-\sum_{j=1}^m \vartheta_{j} T_j}$
If the weights of the raters are equal (the distribution of the weights is uniform), the value of Weighted Kendall's W and Kendall's W are equal.

== Significance Tests ==
In the case of complete ranks, a commonly used significance test for W against a null hypothesis of no agreement (i.e. random rankings) is given by Kendall and Gibbons (1990)

$\chi^2 =m(n-1)W$

Where the test statistic takes a chi-squared distribution with $df = n-1$ degrees of freedom.

In the case of incomplete rankings (see above), this becomes

$\chi^2 =\frac{\lambda(n^2-1)}{k+1}W$

Where again, there are $df = n-1$ degrees of freedom.

Legendre compared via simulation the power of the chi-square and permutation testing approaches to determining significance for Kendall's W. Results indicated the chi-square method was overly conservative compared to a permutation test when $m<20$. Marozzi extended this by also considering the F test, as proposed in the original publication introducing the W statistic by Kendall & Babington Smith (1939):

$F=\frac{W(m-1)}{1-W}$

Where the test statistic follows an F distribution with $v_1=n-1-(2/m)$ and $v_2=(m-1)v_1$ degrees of freedom. Marozzi found the F test performs approximately as well as the permutation test method, and may be preferred to when $m$ is small, as it is computationally simpler.

== Software ==
Kendall's W and Weighted Kendall's W are implemented in MATLAB, SPSS, R, and other statistical software packages.

==See also==
- Maurice Kendall
- Kendall's tau
- Spearman's rank correlation coefficient
- Friedman test
