The Kendall tau-b correlation coefficient, \(\tau_b\), is a nonparametric measure of association based on the number of concordances and discordances in paired observations.
Suppose two observations \(\left(X_i , Y_i \right)\) and \(\left(X_j , Y_j \right)\) are concordant if they are in the same order with respect to each variable. That is, if
- \(X_i < X_j\) and \(Y_i < Y_j\) , or if
- \(X_i > X_j\) and \(Y_i > Y_j\)
They are discordant if they are in the reverse ordering for X and Y, or the values are arranged in opposite directions. That is, if
- \(X_i < X_j\) and \(Y_i > Y_j\) , or if
- \(X_i > X_j\) and \(Y_i < Y_j\)
The two observations are tied if \(X_i = X_j\) and/or \(Y_i = Y_j\) .
The total number of pairs that can be constructed for a sample size of n is
\(N=\binom{n}{2}=\dfrac{1}{2}n(n-1)\)
N can be decomposed into these five quantities:
\(N = P + Q + X_0 + Y_0 + (XY)_0\)
where P is the number of concordant pairs, Q is the number of discordant pairs, \(X_0\) is the number of pairs tied only on the X variable, \(Y_0\) is the number of pairs tied only on the Y variable, and \(\left(XY\right)_0\) is the number of pairs tied on both X and Y.
The Kendall tau-b for measuring order association between variables X and Y is given by the following formula:
\(t_b=\dfrac{P-Q}{\sqrt{(P+Q+X_0)(P+Q+Y_0)}}\)
This value becomes scaled and ranges between -1 and +1. Unlike Spearman it does estimate a population variance as:
\(t_b \text{ is the sample estimate of } t_b = Pr[\text{concordance}] - Pr[\text{discordance}]\)
The Kendall tau-b has properties similar to the properties of the Spearman \(r_s\). Because the sample estimate, \(t_b\) , does estimate a population parameter, \(t_b\) , many statisticians prefer the Kendall tau-b to the Spearman rank correlation coefficient.