Learning About Kendall’s Tau: A Tutorial on Rank Correlation


The Foundation of Correlation and Rank Metrics

In the discipline of statistics, the concept of correlation is a critical measure used to quantify the strength and direction of the relationship between two distinct variables. The outcome of a correlation analysis is always represented by a coefficient, a numerical value that ranges strictly from -1 to 1. A coefficient of 1 denotes a perfect positive relationship, meaning the variables increase or decrease in perfect unison. Conversely, a value of -1 signifies a perfect negative or inverse relationship. Crucially, a value of 0 indicates that no linear association exists between the variables under examination.

The standard measure for assessing this linear interdependence is the Pearson correlation coefficient ($r$). This powerful metric is designed for use with two numerical variables that are assumed to follow a continuous distribution, effectively measuring the degree of their linear fit. However, real-world data frequently deviates from these parametric assumptions. Many datasets involve information that is not continuous or interval-based but is instead ordinal—that is, data presented strictly in the form of ranks or ordered categories. When researchers encounter such ranked data, where the magnitude of the difference between values is not meaningful, relying on Pearson’s $r$ can lead to inaccurate conclusions.

This limitation necessitates the use of a non-parametric approach, which does not rely on assumptions about the underlying distribution of the data. This shift in methodology leads us directly to rank correlation coefficients. These coefficients focus not on the values themselves, but on the relative ordering of those values, making them robust tools for analyzing complex datasets where normality cannot be assumed or where the data is inherently subjective and ordinal.

Defining Kendall’s Tau ($tau$): Theory and Formula

The Kendall’s Tau coefficient (symbolized as $tau$) is a specific type of rank correlation coefficient that evaluates the monotonic relationship between two variables. A monotonic relationship is one where the variables tend to move in the same general direction—either both increasing or both decreasing—but not necessarily at a constant rate, which distinguishes it from a strictly linear relationship. The fundamental genius of Kendall’s Tau lies in its simplicity: it assesses the probability that two observations chosen at random will be ordered consistently (concordant) versus inconsistently (discordant) across the two rankings.

The core calculation involves systematically comparing every possible pair of data points within the dataset. For each pair, the analysis determines whether the relative ranking order is the same (a concordant pair) or whether the order is reversed (a discordant pair). The final coefficient, $tau$, represents the normalized difference between the proportion of these concordant pairs ($C$) and the proportion of discordant pairs ($D$). This approach provides a measure of association that is highly resistant to outliers and effective even with small sample sizes.

The standard mathematical foundation for Kendall’s Tau coefficient (often referred to as $tau_A$) is expressed as a simple ratio of these pair counts. It is crucial to note that this specific formula assumes there are no ties in the rankings, a condition often met when dealing with complete subjective ordering:

τ = (C-D) / (C+D)

This ratio effectively normalizes the net agreement (C – D) by the total number of possible pairs (C + D). The resulting Tau value, like other correlation coefficients, will fall between -1 and 1. A Tau value near 1 signifies strong agreement in the ranking order, a value close to -1 indicates strong disagreement or an inverse ranking, and a value near 0 suggests a weak or nonexistent association between the two sets of rankings.

Within this formula, the variables represent critical components derived from the pair comparisons:

  • C = the number of concordant pairs, where the relative ranking order for both variables is identical.
  • D = the number of discordant pairs, where the relative ranking order for the two variables is opposite.
  • C+D = the total number of pairs that can be formed from the data set, which equals $n(n-1)/2$.

The Fundamental Logic of Concordant and Discordant Pairs

Understanding the calculation of concordant and discordant pairs is the cornerstone of applying the Kendall’s Tau metric successfully. When we compare two observations, say Player X and Player Y, the pair is concordant if both Ranker A and Ranker B assign the same relative order (e.g., Ranker A ranks X higher than Y, and Ranker B also ranks X higher than Y). Conversely, the pair is discordant if the relative order is reversed (e.g., Ranker A ranks X higher than Y, but Ranker B ranks Y higher than X). The Tau coefficient essentially calculates the margin of victory of agreement over disagreement.

This approach is particularly valuable for situations involving inter-rater reliability, such as assessing the consistency of two judges, reviewers, or experts. Since the input data is purely based on relative position—or ranking—and not on precise interval measurements, Kendall’s Tau provides an honest measure of association without forcing the data into potentially inappropriate linear models. It is frequently employed in psychology, quality assessment, and market research where subjective ordering is common.

To solidify this theoretical understanding, we will now walk through a detailed, step-by-step illustration using a real-world scenario. This example demonstrates how to methodically count these pairs and arrive at the final rank correlation coefficient, ensuring the precision necessary for robust statistical reporting.

Practical Example: Calculating Kendall’s Tau Step-by-Step

Consider a scenario designed to test the agreement between two professional basketball coaches. Both coaches are asked to independently rank 12 of their players based on overall skill, with ranks ranging from 1 (lowest skill) to 12 (highest skill). Because the data consists purely of subjective, ordinal assessments, Kendall’s Tau is the most statistically justified measure to quantify the degree of consensus between the two coaches’ professional judgments.

The table below presents the rankings assigned by each coach to the 12 players. For computational ease, the data is typically sorted according to the ranks of the first variable (Coach #1), which establishes a perfect ascending baseline (1 through 12). Our primary task is then to compare Coach #2’s corresponding ranks against this established baseline to identify instances of agreement and reversal across all possible pairs.

Kendall's Tau example

The calculation process requires meticulous accounting. Since we have sorted the list based on Coach #1’s rankings, we only need to analyze the ranks in Coach #2’s column relative to one another to determine concordance and discordance. This standardized procedure significantly streamlines the computation of the two vital components, C and D, necessary for the final coefficient.

Detailed Calculation of Concordant Pairs (C)

Step 1: Count the number of concordant pairs (C). A pair of players ($i$ and $j$) is defined as concordant if their relative order is consistent across both Coach #1’s list and Coach #2’s list. Since our list is already sorted by Coach #1, we focus entirely on Coach #2’s ranks. For each player in Coach #2’s list, we count how many ranks listed *below* that player are larger than the current player’s rank.

Starting with AJ, who is ranked “1” by Coach #2, we observe all 11 subsequent ranks (2, 3, 5, 4, 7, 6, 8, 10, 9, 11, 12) are larger than 1. Thus, AJ contributes 11 concordant pairs. We record this initial count:

Kendall's tau dataset

We proceed to the next player, Brandon, ranked “2.” Below Brandon’s rank, we count the numbers that are larger than 2, resulting in 10 such ranks (3, 5, 4, 7, 6, 8, 10, 9, 11, 12). This systematic process of comparison and counting continues sequentially down the list. We must maintain consistency, only looking at the ranks positioned further down the table to ensure every pair is counted exactly once.

Kendalls' tau concordant pairs calculation

The final counts for all players must be meticulously summed. For example, when reaching Frank, ranked ‘7’, we count all ranks below him that are larger than 7 (8, 10, 9, 11, 12), yielding 5 concordant pairs. Repeating this rigorous comparison process for all remaining players yields the complete set of concordant pair counts, totaling the entire column to find the value of C.

Kendall's Tau

After summing the column, we finalize the total number of concordant pairs. This sum represents the total number of times the two coaches agree on the relative order of any randomly selected pair of players within the dataset.

Kendall's Tau

Detailed Calculation of Discordant Pairs (D)

Step 2: Count the number of discordant pairs (D). A pair is considered discordant if the ranking order is reversed between the two coaches. That is, if Player X is ranked higher than Player Y by Coach #1 (which is the case for every player pair since the list is sorted), but Player X is ranked lower than Player Y by Coach #2. Using the same sorted list, we now focus on Coach #2’s ranks and, for each player, count how many ranks below them in the column are smaller than the current player’s rank.

Starting with AJ, ranked “1,” there are no ranks below him that are smaller than 1, so the count is 0. Similarly, Brandon (rank 2) has no smaller ranks below him, yielding a count of 0.

Kendall's tau calculation for discordant pairs

We repeat this methodical process for every player. A key reversal occurs when we reach Daniel (rank 5). Below Daniel, we see Elliot (rank 4), which is smaller. This means the coaches disagreed on the relative order of Daniel and Elliot. Daniel therefore contributes 1 discordant pair. This counting process isolates all instances where the coaches’ rankings of a pair conflict. The final tally of this column gives us the total number of discordant pairs, D.

Kendall's tau example

Final Result and Assessment of Statistical Significance

Step 3: Calculate the sum of each column and find Kendall’s Tau. After completing the rigorous counting process, we sum the respective columns. The final tallies show that the total number of concordant pairs (C) is 63, and the total number of discordant pairs (D) is 3. The total number of pairs possible is $C+D = 66$. We now substitute these values directly into the Kendall’s Tau formula:

Kendall's tau calculation

Applying the formula yields: Kendall’s Tau ($tau$) = (63 – 3) / (63 + 3) = 60 / 66. This results in a coefficient of approximately 0.909. This exceptionally high positive value indicates a very strong degree of agreement in the relative correlation between the two coaches regarding the rankings of the 12 players. In 63 out of 66 possible pair comparisons, the coaches agreed on the relative order.

While the magnitude of the correlation is clear, rigorous statistical analysis requires determining if this observed agreement is likely due to random chance or if it represents a true, statistical significance. To test the null hypothesis (which states that the true population correlation coefficient is zero), we calculate a standardized test statistic, often referred to as a z-score.

For sample sizes greater than ten ($n > 10$), the distribution of Kendall’s Tau is generally approximated by a normal distribution. This allows us to convert the calculated $tau$ value into a $z$-score using the following approximation formula, which integrates the sample size ($n$) to determine how many standard errors the calculated $tau$ is away from zero:

z = 3τ*√n(n-1) / √2(2n+5)

In this context:

  • τ = the calculated value for Kendall’s Tau (0.909).
  • n = the number of pairs (12).

Applying the values from our example to the z-score calculation:

z = 3(0.909)*√12(12-1) / √2(2*12+5) = 3(0.909)*√132 / √58 ≈ 4.11

The calculated z-score is 4.11. By cross-referencing this score with a standard normal distribution table, we find the corresponding p-value is approximately 0.00004. Since this p-value is significantly smaller than the conventional alpha level threshold of 0.05, we confidently reject the null hypothesis. We conclude that there is a statistically significant correlation between the ranks assigned by the two basketball coaches, definitively validating the strong agreement observed in the $tau$ coefficient.

Computational Efficiency: Implementing Kendall’s Tau in R

While the manual calculation is invaluable for grasping the mathematical underpinnings of Kendall’s Tau, large-scale statistical analysis demands computational speed and precision. Modern statistical programming languages provide robust functions to calculate this coefficient efficiently, minimizing the risk of human error associated with counting hundreds or thousands of pairs.

The statistical programming language R is equipped with specialized packages for this purpose. Specifically, the kendall.tau() function, available within the highly useful VGAM library, is perfectly suited for computing this rank correlation coefficient, even when dealing with data that contains ties (though our example did not).

The syntax for this function is remarkably straightforward, requiring only two numerical vectors of equal length that represent the corresponding rankings:

kendall.tau(x, y)

Where x and y correspond to the two sets of ranked data being compared. The following R code demonstrates how to replicate the calculation performed manually in the previous sections, utilizing the exact ranking data provided by the two coaches. This computational approach confirms the precision of our hand calculations and offers an efficient, scalable method for future analyses.

#load VGAM
library(VGAM)

#create vector for each coach's rankings
coach_1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
coach_2 <- c(1, 2, 3, 5, 4, 7, 6, 8, 10, 9, 11, 12)

#calculate Kendall's Tau
kendall.tau(coach_1, coach_2)

#[1] 0.9090909

As clearly demonstrated, the output from the R function, 0.9090909, precisely matches the value of 0.909 that we calculated through the detailed, step-by-step manual process. This consistency confirms the accuracy of both the theoretical methodology and the practical efficiency of utilizing specialized statistical software for complex rank correlation analysis.

Cite this article

Mohammed looti (2025). Learning About Kendall’s Tau: A Tutorial on Rank Correlation. PSYCHOLOGICAL STATISTICS. Retrieved from https://statistics.arabpsychology.com/kendalls-tau-definition-example/

Mohammed looti. "Learning About Kendall’s Tau: A Tutorial on Rank Correlation." PSYCHOLOGICAL STATISTICS, 8 Nov. 2025, https://statistics.arabpsychology.com/kendalls-tau-definition-example/.

Mohammed looti. "Learning About Kendall’s Tau: A Tutorial on Rank Correlation." PSYCHOLOGICAL STATISTICS, 2025. https://statistics.arabpsychology.com/kendalls-tau-definition-example/.

Mohammed looti (2025) 'Learning About Kendall’s Tau: A Tutorial on Rank Correlation', PSYCHOLOGICAL STATISTICS. Available at: https://statistics.arabpsychology.com/kendalls-tau-definition-example/.

[1] Mohammed looti, "Learning About Kendall’s Tau: A Tutorial on Rank Correlation," PSYCHOLOGICAL STATISTICS, vol. X, no. Y, ص Z-Z, November, 2025.

Mohammed looti. Learning About Kendall’s Tau: A Tutorial on Rank Correlation. PSYCHOLOGICAL STATISTICS. 2025;vol(issue):pages.

Download Post (.PDF)
Scroll to Top