A Comprehensive Guide to the Friedman Test in Stata


The Friedman Test stands out as a crucial non-parametric alternative to the standard Repeated-measures ANOVA. This robust statistical procedure is specifically engineered for analyzing data derived from a within-subjects design, where the core objective is to determine if statistically significant differences exist among the central tendencies of three or more related groups. It is particularly useful when the same subjects are exposed to all experimental conditions, making it an indispensable tool in fields like psychology and medicine where repeated measures are common.

The primary utility of the Friedman Test emerges when the stringent assumptions required for parametric testing—such as the assumption of normality or sphericity—are violated. By operating on the ranks of the data rather than the raw values, the test remains reliable and powerful, offering a sound methodology for analyzing data collected under these challenging conditions. This flexibility ensures that researchers can draw reliable conclusions even when data distributions are non-standard or sample sizes are small.

This comprehensive guide offers a precise, step-by-step walkthrough detailing the necessary commands and procedures required to successfully execute the Friedman Test within the powerful statistical software environment, Stata. We will cover everything from initial data management and the installation of specialized community-contributed packages to the careful interpretation of the resulting statistical output, ensuring clarity and reproducibility at every stage.

Case Study: The Research Context

To illustrate the practical application of the Friedman Test, we will utilize the classic, publicly available t43 dataset, a standard example frequently employed in statistical training and methodology. This dataset captures the reaction times of five distinct patients, each of whom was administered four different pharmaceutical drugs. Since every patient participated in all four drug conditions, this structure perfectly exemplifies a repeated-measures design.

The central research question driving our analysis is straightforward: Does the mean reaction time statistically differ across the four distinct drugs administered? If the collected data satisfied the strict distributional assumptions of parametric tests, we would naturally opt for a Repeated-measures ANOVA. However, assuming those critical assumptions are not met—a common scenario in real-world research—the Friedman Test becomes the appropriate choice.

The procedure allows us to compare the distributions of reaction times based purely on their relative ranks, providing a statistically sound conclusion irrespective of the underlying shape of the distribution. This non-parametric approach is invaluable, especially in scenarios common in biological or psychological research where data may be highly skewed or where the limitations of small sample sizes preclude the use of parametric tests. The methodology outlined in the subsequent sections provides the precise framework for conducting this analysis accurately in Stata.

Step 1: Data Acquisition and Preparation in Stata

The foundational step in any statistical analysis is the proper loading and structuring of the data within the Stata environment. For a repeated-measures analysis like the Friedman Test, the data must be correctly formatted. Typically, this involves a "long" format where one variable uniquely identifies the subject (the patient ID), a second variable defines the condition (the type of drug), and a third contains the measured outcome (the reaction time score).

To load the t43 dataset directly from the official Stata Press repository, researchers must execute the following simple command in the Stata Command window:

use http://www.stata-press.com/data/r14/t43

Once the data is successfully loaded, it is absolutely critical to perform an initial inspection to verify its structure, identify the variable names, and assess data completeness. Understanding the dataset’s organization is fundamental. This initial exploration can be accomplished using the standard browsing command:

br

Executing this command reveals three key variables: person, which serves as the subject identifier or blocking variable; drug, which represents the treatment or condition variable; and score, which holds the dependent variable (reaction time). Recognizing that person acts as the blocking factor is essential for correctly specifying the command syntax when running the Friedman Test later in the analysis.

Friedman Test in Stata example

Step 2: Installing the Necessary ’emh’ Package

It is important to note that the Friedman Test is often not included in the standard, base installation of Stata. To execute this specific non-parametric procedure effectively, analysts must rely on utility packages contributed by the user community. The specialized emh package is precisely what is needed, as it is designed to facilitate non-parametric analysis for multiple related samples, including the test we intend to perform.

If this utility has not been previously installed on your system, it is mandatory to do so before proceeding to the execution phase. Fortunately, Stata simplifies this process tremendously through the use of the ssc install command. This command automatically retrieves and installs the designated package directly from the Statistical Software Components (SSC) archive, which is reliably maintained by Boston College.

To install the essential emh package, simply enter the following command into the Stata command window:

ssc install emh

Assuming a stable internet connection, the installation process is typically rapid, often completing within a few seconds. Once the package is successfully installed, all the specialized commands it contains become immediately accessible for use, not only in the current session but also in all future Stata sessions, permanently enhancing your analytical toolkit.

Step 3: Executing the Non-Parametric Test

With the data properly loaded and the necessary emh package installed, we are fully prepared to execute the core statistical analysis. The syntax used for the emh command is highly flexible, supporting various non-parametric tests. To specifically trigger the Friedman Test, we must carefully specify three variables: the outcome (dependent) variable, the treatment (grouping) variable, and the blocking (subject) variable, alongside crucial options that define the rank-based procedure.

The general structure of the command requires listing the outcome variable first, followed by the grouping variable. Critically, we use the strata() option to correctly identify the repeated-measures or blocking variable, which is the patient ID in our case study. Furthermore, the transformation(rank) option is indispensable; it informs Stata to perform the analysis on the ranks of the data, which is the fundamental methodological principle underlying the Friedman Test. We also include the anova option to request a standard, clear output presentation.

Applying this structure to our specific case study, involving reaction times (score), drugs (drug), and patients (person), the complete and precise command is structured as follows:

emh score drug, strata(person) anova transformation(rank)

Upon execution, Stata processes the ranks of the reaction times independently within each patient block and then calculates the resulting test statistic. This result quantifies the differences observed across the four drug conditions based on the ranked data.

Friedman test output in Stata

Step 4: Interpreting the Results and Conclusion

The output generated by Stata following the execution of the command provides the crucial metrics necessary for drawing a statistical conclusion concerning our research question. The interpretation process hinges primarily on analyzing the calculated test statistic (Q) and its associated p-value.

  • Q (Degrees of Freedom) = 13.5600. This value, often symbolized as Q or χ² (Chi-squared), represents the test statistic for the Friedman Test. It measures the overall magnitude of the observed differences among the group ranks. The value in parentheses, (3), denotes the degrees of freedom (df), which is calculated as the number of groups minus one (4 drugs – 1 = 3 df).
  • P = 0.0036. This is the associated p-value. This probability represents the chance of observing a test statistic as extreme as 13.5600 if, in reality, the null hypothesis were true. The null hypothesis asserts that there is no difference in the median reaction times across the four administered drugs.

To reach a decision, we compare the calculated p-value to a predetermined significance level, or alpha (usually set at 0.05). In this case, since 0.0036 is significantly smaller than the threshold of 0.05, we must consequently reject the null hypothesis. Therefore, the statistical conclusion is that there is a statistically significant difference in the distribution, and thus the median, of reaction times among the four pharmaceutical drugs. The type of drug administered clearly leads to differences in the measured patient response time.

Step 5: Formal Reporting and Next Steps

The final and crucial stage of the statistical analysis involves formally communicating the findings in a clear, standardized format that is suitable for inclusion in academic papers or research reports. Reporting the results of a Friedman Test demands precision, requiring the inclusion of the test’s purpose, the sample size, the degrees of freedom (df), the calculated test statistic (Q), and the corresponding p-value.

Based on the output derived from our Stata analysis, a typical example of formal reporting is structured as follows:

A Friedman Test was conducted on five individuals to investigate the effect of four different drugs on patient response time in a within-subjects design, where each patient received all four drug treatments. Results indicated that the type of drug administered led to statistically significant differences in patient response time, Q(3) = 13.56, p-value = 0.0036. This finding provides strong evidence to reject the null hypothesis that the median reaction times are equal across all drug conditions.

It is important to recognize that while the Friedman Test confirms that an overall difference exists somewhere among the groups, it does not specify which particular pairs of drugs are different from one another. If a statistically significant result is obtained, researchers must typically proceed with subsequent post-hoc analyses. These procedures, such as Wilcoxon signed-rank tests combined with a Bonferroni correction to manage the risk of Type I error, are necessary steps for identifying the precise sources of the observed difference and fully refining the conclusion.

Cite this article

Mohammed looti (2025). A Comprehensive Guide to the Friedman Test in Stata. PSYCHOLOGICAL STATISTICS. Retrieved from https://statistics.arabpsychology.com/perform-the-friedman-test-in-stata/

Mohammed looti. "A Comprehensive Guide to the Friedman Test in Stata." PSYCHOLOGICAL STATISTICS, 8 Nov. 2025, https://statistics.arabpsychology.com/perform-the-friedman-test-in-stata/.

Mohammed looti. "A Comprehensive Guide to the Friedman Test in Stata." PSYCHOLOGICAL STATISTICS, 2025. https://statistics.arabpsychology.com/perform-the-friedman-test-in-stata/.

Mohammed looti (2025) 'A Comprehensive Guide to the Friedman Test in Stata', PSYCHOLOGICAL STATISTICS. Available at: https://statistics.arabpsychology.com/perform-the-friedman-test-in-stata/.

[1] Mohammed looti, "A Comprehensive Guide to the Friedman Test in Stata," PSYCHOLOGICAL STATISTICS, vol. X, no. Y, ص Z-Z, November, 2025.

Mohammed looti. A Comprehensive Guide to the Friedman Test in Stata. PSYCHOLOGICAL STATISTICS. 2025;vol(issue):pages.

Download Post (.PDF)
Scroll to Top