Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity


The Breusch-Pagan Test is one of the most fundamental diagnostic procedures utilized in regression analysis. Its primary purpose is to determine whether the variance of the error terms, known as the disturbance, remains constant across all observations—a condition called homoscedasticity. Failure to meet this requirement results in heteroscedasticity, a condition that seriously violates a core assumption of the Classical Linear Regression Model.

When heteroscedasticity is present, the model’s estimated coefficients remain unbiased, but the calculation of the standard errors becomes unreliable. This distortion leads directly to flawed inference: hypothesis tests and confidence intervals derived from the model cannot be trusted, potentially causing researchers to draw incorrect conclusions about the significance of their predictors. Therefore, confirming the absence of heteroscedasticity is a critical prerequisite for robust statistical modeling.

While dedicated statistical software automates this check, performing the Breusch-Pagan Test manually within a common spreadsheet environment like Microsoft Excel offers invaluable insight into its underlying mechanism. This comprehensive tutorial walks you through every necessary step, from initial model fitting to the final statistical decision, ensuring you can accurately diagnose your model’s stability using readily available tools.

Example Setup: Structuring Data for Statistical Diagnosis

To demonstrate the methodology clearly, we will employ a practical, small-scale dataset consisting of ten observations related to basketball player performance. Our objective is to fit a model and then rigorously test if the relationship between the chosen variables exhibits constant error variance, confirming the data’s suitability for standard statistical inference.

We will construct a multiple linear regression model where the player’s Rating serves as the dependent variable (Y). The explanatory variables (X) used to predict this rating are Points, Assists, and Rebounds. Once this initial model is fitted, the subsequent Breusch-Pagan procedure will focus entirely on the residuals generated by this model, assessing whether their variance is systematically related to the predictors, which would signify the presence of heteroscedasticity.

The raw data, organized and ready for the initial regression analysis in Excel, is displayed below. This dataset forms the foundation upon which all subsequent calculations—predicted values, squared residuals, and the auxiliary regression—will be built.

Raw data in Excel

Step 1: Performing the Initial Multiple Linear Regression

The first prerequisite for the Breusch-Pagan Test is fitting the primary regression model to obtain the coefficients and, crucially, the initial set of residuals. In Excel, this is most efficiently accomplished using the Data Analysis Toolpak. If you cannot locate the Data Analysis option under the Data tab, you must first enable this powerful statistical add-in through Excel’s options menu.

To initiate the regression process, navigate to the Data tab and select Data Analysis. If the option is currently unavailable, ensure the necessary Add-In is activated before proceeding:

Data Analysis Toolpak in Excel

In the Data Analysis dialog box, select the Regression option. Carefully define the Input Y Range (our dependent variable, Rating) and the Input X Range (our independent variables: Points, Assists, and Rebounds). While the Toolpak can generate residuals directly, we will proceed to manually calculate the predicted values and squared residuals in the next step to ensure absolute precision, using the coefficients provided in the summary output.

Multiple regression in Excel

The output sheet provides the critical coefficients—the Intercept and the slopes for each predictor—which define the model’s line of best fit. These coefficient values are non-negotiable inputs for the subsequent calculation of the predicted values, which, in turn, are essential for determining the error terms that drive the entire Breusch-Pagan methodology.

Regression output in Excel

Step 2: Deriving Predicted Values and Squaring the Residuals

The next preparatory phase involves constructing the response variable for the upcoming auxiliary regression. This response variable is the set of squared residuals, representing the estimated error variance for each observation. This step requires two distinct calculations: determining the predicted value () and then calculating the squared error term.

First, we calculate the predicted value () for each observation using the standard regression equation and the coefficients derived in Step 1. This formula estimates the player’s Rating based on their specific values for Points, Assists, and Rebounds. It is vital to use absolute cell references (using the dollar sign, e.g., $A$1) for the coefficient values to ensure the formula can be correctly copied down the column:

Predicted values formula for regression in Excel

Applying this formula across all data points yields a column of predicted ratings. This prediction is crucial because the raw residual is defined precisely as the deviation between the observed actual rating (Y) and this calculated predicted rating (Y – Ŷ).

Predicted regression values in Excel

The final sub-step is calculating the squared residuals. We square the raw residuals for two main reasons: mathematically, it ensures all error terms are positive, and statistically, it emphasizes larger errors, which are the primary evidence used to detect non-constant variance. The calculation is straightforward: (Actual Rating – Predicted Rating)². This newly calculated variable will serve as the dependent variable in the forthcoming auxiliary regression, allowing us to model the variance itself.

Calculating residuals in Excel

This column of squared residuals effectively quantifies the magnitude of the error variance associated with each observation. By using this as the response variable, we are preparing to test if this variance systematically changes as a function of the original explanatory variables.

Squared residual calculation in Excel

Step 3: Executing the Auxiliary Regression

The core of the Breusch-Pagan methodology lies in the auxiliary regression. This secondary regression tests a simple concept: if the variance of the error terms (represented by the squared residuals) can be statistically predicted by the original independent variables, then the variance is not constant, confirming heteroscedasticity. Conversely, if the explanatory variables have no predictive power over the squared residuals, the variance is constant (homoscedastic).

We use the Data Analysis Toolpak for regression once more, but with a crucial modification to the input ranges:

  1. The Input Y Range is now the column containing the Squared Residuals (the calculated error variance).
  2. The Input X Range remains the original set of explanatory variables (Points, Assists, and Rebounds).

The specific output statistic we need from this auxiliary regression is the Coefficient of Determination, commonly referred to as R-squared (new). A high R-squared here suggests that the original predictors successfully explain a significant portion of the variability in the error terms, indicating non-constant variance.

The summary output for this auxiliary regression, where squared residuals predict the predictors, is displayed below:

Breusch-Pagan regression in Excel

From this critical summary, we extract the R-squared value, which is approximately 0.600395. This value, new, is the single required input for the final calculation of the Breusch-Pagan Test statistic.

Step 4: Computing the Test Statistic and P-Value

The final step involves transitioning the auxiliary regression result into a formal statistical test statistic, which follows a Chi-Square test statistic distribution. This comparison is necessary to determine the p-value and make a formal decision regarding the null hypothesis of homoscedasticity.

The Breusch-Pagan test statistic () is calculated by taking the product of the number of observations (n) and the R-squared value (new) obtained from the auxiliary regression:

X2 = n * R2new

In our current example, we have 10 observations (n = 10) and an new of 0.600395. The resulting calculation is:

X2 = 10 * 0.600395 = 6.00395

Once the test statistic is determined, we must find the corresponding p-value. The degrees of freedom (df) for this test are equal to the number of independent variables (k) used in the original regression, which is 3 (Points, Assists, Rebounds). We use Excel’s CHISQ.DIST.RT function, which computes the right-tail probability of the Chi-Square distribution, to find the p-value:

The formula applied in Excel is:

=CHISQ.DIST.RT(6.00395, 3) = 0.111418

Conclusion: Interpreting the Breusch-Pagan Test Results

The final step in the diagnostic process is to compare the calculated p-value against a predetermined significance level (α), typically 0.05. The test is framed by the following competing hypotheses:

  • Null Hypothesis (H₀): The variance of the errors is constant (Homoscedasticity).
  • Alternative Hypothesis (H₁): The variance of the errors is not constant (Presence of Heteroscedasticity).

Our analysis yielded a p-value of 0.111418. Since this calculated p-value (0.111418) is greater than our chosen significance threshold of 0.05, we fail to reject the null hypothesis. This statistical outcome indicates that there is insufficient evidence, at the 5% significance level, to conclude that heteroscedasticity is present in the multiple linear regression model. Consequently, the assumption of constant error variance is satisfied, confirming that the standard error estimates and all associated inference tests derived from the original model are statistically reliable.

Cite this article

Mohammed looti (2025). Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity. PSYCHOLOGICAL STATISTICS. Retrieved from https://statistics.arabpsychology.com/perform-a-breusch-pagan-test-in-excel/

Mohammed looti. "Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity." PSYCHOLOGICAL STATISTICS, 8 Nov. 2025, https://statistics.arabpsychology.com/perform-a-breusch-pagan-test-in-excel/.

Mohammed looti. "Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity." PSYCHOLOGICAL STATISTICS, 2025. https://statistics.arabpsychology.com/perform-a-breusch-pagan-test-in-excel/.

Mohammed looti (2025) 'Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity', PSYCHOLOGICAL STATISTICS. Available at: https://statistics.arabpsychology.com/perform-a-breusch-pagan-test-in-excel/.

[1] Mohammed looti, "Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity," PSYCHOLOGICAL STATISTICS, vol. X, no. Y, ص Z-Z, November, 2025.

Mohammed looti. Breusch-Pagan Test in Excel: A Step-by-Step Tutorial for Detecting Heteroscedasticity. PSYCHOLOGICAL STATISTICS. 2025;vol(issue):pages.

Download Post (.PDF)
Scroll to Top