A comprehensive guide to R-squared, the coefficient of determination. Learn what R-squared means, how to calculate it, interpret its value, and use it to evaluate regression models. Includes formulas, intuitive explanations, practical guidelines, and visualizations.

This article is part of the free-to-read Data Science Handbook
Choose your expertise level to adjust how many terms are explained. Beginners see more tooltips, experts see fewer to maintain reading flow. Hover over underlined terms for instant definitions.
R-squared: Measuring Model Fit
R-squared, also known as the coefficient of determination, is a key metric used to evaluate how well a regression model explains the variability of the dependent variable. This section provides an overview of R-squared, its formula, interpretation, and visual intuition.
Introduction
When building regression models, it's important to assess how well the model fits the data. R-squared quantifies the proportion of the variance in the dependent variable that is predictable from the independent variables.
The R-squared Formula
R-squared is defined as:
Where:
- : Residual sum of squares (unexplained variance)
- : Total sum of squares (total variance)
- : Actual value
- : Predicted value from the model
- : Mean of the actual values
Mathematical Intuition
The formula can be understood through variance decomposition. The total variance in the dependent variable can be split into explained variance (how much the model explains) and unexplained variance (how much remains). R-squared represents the proportion of total variance that is explained by the model. For simple linear regression, R-squared equals the square of the correlation coefficient between the predictor and response variables ().
Interpretation
- : The model explains all the variability of the response data around its mean (perfect fit).
- : The model explains none of the variability (no better than using the mean).
- : The model performs worse than simply using the mean as a predictor (rare but possible with poor models).
- Between 0 and 1: Indicates the proportion of variance explained by the model.
Example:
If , then 85% of the variance in the dependent variable is explained by the model.
Visualizing R-squared
A higher R-squared means the regression line fits the data points more closely. Below is a conceptual illustration:
Plot 1: High vs. Low R-squared
- Left: Data points closely follow the regression line (high R-squared)
- Right: Data points are widely scattered around the line (low R-squared)

High R-squared scenario demonstrating excellent model fit. The data points (blue circles) closely follow the regression line (red), indicating that the model explains a large proportion of the variance in the dependent variable. This visualization shows how a well-fitting model captures the underlying relationship between variables with minimal unexplained variation.

Low R-squared scenario showing poor model fit. The data points (blue circles) are widely scattered around the regression line (red), indicating that the model explains only a small proportion of the variance. This demonstrates how a poorly fitting model fails to capture the underlying relationship, leaving substantial unexplained variation in the data.
Adjusted R-squared
When dealing with multiple regression, the standard R-squared can be misleading because it always increases (or stays the same) when adding more predictors, even if those predictors don't improve the model.
Adjusted R-squared accounts for the number of predictors using the formula:
Where:
- is the number of observations
- is the number of predictors
Unlike standard R-squared, adjusted R-squared can decrease when adding irrelevant predictors, providing a more honest assessment of model quality in multiple regression.
Comparison with Other Metrics
R-squared is just one of many regression evaluation metrics. While R-squared measures the proportion of variance explained, other metrics provide different insights:
- RMSE and MAE: Assess absolute prediction errors in the same units as your data
- AIC and BIC: Help with model selection by balancing fit and complexity
- Cross-validation: Evaluates out-of-sample performance
When to use each metric:
- For quick assessment of model fit: Use R-squared
- For understanding prediction errors: Use RMSE or MAE
- For model selection: Use AIC or BIC
- For out-of-sample performance: Use cross-validation
Common Misconceptions
Several misconceptions about R-squared are widespread:
-
Higher R-squared is always better: Not true! A model with might be overfitted and perform poorly on new data.
-
R-squared indicates causation: R-squared only measures correlation, not causation. High R-squared doesn't mean your predictors cause changes in the outcome.
-
Low R-squared is always bad: Context matters. In social sciences, might be considered excellent, while in physics, might be unacceptable.
-
R-squared works for all models: R-squared is designed for linear regression. For logistic regression and other models, you need pseudo-R-squared measures.
When R-squared is Misleading
R-squared can be misleading in several scenarios:
-
Nonlinear relationships: R-squared assumes linearity and might be low even when the model captures the true relationship well.
-
Outliers: Outliers can dramatically affect R-squared, making it unreliable for assessing overall model performance.
-
Heteroscedasticity: When variance changes across the prediction range, R-squared might not reflect true model quality.
-
Overfitting: R-squared can be artificially high when the model has too many parameters relative to observations.
-
Small sample sizes: With very few observations, R-squared can be unstable and misleading.
Limitations
- R-squared does not indicate whether a regression model is appropriate.
- It can be artificially high for models with many predictors (use adjusted R-squared for multiple regression).
- A high R-squared does not imply causation.
- R-squared doesn't measure prediction accuracy on new data.
- It can be misleading with nonlinear relationships or outliers.
R-squared is a useful first check for model fit, but always consider it alongside other diagnostics and domain knowledge.
Summary
In summary, R-squared measures how well a regression model explains the variability of the dependent variable. It ranges from 0 to 1, with higher values indicating a better fit. However, R-squared alone does not guarantee that the model is appropriate or meaningful, so it should be interpreted in context and used alongside other evaluation metrics.
Quiz
Ready to put your understanding to the test? Challenge yourself with the following quiz and see how much you've learned about R-squared. Good luck!
Reference

About the author: Michael Brenndoerfer
All opinions expressed here are my own and do not reflect the views of my employer.
Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.
With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.
Related Content

Scaling Up without Breaking the Bank: AI Agent Performance & Cost Optimization at Scale
Learn how to scale AI agents from single users to thousands while maintaining performance and controlling costs. Covers horizontal scaling, load balancing, monitoring, cost controls, and prompt optimization strategies.

Managing and Reducing AI Agent Costs: Complete Guide to Cost Optimization Strategies
Learn how to dramatically reduce AI agent API costs without sacrificing capability. Covers model selection, caching, batching, prompt optimization, and budget controls with practical Python examples.

Speeding Up AI Agents: Performance Optimization Techniques for Faster Response Times
Learn practical techniques to make AI agents respond faster, including model selection strategies, response caching, streaming, parallel execution, and prompt optimization for reduced latency.
Stay updated
Get notified when I publish new articles on data and AI, private equity, technology, and more.


