Statistics

Coefficient of Determination (R2) Calculator

Enter your paired x and y values (separated by commas or spaces) to compute the coefficient of determination, R2. The calculator fits a least-squares regression line, breaks down the sum of squares into explained (SSR) and unexplained (SSE) parts, reports the total (SST), and tells you exactly what percentage of variation in y is explained by x. You also get the Pearson correlation coefficient r, the regression equation, and a step-by-step worked solution.

By Dr. Hannah Brandt, PhD · Updated June 7, 2026

R2 (coefficient of determination)Very strong fit

0.9567

Proportion of variance in y explained by x (0 to 1)

Variance explained0.96%

Pearson r0.9781

Slope (b1)0.9214

Intercept (b0)0.7286

SSE (error)1.0757

SSR (regression)23.7729

SST (total)24.8486

n (data points)7

0.9567 R2

Very weak<0.25Weak0.25-0.5Moderate0.5-0.7Strong0.7-0.9Very strong0.9+

Observed data
Regression line (R2 = 0.9567)

R2 = 0.9567: a very strong linear fit, explaining 95.7% of variance in y.

95.7% of the variation in y is explained by the linear relationship with x; the remaining 4.3% is due to other factors or random noise.
The regression line is y = 0.9214x + 0.7286.
With 7 data points, the Pearson correlation r is 0.9781, indicating a very strong positive linear association.
An extremely high R2 can sometimes indicate overfitting or a spurious correlation; always check whether the relationship makes scientific or practical sense.

Next stepPlot your residuals to check the linearity assumption and look for outliers, then consider whether additional predictors could improve the model.

Compute the means from 7 data pointsx-bar = sum(x) / n, y-bar = sum(y) / n
x-bar = 4.0000, y-bar = 4.4143
Fit the OLS regression line: slope b1 = Sxy / Sxxb1 = sum[(xi - x-bar)(yi - y-bar)] / sum[(xi - x-bar)^2]
b1 = 0.9214
Compute intercept b0 = y-bar - b1 * x-bar4.4143 - 0.9214 * 4.0000
b0 = 0.7286 -- regression line: y = 0.9214x + 0.7286
Sum of Squared Errors SSE = sum[(yi - y-hat-i)^2]Squared residuals summed over all points
SSE = 1.0757
Regression Sum of Squares SSR = sum[(y-hat-i - y-bar)^2]Squared deviations of fitted values from the mean
SSR = 23.7729
Total Sum of Squares SST = SSR + SSE23.7729 + 1.0757
SST = 24.8486
Coefficient of Determination R2 = SSR / SST23.7729 / 24.8486
R2 = 0.9567 (95.67% of variance explained)
Pearson r = sqrt(R2) with sign from slopesqrt(0.9567) * (slope >= 0, so r > 0)
r = 0.9781

What is the coefficient of determination (R2)?

The coefficient of determination, written R2 (pronounced "R-squared"), measures how well a regression model fits its data. It ranges from 0 to 1 and answers one simple question: what fraction of the total variation in y does the linear relationship with x account for? An R2 of 0.85 means the model explains 85% of the variation in y, leaving 15% unexplained by x alone. At R2 = 1 every data point falls exactly on the regression line. At R2 = 0 the model is no better than simply predicting the mean of y for every observation. R2 is the square of the Pearson correlation coefficient r in simple linear regression, so a correlation of 0.9 gives R2 = 0.81.

How R2 is calculated: SSE, SSR, and SST

R2 is built from three sums of squares. The Total Sum of Squares (SST) measures how much y varies overall: SST = sum[(yi - y-bar)^2]. The Regression Sum of Squares (SSR) captures how much variation the fitted line accounts for: SSR = sum[(y-hat-i - y-bar)^2]. The Error Sum of Squares (SSE) is the leftover variance the model cannot explain: SSE = sum[(yi - y-hat-i)^2]. By construction SST = SSR + SSE, so R2 = SSR / SST = 1 - SSE / SST. The "show your work" panel above traces every step with your actual numbers: mean computation, OLS slope and intercept, each sum of squares, and the final ratio.

How to use this calculator

Paste or type your x values (predictor) in the first field and your y values (response) in the second, separated by commas, semicolons, or spaces. You need at least 3 paired observations. The calculator instantly reports R2, the percentage of variance explained, the Pearson r, and the full sum-of-squares breakdown. The gauge shows your R2 on a color-coded scale from very weak (red) to very strong (green). Use the chart to confirm that a straight-line model is visually reasonable before trusting the R2 value.

Limitations and common mistakes

R2 only measures linear fit. A perfect U-shaped relationship between x and y can produce an R2 near zero even though x predicts y almost exactly, because the regression line averages out the curve. Adding more predictors to a model always increases R2, even for random noise, which is why adjusted R2 is preferred for comparing multi-variable models. A high R2 does not mean x causes y: a spurious correlation between two unrelated time series can produce R2 close to 1. Always plot your data and residuals to check for non-linearity, outliers, and other violations before drawing conclusions from R2 alone.

R2 interpretation guide

R2 range	Strength label	Common interpretation
0.90 - 1.00	Very strong	Model explains most variability; check for overfitting
0.70 - 0.89	Strong	Good predictive power in most applied fields
0.50 - 0.69	Moderate	Useful but other predictors likely matter
0.25 - 0.49	Weak	Limited predictive value; revisit model specification
0.00 - 0.24	Very weak	Model barely outperforms predicting the mean

Widely used benchmarks for interpreting R2 in applied regression. Context matters: a "weak" R2 in social science may be respectable, while a "strong" one in engineering may still be too low for a control system.

Frequently asked questions

What is a good R2 value?

It depends on the field. Physical sciences and engineering typically expect R2 above 0.95 for a model to be useful. In economics and finance, R2 of 0.5-0.7 is often considered strong. In psychology and social science, 0.3 can be notable because human behavior is highly variable. The reference table on this page gives widely used benchmarks, but always interpret R2 in the context of your specific domain and the alternatives available.

What is the difference between R and R2?

In simple linear regression, R (uppercase) is the absolute value of the Pearson correlation coefficient r, and R2 is its square. If r = 0.9, then R2 = 0.81. R2 has a cleaner interpretation as a proportion of variance explained, while r directly encodes the sign of the relationship (positive or negative). In multiple regression, R is the multiple correlation coefficient (correlation between y and y-hat), and R2 remains the proportion of variance explained.

Can R2 be negative?

In ordinary least-squares regression, R2 is always between 0 and 1 because the OLS line minimizes SSE by definition, so it can never do worse than the mean. However, if you compute R2 for a model whose line was NOT estimated by OLS (for example a line with a manually fixed slope), or if you apply a model trained on one dataset to a different dataset, then the formula 1 - SSE / SST can go negative, meaning the model is literally worse than predicting the mean.

What is adjusted R2 and when should I use it?

Adjusted R2 penalizes the ordinary R2 for every extra predictor added to a multiple regression model. The formula is 1 - (1 - R2)(n - 1) / (n - k - 1), where n is the sample size and k is the number of predictors. Use adjusted R2 whenever you are comparing models with different numbers of predictors. This calculator focuses on simple linear regression (one x, one y), where adjusted R2 = 1 - (1 - R2)(n - 1) / (n - 2).

Why can a high R2 be misleading?

Spurious correlations, overfitting, and non-random data collection can all inflate R2 without producing a useful model. Time-series data where both variables trend upward together often show R2 close to 1 even if x has no causal link to y. Adding irrelevant predictors to a model raises R2 mechanically. And a perfectly quadratic or sinusoidal relationship between x and y can give R2 near 0 even though x is a perfect predictor. Always visualize the data and residuals, not just the R2 number.

Sources

Was this calculator helpful?

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

How we build & check our calculators