Statistics

MSE Calculator - Mean Squared Error

Q: What is the MSE formula?

MSE = (1/n) x the sum of (actual_i - predicted_i) squared, where n is the number of observations. You subtract each predicted value from the corresponding actual value, square the result, add all those squared values together, then divide by n. This calculator does every step for you and shows the breakdown in the "Show your work" panel.

Q: What is a good MSE value?

There is no universal threshold because MSE is in squared units of your data. An MSE of 4 is excellent when predicting heights in metres but tiny when predicting house prices in dollars. Compare MSE relative to the scale of your target variable, or use RMSE (which is in the same units as the data) and compare it against the mean or standard deviation of the actual values. MAPE and R-squared are better for scale-free comparisons across different datasets.

Q: What is the difference between MSE and RMSE?

RMSE is the square root of MSE. Both measure the same thing - average squared prediction error - but RMSE expresses it in the original units of your data. If you are predicting temperatures in Celsius, MSE is in Celsius squared, while RMSE is in Celsius, which is far easier to reason about. RMSE is therefore the preferred metric when communicating accuracy to a non-technical audience.

Q: What is the difference between MSE and MAE?

MSE squares each residual before averaging, which gives large errors disproportionate weight. MAE (Mean Absolute Error) takes the absolute value instead, so every residual contributes linearly. If your data has outliers or your goal is to minimise the worst predictions, MAE is more robust. If your optimisation algorithm relies on gradient descent or you want to penalise large errors heavily, MSE (or RMSE) is more appropriate.

Q: Why is MAPE not calculated for some datasets?

MAPE divides each residual by the corresponding actual value, so it is undefined when any actual value equals zero. This calculator skips zero-valued observations when computing MAPE and labels those rows as N/A in the breakdown table. If all actual values are zero, MAPE is not reported at all.

Q: Can R-squared be negative?

Yes. R-squared is defined as 1 - SSE/SStot. If your predicted values fit the data worse than a horizontal line at the mean (that is, your model is actively unhelpful), SSE exceeds SStot and R-squared goes negative. A negative R-squared is a strong signal that something is wrong with the model or the alignment of predicted and actual values.

Q: How many data points do I need?

This calculator requires at least two observations. In practice, error metrics become reliable only with a reasonable sample size. With fewer than 10 points a single outlier can dominate MSE and RMSE substantially; with 30 or more points the metrics are more stable. For machine learning evaluation, compute metrics on a held-out test set rather than the training set to avoid optimistic bias.

Enter your actual and predicted values as comma-separated lists to compute the Mean Squared Error (MSE) and a full suite of companion error metrics: RMSE, SSE, MAE, MAPE, and R-squared. A step-by-step panel shows every residual and squared difference so you can follow the calculation exactly. Results update as you type.

By Dr. Hannah Brandt, PhD · Updated June 7, 2026

MSEExcellent fit

Mean Squared Error - the average of squared residuals

RMSE2.4495

SSE30

MAE2.4

MAPE0.1%

R-squared0.97

n (observations)5

0.97 R²

Poor fit<0.5Moderate0.5-0.7Good fit0.7-0.9Excellent0.9+

Observation index

Actual
Predicted

MSE = 6.0000, RMSE = 2.4495 across 5 observations.

MSE of 6.0000 means the average squared deviation between actual and predicted values is 6.0000.
RMSE of 2.4495 expresses that error in the same units as your data, making it easier to interpret alongside the scale of your values.
R-squared of 0.9700 indicates the model explains 97.0% of the variance in the actual values.
MAPE of 10.30% gives the average percentage error relative to actual values, useful when comparing errors across different scales.

Next stepWith an R-squared above 0.9 your model fits the data well. Consider checking for overfitting if this is a training set.

Compute each residual (actual - predicted)(10 - 12), (20 - 18), (30 - 33), (40 - 37), (50 - 52)
-2.00, 2.00, -3.00, 3.00, -2.00
Square each residual-2.00², 2.00², -3.00², 3.00², -2.00²
4.0000, 4.0000, 9.0000, 9.0000, 4.0000
Sum all squared residuals to get SSESSE = sum of all (actual - predicted)²
30.0000
Divide SSE by n to get MSE30.0000 / 5
6.0000
Take the square root of MSE to get RMSEsqrt(6.0000)
2.4495
Mean Absolute Error: average of |residuals|(|-2.00| + |2.00| + |-3.00| + |3.00| + |-2.00|) / 5
2.4000
R-squared: 1 - SSE / SStot (mean actual = 30.0000)1 - 30.0000 / SStot
0.9700

Residuals breakdown

#	Actual	Predicted	Residual	Squared error	Abs % error
1	10	12	-2.0000	4.0000	20.00%
2	20	18	2.0000	4.0000	10.00%
3	30	33	-3.0000	9.0000	10.00%
4	40	37	3.0000	9.0000	7.50%
5	50	52	-2.0000	4.0000	4.00%

Residual = Actual - Predicted. Squared error is the residual squared. Observations where actual = 0 show N/A for percentage error.

Formula

MSE = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2, \quad RMSE = \sqrt{MSE}, \quad R^2 = 1 - \frac{SSE}{SS_{tot}}

Worked example

Actual = [10, 20, 30], Predicted = [12, 18, 33]. Residuals: -2, 2, -3. Squared: 4, 4, 9. SSE = 17. MSE = 17/3 = 5.667. RMSE = sqrt(5.667) = 2.380. MAE = (2+2+3)/3 = 2.333. Mean actual = 20, SStot = (100+0+100) = 200, R-squared = 1 - 17/200 = 0.915.

What is Mean Squared Error (MSE)?

Mean Squared Error (MSE) is the most widely used metric for measuring how well a set of predicted values matches a set of actual (observed) values. For each observation you compute the residual, which is the difference between the actual value and the predicted value, square it to make it positive and penalise large errors more heavily, and then average all those squared residuals. The squaring step means a prediction that is 10 units away contributes 100 to the error, four times more than a prediction that is 5 units away. MSE is used in regression analysis, machine learning evaluation, time-series forecasting, and any field where a numerical prediction is compared with a ground-truth measurement.

How to use this calculator

Paste your actual values into the first field and your predicted values into the second field, separating each number with a comma. Both lists must contain the same number of values and at least two observations. The calculator returns MSE, RMSE, SSE, MAE, MAPE, and R-squared all at once. The "Show your work" panel shows every residual, its square, and how the averages are built up. The residuals breakdown table below the results lists each observation individually so you can spot which predictions are driving the error. The actual-versus-predicted chart makes the gap between the two series visible at a glance.

MSE, RMSE, MAE and MAPE - which metric to choose?

MSE is the mathematical foundation: it is differentiable everywhere, which makes it convenient for calculus-based optimisation (gradient descent minimises MSE when fitting a linear model). Its weakness is that squaring inflates the influence of outliers and puts the result in squared units, making it hard to compare against the scale of the original data. RMSE restores the original units by taking the square root of MSE, so an RMSE of 3.5 when forecasting temperatures in degrees Celsius tells you predictions are off by about 3.5 degrees on average. MAE uses absolute values instead of squares, giving outliers less weight and producing a number that is easier to interpret directly - but MAE is not differentiable at zero, which complicates some optimisers. MAPE expresses the error as a percentage of each actual value, making it unit-free and comparable across datasets of very different scales, though it breaks down when any actual value is zero.

R-squared and model fit

R-squared (the coefficient of determination) tells you what fraction of the variance in your actual values is explained by your predicted values. An R-squared of 0.85 means your predictions account for 85% of the variability - the remaining 15% is unexplained error. R-squared runs from negative infinity to 1, though values below 0 indicate a model that performs worse than simply predicting the mean every time. Perfect predictions give R-squared = 1. Unlike MSE, R-squared is scale-free, so it is useful for comparing models across different target variables. However, R-squared always increases (or stays the same) as you add more predictors to a regression model, even irrelevant ones, which is why adjusted R-squared exists for regression contexts.

R-squared interpretation guide

R-squared range	Interpretation	Typical use case
0.90 - 1.00	Excellent fit	Physical models, controlled experiments
0.70 - 0.89	Good fit	Econometrics, engineering predictions
0.50 - 0.69	Moderate fit	Social science, biological data
0.30 - 0.49	Weak fit	Complex behavioral models
Below 0.30	Poor fit	Review model structure

General benchmarks for R-squared. Acceptable thresholds vary by field - cross-sectional social science data often falls below 0.5 while physical measurements can reach 0.99.

Frequently asked questions

What is the MSE formula?

MSE = (1/n) x the sum of (actual_i - predicted_i) squared, where n is the number of observations. You subtract each predicted value from the corresponding actual value, square the result, add all those squared values together, then divide by n. This calculator does every step for you and shows the breakdown in the "Show your work" panel.

What is a good MSE value?

There is no universal threshold because MSE is in squared units of your data. An MSE of 4 is excellent when predicting heights in metres but tiny when predicting house prices in dollars. Compare MSE relative to the scale of your target variable, or use RMSE (which is in the same units as the data) and compare it against the mean or standard deviation of the actual values. MAPE and R-squared are better for scale-free comparisons across different datasets.

What is the difference between MSE and RMSE?

RMSE is the square root of MSE. Both measure the same thing - average squared prediction error - but RMSE expresses it in the original units of your data. If you are predicting temperatures in Celsius, MSE is in Celsius squared, while RMSE is in Celsius, which is far easier to reason about. RMSE is therefore the preferred metric when communicating accuracy to a non-technical audience.

What is the difference between MSE and MAE?

MSE squares each residual before averaging, which gives large errors disproportionate weight. MAE (Mean Absolute Error) takes the absolute value instead, so every residual contributes linearly. If your data has outliers or your goal is to minimise the worst predictions, MAE is more robust. If your optimisation algorithm relies on gradient descent or you want to penalise large errors heavily, MSE (or RMSE) is more appropriate.

Why is MAPE not calculated for some datasets?

MAPE divides each residual by the corresponding actual value, so it is undefined when any actual value equals zero. This calculator skips zero-valued observations when computing MAPE and labels those rows as N/A in the breakdown table. If all actual values are zero, MAPE is not reported at all.

Can R-squared be negative?

Yes. R-squared is defined as 1 - SSE/SStot. If your predicted values fit the data worse than a horizontal line at the mean (that is, your model is actively unhelpful), SSE exceeds SStot and R-squared goes negative. A negative R-squared is a strong signal that something is wrong with the model or the alignment of predicted and actual values.

How many data points do I need?

This calculator requires at least two observations. In practice, error metrics become reliable only with a reasonable sample size. With fewer than 10 points a single outlier can dominate MSE and RMSE substantially; with 30 or more points the metrics are more stable. For machine learning evaluation, compute metrics on a held-out test set rather than the training set to avoid optimistic bias.

Sources

Was this calculator helpful?

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

How we build & check our calculators