Question 1

What does the sum of squares measure?

Accepted Answer

The statistical sum of squares measures total variability in a dataset. It answers the question: how far do the individual values stray from the mean, in total, when each distance is squared? A SS of zero means every value equals the mean. A large SS means the data is widely spread. Dividing SS by (n - 1) converts it into the sample variance, the most widely used single measure of spread.

Question 2

What is the difference between statistical SS and algebraic SS?

Accepted Answer

Statistical SS = Σ(xᵢ - x̅)² subtracts the mean from each value before squaring. It measures dispersion around the mean and is used for variance, standard deviation, ANOVA, and regression. Algebraic SS = Σxᵢ² simply squares each raw value and sums them, with no mean subtraction. The two are connected by the identity Σxᵢ² = SS + n * x̅², so you can switch between them if you know the count and mean.

Question 3

How is sum of squares used in ANOVA?

Accepted Answer

One-way ANOVA splits the total sum of squares (SS Total = Σ(xᵢ - grand mean)²) into two components: SS Between, which captures how much the group means differ from each other, and SS Within, which captures variation within each group. Dividing each by its degrees of freedom gives the mean squares. The F-statistic is MS Between / MS Within. A large F indicates that between-group differences are unlikely to be due to chance alone.

Question 4

How is sum of squares used in linear regression?

Accepted Answer

In regression, SS Total = SS Regression + SS Residual. SS Regression (also called SS Explained) is the portion of total variability that the fitted line accounts for. SS Residual is the leftover variation not explained by the model. R-squared equals SS Regression / SS Total, and it tells you what fraction of the total variance the model explains. The residual mean square (SS Residual / degrees of freedom) estimates the variance of the error term.

Question 5

Why do we square the deviations instead of just summing them?

Accepted Answer

If you simply summed the deviations (xᵢ - x̅) without squaring, positive and negative deviations would cancel out and the total would always equal zero, giving you no information about spread. Squaring makes all deviations positive and also penalizes larger deviations more heavily, which is the mathematically natural choice because it leads to the least-squares estimates that minimize prediction error in regression.

Question 6

What does SS divided by (n - 1) give, and why not n?

Accepted Answer

Dividing SS by (n - 1) gives the sample variance, s². The denominator (n - 1) is called Bessel's correction. When estimating population variance from a sample, we use the sample mean rather than the (unknown) population mean, and this introduces a small downward bias in SS. Dividing by (n - 1) instead of n corrects for that bias, making the sample variance an unbiased estimator of the population variance. If you know you have the entire population, dividing by n (population variance) is appropriate instead.

Question 7

Can the sum of squares be negative?

Accepted Answer

No. Because every term in Σ(xᵢ - x̅)² is a squared real number, it is always zero or positive. The same holds for the algebraic SS = Σxᵢ². If you ever compute a negative SS it signals an arithmetic error, often a sign mistake when computing deviations or a rounding problem in an intermediate step.

SS Type	Formula	Used in
Statistical (total)	Σ(xᵢ - x̅)²	Variance, std dev, ANOVA total
Algebraic	Σxᵢ²	Variance shortcut formulas
SS Between (ANOVA)	Σnⱼ(x̅ⱼ - x̅)²	Between-group variation in ANOVA
SS Within (ANOVA)	ΣΣ(xᵢⱼ - x̅ⱼ)²	Within-group (error) variation in ANOVA
SS Regression	Σ(ŷᵢ - x̅)²	Explained variation in linear regression
SS Residual	Σ(yᵢ - ŷᵢ)²	Unexplained variation (regression error)

Sum of Squares Calculator

Your details

What is the sum of squares?

Statistical SS vs. algebraic SS

Sum of squares in ANOVA and regression

How to calculate the sum of squares by hand

Sum of squares variants and their uses

Frequently asked questions

Sources