F-Statistic Calculator
Enter two sample standard deviations and sizes to test whether their underlying variances are equal, or switch to regression mode and supply the sum-of-squared residuals for your full and restricted models. The calculator returns the F-statistic, degrees of freedom, exact p-value, critical value, and a plain-English conclusion. A step-by-step panel shows every arithmetic operation so you can check the working in detail.
What is the F-statistic?
The F-statistic is the ratio of two mean-squared quantities, each following a chi-squared distribution scaled by its degrees of freedom. Under the null hypothesis, the ratio follows an F-distribution parametrised by two degrees-of-freedom values: the numerator df (df1) and the denominator df (df2). A large F-value means the numerator is much larger than expected by chance, which is evidence against the null hypothesis. The F-distribution is right-skewed and always non-negative, so large positive values are the ones that challenge H0.
Two-sample variance test (Snedecor F-test)
The most direct application of the F-statistic is testing whether two populations have the same variance. You compute S1-squared / S2-squared where S1 and S2 are the two sample standard deviations. Under H0 (equal variances) this ratio follows an F(n1-1, n2-1) distribution. A p-value below your significance level (commonly 0.05) means the evidence favours unequal variances. This test is a standard pre-check before running a two-sample t-test: if variances are unequal, use Welch's t-test rather than the pooled version. Assumptions: both samples are drawn from normally distributed populations and observations are independent.
Regression and Wald F-test
In linear regression, the F-statistic tests whether a set of J coefficient restrictions holds jointly. You fit the unrestricted (full) model and the restricted model, record their sums of squared residuals (SSR_F and SSR_R), and form F = [(SSR_R - SSR_F) / J] / [SSR_F / (N - K)]. The numerator is the average increase in residuals per restriction; the denominator is the baseline mean-squared error of the full model. A significant F means at least one of the J restrictions is false, so the restricted model is misspecified. A common special case is testing whether all slope coefficients are zero, which is the "overall model F" reported by most regression packages.
Interpreting the p-value and critical value
Two decision rules produce the same conclusion. Rule 1: compare the p-value to alpha. If p < alpha, reject H0. Rule 2: compare the computed F to the critical value F_crit. If F > F_crit, reject H0. The critical value is the F-quantile that cuts off an area equal to alpha in the right tail of the distribution, so the two rules are mathematically equivalent. Reporting the exact p-value is generally preferred over a binary reject/fail-to-reject statement because it lets readers apply their own significance threshold. Common alpha levels are 0.10 (marginal), 0.05 (conventional), and 0.01 (stringent).
Approximate critical F-values (right-tailed, α = 0.05)
| df₁ \ df₂ | 10 | 20 | 30 | 60 | 120 |
|---|---|---|---|---|---|
| 1 | 4.96 | 4.35 | 4.17 | 4.00 | 3.92 |
| 2 | 4.10 | 3.49 | 3.32 | 3.15 | 3.07 |
| 3 | 3.71 | 3.10 | 2.92 | 2.76 | 2.68 |
| 4 | 3.48 | 2.87 | 2.69 | 2.53 | 2.45 |
| 5 | 3.33 | 2.71 | 2.53 | 2.37 | 2.29 |
| 10 | 2.98 | 2.35 | 2.16 | 1.99 | 1.91 |
Rows = numerator df (df₁), columns = denominator df (df₂). Reject H₀ if F > critical value.
Frequently asked questions
What does a large F-statistic mean?
A large F-statistic means the variance in the numerator (between-group variation, or the reduction in residuals from removing the restrictions) is much larger than the variance in the denominator (within-group noise, or the baseline residual variance). The larger F is, the more evidence there is against the null hypothesis. Whether a given value counts as "large" depends on the degrees of freedom, which is why you always need the p-value or critical value to make a decision.
What is the difference between F and t statistics?
For a two-tailed t-test with df degrees of freedom, the square of the t-statistic equals an F(1, df) statistic. The F-test generalises the t-test by testing multiple restrictions simultaneously. When you have just one restriction (for example, testing one regression coefficient), F equals t-squared and the two tests give the same p-value. For two or more restrictions you need the F-test because t cannot handle joint hypotheses.
Does the variance F-test require normally distributed data?
Yes, the Snedecor F-test for equality of variances is sensitive to non-normality. With moderate departures from normality the p-value can be misleading. For non-normal data, Levene's test or the Brown-Forsythe test are more robust alternatives. The regression F-test is less sensitive to non-normality in large samples because of the central limit theorem, but small-sample regression F-tests also assume normally distributed errors.
What is the null hypothesis of the F-test?
For the variance test, H0 is that both populations have the same variance (sigma-1-squared = sigma-2-squared), so the ratio equals 1. For the regression Wald test, H0 is that all J excluded coefficients are jointly zero, meaning the restrictions hold in the true model. In ANOVA, H0 is that all group means are equal, which is equivalent to saying the between-group variance is no larger than expected by chance.
Can the F-statistic be less than 1?
Yes. If the sample variance of group 1 is smaller than that of group 2, the ratio S1-squared / S2-squared is less than 1. This is valid: small F-values favour the null hypothesis. Some textbooks and software always place the larger variance in the numerator to keep F >= 1, but this changes the hypothesis and can be misleading. This calculator uses sample 1 as the numerator and sample 2 as the denominator, exactly as you enter them.