Skip to content
Statistics

Chi-Square Calculator

Enter your observed and expected counts and choose a significance level. The calculator computes the chi-square statistic (χ²), the p-value, the critical value for your chosen alpha, degrees of freedom, and the Cramer's V effect size. A per-category breakdown shows which groups drive the discrepancy.

Your details

The frequencies you actually measured. Separate values with commas, spaces or new lines.
The frequencies predicted by your null hypothesis. Each must be greater than zero.
The threshold probability for rejecting the null hypothesis. 0.05 (5%) is the most common choice in research.
Chi-square (χ²)Not significant - fail to reject H0
3.0002
p-value0.3916
Critical value7.8147
Degrees of freedom3
Cramer's V (effect size)0.0535
Total sample size (n)349
Categories (k)4
0.3916 p-value
Highly significant<0.001Very significant0.001-0.01Significant0.01-0.05Marginal0.05-0.1Not significant0.1+

χ² = 3.0002, p = 0.3916, df = 3

  • The p-value (0.3916) is above α = 0.05, so the data are consistent with the null hypothesis.
  • Cramer's V = 0.0535, indicating a negligible effect size (V < 0.1 negligible, 0.1-0.3 small, 0.3-0.5 medium, > 0.5 large).
  • Inspect the per-category breakdown below to see which groups contribute most to the statistic.

Next stepYour data fit the expected distribution. Check the effect size and consider whether your sample is large enough to detect a real difference.

Per-category chi-square contributions

CategoryObserved (O)Expected (E)O - E(O - E)²(O-E)²/E% of χ²
Category 19080.0010.00100.001.250041.7%
Category 26070.00-10.00100.001.428647.6%
Category 3104100.004.0016.000.16005.3%
Category 49599.00-4.0016.000.16165.4%
TOTAL349349.003.0002100.0%

Categories with a large % contribution are the main drivers of the chi-square statistic.

Formula

χ2=i=1k(OiEi)2Ei,df=k1,V=χ2n(k1)\chi^2 = \sum_{i=1}^{k}\dfrac{(O_i - E_i)^2}{E_i}, \quad df = k - 1, \quad V = \sqrt{\dfrac{\chi^2}{n \cdot (k-1)}}

Worked example

For observed 90, 60, 104, 95 and expected 80, 70, 100, 99: terms are (10²/80) + (10²/70) + (4²/100) + (4²/99) = 1.25 + 1.4286 + 0.16 + 0.1616 = 3.0002, df = 3. At df = 3, the critical value is 7.815 (alpha = 0.05), so p > 0.05 and the result is not significant. Cramer's V = sqrt(3.0002 / (349 x 3)) = 0.053, a negligible effect.

What the chi-square statistic measures

The chi-square (χ²) statistic quantifies how far a set of observed counts strays from the counts a hypothesis predicts. For each category you take the difference between the observed value and the expected value, square it so positive and negative gaps both count, then divide by the expected value to scale the discrepancy relative to how large that category should be. Summing those terms across every category gives a single number: zero means the data match the model exactly, and the value grows as the mismatch widens. The formula is χ² = sum of (O minus E) squared divided by E, summed over all k categories.

p-value, critical value, and statistical significance

Once you have χ² and the degrees of freedom (df = k minus 1), you can determine whether the result is statistically significant. The p-value is the probability of observing a χ² this large or larger if the null hypothesis were true; a small p-value is evidence against the null. The critical value is the threshold χ² must exceed for the result to be significant at your chosen alpha level. Both are derived from the chi-square distribution, which shifts right as df increases. Typical alpha levels are 0.10 (lenient), 0.05 (standard), 0.01 (strict), and 0.001 (very strict).

Degrees of freedom and Cramer's V effect size

Degrees of freedom describe how many category counts are free to vary once the totals are fixed. For a goodness-of-fit test with k categories, df = k minus 1. Statistical significance alone does not tell you how large the discrepancy is in practical terms; for that you use an effect size. Cramer's V is computed as the square root of chi-square divided by (n times (k minus 1)), where n is the total sample size. V ranges from 0 to 1: values below 0.1 are considered negligible, 0.1 to 0.3 are small, 0.3 to 0.5 are medium, and above 0.5 are large. A test with a huge sample can be statistically significant with a negligible V, so always report both.

Per-category breakdown and when the test applies

The per-category breakdown table shows the contribution (O minus E)²/E from each group, expressed both as a raw value and as a percentage of the total χ². Categories with a large share are the ones driving the result. This is especially useful when you have many categories and want to pinpoint where the mismatch is. The chi-square goodness-of-fit test applies to counts of categorical data (not percentages or measured quantities), with independent observations drawn at random. A standard rule of thumb is that every expected count should be at least 5; below that, the normal approximation breaks down and you should merge categories or use an exact test.

Chi-square critical values at common significance levels

dfα = 0.10α = 0.05α = 0.01α = 0.001
12.7063.8416.63510.828
24.6055.9919.2113.816
36.2517.81511.34516.266
47.7799.48813.27718.467
59.23611.0715.08620.515
610.64512.59216.81222.458
712.01714.06718.47524.322
813.36215.50720.0926.124
914.68416.91921.66627.877
1015.98718.30723.20929.588
1218.54921.02626.21732.909
1522.30724.99630.57837.697
2028.41231.4137.56645.315

Reject the null hypothesis when your χ² exceeds the critical value for your degrees of freedom and chosen α.

Frequently asked questions

How is the p-value calculated from a chi-square statistic?

The p-value is the area to the right of your chi-square value on the chi-square distribution curve with your degrees of freedom. This calculator uses the regularised incomplete gamma function to compute it precisely. If your χ² = 7.82 with df = 3, the p-value is about 0.05, meaning there is roughly a 5% chance of seeing a statistic this large under the null hypothesis.

What is the difference between the chi-square statistic and the p-value?

The chi-square statistic is a raw measure of how far observed counts deviate from expected counts. On its own it is hard to interpret because the expected range shifts with degrees of freedom. The p-value converts that statistic into a probability, giving a consistent scale (0 to 1) that tells you how surprising the result is under the null hypothesis. Always report both along with degrees of freedom so readers can verify and replicate your result.

What does Cramer's V tell me that the p-value doesn't?

Cramer's V measures the practical size of the discrepancy, independent of sample size. With a large sample (say n = 10,000) even a trivial difference can produce a tiny p-value. Cramer's V will still be small (near zero), revealing that the effect is not meaningful in practice. As a rule of thumb: V below 0.1 is negligible, 0.1 to 0.3 is small, 0.3 to 0.5 is medium, and above 0.5 is large.

How are degrees of freedom calculated here?

For a goodness-of-fit test this calculator uses df = k minus 1, where k is the number of categories (the length of your list). A contingency table test of independence instead uses (rows minus 1) times (columns minus 1), so use a dedicated two-way table calculator for that case.

Why must every expected count be greater than zero, and ideally at least 5?

Each term divides by the expected value, so a zero expected count makes the formula undefined. Beyond that, the chi-square distribution is an approximation that relies on expected frequencies being large enough for a normal approximation to hold. Statisticians commonly require each expected count to be at least 5; with smaller expected counts, consider merging categories or using Fisher's exact test.

How do I read the per-category breakdown table?

Each row shows one category's observed count, expected count, difference (O minus E), squared difference, the contribution to χ² that is (O minus E)² divided by E, and the percentage that contribution represents of the total χ². Categories with a high percentage are the main drivers of the statistic. Investigating those categories first helps you understand the practical reason for a significant result.

Sources

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

Search 3,500+ calculators

Loading search…