Sampling Distribution of the Sample Proportion Calculator
Enter a population proportion and a sample size to get the mean and standard error of the sampling distribution of the sample proportion. Choose a tail mode and threshold(s) to find the probability that your sample proportion falls in that region. You also get the z-scores, a Central Limit Theorem validity check, a confidence interval for the proportion, and a full step-by-step breakdown of the arithmetic.
What is the sampling distribution of the sample proportion?
When you draw repeated random samples of size n from a large population where a proportion p of individuals have a characteristic of interest, the sample proportions p-hat you observe across those many samples form a distribution. That distribution has a predictable mean and spread: the mean equals p (so the sample proportion is an unbiased estimator of the population proportion) and the standard deviation - called the standard error - equals sqrt(p(1-p)/n). The collection of all possible p-hat values and their probabilities is the sampling distribution of the sample proportion.
When can you use the normal approximation?
By the Central Limit Theorem, the sampling distribution of p-hat is approximately normal when the sample is large enough. The standard rule of thumb requires both np >= 10 and n(1-p) >= 10. Some textbooks use np >= 5 and n(1-p) >= 5 as a looser check, and some statisticians prefer np >= 15 for greater reliability. This calculator reports whether your inputs satisfy the np >= 10 criterion. If neither is met, the distribution may be noticeably skewed, and you should use exact binomial probabilities rather than the normal approximation.
How to read the probability results
Once the standard error is known, any sample proportion threshold can be converted to a z-score: z = (p-hat - p) / se. You can then look up the probability from the standard normal distribution. "Between two values" mode gives you P(p1 < p-hat < p2) = Phi(z2) - Phi(z1), the area under the curve between the two thresholds. "Left tail" gives P(p-hat <= p1) = Phi(z1), and "right tail" gives P(p-hat >= p1) = 1 - Phi(z1). These probabilities represent how likely a randomly drawn sample of size n is to produce a sample proportion in the stated region, assuming the true population proportion is p.
Confidence interval for the population proportion
The confidence interval uses the same standard error to build a range around p: CI = p +/- z* x se, where z* is the critical value for the chosen confidence level (1.96 for 95%). This interval answers a different but related question: if you observed a sample proportion and wanted to estimate where the true population proportion might lie, the confidence interval gives a plausible range. In this calculator the interval is centered on the known population proportion p to show the range of sample proportions consistent with that p at the chosen confidence level.
Critical z-values for common confidence levels
| Confidence level | z* (critical value) | Interpretation |
|---|---|---|
| 90% | 1.645 | 90% of intervals contain the true p |
| 95% | 1.960 | 95% of intervals contain the true p (most common) |
| 98% | 2.326 | 98% of intervals contain the true p |
| 99% | 2.576 | 99% of intervals contain the true p |
The z* value multiplied by the standard error gives the margin of error for the confidence interval around p.
Frequently asked questions
What is the standard error of the sample proportion?
The standard error (SE) is the standard deviation of the sampling distribution of p-hat. It measures how much the sample proportion typically varies from the true population proportion across repeated samples. It is calculated as sqrt(p(1-p)/n). A larger sample size n produces a smaller standard error, meaning your sample proportion is likely to be closer to the true p.
How do I know if the normal approximation is valid?
Check that both np >= 10 and n(1-p) >= 10. For example, with p = 0.4 and n = 100, np = 40 and n(1-p) = 60, so the approximation is reliable. If n is small or p is very close to 0 or 1, these products can fall below 10 and the distribution becomes skewed, making the normal approximation inaccurate. In that case use an exact binomial calculation.
What is the difference between "left tail" and "right tail" modes?
Left tail (P(p-hat <= p1)) gives the probability of observing a sample proportion at most as extreme as p1 in the lower direction from the mean. Right tail (P(p-hat >= p1)) gives the probability of observing a sample proportion at least as large as p1. Left-tail results are used when you are concerned about proportions being unusually low; right-tail results are used when you are concerned about proportions being unusually high.
Why does increasing the sample size reduce the standard error?
The standard error formula is sqrt(p(1-p)/n). As n grows, you are dividing by a larger number, so the result shrinks. Intuitively, larger samples average out more random variation, so each individual sample proportion is closer to the true population proportion. Doubling the sample size reduces the standard error by a factor of sqrt(2), so to halve the standard error you need to quadruple the sample size.
Can I use this calculator for hypothesis testing?
Yes. In a one-proportion z-test, you assume a null hypothesis proportion p0, compute the standard error under H0, convert your observed sample proportion to a z-score, and find the tail probability (p-value). For a left-tailed test use left tail mode; for a right-tailed test use right tail mode; for a two-sided test run both and double the smaller probability (or use two thresholds symmetrically around p).
What does the confidence interval represent in this context?
Here the CI is built around the known population proportion p and uses the standard error sqrt(p(1-p)/n). It shows the range of sample proportions that would be considered typical given that population proportion and sample size. At 95% confidence, roughly 95 out of 100 random samples of size n will produce a p-hat within that interval - this is the region of "not surprising" outcomes.