Hardy-Weinberg Calculator
The Hardy-Weinberg principle converts allele frequencies into genotype frequencies for a population at genetic equilibrium. Enter p directly, supply observed genotype counts, or reverse-solve from the fraction of recessive-phenotype individuals. The calculator returns all three genotype frequencies, expected counts for any population size, the chi-square statistic for testing equilibrium, and the estimated carrier rate.
Formula
Worked example
If p = 0.6 then q = 0.4: AA = 36%, Aa = 48%, aa = 16%. In a population of 1000: 360 AA, 480 Aa, 160 aa.
How the Hardy-Weinberg principle works
In 1908, mathematician G.H. Hardy and physician Wilhelm Weinberg independently showed that allele and genotype frequencies in a large, randomly mating population remain constant from generation to generation in the absence of evolution. The equation p + q = 1 describes the two allele frequencies. Squaring both sides yields the genotype equation: p² + 2pq + q² = 1, where p² is the frequency of the homozygous dominant genotype (AA), 2pq is the heterozygous carrier frequency (Aa), and q² is the homozygous recessive frequency (aa). The principle is powerful because it gives geneticists a mathematical null hypothesis. If observed genotype frequencies in a real population deviate significantly from these predictions, at least one equilibrium assumption is being violated.
Three ways to enter your data
This calculator supports three input modes so you can work from whatever data you have. If you already know p, select "Allele frequency" and enter it directly. If you have a sample of genotyped individuals, select "Observed genotype counts" and enter the number of AA, Aa, and aa individuals. The calculator derives p as (2·NAA + NAa) / (2·N) from your counts. If you only know the fraction of the population expressing a recessive trait (such as the birth prevalence of cystic fibrosis), select "Recessive phenotype frequency": the calculator treats that fraction as q² and reverse-solves q = sqrt(q²), then p = 1 - q. In all modes, enter a population size N to get expected genotype counts alongside the frequencies.
Chi-square test for Hardy-Weinberg equilibrium
When you use observed genotype counts, you can enable the chi-square goodness-of-fit test. The calculator computes expected counts from the HWE-derived p and q, then evaluates chi-sq = sum of (observed - expected)² / expected across the three genotype classes. With one degree of freedom (because q is derived from p and both allele frequencies are constrained to sum to 1), the critical value is 3.841 at alpha = 0.05. A chi-sq below this threshold is consistent with equilibrium; above it, the population deviates significantly. Note that expected counts below five make the chi-square approximation unreliable, so the calculator flags this situation. A departure from HWE can indicate selection, inbreeding, admixture, or population structure, but is not proof of any specific mechanism.
Carrier frequency and its practical importance
The heterozygous carrier frequency 2pq is often the most practically important output in medical genetics. For rare recessive diseases, q is small, so q² (disease frequency) is very small, but 2pq (carrier frequency) is much larger. For example, cystic fibrosis affects roughly 1 in 2,500 people of Northern European descent (q² = 0.0004), giving q = 0.02 and p = 0.98. The carrier frequency 2pq = 2 × 0.98 × 0.02 = 0.039, meaning about 1 in 25 individuals is a carrier, far outnumbering affected individuals. This ratio underpins genetic counseling: even for very rare diseases, a substantial fraction of the population silently carries one copy of the disease allele.
Five assumptions of Hardy-Weinberg equilibrium
The equilibrium holds only when five conditions are met: (1) Random mating, meaning every individual has an equal probability of mating with any other. (2) No natural selection, meaning all three genotypes survive and reproduce equally well. (3) No mutation, meaning alleles do not change from one generation to the next. (4) No gene flow (migration), meaning no individuals enter or leave the population carrying different allele frequencies. (5) Large population size, meaning random genetic drift does not cause allele frequencies to fluctuate by chance. In practice, most natural and clinical populations violate at least one of these assumptions to some degree. The Hardy-Weinberg calculation is therefore a baseline, not a description of any specific real population.
Hardy-Weinberg genotype frequencies at key allele frequencies
| p | q | p² (AA) % | 2pq (Aa) % | q² (aa) % | Carriers per affected individual |
|---|---|---|---|---|---|
| 0.99 | 0.01 | 98.01 | 1.98 | 0.01 | ~198 |
| 0.95 | 0.05 | 90.25 | 9.50 | 0.25 | ~38 |
| 0.9 | 0.1 | 81.00 | 18.00 | 1.00 | ~18 |
| 0.8 | 0.2 | 64.00 | 32.00 | 4.00 | ~8 |
| 0.7 | 0.3 | 49.00 | 42.00 | 9.00 | ~4.7 |
| 0.6 | 0.4 | 36.00 | 48.00 | 16.00 | ~3 |
| 0.5 | 0.5 | 25.00 | 50.00 | 25.00 | ~2 |
p = dominant allele frequency; q = 1 - p. Carrier (2pq) peaks at 50% when p = q = 0.5.
Frequently asked questions
What is p and q in the Hardy-Weinberg equation?
p is the frequency of the dominant allele (A) in the population, a value between 0 and 1. q is the frequency of the recessive allele (a), calculated as 1 - p. Together they describe the entire allele composition of the gene pool at a single locus. The equation p + q = 1 simply states that every allele in the population is either A or a.
How do I use Hardy-Weinberg if I only know the frequency of a recessive disease?
If you know the proportion of individuals expressing a recessive disease (the affected frequency), that proportion equals q². Take its square root to get q, then subtract from 1 to get p. For example, if 1 in 10,000 people are affected, q² = 0.0001, q = 0.01, p = 0.99, and the carrier frequency 2pq = 0.0198 (about 1 in 50). Select the "Recessive phenotype frequency" input mode in this calculator to do this automatically.
What does a significant chi-square result mean in a Hardy-Weinberg test?
A chi-square statistic above 3.841 (the critical value at df = 1, alpha = 0.05) means the observed genotype counts differ from HWE expectations more than you would expect by chance alone. This is evidence that at least one equilibrium assumption is violated: the population may be inbreeding (excess homozygotes), experiencing selection against one genotype, or may be a mixture of two subpopulations with different allele frequencies (population structure). It does not tell you which assumption is violated, only that something is at work.
Why is the carrier frequency always higher than the recessive phenotype frequency?
Because 2pq is always larger than q² when p is greater than q/2. For a rare allele where q is small, p is close to 1, so 2pq is approximately 2q while q² is q times q, which is much smaller. This means for every person who expresses a rare recessive condition, many more are silent carriers. The rarer the recessive allele, the more lopsided this ratio becomes.
Does Hardy-Weinberg apply to X-linked genes?
Not directly using the same formula. For X-linked loci, males are hemizygous (they carry only one X-chromosome allele), so the three-genotype model does not apply to them. In females, who have two X chromosomes, p² + 2pq + q² still applies. The full population calculation must account for both sexes separately, and under random mating, X-linked allele frequencies equilibrate over two generations rather than one.
How do I convert genotype frequencies into actual numbers?
Multiply each genotype frequency by the total population size N. If p = 0.6 and N = 1000, then expected AA count = p² × N = 0.36 × 1000 = 360, expected Aa count = 2pq × N = 0.48 × 1000 = 480, and expected aa count = q² × N = 0.16 × 1000 = 160. This calculator does this automatically when you enter N in the population size field.