Statistics

Pearson Correlation Calculator

Paste or type your paired X and Y data (separated by commas or spaces) and get the Pearson correlation coefficient r, its R-squared, t-statistic, two-tailed p-value, 95 percent confidence interval, and a plain-English interpretation of the strength and direction. All calculations update instantly in your browser with a full step-by-step breakdown.

By Dr. Hannah Brandt, PhD · Updated June 7, 2026

Pearson rStrong positive

0.9978

Correlation coefficient ranging from -1 (perfect negative) to +1 (perfect positive)

R-squared (r²)0.9956

t-statistic42.5958

P-value (two-tailed)0

95% Confidence interval0.9904 to 0.9995

Sample size (n)10

StrengthVery strong positive

SignificanceYes (p < 0.05)

0.9978 r

Strong negative<-0.7Moderate negative-0.7--0.4Weak negative-0.4--0.1None-0.1-0.1Weak positive0.1-0.4Moderate positive0.4-0.7Strong positive0.7+

r = 0.9978: very strong positive correlation.

r = 0.9978 indicates a very strong linear relationship (positive direction): as X increases, Y tends to increase.
R-squared = 99.6%, meaning X accounts for about 99.6% of the variation in Y.
The correlation is statistically significant at the 0.05 level (p = 0.0000), making it unlikely to be due to chance.
Correlation does not imply causation. A high r only tells you the two variables move together linearly, not that one causes the other.

Next stepConsider also running a simple linear regression to obtain the slope and intercept, and check a scatter plot for outliers or non-linear patterns.

Count the paired observationsn = 10
10 pairs
Calculate the mean of X and the mean of Ymean(X) = sum(X) / n, mean(Y) = sum(Y) / n
mean(X) = 8.3000, mean(Y) = 13.9000
Calculate the sample standard deviations of X and YSD(X) = sqrt(sum((xi - mean(X))^2) / (n-1)), similarly for Y
SD(X) = 4.1110, SD(Y) = 6.5904
Calculate the sample covariance of X and Ycov(X,Y) = sum((xi - mean(X)) * (yi - mean(Y))) / (n-1)
cov(X,Y) = 27.0333
Compute Pearson r = cov(X,Y) / (SD(X) * SD(Y))r = 27.0333 / (4.1110 * 6.5904)
r = 0.9978
Compute R-squared = r^2r^2 = 0.9978^2
r^2 = 0.9956
Compute the t-statistic with df = n - 2 = 8t = r * sqrt(n-2) / sqrt(1 - r^2) = 0.9978 * sqrt(8) / sqrt(1 - 0.9956)
t = 42.5958
Determine the two-tailed p-value from the t-distributionp = 2 * P(T > |t|) where T ~ t(8)
p = 0.0000
Apply Fisher's z-transformation to get the 95% confidence interval for rz = 0.5 * ln((1+r)/(1-r)), SE = 1/sqrt(n-3), CI: [z - 1.96*SE, z + 1.96*SE] back-transformed
95% CI: [0.9904, 0.9995]

Formula

r = \dfrac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}, \quad t = \dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}, \quad \text{df} = n-2

Worked example

With X = [2, 4, 5, 6, 8] and Y = [4, 7, 8, 10, 14]: mean(X) = 5, mean(Y) = 8.6, cov = 7.6, SD(X) = 2.236, SD(Y) = 3.715, r = 7.6 / (2.236 * 3.715) = 0.9147. With n = 5, t = 0.9147 * sqrt(3) / sqrt(1 - 0.8367) = 3.94, p = 0.029, significant at the 0.05 level.

What is the Pearson correlation coefficient?

The Pearson correlation coefficient (r) measures the strength and direction of the linear relationship between two continuous variables. It was developed by Karl Pearson in 1895, building on earlier work by Francis Galton, and remains the most widely used measure of association in statistics. The value of r always falls between -1 and +1 inclusive. A value of +1 means a perfect positive linear relationship (both variables increase together in exact proportion), a value of -1 means a perfect negative linear relationship (one increases as the other decreases in exact proportion), and a value of 0 means no linear association. The sign tells you the direction; the absolute value tells you the strength. Because r only captures linear patterns, two datasets can have the same r but very different scatter plots (Anscombe's quartet is a classic demonstration of this), so always inspect a scatter plot alongside the coefficient.

What does r-squared mean?

R-squared (r^2) is the square of the Pearson coefficient and represents the proportion of variance in Y that is statistically explained by X. For example, if r = 0.80, then r^2 = 0.64, meaning that 64% of the variability in Y is accounted for by its linear relationship with X. The remaining 36% is due to other factors or random variation. R-squared is also the key output of a simple linear regression model on the same data, so it bridges correlation and regression analysis. A high r^2 does not mean the model is correct, it only means the line fits the data well within your sample.

How to interpret the p-value and significance

The t-statistic and the associated two-tailed p-value test the null hypothesis that the true population correlation (rho) is zero. If the p-value is below your chosen significance level (commonly 0.05), you reject the null hypothesis and conclude that the correlation is statistically significant, meaning it is unlikely to have arisen by chance alone. However, statistical significance is not the same as practical significance: with a large enough sample (n = 1000), even r = 0.07 becomes significant, yet that relationship explains less than 0.5% of the variance. Always report both r and the p-value, and consider the effect size (r itself) alongside significance. Small samples (n < 10) produce unstable estimates: a single outlier pair can shift r dramatically.

95% confidence interval for r (Fisher z-transformation)

Because r is bounded between -1 and +1, its sampling distribution is skewed for values far from zero. Fisher's z-transformation converts r to a value with an approximately normal distribution: z = 0.5 * ln((1+r)/(1-r)), with standard error 1/sqrt(n-3). A 95% confidence interval is constructed in z-space, then back-transformed to the r scale. The resulting CI tells you the plausible range for the true population correlation rho. Wider intervals (seen with small n or r near +/-1) indicate more uncertainty. This calculator requires n >= 4 to produce a finite confidence interval.

Assumptions and when NOT to use Pearson r

Pearson r is designed for data that are: (1) continuous and measured on an interval or ratio scale; (2) approximately bivariate normally distributed; (3) related linearly (not curved); and (4) free from heavy outliers that distort the mean. If your data are ordinal rankings, contain severe outliers, or show a clear non-linear pattern (a curve, a U-shape), consider Spearman rank correlation or Kendall tau instead. Pearson r also assumes independence of observations: repeated-measures or hierarchical data require different approaches. Never extrapolate a correlation found in one range of data to a wider range without additional evidence.

Pearson r strength guidelines (Cohen 1988 / Evans 1996)

\|r\| range	Strength	Interpretation
0.00 - 0.09	None	No meaningful linear association
0.10 - 0.29	Weak	Small effect; relationship exists but is not strong
0.30 - 0.49	Moderate-weak	Moderate-small effect; noticeable but limited
0.50 - 0.69	Moderate	Moderate effect; practical significance likely
0.70 - 0.89	Strong	Large effect; meaningful linear association
0.90 - 1.00	Very strong	Very large effect; near-perfect linear association

Conventional thresholds for interpreting the magnitude of a Pearson correlation. Direction (positive or negative) is assessed separately.

Frequently asked questions

What is a "good" Pearson r value?

There is no universal cutoff because it depends entirely on the field and research question. In physics or engineering, r below 0.99 might be considered poor. In psychology or social science, r = 0.30 can be a meaningful finding. Use Cohen's (1988) benchmarks as a starting point: r around 0.10 is small, 0.30 is medium, and 0.50 is large. Always interpret r in the context of your subject matter and consider r-squared to understand the practical proportion of explained variance.

How many data points do I need?

You need at least 3 pairs to compute r, but 3 points almost always yield either r = 1, r = -1, or r near those extremes by chance. As a practical rule, aim for at least 10 pairs before trusting the result, and at least 30 pairs for stable estimates. For formal significance testing, a power analysis is the right tool: at r = 0.50 with alpha = 0.05 and 80% power you need about 29 pairs.

What is the difference between Pearson r and Spearman's rho?

Pearson r measures the strength of the LINEAR relationship between two continuous variables, using the actual numeric values. Spearman's rho converts both variables to ranks first, then computes Pearson r on those ranks. This makes Spearman robust to outliers and valid for ordinal data or non-linear monotonic relationships. If your data are ordinal (e.g. survey ratings), heavily skewed, or clearly non-linear, prefer Spearman. If the data are continuous, roughly normal, and the relationship looks linear on a scatter plot, use Pearson.

Can Pearson r be used to prove causation?

No. A correlation, however strong, cannot prove that one variable causes another. Both variables might be driven by a third (confounding) variable, the relationship might be coincidental, or the causation might run in the opposite direction. To establish causation you need a well-designed experiment with random assignment, or a rigorous quasi-experimental design with controls for confounders.

What is the t-statistic used for in this calculator?

The t-statistic is derived from r and n, and it follows a t-distribution with n-2 degrees of freedom under the null hypothesis that rho = 0. It is used to compute the p-value: how surprising would this r (or one more extreme) be if the true correlation were zero? Large |t| (and thus small p) means the data would be very unusual if there were no true correlation, so you reject the null hypothesis.

Sources

Was this calculator helpful?

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

How we build & check our calculators