Statistics

Polynomial Regression Calculator

Enter your x and y data points (one pair per line, comma-separated), choose a polynomial degree from 1 (linear) to 6 (sextic), and this calculator fits the best curve by least squares. You get the polynomial equation with coefficients, R-squared, adjusted R-squared, RMSE, a full table of fitted values and residuals, and a chart showing your data alongside the fitted curve. You can also enter any x-value to predict a new y.

By Dr. Hannah Brandt, PhD · Updated June 7, 2026

Fitted equation

y = 1.0488x^2 - 0.5012x + 1.5286

The polynomial equation with computed coefficients

R-squared (R²)0.9997

Adjusted R-squared0.9996

RMSE0.2717

Data points used7

Predicted y64.6429

Data
Fitted curve

Fit quality is excellent (R-squared above 0.99).

The degree-2 polynomial explains 100.0% of the variation in your y-values.
Adjusted R-squared is 0.9996, which penalises extra coefficients and is better for comparing models of different degrees.
RMSE of 0.2717 is the typical distance (in y-units) between your data and the fitted curve.

Next stepCompare R-squared and adjusted R-squared across degrees: if adjusted R-squared stops increasing or decreases, the current degree is likely sufficient.

Step 1 - Identify the modeldegree = 2 (quadratic)
y = a0 + a1*x + ... + a2*x^2
Step 2 - Count valid data pointsn = number of (x, y) pairs parsed from input
n = 7
Step 3 - Build design matrix X (n x 3) via VandermondeX[i][j] = x_i^j for j = 0, 1, ..., 2
7 rows, 3 columns
Step 4 - Form the normal equations: (X^T X) c = X^T yMatrix of size 3 x 3 on the left
Normal-equation system assembled
Step 5 - Solve by Gaussian elimination with partial pivotingGaussian elimination on the 3 x 4 augmented matrix
Coefficients: [1.5286, -0.5012, 1.0488]
Step 6 - Write out the fitted equationSubstitute coefficients back into the polynomial
y = 1.0488x^2 - 0.5012x + 1.5286
Step 7 - Compute R-squaredR^2 = 1 - SS_res / SS_tot
0.9997
Step 8 - Compute adjusted R-squaredadj R^2 = 1 - (1 - R^2) * (n - 1) / (n - 2 - 1)
0.9996
Step 9 - Compute RMSERMSE = sqrt( sum( (y_i - y_hat_i)^2 ) / n )
0.2717
Step 10 - Predict y at x = 8Substitute x = 8 into the fitted equation
y = 64.6429

Fitted Values and Residuals

x	y (observed)	y-hat (fitted)	Residual	Residual^2
1.0000	2.1000	2.0762	0.0238	0.0006
2.0000	4.9000	4.7214	0.1786	0.0319
3.0000	9.2000	9.4643	-0.2643	0.0698
4.0000	16.1000	16.3048	-0.2048	0.0419
5.0000	25.3000	25.2429	0.0571	0.0033
6.0000	36.8000	36.2786	0.5214	0.2719
7.0000	49.1000	49.4119	-0.3119	0.0973

Residual = observed y minus fitted y. Smaller residuals mean a tighter fit.

What is polynomial regression?

Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an n-th degree polynomial. Unlike simple linear regression, which fits a straight line, polynomial regression can capture curves and bends in the data. It is still a form of linear regression because the model is linear in the unknown coefficients, even though it is nonlinear in x. The technique is used in physics, engineering, economics, biology, and any field where the data follow a curved pattern rather than a straight line.

How polynomial regression is computed

The calculator uses ordinary least squares (OLS) to find the polynomial coefficients that minimise the sum of squared residuals between the observed y-values and the values predicted by the polynomial. This is done via the normal equations: (X^T X) c = X^T y, where X is the Vandermonde design matrix. The system is solved using Gaussian elimination with partial pivoting, which is numerically stable for the degree range supported here (1-6). You need at least (degree + 1) data points to fit a unique polynomial. R-squared measures how well the polynomial explains the variance in y; values above 0.95 generally indicate a very good fit. Adjusted R-squared penalises the addition of extra terms and is the better metric when comparing polynomials of different degrees.

Choosing the right degree

Selecting the polynomial degree is the key modelling decision. A degree that is too low (underfitting) misses important curvature in the data. A degree that is too high (overfitting) passes through noise and performs poorly on new data even though R-squared is high on the training set. A good approach is to start at degree 2 and increase it until adjusted R-squared levels off or decreases. The residual plot is also informative: if residuals show a systematic curve, a higher degree is warranted. With fewer than 20 data points, degrees above 3 or 4 are rarely justified. For smooth interpolation across a large range of x, consider whether a spline or other piecewise approach would serve better than a high-degree global polynomial.

Interpreting R-squared and RMSE

R-squared (the coefficient of determination) tells you the fraction of the total variance in y that the fitted polynomial accounts for. An R-squared of 0.95 means 95% of the variation is explained. However, R-squared always increases or stays the same when you add more terms, even if those terms are not meaningful. Adjusted R-squared corrects for this by penalising degrees of freedom used by extra coefficients; it can decrease if an added term does not improve fit enough. RMSE (root mean square error) is the square root of the average squared residual and is in the same units as y. It tells you the typical prediction error of the model on the training data. Use RMSE alongside R-squared for a full picture of fit quality.

Polynomial Degree Guide

Degree	Name	Equation form	Typical use
1	Linear	y = a0 + a1*x	Straight-line trends, proportional relationships
2	Quadratic	y = a0 + a1x + a2x^2	Parabolic shapes, projectile motion, cost curves
3	Cubic	y = a0 + ... + a3*x^3	S-shaped growth, economic cycles
4	Quartic	y = a0 + ... + a4*x^4	Double-hump distributions, complex oscillations
5	Quintic	y = a0 + ... + a5*x^5	High-precision curve matching, engineering splines
6	Sextic	y = a0 + ... + a6*x^6	Very complex curves; use sparingly to avoid overfitting

Typical uses for each polynomial degree in regression analysis.

Frequently asked questions

How many data points do I need?

You need at least (degree + 1) points to fit a polynomial of that degree: 2 points for degree 1, 3 for degree 2, and so on. In practice, you want considerably more than the minimum for a statistically meaningful fit. As a rough guide, aim for at least 5 times (degree + 1) data points before trusting R-squared values.

What format should my data be in?

Enter one (x, y) pair per line. Separate x and y with a comma, a space, or a tab. For example: "1, 2.5" or "1 2.5". Lines with missing or non-numeric values are ignored automatically, so you can paste directly from a spreadsheet.

Why is my R-squared perfect (1.0) but the fit looks wrong?

If the number of data points equals the polynomial degree plus one, the polynomial passes exactly through every point and R-squared is exactly 1. This is interpolation, not regression. For a meaningful fit that generalises to new data, use substantially more points than the degree requires.

When should I increase the polynomial degree?

Increase the degree if the residuals (observed minus fitted) show a clear systematic pattern, such as a curve or wave shape. A random scatter of residuals around zero indicates a good fit at the current degree. Increasing the degree when residuals are already random will overfit the data.

Can I use this for extrapolation beyond my data range?

You can enter any x-value in the prediction field and get a y-value, but extrapolation beyond the data range is unreliable for high-degree polynomials. Polynomials can diverge rapidly outside the fitted range. If extrapolation is important, use the lowest degree that fits well or consider a mechanistic model based on the underlying process.

What is the difference between R-squared and adjusted R-squared?

R-squared always increases (or stays the same) when you add a polynomial term, even if that term does not help. Adjusted R-squared adjusts for the number of terms and can decrease if a new term does not improve fit proportionally. When comparing models of different degrees, use adjusted R-squared: the degree with the highest adjusted R-squared is generally the best choice.

Sources

Was this calculator helpful?

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

How we build & check our calculators