Skip to content
Statistics

Polynomial Regression Calculator

Enter your x and y data points (one pair per line, comma-separated), choose a polynomial degree from 1 (linear) to 6 (sextic), and this calculator fits the best curve by least squares. You get the polynomial equation with coefficients, R-squared, adjusted R-squared, RMSE, a full table of fitted values and residuals, and a chart showing your data alongside the fitted curve. You can also enter any x-value to predict a new y.

Your details

Enter one x,y pair per line. Separate x and y with a comma, space, or tab. At least (degree + 1) valid pairs are required.
Degree 1 is a straight line; degree 2 is a parabola. Higher degrees fit more complex curves but can overfit small datasets.
Controls how many decimal places are shown in coefficients and results.
Enter any x-value to interpolate or extrapolate a predicted y from the fitted polynomial.
Fitted equation
y = 1.0488x^2 - 0.5012x + 1.5286

The polynomial equation with computed coefficients

R-squared (R²)0.9997
Adjusted R-squared0.9996
RMSE0.2717
Data points used7
Predicted y64.6429
024.7149.41147
x
  • Data
  • Fitted curve

Fit quality is excellent (R-squared above 0.99).

  • The degree-2 polynomial explains 100.0% of the variation in your y-values.
  • Adjusted R-squared is 0.9996, which penalises extra coefficients and is better for comparing models of different degrees.
  • RMSE of 0.2717 is the typical distance (in y-units) between your data and the fitted curve.

Next stepCompare R-squared and adjusted R-squared across degrees: if adjusted R-squared stops increasing or decreases, the current degree is likely sufficient.

Fitted Values and Residuals

xy (observed)y-hat (fitted)ResidualResidual^2
1.00002.10002.07620.02380.0006
2.00004.90004.72140.17860.0319
3.00009.20009.4643-0.26430.0698
4.000016.100016.3048-0.20480.0419
5.000025.300025.24290.05710.0033
6.000036.800036.27860.52140.2719
7.000049.100049.4119-0.31190.0973

Residual = observed y minus fitted y. Smaller residuals mean a tighter fit.

What is polynomial regression?

Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an n-th degree polynomial. Unlike simple linear regression, which fits a straight line, polynomial regression can capture curves and bends in the data. It is still a form of linear regression because the model is linear in the unknown coefficients, even though it is nonlinear in x. The technique is used in physics, engineering, economics, biology, and any field where the data follow a curved pattern rather than a straight line.

How polynomial regression is computed

The calculator uses ordinary least squares (OLS) to find the polynomial coefficients that minimise the sum of squared residuals between the observed y-values and the values predicted by the polynomial. This is done via the normal equations: (X^T X) c = X^T y, where X is the Vandermonde design matrix. The system is solved using Gaussian elimination with partial pivoting, which is numerically stable for the degree range supported here (1-6). You need at least (degree + 1) data points to fit a unique polynomial. R-squared measures how well the polynomial explains the variance in y; values above 0.95 generally indicate a very good fit. Adjusted R-squared penalises the addition of extra terms and is the better metric when comparing polynomials of different degrees.

Choosing the right degree

Selecting the polynomial degree is the key modelling decision. A degree that is too low (underfitting) misses important curvature in the data. A degree that is too high (overfitting) passes through noise and performs poorly on new data even though R-squared is high on the training set. A good approach is to start at degree 2 and increase it until adjusted R-squared levels off or decreases. The residual plot is also informative: if residuals show a systematic curve, a higher degree is warranted. With fewer than 20 data points, degrees above 3 or 4 are rarely justified. For smooth interpolation across a large range of x, consider whether a spline or other piecewise approach would serve better than a high-degree global polynomial.

Interpreting R-squared and RMSE

R-squared (the coefficient of determination) tells you the fraction of the total variance in y that the fitted polynomial accounts for. An R-squared of 0.95 means 95% of the variation is explained. However, R-squared always increases or stays the same when you add more terms, even if those terms are not meaningful. Adjusted R-squared corrects for this by penalising degrees of freedom used by extra coefficients; it can decrease if an added term does not improve fit enough. RMSE (root mean square error) is the square root of the average squared residual and is in the same units as y. It tells you the typical prediction error of the model on the training data. Use RMSE alongside R-squared for a full picture of fit quality.

Polynomial Degree Guide

DegreeNameEquation formTypical use
1Lineary = a0 + a1*xStraight-line trends, proportional relationships
2Quadraticy = a0 + a1*x + a2*x^2Parabolic shapes, projectile motion, cost curves
3Cubicy = a0 + ... + a3*x^3S-shaped growth, economic cycles
4Quarticy = a0 + ... + a4*x^4Double-hump distributions, complex oscillations
5Quinticy = a0 + ... + a5*x^5High-precision curve matching, engineering splines
6Sexticy = a0 + ... + a6*x^6Very complex curves; use sparingly to avoid overfitting

Typical uses for each polynomial degree in regression analysis.

Frequently asked questions

How many data points do I need?

You need at least (degree + 1) points to fit a polynomial of that degree: 2 points for degree 1, 3 for degree 2, and so on. In practice, you want considerably more than the minimum for a statistically meaningful fit. As a rough guide, aim for at least 5 times (degree + 1) data points before trusting R-squared values.

What format should my data be in?

Enter one (x, y) pair per line. Separate x and y with a comma, a space, or a tab. For example: "1, 2.5" or "1 2.5". Lines with missing or non-numeric values are ignored automatically, so you can paste directly from a spreadsheet.

Why is my R-squared perfect (1.0) but the fit looks wrong?

If the number of data points equals the polynomial degree plus one, the polynomial passes exactly through every point and R-squared is exactly 1. This is interpolation, not regression. For a meaningful fit that generalises to new data, use substantially more points than the degree requires.

When should I increase the polynomial degree?

Increase the degree if the residuals (observed minus fitted) show a clear systematic pattern, such as a curve or wave shape. A random scatter of residuals around zero indicates a good fit at the current degree. Increasing the degree when residuals are already random will overfit the data.

Can I use this for extrapolation beyond my data range?

You can enter any x-value in the prediction field and get a y-value, but extrapolation beyond the data range is unreliable for high-degree polynomials. Polynomials can diverge rapidly outside the fitted range. If extrapolation is important, use the lowest degree that fits well or consider a mechanistic model based on the underlying process.

What is the difference between R-squared and adjusted R-squared?

R-squared always increases (or stays the same) when you add a polynomial term, even if that term does not help. Adjusted R-squared adjusts for the number of terms and can decrease if a new term does not improve fit proportionally. When comparing models of different degrees, use adjusted R-squared: the degree with the highest adjusted R-squared is generally the best choice.

Sources

Written by Dr. Hannah Brandt, PhD Statistician · Munich, Germany

Applied statistician translating rigorous probability theory into clear, accurate tools for researchers and practitioners.

Search 3,500+ calculators

Loading search…