Interpolate and Extrapolate With Fitted Curves
Fit a linear or polynomial curve through your data points, then interpolate or extrapolate values at any x-coordinate. Visualize the fit and understand the difference between interpolation and extrapolation.
Interpolation is estimating y at an x that sits between your known data points. You fit a function (linear segment, polynomial, spline) through the data and evaluate it at the missing x. As long as the underlying relationship behaves smoothly across that gap and the surrounding points are reliable, the estimate inherits roughly the precision of those neighbors.
Extrapolation is not interpolation, and treating the two as if they were is the standard mistake. Going past the boundary of observed x's is a fundamentally different problem: you're assuming the model that fit your known range continues to hold outside it, with no data to check the assumption. Polynomials in particular blow up outside their fit range. A cubic that bends gently through five points can shoot to ±10⁴ a few units past the last observation (Runge's phenomenon, since you'll see it cited that way). Linear extrapolation is more stable than polynomial but still rests on the linearity assumption holding off-domain. The page flags every query as either interpolation or extrapolation based on x's position relative to min(x) and max(x), and surfaces residuals so you can sanity-check the fit before trusting any prediction inside the range, let alone outside.
Paste Your Data Points and Choose x
Enter (x, y) pairs representing known measurements. Each x-value must be distinct—two points at the same x with different y values creates ambiguity. If you have repeated measurements, average them first or pick the most reliable reading.
Specify the query x where you want an estimated y. The tool determines automatically whether this x falls within your data range (interpolation) or outside it (extrapolation). The distinction matters: interpolation leverages surrounding data, while extrapolation projects a trend into unknown territory.
Data quality drives result quality. Outliers, transcription errors, or measurement noise propagate through the fitted curve. Inspect your points visually before trusting any estimate. A single misplaced point can tilt a polynomial dramatically.
Interpolation Methods: Linear vs Polynomial
Linear interpolation connects adjacent points with straight segments. For a query between x₁ and x₂, it weights the y-values by distance: y ≈ y₁ + (y₂ - y₁) × (x - x₁) / (x₂ - x₁). Simple, fast, and robust when data changes smoothly. It won't capture curvature but avoids wild swings.
Polynomial interpolation fits a single curve through all points. A degree-n polynomial can pass exactly through n + 1 points. Quadratic (degree 2) handles one bend, cubic (degree 3) handles two. Higher degrees risk oscillation between points (Runge's phenomenon), producing unrealistic wiggles.
Pick linear when data trends are roughly straight or when you have few points. Pick polynomial when you see clear curvature and have enough points to justify the degree. A degree-4 polynomial through five points will pass exactly through each, but that doesn't guarantee accuracy—it may overfit noise.
Rule of thumb: Use the simplest method that captures the trend. If linear looks adequate, don't upgrade to polynomial just because you can.
Extrapolation Risk: When Predictions Break
Extrapolation asks the model to venture past observed boundaries. Linear extrapolation assumes the slope continues unchanged—often reasonable for short steps but unreliable over long distances. Curves bend; systems saturate; relationships shift.
Polynomial extrapolation is far worse. Outside the data window, polynomials can shoot toward infinity or dive negative depending on leading-term signs. A cubic that fits five points nicely may predict absurd values one unit beyond the last observation.
Treat any extrapolated value as tentative guidance, not a reliable forecast. Flag it in reports. If critical decisions hinge on that estimate, collect additional data closer to the region of interest rather than stretching a model beyond its foundation.
Warning: Confidence intervals widen rapidly outside the data range. An estimate that feels precise inside the window becomes speculative outside it.
Residual View: How Well the Fit Matches
Residuals are observed y minus fitted y at each data point. Small, randomly scattered residuals suggest the model captures the relationship adequately. Patterns—a systematic curve, increasing spread—indicate the model misses structure.
For polynomial interpolation passing exactly through all points, residuals at those points are zero by construction. That doesn't prove the curve is "right"—it just fits the data exactly. The curve's behavior between points may still oscillate or stray from the true underlying function.
Inspect residuals especially when using least-squares regression curves (not exact-fit polynomials). A parabolic residual pattern means a linear fit missed curvature; a funnel shape signals heteroscedasticity. Residual analysis guides model refinement.
Visual Forecast: Point Estimate on the Curve
The chart overlays your data points with the fitted curve and marks the query location. Seeing where your estimate sits relative to the data helps you judge plausibility. An estimate that falls amid clustered points feels more credible than one perched on a curve bending away into empty space.
Zoom out to view curve behavior beyond the data window. Polynomial curves often behave wildly outside their domain—visualizing this reinforces caution about extrapolation. A curve shooting upward at the edge warns you not to trust values projected further out.
Visualization also reveals outliers. A point far from the curve might be a measurement error or a genuine anomaly worth investigating. Either way, its influence on the fit deserves attention.
Common questions about fitting between points
When is linear interpolation better than polynomial?
When data follows a roughly straight trend or you have only two or three points. Linear avoids overfitting and extrapolates more predictably. If unsure, start linear; upgrade only if residuals show clear curvature.
Can I extrapolate safely with linear fits?
Safer than polynomial but still risky. Linear extrapolation assumes the slope holds indefinitely—fine for short projections, unreliable for long ones. Real systems eventually deviate from straight-line trends.
Why does polynomial degree matter?
Higher degrees fit more complex shapes but risk oscillation (Runge's phenomenon). A degree-6 polynomial through seven points might wiggle dramatically between them, producing unrealistic intermediate values.
What if my query x is far outside the data?
Treat the estimate as speculative. Uncertainty grows with distance from observed points. Collect data nearer to your region of interest if the estimate matters for decisions.
How do I handle noisy measurements?
Consider smoothing or least-squares regression instead of exact interpolation. Fitting a lower-degree polynomial that doesn't pass through every point may better represent the underlying trend by averaging out noise.
Limitations of curve fitting
Inside vs outside the data range: interpolation inherits the precision of the surrounding points and is generally safe with a sensible model class. Extrapolation is a fundamentally different problem. You're assuming the fitted model holds in a regime you have no data for.
Polynomials blow up off-domain: Runge's phenomenon. A cubic that bends gently through five points can shoot to ±10⁴ a few units past the last observation. High-degree polynomials passing exactly through n points often oscillate wildly between them.
Outliers propagate: measurement noise and transcription errors distort the fitted curve. Inspect the data before trusting any prediction.
No uncertainty bands: this page produces point estimates only. For uncertainty quantification, fit a model with explicit variance (Gaussian process regression, bootstrapped splines).
Note: scipy.interpolate has the most complete set of methods (linear, cubic, splines, barycentric). R's approx() and spline() cover the basics. For piecewise-cubic Hermite (PCHIP), which avoids overshoot, scipy.interpolate.PchipInterpolator is the standard choice. ISLR Chapter 7 covers the regression-side perspective on smoothing and basis expansions.
Sources & References
Methods and formulas follow established numerical analysis references:
- •NIST/SEMATECH e-Handbook: Polynomial Regression
- •Wolfram MathWorld: Least Squares Fitting
- •Penn State STAT 501: Polynomial Regression
Curve fitting between points: working questions
Linear vs cubic spline, which should I pick?
Linear if the data are well-spaced and you want predictable behavior at the cost of corners at each data point. Cubic spline if you want a smooth curve through the points without polynomial blow-up. Splines fit a piecewise cubic between each pair of points, with continuity in the function and its first two derivatives. They handle 50-point datasets that would explode under a single 50-degree polynomial. scipy.interpolate.CubicSpline is the standard implementation; R's splinefun() does the same job. PCHIP (scipy.interpolate.PchipInterpolator) is monotonic where the data are monotonic, which avoids overshoot.
Why does my polynomial oscillate between data points?
Runge's phenomenon. A high-degree polynomial fit to evenly-spaced points oscillates wildly near the endpoints. The classic example: degree-15 polynomial through equispaced samples of 1/(1 + 25x²) on [−1, 1] swings to thousands at the boundaries. Fix: use Chebyshev nodes instead of equispaced points (they cluster near the boundaries and avoid the issue), use splines instead of a single high-degree polynomial, or limit polynomial degree to about 6 even for clean data. Above degree 10 with equispaced data, oscillation is essentially guaranteed.
How far can I safely extrapolate past my data?
Generally not at all. Inside the data range you can lean on the local behavior of the fitted function. Outside, you're assuming the model continues to hold in a regime you have no data for. A linear extrapolation a small fraction past the boundary is sometimes defensible if the underlying physics or theory supports linearity off-domain. Polynomial extrapolation is almost never defensible because polynomials grow without bound (Runge's phenomenon plus monomial blow-up). Specific quantitative answer: don't.
What is spline interpolation?
Piecewise polynomial fitting where each segment is a low-degree polynomial (usually cubic) and the segments join smoothly. Cubic splines match the function value, first derivative, and second derivative at each interior knot, which gives a curve that looks visually smooth and avoids the wild oscillations of high-degree single-polynomial fits. Natural splines additionally set the second derivative to zero at the boundaries; clamped splines fix the first derivative at the boundaries. The standard reference is de Boor's "A Practical Guide to Splines."
Lagrange interpolation, is it still useful?
Mainly as a theoretical tool. Lagrange's formula gives the unique polynomial of degree n − 1 through n points and is elegant for proofs and small symbolic problems. For numerical work, it's both expensive (O(n²) to evaluate) and unstable (sensitive to round-off). Newton's divided-difference form has the same end product but is incrementally updatable and easier to implement. Barycentric Lagrange is the modern stable variant that fixes the round-off issues. For practical curve-fitting through many points, splines beat any polynomial form.
How do I deal with noisy data?
Don't interpolate it (interpolation forces the curve through every point, including the noise). Smooth it instead: fit a regression line, polynomial, or smoothing spline that minimizes residuals without passing through every point exactly. R's smooth.spline() and Python's scipy.interpolate.UnivariateSpline take a smoothing parameter that trades fit accuracy for curve smoothness. Cross-validation picks the smoothing level objectively. For genuinely noisy data, a smoothing approach is almost always better than exact interpolation.
Best fit vs interpolation, what's the difference?
Best fit (regression) finds parameters of an assumed model class that minimize residuals; the curve doesn't pass through every data point. Interpolation forces the curve through every point exactly. Use regression when the data are noisy or when you have a theoretical model to fit. Use interpolation when the data are exact (computed values, calibration tables) and you want to estimate values at intermediate points. Mixing them up is a frequent mistake: interpolating noisy lab data builds the noise into your function and then propagates it.
Inverse interpolation, how does it work?
Given y, find x such that f(x) = y. Two common approaches. If f is monotonic, swap roles and interpolate x as a function of y. If f isn't monotonic, root-find on g(x) = f(x) − y using bisection or Brent's method on each interval where the sign changes. Useful for inverting CDFs (finding percentiles), backing out concentrations from calibration curves, and solving "at what time did the value cross threshold" questions. scipy.optimize.brentq is the standard Python tool for the bracketed-root case.
Related Math & Statistics Tools
Regression Calculator
Fit linear and polynomial regression models with R² and statistics
Smoothing & Moving Average
Apply SMA, EMA, and WMA to time series data for trend analysis
Numerical Root Finder
Find function roots using Newton-Raphson and Bisection methods
Descriptive Statistics
Calculate mean, median, standard deviation, and more
Linear Algebra Helper
Compute determinant, rank, trace, and eigenvalues
Calculus Calculator
Compute derivatives and integrals of functions
Probability Toolkit
Compute probabilities for various distributions