Question 1

Linear vs cubic spline, which should I pick?

Accepted Answer

Linear if the data are well-spaced and you want predictable behavior at the cost of corners at each data point. Cubic spline if you want a smooth curve through the points without polynomial blow-up. Splines fit a piecewise cubic between each pair of points, with continuity in the function and its first two derivatives. They handle 50-point datasets that would explode under a single 50-degree polynomial. scipy.interpolate.CubicSpline is the standard implementation; R's splinefun() does the same job. PCHIP (scipy.interpolate.PchipInterpolator) is monotonic where the data are monotonic, which avoids overshoot.

Question 2

Why does my polynomial oscillate between data points?

Accepted Answer

Runge's phenomenon. A high-degree polynomial fit to evenly-spaced points oscillates wildly near the endpoints. The classic example: degree-15 polynomial through equispaced samples of 1/(1 + 25x²) on [−1, 1] swings to thousands at the boundaries. Fix: use Chebyshev nodes instead of equispaced points (they cluster near the boundaries and avoid the issue), use splines instead of a single high-degree polynomial, or limit polynomial degree to about 6 even for clean data. Above degree 10 with equispaced data, oscillation is essentially guaranteed.

Question 3

How far can I safely extrapolate past my data?

Accepted Answer

Generally not at all. Inside the data range you can lean on the local behavior of the fitted function. Outside, you're assuming the model continues to hold in a regime you have no data for. A linear extrapolation a small fraction past the boundary is sometimes defensible if the underlying physics or theory supports linearity off-domain. Polynomial extrapolation is almost never defensible because polynomials grow without bound (Runge's phenomenon plus monomial blow-up). Specific quantitative answer: don't.

Question 4

What is spline interpolation?

Accepted Answer

Piecewise polynomial fitting where each segment is a low-degree polynomial (usually cubic) and the segments join smoothly. Cubic splines match the function value, first derivative, and second derivative at each interior knot, which gives a curve that looks visually smooth and avoids the wild oscillations of high-degree single-polynomial fits. Natural splines additionally set the second derivative to zero at the boundaries; clamped splines fix the first derivative at the boundaries. The standard reference is de Boor's "A Practical Guide to Splines."

Question 5

Lagrange interpolation, is it still useful?

Accepted Answer

Mainly as a theoretical tool. Lagrange's formula gives the unique polynomial of degree n − 1 through n points and is elegant for proofs and small symbolic problems. For numerical work, it's both expensive (O(n²) to evaluate) and unstable (sensitive to round-off). Newton's divided-difference form has the same end product but is incrementally updatable and easier to implement. Barycentric Lagrange is the modern stable variant that fixes the round-off issues. For practical curve-fitting through many points, splines beat any polynomial form.

Question 6

How do I deal with noisy data?

Accepted Answer

Don't interpolate it (interpolation forces the curve through every point, including the noise). Smooth it instead: fit a regression line, polynomial, or smoothing spline that minimizes residuals without passing through every point exactly. R's smooth.spline() and Python's scipy.interpolate.UnivariateSpline take a smoothing parameter that trades fit accuracy for curve smoothness. Cross-validation picks the smoothing level objectively. For genuinely noisy data, a smoothing approach is almost always better than exact interpolation.

Question 7

Best fit vs interpolation, what's the difference?

Accepted Answer

Best fit (regression) finds parameters of an assumed model class that minimize residuals; the curve doesn't pass through every data point. Interpolation forces the curve through every point exactly. Use regression when the data are noisy or when you have a theoretical model to fit. Use interpolation when the data are exact (computed values, calibration tables) and you want to estimate values at intermediate points. Mixing them up is a frequent mistake: interpolating noisy lab data builds the noise into your function and then propagates it.

Question 8

Inverse interpolation, how does it work?

Accepted Answer

Given y, find x such that f(x) = y. Two common approaches. If f is monotonic, swap roles and interpolate x as a function of y. If f isn't monotonic, root-find on g(x) = f(x) − y using bisection or Brent's method on each interval where the sign changes. Useful for inverting CDFs (finding percentiles), backing out concentrations from calibration curves, and solving "at what time did the value cross threshold" questions. scipy.optimize.brentq is the standard Python tool for the bracketed-root case.

Interpolate and Extrapolate With Fitted Curves

Paste Your Data Points and Choose x

Interpolation Methods: Linear vs Polynomial

Extrapolation Risk: When Predictions Break

Residual View: How Well the Fit Matches

Visual Forecast: Point Estimate on the Curve

Common questions about fitting between points

Limitations of curve fitting

Sources & References

Curve fitting between points: working questions

Related Math & Statistics Tools

Regression Calculator

Smoothing & Moving Average

Numerical Root Finder

Descriptive Statistics

Linear Algebra Helper

Calculus Calculator

Probability Toolkit

How helpful was this calculator?

How helpful was this calculator?