Non-linear interpolation generalizes Linear interpolation by fitting a smooth curve through several surrounding samples — a polynomial, a spline, a cubic — and reading the missing value off the curve. It’s more accurate than linear interpolation when the underlying signal has natural curvature, as ECG and EEG signals often do.
A linear interpolation through two neighbors of a curved signal cuts off the curve between them; a cubic spline fitted through four or six neighbors follows the curve much more closely. For wide gaps in smooth data, this difference can be substantial.
The methods commonly available in Pandas interpolate:
df.interpolate(method='linear') # default — straight line between neighbors
df.interpolate(method='quadratic') # parabola through 3 points
df.interpolate(method='cubic') # cubic curve through 4 points
df.interpolate(method='spline', order=3) # spline of specified order
df.interpolate(method='polynomial', order=2) # polynomial of specified order
df.interpolate(method='akima') # Akima spline — robust to outliersHigher-order methods fit more flexible curves through more surrounding samples, which tracks curvature better but risks overshoot — the curve passing through a maximum or minimum that wasn’t in the true signal — and amplifies noise in the surrounding samples.
For practical engineering data:
linearis a fine default for most signals.cubicandsplineare better for visibly curved signals with wide gaps.akimais robust when the surrounding samples might be noisy or contain outliers.
A common pitfall: applying high-order interpolation to noisy data produces a curve that overshoots the noise rather than smoothing it. If the signal is noisy, smooth it with a Moving-average filter first, then interpolate the gaps. Doing both at once is asking for trouble.
For the basic version with straight lines, see Linear interpolation. For the overall imputation framework, see Imputation and Missing data.