Smoothing a 2-D figure - python

I have a number of vaguely rectangular 2D figures that need to be smoothed. A simplified example:
fig, ax1 = plt.subplots(1,1, figsize=(3,3))
xs1 = [-0.25, -0.625, -0.125, -1.25, -1.125, -1.25, 0.875, 1.0, 1.0, 0.5, 1.0, 0.625, -0.25]
ys1 = [1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 1.875, 1.75, 1.625, 1.5, 1.375, 1.25, 1.25]
ax1.plot(xs1, ys1)
ax1.set_ylim(0.5,2.5)
ax1.set_xlim(-2,2) ;
I have tried scipy.interpolate.RectBivariateSpline but that apparently wants data at all the points (e.g. for a heat map), and scipy.interpolate.interp1d but that, reasonably enough, wants to generate a 1d smoothed version.
What is an appropriate method to smooth this?
Edit to revise/explain my goal a little better. I don't need the lines to go through all the points; in fact I'd prefer that they not go through all the points, because there are clear outlier points that "should" be averaged with neighbors, or some similar approach. I've included a crude manual sketch of the start of what I have in mind above.

Chaikin's corner cutting algorithm might be the ideal approach for you. For a given polygon with vertices as P0, P1, ...P(N-1), the corner cutting algorithm will generate 2 new vertices for each line segment defined by P(i) and P(i+1) as
Q(i) = (3/4)P(i) + (1/4)P(i+1)
R(i) = (1/4)P(i) + (3/4)P(i+1)
So, your new polygon will have 2N vertices. You can then apply the corner cutting on the new polygon again and repeatedly until the desired resolution is reached. The result will be a polygon with many vertices but they will look smooth when displayed. It can be proved that the resulting curve produced from this corner cutting approach will converge into a quadratic B-spline curve. The advantage of this approach is that the resulting curve will never over-shoot. The following pictures will give you a better idea about this algorithm (pictures taken from this link)
Original Polygon
Apply corner cutting once
Apply corner cutting one more time
See this link for more details for Chaikin's corner cutting algorithm.

Actually, you can use scipy.interpolate.inter1d for this. You need to treat both the x and y components of your polygon as separate series.
As a quick example with a square:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [0, 1, 1, 0, 0]
y = [0, 0, 1, 1, 0]
t = np.arange(len(x))
ti = np.linspace(0, t.max(), 10 * t.size)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
However, as you can see, this results in some issues at 0,0.
This is happening because a spline segment depends on more than just two points. The first and last points aren't "connected" in the way we interpolated. We can fix this by "padding" the x and y sequences with the second-to-last and second points so that there are "wrap-around" boundary conditions for the spline at the endpoints.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [0, 1, 1, 0, 0]
y = [0, 0, 1, 1, 0]
# Pad the x and y series so it "wraps around".
# Note that if x and y are numpy arrays, you'll need to
# use np.r_ or np.concatenate instead of addition!
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
And just to show what it looks like with your data:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
x = [-0.25, -0.625, -0.125, -1.25, -1.125, -1.25,
0.875, 1.0, 1.0, 0.5, 1.0, 0.625, -0.25]
y = [1.25, 1.375, 1.5, 1.625, 1.75, 1.875, 1.875,
1.75, 1.625, 1.5, 1.375, 1.25, 1.25]
# Pad the x and y series so it "wraps around".
# Note that if x and y are numpy arrays, you'll need to
# use np.r_ or np.concatenate instead of addition!
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
xi = interp1d(t, x, kind='cubic')(ti)
yi = interp1d(t, y, kind='cubic')(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi)
ax.plot(x, y)
ax.margins(0.05)
plt.show()
Note that you get quite a bit of "overshoot" with this method. That's due to the cubic spline interpolation. #pythonstarter's suggestion is another good way to handle it, but bezier curves will suffer from the same problem (they're basically equivalent mathematically, it's just a matter of how the control points are defined). There are a number of other ways to handle the smoothing, including methods that are specialized for smoothing a polygon (e.g. Polynomial Approximation with Exponential Kernel (PAEK)). I've never tried to implement PAEK, so I'm not sure how involved it is. If you need to do this more robustly, you might try looking into PAEK or another similar method.

This is more of a comment then the answer, but maybe you can try to define this polygon as a Bezier curve. Code is rather simple, and I’m sure that you are familiar with how this curves work. In that case this curve would be a control polygon. But it's not all perfect: firstly it is not really "smoothed version" of this polygon, but a curve, and another thing; the higher degree of the curve it is the less it looks like control polygon. What I the want to say is that maybe you should try solving this problem with math tools instead trying to smooth instead of smoothening the polygon with programing skills

Related

Difference between uniformly distributed variables, Python

I'm very new to Python, and I was trying to use this problem as a learning exercise, but I can't get anywhere with it.
What I want to do is to show that for two random variables that come uniformly distributed within a 200ns window, the probability of them arriving within 7ns of each other is ~5%:
X, Y ~ U[0, 200]
Z = X - Y
P(|Z| < 7) = ?
I wanted to know the most analytical way of doing this, because I thought Python might have some useful libraries to help, and because if I wanted to do a stochastic simulation I would do it in C++ ROOT which would take me far less time!
The way that I've done it is below, but it's different from what I've calculated analytically. Can anyone suggest a better/more accurate way of solving the same problem?
Thanks a lot!
from scipy.stats import uniform, expon
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1, 1)
a, b = 0, 200
size = 1000000
# Genrating uniform distribution
uniform_distribution = uniform(loc=a, scale=b)
x = uniform_distribution.rvs(size=size)
y = uniform_distribution.rvs(size=size)
z=x-y
ax.hist(z)
zsmall=[z for i in z if abs(i)<7]
n=len(zsmall)
print("probability = ",n/size)
Edit: added some code to improve the figure.
Your code is fine, and the results do agree with the analytically derived value. To see this more readily, I have modified your code slightly, scaling the domains of X and Y down to [0, 1] and computing P(|Z| < 7/200), so that this is still equivalent to your original question.
from scipy.stats import uniform
import matplotlib.pyplot as plt
a, b = 0, 1
size = 1000000
# generate uniformly distributed x and y
uniform_distribution = uniform(loc=a, scale=b)
x = uniform_distribution.rvs(size=size)
y = uniform_distribution.rvs(size=size)
z = x - y
# set up figure
fig, ax = plt.subplots(figsize = [16, 8])
ax.set_aspect('equal')
ax.set_xlim([-1, 1])
ax.set_ylim([0, 1])
ax.set_xticks([-1, 0, 1])
ax.set_xticklabels([-1, 0, 1], size=20)
ax.set_yticks([0, 1])
ax.set_yticklabels([0, 1], size=20)
# plot histogram with y-axis scaled to show density,
# increased bin number for better resolution
ax.hist(z, density=True, bins=200, alpha=0.5)
# plot lines around the area we want to estimate
plt.axvline(-7/200, color='black', linestyle='--')
ax.annotate('x = -7/200', xy=(-7/200, 0.4), xytext=(-0.05, 0.4), fontsize=16, ha='right')
plt.axvline( 7/200, color='black', linestyle='--')
ax.annotate('x = 7/200', xy=(7/200, 0.2), xytext=(0.05, 0.2), fontsize=16)
# plot theoretical probability density function
ax.plot([-1, 0], [0, 1], color='gray', linestyle=':')
ax.plot([ 0, 1], [1, 0], color='gray', linestyle=':')
zsmall = [1 for i in z if abs(i) < 7/200]
n = len(zsmall)
print("probability =", n/size)
probability = 0.06857
As you can see, this approaches the theoretically expected triangular distribution (gray dotted lines) pretty closely already. For comparison, we can calculate the theoretical probability, which is the area between the dashed lines and below the dotted lines. We can compute this as the area of the whole rectangle between the dashed lines minus the area of the square consisting of the two small triangles above the dotted lines:
2*(7/200) - (7/200)**2
= 0.068775
So the theoretical value does agree with your simulation result.

python: interpolation for a closed circle

I want to interpolate a circle by using some given points. I refer to the scipy.interpolate, and use the interpolate.splprep to interpolate my circle. However, the interpolated circle is weird, and it is different from the standard circle:
And my code is:
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
t = np.arange(0, 1.25, 0.25)
x = np.sin(2*np.pi*t)
y = np.cos(2*np.pi*t)
tck,u = interpolate.splprep([x,y], s=0)
unew = np.arange(0, 1.01, 0.01)
out = interpolate.splev(unew, tck)
plt.figure()
plt.plot(x, y, 'x', out[0], out[1])
plt.axis([-1.05, 1.05, -1.05, 1.05])
plt.title('Spline of parametrically-defined curve')
plt.show()
Of course, if I give more points, such as: t = np.arange(0, 1.25, 0.1), the circle would look better. But, I still can not accept this result. Is there any better interpolation method to interpolate the circle?
Fisrt Edit:
#gregory mention that scipy.interpolate.CubicSpline can be used to interpolate a circle. And the example code is:
theta = 2 * np.pi * np.linspace(0, 1, 5)
y = np.c_[np.cos(theta), np.sin(theta)]
cs = CubicSpline(theta, y, bc_type='periodic')
However, it use the theta and [cos, sin] to represent the circle, but what if we do not know the formulation about the curve? What if we only have (x, y)? Can we parametrically represent the curve, like interpolate.splprep?
The cubic spline will fit a third degree polynomial to your data, supposedly a circle, and your points are far appart pi/2 radians, so you should consider using more closely spaced data or otherwise, interpolating with polar coordinates. This way your data will be constant radius, turning interpolation unneeded. If you add some noise or pertubation to your data, option bc_type='periodic' just works fine with polar coordinates, as #gregory stated.
Matplotlib (plt) Polar Coordinates:
plt.axes(projection = 'polar')
Coordinates will be angles in radians and a function of (or measured) radius at the corresponding angles.

How do I smooth out the edges of a closed line similar to d3's curveCardinal method implementation?

I have a few data points that I am connecting using a closed line plot, and I want the line to have smooth edges similar to how the curveCardinal methods in d3 do it. Link Here
Here's a minimal example of what I want to do:
import numpy as np
from matplotlib import pyplot as plt
x = np.array([0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5])
y = np.array([1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0])
fig, ax = plt.subplots()
ax.plot(x, y)
ax.scatter(x, y)
Now, I'd like to smooth out/interpolate the line similar to d3's curveCardinal methods. Here are a few things that I've tried.
from scipy import interpolate
tck, u = interpolate.splprep([x, y], s=0, per=True)
xi, yi = interpolate.splev(np.linspace(0, 1, 100), tck)
fig, ax = plt.subplots(1, 1)
ax.plot(xi, yi, '-b')
ax.plot(x, y, 'k')
ax.scatter(x[:2], y[:2], s=200)
ax.scatter(x, y)
The result of the above code is not bad, but I was hoping that the curve would stay closer to the line when the data points are far apart (I increased the size of two such data points above to highlight this). Essentially, have the curve stay close to the line.
Using interp1d (has the same problem as the code above):
from scipy.interpolate import interp1d
x = [0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5]
y = [1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0]
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
kind='cubic'
xi = interp1d(t, x, kind=kind)(ti)
yi = interp1d(t, y, kind=kind)(ti)
fig, ax = plt.subplots()
ax.plot(xi, yi, 'g')
ax.plot(x, y, 'k')
ax.scatter(x, y)
I also looked at the Chaikins Corner Cutting algorithm, but I don't like the result.
def chaikins_corner_cutting(coords, refinements=5):
coords = np.array(coords)
for _ in range(refinements):
L = coords.repeat(2, axis=0)
R = np.empty_like(L)
R[0] = L[0]
R[2::2] = L[1:-1:2]
R[1:-1:2] = L[2::2]
R[-1] = L[-1]
coords = L * 0.75 + R * 0.25
return coords
fig, ax = plt.subplots()
ax.plot(x, y, 'k', linewidth=1)
ax.plot(chaikins_corner_cutting(x, 4), chaikins_corner_cutting(y, 4))
I also, superficially, looked at Bezier curves, matplotlibs PathPatch, and Fancy box implementations, but I couldn't get any satisfactory results.
Suggestions are greatly appreciated.
So, here's how I ended up doing it. I decided to introduce new points between every two existing data points. The following image shows how I am adding these new points. Red are data that I have. Using a convex hull I calculate the geometric center of the data points and draw lines to it from each point (shown with blue lines). Divide these lines twice in half and connect the resulting points (green line). The center of the green line is the new point added.
Here are the functions that accomplish this:
def midpoint(p1, p2, sf=1):
"""Calculate the midpoint, with an optional
scaling-factor (sf)"""
xm = ((p1[0]+p2[0])/2) * sf
ym = ((p1[1]+p2[1])/2) * sf
return (xm, ym)
def star_curv(old_x, old_y):
""" Interpolates every point by a star-shaped curve. It does so by adding
"fake" data points in-between every two data points, and pushes these "fake"
points towards the center of the graph (roughly 1/4 of the way).
"""
try:
points = np.array([old_x, old_y]).reshape(7, 2)
hull = ConvexHull(points)
x_mid = np.mean(hull.points[hull.vertices,0])
y_mid = np.mean(hull.points[hull.vertices,1])
except:
x_mid = 0.5
y_mid = 0.5
c=1
x, y = [], []
for i, j in zip(old_x, old_y):
x.append(i)
y.append(j)
try:
xm_i, ym_i = midpoint((i, j),
midpoint((i, j), (x_mid, y_mid)))
xm_j, ym_j = midpoint((old_x[c], old_y[c]),
midpoint((old_x[c], old_y[c]), (x_mid, y_mid)))
xm, ym = midpoint((xm_i, ym_i), (xm_j, ym_j))
x.append(xm)
y.append(ym)
c += 1
except IndexError:
break
orig_len = len(x)
x = x[-3:-1] + x + x[1:3]
y = y[-3:-1] + y + y[1:3]
t = np.arange(len(x))
ti = np.linspace(2, orig_len + 1, 10 * orig_len)
kind='quadratic'
xi = interp1d(t, x, kind=kind)(ti)
yi = interp1d(t, y, kind=kind)(ti)
return xi, yi
Here's how it looks:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.spatial import ConvexHull
x = [0.5, 0.13, 0.4, 0.5, 0.6, 0.7, 0.5]
y = [1.0, 0.7, 0.5, 0.2, 0.4, 0.6, 1.0]
xi, yi = star_curv(x, y)
fig, ax = plt.subplots()
ax.plot(xi, yi, 'g')
ax.plot(x, y, 'k', alpha=0.5)
ax.scatter(x, y, color='r')
The result is especially noticeable when the data points are more symmetric, for example the following x, y values give the results in the image below:
x = [0.5, 0.32, 0.34, 0.5, 0.66, 0.65, 0.5]
y = [0.71, 0.6, 0.41, 0.3, 0.41, 0.59, 0.71]
Comparison between the interpolation presented here, with the default interp1d interpolation.
I would create another array with the vertices extended in/out or up/down by about 5%. So if a point is lower than the average of the neighbouring points, make it a bit lower still.
Then do a linear interpolation between the new points, say 10 points/edge. Finally do a spline between the second last point per edge and the actual vertex. If you use Bezier curves, you can make the spline come in at the same angle on each side.
It's a bit of work, but of course you can use this anywhere.

Get best linear function which approximate some dots in 3D

I have 4 dots which are represented with these coordinates:
X = [0.1, 0.5, 0.9, 0.18]
Y = [0.7, 0.5, 0.7, 0.3]
Z = [4.2, 3.3, 4.2, 2.5]
and I have to get the best linear function (plane) which approximate these 4 dots.
I'm aware of numpy.polyfit, but polyfitworks only with x and y (2D),
What can I do?
while not completely general, if the the data points can be reasonably represented as a surface relative to a coordinate plane, say z = ax + by + c then np.linalg.lstsq can be used
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
X = np.array([0.1, 0.5, 0.9, 0.18])
Y = np.array([0.7, 0.5, 0.7, 0.3])
Z = np.array([4.2, 3.3, 4.2, 2.5])
# least squares fit
A = np.vstack([X, Y, np.ones(len(X))]).T
a,b,c= np.linalg.lstsq(A, Z)[0]
# plots
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# plot data as big red crosses
ax.scatter(X, Y, Z, color='r', marker='+', linewidth=10)
# plot plane fit as grid of green dots
xs = np.linspace(min(X), max(X), 10)
ys = np.linspace(min(Y), max(Y), 10)
xv, yv = np.meshgrid(xs, ys)
zv = a*xv + b*yv + c
ax.scatter(xv, yv, zv, color = 'g')
# ax.plot_wireframe(xv, yv, zv, color = 'g') # alternative fit plane plot
plt.show()
plotting the data 1st, you could select a different coordinate pair for the "independent variable" plane to avoid ill conditioned result if necessary, if the data points appeared to lie in a plane containing the z axis, then use xz or yz
and of course you could have degenerate points on a line or the vertices of a regular tetrahedron
for a better "geometric fit" the 1st fitted plane could be used as the base for a 2nd least square fit of the data rotated into that coordinate system (if the data is "reasonably" plane like)

What to do if I want 3D spline/smooth interpolation of random unstructured data?

I was inspired by this answer by #James to see how griddata and map_coordinates might be used. In the examples below I'm showing 2D data, but my interest is in 3D. I noticed that griddata only provides splines for 1D and 2D, and is limited to linear interpolation for 3D and higher (probably for very good reasons). However, map_coordinates seems to be fine with 3D using higher order (smoother than piece-wise linear) interpolation.
My primary question: if I have random, unstructured data (where I can not use map_coordinates) in 3D, is there some way to get smoother than piece-wise linear interpolation within the NumPy SciPy universe, or at least nearby?
My secondary question: is spline for 3D not available in griddata because it is difficult or tedious to implement, or is there a fundamental difficulty?
The images and horrible python below show my current understanding of how griddata and map_coordinates can or can't be used. Interpolation is done along the thick black line.
STRUCTURED DATA:
UNSTRUCTURED DATA:
Horrible python:
import numpy as np
import matplotlib.pyplot as plt
def g(x, y):
return np.exp(-((x-1.0)**2 + (y-1.0)**2))
def findit(x, X): # or could use some 1D interpolation
fraction = (x - X[0]) / (X[-1]-X[0])
return fraction * float(X.shape[0]-1)
nth, nr = 12, 11
theta_min, theta_max = 0.2, 1.3
r_min, r_max = 0.7, 2.0
theta = np.linspace(theta_min, theta_max, nth)
r = np.linspace(r_min, r_max, nr)
R, TH = np.meshgrid(r, theta)
Xp, Yp = R*np.cos(TH), R*np.sin(TH)
array = g(Xp, Yp)
x, y = np.linspace(0.0, 2.0, 200), np.linspace(0.0, 2.0, 200)
X, Y = np.meshgrid(x, y)
blob = g(X, Y)
xtest = np.linspace(0.25, 1.75, 40)
ytest = np.zeros_like(xtest) + 0.75
rtest = np.sqrt(xtest**2 + ytest**2)
thetatest = np.arctan2(xtest, ytest)
ir = findit(rtest, r)
it = findit(thetatest, theta)
plt.figure()
plt.subplot(2,1,1)
plt.scatter(100.0*Xp.flatten(), 100.0*Yp.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')
exact = g(xtest, ytest)
import scipy.ndimage.interpolation as spndint
ndint0 = spndint.map_coordinates(array, [it, ir], order=0)
ndint1 = spndint.map_coordinates(array, [it, ir], order=1)
ndint2 = spndint.map_coordinates(array, [it, ir], order=2)
import scipy.interpolate as spint
points = np.vstack((Xp.flatten(), Yp.flatten())).T # could use np.array(zip(...))
grid_x = xtest
grid_y = np.array([0.75])
g0 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='nearest')
g1 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='linear')
g2 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='cubic')
plt.subplot(4,2,5)
plt.plot(exact, 'or')
#plt.plot(ndint0)
plt.plot(ndint1)
plt.plot(ndint2)
plt.title("map_coordinates")
plt.subplot(4,2,6)
plt.plot(exact, 'or')
#plt.plot(g0)
plt.plot(g1)
plt.plot(g2)
plt.title("griddata")
plt.subplot(4,2,7)
#plt.plot(ndint0 - exact)
plt.plot(ndint1 - exact)
plt.plot(ndint2 - exact)
plt.title("error map_coordinates")
plt.subplot(4,2,8)
#plt.plot(g0 - exact)
plt.plot(g1 - exact)
plt.plot(g2 - exact)
plt.title("error griddata")
plt.show()
seed_points_rand = 2.0 * np.random.random((400, 2))
rr = np.sqrt((seed_points_rand**2).sum(axis=-1))
thth = np.arctan2(seed_points_rand[...,1], seed_points_rand[...,0])
isinside = (rr>r_min) * (rr<r_max) * (thth>theta_min) * (thth<theta_max)
points_rand = seed_points_rand[isinside]
Xprand, Yprand = points_rand.T # unpack
array_rand = g(Xprand, Yprand)
grid_x = xtest
grid_y = np.array([0.75])
plt.figure()
plt.subplot(2,1,1)
plt.scatter(100.0*Xprand.flatten(), 100.0*Yprand.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')
g0rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='nearest')
g1rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='linear')
g2rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='cubic')
plt.subplot(4,2,6)
plt.plot(exact, 'or')
#plt.plot(g0rand)
plt.plot(g1rand)
plt.plot(g2rand)
plt.title("griddata")
plt.subplot(4,2,8)
#plt.plot(g0rand - exact)
plt.plot(g1rand - exact)
plt.plot(g2rand - exact)
plt.title("error griddata")
plt.show()
Good question! (and nice plots!)
For unstructured data, you'll want to switch back to functions meant for unstructured data. griddata is one option, but uses triangulation with linear interpolation in between. This leads to "hard" edges at triangle boundaries.
Splines are radial basis functions. In scipy terms, you want scipy.interpolate.Rbf. I'd recommend using function="linear" or function="thin_plate" over cubic splines, but cubic is available as well. (Cubic splines will exacerbate problems with "overshooting" compared to linear or thin-plate splines.)
One caveat is that this particular implementation of radial basis functions will always use all points in your dataset. This is the most accurate and smooth approach, but it scales poorly as the number of input observation points increases. There are several ways around this, but things will get more complex. I'll leave that for another question.
At any rate, here's a simplified example. We'll generate random data and then interpolate it at points that are on a regular grid. (Note that the input is not on a regular grid, and the interpolated points don't need to be either.)
import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
interp = scipy.interpolate.Rbf(x, y, z, function='thin_plate')
yi, xi = np.mgrid[0:1:100j, 0:1:100j]
zi = interp(xi, yi)
plt.plot(x, y, 'ko')
plt.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
plt.colorbar()
plt.show()
Choice of spline type
I chose "thin_plate" as the type of spline. Our input observations points range from 0 to 1 (they're created by np.random.random). Notice that our interpolated values go slightly above 1 and well below zero. This is "overshooting".
Linear splines will completely avoid overshooting, but you'll wind up with "bullseye" patterns (nowhere near as severe as with IDW methods, though). For example, here's the exact same data interpolated with a linear radial basis function. Notice that our interpolated values never go above 1 or below 0:
Higher order splines will make trends in the data more continuous but will overshoot more. The default "multiquadric" is fairly similar to a thin-plate spline, but will make things a bit more continuous and overshoot a bit worse:
However, as you go to even higher order splines such as "cubic" (third order):
and "quintic" (fifth order)
You can really wind up with unreasonable results as soon as you move even slightly beyond your input data.
At any rate, here's a simple example to compare different radial basis functions on random data:
import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
yi, xi = np.mgrid[0:1:100j, 0:1:100j]
interp_types = ['multiquadric', 'inverse', 'gaussian', 'linear', 'cubic',
'quintic', 'thin_plate']
for kind in interp_types:
interp = scipy.interpolate.Rbf(x, y, z, function=kind)
zi = interp(xi, yi)
fig, ax = plt.subplots()
ax.plot(x, y, 'ko')
im = ax.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
fig.colorbar(im)
ax.set(title=kind)
fig.savefig(kind + '.png', dpi=80)
plt.show()

Categories