Related
I build a Jupyter Notebookthat imports geoelectric VES point data and subsequently interpolates the point data over a uniform 2D Mesh. I added the relevant parts of the code below (the previous part only imports all data into a dataframe).
x = df['Distance X [m]'].to_numpy()
y = df['AB/2 [m]'].to_numpy()
z = df['Resistivity [Ohmm]'].to_numpy()
#plot
cax = plt.scatter(x, y, c=z)
cbar = plt.colorbar(cax, fraction=0.03)
plt.title('Measured Resistivity')
#invert y axis
plt.gca().invert_yaxis()
plt.savefig('datapoints.png',dpi=100)
import numpy as np
from scipy.interpolate import griddata
from matplotlib.pyplot import figure
# target grid to interpolate to
xi = np.arange(0,6500,20)
yi = np.arange(0,500,20)
xi,yi = np.meshgrid(xi,yi)
# interpolate
zi = griddata((x,y),z,(xi,yi),method='cubic')
# plot
fig = plt.figure()
figure(figsize=(12, 6), dpi=80)
#ax = fig.add_subplot(111)
plt.contourf(xi,yi,zi)
plt.plot(x,y,'k.')
plt.xlabel('xi',fontsize=16)
plt.ylabel('yi',fontsize=16)
plt.gca().invert_yaxis()
plt.colorbar()
plt.savefig('interpolated.png',dpi=100)
#plt.close(fig)
So far, I managed to import my dataset, plot it and interpolate over the grid. However, especially at higher grid spacings, it becomes obvious that for some reason, the cubic and linear do not interpolation does not include the first row of the mesh (in my context the first meters of the subsurface) which is actually supposed to have the best data coverage. Only the nearest neighbor method works fine. In the added image e.g., the first 20m are not resolved.
Link to Interpolated Section
I am using cartopy to display a KDE overlayed on a world map. Initially, I was using the ccrs.PlateCarree projection with no issues, but the moment I tried to use another projection it seemed to explode the scale of the projection. For reference, I have included an example that you can test on your own machine below (just comment out the two projec lines to switch between projections)
from scipy.stats import gaussian_kde
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
projec = ccrs.PlateCarree()
#projec = ccrs.InterruptedGoodeHomolosine()
fig = plt.figure(figsize=(12, 12))
ax = fig.add_subplot(projection=projec)
np.random.seed(1)
discrete_points = np.random.randint(0,10,size=(2,400))
kde = gaussian_kde(discrete_points)
x, y = discrete_points
# https://www.oreilly.com/library/view/python-data-science/9781491912126/ch04.html
resolution = 1
x_step = int((max(x)-min(x))/resolution)
y_step = int((max(y)-min(y))/resolution)
xgrid = np.linspace(min(x), max(x), x_step+1)
ygrid = np.linspace(min(y), max(y), y_step+1)
Xgrid, Ygrid = np.meshgrid(xgrid, ygrid)
Z = kde.evaluate(np.vstack([Xgrid.ravel(), Ygrid.ravel()]))
Zgrid = Z.reshape(Xgrid.shape)
ext = [min(x)*5, max(x)*5, min(y)*5, max(y)*5]
earth = plt.cm.gist_earth_r
ax.add_feature(cfeature.NaturalEarthFeature('physical', 'land', '50m',
edgecolor='black', facecolor="none"))
ax.imshow(Zgrid,
origin='lower', aspect='auto',
extent=ext,
alpha=0.8,
cmap=earth, transform=projec)
ax.axis('on')
ax.get_xaxis().set_visible(True)
ax.get_yaxis().set_visible(True)
ax.set_xlim(-30, 90)
ax.set_ylim(-60, 60)
plt.show()
You'll notice that when using the ccrs.PlateCarree() projection, the KDE is nicely placed over Africa, however when using the ccrs.InterruptedGoodeHomolosine() projection, you can't see the world map at all. This is because the world map is on an enormous scale. Below is an image of both examples:
Plate Carree projection:
Interrupted Goode Homolosine projection (standard zoom):
Interrupted Goode Homolosine projection (zoomed out):
If anyone could explain why this is occurring, and how to fix it so I can plot the same data on different projections, that would be greatly appreciated.
EDIT:
I would also like to specify that I tried adding transform=projec to line 37 in the example I included, namely:
ax.add_feature(cfeature.NaturalEarthFeature('physical', 'land', '50m',
edgecolor='black', facecolor="none", transform=projec))
However this did not help. In fact, it seemed upon adding this the world map no longer appeared at all.
EDIT:
In response to JohanC's answer, this is the plot I get when using that code:
And zoomed out:
Comments on your plots:
Plot1: (the reference map)
projection: PlateCarree projection
(Zgrid) image extents cover (approx) square area, about 40 degrees on each side
image's lower-left corner is at lat/long: (0,0)
Plot2
Q: Why the topo features are not shown on the map?
A: The plot covers very small area that does not include any of them.
projection: InterruptedGoodeHomolosine
the image data, Zgrid is declared to fit within grid (mapprojection) coordinates (unit: meters)
the map is plotted within a small extents of a few meters in both x and y, and aspect ratio is not equal.
Plot3
Q: Why the Zgrid image are not seen on the map?
A: The plot covers very large area that the image become too small to plot.
projection: InterruptedGoodeHomolosine projection
the (Zgrid) image extent is very small, not visible at this scale
the map is plotted within a large extents, and aspect ratio is not equal.
The remedies (for Plot2 and 3)
Zgrid need proper transformation from lat/long to the axes' projection coordinates
map's extents also need to be transformed and set appropriately
the aspect ratio must be set 'equal', to prevent unequal stretches in x and y
About 'gridlines' plots
useful for location reference
latitude/parallels: OK with InterruptedGoodeHomolosine in this case
longitude/meridians: is problematic (dont know how to fix !!)
Here is the modified code that runs and produces the required map.
# proposed code
from scipy.stats import gaussian_kde
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
fig = plt.figure(figsize=(7, 12))
ax = plt.axes(projection=ccrs.InterruptedGoodeHomolosine())
np.random.seed(1)
discrete_points = np.random.randint(0,10,size=(2,400))
kde = gaussian_kde(discrete_points)
x, y = discrete_points
# https://www.oreilly.com/library/view/python-data-science/9781491912126/ch04.html
resolution = 1
x_step = int((max(x)-min(x))/resolution)
y_step = int((max(y)-min(y))/resolution)
xgrid = np.linspace(min(x), max(x), x_step+1)
ygrid = np.linspace(min(y), max(y), y_step+1)
Xgrid, Ygrid = np.meshgrid(xgrid, ygrid)
Z = kde.evaluate(np.vstack([Xgrid.ravel(), Ygrid.ravel()]))
Zgrid = Z.reshape(Xgrid.shape)
ext = [min(x)*5, max(x)*5, min(y)*5, max(y)*5]
earth = plt.cm.gist_earth_r
ocean110 = cfeature.NaturalEarthFeature('physical', 'ocean', \
scale='110m', edgecolor='none', facecolor=cfeature.COLORS['water'])
ax.add_feature(ocean110, zorder=-5)
land110 = cfeature.NaturalEarthFeature('physical', 'land', '110m', \
edgecolor='black', facecolor="silver")
ax.add_feature(land110, zorder=5)
# extents used by both Zgrid and axes
ext = [min(x)*5, max(x)*5, min(y)*5, max(y)*5]
# plot the image's data array
# note the options: `extent` and `transform`
ax.imshow(Zgrid,
origin='lower', aspect='auto',
extent=ext, #set image's extent
alpha=0.75,
cmap=earth, transform=ccrs.PlateCarree(),
zorder=10)
# set the plot's extent with proper coord transformation
ax.set_extent(ext, ccrs.PlateCarree())
ax.coastlines()
#ax.add_feature(cfeature.BORDERS) #uncomment if you need
ax.gridlines(linestyle=':', linewidth=1, draw_labels=True, dms=True, zorder=30, color='k')
ax.set_aspect('equal') #make sure the aspect ratio is 1
plt.show()
The output map:
I have measured the positions of different products in different angles positions (6 values in steps of 60 deg. over a complete rotation). Instead of representing my values on a Cartesian graph where 0 and 360 are the same point, I want to use a polar graph.
With matplotlib, I got a spider chart type graph, but I want to avoid straight lines between points and display and extrapolated values between those. I have a solution that is kind of OK, but I was hoping there is a nice "one liner" I could use to have a more realistic representation or a better tangent handling for some points.
Does anyone have an idea to improve my code below ?
# Libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# Some data to play with
df = pd.DataFrame({'measure':[10, -5, 15,20,20, 20,15,5,10], 'angle':[0,45,90,135,180, 225, 270, 315,360]})
# The few lines I would like to avoid...
angles = [y/180*np.pi for x in [np.arange(x, x+45,5) for x in df.angle[:-1]] for y in x]
values = [y for x in [np.linspace(x, df.measure[i+1], 10)[:-1] for i, x in enumerate(df.measure[:-1])] for y in x]
angles.append(360/180*np.pi)
values.append(values[0])
# Initialise the spider plot
ax = plt.subplot(polar=True)
# Plot data
ax.plot(df.angle/180*np.pi, df['measure'], linewidth=1, linestyle='solid', label="Spider chart")
ax.plot(angles, values, linewidth=1, linestyle='solid', label='what I want')
ax.legend()
# Fill area
ax.fill(angles, values, 'b', alpha=0.1)
plt.show()
the result is below, I want something similar to the orange line with some kind of spline to avoid sharp corners I currently get
I have a solution that is a patchwork of other solutions. It needs to be cleaned and optimized, but it does the job !
Comments and improvements are always welcome, see below
# https://stackoverflow.com/questions/33962717/interpolating-a-closed-curve-using-scipy
from scipy import interpolate
x=df.measure[:-1] * np.cos(df.angle[:-1]/180*np.pi)
y=df.measure[:-1] * np.sin(df.angle[:-1]/180*np.pi)
x = np.r_[x, x[0]]
y = np.r_[y, y[0]]
# fit splines to x=f(u) and y=g(u), treating both as periodic. also note that s=0
# is needed in order to force the spline fit to pass through all the input points.
tck, u = interpolate.splprep([x, y], s=0, per=True)
# evaluate the spline fits for 1000 evenly spaced distance values
xi, yi = interpolate.splev(np.linspace(0, 1, 1000), tck)
def cart2pol(x, y):
rho = np.sqrt(x**2 + y**2)
phi = np.arctan2(y, x)
return(rho, phi)
# Initialise the spider plot
plt.figure(figsize=(12,8))
ax = plt.subplot(polar=True)
# Plot data
ax.plot(df.angle/180*np.pi, df['measure'], linewidth=1, linestyle='solid', label="Spider chart")
ax.plot(angles, values, linewidth=1, linestyle='solid', label='Interval linearisation')
ax.plot(cart2pol(xi, yi)[1], cart2pol(xi, yi)[0], linewidth=1, linestyle='solid', label='Smooth interpolation')
ax.legend()
# Fill area
ax.fill(angles, values, 'b', alpha=0.1)
plt.show()
I have a polygon shapefile (the state of Illinois) and a CSV file with (lat, lon, zvalue). I want to plot a smooth contour plot representing those zvalues. Following is my code:
import glob
import fiona
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from mpl_toolkits.basemap import Basemap
from matplotlib.mlab import griddata
# Read in the tabulated data
tabfname = glob.glob("Outputs\\*.csv")[0]
df = pd.read_table(tabfname, sep=",")
print(df.head())
lat, lon, z = list(df.y), list(df.x), list(df["Theil Sen Slope"])
z0, z1, z2 = np.min(z)+0.03, np.mean(z), np.max(z)-0.01
# Read some metadata of the shapefile
shp = glob.glob("GIS\\*.shp")[0]
with fiona.drivers():
with fiona.open(shp) as src:
bnds = src.bounds
extent = [values for values in bnds]
lono = np.mean([extent[0], extent[2]])
lato = np.mean([extent[1], extent[3]])
llcrnrlon = extent[0]-0.5
llcrnrlat = extent[1]-0.5
urcrnrlon = extent[2]+0.5
urcrnrlat = extent[3]+0.5
# Create a Basemap
fig = plt.figure()
ax = fig.add_subplot(111)
m = Basemap(llcrnrlon=llcrnrlon, llcrnrlat=llcrnrlat,
urcrnrlon=urcrnrlon, urcrnrlat=urcrnrlat,
resolution='i', projection='tmerc' , lat_0 = lato, lon_0 = lono)
# Read in and display the shapefile
m.readshapefile(shp.split(".")[0], 'shf', zorder=2, drawbounds=True)
# Compute the number of bins to aggregate data
nx = 100
ny = 100
# Create a mesh and interpolate data
xi = np.linspace(llcrnrlon, urcrnrlon, nx)
yi = np.linspace(llcrnrlat, urcrnrlat, ny)
xgrid, ygrid = np.meshgrid(xi, yi)
xs, ys = m(xgrid, ygrid)
zs = griddata(lon, lat, z, xgrid, ygrid, interp='nn')
# Plot the contour map
conf = m.contourf(xs, ys, zs, 30, zorder=1, cmap='jet')
cbar = m.colorbar(conf, location='bottom', pad="5%", ticks=(z0, z1, z2))
# Scatter plot of the points that make up the contour
for x, y in zip(lon, lat):
X, Y = m(x,y)
m.scatter(X, Y, zorder=4, color='black', s=1)
plt.show()
fig.savefig("Myplot.png", format="png")
And this is the output I got(The scattered black dots are there to show the spatial distribution of the points from which the interpolation was generated. I used Nearest Neighbor interpolation method here.):
I basically referred to the examples given in the following two links to plot this:
https://gist.github.com/urschrei/29cd446ae8a8ec60ddbc
https://matplotlib.org/basemap/users/examples.html
Now this image has 3 problems:
The interpolated contour does not expand within the whole of the shapefile
The part of the contour plot protruding out of the shapefile boundary is not masked off
The contour is not smooth.
What I want is to overcome these three deficiencies of my plot and generate a smooth and nice looking plot similar to the ones shown below (Source: https://doi.org/10.1175/JCLI3557.1 ):
How do I achieve that?
I was inspired by this answer by #James to see how griddata and map_coordinates might be used. In the examples below I'm showing 2D data, but my interest is in 3D. I noticed that griddata only provides splines for 1D and 2D, and is limited to linear interpolation for 3D and higher (probably for very good reasons). However, map_coordinates seems to be fine with 3D using higher order (smoother than piece-wise linear) interpolation.
My primary question: if I have random, unstructured data (where I can not use map_coordinates) in 3D, is there some way to get smoother than piece-wise linear interpolation within the NumPy SciPy universe, or at least nearby?
My secondary question: is spline for 3D not available in griddata because it is difficult or tedious to implement, or is there a fundamental difficulty?
The images and horrible python below show my current understanding of how griddata and map_coordinates can or can't be used. Interpolation is done along the thick black line.
STRUCTURED DATA:
UNSTRUCTURED DATA:
Horrible python:
import numpy as np
import matplotlib.pyplot as plt
def g(x, y):
return np.exp(-((x-1.0)**2 + (y-1.0)**2))
def findit(x, X): # or could use some 1D interpolation
fraction = (x - X[0]) / (X[-1]-X[0])
return fraction * float(X.shape[0]-1)
nth, nr = 12, 11
theta_min, theta_max = 0.2, 1.3
r_min, r_max = 0.7, 2.0
theta = np.linspace(theta_min, theta_max, nth)
r = np.linspace(r_min, r_max, nr)
R, TH = np.meshgrid(r, theta)
Xp, Yp = R*np.cos(TH), R*np.sin(TH)
array = g(Xp, Yp)
x, y = np.linspace(0.0, 2.0, 200), np.linspace(0.0, 2.0, 200)
X, Y = np.meshgrid(x, y)
blob = g(X, Y)
xtest = np.linspace(0.25, 1.75, 40)
ytest = np.zeros_like(xtest) + 0.75
rtest = np.sqrt(xtest**2 + ytest**2)
thetatest = np.arctan2(xtest, ytest)
ir = findit(rtest, r)
it = findit(thetatest, theta)
plt.figure()
plt.subplot(2,1,1)
plt.scatter(100.0*Xp.flatten(), 100.0*Yp.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')
exact = g(xtest, ytest)
import scipy.ndimage.interpolation as spndint
ndint0 = spndint.map_coordinates(array, [it, ir], order=0)
ndint1 = spndint.map_coordinates(array, [it, ir], order=1)
ndint2 = spndint.map_coordinates(array, [it, ir], order=2)
import scipy.interpolate as spint
points = np.vstack((Xp.flatten(), Yp.flatten())).T # could use np.array(zip(...))
grid_x = xtest
grid_y = np.array([0.75])
g0 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='nearest')
g1 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='linear')
g2 = spint.griddata(points, array.flatten(), (grid_x, grid_y), method='cubic')
plt.subplot(4,2,5)
plt.plot(exact, 'or')
#plt.plot(ndint0)
plt.plot(ndint1)
plt.plot(ndint2)
plt.title("map_coordinates")
plt.subplot(4,2,6)
plt.plot(exact, 'or')
#plt.plot(g0)
plt.plot(g1)
plt.plot(g2)
plt.title("griddata")
plt.subplot(4,2,7)
#plt.plot(ndint0 - exact)
plt.plot(ndint1 - exact)
plt.plot(ndint2 - exact)
plt.title("error map_coordinates")
plt.subplot(4,2,8)
#plt.plot(g0 - exact)
plt.plot(g1 - exact)
plt.plot(g2 - exact)
plt.title("error griddata")
plt.show()
seed_points_rand = 2.0 * np.random.random((400, 2))
rr = np.sqrt((seed_points_rand**2).sum(axis=-1))
thth = np.arctan2(seed_points_rand[...,1], seed_points_rand[...,0])
isinside = (rr>r_min) * (rr<r_max) * (thth>theta_min) * (thth<theta_max)
points_rand = seed_points_rand[isinside]
Xprand, Yprand = points_rand.T # unpack
array_rand = g(Xprand, Yprand)
grid_x = xtest
grid_y = np.array([0.75])
plt.figure()
plt.subplot(2,1,1)
plt.scatter(100.0*Xprand.flatten(), 100.0*Yprand.flatten())
plt.plot(100.0*xtest, 100.0*ytest, '-k', linewidth=3)
plt.hold
plt.imshow(blob, origin='lower', cmap='gray')
plt.text(5, 5, "don't use jet!", color='white')
g0rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='nearest')
g1rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='linear')
g2rand = spint.griddata(points_rand, array_rand.flatten(), (grid_x, grid_y), method='cubic')
plt.subplot(4,2,6)
plt.plot(exact, 'or')
#plt.plot(g0rand)
plt.plot(g1rand)
plt.plot(g2rand)
plt.title("griddata")
plt.subplot(4,2,8)
#plt.plot(g0rand - exact)
plt.plot(g1rand - exact)
plt.plot(g2rand - exact)
plt.title("error griddata")
plt.show()
Good question! (and nice plots!)
For unstructured data, you'll want to switch back to functions meant for unstructured data. griddata is one option, but uses triangulation with linear interpolation in between. This leads to "hard" edges at triangle boundaries.
Splines are radial basis functions. In scipy terms, you want scipy.interpolate.Rbf. I'd recommend using function="linear" or function="thin_plate" over cubic splines, but cubic is available as well. (Cubic splines will exacerbate problems with "overshooting" compared to linear or thin-plate splines.)
One caveat is that this particular implementation of radial basis functions will always use all points in your dataset. This is the most accurate and smooth approach, but it scales poorly as the number of input observation points increases. There are several ways around this, but things will get more complex. I'll leave that for another question.
At any rate, here's a simplified example. We'll generate random data and then interpolate it at points that are on a regular grid. (Note that the input is not on a regular grid, and the interpolated points don't need to be either.)
import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
interp = scipy.interpolate.Rbf(x, y, z, function='thin_plate')
yi, xi = np.mgrid[0:1:100j, 0:1:100j]
zi = interp(xi, yi)
plt.plot(x, y, 'ko')
plt.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
plt.colorbar()
plt.show()
Choice of spline type
I chose "thin_plate" as the type of spline. Our input observations points range from 0 to 1 (they're created by np.random.random). Notice that our interpolated values go slightly above 1 and well below zero. This is "overshooting".
Linear splines will completely avoid overshooting, but you'll wind up with "bullseye" patterns (nowhere near as severe as with IDW methods, though). For example, here's the exact same data interpolated with a linear radial basis function. Notice that our interpolated values never go above 1 or below 0:
Higher order splines will make trends in the data more continuous but will overshoot more. The default "multiquadric" is fairly similar to a thin-plate spline, but will make things a bit more continuous and overshoot a bit worse:
However, as you go to even higher order splines such as "cubic" (third order):
and "quintic" (fifth order)
You can really wind up with unreasonable results as soon as you move even slightly beyond your input data.
At any rate, here's a simple example to compare different radial basis functions on random data:
import numpy as np
import scipy.interpolate
import matplotlib.pyplot as plt
np.random.seed(1977)
x, y, z = np.random.random((3, 10))
yi, xi = np.mgrid[0:1:100j, 0:1:100j]
interp_types = ['multiquadric', 'inverse', 'gaussian', 'linear', 'cubic',
'quintic', 'thin_plate']
for kind in interp_types:
interp = scipy.interpolate.Rbf(x, y, z, function=kind)
zi = interp(xi, yi)
fig, ax = plt.subplots()
ax.plot(x, y, 'ko')
im = ax.imshow(zi, extent=[0, 1, 1, 0], cmap='gist_earth')
fig.colorbar(im)
ax.set(title=kind)
fig.savefig(kind + '.png', dpi=80)
plt.show()