Applying a function to meshgrids only under select polygon criteria? - python

My question is similar to this post but I'm having some trouble adapting it:
Ambiguous truth value for meshgrid and user-defined functions using if-statement
Essentially, I would like the conditional statement to not look like this:
import numpy as np
def test(x, y):
a = 1.0/(1+x*x)
b = np.ones(y.shape)
mask = (y!=0)
b[mask] = np.sin(y[mask])/y[mask]
return a*b
Rather, the "mask" to depend on whether x,y lie within a certain polygon. So every value in the resulting array is a 1, but a polygon between 4 values is generated. I only want the function to apply to points from the 2 meshgrid inputs (X,Y) which lay inside the polygon
x and y are real numbers that can be negative.
I'm not sure how to pass in the array items as singular values.
I ultimately want to plot Z on a colour plot
Thanks
i.e. points within a polygon undergo a transformation, points outside the polygon remain as 1
For example, I would expect my function to look like this
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon
def f(x, y, poly):
a = 1.0/(1+x*x)
b = np.ones(y.shape)
mask = (Point(x,y).within(poly) == True)
b[mask] = a*b
return b
x and y are meshgrids of arbitrary dimensions
I should add that I get the following error:
"only size-1 arrays can be converted to Python scalars"
X and Y are generated and the function is called via
coords = [(0, 0), (4,0), (4,4), (0,4)]
poly = Polygon(coords)
x = np.linspace(0,10, 11, endpoint = True) # x intervals
y = np.linspace(0,10,11, endpoint = True) # y intervals
X, Y = np.meshgrid(x,y)
Z = f(X, Y, poly)
Thanks!
Error Message:
Traceback (most recent call last):
File "meshgrid_understanding.py", line 28, in <module>
Z = f(X, Y, poly)
File "meshgrid_understanding.py", line 16, in f
mask = (Point(x,y).within(poly) != True)
File "C:\Users\Nick\AppData\Local\Programs\Python\Python37\lib\site-packages\shapely\geometry\point.py", line 48, in __init__
self._set_coords(*args)
File "C:\Users\Nick\AppData\Local\Programs\Python\Python37\lib\site-packages\shapely\geometry\point.py", line 137, in _set_coords
self._geom, self._ndim = geos_point_from_py(tuple(args))
File "C:\Users\Nick\AppData\Local\Programs\Python\Python37\lib\site-packages\shapely\geometry\point.py", line 214, in geos_point_from_py
dx = c_double(coords[0])
TypeError: only size-1 arrays can be converted to Python scalars

Matplotlib has a function that accepts an array of points. Demo:
import numpy as np
from matplotlib.path import Path
coords = [(0, 0), (4,0), (4,4), (0,4)]
x = np.linspace(0, 10, 11, endpoint=True)
y = np.linspace(0, 10, 11, endpoint=True)
X, Y = np.meshgrid(x, y)
points = np.c_[X.ravel(), Y.ravel()]
mask = Path(coords).contains_points(points).reshape(X.shape)

You are passing an array to the function Point that accepts single values.

Related

Order of dimensions for a multivariate (4d) normal distribution using scipy / python?

I would like to evaluate a 4d Gaussian / normal distribution on a 4d grid. Let's call the variables (x1,y1,x2,y2). Then if I have means = (x1=1,y1=0,x2=2,y2=0), I expect that when I do a 2d contour plot in the x1, x2 direction, at y1=y2=0, to see a Gaussian centered in (x1=1, x2=2). However, I see the mean/center at (x1=2,x2=0) instead.
What am I missing here? Is it how I define the grid to begin with?
For a 2d normal distribution it works as expected.
import numpy as np
from matplotlib import pyplot as plt
from scipy.stats import multivariate_normal
xy_min = -5
xy_max = 5
npoints = 50
x = np.linspace(xy_min, xy_max, npoints)
dim = 4
xx1,yy1,xx2,yy2 = np.meshgrid(x, x,x,x)
points = np.concatenate([xx1[:, :,:, :,None], yy1[:, :, :,:,None],xx2[:, :, :,:,None],yy2[:, :, :,:,None]], axis=-1)
cov = np.diag(np.ones(4))
mean=np.array([1,0,2,0])
rv = multivariate_normal.pdf(points , mean=mean, cov=cov)
plt.figure()
plt.contourf(x, x, rv[:,0,:,0])
I tried to manually reshape the evaluation points first, but it gives the same results. So I think I am missing something conceptually here?
points_resh = np.reshape(points,[npoints**4,dim],order='C')
rv_resh = multivariate_normal.pdf(points_resh , mean=mean, cov=cov)
rv2 = np.reshape(rv_resh,[npoints,npoints,npoints,npoints],order='C')
plt.figure()
plt.contourf(x, x, rv2[:,0,:,0])
** EDIT: SOLVED **
using ij indexing for meshgrid everything works as expected. Only need to keep in mind that the matrix needs to be transposed for contour plotting. See example below:
#%% Instead use ij indexing
x = np.linspace(-5, 5, 50)
y = np.linspace(-3, 3, 30)
z= np.linspace(-2, 2, 20)
w= np.linspace(-1, 1, 10)
x4d,y4d,z4d,w4d= np.meshgrid(x, y,z,w,indexing='ij')
points4d= np.concatenate([x4d[:, :,:,:,None], y4d[:, :,:,:,None], z4d[:, :,:,:,None],w4d[:, :,:,:,None]], axis=-1)
rv4d = multivariate_normal.pdf(points4d , mean=[1,0.0,2,0.0], cov=[0.1,0.1,0.1,0.1])
fig,ax=plt.subplots()
ax.contourf(x,z,rv4d[:,0,:,0].T)
ax.set(xlabel='x',ylabel='y')
print(x_mean)
using ij indexing for meshgrid everything works as expected. Only need to keep in mind that the matrix needs to be transposed for contour plotting. See example below:
#%% Instead use ij indexing
x = np.linspace(-5, 5, 50)
y = np.linspace(-3, 3, 30)
z= np.linspace(-2, 2, 20)
w= np.linspace(-1, 1, 10)
x4d,y4d,z4d,w4d= np.meshgrid(x, y,z,w,indexing='ij')
points4d= np.concatenate([x4d[:, :,:,:,None], y4d[:, :,:,:,None], z4d[:, :,:,:,None],w4d[:, :,:,:,None]], axis=-1)
rv4d = multivariate_normal.pdf(points4d , mean=[1,0.0,2,0.0], cov=[0.1,0.1,0.1,0.1])
fig,ax=plt.subplots()
ax.contourf(x,z,rv4d[:,0,:,0].T)
ax.set(xlabel='x',ylabel='y')
print(x_mean)

ValueError: condition must be a 1-d array

Hi I've been stuck on this error for a while now! I want to interpolate data 3 D and then display it in 2D (in Basemap). Unfortunately, I get this error when I want to plot the grid[long], grid[lat] and the interpolation values with contourf:
ValueError: condition must be a 1-d array
I already tried to import the values as y = df['variable'].values.tolist() but this did not change the error. Unfortunately, as I am new to arrays, I do not have a good understanding of them and need to solve this error in a timely manner.
def load_data():
df = pd.read_csv(r"File")
return(df)
def get_data(df):
return {
"lons": df['Longitude'],
"lats": df['Latitude'],
"alts": df['Altitude'],
"values": df['O18'],
}
def generate_grid(data, basemap, delta=1):
grid = {
'lon': np.arange(-180, 180, delta),
'lat': np.arange(np.amin(data["lats"]), np.amax(data["lats"]), delta),
'alt': np.arange(np.amin(data["alts"]), np.amax(data["alts"]), delta)
}
grid["x"], grid["y"], grid["z"] = np.meshgrid(grid["lon"], grid["lat"], grid["alt"], indexing="ij")
grid["x"], grid["y"] = basemap(grid["x"], grid["y"])
return grid
def interpolate(data, grid):
uk3d = UniversalKriging3D(
data["lons"],
data["lats"],
data["alts"],
data["values"],
variogram_model='exponential',
drift_terms=["specified"],
specified_drift=[data["alts"]],
)
return uk3d.execute("grid", grid["lon"], grid["lat"], grid["alt"], specified_drift_arrays=[grid["z"]])
def prepare_map_plot():
figure, axes = plt.subplots(figsize=(10,10))
basemap = Basemap(projection='robin', lon_0=0, lat_0=0, resolution='h',area_thresh=1000,ax=axes)
return figure, axes, basemap
def plot_mesh_data(interpolation, grid, basemap):
colormesh = basemap.contourf(grid["x"], grid["y"], interpolation,32, cmap='RdBu_r')
color_bar = my_basemap.colorbar(colormesh,location='bottom',pad="10%")
The error Message:
>>> plot_mesh_data(interpolation, grid,basemap)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in plot_mesh_data
File "C:\Users\Name\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mpl_toolkits\basemap\__init__.py", line 548, in with_transform
return plotfunc(self,x,y,data,*args,**kwargs)
File "C:\Users\Name \AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mpl_toolkits\basemap\__init__.py", line 3666, in contourf
xl = xx.compress(condition).tolist()
ValueError: condition must be a 1-d array
Hmm..
Seems to be a problem with np.compress
condition needs to be a 1-d array of bools according to:
https://numpy.org/doc/stable/reference/generated/numpy.compress.html#numpy.compress
This is what happens to grid['x']
xx = x[x.shape[0]//2,:]
condition = (xx >= self.xmin) & (xx <= self.xmax)
So i would do this to your grid["x"] like:
```
def plot_mesh_data(interpolation, grid, basemap):
x = grid["x"]
xx = x[x.shape[0]//2,:]
condition = (xx >= self.xmin) & (xx <= self.xmax)
print(condition)
colormesh = basemap.contourf(grid["x"], grid["y"], interpolation,32,
cmap='RdBu_r')
color_bar = basemap.colorbar(colormesh,location='bottom',pad="10%")
```
Outside of your function to see why it is not a 1-D array of booleans.
so the print should give you smt like: [(True, False, True), type=ndarray)] or [True, False, True] etc.
Update since the self pointer was missing. This normally occurs when you try to act on an class method without having the object correctly instanciated.
i.e.:
import Class as Class_imp
Class_imp.dosmt()
Will give you positional argument self missing. SInce you did not do:
my_class_imp = Class_imp()
my_class_imp.dosmt()
Do you have a part in your complete script at bottom that does
if __name__ == '__main__':
df = get_data and data = load_data
fig, ax, basemap = prepare_map_plot()
interpol = interpolate(data, grid)
grid = generate_grid(data, basemap, delta=1)
plot_mesh_data(interpol, grid, basemap)
you can run this like
>>> import runpy
>>> runpy.run_path(path_name='path_to_script.py')
Cheers

How to convert A[x,y] = z to [ [ x0...xN ], [ y0...yN], [ z0...zN] ]

I have a 2D Numpy array that represents an image, and I want to create a surface plot of image intensity using matplotlib.surface_plot. For this function, I need to convert the 2D array A[x,y] => z into three arrays: [x0,...,xN], [y0,...,yN] and [z0,...,zN]. I can see how to do this conversion element-by-element:
X = []
Y = []
Z = []
for x in range( A.shape[ 0 ] ):
for y in range( A.shape[ 1 ] ):
X.append( x )
Y.append( y )
Z.append( A[x,y] )
but I'm wondering whether there is a more Pythonic way to do this?
a very simple way to do this could be to basically use the code shown in the matplotlib example. assuming x and y representing the sizes of the two dims in your image array A, you could do
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
# generate some input data that looks nice on a color map:
A = np.mgrid[0:10:0.1,0:10:0.1][0]
X = np.arange(0, A.shape[0], 1)
Y = np.arange(0, A.shape[1], 1)
X, Y = np.meshgrid(X, Y)
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, A, cmap='viridis',
linewidth=0, antialiased=False)
gives
You possibly don't need to construct the actual grid, because some pyplot functions accept 1d arrays for x and y, implying that a grid is to be constructed. It seems that Axes3D.plot_surface (which I presume you meant) does need 2d arrays as input, though.
So to get your grid the easiest way is using np.indices to get the indices corresponding to your array:
>>> import numpy as np
...
... # dummy data
... A = np.random.random((3,4)) # randoms of shape (3,4)
...
... # get indices
... x,y = np.indices(A.shape) # both arrays have shape (3,4)
...
... # prove that the indices correspond to the values of A
... print(all(A[i,j] == A[x[i,j], y[i,j]] for i in x.ravel() for j in y.ravel()))
True
The resulting arrays all have the same shape as A, which should be correct for most use cases. If for any reason you really need a flattened 1d array, you should use x.ravel() etc. to get a flattened view of the same 2d array.
I should note though that the standard way to visualize images (due to the short-wavelength variation of the data) is pyplot.imshow or pyplot.pcolormesh which can give you pixel-perfect visualization, albeit in two dimensions.
We agree X, Y and Z have different sizes (N for X and Y and N^2 for Z) ? If yes:
X looks not correct (you add several times the same values)
something like:
X = list(range(A.shape[0])
Y = list(range(A.shape[1])
Z = [A[x,y] for x in X for y in Y]

Repeating Scipy's griddata

The griding the data (d) in irregular grid (x and y) using Scipy's griddata is timecomsuing when the datasets are many. But, the longitudes and latitudes (x and y) are always same, only the data (d) are changing. In this case, once using the giddata, how to repeat the procedure with different d arrys to achieve faster result?
import numpy as np, matplotlib.pyplot as plt
from scipy.interpolate import griddata
x = np.array([110, 112, 114, 115, 119, 120, 122, 124]).astype(float)
y = np.array([60, 61, 63, 67, 68, 70, 75, 81]).astype(float)
d = np.array([4, 6, 5, 3, 2, 1, 7, 9]).astype(float)
ulx, lrx = np.min(x), np.max(x)
uly, lry = np.max(y), np.min(y)
xi = np.linspace(ulx, lrx, 15)
yi = np.linspace(uly, lry, 15)
grided_data = griddata((x, y), d, (xi.reshape(1,-1), yi.reshape(-1,1)), method='nearest',fill_value=0)
plt.imshow(grided_data)
plt.show()
The above code works for one array of d.
But I have hundreds of other arrays.
griddata with nearest ends up using NearestNDInterpolator. That's a class that creates an iterator, which is called with the xi:
elif method == 'nearest':
ip = NearestNDInterpolator(points, values, rescale=rescale)
return ip(xi)
So you could create your own NearestNDInterpolator and call it with multiple times with different xi.
But I think in your case you want to change the values. Looking at the code for that class I see
self.tree = cKDTree(self.points)
self.values = y
the __call__ does:
dist, i = self.tree.query(xi)
return self.values[i]
I don't know the relative cost of creating the tree versus query.
So it should be easy to change values between uses of __call__. And it looks like values could have multiple columns, since it's just indexing on the 1st dimension.
This interpolator is simple enough that you could write your own using the same tree idea.
Here's a Nearest Interpolator that lets you repeat the interpolation for the same points, but different z values. I haven't done timings yet to see how much time it saves
class MyNearest(interpolate.NearestNDInterpolator):
# normal interpolation, but returns the near neighbor indices as well
def __call__(self, *args):
xi = interpolate.interpnd._ndim_coords_from_arrays(args, ndim=self.points.shape[1])
xi = self._check_call_shape(xi)
xi = self._scale_x(xi)
dist, i = self.tree.query(xi)
return i, self.values[i]
def my_griddata(points, values, method='linear', fill_value=np.nan,
rescale=False):
points = interpolate.interpnd._ndim_coords_from_arrays(points)
if points.ndim < 2:
ndim = points.ndim
else:
ndim = points.shape[-1]
assert(ndim==2)
# simplified call for 2d 'nearest'
ip = MyNearest(points, values, rescale=rescale)
return ip # ip(xi) # return iterator, not values
ip = my_griddata((xreg, yreg), z, method='nearest',fill_value=0)
print(ip)
xi = (xi.reshape(1,-1), yi.reshape(-1,1))
I, data = ip(xi)
print(data.shape)
print(I.shape)
print(np.allclose(z[I],data))
z1 = xreg+yreg # new z data
data = z1[I] # should show diagonal color bars
So as long as z has the same shape as before (and as xreg), z[I] will return the nearest value for each xi.
And it can interpolated 2d data as well (e.g. (225,n) shaped)
z1 = np.array([xreg+yreg, xreg-yreg]).T
print(z1.shape) # (225,2)
data = z1[I]
print(data.shape) # (20,20,2)

Python and Scipy programming

I'm getting this error message:
Traceback (most recent call last):
File "C:/Python27/test", line 14, in <module>
tck = interpolate.bisplrep(X,Y,Z)
File "C:\Python27\lib\site-packages\scipy\interpolate\fitpack.py", line 850, in bisplrep
raise TypeError('m >= (kx+1)(ky+1) must hold')
TypeError: m >= (kx+1)(ky+1) must hold
The error says that len(X) = m is <=(kx+1)(ky+1). How can I solve this? Here's my program:
import scipy
import math
import numpy
from scipy import interpolate
x= [1000,2000,3000,4000,5000,6000]
y= [1000]
Y = numpy.array([[i]*len(x) for i in y])
X = numpy.array([x for i in y])
Z = numpy.array([[21284473.74,2574509.71,453334.97,95761.64,30580.45,25580.60]])
tck = interpolate.bisplrep(x,y,Z)
print interpolate.bisplev(3500,1000,tck)
Have you read the documentation?
If you don't specify kx and ky, default values will be 3:
scipy.interpolate.bisplrep(x, y, z, w=None, xb=None, xe=None, yb=None, ye=None,
kx=3, ky=3, task=0, s=None, eps=1e-16, tx=None, ty=None,
full_output=0, nxest=None, nyest=None, quiet=1)
And of course, len(X) = 6 < 16 = (3+1)(3+1).
Even if you give kx=1 and ky=1 explicitly while calling, you have another problem. Your (x,y) values form a line, and you can not define a surface from a line. Therefore it gives you ValueError: Invalid inputs.. First, you should fix your data. If this is your data, as you have no variation in Y, skip it and do a spline in 2D with X and Z.

Categories