I have a 2d matrix (1800*600) with many NaN values.
I would like to conduct a 2d interpolation, which is very simple in matlab.
But if scipy.interpolate.inter2d is used, the result is a NaN matrix. I know the NaN values could be filled using scipy.interpolate.griddata, but I don't want to fulfill the Nan. What other functions can I use to conduct a 2d interpolation?
A workaround using inter2d is to perform two interpolations: one on the filled data (replace the NaNs with an arbitrary value) and one to keep track of the undefined areas. It is then possible to re-assign NaN value to these areas:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.interpolate import interp2d
# Generate some test data:
x = np.linspace(-2, 2, 40)
y = np.linspace(-2, 2, 41)
xx, yy = np.meshgrid(x, y)
z = xx**2+yy**2
z[ xx**2+yy**2<1 ] = np.nan
# Interpolation functions:
nan_map = np.zeros_like( z )
nan_map[ np.isnan(z) ] = 1
filled_z = z.copy()
filled_z[ np.isnan(z) ] = 0
f = interp2d(x, y, filled_z, kind='linear')
f_nan = interp2d(x, y, nan_map, kind='linear')
# Interpolation on new points:
xnew = np.linspace(-2, 2, 20)
ynew = np.linspace(-2, 2, 21)
z_new = f(xnew, ynew)
nan_new = f_nan( xnew, ynew )
z_new[ nan_new>0.5 ] = np.nan
plt.pcolor(xnew, ynew, z_new);
Related
I have been able to interpolate values successfully from linear values of x to sine-like values of y.
However - I am struggling to interpolate the other way - from nonlinear values of y to linear values of x.
The below is a toy example
import matplotlib.pylab as plt
from scipy import interpolate
#create 100 x values
x = np.linspace(-np.pi, np.pi, 100)
#create 100 values of y where y= sin(x)
y=np.sin(x)
#learn function to map y from x
f = interpolate.interp1d(x, y)
With new values of linear x
xnew = np.array([-1,1])
I get correctly interpolated values of nonlinear y
ynew = f(xnew)
print(ynew)
array([-0.84114583, 0.84114583])
The problem comes when I try and interpolate values of x from y.
I create a new function, the reverse of f:
f2 = interpolate.interp1d(y,x,kind='cubic')
I put in values of y that I successfully interpolated before
ynew=np.array([-0.84114583, 0.84114583])
I am expecting to get the original values of x [-1, 1]
But I get:
array([-1.57328791, 1.57328791])
I have tried putting in other values for the 'kind' parameter with no luck and am not sure if I have got the wrong approach here. Thanks for your help
I guess the problem raises from the fact, that x is not a function of y, since for an arbitrary y value there may be more than one x value found.
Take a look at a truncated range of data.
When x ranges from 0 to np.pi/2, then for every y value there is a unique x value.
In this case the snippet below works as expected.
>>> import numpy as np
>>> from scipy import interpolate
>>> x = np.linspace(0, np.pi / 2, 100)
>>> y = np.sin(x)
>>> f = interpolate.interp1d(x, y)
>>> f([0, 0.1, 0.3, 0.5])
array([0. , 0.09983071, 0.29551713, 0.47941047])
>>> f2 = interpolate.interp1d(y, x)
>>> f2([0, 0.09983071, 0.29551713, 0.47941047])
array([0. , 0.1 , 0.3 , 0.50000001])
Maxim provided the reason for this behavior. This interpolation is a class designed to work for functions. In your case, y=arcsin(x) is only in a limited interval a function. This leads to interesting phenomena in the interpolation routine that interpolates to the nearest y-value which in the case of the arcsin() function is not necessarily the next value in the x-y curve but maybe several periods away. An illustration:
import numpy as np
import matplotlib.pylab as plt
from scipy import interpolate
xmin=-np.pi
xmax=np.pi
fig, axes = plt.subplots(3, 3, figsize=(15, 10))
for i, fac in enumerate([2, 1, 0.5]):
x = np.linspace(xmin * fac, xmax*fac, 100)
y=np.sin(x)
#x->y
f = interpolate.interp1d(x, y)
x_fit = np.linspace(xmin*fac, xmax*fac, 1000)
y_fit = f(x_fit)
axes[i][0].plot(x_fit, y_fit)
axes[i][0].set_ylabel(f"sin period {fac}")
if not i:
axes[i][0].set_title(label="interpolation x->y")
#y->x
f2 = interpolate.interp1d(y, x)
y2_fit = np.linspace(.99 * min(y), .99 * max(y), 1000)
x2_fit = f2(y2_fit)
axes[i][1].plot(x2_fit, y2_fit)
if not i:
axes[i][1].set_title(label="interpolation y->x")
#y->x with cubic interpolation
f3 = interpolate.interp1d(y, x, kind="cubic")
y3_fit = np.linspace(.99 * min(y), .99 * max(y), 1000)
x3_fit = f3(y3_fit)
axes[i][2].plot(x3_fit, y3_fit)
if not i:
axes[i][2].set_title(label="cubic interpolation y->x")
plt.show()
As you can see, the interpolation works along the ordered list of y-values (as you instructed it to), and this works particularly badly with cubic interpolation.
I am trying to create a 3D array to then perform volume rendering on (in other software or volume rendering packages) of strange attractor like Lorenz Attractor. It is easy enough to plot the attractor from data points and provide a value to assign a color and visualize in matplotlib for example.
However I would like a filled volume array. I have tried interpolation methods like griddata but it doesn't give the desired result. What I am envisioning is something like:
Which is from the wikipedia page.
Here is what I have tried but if you open the result in a simple viewer it doesn't look great. I am thinking instead maybe doing a interpolation only between the points that make up the x,y,z array... I am a little lost after playing with this for several hours. What I think I need is to take the points and do some sort of interpolation or filling into an array, here I am calling interp_im. This can then be viewed in volume rendering. Any help is greatly appreciated on this!
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy.interpolate import griddata
from scipy.interpolate import LinearNDInterpolator
from skimage.external import tifffile
rho = 28.0
sigma = 10.0
beta = 8.0 / 3.0
def f(state, t):
x, y, z = state # Unpack the state vector
return sigma * (y - x), x * (rho - z) - y, x * y - beta * z # Derivatives
state0 = [1.0, 1.0, 1.0]
t = np.arange(0.0, 40.0, 0.01) #t = np.arange(0.0, 40.0, 0.01)
states = odeint(f, state0, t)
# shift x,y,z positions to int for regular image volume
x = states[:, 0]
y = states[:, 1]
z = states[:, 2]
x_min = x.min()
y_min = y.min()
z_min = z.min()
states_int = states + [abs(x_min),abs(y_min),abs(z_min)] + 1
states_int = states_int * 10
states_int = states_int.astype(int)
# values will be in order of tracing for color
values = []
for i,j in enumerate(states_int):
values.append(i*10)
values = np.asarray(values)
fig = plt.figure()
ax = fig.gca(projection='3d')
sc = ax.scatter(states_int[:, 0], states_int[:, 1], states_int[:, 2],c=values)
plt.colorbar(sc)
plt.draw()
plt.show()
#print(x.shape, y.shape, z.shape, values.shape)
#Interpolate for volume rendering
x_ = np.linspace(0,999,500)
y_ = np.linspace(0,999,500)
z_ = np.linspace(0,999,500)
xx,yy,zz = np.meshgrid(x_,y_,z_, sparse = True)
#
# X = states_int.tolist()
#
interp_im = griddata(states_int, values, (xx,yy,zz), method='linear')
interp_im = interp_im.astype(np.uint16)
np.save('interp_im.npy', interp_im)
tifffile.imsave('LorenzAttractor.tif', interp_im)
Your data is in the volume, it is just pixelated. If you blur the volume, for example with a gaussian, you get something much more usable. For example:
from scipy import ndimage
vol = np.zeros((512, 512, 512), dtype=states_int.dtype)
# add data to vol
vol[tuple(np.split(states_int, vol.ndim, axis=1))] = values[:, np.newaxis]
# apply gaussian filter, sigma=5 in this case
vol = ndimage.gaussian_filter(vol, 5)
I would then use something like napari to view the data in 3D:
import napari
with napari.gui_qt():
napari.view_image(v)
To make the volume smoother you may want to reduce your integration step size.
I have the following quadratic form f(x) = x^T A x - b^T x and i've used numpy to define my matrices A, b:
A = np.array([[4,3], [3,7]])
b = np.array([3,-7])
So we're talking about 2 dimensions here, meaning that the contour plot will have the axes x1 and x2 and I want these to span from -4 to 4.
I've tried to experiment by doing
u = np.linspace(-4,4,100)
x, y = np.meshgrid(u,u)
in order to create the 2 axis x1 and x2 but then I dont know how to define my function f(x) and if I do plt.contour(x,y,f) it won't work because the function f(x) is defined with only x as an argument.
Any ideas would be greatly appreciated. Thanks !
EDIT : I managed to "solve" the problem by doing the operations between the quadratic form , for example x^T A x, and ended up with a function of x1,x2 where these are the components of x vector. After that I did
u = np.linspace(-4,4,100)
x, y = np.meshgrid(u,u)
z = 1.5*(x**2) + 3*(y**2) - 2*x + 8*y + 2*x*y #(thats the function i ended up with)
plt.contour(x, y, z)
If Your transformation matrices A, b look like
A = np.array([[4,3], [3,7]])
b = np.array([3,-7])
and Your data look like
u = np.linspace(-4,4,100)
x, y = np.meshgrid(u,u)
x.shape
x and y will have the shapes (100,100).
You can define f(x) as
def f(x):
return np.dot(np.dot(x.T,A),x) - np.dot(b,x)
to then input anything with the shape (2, N) into the function f.
I am unfortunately not sure, which values You want to feed into it.
But one example would be: [(-4:4), (-4:4)]
plt.contour(x, y, f(x[0:2,:]))
update
If the visualization of the contour plot does not fit Your purpose, You can use other plots, e.g. 3D visualizations.
from mpl_toolkits.mplot3d import Axes3D # This import has side effects required for the kwarg projection='3d' in the call to fig.add_subplot
fig = plt.figure(figsize=(40,20))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x,y, f(x[0:2,:]))
plt.show()
If You expect other values in the z-dimension, the projection f might be off.
For other 3d plots see: https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html
you could try something like this:
import numpy as np
import matplotlib.pyplot as plt
A = np.array([[4,3], [3,7]])
n_points = 100
u = np.linspace(-4, 4, n_points)
x, y = np.meshgrid(u, u)
X = np.vstack([x.flatten(), y.flatten()])
f_x = np.dot(np.dot(X.T, A), X)
f_x = np.diag(f_x).reshape(n_points, n_points)
plt.figure()
plt.contour(x, y, f_x)
Another alternative is to compute f_x as follows.
f_x = np.zeros((n_points, n_points))
for i in range(n_points):
for j in range(n_points):
in_v = np.array([[x[i][j]], [y[i][j]]])
f_x[i][j] = np.dot(np.dot(in_v.T, A), in_v)
I am interpolating a 2d numpy array to fill missing values that are marked with NaN. The following code works but only uses one core.
Are there any better functions that I can use to utilize all of the 24 cores that I have?
x = np.arange(0, array.shape[1])
y = np.arange(0, array.shape[0])
#mask invalid values
array = np.ma.masked_invalid(array)
xx, yy = np.meshgrid(x, y)
#get only the valid values
x1 = xx[~array.mask]
y1 = yy[~array.mask]
newarr = array[~array.mask]
GD1 = interpolate.griddata((x1, y1), newarr.ravel(),
(xx, yy),
method='cubic')
I think that you can do this with dask. I am not too familiar with dask but here is a start:
import numpy as np
from scipy import interpolate
import dask.array as da
import matplotlib.pyplot as plt
from dask import delayed
# create data with random missing entries
ar_size = 2000
chunk_size = 500
z_array = np.ones((ar_size, ar_size))
z_array[np.random.randint(0, ar_size-1, 50),
np.random.randint(0, ar_size-1, 50)]= np.nan
# XY coords
x = np.linspace(0, 3, z_array.shape[1])
y = np.linspace(0, 3, z_array.shape[0])
# gen sin wave for testing
z_array = z_array * np.sin(x)
# prove there are nans in the dataset
assert np.isnan(np.sum(z_array))
xx, yy = np.meshgrid(x, y)
print("global x.size: ", xx.size)
# make dask arrays
dask_xyz = da.from_array((xx, yy, z_array), chunks=(3, chunk_size, "auto"), name="dask_all")
dask_xx = dask_xyz[0,:,:]
dask_yy = dask_xyz[1,:,:]
dask_zz = dask_xyz[2,:,:]
# select only valid values
dask_valid_y1 = dask_yy[~da.isnan(dask_zz)]
dask_valid_x1 = dask_xx[~da.isnan(dask_zz)]
dask_newarr = dask_zz[~da.isnan(dask_zz)]
def gd_wrapped(x1, y1, newarr, xx, yy):
# note: linear and cubic griddata impl do not extrapolate
# and therefore fail near the boundaries... see RBF interp instead
print("local x.size: ", x1.size)
gd_zz = interpolate.griddata((x1, y1), newarr.ravel(),
(xx, yy),
method='nearest')
return gd_zz
def rbf_wrapped(x1, y1, newarr, xx, yy):
rbf_interpolant = interpolate.Rbf(x1, y1, newarr, function='linear')
return rbf_interpolant(xx, yy)
# interpolate
# gd_chunked = [delayed(rbf_wrapped)(x1, y1, newarr, xx, yy) for \
gd_chunked = [delayed(gd_wrapped)(x1, y1, newarr, xx, yy) for \
x1, y1, newarr, xx, yy \
in \
zip(dask_valid_x1.to_delayed().flatten(),
dask_valid_y1.to_delayed().flatten(),
dask_newarr.to_delayed().flatten(),
dask_xx.to_delayed().flatten(),
dask_yy.to_delayed().flatten())]
gd_out = delayed(da.concatenate)(gd_chunked, axis=0)
gd_out.visualize("dask_par.png")
gd1 = np.array(gd_out.compute())
print(gd1)
assert gd1.shape == (ar_size, ar_size)
print(gd1.shape)
plt.figure()
plt.imshow(gd1)
plt.savefig("dask_par_sin.png")
# prove we have no more nans in the data
assert ~np.isnan(np.sum(gd1))
There are some issues with this implementation. Griddata cannot extrapolate so nans are an issue at chunk boundaries. You could probably solve this with some overlapping cells. As a stopgap solution you can use method='nearest' or try radial basis function interpolation.
I have two arrays with values:
x = np.array([100, 123, 123, 118, 123])
y = np.array([12, 1, 14, 13])
I want to evaluate for example the function:
def func(a, b):
return a*0.8 * (b/2)
So, I want to fill the y missing values.
I am using:
import numpy as np
from scipy import interpolate
def func(a, b):
return a*0.8 * (b/2)
x = np.array([100, 123, 123, 118, 123])
y = np.array([12, 1, 14, 13])
X, Y = np.meshgrid(x, y)
Z = func(X, Y)
f = interpolate.interp2d(x, y, Z, kind='cubic')
Now, I am not sure how to continue from here.If I try:
xnew = np.linspace(0,150,10)
ynew = np.linspace(0,150,10)
Znew = f(xnew, ynew)
Znew is filled with nan values.
Also, I want to make the opposite.
If x is smaller than y and I want to interpolate always based on x values.
So, for example:
x = np.array([1,3,4])
y = np.array([1,2,3,4,5,6,7])
I want to remove values from y now.
How can I proceed with this?
To interpolate from a 1d array you can use np.interp as follow :
np.interp(np.linspace(0,1,len(x)), np.linspace(0,1,len(y)),y)
you can have a look at the documentation for full details but in short :
consider that your array y have value with references from 0 to 1 (example [5,2,6,3,9] will have indexes [0,0.25,0.5,0.75,1])
The second and the third argument of the function are the indexes and the vector y
The first argument is the indexes of the interpolated value of y
as an example :
>>> y = [0,5]
>>> indexes = [0,1]
>>> new_indexes = [0,0.5,1]
>>> np.interp(new_indexes, indexes, y)
[0,2.5,5]