Evenly sampled 3D meshgrid - python

I have a 3-dimensional meshgrid generated using the following code:
x = np.linspace(-1,1,100)
xx, yy, zz = np.meshgrid(x, x, x)
This generates a 100 x 100 x 100 point 3-d grid of points. I would like to plot an evenly-space sub-sampling of this same grid, without having to generate a new grid. My approach to this was to use np.linspace() to get an array of 10000 evenly-space indices from the original array to plot xx[subsample], yy[subsample], and zz[subsample]. I used
subsample = np.linspace(0,len(xx.flatten())-1,10000,dtype=int)
However, when I pass this array my plotting function, I get uneven structure (diagonal lines) in 3-dimensions:
My guess is that this is happening because I flattened the array, and then used np.linspace(), but I can't figure out how to sample the grid in 3-dimensions and have it come out evenly distributed. I would like to avoid generating a new meshgrid if at all possible.
My question is how would I evenly subsample my original 3-dimensional meshgrid, without having to generate a new meshgrid?

In [117]: x = np.linspace(-1,1,100)
...: xx, yy, zz = np.meshgrid(x, x, x)
In [118]: xx.shape
Out[118]: (100, 100, 100)
1000 equally spaced points in xx, similarly for all other grids:
In [119]: xx[::10,::10,::10].shape
Out[119]: (10, 10, 10)
Or with advanced indexing (making a copy)
In [123]: i=np.arange(0,100,10)
In [124]: xx[np.ix_(i,i,i)].shape
Out[124]: (10, 10, 10)
I think we could use np.ravel_multi_index to get an array of flattened indices. We'd have to generate 1000 tuples of indices to do that!
I don't see how we could get a 10,000 points. ::5 would give 8000 points.

Have you trying using arange? Using linspace for integers may have some rounding issues.
Could you try the following?
subsample = np.arange(0, xx.size, xx.size // 10000) # the last parameter is the step size
Also, be sure that xx.size is divisible by 10000, which is the case for your 100x100x100.
Tip: use .size to get the number of elements in an array. Use .ravel instead of .flatten as the latter creates a copy but ravel is just a view.
Edit: That subsample did not generate those diagonals but it just got a plane.
subsample_axis = [np.arange(0, xx.shape[i], 10) for i in range(len(xx.shape))]
subsample = np.zeros([len(axis) for axis in subsample_axis])
for i, axis in enumerate(subsample_axis):
shape = [len(axis) if j == i else 1 for j in range(len(xx.shape))]
subsample += axis.reshape(shape)*np.prod(xx.shape[i+1:])
subsample = subsample.ravel().astype('int')

Related

Average of a 3D numpy slice based on 2D arrays

I am trying to calculate the average of a 3D array between two indices on the 1st axis. The start and end indices vary from cell to cell and are represented by two separate 2D arrays that are the same shape as a slice of the 3D array.
I have managed to implement a piece of code that loops through the pixels of my 3D array, but this method is painfully slow in the case of my array with a shape of (70, 550, 350). Is there a way to vectorise the operation using numpy or xarray (the arrays are stored in an xarray dataset)?
Here is a snippet of what I would like to optimise:
# My 3D raster containing values; shape = (time, x, y)
values = np.random.rand(10, 55, 60)
# A 2D raster containing start indices for the averaging
start_index = np.random.randint(0, 4, size=(values.shape[1], values.shape[2]))
# A 2D raster containing end indices for the averaging
end_index = np.random.randint(5, 9, size=(values.shape[1], values.shape[2]))
# Initialise an array that will contain results
mean_array = np.zeros_like(values[0, :, :])
# Loop over 3D raster to calculate the average between indices on axis 0
for i in range(0, values.shape[1]):
for j in range(0, values.shape[2]):
mean_array[i, j] = np.mean(values[start_index[i, j]: end_index[i, j], i, j], axis=0)
One way to do this without loops is to zero-out the entries you don't want to use, compute the sum of the remaining items, then divide by the number of nonzero entries. For example:
i = np.arange(values.shape[0])[:, None, None]
mean_array_2 = np.where((i >= start_index) & (i < end_index), values, 0).sum(0) / (end_index - start_index)
np.allclose(mean_array, mean_array_2)
# True
Note that this assumes that the indices are in the range 0 <= i < values.shape[0]; if this is not the case you can use np.clip or other means to standardize the indices before computation.

How to generate a 3D grid of vectors ? (each position in the 3D grid is a vector)

I want to generate a four dimensional array with dimensions (dim,N,N,N). The first component ndim =3 and N corresponds to the grid length. How can one elegantly generate such an array using python ?
here is my 'ugly' implementation:
qvec=np.zeros([ndim,N,N,N])
freq = np.arange(-(N-1)/2.,+(N+1)/2.)
x, y, z = np.meshgrid(freq[range(N)], freq[range(N)], freq[range(N)],indexing='ij')
qvec[0,:,:,:]=x
qvec[1,:,:,:]=y
qvec[2,:,:,:]=z
Your implementation looks good enough to me. However, here are some improvements to make it prettier:
qvec=np.empty([ndim,N,N,N])
freq = np.arange(-(N-1)/2.,+(N+1)/2.)
x, y, z = np.meshgrid(*[freq]*ndim, indexing='ij')
qvec[0,...]=x # qvec[0] = x
qvec[1,...]=y # qvec[1] = y
qvec[2,...]=z # qvec[2] = z
The improvements are:
Using numpy.empty() instead of numpy.zeros()
Getting rid of the range(N) indexing since that would give the same freq array
Using iterable unpacking and utilizing ndim
Using the ellipsis notation for dimensions (this is also not needed)
So, after incorporating all of the above points, the below piece of code would suffice:
qvec=np.empty([ndim,N,N,N])
freq = np.arange(-(N-1)/2.,+(N+1)/2.)
x, y, z = np.meshgrid(*[freq]*ndim, indexing='ij')
qvec[0:ndim] = x, y, z
Note: I'm assuming N is same since you used same variable name.

numpy meshgrid filter out points

I have a meshgrid in numpy. I make some calculations on the points. I want to filter out points that could not be calcutaled for some reason ( division by zero).
from numpy import arange, array
Xout = arange(-400, 400, 20)
Yout = arange(0, 400, 20)
Zout = arange(0, 400, 20)
Xout_3d, Yout_3d, Zout_3d = numpy.meshgrid(Xout,Yout,Zout)
#some calculations
# for example
b = z / ( y - x )
To perform z / ( y - x ) using those 3D mesh arrays, you can create a mask of the valid ones. Now, the valid ones would be the ones where any pair of combinations between y and x aren't identical. So, this mask would be of shape (M,N), where M and N are the lengths of the Y and X axes respectively. To get such a mask to span across all combinations between X and Y, we could use NumPy's broadcasting. Thus, we would have such a mask like so -
mask = Yout[:,None] != Xout
Finally, and again using broadcasting to broadcast the mask along the first two axes of the3D arrays, we could perform such a division and choose between an invalid specifier and the actual division result using np.where, like so -
invalid_spec = 0
out = np.where(mask[...,None],Zout_3d/(Yout_3d-Xout_3d),invalid_spec)
Alternatively, we can directly get to such an output using broadcasting and thus avoid using meshgrid and having those heavy 3D arrays in workspace. The idea is to simultaneously populate the 3D grids and perform the subtraction and division computations, both on the fly. So, the implementation would look something like this -
np.where(mask[...,None],Zout/(Yout[:,None,None] - Xout[:,None]),invalid_spec)

Python numpy grid transformation using universal functions

Here is my problem : I manipulate 432*46*136*136 grids representing time*(space) encompassed in numpy arrays with numpy and python. I have one array alt, which encompasses the altitudes of the grid points, and another array temp which stores the temperature of the grid points.
It is problematic for a comparison : if T1 and T2 are two results, T1[t0,z0,x0,y0] and T2[t0,z0,x0,y0] represent the temperature at H1[t0,z0,x0,y0] and H2[t0,z0,x0,y0] meters, respectively. But I want to compare the temperature of points at the same altitude, not at the same grid point.
Hence I want to modify the z-axis of my matrices to represent the altitude and not the grid point. I create a function conv(alt[t,z,x,y]) which attributes a number between -20 and 200 to each altitude. Here is my code :
def interpolation_extended(self,temp,alt):
[t,z,x,y]=temp.shape
new=np.zeros([t,220,x,y])
for l in range(0,t):
for j in range(0,z):
for lat in range(0,x):
for lon in range(0,y):
new[l,conv(alt[l,j,lat,lon]),lat,lon]=temp[l,j,lat,lon]
return new
But this takes definitely too much time, I can't work this it. I tried to write it using universal functions with numpy :
def interpolation_extended(self,temp,alt):
[t,z,x,y]=temp.shape
new=np.zeros([t,220,x,y])
for j in range(0,z):
new[:,conv(alt[:,j,:,:]),:,:]=temp[:,j,:,:]
return new
But that does not work. Do you have any idea of doing this in python/numpy without using 4 nested loops ?
Thank you
I can't really try the code since I don't have your matrices, but something like this should do the job.
First, instead of declaring conv as a function, get the whole altitude projection for all your data:
conv = np.round(alt / 500.).astype(int)
Using np.round, the numpys version of round, it rounds all the elements of the matrix by vectorizing operations in C, and thus, you get a new array very quickly (at C speed). The following line aligns the altitudes to start in 0, by shifting all the array by its minimum value (in your case, -20):
conv -= conv.min()
the line above would transform your altitude matrix from [-20, 200] to [0, 220] (better for indexing).
With that, interpolation can be done easily by getting multidimensional indices:
t, z, y, x = np.indices(temp.shape)
the vectors above contain all the indices needed to index your original matrix. You can then create the new matrix by doing:
new_matrix[t, conv[t, z, y, x], y, x] = temp[t, z, y, x]
without any loop at all.
Let me know if it works. It might give you some erros since is hard for me to test it without data, but it should do the job.
The following toy example works fine:
A = np.random.randn(3,4,5) # Random 3x4x5 matrix -- your temp matrix
B = np.random.randint(0, 10, 3*4*5).reshape(3,4,5) # your conv matrix with altitudes from 0 to 9
C = np.zeros((3,10,5)) # your new matrix
z, y, x = np.indices(A.shape)
C[z, B[z, y, x], x] = A[z, y, x]
C contains your results by altitude.

Python random sample of two arrays, but matching indices

I have two numpy arrays x and y, which have length 10,000.
I would like to plot a random subset of 1,000 entries of both x and y.
Is there an easy way to use the lovely, compact random.sample(population, k) on both x and y to select the same corresponding indices? (The y and x vectors are linked by a function y(x) say.)
Thanks.
You can use np.random.choice on an index array and apply it to both arrays:
idx = np.random.choice(np.arange(len(x)), 1000, replace=False)
x_sample = x[idx]
y_sample = y[idx]
Just zip the two together and use that as the population:
import random
random.sample(zip(xs,ys), 1000)
The result will be 1000 pairs (2-tuples) of corresponding entries from xs and ys.
Update: For Python 3, you need to convert the zipped sequences into a list:
random.sample(list(zip(xs,ys)), 1000)
After test numpy.random.choice solution,
I found out it was very slow for larger array.
numpy.random.randint should be much faster
example
x = np.arange(1e8)
y = np.arange(1e8)
idx = np.random.randint(0, x.shape[0], 10000)
return x[idx], y[idx]
Using the numpy.random.randint function, you generate a list of random numbers, meaning that you can select certain datapoints twice.

Categories