3D slice from 3D array - python

I have a large 3D, N x N x N, numpy array with a value at each index in the array.
I want to be able to take cubic slices from the array using a center point:
def take_slice(large_array, center_point):
...
return cubic_slice_from_center
To illustrate, I want the cubic_slice_from_center to come back with the following shape, where slice[1][1][1] would be the value of the center point used to generate the slice:
print(cubic_slice_from_center)
array([[[0.32992015, 0.30037145, 0.04947877],
[0.0158681 , 0.26743224, 0.49967057],
[0.04274621, 0.0738851 , 0.60360489]],
[[0.78985965, 0.16111745, 0.51665212],
[0.08491344, 0.30240689, 0.23544363],
[0.47282742, 0.5777977 , 0.92652398]],
[[0.78797628, 0.98634545, 0.17903971],
[0.76787071, 0.29689657, 0.08112121],
[0.08786254, 0.06319838, 0.27050039]]])
I looked at a couple of ways to do this. One way was the following:
def get_cubic_slice(space, slice_center_x, slice_center_y, slice_center_z):
return space[slice_center_x-1:slice_center_x+2,
slice_center_y-1:slice_center_y+2,
slice_center_z-1:slice_center_z+2]
This works so long as the cubic slice is not on the edge but, if it is on the edge, it returns an empty array!
Sometimes, the center point of the slice will be on the edge of the 3D numpy array. When this occurs, rather than return nothing, I would like to return the values of the slice of cubic space that are within the bounds of the space and, where the slice would be out of bounds, fill the return array with np.nan values.
For example, for a 20 x 20 x 20 space, with indices 0-19 for the x, y and z axes, I would like the get_cubic_slice function to return the following kind of result for the point (0,5,5):
print(get_cubic_slice(space,0,5,5))
array([[[np.nan, np.nan, np.nan],
[np.nan , np.nan, np.nan],
[np.nan, np.nan , np.nan]],
[[0.78985965, 0.16111745, 0.51665212],
[0.08491344, 0.30240689, 0.23544363],
[0.47282742, 0.5777977 , 0.92652398]],
[[0.78797628, 0.98634545, 0.17903971],
[0.76787071, 0.29689657, 0.08112121],
[0.08786254, 0.06319838, 0.27050039]]])
What would be the best way to do this with numpy?

x = np.arange(27).reshape(3,3,3)
[[[ 0 1 2] [ 3 4 5] [ 6 7 8]]
[[ 9 10 11] [12 13 14] [15 16 17]]
[[18 19 20] [21 22 23] [24 25 26]]]
x[1][1][2]
14
x[1:, 0:2, 1:4]
[[[ 0] [ 3]]
[[ 9] [12]]
[[18] [21]]]
This is the way how we can do slicing in 3D array

Related

Query about picking values from numpy array

I have a numpy array of shape (206, 482, 3). I wanted to pick the 1st channel so I used name_of_array[:][:][0] but apparently that doesn't select the 1st channel.
I think name_of_array[:,:,0] picks the 1st channel. I don't understand why. Why name_of_array[:][:][0] != name_of_array[:,:,0]?
It's important to understand what each thing does. To do this break up the action left to right. Perhaps rewriting will make this more clear:
x[:][:][0] -> ( ( x[:] )[:] )[0] # Both are valid and equivalent Python syntax
So basically, we apply [:] to x, then [:] to the result, then [0] to this result. What does x[:]? Just return a copy of x! Thus
( (x[:])[:] )[0] == ( (x)[:] )[0] == (x[:])[0] == x[0]
This is of course, not what you expected. On the other hand,
x[:, :, 0]
returns at once the 0 column of all rows of all frames (I'm treating the index as [frame, row, col]).
The short answer: because thats the syntax (see Numpy basics indexing).
arr[:] == arr # full slice of all dimensions of the array
arr[:][:] == arr # full slice of a full slice of all dimensions
arr[:][:][0] == arr # equal to arr[0] because the first 2 [:] slice all
vs
arr[:,:,0] # slice all of 1st dim, slice all of 2nd dim, get 0th of 3rd arr
One way to figure things like this out yourself is to make a simplified example and experiment (heeding How to debug small programs):
import numpy as np
res = np.arange(4 * 3 * 2).reshape(4,3,2)
print(":,:,:")
print(res[:, :, :])
print("\n1:2,1:2,:")
print(res[1:2, 1:2, :])
print("\n:,:,0")
print(res[:, :, 0])
print("\n:,:,1")
print(res[:,:,1])
Output:
# :,:,: == all of it
[[[ 0 1]
[ 2 3]
[ 4 5]]
[[ 6 7]
[ 8 9]
[10 11]]
[[12 13]
[14 15]
[16 17]]
[[18 19]
[20 21]
[22 23]]]
# 1:2,1:2,:
[[[8 9]]]
# :,:,0
[[ 0 2 4]
[ 6 8 10]
[12 14 16]
[18 20 22]]
# :,:,1
[[ 1 3 5]
[ 7 9 11]
[13 15 17]
[19 21 23]]
There are lots of questions about numpy-slicing on SO, some of which are worthwhile studying to advance your knowledge (suggested as probably dupes but they do not address the confusion correctly):
Numpy extract submatrix
Selecting specific rows and columns from NumPy array

Numpy: Multidimensional index. Row by row with no loop

I have a Nx2x2x2 array called A. I also have a Nx2 array called B, which tells me the position of the last two dimensions of A in which I am interested. I am currently getting a Nx2 array, either by using a loop (as in C in the code below) or by using a list comprehension (as in D in the code below). I want to know whether there would be time gains from vectorization, and, if so, how to vectorize this task.
My current approach to vectorization (E in the code below), is using B to index each submatrix of A, so it does not return what I want. I want E to return the same as C or D.
Input:
A=np.reshape(np.arange(0,32),(4,2,2,2))
print("A")
print(A)
B=np.concatenate((np.array([0,1,0,1])[:,np.newaxis],np.array([1,1,0,0])[:,np.newaxis]),axis=1)
print("B")
print(B)
C=np.empty(shape=(4,2))
for n in range(0, 4):
C[n,:]=A[n,:,B[n,0],B[n,1]]
print("C")
print(C)
D = np.array([A[n,:,B[n,0],B[n,1]] for n in range(0, 4)])
print("D")
print(D)
E=A[:,:,B[:,0],B[:,1]]
print("E")
print(E)
Output:
A
[[[[ 0 1]
[ 2 3]]
[[ 4 5]
[ 6 7]]]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
[[[24 25]
[26 27]]
[[28 29]
[30 31]]]]
B
[[0 1]
[1 1]
[0 0]
[1 0]]
C
[[ 1. 5.]
[ 11. 15.]
[ 16. 20.]
[ 26. 30.]]
D
[[ 1 5]
[11 15]
[16 20]
[26 30]]
E
[[[ 1 3 0 2]
[ 5 7 4 6]]
[[ 9 11 8 10]
[13 15 12 14]]
[[17 19 16 18]
[21 23 20 22]]
[[25 27 24 26]
[29 31 28 30]]]
The complicated slicing operation could be done in a vectorized manner like so -
shp = A.shape
out = A.reshape(shp[0],shp[1],-1)[np.arange(shp[0]),:,B[:,0]*shp[3] + B[:,1]]
You are using the first and second columns of B to index into the third and fourth dimensions of input 4D array, A. With it means is, basically you are slicing the 4D array, with the last two dimensions being fused together. So, you need to get the linear indices with that fused format using B. Of course, before doing all that, you need to reshape A to a 3D array with A.reshape(shp[0],shp[1],-1).
Verify results for a generic 4D array case -
In [104]: A = np.random.rand(6,3,4,5)
...: B = np.concatenate((np.random.randint(0,4,(6,1)),np.random.randint(0,5,(6,1))),1)
...:
In [105]: C=np.empty(shape=(6,3))
...: for n in range(0, 6):
...: C[n,:]=A[n,:,B[n,0],B[n,1]]
...:
In [106]: shp = A.shape
...: out = A.reshape(shp[0],shp[1],-1)[np.arange(shp[0]),:,B[:,0]*shp[3] + B[:,1]]
...:
In [107]: np.allclose(C,out)
Out[107]: True

How to iterate over initial dimensions of a Numpy array?

I have a Numpy array with shape [1000, 1000, 1000, 3], being the last dimension, sized 3, is contains the triplets of 3D spatial vectors components. How can I use nditer to iterate over each triplet? Like this:
for vec in np.nditer(my_array, op_flags=['writeonly', <???>]):
vec = np.array(something)
I've addressed this question before, but here's a short example:
vec=np.arange(2*2*2*3).reshape(2,2,2,3)
it=np.ndindex(2,2,2)
for i in it:
print(vec[i])
producing:
[0 1 2]
[3 4 5]
[6 7 8]
[ 9 10 11]
[12 13 14]
[15 16 17]
[18 19 20]
[21 22 23]
ndindex constructs a multi-index iterator around a dummy array of the size you give it (here (2,2,2)), and returns it along with a next method.
So you can use ndindex as is, or use it as a model for constructing your on nditer.

Apply function to vectors in 3D numpy array

I have a question about how to apply a function to vectors in a 3D numpy array.
My problem is the following: let's say I have an array like this one:
a = np.arange(24)
a = a.reshape([4,3,2])
I want to apply a function to all following vectors to modify them:
[0 6], [1 7], [2 8], [4 10], [3 9] ...
What is the best method to use? As my array is quite big, looping in two of the three dimension is quite long...
Thanks in advance!
You can use function np.apply_along_axis. From the doc:
Apply a function to 1-D slices along the given axis.
For example:
>>> import numpy as np
>>> a = np.arange(24)
>>> a = a.reshape([4,3,2])
>>>
>>> def my_func(a):
... print "vector: " + str(a)
... return sum(a) / len(a)
...
>>> np.apply_along_axis(my_func, 0, a)
vector: [ 0 6 12 18]
vector: [ 1 7 13 19]
vector: [ 2 8 14 20]
vector: [ 3 9 15 21]
vector: [ 4 10 16 22]
vector: [ 5 11 17 23]
array([[ 9, 10],
[11, 12],
[13, 14]])
In example above I've used 0th axis. If you need n axes you can execute this function n times.

Fast interpolation of grid data

I have a large 3d np.ndarray of data that represents a physical variable sampled over a volume in a regular grid fashion (as in the value in array[0,0,0] represents the value at physical coords (0,0,0)).
I would like to go to a finer grid spacing by interpolating the data in the rough grid. At the moment I'm using scipy griddata linear interpolation but it's pretty slow (~90secs for 20x20x20 array). It's a bit overengineered for my purposes, allowing random sampling of the volume data. Is there anything out there that can take advantage of my regularly spaced data and the fact that there is only a limited set of specific points I want to interpolate to?
Sure! There are two options that do different things but both exploit the regularly-gridded nature of the original data.
The first is scipy.ndimage.zoom. If you just want to produce a denser regular grid based on interpolating the original data, this is the way to go.
The second is scipy.ndimage.map_coordinates. If you'd like to interpolate a few (or many) arbitrary points in your data, but still exploit the regularly-gridded nature of the original data (e.g. no quadtree required), it's the way to go.
"Zooming" an array (scipy.ndimage.zoom)
As a quick example (This will use cubic interpolation. Use order=1 for bilinear, order=0 for nearest, etc.):
import numpy as np
import scipy.ndimage as ndimage
data = np.arange(9).reshape(3,3)
print 'Original:\n', data
print 'Zoomed by 2x:\n', ndimage.zoom(data, 2)
This yields:
Original:
[[0 1 2]
[3 4 5]
[6 7 8]]
Zoomed by 2x:
[[0 0 1 1 2 2]
[1 1 1 2 2 3]
[2 2 3 3 4 4]
[4 4 5 5 6 6]
[5 6 6 7 7 7]
[6 6 7 7 8 8]]
This also works for 3D (and nD) arrays. However, be aware that if you zoom by 2x, for example, you'll zoom along all axes.
data = np.arange(27).reshape(3,3,3)
print 'Original:\n', data
print 'Zoomed by 2x gives an array of shape:', ndimage.zoom(data, 2).shape
This yields:
Original:
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
Zoomed by 2x gives an array of shape: (6, 6, 6)
If you have something like a 3-band, RGB image that you'd like to zoom, you can do this by specifying a sequence of tuples as the zoom factor:
print 'Zoomed by 2x along the last two axes:'
print ndimage.zoom(data, (1, 2, 2))
This yields:
Zoomed by 2x along the last two axes:
[[[ 0 0 1 1 2 2]
[ 1 1 1 2 2 3]
[ 2 2 3 3 4 4]
[ 4 4 5 5 6 6]
[ 5 6 6 7 7 7]
[ 6 6 7 7 8 8]]
[[ 9 9 10 10 11 11]
[10 10 10 11 11 12]
[11 11 12 12 13 13]
[13 13 14 14 15 15]
[14 15 15 16 16 16]
[15 15 16 16 17 17]]
[[18 18 19 19 20 20]
[19 19 19 20 20 21]
[20 20 21 21 22 22]
[22 22 23 23 24 24]
[23 24 24 25 25 25]
[24 24 25 25 26 26]]]
Arbitrary interpolation of regularly-gridded data using map_coordinates
The first thing to undersand about map_coordinates is that it operates in pixel coordinates (e.g. just like you'd index the array, but the values can be floats). From your description, this is exactly what you want, but if often confuses people. For example, if you have x, y, z "real-world" coordinates, you'll need to transform them to index-based "pixel" coordinates.
At any rate, let's say we wanted to interpolate the value in the original array at position 1.2, 0.3, 1.4.
If you're thinking of this in terms of the earlier RGB image case, the first coordinate corresponds to the "band", the second to the "row" and the last to the "column". What order corresponds to what depends entirely on how you decide to structure your data, but I'm going to use these as "z, y, x" coordinates, as it makes the comparison to the printed array easier to visualize.
import numpy as np
import scipy.ndimage as ndimage
data = np.arange(27).reshape(3,3,3)
print 'Original:\n', data
print 'Sampled at 1.2, 0.3, 1.4:'
print ndimage.map_coordinates(data, [[1.2], [0.3], [1.4]])
This yields:
Original:
[[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]]
[[ 9 10 11]
[12 13 14]
[15 16 17]]
[[18 19 20]
[21 22 23]
[24 25 26]]]
Sampled at 1.2, 0.3, 1.4:
[14]
Once again, this is cubic interpolation by default. Use the order kwarg to control the type of interpolation.
It's worth noting here that all of scipy.ndimage's operations preserve the dtype of the original array. If you want floating point results, you'll need to cast the original array as a float:
In [74]: ndimage.map_coordinates(data.astype(float), [[1.2], [0.3], [1.4]])
Out[74]: array([ 13.5965])
Another thing you may notice is that the interpolated coordinates format is rather cumbersome for a single point (e.g. it expects a 3xN array instead of an Nx3 array). However, it's arguably nicer when you have sequences of coordinate. For example, consider the case of sampling along a line that passes through the "cube" of data:
xi = np.linspace(0, 2, 10)
yi = 0.8 * xi
zi = 1.2 * xi
print ndimage.map_coordinates(data, [zi, yi, xi])
This yields:
[ 0 1 4 8 12 17 21 24 0 0]
This is also a good place to mention how boundary conditions are handled. By default, anything outside of the array is set to 0. Thus the last two values in the sequence are 0. (i.e. zi is > 2 for the last two elements).
If we wanted the points outside the array to be, say -999 (We can't use nan as this is an integer array. If you want nan, you'll need to cast to floats.):
In [75]: ndimage.map_coordinates(data, [zi, yi, xi], cval=-999)
Out[75]: array([ 0, 1, 4, 8, 12, 17, 21, 24, -999, -999])
If we wanted it to return the nearest value for points outside the array, we'd do:
In [76]: ndimage.map_coordinates(data, [zi, yi, xi], mode='nearest')
Out[76]: array([ 0, 1, 4, 8, 12, 17, 21, 24, 25, 25])
You can also use "reflect" and "wrap" as boundary modes, in addition to "nearest" and the default "constant". These are fairly self-explanatory, but try experimenting a bit if you're confused.
For example, let's interpolate a line along the first row of the first band in the array that extends for twice the distance of the array:
xi = np.linspace(0, 5, 10)
yi, zi = np.zeros_like(xi), np.zeros_like(xi)
The default give:
In [77]: ndimage.map_coordinates(data, [zi, yi, xi])
Out[77]: array([0, 0, 1, 2, 0, 0, 0, 0, 0, 0])
Compare this to:
In [78]: ndimage.map_coordinates(data, [zi, yi, xi], mode='reflect')
Out[78]: array([0, 0, 1, 2, 2, 1, 2, 1, 0, 0])
In [78]: ndimage.map_coordinates(data, [zi, yi, xi], mode='wrap')
Out[78]: array([0, 0, 1, 2, 0, 1, 1, 2, 0, 1])
Hopefully that clarifies things a bit!
Great answer by Joe. Based on his suggestion, I created the regulargrid package (https://pypi.python.org/pypi/regulargrid/, source at https://github.com/JohannesBuchner/regulargrid)
It provides support for n-dimensional Cartesian grids (as needed here) via the very fast scipy.ndimage.map_coordinates for arbitrary coordinate scales.

Categories