Efficient transformations of 3D numpy arrays

Efficient transformations of 3D numpy arrays - python

I have some 3D numpy arrays that need to be transformed in various ways. E.g.:
x.shape = (4, 17, 17)
This array is 1 sample of 4 planes, each of size 17x17. What is the most efficient way to transform each plane: flipud, fliplr, and rot90? Is there a better way than using a for loop? Thanks!
for p in range(4):
x[p, :, :] = np.fliplr(x[p, :, :])

Look at the code of these functions:
def fliplr(...):
....
return m[:, ::-1]
In other words it returns a view with reverse slicing on the 2nd dimension
Your x[p, :, :] = np.fliplr(x[p, :, :] applies that reverse slicing to the last dimension, so the equivalent for the whole array should be
x[:, :, ::-1]
flipping the 2nd axis would be
x[:, ::-1, :]
etc.
np.rot90 has 4 case (k); for k=1 it is
return fliplr(m).swapaxes(0, 1)
in other words m[:, ::-1].swapaxes(0,1)
To work on your planes you would do something like
m[:, :,::-1].swapaxes(1,2)
or you could do the swapaxes/transpose first
m.transpose(0,2,1)[:, :, ::-1]
Does that give you enough tools to transform the plane's in what ever way you want?
As I discussed in another recent question, https://stackoverflow.com/a/41291462/901925, the flip... returns a view, but the rot90, with both flip and swap, will, most likely return a copy. Either way, numpy will be giving you the most efficient version.

Related

Indexing whole Tensor along specific dimension and specific channels

Let say we have a Tensor A with the dimension dim(A)=[i, j, k=6, u, v]. Now we are interested to get the whole tensor at dimension k with channels=[0:3]. I know we can get it this way:
B = A[:, :, 0:3, :, :]
Now I would like to know if there is any better "pythonic" way to achieve the same result without doing this suboptimal indexing. I mean something like.
B = subset(A, dim=2, index=[0, 1, 2])
No matter in which framework, i.e. pytorch, tensorflow, numpy, etc.
Thanks a lot

In numpy, you can use the take method:
B = A.take([0,1,2], axis=2)
In TensorFlow, there is not really a more concise way than using the traditionnal approach. Using tf.slice would be really verbose:
B = tf.slice(A,[0,0,0,0,0],[-1,-1,3,-1,-1])
You can potentially use the experimental version of take (since TF 2.4):
B = tf.experimental.numpy.take(A, [0,1,2], axis=2)
in PyTorch, you can use index_select:
torch.index_select(A, dim=2, index=torch.tensor([0,1,2]))
Note that you can skip listing explicitly the first dimensions (or the last) by using an ellipsis:
# Both are equivalent in that case
B = A[..., 0:3, :, :]
B = A[:, :, 0:3, ...]

What does the parameter -1 means in the sum method of the NumPy?

Consider the following code:
X = rand.rand(10, 2)
differences = X[:, np.newaxis, :] - X[np.newaxis, :, :]
differences = X[:, np.newaxis, :] - X[np.newaxis, :, :]
sq_differences = differences ** 2
dist_sq = sq_differences.sum(-1)
In this code, we're calculating the squared distance between the points in the cartesian plane (points are stored in the array X). Please explain the last step in this code, especially that -1 parameter in the sum method.

When Numpy.sum() is used to sum a multi dimenstional array it allows you to specify a dimension. For example dimension = 0 works along the column while dimemen =1 works along the row.
This is explained better here Numpy.sum
Now why -1 is used took a little more digging but is a very simple and sensible answer when you find it.
When you pass dimensions as -1 it means to pick the last dimension in the array. Such that in a 2D array this would be column

ValueError: shape mismatch: with same shape

I got a basic error with strange output that I do not understand verywell:
step to reproduce
arr1 = np.zeros([6,10,50])
arr2 = np.zeros([6,10])
arr1[:, :, range(25,26,1)] = [arr2]
That generate this error:
ValueError: shape mismatch: value array of shape (1,6,10) could not be broadcast to indexing result of shape (1,6,10)
Could anyone explain what I'm doing wrong ?

Add an extra dimension to arr2:
arr1[:, :, range(25,26,1)] = arr2.reshape(arr2.shape + (1,))
Easier notation for range as used here:
arr1[:, :, 25:26)] = arr2.reshape(arr2.shape + (1,))
(and slice(25,26,1), or slice(25,26), could also work; just to add to the options and possible confusion.)
Or insert an extra axis at the end of arr2:
arr1[..., 25:26] = arr2[..., np.newaxis]
(where ... means "as many dimensions as possible"). You can also use None instead of np.newaxis; the latter is probably more explicit, but anyone knowing NumPy will recognise None as inserting an extra dimension (axis).
Of course, you could also set arr2 to be 3-dimensional from the start:
arr2 = np.zeros([6,10,1])
Note that broadcasting does work when used from the left:
>>> arr1 = np.zeros([50,6,10]) # Swapped ("rolled") dimensions
>>> arr2 = np.zeros([6,10])
>>> arr1[25:26, :, :] = arr2 # No need to add an extra axis
It's just that it doesn't work when used from the right, as in your code.

Since range(25, 26, 1) is actually a single number, you could use either:
arr1[:, :, 25:26] = arr2[..., None]
or:
arr1[:, :, 25] = arr2
in place of arr1[:, :, range(25,26,1)] = [arr2].
Note that for ranges/slices that do not reduce to a single number the first line would use broadcasting.
The reason why your original code does not work is that you are mixing NumPy arrays and Python lists in a non-compatible way because NumPy interprets [arr2] as having shape (1, 6, 10) while the result expects a shape (6, 10, 1) (the error you are getting is substantially about that.)
The above solution targets at making sure that arr2 is in a compatible shape.
Another possibility would have been to change the shape of the recipient, which would allow you to assign [arr2], e.g.:
arr1 = np.zeros([50,6,10])
arr2 = np.zeros([6,10])
arr1[25:26, :, :] = [arr2]
This method may be less efficient though, since arr2[..., None] is just a memory view of the same data in arr2, while [arr2] is creating (read: allocating new memory for) a new list object, which would require some casting (happening under the hood) to be assigned to a NumPy array.

Divide each plane of cube by its median without loop

I need to normalize a numpy data cube say:
cube = np.random.random(100000).reshape(10,100,100)
and then normalise each of the 10 resulting planes by the median. So, e.g. for the first plane
cube[0, :, :] /= np.median(cube[0, :, :])
I just want to avoid a loop if possible 😊
thanks

You can pass a list of axes to np.median and then expand via None (np.newaxis):
>>> cube = np.random.random(100000).reshape(10,100,100)
>>> simple = cube / np.median(cube,axis=[1,2])[:,None,None]
>>>
>>> brute = cube.copy()
>>> for i in range(10):
... brute[i, :, :] /= np.median(cube[i, :, :])
...
>>> np.allclose(brute, simple)
True
but to be honest, looping over the shortest axis often isn't so bad performance-wise if the other axes are much longer.

Iterate across arbitrary dimension in numpy

I have a multidimensional numpy array, and I need to iterate across a given dimension. Problem is, I won't know which dimension until runtime. In other words, given an array m, I could want
m[:,:,:,i] for i in xrange(n)
or I could want
m[:,:,i,:] for i in xrange(n)
etc.
I imagine that there must be a straightforward feature in numpy to write this, but I can't figure out what it is/what it might be called. Any thoughts?

There are many ways to do this. You could build the right index with a list of slices, or perhaps alter m's strides. However, the simplest way may be to use np.swapaxes:
import numpy as np
m=np.arange(24).reshape(2,3,4)
print(m.shape)
# (2, 3, 4)
Let axis be the axis you wish to loop over. m_swapped is the same as m except the axis=1 axis is swapped with the last (axis=-1) axis.
axis=1
m_swapped=m.swapaxes(axis,-1)
print(m_swapped.shape)
# (2, 4, 3)
Now you can just loop over the last axis:
for i in xrange(m_swapped.shape[-1]):
assert np.all(m[:,i,:] == m_swapped[...,i])
Note that m_swapped is a view, not a copy, of m. Altering m_swapped will alter m.
m_swapped[1,2,0]=100
print(m)
assert(m[1,0,2]==100)

You can use slice(None) in place of the :. For example,
from numpy import *
d = 2 # the dimension to iterate
x = arange(5*5*5).reshape((5,5,5))
s = slice(None) # :
for i in range(5):
slicer = [s]*3 # [:, :, :]
slicer[d] = i # [:, :, i]
print x[slicer] # x[:, :, i]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Efficient transformations of 3D numpy arrays - python

Related

Indexing whole Tensor along specific dimension and specific channels

What does the parameter -1 means in the sum method of the NumPy?

ValueError: shape mismatch: with same shape

Divide each plane of cube by its median without loop

Iterate across arbitrary dimension in numpy

Categories

Resources