Indexing whole Tensor along specific dimension and specific channels - python

Let say we have a Tensor A with the dimension dim(A)=[i, j, k=6, u, v]. Now we are interested to get the whole tensor at dimension k with channels=[0:3]. I know we can get it this way:
B = A[:, :, 0:3, :, :]
Now I would like to know if there is any better "pythonic" way to achieve the same result without doing this suboptimal indexing. I mean something like.
B = subset(A, dim=2, index=[0, 1, 2])
No matter in which framework, i.e. pytorch, tensorflow, numpy, etc.
Thanks a lot

In numpy, you can use the take method:
B = A.take([0,1,2], axis=2)
In TensorFlow, there is not really a more concise way than using the traditionnal approach. Using tf.slice would be really verbose:
B = tf.slice(A,[0,0,0,0,0],[-1,-1,3,-1,-1])
You can potentially use the experimental version of take (since TF 2.4):
B = tf.experimental.numpy.take(A, [0,1,2], axis=2)
in PyTorch, you can use index_select:
torch.index_select(A, dim=2, index=torch.tensor([0,1,2]))
Note that you can skip listing explicitly the first dimensions (or the last) by using an ellipsis:
# Both are equivalent in that case
B = A[..., 0:3, :, :]
B = A[:, :, 0:3, ...]

Related

Vectorized computation of numpys tensor dot

I have two vectors containing tensors of shape (3,3) and shape (3,3,3,3) respectively. The vectors have the same length, I am computing the element-wise tensor dot of these two vectors . For example, want to vectorise the following computation to improve performance:
a = np.arange(9.).reshape(3,3)
b = np.arange(81.).reshape(3,3,3,3)
c = np.tensordot(a,b)
a_vec = np.asanyarray([a,a])
b_vec = np.asanyarray([b,b])
c_vec = np.empty(a_vec.shape)
for i in range(c_vec.shape[0]):
c_vec[i, :, :] = np.tensordot(a_vec[i,:,:], b_vec[i,:,:,:,:])
print(np.allclose(c_vec[0], c))
# True
I thought about using numpy.einsum but can't figure out the correct subscripts. I have tried a lot of different approaches but failed so far on all of them:
# I am trying something like this
c_vec = np.einsum("ijk, ilmno -> ijo", a_vec, b_vec)
print(np.allclose(c_vec[0], c))
# False
But this does not reproduce the iterative computation I want above. If this can't be done using einsum or there is a more performant way to do this, I am open for any kind of solutions.
Vectorized way with np.einsum would be -
c_vec = np.einsum('ijk,ijklm->ilm',a_vec,b_vec)
tensor_dot has an axes argument you can use too:
c_vec = np.tensordot(a_vec, b_vec, axes=([1, 2], [1, 2]))

ValueError: shape mismatch: with same shape

I got a basic error with strange output that I do not understand verywell:
step to reproduce
arr1 = np.zeros([6,10,50])
arr2 = np.zeros([6,10])
arr1[:, :, range(25,26,1)] = [arr2]
That generate this error:
ValueError: shape mismatch: value array of shape (1,6,10) could not be broadcast to indexing result of shape (1,6,10)
Could anyone explain what I'm doing wrong ?
Add an extra dimension to arr2:
arr1[:, :, range(25,26,1)] = arr2.reshape(arr2.shape + (1,))
Easier notation for range as used here:
arr1[:, :, 25:26)] = arr2.reshape(arr2.shape + (1,))
(and slice(25,26,1), or slice(25,26), could also work; just to add to the options and possible confusion.)
Or insert an extra axis at the end of arr2:
arr1[..., 25:26] = arr2[..., np.newaxis]
(where ... means "as many dimensions as possible"). You can also use None instead of np.newaxis; the latter is probably more explicit, but anyone knowing NumPy will recognise None as inserting an extra dimension (axis).
Of course, you could also set arr2 to be 3-dimensional from the start:
arr2 = np.zeros([6,10,1])
Note that broadcasting does work when used from the left:
>>> arr1 = np.zeros([50,6,10]) # Swapped ("rolled") dimensions
>>> arr2 = np.zeros([6,10])
>>> arr1[25:26, :, :] = arr2 # No need to add an extra axis
It's just that it doesn't work when used from the right, as in your code.
Since range(25, 26, 1) is actually a single number, you could use either:
arr1[:, :, 25:26] = arr2[..., None]
or:
arr1[:, :, 25] = arr2
in place of arr1[:, :, range(25,26,1)] = [arr2].
Note that for ranges/slices that do not reduce to a single number the first line would use broadcasting.
The reason why your original code does not work is that you are mixing NumPy arrays and Python lists in a non-compatible way because NumPy interprets [arr2] as having shape (1, 6, 10) while the result expects a shape (6, 10, 1) (the error you are getting is substantially about that.)
The above solution targets at making sure that arr2 is in a compatible shape.
Another possibility would have been to change the shape of the recipient, which would allow you to assign [arr2], e.g.:
arr1 = np.zeros([50,6,10])
arr2 = np.zeros([6,10])
arr1[25:26, :, :] = [arr2]
This method may be less efficient though, since arr2[..., None] is just a memory view of the same data in arr2, while [arr2] is creating (read: allocating new memory for) a new list object, which would require some casting (happening under the hood) to be assigned to a NumPy array.

What is the best way to keep dimensionality when subarraying numpy arrays?

Suppose I had a standard numpy array such as
a = np.arange(6).reshape((2,3))
When I subarray the array, by performing such task as
a[1, :]
I will lose dimensionality and it will turn into 1D and print, array([3, 4, 5])
Of course the list being 2D you originally want to keep dimensionality. So Ihave to do a tedious task such as
b=a[1, :]
b.reshape(1, b.size)
Why does numpy decrease dimensionality when subarraying?
What is the best way to keep dimensionality, since a[1, :].reshape(1, a.size) will break?
Just use slicing rather than indexing, and the shape will be preserved:
a[1:2]
Although I agree with John Zwinck's answer, I wanted to provide an alternative in case, for whatever reason, you are forced into using indexing (instead of slicing).
OP says that "a[1, :].reshape(1, a.size) will break":
You can add dimensions to numpy arrays like this:
b = a[1]
# array([3, 4, 5]
b = a[1][np.newaxis]
# array([[3, 4, 5]])
(Note that np.newaxis is None, but it's a lot more readable to use the np.newaxis)
As pointed out in the comments (#PaulPanzer and #Divakar), there are actually many ways to accomplish this same thing (again, with indexing instead of slicing):
These ones do not make a copy (data changed in each affect a)
a[1, None]
a[1, np.newaxis]
a[1].reshape(1, a.shape[1]) # Use shape, not size
This one does make a copy (data is independent from a)
a[[1]]

Efficient transformations of 3D numpy arrays

I have some 3D numpy arrays that need to be transformed in various ways. E.g.:
x.shape = (4, 17, 17)
This array is 1 sample of 4 planes, each of size 17x17. What is the most efficient way to transform each plane: flipud, fliplr, and rot90? Is there a better way than using a for loop? Thanks!
for p in range(4):
x[p, :, :] = np.fliplr(x[p, :, :])
Look at the code of these functions:
def fliplr(...):
....
return m[:, ::-1]
In other words it returns a view with reverse slicing on the 2nd dimension
Your x[p, :, :] = np.fliplr(x[p, :, :] applies that reverse slicing to the last dimension, so the equivalent for the whole array should be
x[:, :, ::-1]
flipping the 2nd axis would be
x[:, ::-1, :]
etc.
np.rot90 has 4 case (k); for k=1 it is
return fliplr(m).swapaxes(0, 1)
in other words m[:, ::-1].swapaxes(0,1)
To work on your planes you would do something like
m[:, :,::-1].swapaxes(1,2)
or you could do the swapaxes/transpose first
m.transpose(0,2,1)[:, :, ::-1]
Does that give you enough tools to transform the plane's in what ever way you want?
As I discussed in another recent question, https://stackoverflow.com/a/41291462/901925, the flip... returns a view, but the rot90, with both flip and swap, will, most likely return a copy. Either way, numpy will be giving you the most efficient version.

Convert NumPy vector to 2D array / matrix

What is the best way to convert a vector to a 2-dimensional array?
For example, a vector b of size (10, )
a = rand(10,10)
b = a[1, :]
b.shape
Out: (10L,)
can be converted to array of size (10,1) as
b = b.reshape(len(b), 1)
Is there a more concise way to do it?
Since you lose a dimension when indexing with a[1, :], the lost dimension needs to be replaced to maintain a 2D shape. With this in mind, you can make the selection using the syntax:
b = a[1, :, None]
Then b has the required shape of (10, 1). Note that None is the same as np.newaxis and inserts a new axis of length 1.
(This is the same thing as writing b = a[1, :][:, None] but uses only one indexing operation, hence saves a few microseconds.)
If you want to continue using reshape (which is also fine for this purpose), it's worth remembering that you can use -1 for (at most) one axis to have NumPy figure out what the correct length should be instead:
b.reshape(-1, 1)
Use np.newaxis:
In [139]: b.shape
Out[139]: (10,)
In [140]: b=b[:,np.newaxis]
In [142]: b.shape
Out[142]: (10, 1)
I think clearest way of doing this is by using np.expand_dims, which basically adds an axis to the array. If you use axis=-1, a new axis will be added as the last dimension.
b = np.expand_dims(b, axis=-1)
or if you want to me more concise:
b = np.expand_dims(b, -1)
Although the question is old, still it is worth to answer I think.
Use this style:
b = a[1:2, :]
you can use np.asmatrix(b) as well
a.shape #--> (12,)
np.asmatrix(a).shape #--> (1, 12)
np.asmatrix(a).T.shape #--> (12, 1)

Categories