Related
I want to replicate the torch.gather() function in TensorFlow 2.X.
I have a Tensor A (shape: [2, 4, 3]) and a corresponding Index-Tensor I (shape: [2,2,3]).
Using torch.gather() yields the following:
A = torch.tensor([[[10,20,30], [100,200,300], [1000,2000,3000]],
[[50,60,70], [500,600,700], [5000,6000,7000]]])
I = torch.tensor([[[0,1,0], [1,2,1]],
[[2,1,2], [1,0,1]]])
torch.gather(A, 1, I)
>
tensor([[[10, 200, 30], [100, 2000, 300]],
[5000, 600, 7000], [500, 60, 700]]])
I have tried using tf.gather(), but this did not yield pytorch-like results. I also tried to play around with tf.gather_nd(), but I could not find a suitable solution.
I found this StackOverflow post, but this seems not to work for me.
Edit:
When using tf.gather_nd(A, I), I get the following result:
tf.gather_nd(A, I)
>
[[100, 6000],
[ 0, 60]]
The result for tf.gather(A, I) is rather lengthy. It has the shape of [2, 2, 3, 4, 3]
torch.gather and tf.gather_nd work differently and will therefore yield different results when using the same indices tensor (in some cases an error will also be returned). This is what the indices tensor would have to look like to get the same results:
import tensorflow as tf
A = tf.constant([[
[10,20,30], [100,200,300], [1000,2000,3000]],
[[50,60,70], [500,600,700], [5000,6000,7000]]])
I = tf.constant([[[
[0,0,0],
[0,1,1],
[0,0,2],
],[
[0,1,0],
[0,2,1],
[0,1,2],
]],
[[
[1,2,0],
[1,1,1],
[1,2,2],
],
[
[1,1,0],
[1,0,1],
[1,1,2],
]]])
print(tf.gather_nd(A, I))
tf.Tensor(
[[[ 10 200 30]
[ 100 2000 300]]
[[5000 600 7000]
[ 500 60 700]]], shape=(2, 2, 3), dtype=int32)
So, the question is actually how are you calculating your indices or are they always hard-coded? Also, check out this post on the differences of the two operations.
As for the post you linked that didn't work for you, you just need to cast the indices and everything should be fine:
def torch_gather(x, indices, gather_axis):
all_indices = tf.where(tf.fill(indices.shape, True))
gather_locations = tf.reshape(indices, [indices.shape.num_elements()])
gather_indices = []
for axis in range(len(indices.shape)):
if axis == gather_axis:
gather_indices.append(tf.cast(gather_locations, dtype=tf.int64))
else:
gather_indices.append(tf.cast(all_indices[:, axis], dtype=tf.int64))
gather_indices = tf.stack(gather_indices, axis=-1)
gathered = tf.gather_nd(x, gather_indices)
reshaped = tf.reshape(gathered, indices.shape)
return reshaped
I = tf.constant([[[0,1,0], [1,2,1]],
[[2,1,2], [1,0,1]]])
A = tf.constant([[
[10,20,30], [100,200,300], [1000,2000,3000]],
[[50,60,70], [500,600,700], [5000,6000,7000]]])
print(torch_gather(A, I, 1))
tf.Tensor(
[[[ 10 200 30]
[ 100 2000 300]]
[[5000 600 7000]
[ 500 60 700]]], shape=(2, 2, 3), dtype=int32)
You could also try this as an equivalent to torch.gather:
import random
import numpy as np
import tensorflow as tf
import torch
# torch.gather equivalent
def tf_gather(x: tf.Tensor, indices: tf.Tensor, axis: int) -> tf.Tensor:
complete_indices = np.array(np.where(indices > -1))
complete_indices[axis] = tf.reshape(indices, [-1])
flat_ind = np.ravel_multi_index(tuple(complete_indices), x.shape)
return tf.reshape(tf.gather(tf.reshape(x, [-1]), flat_ind), indices.shape)
# ======= test program ========
if __name__ == '__main__':
a = np.random.rand(2, 5, 3, 4)
dim = 2 # 0 <= dim < len(a.shape))
ind = np.expand_dims(np.argmax(a, axis=dim), axis=dim)
# ========== np: groundtruth ==========
np_max = np.expand_dims(np.max(a, axis=dim), axis=dim)
# ========= torch: gather =========
torch_max = torch.gather(torch.tensor(a), dim=dim, index=torch.tensor(ind))
# ========= tensorflow: torch-like gather =========
tf_max = tf_gather(tf.convert_to_tensor(a), axis=dim, indices=tf.convert_to_tensor(ind))
keepdim = False
if not keepdim:
np_max = np.squeeze(np_max, axis=dim)
torch_max = torch.squeeze(torch_max, dim=dim)
tf_max = tf.squeeze(tf_max, axis=dim)
# print('np_max:\n', np_max)
# print('torch_max:\n', torch_max)
# print('tf_max:\n', tf_max)
assert np.allclose(np_max, torch_max.numpy()), '\33[1m\33[31mError with torch\33[0m'
assert np.allclose(np_max, tf_max.numpy()), '\33[1m\33[31mError with tensorflow\33[0m'
print('\33[1m\33[32mSuccess!\33[0m')
I have multidimensional array. Once it has a critical value in the last dimension, I would like to mutate a tail of the dimension.
np.random.seed(100)
arr = np.random.uniform(size=100).reshape([2,5,2,5])
# array([[[[ 0.54340494, 0.27836939, 0.42451759, 0.84477613, 0.00471886],
# [ 0.12156912, 0.67074908, 0.82585276, 0.13670659, 0.57509333]],
# [[ 0.89132195, 0.20920212, 0.18532822, 0.10837689, 0.21969749],
# [ 0.97862378, 0.81168315, 0.17194101, 0.81622475, 0.27407375]],
# [[ 0.43170418, 0.94002982, 0.81764938, 0.33611195, 0.17541045],
# [ 0.37283205, 0.00568851, 0.25242635, 0.79566251, 0.01525497]],
# [[ 0.59884338, 0.60380454, 0.10514769, 0.38194344, 0.03647606],
# [ 0.89041156, 0.98092086, 0.05994199, 0.89054594, 0.5769015 ]],
# [[ 0.74247969, 0.63018394, 0.58184219, 0.02043913, 0.21002658],
# [ 0.54468488, 0.76911517, 0.25069523, 0.28589569, 0.85239509]]],
# [[[ 0.97500649, 0.88485329, 0.35950784, 0.59885895, 0.35479561],
# [ 0.34019022, 0.17808099, 0.23769421, 0.04486228, 0.50543143]],
# [[ 0.37625245, 0.5928054 , 0.62994188, 0.14260031, 0.9338413 ],
# [ 0.94637988, 0.60229666, 0.38776628, 0.363188 , 0.20434528]],
# [[ 0.27676506, 0.24653588, 0.173608 , 0.96660969, 0.9570126 ],
# [ 0.59797368, 0.73130075, 0.34038522, 0.0920556 , 0.46349802]],
# [[ 0.50869889, 0.08846017, 0.52803522, 0.99215804, 0.39503593],
# [ 0.33559644, 0.80545054, 0.75434899, 0.31306644, 0.63403668]],
# [[ 0.54040458, 0.29679375, 0.1107879 , 0.3126403 , 0.45697913],
# [ 0.65894007, 0.25425752, 0.64110126, 0.20012361, 0.65762481]]]])
Let's say critical value will be 0.80. We need to mutate all furter values after we see value higher than 0.80. We focus on two first "rows". Which stands for [3,2] after selection with np.argmax.
where_bigger = np.argmax(arr >= 0.80, axis = 3)
# array([[[3, 2], ## used as example later !!!!!!!!!
# [0, 0],
# [1, 0],
# [0, 0],
# [0, 4]],
# [[0, 0],
# [4, 0],
# [3, 0],
# [3, 1],
# [0, 0]]])
As example, we first focus on element with index 3 in the [3,2](see above with !!!!). Once we found value higher than 0.80 (index of such is 3) all following values should be replaced with np.na
arr[0,0,0,3] ## 0.84477613 comes as first element in [3,2]
# [ 0.54340494, 0.27836939, 0.42451759, 0.84477613, np.na]
Similar here, we focus on element 2 out of [3,2] and need to set all following elements to np.na
arr[0,0,1,2] ## 0.82585276 comes as second element in [3,2]
# [ 0.12156912, 0.67074908, 0.82585276, np.na, np.na]
At the end we repeat it for all elements found by argmax:
# array([[[[ 0.54340494, 0.27836939, 0.42451759, 0.84477613, np.na],
# [ 0.12156912, 0.67074908, 0.82585276, np.na, np.na]],
# [[ 0.89132195, np.na, np.na, np.na, np.na],
# [ 0.97862378, np.na, np.na, np.na, np.na]],
# [[ 0.43170418, 0.94002982, np.na, np.na, np.na],
# ...
Is it possible to adjust whole array at once without looping? Probably it is possible to do with slicing. I would like to use some approach like
arr[where_bigger:] = np.na, but it is clearly wrong. And so far I could not progress further.
Best bet is some type of boolean mask. You can make the tail by np.logical_or.accumulate but that will include the index with the threshhold value. If you want to keep the first instance, you'll have to pad it.
mask = np.c_[np.zeros(arr.shape[:-1] + (1,), dtype = bool), np.logical_or.accumulate(arr > .8, axis = -1)[...,:-1]]
arr[mask] = np.nan
I have a NumPy array with the following shape:
(1532, 2036, 5)
I would like to generate a list of arrays where each one has the following shape:
(1532, 2036)
You can use Ellipsis to signify all dimensions up to the last. For example:
arr = np.random.rand(4, 3, 2)
arr
array([[[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]],
[[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]],
[[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]],
[[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]]])
The list of the last dimension arrays can be constructed as #Usernamenotfound mentioned or with Ellipsis like so:
[arr[..., i] for i in range(arr.shape[-1])]
[array([[ 0.35235813, 0.53743048, 0.80048303],
[ 0.1339381 , 0.81425027, 0.34039991],
[ 0.2112466 , 0.03755819, 0.74622891],
[ 0.99313615, 0.90787642, 0.8738962 ]]),
array([[ 0.57984153, 0.46753367, 0.07982378],
[ 0.84586721, 0.41086151, 0.19972737],
[ 0.73086434, 0.40113463, 0.74695994],
[ 0.65634951, 0.37387861, 0.41747727]])]
Each element has the shape (4, 3).
Likewise you could so the same for the first dimension, making 4 (3, 2) arrays.
[arr[i, ...] for i in range(arr.shape[0])]
[array([[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]]), array([[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]]), array([[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]]), array([[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]])]
You can also permute the axes with numpy.transpose then simply iterate through the array:
import numpy as np
a = ... # Define the input array here
out = [a for a in np.transpose(arr, (2, 0, 1))]
You can slice the 3D array using
[x[:,:,i] for i in range(5)]
The above would give you a list of 2D arrays.
The same process can be scaled for multidimensional arrays
Can someone please help me to understand why sometimes the advanced selection doesn't work and what I can do to get it to work (2nd case)?
>>> import numpy as np
>>> b = np.random.rand(5, 14, 3, 2)
# advanced selection works as expected
>>> b[[0,1],[0,1]]
array([[[ 0.7575555 , 0.18989068],
[ 0.06816789, 0.95760398],
[ 0.88358107, 0.19558106]],
[[ 0.62122898, 0.95066355],
[ 0.62947885, 0.00297711],
[ 0.70292323, 0.2109297 ]]])
# doesn't work - why?
>>> b[[0,1],[0,1,2]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
# but this seems to work
>>> b[:,[0,1,2]]
array([[[[ 7.57555496e-01, 1.89890676e-01],
[ 6.81678915e-02, 9.57603975e-01],
[ 8.83581071e-01, 1.95581063e-01]],
[[ 2.24896112e-01, 4.77818599e-01],
[ 4.29313861e-02, 8.61578045e-02],
[ 4.80092364e-01, 3.66821618e-01]],
...
Update
Breaking up the selection seems to resolve the problem, but I am unsure why this is necessary (or if there's a better way to achieve this).
>>> b.shape
(5, 14, 3, 2)
>>> b[[0,1]].shape
(2, 14, 3, 2)
# trying to separate indexing by dimension.
>>> b[[0,1]][:,[0,1,2]]
array([[[[ 0.7575555 , 0.18989068],
[ 0.06816789, 0.95760398],
[ 0.88358107, 0.19558106]],
[[ 0.22489611, 0.4778186 ],
[ 0.04293139, 0.0861578 ],
You want
b[np.ix_([0, 1], [0, 1, 2])]
You also need to do the same thing for b[[0, 1], [0, 1]], because that's not actually doing what you think it is:
b[np.ix_([0, 1], [0, 1])]
The problem here is that advanced indexing does something completely different from what you think it does. You've made the mistake of thinking that b[[0, 1], [0, 1, 2]] means "take all parts b[i, j] of b where i is 0 or 1 and j is 0, 1, or 2". This is a reasonable mistake to make, considering that it seems to work that way when you have one list in the indexing expression, like
b[:, [1, 3, 5], 2]
In fact, for an array A and one-dimensional integer arrays I and J, A[I, J] is an array where
A[I, J][n] == A[I[n], J[n]]
This generalizes in the natural way to more index arrays, so for example
A[I, J, K][n] == A[I[n], J[n], K[n]]
and to higher-dimensional index arrays, so if I and J are two-dimensional, then
A[I, J][m, n] == A[I[m, n], J[m, n]]
It also applies the broadcasting rules to the index arrays, and converts lists in the indexes to arrays. This is much more powerful than what you expected to happen, but it means that to do what you were trying to do, you need something like
b[[[0],
[1]], [[0, 1, 2]]]
np.ix_ is a helper that will do that for you so you don't have to write a dozen brackets.
I think you misunderstood the advanced selection syntax for this case. I used your example, just made it smaller to be easier to see.
import numpy as np
b = np.random.rand(5, 4, 3, 2)
# advanced selection works as expected
print b[[0,1],[0,1]] # http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
# this picks the two i,j=0 (a 3x2 matrix) and i=1,j=1, another 3x2 matrix
# doesn't work - why?
#print b[[0,1],[0,1,2]] # this doesnt' work because [0,1] and [0,1,2] have different lengths
print b[[0,1,2],[0,1,2]] # works
Output:
[[[ 0.27334558 0.90065184]
[ 0.8624593 0.34324983]
[ 0.19574819 0.2825373 ]]
[[ 0.38660087 0.63941692]
[ 0.81522421 0.16661912]
[ 0.81518479 0.78655536]]]
[[[ 0.27334558 0.90065184]
[ 0.8624593 0.34324983]
[ 0.19574819 0.2825373 ]]
[[ 0.38660087 0.63941692]
[ 0.81522421 0.16661912]
[ 0.81518479 0.78655536]]
[[ 0.65336551 0.1435357 ]
[ 0.91380873 0.45225145]
[ 0.57255923 0.7645396 ]]]
I am trying to make my program faster.
I have a matrix and a vector:
GDES = N.array([[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
[21,22,23,24,25]])
Ene=N.array([1,2,3,4,5])
NN=len(GDES);
I have defined a function for matrix multiplication:
def Gl(n,np,k,q):
matrix = GDES[k,np]*GDES[k,n]*GDES[q,np]*GDES[q,n]
return matrix
and I have made a for loop in my calculation:
SIl = N.zeros((NN,NN),N.float)
for n in xrange(NN):
for np in xrange(NN):
SumJ = N.sum(N.sum(Gl(n,np,k,q) for q in xrange(NN)) for k in xrange(NN))
SIl[n,np]=SumJ
print 'SIl:',SIl
output:
SIl: [[ 731025. 828100. 931225. 1040400. 1155625.]
[ 828100. 940900. 1060900. 1188100. 1322500.]
[ 931225. 1060900. 1199025. 1345600. 1500625.]
[ 1040400. 1188100. 1345600. 1512900. 1690000.]
[ 1155625. 1322500. 1500625. 1690000. 1890625.]]
I want to use newaxis to make it faster:
def G():
Mknp = GDES[:, :, N.newaxis, N.newaxis]
Mkn = GDES[:, N.newaxis, :, N.newaxis]
Mqnp = GDES[:, N.newaxis, N.newaxis, :]
Mqn = GDES[N.newaxis, :, :, N.newaxis]
matrix=Mknp*Mkn*Mqnp*Mqn
return matrix
tmp = G()
MGI = N.sum(N.sum(tmp,axis=3), axis=2)
MGI = N.reshape(MGI,(NN,NN))
print 'MGI:', MGI
output:
MGI: [[ 825 3900 9225 16800 26625]
[ 31200 92400 169600 262800 372000]
[ 146575 413400 722475 1073800 1467375]
[ 403200 1116900 1911600 2787300 3744000]
[ 857325 2352900 3980725 5740800 7633125]]
Any idea how can I get the right answer?
Your problem is a perfect fit for np.einsum:
>>> GDES = np.arange(1, 26).reshape(5, 5)
>>> np.einsum('kj,ki,lj,li->ij', GDES, GDES, GDES, GDES)
array([[ 731025, 828100, 931225, 1040400, 1155625],
[ 828100, 940900, 1060900, 1188100, 1322500],
[ 931225, 1060900, 1199025, 1345600, 1500625],
[1040400, 1188100, 1345600, 1512900, 1690000],
[1155625, 1322500, 1500625, 1690000, 1890625]])
For your particular case, this other syntax may be easier to figure out:
>>> np.einsum(GDES, [2,1], GDES, [2,0], GDES, [3,1], GDES, [3,0], [0,1])
array([[ 731025, 828100, 931225, 1040400, 1155625],
[ 828100, 940900, 1060900, 1188100, 1322500],
[ 931225, 1060900, 1199025, 1345600, 1500625],
[1040400, 1188100, 1345600, 1512900, 1690000],
[1155625, 1322500, 1500625, 1690000, 1890625]])