Gather elements along second dimension of tensor - python

Assume values and tensor T both have shape (N,K). Now if we think of them in terms of matrices, I would like for each row of T to get the row element corresponding to the index where values has it's maximum. I can easily find those indices with
max_indicies = tf.argmax(T, 1)
which returns a tensor of shape (N). Now, how can I gather up these indices from T such that I get something of shape N? I tried
result = tf.gather(T,max_indices)
but it doesn't do the right thing - it returns something of shape (N,K) which means that it didn't gather up anything.

You can use tf.gather_nd.
For example,
import tensorflow as tf
sess = tf.InteractiveSession()
values = tf.constant([[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 0]])
T = tf.constant([[0, 1, 2 , 3],
[4, 5, 6 , 7],
[8, 9, 10, 11]])
max_indices = tf.argmax(values, axis=1)
# If T.get_shape()[0] is None, you can replace it with tf.shape(T)[0].
result = tf.gather_nd(T, tf.stack((tf.range(T.get_shape()[0],
dtype=max_indices.dtype),
max_indices),
axis=1))
print(result.eval())
However when the ranks of values and T are higher, the use of tf.gather_nd will be a little awkward. I posted my current solution on this question. There might be a better solution in case of high dimensional values and T.

Related

Torch - How to calculate average of tensors with the same indexes

Suppose having two matrices: X(m, n) and index matrix I(m, 1). Every item in index matrix I_k represents the index of the kth element X_k in X.
And suppose the index is in the range of [0, 1, 2, ..., j-1]
I would like to calculate the average of tensors in X with the same index i and return a result matrix R(j, n).
For example,
X = [[1, 1, 1],
[2, 2, 2],
[3, 3, 3]]
I = [0, 0, 1]
The result matrix should be:
R = [[torch.mean(([1, 1, 1], [2, 2, 2]))],
[torch.mean(([3, 3, 3]))]
which equals to:
R = [[1.5, 1.5, 1.5],
[3, 3, 3]]
My current solution is to traverse through m, stack the tensors with the same index and perform torch.mean.
Is there a way avoiding traversing through m? It seems not elegant and rather time-consuming.
ret = torch.empty_like(X)
ret.scatter_reduce_(0, I.unsqueeze(-1).expand_as(X), X, "mean", include_self=False)
should do what you want.
Now, note that this is a fairly new method so it may not be particularly performant. If you bump into an issue with this method, you may be better off running scatter_add_ on the tensor X and a tensor of ones and then divide.
If you want to also have a smaller tensor as output, you may want to figure out how many indices and with that infer the size of out.

diagonalize multiple vectors using numpy

Say I have a matrix of shape (2,3), I need to diagonalize the 3-elements vector into matrix of shape (3,3), for all the 2 vectors at once. That is, I need to return matrix with shape (2,3,3). How can I do that with Numpy elegantly ?
given data = np.array([[1,2,3],[4,5,6]])
i want the result [[[1,0,0],
[0,2,0],
[0,0,3]],
[[4,0,0],
[0,5,0],
[0,0,6]]]
Thanks
tl;dr, my one-liner: mydiag=np.vectorize(np.diag, signature='(n)->(n,n)')
I suppose here that by "diagonalize" you mean "applying np.diag".
Which, as a teacher of linear algebra, tickles me a bit. Since "diagonalizing" has a specific meaning, which is not that (it is computing eigen vectors and values, and from there, writing M=P⁻¹ΛP. Which you cannot do from the inputs you have).
So, I suppose that if input matrix is
[[1, 2, 3],
[9, 8, 7]]
The output matrix you want is
[[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]],
[[9, 0, 0],
[0, 8, 0],
[0, 0, 7]]]
If not, you can ignore this answer [Edit: in the meantime, you explained exactly that. So yo may continue to read].
There are many way to do that.
My one liner would be
mydiag=np.vectorize(np.diag, signature='(n)->(n,n)')
Which build a new functions which does what you want (it interprets the input as a list of 1D-array, call np.diag of each of them, to get a 2D-array, and put each 2D-array in a numpy array, thus getting a 3D-array)
Then, you just call mydiag(M)
One advantage of vectorize, is that it uses numpy broadcasting. In other words, the loops are executed in C, not in python. In yet other words, it is faster. Well it is supposed to be (on small matrix, it is in fact slower than Michael's method - in comment; on large matrix, it is has the exact same speed. Which is frustrating, since einsum doc itself specify that it sacrifices broadcasting).
Plus, it is a one-liner, which has no other interest than bragging on forums. But well, here we are.
Here is one way with indexing:
out = np.zeros(data.shape+(data.shape[-1],), dtype=data.dtype)
x,y = np.indices(data.shape).reshape(2, -1)
out[x,y,y] = data.ravel()
output:
array([[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]],
[[4, 0, 0],
[0, 5, 0],
[0, 0, 6]]])
We use array indexing to precisely grab those elements that are on the diagonal. Note that array indexing allows broadcasting between the indices, so we have index1 contain the index of the array, and index2 contain the index of the diagonal element.
index1 = np.arange(2)[:, None] # 2 is the number of arrays
index2 = np.arange(3)[None, :] # 3 is the square size of each matrix
result = np.zeros((2, 3, 3))
result[index1, index2, index2] = data

How can I index each occurrence of a max value along a given axis of a numpy array?

Suppose I have the following numpy array.
Q = np.array([[0,1,1],[1,0,1],[0,2,0])
Question: How do I identify the position of each max value along axis 1? So the desired output would be something like:
array([[1,2],[0,2],[1]]) # The dtype of the output is not required to be a np array.
With np.argmax I can identify the first occurrence of the maximum along the axis, but not the subsequent values.
In: np.argmax(Q, axis =1)
Out: array([1, 0, 1])
I've also seen answers that rely on using np.argwhere that use a term like this.
np.argwhere(Q == np.amax(Q))
This will also not work here because I can't limit argwhere to work along a single axis. I also can't just flatten out the np array to a single axis because the max's in each row will differ. I need to identify each instance of the max of each row.
Is there a pythonic way to achieve this without looping through each row of the entire array, or is there a function analogous to np.argwhere that accepts an axis argument?
Any insight would be appreciated thanks!
Try with np.where:
np.where(Q == Q.max(axis=1)[:,None])
Output:
(array([0, 0, 1, 1, 2]), array([1, 2, 0, 2, 1]))
Not quite the output you want, but contains equivalent information.
You can also use np.argwhere which gives you the zip data:
np.argwhere(Q==Q.max(axis=1)[:,None])
Output:
array([[0, 1],
[0, 2],
[1, 0],
[1, 2],
[2, 1]])

Slicing a tensor with a tensor of indices and tf.gather

I am trying to slice a tensor with a indices tensor. For this purpose I am trying to use tf.gather.
However, I am having a hard time understanding the documentation and don't get it to work as I would expect it to:
I have two tensors. An activations tensor with a shape of [1,240,4] and an ids tensor with the shape [1,1,120]. I want to slice the second dimension of the activations tensor with the indices provided in the third dimension of the ids tensor:
downsampled_activations = tf.gather(activations, ids, axis=1)
I have given it the axis=1 option since that is the axis in the activations tensor I want to slice.
However, this does not render the expected result and only gives me the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0,1] = 1 is not in [0, 1)
I have tried various combinations of the axis and batch_dims options, but to no avail so far and the documentation doesn't really help me on my path. Anybody care to explain the parameters in more detail or on the example above would be very helpful!
Edit:
The IDs are precomputed before runtime and come in through an input pipeline as such:
features = tf.io.parse_single_example(
serialized_example,
features={ 'featureIDs': tf.io.FixedLenFeature([], tf.string)}
They are then reshaped into the previous format:
feature_ids_raw = tf.decode_raw(features['featureIDs'], tf.int32)
feature_ids_shape = tf.stack([batch_size, (num_neighbours * 4)])
feature_ids = tf.reshape(feature_ids_raw, feature_ids_shape)
feature_ids = tf.expand_dims(feature_ids, 0)
Afterwards they have the previously mentioned shape (batch_size = 1 and num_neighbours = 30 -> [1,1,120]) and I want to use them to slice the activations tensor.
Edit2: I would like the output to be [1,120,4]. (So I would like to gather the entries along the second dimension of the activations tensor in accordance with the IDs stored in my ids tensor.)
You can use :
downsampled_activations =tf.gather(activations , tf.squeeze(ids) ,axis = 1)
downsampled_activations.shape # [1,120,4]
In most cases, the tf.gather method needs 1d indices, and that is right in your case, instead of indices with 3d (1,1,120), a 1d is sufficient (120,). The method tf.gather will look at the axis( = 1) and return the element at each index provided by the indices tensor.
tf.gather Gather slices from params axis axis according to indices.
Granted that the documentation is not the most expressive, and the emphasis should be placed on the slices (since you index slices from the axis and not elements, which is what I suppose you mistakenly took it for).
Let's take a much smaller example:
activations_small = tf.convert_to_tensor([[[1, 2, 3, 4], [11, 22, 33, 44]]])
print(activations_small.shape) # [1, 2, 4]
Let's picture this tensor:
XX 4 XX 44 XX XX
XX 3 XX 33 X XX
XXX 2 XX 22XX XX
X-----X-----+X XX
| 1 | 11 | XX
+-----+-----+X
tf.gather(activations1, [0, 0], axis=1) will return
<tf.Tensor: shape=(1, 2, 4), dtype=int32, numpy=
array([[[1, 2, 3, 4],
[1, 2, 3, 4]]], dtype=int32)>
What tf.gather did was to look from axis 1, and picks up index 0 (ofc, two times i.e. [0, 0]). If you were to run tf.gather(activations1, [0, 0, 0, 0, 0], axis=1).shape, you'd get TensorShape([1, 5, 4]).
Your Error
Now let's try to trigger the error that you're getting.
tf.gather(activations1, [0, 2], axis=1)
InvalidArgumentError: indices[1] = 2 is not in [0, 2) [Op:GatherV2]
What happened here was that when tf.gather looks from axis 1 perspective, there's no item (column if you will) with index = 2.
I guess this is what the documentation is hinting at by
param:<indices> The index Tensor. Must be one of the following types: int32, int64. Must be in range [0, params.shape[axis]).
Your (potential) solution
From the dimensions of indices, and that of the expected result from your question, I am not sure if the above was very obvious to you.
tf.gather(activations, indices=[0, 1, 2, 3], axis=2) or anything with indices within the range of indices in [0, activations.shape[2]) i.e. [0, 4) would work. Anything else would give you the error that you're getting.
There's a verbatim answer below in case that's your expected result.

How to replicate numpy.choose() in tensorflow?

I'm trying to efficiently replicate numpy's ndarray.choose() method.
Here's a numpy example of what I'm looking for:
b = np.arange(15).reshape(3, 5)
c = np.array([1,0,4])
c.choose(b.T) # trying to replicate in tensorflow
-> array([ 1, 5, 14])
The best I've been able to do with this is generate a batch_size square matrix (which is huge if batch size is huge) and take the diagonal of it:
tf_b = tf.constant(b)
tf_c = tf.constant(c)
sess.run(tf.diag_part(tf.gather(tf.transpose(tf_b), tf_c)))
-> array([ 1, 5, 14])
Is there a way to do this that is just linear in the first dimension (instead of squared)?
Yeah, there's an easier way to do this. Flatten your b array to 1-d, so it's [0, 1, 2, ..., 13, 14]. Take an array of indices that are in the range of the number of 'choices' you are taking (3 in your case). That will be [0, 1, 2]. Multiply this range by the second dimension of your original shape, which is the number of options for each choice (5 in your case). That gives you [0, 5, 10]. Then add your indices to this to obtain [1, 5, 14]. Now you're good to call tf.gather().
Here is some code that I've taken from here that does a similar thing for RNN outputs. Yours will be slightly different, but the idea is the same.
index = tf.range(0, batch_size) * max_length + (length - 1)
flat = tf.reshape(output, [-1, out_size])
relevant = tf.gather(flat, index)
return relevant
In a big picture, the operation is pretty straightforward. You use the range operation to get the index of the beginning of each row, then add the index of where you are in each row. I think doing it in 1D is easiest, so that's why we flatten it.

Categories