Related
Suppose I have a tensor like the following:
x = torch.tensor([[[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[4, 5, 6, 7, 8]]])
and I want to extract the position of the lowest 3 values, which is 1, 2, and 2 in this example.
So I first flatten x and get the index:
v, i = torch.topk(x.flatten(), 3, largest = False)
i output tensor([0, 5, 1]), which is the index that I want, but it is not in the index of the original tensor shape. What I am looking for is [0, 0, 0], [0, 0, 1], and [0, 1, 0].
How can I revert the location of the index?
There is a functionality in Numpy which seems handy in creating the desired output. Unfortunately, I wasn't able to find the Pytorch equivalent and I believe there isn't any yet! I suggest using this function:
import torch
import numpy as np
x = torch.tensor([[[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[4, 5, 6, 7, 8]]])
v, i = torch.topk(x.flatten(), 3, largest = False)
print(np.unravel_index(i, x.size()))
output:
(array([0, 0, 0]), array([0, 1, 0]), array([0, 0, 1]))
This question already has answers here:
Python - slice array at different position on every row
(3 answers)
Closed 7 months ago.
Say that I have an array of arrays
array = np.random.randint(0, 6, (4, 6))
array
array([[3, 5, 2, 5, 1, 3],
[5, 3, 0, 1, 4, 3],
[2, 1, 0, 2, 2, 4],
[2, 1, 0, 4, 2, 2]])
And I also have arrays for desired start and end indices for slicing out this array for each row
starts = np.random.randint(0, 3, (4,))
ends = starts + 3
ends
array([6, 3, 4, 3])
How do I slice out the array of arrays using these indices?
For the example, the desired result will be
array([[5, 1, 3],
[5, 3, 0],
[1, 0, 2],
[2, 1, 0]])
something like array[starts:ends] does not work
I am not sure but if this may help you:
a = []
i = 0
for s, e in zip(starts, ends):
a.append(array[i][s:e])
i += 1
Or use enumerate()
a = []
for i,(s, e) in enumerate(zip(starts, ends)):
a.append(array[i][s:e])
then convert the list into an array to get the desired output
print(np.array(a))
It looks like this:
array([[2, 4, 2],
[4, 3, 0],
[2, 2, 0],
[4, 0, 4]])
I was working with numpy and argsort, while encountering a strange (?) behavior of argsort:
>>> array = [[0, 1, 2, 3, 4, 5],
[444, 4, 8, 3, 1, 10],
[2, 5, 8, 999, 1, 4]]
>>> np.argsort(array, axis=0)
array([[0, 0, 0, 0, 1, 2],
[2, 1, 1, 1, 2, 0],
[1, 2, 2, 2, 0, 1]], dtype=int64)
The first 4 values of each list are pretty clear to me - argsort doing it's job right. But the last 2 values are pretty confusing, as it is kinda sorting the values wrong.
Shouldn't the output of argsort be:
array([[0, 0, 0, 0, 2, 1],
[2, 1, 1, 1, 0, 2],
[1, 2, 2, 2, 1, 0]], dtype=int64)
I think the issue is with what you think argsort is outputting. Let's focus on a simpler 1D example:
arr = np.array([5, 10, 4])
The result of np.argsort will be the indices from the original array to make the elements sorted:
[2, 0, 1]
Let's take a look at what the actual sorted values are to understand why:
[
4, # at index 2 in the original array
5, # at index 0 in the original array
10, # at index 1 in the original array
]
It seems like you are imagining the inverse operation, where argsort will tell you what index in the output each element will move to. You can obtain those indices by applying argsort to the result of argsort.
The output is correct, the thing is that np.argsort with axis=0, is actually comparing the each element of the first axis elements'. So, that for array
array = [[0, 1, 2, 3, 4, 5],
... [444, 4, 8, 3, 1, 10],
... [2, 5, 8, 999, 1, 4]]
axis=0, compares the elements, (0, 444, 2), (1,4,8), (2,8,8), (3,3,999), (4,1,1), (5,10,4) so that it gives the array of indices as:
np.argsort(array, axis=0)
array([[0, 0, 0, 0, 1, 2],
[2, 1, 1, 1, 2, 0],
[1, 2, 2, 2, 0, 1]])
So, for your question the last 2 values, comes from the elements (4,1,1) which give the array index value as (1,2,0), and for (5,10,4) it gives (2,0,1).
Refer this: np.argsort
Let's say I have the following array
import numpy as np
matrix = np.array([
[[1, 2, 3, 4], [0, 1], [2, 3, 4, 5]],
[[1, 2, 3], [4], [0, 1], [2, 0], [0, 0]],
[[2, 2], [3, 4, 0], [1, 1, 0, 0], [0]],
[[6, 3, 3, 4, 0], [4, 2, 3, 4, 5]],
[[1, 2, 3, 2], [0, 1, 2], [3, 4, 5]]])
As you can see, it's a staggered array. What I want to do is to sum the elements in a way so that the output is:
[11, 11, 15, 18, 0, 8, 9, 9, 12, 15]
I want to sum the elements in the "columns" of the matrix, but I don't know how to do it.
As mentioned by juanpa.arrivillaga in the comments, you don't have a multi-dimensional array, you have a 1-D array of lists of lists. You need to flatten the inner lists first :
>>> np.array([[z for y in x for z in y] for x in matrix])
array([[1, 2, 3, 4, 0, 1, 2, 3, 4, 5],
[1, 2, 3, 4, 0, 1, 2, 0, 0, 0],
[2, 2, 3, 4, 0, 1, 1, 0, 0, 0],
[6, 3, 3, 4, 0, 4, 2, 3, 4, 5],
[1, 2, 3, 2, 0, 1, 2, 3, 4, 5]])
It should be much easier to solve your problem now. This matrix has a shape of (5,10), and supports T for transposition and np.sum() for summing rows or columns.
You didn't write any code, so I won't solve the problem completely, but with this matrix, you're one step away from:
array([11, 11, 15, 18, 0, 8, 9, 9, 12, 15])
I have a 2D array with filled with some values (column 0) and zeros (rest of the columns). I would like to do pretty much the same as I do with MS excel but using numpy, meaning to put into the rest of the columns values from calculations based on the first column. Here it is a MWE:
import numpy as np
a = np.zeros(20, dtype=np.int8).reshape(4,5)
b = [1, 2, 3, 4]
b = np.array(b)
a[:, 0] = b
# don't change the first column
for column in a[:, 1:]:
a[:, column] = column[0]+1
The expected output:
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]], dtype=int8)
The resulting output:
array([[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0]], dtype=int8)
Any help would be appreciated.
Looping is slow and there is no need to loop to produce the array that you want:
>>> a = np.ones(20, dtype=np.int8).reshape(4,5)
>>> a[:, 0] = b
>>> a
array([[1, 1, 1, 1, 1],
[2, 1, 1, 1, 1],
[3, 1, 1, 1, 1],
[4, 1, 1, 1, 1]], dtype=int8)
>>> np.cumsum(a, axis=1)
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
What went wrong
Let's start, as in the question, with this array:
>>> a
array([[1, 0, 0, 0, 0],
[2, 0, 0, 0, 0],
[3, 0, 0, 0, 0],
[4, 0, 0, 0, 0]], dtype=int8)
Now, using the code from the question, let's do the loop and see what column actually is:
>>> for column in a[:, 1:]:
... print(column)
...
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]
[0 0 0 0]
As you can see, column is not the index of the column but the actual values in the column. Consequently, the following does not do what you would hope:
a[:, column] = column[0]+1
Another method
If we want to loop (so that we can do something more complex), here is another approach to generating the desired array:
>>> b = np.array([1, 2, 3, 4])
>>> np.column_stack([b+i for i in range(5)])
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
Your usage of column is a little ambiguous: in for column in a[:, 1:], it is treated as a column and in the body, however, it is treated as index to the column. You can try this instead:
for column in range(1, a.shape[1]):
a[:, column] = a[:, column-1]+1
a
#array([[1, 2, 3, 4, 5],
# [2, 3, 4, 5, 6],
# [3, 4, 5, 6, 7],
# [4, 5, 6, 7, 8]], dtype=int8)