Getting indices of values in high dimension matrix - python

I have tensorA of size 10x4x9x2, the other tensorB is of size 10x5x2 that contains values from tensorA. Now, how can i find the index of each element in tensorB in tensorA.
Example:
First 2 elements of TensorA:
[[[[ 4., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., 3.],
[ 3., 11.],
[11., 10.],
[10., -1.],
[-1., -1.],
[-1., -1.]],
[[12., 13.],
[13., 9.],
[ 9., 7.],
[ 7., 5.],
[ 5., 3.],
[ 3., 4.],
[ 4., 1.],
[ 1., 0.],
[ 0., -1.]],
...... so on
Fist 2 elements of TensorB:
[[[ 2., 5.],
[ 5., 7.],
[ 7., 9.],
[ 9., 10.],
[10., 12.]],
[[ 0., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., -1.],
[-1., -1.]],
Now in tensorB the first element is [2,5] included in the first 5x2 matrix (dimension 0).
so the element should be matched against dimension 0 in tensorA. And the output should be index
0,0,2 since it is the 3rd element.

You can compare the rows that are equal, sum along the last axis, and check that sum against the size of the searched tensor. Then the nonzero function will get you the indices you're looking for.
Since for the example tensors you have given, TensorB[0, 0] is [2., 5.], that looks like:
((TensorA == TensorB[0, 0]).sum(dim=3) == 2).nonzero()
This will return a tensor of [[0, 0, 2]] if that is the only matching row. If you don't want to hard-code 2 (the size of the searched tensor), you can use:
((TensorA == TensorB[0, 0]).sum(dim=3) == TensorB[0, 0].size()[0]).nonzero()

Related

What is Pytorch equivalent of Pandas groupby.apply(list)?

I have the following pytorch tensor long_format:
tensor([[ 1., 1.],
[ 1., 2.],
[ 1., 3.],
[ 1., 4.],
[ 0., 5.],
[ 0., 6.],
[ 0., 7.],
[ 1., 8.],
[ 0., 9.],
[ 0., 10.]])
I would like to groupby the first column and store the 2nd column as a tensor. The result is NOT guranteed to be the same size for each grouping. See example below.
[tensor([ 1., 2., 3., 4., 8.]),
tensor([ 5., 6., 7., 9., 10.])]
Is there any nice way to do this using purely Pytorch operators? I would like to avoid using for loops for tracebility purposes.
I have tried using a for loop and empty list of empty tensors but this result in an incorrect trace (different inputs values gave same results)
n_groups = 2
inverted = [torch.empty([0]) for _ in range(n_groups)]
for index, value in long_format:
value = value.unsqueeze(dim=0)
index = index.int()
if type(inverted[index]) != torch.Tensor:
inverted[index] = value
else:
inverted[index] = torch.cat((inverted[index], value))
You can use this code:
import torch
x = torch.tensor([[ 1., 1.],
[ 1., 2.],
[ 1., 3.],
[ 1., 4.],
[ 0., 5.],
[ 0., 6.],
[ 0., 7.],
[ 1., 8.],
[ 0., 9.],
[ 0., 10.]])
result = [x[x[:,0]==i][:,1] for i in x[:,0].unique()]
output
[tensor([ 5., 6., 7., 9., 10.]), tensor([1., 2., 3., 4., 8.])]

how to modify a column of numpy arrays stored in a list

I have a list of numpy arrays and want to modify some numbers of arrays. This is my simplified list:
first_list=[np.array([[1.,2.,0.], [2.,1.,0.], [6.,8.,3.], [8.,9.,7.]]),
np.array([[1.,0.,2.], [0.,0.,2.], [5.,5.,1.], [0.,6.,2.]])]
I have a factor which defines how many splits I have in each arrays:
spl_array=2.
it means each array of the list can be splited into 2 ones. I want to add a fixed value (3.) into last column of each split of each array and also copy the last split and subtract this value (3.) from the third column of this copied split. Finally I want to have it as following:
final_list=[np.array([[1.,2.,3.], [2.,1.,3.], [6.,8.,6.], [8.,9.,10.], \
[6.,8.,0.], [8.,9.,4.]]), # copied and subtracted
np.array([[1.,0.,5.], [0.,0.,5.], [5.,5.,4.], [0.,6.,5.], \
[5.,5.,-2.], [0.,6.,-1.]])] # copied and subtracted
I tried some for loops but I totaly lost. In advance , I do appreciate any help.
final_list=[]
for i in first_list:
each_lay=np.split (i, spl_array)
for j in range (len(each_lay)):
final_list.append([each_lay[j][:,0], each_lay[j][:,1], each_lay[j][:,2]+3])
Is it what you expect:
m = np.asarray(first_list)
m = np.concatenate((m, m[:, 2:]), axis=1)
m[:, :4, 2] += 3
m[:, 4:, 2] -= 3
final_list = m.tolist()
>>> m
array([[[ 1., 2., 3.],
[ 2., 1., 3.],
[ 6., 8., 6.],
[ 8., 9., 10.],
[ 6., 8., 0.],
[ 8., 9., 4.]],
[[ 1., 0., 5.],
[ 0., 0., 5.],
[ 5., 5., 4.],
[ 0., 6., 5.],
[ 5., 5., -2.],
[ 0., 6., -1.]]])

multiplication of all the combination of elements in certain dimension

I'm trying to do multiplication of all the combination of elements in a certain dimension.
For example,
using for loops it would be
for a,i in enumerate(A):
for b,j in enumerate(B):
c[i][j] = a[1]*b[0]
not using for loop would be
A[:,None,1]*B[:,0]
this worked fine for 2-dimensions(A.shape=(4,2),B.shape=(4,2)).
However, When I am expanding above procedures to 3-dimensions(A.shape=(2,4,2),B.shape=(2,4,2)).
I tried with
A[:,:,1]*B[:,:,None,0]
but gives me an error.
How can I do this without using for loops???
For 2-dimensional tensors this is what I get
a = torch.FloatTensor([[0,1],[0,2],[0,3],[0,4]])
b = torch.FloatTensor([[1,0],[2,0],[3,0],[4,0]])
c = a[:,None,0] * b[:,1]
print(c)
tensor([[ 1., 2., 3., 4.],
[ 2., 4., 6., 8.],
[ 3., 6., 9., 12.],
[ 4., 8., 12., 16.]])
but in the case of
A = torch.cat([a.unsqueeze(0),a.unsqueeze(0)],dim=0)
B = torch.cat([b.unsqueeze(0),b.unsqueeze(0)],dim=0)
C = A[:,:,None,0] * B[:,:,1] << RuntimeError
RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 1
what I want is
tensor([[[ 1., 2., 3., 4.],
[ 2., 4., 6., 8.],
[ 3., 6., 9., 12.],
[ 4., 8., 12., 16.]],
[[ 1., 2., 3., 4.],
[ 2., 4., 6., 8.],
[ 3., 6., 9., 12.],
[ 4., 8., 12., 16.]]])
I cannot think of a way to do this without using for loop in dimension 0.

python - numpy - summation of numpy arrays of different dimension

Suppose we create an numpy array like this:
x = np.linspace(1,5,5).reshape(-1,1)
which results in this:
array([[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 5.]])
now we add the transpose of this array to it:
x + x.T
which results in this:
array([[ 2., 3., 4., 5., 6.],
[ 3., 4., 5., 6., 7.],
[ 4., 5., 6., 7., 8.],
[ 5., 6., 7., 8., 9.],
[ 6., 7., 8., 9., 10.]])
I don't understand this because the two arrays have different dimensions (5x1 and 1x5) and I learned in linear algebra that we can only sum up matrices when they have the same dimension.
Edit: OK thanks, got it
Here, we have
x = array([[ 1.],[ 2.],[ 3.],[ 4.],[ 5.]])
x.T = array([[ 1., 2., 3., 4., 5.]])
Now you are trying to add two matrices of different dimensions (1 X 5) and (5 X 1).
The way numpy handles this is by copying elements in each row of 1st matrix across its columns to match a number of columns of 2nd matrix and copying elements in each column of 2nd matrix across its rows to match no. of rows of 1st matrix. This gives you 2 5 X 5 matrix which can be added together.
The element wise addition is done as
array([[ 1., 1., 1., 1., 1.],[ 2., 2., 2., 2., 2.,],[ 3., 3., 3., 3., 3.,],[4., 4., 4., 4., 4.],[ 5., 5., 5., 5., 5.,]]) + array([[ 1., 2., 3., 4., 5.],[ 1., 2., 3., 4., 5.],[ 1., 2., 3., 4., 5.],[ 1., 2., 3., 4., 5.],[ 1., 2., 3., 4., 5.]])
which produces the result
array([[ 2., 3., 4., 5., 6.],
[ 3., 4., 5., 6., 7.],
[ 4., 5., 6., 7., 8.],
[ 5., 6., 7., 8., 9.],
[ 6., 7., 8., 9., 10.]])

Python - Find values whose coordinates are known in several times

I would like to get several values whose i have the coordinates.
My coordinates are given by "Coord" (shape : (3, 3, 2, 3) : X and Y during 3 times and with 2 because of 2 coordinates) and my values are given by "Values" (shape : (3, 3, 3) for 3 times)
In other words, i would like to concatenate values in time with "slices" for each positions...
I dont know how to undertake that...Here there is a little part of the arrays.
import numpy as np
Coord = np.array([[[[ 4., 6., 10.],
[ 1., 3., 7.]],
[[ 3., 5., 9.],
[ 1., 3., 7.]],
[[ 2., 4., 8.],
[ 1., 3., 7.]]],
[[[ 4., 6., 10.],
[ 2., 4., 8.]],
[[ 3., 5., 9.],
[ 2., 4., 8.]],
[[ 2., 4., 8.],
[ 2., 4., 8.]]],
[[[ 4., 6., 10.],
[ 3., 5., 9.]],
[[ 3., 5., 9.],
[ 3., 5., 9.]],
[[ 2., 4., 8.],
[ 3., 5., 9.]]]])
Values = np.array([[[-4.24045246, 0.97551048, -5.78904502],
[-3.24218504, 0.9771782 , -4.79103141],
[-2.24390519, 0.97882129, -3.79298771]],
[[-4.24087775, 1.97719843, -5.79065966],
[-3.24261128, 1.97886271, -4.7926441 ],
[-2.24433235, 1.98050192, -3.79459845]],
[[-4.24129055, 2.97886284, -5.79224713],
[-3.24302502, 2.98052345, -4.79422942],
[-2.24474697, 2.98215901, -3.79618161]]])
EDIT LATER
I try in case of a simplified problem (without time first). I have used a "for loop" but
somes errors seems subsist...do you think it s the best way to treat this problem? because my arrays are important... 400x300x100
Coord3 = np.array([[[ 2, 2.],
[ 0., 1.],
[ 0., 2.]],
[[ 1., 0.],
[ 2., 1.],
[ 1., 2.]],
[[ 2., 0.],
[ 1., 1.],
[ 0., 0.]]])
Coord3 = Coord3.astype(int)
Values2 = np.array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
b = np.zeros((3,3))
for i in range(Values2.shape[0]):
for j in range(Values2.shape[1]):
b[Coord3[i,j,0], Coord3[i,j,1]] = Values2[i,j]
b
Your second example is relatively easy to do with fancy indexing:
b = np.zeros((3,3), values2.dtype)
b[coord3[..., 0], coord3[..., 1]] = values2
The origial problem is a bit harder to do, but I think this takes care of it:
coord = coord.astype(int)
x_size = coord[..., 0, :].max() + 1
y_size = coord[..., 1, :].max() + 1
# x_size, y_size = coord.max(axis=(0, 1, 3)) + 1
nt = coord.shape[3]
b = np.zeros((x_size, y_size, nt), values.dtype)
b[coord[..., 0, :], coord[..., 1, :], np.arange(nt)] = values

Categories