How do I concatenate an array into a 3D matrix? - python

In my Python application I have a 3D matrix (array) such this:
array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
and I would like to add, in a particular "line", for example, in the middle, zero arrays. At the end I would like to end with the following matrix:
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
Anybody knows how to solve this issue? I tried to use "numpy.concatenate", but it allow me only to add more "lines".
Thanks in advance!

Possible duplicate of
Inserting a row at a specific location in a 2d array in numpy?
For example:
a = array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
output = np.insert(a, 2, np.array([0,0,0]), 0)
output:
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
Why this works on 3D array?
See doc here.
It says:
numpy.insert(arr, obj, values, axis=None)
...
Parameters :
values : array_like
Values to insert into arr.
If the type of values is different from that of arr,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
values is converted to the type of arr. values should be shaped so that
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arr[...,obj,...] = values is legal.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
So it's very wise function!!

Is this what you want?
result = np.r_[ a[:2], np.zeros(1,2,3), a[2][None] ]

I'd do it this way:
>>> a = np.array([[[ 1., 2., 3.]], [[ 4., 5., 6.]], [[ 7., 8., 9.]]])
>>> np.concatenate((a[:2], np.tile(np.zeros_like(a[0]), (2,1,1)), a[2:]))
array([[[ 1., 2., 3.]],
[[ 4., 5., 6.]],
[[ 0., 0., 0.]],
[[ 0., 0., 0.]],
[[ 7., 8., 9.]]])
The 2 in (2,1,1) given to tile() is how many zero "rows" to insert. The 2 in the slice indexes is of course where to insert.
If you're going to insert a large amount of zeros, it may be more efficient to just create a big array of zeros first and then copy in the parts you need from the original array.

Related

What is Pytorch equivalent of Pandas groupby.apply(list)?

I have the following pytorch tensor long_format:
tensor([[ 1., 1.],
[ 1., 2.],
[ 1., 3.],
[ 1., 4.],
[ 0., 5.],
[ 0., 6.],
[ 0., 7.],
[ 1., 8.],
[ 0., 9.],
[ 0., 10.]])
I would like to groupby the first column and store the 2nd column as a tensor. The result is NOT guranteed to be the same size for each grouping. See example below.
[tensor([ 1., 2., 3., 4., 8.]),
tensor([ 5., 6., 7., 9., 10.])]
Is there any nice way to do this using purely Pytorch operators? I would like to avoid using for loops for tracebility purposes.
I have tried using a for loop and empty list of empty tensors but this result in an incorrect trace (different inputs values gave same results)
n_groups = 2
inverted = [torch.empty([0]) for _ in range(n_groups)]
for index, value in long_format:
value = value.unsqueeze(dim=0)
index = index.int()
if type(inverted[index]) != torch.Tensor:
inverted[index] = value
else:
inverted[index] = torch.cat((inverted[index], value))
You can use this code:
import torch
x = torch.tensor([[ 1., 1.],
[ 1., 2.],
[ 1., 3.],
[ 1., 4.],
[ 0., 5.],
[ 0., 6.],
[ 0., 7.],
[ 1., 8.],
[ 0., 9.],
[ 0., 10.]])
result = [x[x[:,0]==i][:,1] for i in x[:,0].unique()]
output
[tensor([ 5., 6., 7., 9., 10.]), tensor([1., 2., 3., 4., 8.])]

how to modify a column of numpy arrays stored in a list

I have a list of numpy arrays and want to modify some numbers of arrays. This is my simplified list:
first_list=[np.array([[1.,2.,0.], [2.,1.,0.], [6.,8.,3.], [8.,9.,7.]]),
np.array([[1.,0.,2.], [0.,0.,2.], [5.,5.,1.], [0.,6.,2.]])]
I have a factor which defines how many splits I have in each arrays:
spl_array=2.
it means each array of the list can be splited into 2 ones. I want to add a fixed value (3.) into last column of each split of each array and also copy the last split and subtract this value (3.) from the third column of this copied split. Finally I want to have it as following:
final_list=[np.array([[1.,2.,3.], [2.,1.,3.], [6.,8.,6.], [8.,9.,10.], \
[6.,8.,0.], [8.,9.,4.]]), # copied and subtracted
np.array([[1.,0.,5.], [0.,0.,5.], [5.,5.,4.], [0.,6.,5.], \
[5.,5.,-2.], [0.,6.,-1.]])] # copied and subtracted
I tried some for loops but I totaly lost. In advance , I do appreciate any help.
final_list=[]
for i in first_list:
each_lay=np.split (i, spl_array)
for j in range (len(each_lay)):
final_list.append([each_lay[j][:,0], each_lay[j][:,1], each_lay[j][:,2]+3])
Is it what you expect:
m = np.asarray(first_list)
m = np.concatenate((m, m[:, 2:]), axis=1)
m[:, :4, 2] += 3
m[:, 4:, 2] -= 3
final_list = m.tolist()
>>> m
array([[[ 1., 2., 3.],
[ 2., 1., 3.],
[ 6., 8., 6.],
[ 8., 9., 10.],
[ 6., 8., 0.],
[ 8., 9., 4.]],
[[ 1., 0., 5.],
[ 0., 0., 5.],
[ 5., 5., 4.],
[ 0., 6., 5.],
[ 5., 5., -2.],
[ 0., 6., -1.]]])

fold/col2im for convolutions in numpy

Suppose I have an input matrix of shape (batch_size ,channels ,h ,w)
in this case (1 ,2 ,3 ,3)
[[[[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]],
[[ 9., 10., 11.],
[12., 13., 14.],
[15., 16., 17.]]]])
to do a convolution with it i unroll it to the shape of
(batch_size ,channels * kernel_size * kernel_size ,out_h * out_w)
which is:
[[[ 0., 1., 3., 4.],
[ 1., 2., 4., 5.],
[ 3., 4., 6., 7.],
[ 4., 5., 7., 8.],
[ 9., 10., 12., 13.],
[10., 11., 13., 14.],
[12., 13., 15., 16.],
[13., 14., 16., 17.]]]
now i want to get the unrolled matrix back to its original form
which looks like this:
# for demonstration only the first and second column of the unrolled matrix
# the output should be the same shape as the initial matrix -> initialized to zeros
# current column -> [ 0., 1., 3., 4., 9., 10., 12., 13.]
[[[[0+0, 0+1, 0],
[0+3, 0+4, 0],
[0 , 0 , 0]],
[[0+9 , 0+10, 0],
[0+12, 0+13, 0],
[0 , 0 , 0]]]]
# for the next column it would be
# current column -> [ 1., 2., 4., 5., 10., 11., 13., 14.]
[[[[0 , 1+1, 0+2],
[3 , 4+4, 0+5],
[0 , 0 , 0 ]],
[[9 , 10+10, 0+11],
[12 , 13+13, 0+14],
[0 , 0 , 0 ]]]])
you basically put your unrolled elements back to its original place and sum the overlapping parts together.
But now to my question:
How could one implement this as fast as possible using numpy and
as less loops as possible. I already just looped through it kernel by kernel but this aproach isnt feasible with larger inputs. I think this could be parallelized quite a bit but my numpy indexing and overall knowledge isnt good enough to figure out a good solution by myself.
thanks for reading and have a nice day :)
With numpy, I expect this can be done using numpy.lib.stride_tricks.as_strided. However, I'd suggest that you look at pytorch, which interoperates easily with numpy and has quite efficient primitives for this operation. In your case, the code would look like:
kernel_size = 2
x = torch.arange(18).reshape(1, 2, 3, 3).to(torch.float32)
unfold = torch.nn.Unfold(kernel_size=kernel_size)
fold = torch.nn.Fold(kernel_size=kernel_size, output_size=(3, 3))
unfolded = unfold(x)
cols = torch.arange(kernel_size ** 2)
for col in range(kernel_size ** 2):
# col = 0
unfolded_masked = torch.where(col == cols, unfolded, torch.tensor(0.0, dtype=torch.float32))
refolded = fold(unfolded_masked)
print(refolded)
tensor([[[[ 0., 1., 0.],
[ 3., 4., 0.],
[ 0., 0., 0.]],
[[ 9., 10., 0.],
[12., 13., 0.],
[ 0., 0., 0.]]]])
tensor([[[[ 0., 1., 2.],
[ 0., 4., 5.],
[ 0., 0., 0.]],
[[ 0., 10., 11.],
[ 0., 13., 14.],
[ 0., 0., 0.]]]])
tensor([[[[ 0., 0., 0.],
[ 3., 4., 0.],
[ 6., 7., 0.]],
[[ 0., 0., 0.],
[12., 13., 0.],
[15., 16., 0.]]]])
tensor([[[[ 0., 0., 0.],
[ 0., 4., 5.],
[ 0., 7., 8.]],
[[ 0., 0., 0.],
[ 0., 13., 14.],
[ 0., 16., 17.]]]])

Getting indices of values in high dimension matrix

I have tensorA of size 10x4x9x2, the other tensorB is of size 10x5x2 that contains values from tensorA. Now, how can i find the index of each element in tensorB in tensorA.
Example:
First 2 elements of TensorA:
[[[[ 4., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., 3.],
[ 3., 11.],
[11., 10.],
[10., -1.],
[-1., -1.],
[-1., -1.]],
[[12., 13.],
[13., 9.],
[ 9., 7.],
[ 7., 5.],
[ 5., 3.],
[ 3., 4.],
[ 4., 1.],
[ 1., 0.],
[ 0., -1.]],
...... so on
Fist 2 elements of TensorB:
[[[ 2., 5.],
[ 5., 7.],
[ 7., 9.],
[ 9., 10.],
[10., 12.]],
[[ 0., 1.],
[ 1., 2.],
[ 2., 5.],
[ 5., -1.],
[-1., -1.]],
Now in tensorB the first element is [2,5] included in the first 5x2 matrix (dimension 0).
so the element should be matched against dimension 0 in tensorA. And the output should be index
0,0,2 since it is the 3rd element.
You can compare the rows that are equal, sum along the last axis, and check that sum against the size of the searched tensor. Then the nonzero function will get you the indices you're looking for.
Since for the example tensors you have given, TensorB[0, 0] is [2., 5.], that looks like:
((TensorA == TensorB[0, 0]).sum(dim=3) == 2).nonzero()
This will return a tensor of [[0, 0, 2]] if that is the only matching row. If you don't want to hard-code 2 (the size of the searched tensor), you can use:
((TensorA == TensorB[0, 0]).sum(dim=3) == TensorB[0, 0].size()[0]).nonzero()

Python - Find values whose coordinates are known in several times

I would like to get several values whose i have the coordinates.
My coordinates are given by "Coord" (shape : (3, 3, 2, 3) : X and Y during 3 times and with 2 because of 2 coordinates) and my values are given by "Values" (shape : (3, 3, 3) for 3 times)
In other words, i would like to concatenate values in time with "slices" for each positions...
I dont know how to undertake that...Here there is a little part of the arrays.
import numpy as np
Coord = np.array([[[[ 4., 6., 10.],
[ 1., 3., 7.]],
[[ 3., 5., 9.],
[ 1., 3., 7.]],
[[ 2., 4., 8.],
[ 1., 3., 7.]]],
[[[ 4., 6., 10.],
[ 2., 4., 8.]],
[[ 3., 5., 9.],
[ 2., 4., 8.]],
[[ 2., 4., 8.],
[ 2., 4., 8.]]],
[[[ 4., 6., 10.],
[ 3., 5., 9.]],
[[ 3., 5., 9.],
[ 3., 5., 9.]],
[[ 2., 4., 8.],
[ 3., 5., 9.]]]])
Values = np.array([[[-4.24045246, 0.97551048, -5.78904502],
[-3.24218504, 0.9771782 , -4.79103141],
[-2.24390519, 0.97882129, -3.79298771]],
[[-4.24087775, 1.97719843, -5.79065966],
[-3.24261128, 1.97886271, -4.7926441 ],
[-2.24433235, 1.98050192, -3.79459845]],
[[-4.24129055, 2.97886284, -5.79224713],
[-3.24302502, 2.98052345, -4.79422942],
[-2.24474697, 2.98215901, -3.79618161]]])
EDIT LATER
I try in case of a simplified problem (without time first). I have used a "for loop" but
somes errors seems subsist...do you think it s the best way to treat this problem? because my arrays are important... 400x300x100
Coord3 = np.array([[[ 2, 2.],
[ 0., 1.],
[ 0., 2.]],
[[ 1., 0.],
[ 2., 1.],
[ 1., 2.]],
[[ 2., 0.],
[ 1., 1.],
[ 0., 0.]]])
Coord3 = Coord3.astype(int)
Values2 = np.array([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
b = np.zeros((3,3))
for i in range(Values2.shape[0]):
for j in range(Values2.shape[1]):
b[Coord3[i,j,0], Coord3[i,j,1]] = Values2[i,j]
b
Your second example is relatively easy to do with fancy indexing:
b = np.zeros((3,3), values2.dtype)
b[coord3[..., 0], coord3[..., 1]] = values2
The origial problem is a bit harder to do, but I think this takes care of it:
coord = coord.astype(int)
x_size = coord[..., 0, :].max() + 1
y_size = coord[..., 1, :].max() + 1
# x_size, y_size = coord.max(axis=(0, 1, 3)) + 1
nt = coord.shape[3]
b = np.zeros((x_size, y_size, nt), values.dtype)
b[coord[..., 0, :], coord[..., 1, :], np.arange(nt)] = values

Categories