Join numpy arrays of different dimensions and shapes - python

I have 2 arrays, one has a shape of (2,2) and the other has a shape of (2,2,2). I want to stack them together so that my final result can have a shape of (3,2,2). I'll put an illustration of what I'm talking about
Array 1 -> [ 1,2 ] -> shape(2,2)
[ 3,4 ]
Array 2 -> [ 5,6 ] [ 9,10 ] -> shape (2,2,2)
[ 7,8 ] [ 11,12 ]
Final Array after stacking Arrays 1 and 2 -> [ 1,2 ] [ 5,6 ] [ 9,10 ] ->shape (3,2,2)
[ 3,4 ] [ 7,8 ] [ 11,12 ]

To be more flexible with dimension choices you can use ravel with reshape:
import numpy as np
arr1 = np.arange(1, 5).reshape(2, 2)
arr2 = np.arange(5, 13).reshape(2, 2, 2)
stack = np.concatenate((arr1.ravel(),arr2.ravel())).reshape(3,2,2)
Output:
>>> stack
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])

Use numpy.dstack, to stack your arrays along the depth (third) axis:
import numpy as np
a = np.arange(1, 5).reshape(2, 2)
b = np.arange(5, 13).reshape(2, 2, 2)
c = np.dstack((a, b))
print(c)
#[[[ 1 5 6]
# [ 2 7 8]]
#
# [[ 3 9 10]
# [ 4 11 12]]]
print(c.shape)
#(2, 2, 3)
EDIT: To get your desired shape of (3, 2, 2), you can still use np.dstack, but with transposed input and output:
c = np.dstack((a.T, b.T)).T
print(c)
#[[[ 1 2]
# [ 3 4]]
#
# [[ 5 6]
# [ 7 8]]
#
# [[ 9 10]
# [11 12]]]
print(c.shape)
#(3, 2, 2)

Related

Get batched indices from stacked matrices - Python Jax

I would like to extract the indices of stacked matrices.
Let us say we have an array a of dimension (3, 2, 4), meaning that we have three arrays of dimension (2,4) and a list of indices (3, 2).
def get_cols(x,idx):
x = x[:,idx]
return x
idx = jnp.array([[0,1],[2,3],[1,2]])
a = jnp.array([[[1,2,3,4],
[3,2,2,4]],
[[100,20,3,50],
[5,5,2,4]],
[[1,2,3,4],
[3,2,2,4]]
])
e = jax.vmap(get_cols, in_axes=(None,0))(a,idx)
I want to extract the columns of the different matrices given a batch of indices. I expect the following result:
e = [[[[1,2],
[3,2]],
[[100,20],
[5,5]],
[[1,2],
[3,2]]],
[[[3,4],
[2,4]],
[[3,50],
[2,4]],
[[3,4],
[2,4]]],
[[[2,3],
[2,2]],
[[20,3],
[5,2]],
[[2,3],
[2,2]]]]
What am I missing?
It looks like you're interested in a double vmap over the inputs; e.g. something like this:
e = jax.vmap(jax.vmap(get_cols, in_axes=(0, None)), in_axes=(None, 0))(a, idx)
print(e)
[[[[ 1 2]
[ 3 2]]
[[100 20]
[ 5 5]]
[[ 1 2]
[ 3 2]]]
[[[ 3 4]
[ 2 4]]
[[ 3 50]
[ 2 4]]
[[ 3 4]
[ 2 4]]]
[[[ 2 3]
[ 2 2]]
[[ 20 3]
[ 5 2]]
[[ 2 3]
[ 2 2]]]]

how to merge two 3d-arrays on the 2nd dimension efficiently?

Lets say I have two 3 dimensional arrays (a & b) of shape (1.000.000, ???, 50), (??? = see below).
How to merge them,
so that the result will be (1.000.000, {shape of a's + b's second dimension} , 50)?
Here are the samples, as you can see below: (np.arrays are also possible)
EDIT: added usable code, please scroll^^
[ #a
[
],
[
[1 2 3]
],
[
[0 2 7]
[1 Nan 3]
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
[#b
[
[7 2 3]
[1 2 9]
[1 2 3]
[8 0 3]
[1 7 3]
]
[
[3 9 0]
[2 2 3]
[8 1 3]
[0 2 3]
],
[
[10 0 3]
[0 NaN 9]
[10 NaN 3]
],
[
[0 2 NaN]
[1 Nan 3]
],
[
[1 2 NaN]
],
[
]
]
a = [ [ ],
[ [1, 2, 3] ],
[ [0, 2, 7], [1,np.nan,3] ],
[
[10,0,3], [np.nan,9,9], [10,np.nan,3]
],
[
[8,2,0], [2,2,3], [8,1,3], [1,2,3]
],
[
[0,2,3], [1,2,9], [1,2,3], [1,0,3], [1,2,3]
]
]
b = [
[
[7,2,3], [1,2,9], [1,2,3], [8,0,3], [1,7,3]
],
[
[3,9,0], [2,2,3], [8,1,3], [0,2,3]
],
[
[10,0,3], [0,np.nan,9], [10,np.nan,3]
],
[
[0,2,np.nan], [1,np.nan,3]
],
[
[1,2,np.nan]
],
[
]
]
expected outcome:
[
[ [7 2 3]# from b
[1 2 9]# from b
[1 2 3]# from b
[8 0 3]# from b
[1 7 3]# from b
],
[
[1 2 3]
[3 9 0]# from b
[2 2 3]# from b
[8 1 3]# from b
[0 2 3]# from b
],
[
[0 2 7]
[1 Nan 3]
[10 0 3]# from b
[0 NaN 9]# from b
[10 NaN 3]# from b
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
[0 2 NaN]# from b
[1 Nan 3]# from b
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
[1 2 NaN]# from b
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
Do you know a way to do that efficiently?
EDIT: tried concatenate (didnt work):
DF_LEN, COL_LEN, cols = 20,5,['A', 'B']
a = np.asarray(pd.DataFrame(1, index=range(DF_LEN), columns=cols))
a = list((map(lambda i: a[:i], range(1,a.shape[0]+1))))
b = np.asarray(pd.DataFrame(np.nan, index=range(DF_LEN), columns=cols))
b = list((map(lambda i: b[:i], range(1,b.shape[0]+1))))
b = b[::-1]
a_first = a[0]; del a[0]
b_last = b[-1]; del b[-1]
result = np.concatenate([a, b], axis=1)
>>>AxisError: axis 1 is out of bounds for array of dimension 1
You cannot have an array with variable length in a dimension. a and b are most likely list of lists and not arrays. You can use list comprehension along with zip:
np.array([x+y for x,y in zip(a,b)])
EDIT: or based on comment provided if a and b are lists of arrays:
np.array([np.vstack((x,y)) for x,y in zip(a,b)])
The output for your example looks like:
[[[ 7.  2.  3.]
  [ 1.  2.  9.]
  [ 1.  2.  3.]
  [ 8.  0.  3.]
  [ 1.  7.  3.]]
[[ 1.  2.  3.]
  [ 3.  9.  0.]
  [ 2.  2.  3.]
  [ 8.  1.  3.]
  [ 0.  2.  3.]]
[[ 0.  2.  7.]
  [ 1. nan  3.]
  [10.  0.  3.]
  [ 0. nan  9.]
  [10. nan  3.]]
[[10.  0.  3.]
  [nan  9.  9.]
  [10. nan  3.]
  [ 0.  2. nan]
  [ 1. nan  3.]]
[[ 8.  2.  0.]
  [ 2.  2.  3.]
  [ 8.  1.  3.]
  [ 1.  2.  3.]
  [ 1.  2. nan]]
[[ 0.  2.  3.]
  [ 1.  2.  9.]
  [ 1.  2.  3.]
  [ 1.  0.  3.]
  [ 1.  2.  3.]]]
To perform your concatenation, run:
result = np.concatenate([a, b], axis=1)
To test this code, I created a and b as:
a = np.stack([ np.full((2, 3), i) for i in range(1, 6)], axis=1)
b = np.stack([ np.full((2, 3), i + 10) for i in range(1, 4)], axis=1)
So they contain:
array([[[1, 1, 1], array([[[11, 11, 11],
[2, 2, 2], [12, 12, 12],
[3, 3, 3], [13, 13, 13]],
[4, 4, 4],
[5, 5, 5]], [[11, 11, 11],
[12, 12, 12],
[[1, 1, 1], [13, 13, 13]]])
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5]]])
and their shapes are: (2, 5, 3) and (2, 3, 3)
The result of my concatenation is:
array([[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]],
[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]]])
and the shape is (2, 8, 3), just as it should be.
Edit following the comment as of 19:56Z
I tried the code from your comment.
After you executed a = list((map(lambda i: a[:i], range(1,a.shape[0]+1)))),
the result is:
[array([[1, 1]], dtype=int64),
array([[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
...
so a is a list of arrays of varying sizes.
Theres is something wrong in the way you construct your data.
First check that your both arrays are 3-D and their shapes differ
only in axis 1. Only then you can run my code on them.
For now both a and b are plain pythonic lists, not Numpy arrays!

python numpy 3-d matrix times 1-d array

I have a multi-dimentional array named a (dimention is (2,3,3)) and another array named c (dimention is (2,)) as following code: how to get the output as the combination--->(a[0]*c[0],a[1]*c[1]) without loops, which means 1 times first group of a, i.e.,[[1,2],[2,-2],[3,-3]] and 10 times second group of a, namely [[4,-4],[5,-5],[6,-6]]. Btw, i have tried a*c, np.multipy(a,c), etc, but it seems like 1 times first column of a and 10 times second column, that is not what i want. Many thanks.
In [88]: a = np.array([[[1,2],[2,-2],[3,-3]],[[4,-4],[5,-5],[6,-6]]])
In [89]: a
Out[89]:
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 4, -4],
[ 5, -5],
[ 6, -6]]])
In [90]: c = np.array([1,10])
In [91]: c
Out[91]: array([ 1, 10])
In [92]: a*c
Out[92]:
array([[[ 1, 20],
[ 2, -20],
[ 3, -30]],
[[ 4, -40],
[ 5, -50],
[ 6, -60]]])
The output that i want is like
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 40, -40],
[ 50, -50],
[ 60, -60]]])
import numpy as np
a = np.array([[[1,2],
[2,-2],
[3,-3]],
[[4,-4],
[5,-5],
[6,-6]]])
c = np.array([1,10])
print(a*c)
Output:
[[[ 1 20]
[ 2 -20]
[ 3 -30]]
[[ 4 -40]
[ 5 -50]
[ 6 -60]]]
I'm guessing that's what you asked.
What is your question? How to multiply? That you could do like this:
import numpy as np
a = np.array([[[1,2],[2,-2],[3,-3]], [[4,-4],[5,-5],[6,-6]]]);
c = np.array([1, 10]);
print a.dot(c)

numpy nansum across first index

I have an example 2 x 2 x 2 array:
np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
I want the nansum of the array across the first index as follows:
Sum all values in:
[[ 1, 2],
[ 3, 4]]
and
[[ 5, 6],
[ 7 , 8]]
The sum of the first array would be 10 and the second would be 26
i.e.
array([10, 26])
I think you are looking for this
a = np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
np.nansum(a,axis=(1,2))
# array([10, 26])
because you want to sum on axis 1 and 2 only, and get one number per axis 0

convert separate 1D np.arrays into a list of 2D np.arrays

I'm trying to convert three 1D arrays into a list of 2D arrays. I've managed to do this by creating an empty ndarray and populating it line by line. Could someone show me a more elegant approach?
import numpy as np
import pandas as pd
one=np.arange(1,4,1)
two=np.arange(10,40,10)
three=np.arange(100,400,100)
df=pd.DataFrame({'col1':one,'col2':two,'col3':three})
desired_output=[np.array([[1.,10.],[1.,100.]]),np.array([[2.,20.],[2.,200.]]),np.array([[3.,30.],[3.,300.]])]
current, inelegant approach that works:
output=[]
for i in range(len(df)):
temp=np.zeros(shape=(2,2))
temp[0][0]=df.iloc[i,0]
temp[0][1]=df.iloc[i,1]
temp[1][0]=df.iloc[i,0]
temp[1][1]=df.iloc[i,2]
output.append(temp)
so first of all you can get array from df values by simply doing the following
In [61]:
arr = df.values
arr
Out[61]:
array([[ 1, 10, 100],
[ 2, 20, 200],
[ 3, 30, 300]])
then add the first column in the array again
In [73]:
arr_mod = np.hstack((arr , arr[: , 0][:, np.newaxis]))
arr_mod
Out[73]:
array([[ 1, 10, 100, 1],
[ 2, 20, 200, 2],
[ 3, 30, 300, 3]])
swap the column you've just added with the last column in the array
In [74]:
arr_mod[: , [2 , 3]] = arr_mod [: , [3 , 2]]
arr_mod
Out[74]:
array([[ 1, 10, 1, 100],
[ 2, 20, 2, 200],
[ 3, 30, 3, 300]])
then convert this 2d array to 3d array and convert it to list
In [78]:
list(arr_mod.reshape( -1, 2 , 2))
Out[78]:
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]])]
Here's one approach using np.column_stack and np.vsplit -
arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
Basically, we use np.column_stack to stack column-1 with column-2 and then again column-1 with column-3 to give us a 2D NumPy array arr2D of shape N x 4. Next, we reshape arr2D to a 2*N X 2 array and split along the rows with np.vsplit to give us the expected list of 2D arrays.
Sample run -
>>> df
col1 col2 col3
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
5 6 60 600
>>> arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
>>> out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
>>> print out_list
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]]), array([[ 4, 40],
[ 4, 400]]), array([[ 5, 50],
[ 5, 500]]), array([[ 6, 60],
[ 6, 600]])]

Categories