Related
I would like to extract the indices of stacked matrices.
Let us say we have an array a of dimension (3, 2, 4), meaning that we have three arrays of dimension (2,4) and a list of indices (3, 2).
def get_cols(x,idx):
x = x[:,idx]
return x
idx = jnp.array([[0,1],[2,3],[1,2]])
a = jnp.array([[[1,2,3,4],
[3,2,2,4]],
[[100,20,3,50],
[5,5,2,4]],
[[1,2,3,4],
[3,2,2,4]]
])
e = jax.vmap(get_cols, in_axes=(None,0))(a,idx)
I want to extract the columns of the different matrices given a batch of indices. I expect the following result:
e = [[[[1,2],
[3,2]],
[[100,20],
[5,5]],
[[1,2],
[3,2]]],
[[[3,4],
[2,4]],
[[3,50],
[2,4]],
[[3,4],
[2,4]]],
[[[2,3],
[2,2]],
[[20,3],
[5,2]],
[[2,3],
[2,2]]]]
What am I missing?
It looks like you're interested in a double vmap over the inputs; e.g. something like this:
e = jax.vmap(jax.vmap(get_cols, in_axes=(0, None)), in_axes=(None, 0))(a, idx)
print(e)
[[[[ 1 2]
[ 3 2]]
[[100 20]
[ 5 5]]
[[ 1 2]
[ 3 2]]]
[[[ 3 4]
[ 2 4]]
[[ 3 50]
[ 2 4]]
[[ 3 4]
[ 2 4]]]
[[[ 2 3]
[ 2 2]]
[[ 20 3]
[ 5 2]]
[[ 2 3]
[ 2 2]]]]
Lets say I have two 3 dimensional arrays (a & b) of shape (1.000.000, ???, 50), (??? = see below).
How to merge them,
so that the result will be (1.000.000, {shape of a's + b's second dimension} , 50)?
Here are the samples, as you can see below: (np.arrays are also possible)
EDIT: added usable code, please scroll^^
[ #a
[
],
[
[1 2 3]
],
[
[0 2 7]
[1 Nan 3]
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
[#b
[
[7 2 3]
[1 2 9]
[1 2 3]
[8 0 3]
[1 7 3]
]
[
[3 9 0]
[2 2 3]
[8 1 3]
[0 2 3]
],
[
[10 0 3]
[0 NaN 9]
[10 NaN 3]
],
[
[0 2 NaN]
[1 Nan 3]
],
[
[1 2 NaN]
],
[
]
]
a = [ [ ],
[ [1, 2, 3] ],
[ [0, 2, 7], [1,np.nan,3] ],
[
[10,0,3], [np.nan,9,9], [10,np.nan,3]
],
[
[8,2,0], [2,2,3], [8,1,3], [1,2,3]
],
[
[0,2,3], [1,2,9], [1,2,3], [1,0,3], [1,2,3]
]
]
b = [
[
[7,2,3], [1,2,9], [1,2,3], [8,0,3], [1,7,3]
],
[
[3,9,0], [2,2,3], [8,1,3], [0,2,3]
],
[
[10,0,3], [0,np.nan,9], [10,np.nan,3]
],
[
[0,2,np.nan], [1,np.nan,3]
],
[
[1,2,np.nan]
],
[
]
]
expected outcome:
[
[ [7 2 3]# from b
[1 2 9]# from b
[1 2 3]# from b
[8 0 3]# from b
[1 7 3]# from b
],
[
[1 2 3]
[3 9 0]# from b
[2 2 3]# from b
[8 1 3]# from b
[0 2 3]# from b
],
[
[0 2 7]
[1 Nan 3]
[10 0 3]# from b
[0 NaN 9]# from b
[10 NaN 3]# from b
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
[0 2 NaN]# from b
[1 Nan 3]# from b
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
[1 2 NaN]# from b
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
Do you know a way to do that efficiently?
EDIT: tried concatenate (didnt work):
DF_LEN, COL_LEN, cols = 20,5,['A', 'B']
a = np.asarray(pd.DataFrame(1, index=range(DF_LEN), columns=cols))
a = list((map(lambda i: a[:i], range(1,a.shape[0]+1))))
b = np.asarray(pd.DataFrame(np.nan, index=range(DF_LEN), columns=cols))
b = list((map(lambda i: b[:i], range(1,b.shape[0]+1))))
b = b[::-1]
a_first = a[0]; del a[0]
b_last = b[-1]; del b[-1]
result = np.concatenate([a, b], axis=1)
>>>AxisError: axis 1 is out of bounds for array of dimension 1
You cannot have an array with variable length in a dimension. a and b are most likely list of lists and not arrays. You can use list comprehension along with zip:
np.array([x+y for x,y in zip(a,b)])
EDIT: or based on comment provided if a and b are lists of arrays:
np.array([np.vstack((x,y)) for x,y in zip(a,b)])
The output for your example looks like:
[[[ 7. 2. 3.]
[ 1. 2. 9.]
[ 1. 2. 3.]
[ 8. 0. 3.]
[ 1. 7. 3.]]
[[ 1. 2. 3.]
[ 3. 9. 0.]
[ 2. 2. 3.]
[ 8. 1. 3.]
[ 0. 2. 3.]]
[[ 0. 2. 7.]
[ 1. nan 3.]
[10. 0. 3.]
[ 0. nan 9.]
[10. nan 3.]]
[[10. 0. 3.]
[nan 9. 9.]
[10. nan 3.]
[ 0. 2. nan]
[ 1. nan 3.]]
[[ 8. 2. 0.]
[ 2. 2. 3.]
[ 8. 1. 3.]
[ 1. 2. 3.]
[ 1. 2. nan]]
[[ 0. 2. 3.]
[ 1. 2. 9.]
[ 1. 2. 3.]
[ 1. 0. 3.]
[ 1. 2. 3.]]]
To perform your concatenation, run:
result = np.concatenate([a, b], axis=1)
To test this code, I created a and b as:
a = np.stack([ np.full((2, 3), i) for i in range(1, 6)], axis=1)
b = np.stack([ np.full((2, 3), i + 10) for i in range(1, 4)], axis=1)
So they contain:
array([[[1, 1, 1], array([[[11, 11, 11],
[2, 2, 2], [12, 12, 12],
[3, 3, 3], [13, 13, 13]],
[4, 4, 4],
[5, 5, 5]], [[11, 11, 11],
[12, 12, 12],
[[1, 1, 1], [13, 13, 13]]])
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5]]])
and their shapes are: (2, 5, 3) and (2, 3, 3)
The result of my concatenation is:
array([[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]],
[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]]])
and the shape is (2, 8, 3), just as it should be.
Edit following the comment as of 19:56Z
I tried the code from your comment.
After you executed a = list((map(lambda i: a[:i], range(1,a.shape[0]+1)))),
the result is:
[array([[1, 1]], dtype=int64),
array([[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
...
so a is a list of arrays of varying sizes.
Theres is something wrong in the way you construct your data.
First check that your both arrays are 3-D and their shapes differ
only in axis 1. Only then you can run my code on them.
For now both a and b are plain pythonic lists, not Numpy arrays!
I have a multi-dimentional array named a (dimention is (2,3,3)) and another array named c (dimention is (2,)) as following code: how to get the output as the combination--->(a[0]*c[0],a[1]*c[1]) without loops, which means 1 times first group of a, i.e.,[[1,2],[2,-2],[3,-3]] and 10 times second group of a, namely [[4,-4],[5,-5],[6,-6]]. Btw, i have tried a*c, np.multipy(a,c), etc, but it seems like 1 times first column of a and 10 times second column, that is not what i want. Many thanks.
In [88]: a = np.array([[[1,2],[2,-2],[3,-3]],[[4,-4],[5,-5],[6,-6]]])
In [89]: a
Out[89]:
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 4, -4],
[ 5, -5],
[ 6, -6]]])
In [90]: c = np.array([1,10])
In [91]: c
Out[91]: array([ 1, 10])
In [92]: a*c
Out[92]:
array([[[ 1, 20],
[ 2, -20],
[ 3, -30]],
[[ 4, -40],
[ 5, -50],
[ 6, -60]]])
The output that i want is like
array([[[ 1, 2],
[ 2, -2],
[ 3, -3]],
[[ 40, -40],
[ 50, -50],
[ 60, -60]]])
import numpy as np
a = np.array([[[1,2],
[2,-2],
[3,-3]],
[[4,-4],
[5,-5],
[6,-6]]])
c = np.array([1,10])
print(a*c)
Output:
[[[ 1 20]
[ 2 -20]
[ 3 -30]]
[[ 4 -40]
[ 5 -50]
[ 6 -60]]]
I'm guessing that's what you asked.
What is your question? How to multiply? That you could do like this:
import numpy as np
a = np.array([[[1,2],[2,-2],[3,-3]], [[4,-4],[5,-5],[6,-6]]]);
c = np.array([1, 10]);
print a.dot(c)
I have an example 2 x 2 x 2 array:
np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
I want the nansum of the array across the first index as follows:
Sum all values in:
[[ 1, 2],
[ 3, 4]]
and
[[ 5, 6],
[ 7 , 8]]
The sum of the first array would be 10 and the second would be 26
i.e.
array([10, 26])
I think you are looking for this
a = np.array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7 , 8]]])
np.nansum(a,axis=(1,2))
# array([10, 26])
because you want to sum on axis 1 and 2 only, and get one number per axis 0
I'm trying to convert three 1D arrays into a list of 2D arrays. I've managed to do this by creating an empty ndarray and populating it line by line. Could someone show me a more elegant approach?
import numpy as np
import pandas as pd
one=np.arange(1,4,1)
two=np.arange(10,40,10)
three=np.arange(100,400,100)
df=pd.DataFrame({'col1':one,'col2':two,'col3':three})
desired_output=[np.array([[1.,10.],[1.,100.]]),np.array([[2.,20.],[2.,200.]]),np.array([[3.,30.],[3.,300.]])]
current, inelegant approach that works:
output=[]
for i in range(len(df)):
temp=np.zeros(shape=(2,2))
temp[0][0]=df.iloc[i,0]
temp[0][1]=df.iloc[i,1]
temp[1][0]=df.iloc[i,0]
temp[1][1]=df.iloc[i,2]
output.append(temp)
so first of all you can get array from df values by simply doing the following
In [61]:
arr = df.values
arr
Out[61]:
array([[ 1, 10, 100],
[ 2, 20, 200],
[ 3, 30, 300]])
then add the first column in the array again
In [73]:
arr_mod = np.hstack((arr , arr[: , 0][:, np.newaxis]))
arr_mod
Out[73]:
array([[ 1, 10, 100, 1],
[ 2, 20, 200, 2],
[ 3, 30, 300, 3]])
swap the column you've just added with the last column in the array
In [74]:
arr_mod[: , [2 , 3]] = arr_mod [: , [3 , 2]]
arr_mod
Out[74]:
array([[ 1, 10, 1, 100],
[ 2, 20, 2, 200],
[ 3, 30, 3, 300]])
then convert this 2d array to 3d array and convert it to list
In [78]:
list(arr_mod.reshape( -1, 2 , 2))
Out[78]:
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]])]
Here's one approach using np.column_stack and np.vsplit -
arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
Basically, we use np.column_stack to stack column-1 with column-2 and then again column-1 with column-3 to give us a 2D NumPy array arr2D of shape N x 4. Next, we reshape arr2D to a 2*N X 2 array and split along the rows with np.vsplit to give us the expected list of 2D arrays.
Sample run -
>>> df
col1 col2 col3
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
5 6 60 600
>>> arr2D = np.column_stack((df['col1'],df['col2'],df['col1'],df['col3']))
>>> out_list = np.vsplit(arr2D.reshape(-1,2),arr2D.shape[0])
>>> print out_list
[array([[ 1, 10],
[ 1, 100]]), array([[ 2, 20],
[ 2, 200]]), array([[ 3, 30],
[ 3, 300]]), array([[ 4, 40],
[ 4, 400]]), array([[ 5, 50],
[ 5, 500]]), array([[ 6, 60],
[ 6, 600]])]