Get batched indices from stacked matrices - Python Jax - python

I would like to extract the indices of stacked matrices.
Let us say we have an array a of dimension (3, 2, 4), meaning that we have three arrays of dimension (2,4) and a list of indices (3, 2).
def get_cols(x,idx):
x = x[:,idx]
return x
idx = jnp.array([[0,1],[2,3],[1,2]])
a = jnp.array([[[1,2,3,4],
[3,2,2,4]],
[[100,20,3,50],
[5,5,2,4]],
[[1,2,3,4],
[3,2,2,4]]
])
e = jax.vmap(get_cols, in_axes=(None,0))(a,idx)
I want to extract the columns of the different matrices given a batch of indices. I expect the following result:
e = [[[[1,2],
[3,2]],
[[100,20],
[5,5]],
[[1,2],
[3,2]]],
[[[3,4],
[2,4]],
[[3,50],
[2,4]],
[[3,4],
[2,4]]],
[[[2,3],
[2,2]],
[[20,3],
[5,2]],
[[2,3],
[2,2]]]]
What am I missing?

It looks like you're interested in a double vmap over the inputs; e.g. something like this:
e = jax.vmap(jax.vmap(get_cols, in_axes=(0, None)), in_axes=(None, 0))(a, idx)
print(e)
[[[[ 1 2]
[ 3 2]]
[[100 20]
[ 5 5]]
[[ 1 2]
[ 3 2]]]
[[[ 3 4]
[ 2 4]]
[[ 3 50]
[ 2 4]]
[[ 3 4]
[ 2 4]]]
[[[ 2 3]
[ 2 2]]
[[ 20 3]
[ 5 2]]
[[ 2 3]
[ 2 2]]]]

Related

Join numpy arrays of different dimensions and shapes

I have 2 arrays, one has a shape of (2,2) and the other has a shape of (2,2,2). I want to stack them together so that my final result can have a shape of (3,2,2). I'll put an illustration of what I'm talking about
Array 1 -> [ 1,2 ] -> shape(2,2)
[ 3,4 ]
Array 2 -> [ 5,6 ] [ 9,10 ] -> shape (2,2,2)
[ 7,8 ] [ 11,12 ]
Final Array after stacking Arrays 1 and 2 -> [ 1,2 ] [ 5,6 ] [ 9,10 ] ->shape (3,2,2)
[ 3,4 ] [ 7,8 ] [ 11,12 ]
To be more flexible with dimension choices you can use ravel with reshape:
import numpy as np
arr1 = np.arange(1, 5).reshape(2, 2)
arr2 = np.arange(5, 13).reshape(2, 2, 2)
stack = np.concatenate((arr1.ravel(),arr2.ravel())).reshape(3,2,2)
Output:
>>> stack
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
Use numpy.dstack, to stack your arrays along the depth (third) axis:
import numpy as np
a = np.arange(1, 5).reshape(2, 2)
b = np.arange(5, 13).reshape(2, 2, 2)
c = np.dstack((a, b))
print(c)
#[[[ 1 5 6]
# [ 2 7 8]]
#
# [[ 3 9 10]
# [ 4 11 12]]]
print(c.shape)
#(2, 2, 3)
EDIT: To get your desired shape of (3, 2, 2), you can still use np.dstack, but with transposed input and output:
c = np.dstack((a.T, b.T)).T
print(c)
#[[[ 1 2]
# [ 3 4]]
#
# [[ 5 6]
# [ 7 8]]
#
# [[ 9 10]
# [11 12]]]
print(c.shape)
#(3, 2, 2)

Is there a pythonic way to concatenate groups of columns from a matrix along the first axis?

I'm working with a m x n numpy 2D-array which holds some integer values. The dimensions are unknown before executing the script, but n (the width) is always even. Something like:
[[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]
[ 1 2 3 4]]
What I need is to group the columns in pairs and concatenate them along the first axis:
[[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[ 1 2]
[ 3 4]
[ 3 4]
[ 3 4]
[ 3 4]
[ 3 4]]
I tried using reshape but that doesn't output the expected result. I'm not very used to program in Python and would be able to implement it using loops and if statements, but I'm sure there's a more elegant way to do it. Any help is welcomed!
You need to transpose the matrix between the reshape:
# sample
a = np.stack([[1,2,3,4, 5, 6]]*2)
a.reshape(a.shape[0], -1, 2).transpose(1,0,2).reshape(-1,2)
Output:
array([[1, 2],
[1, 2],
[3, 4],
[3, 4],
[5, 6],
[5, 6]])
hi with reshape you can choose to start with the columns like this:
a=np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]])
a.reshape((8,2),order='F')

how to merge two 3d-arrays on the 2nd dimension efficiently?

Lets say I have two 3 dimensional arrays (a & b) of shape (1.000.000, ???, 50), (??? = see below).
How to merge them,
so that the result will be (1.000.000, {shape of a's + b's second dimension} , 50)?
Here are the samples, as you can see below: (np.arrays are also possible)
EDIT: added usable code, please scroll^^
[ #a
[
],
[
[1 2 3]
],
[
[0 2 7]
[1 Nan 3]
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
[#b
[
[7 2 3]
[1 2 9]
[1 2 3]
[8 0 3]
[1 7 3]
]
[
[3 9 0]
[2 2 3]
[8 1 3]
[0 2 3]
],
[
[10 0 3]
[0 NaN 9]
[10 NaN 3]
],
[
[0 2 NaN]
[1 Nan 3]
],
[
[1 2 NaN]
],
[
]
]
a = [ [ ],
[ [1, 2, 3] ],
[ [0, 2, 7], [1,np.nan,3] ],
[
[10,0,3], [np.nan,9,9], [10,np.nan,3]
],
[
[8,2,0], [2,2,3], [8,1,3], [1,2,3]
],
[
[0,2,3], [1,2,9], [1,2,3], [1,0,3], [1,2,3]
]
]
b = [
[
[7,2,3], [1,2,9], [1,2,3], [8,0,3], [1,7,3]
],
[
[3,9,0], [2,2,3], [8,1,3], [0,2,3]
],
[
[10,0,3], [0,np.nan,9], [10,np.nan,3]
],
[
[0,2,np.nan], [1,np.nan,3]
],
[
[1,2,np.nan]
],
[
]
]
expected outcome:
[
[ [7 2 3]# from b
[1 2 9]# from b
[1 2 3]# from b
[8 0 3]# from b
[1 7 3]# from b
],
[
[1 2 3]
[3 9 0]# from b
[2 2 3]# from b
[8 1 3]# from b
[0 2 3]# from b
],
[
[0 2 7]
[1 Nan 3]
[10 0 3]# from b
[0 NaN 9]# from b
[10 NaN 3]# from b
],
[
[10 0 3]
[NaN 9 9]
[10 NaN 3]
[0 2 NaN]# from b
[1 Nan 3]# from b
],
[
[8 2 0]
[2 2 3]
[8 1 3]
[1 2 3]
[1 2 NaN]# from b
],
[
[0 2 3]
[1 2 9]
[1 2 3]
[1 0 3]
[1 2 3]
]
]
Do you know a way to do that efficiently?
EDIT: tried concatenate (didnt work):
DF_LEN, COL_LEN, cols = 20,5,['A', 'B']
a = np.asarray(pd.DataFrame(1, index=range(DF_LEN), columns=cols))
a = list((map(lambda i: a[:i], range(1,a.shape[0]+1))))
b = np.asarray(pd.DataFrame(np.nan, index=range(DF_LEN), columns=cols))
b = list((map(lambda i: b[:i], range(1,b.shape[0]+1))))
b = b[::-1]
a_first = a[0]; del a[0]
b_last = b[-1]; del b[-1]
result = np.concatenate([a, b], axis=1)
>>>AxisError: axis 1 is out of bounds for array of dimension 1
You cannot have an array with variable length in a dimension. a and b are most likely list of lists and not arrays. You can use list comprehension along with zip:
np.array([x+y for x,y in zip(a,b)])
EDIT: or based on comment provided if a and b are lists of arrays:
np.array([np.vstack((x,y)) for x,y in zip(a,b)])
The output for your example looks like:
[[[ 7.  2.  3.]
  [ 1.  2.  9.]
  [ 1.  2.  3.]
  [ 8.  0.  3.]
  [ 1.  7.  3.]]
[[ 1.  2.  3.]
  [ 3.  9.  0.]
  [ 2.  2.  3.]
  [ 8.  1.  3.]
  [ 0.  2.  3.]]
[[ 0.  2.  7.]
  [ 1. nan  3.]
  [10.  0.  3.]
  [ 0. nan  9.]
  [10. nan  3.]]
[[10.  0.  3.]
  [nan  9.  9.]
  [10. nan  3.]
  [ 0.  2. nan]
  [ 1. nan  3.]]
[[ 8.  2.  0.]
  [ 2.  2.  3.]
  [ 8.  1.  3.]
  [ 1.  2.  3.]
  [ 1.  2. nan]]
[[ 0.  2.  3.]
  [ 1.  2.  9.]
  [ 1.  2.  3.]
  [ 1.  0.  3.]
  [ 1.  2.  3.]]]
To perform your concatenation, run:
result = np.concatenate([a, b], axis=1)
To test this code, I created a and b as:
a = np.stack([ np.full((2, 3), i) for i in range(1, 6)], axis=1)
b = np.stack([ np.full((2, 3), i + 10) for i in range(1, 4)], axis=1)
So they contain:
array([[[1, 1, 1], array([[[11, 11, 11],
[2, 2, 2], [12, 12, 12],
[3, 3, 3], [13, 13, 13]],
[4, 4, 4],
[5, 5, 5]], [[11, 11, 11],
[12, 12, 12],
[[1, 1, 1], [13, 13, 13]]])
[2, 2, 2],
[3, 3, 3],
[4, 4, 4],
[5, 5, 5]]])
and their shapes are: (2, 5, 3) and (2, 3, 3)
The result of my concatenation is:
array([[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]],
[[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3],
[ 4, 4, 4],
[ 5, 5, 5],
[11, 11, 11],
[12, 12, 12],
[13, 13, 13]]])
and the shape is (2, 8, 3), just as it should be.
Edit following the comment as of 19:56Z
I tried the code from your comment.
After you executed a = list((map(lambda i: a[:i], range(1,a.shape[0]+1)))),
the result is:
[array([[1, 1]], dtype=int64),
array([[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1]], dtype=int64),
...
so a is a list of arrays of varying sizes.
Theres is something wrong in the way you construct your data.
First check that your both arrays are 3-D and their shapes differ
only in axis 1. Only then you can run my code on them.
For now both a and b are plain pythonic lists, not Numpy arrays!

separating 2d numpy array into nxn chunks

How would you separate a 2D numpy array into a nxn chunks?
For example, the following array of shape (4,4):
arr = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
Transformed to this array, of shape (4,2,2), by subsampling with a different (2x2) array:
new_arr = [[[1,2],
[5,6]],
[[3,4],
[7,8]],
[[9,10],
[13,14]],
[[11,12],
[15,16]]]
You can use np.vsplit to split the array into multiple subarrays vertically. Similarly you can use np.hsplit to split the array into multiple subarrays horizontally. To better understand this examine the generalized resample function which makes the use of np.vsplit and np.hsplit methods.
Use this:
def ressample(arr, N):
A = []
for v in np.vsplit(arr, arr.shape[0] // N):
A.extend([*np.hsplit(v, arr.shape[0] // N)])
return np.array(A)
Example 1:
The given 2D array is of shape 4x4 and we want to subsample it into the chunks of shape 2x2.
arr = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]])
print(ressample(arr, 2)) #--> chunk size 2
Output 1:
[[[ 1 2]
[ 5 6]]
[[ 3 4]
[ 7 8]]
[[ 9 10]
[13 14]]
[[11 12]
[15 16]]]
Example 2:
Consider the given 2D array contains 8 rows and 8 columns. Now we subsample this array into the chunks of shape 4x4.
arr = np.random.randint(0, 10, 64).reshape(8, 8)
print(ressample(arr, 4)) #--> chunck size 4
Sample Output 2:
[[[8 3 7 5]
[7 2 6 1]
[7 9 2 2]
[3 1 8 8]]
[[2 0 3 2]
[2 9 0 8]
[2 6 3 9]
[2 4 4 8]]
[[9 9 1 8]
[9 1 5 0]
[8 5 1 2]
[2 7 5 1]]
[[7 8 9 6]
[9 0 9 5]
[8 9 8 3]
[7 3 6 3]]]
You could do the following, and adjust it to your array:
import numpy as np
arr = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
arr_new = np.array([[arr[i][j:j+2], arr[i+1][j:j+2]] for j in range(len(arr[0])-2) for i in range(len(arr)-2)])
print(arr_new)
print(arr_new.shape)
This gives the following output:
[[[ 1 2]
[ 5 6]]
[[ 5 6]
[ 9 10]]
[[ 2 3]
[ 6 7]]
[[ 6 7]
[10 11]]]
(4, 2, 2)
You could use hsplit() and vsplit() methods to achieve the above.
import numpy as np
arr = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
ls1,ls2 = np.hsplit(arr, 2)
ls1 = np.vsplit(ls1,2)
ls2 = np.vsplit(ls2,2)
ls = ls1 + ls2
result = np.array(ls)
print(result)
>>>
[[[ 1 2]
[ 5 6]]
[[ 9 10]
[13 14]]
[[ 3 4]
[ 7 8]]
[[11 12]
[15 16]]]
print(result.tolist())
>>> [[[1, 2], [5, 6]], [[9, 10], [13, 14]], [[3, 4], [7, 8]], [[11, 12], [15, 16]]]
There is no need to split or anything; the same can be achieved by reshaping and reordering the axes.
result = np.swapaxes(arr.reshape(2, 2, 2, 2), 1, 2).reshape(-1, 2, 2)
Dividing an (N, N) array to (n, n) chunks is also basically a sliding window op with an (n, n) window and a stride of n.
from numpy.lib.stride_tricks import sliding_window_view
result = sliding_window_view(arr, (2, 2))[::2, ::2].reshape(-1, 2, 2)

sum elements of array

I have an array like this:
array = np.array([[[[ 2, -3],[ 3, 2]],[[-4, -1],[-5, 1]],
[[-7, -5],[-1, 6]],[[-5, 0],[-4, 2]]],
[[[-1, 4],[ 6, 1]],[[-2, -3],[-5, 5]],
[[-2, -8],[-1, 7]],[[-1, 8],[-4, 2]]]])
If I sum(array) then I get the sum of (4x2x2) + (4x2x2).
How can I sum the elements inside of the first arrays, opposite of what sum() function did. Like (2-3) = -1 in the first group, (3+2) = 5 in the second, etc.
Thanks
summing along the 3rd axis should do what you want:
res = np.sum(array, axis=3)
# or:
# res = array.sum(axis=3)
which produces
[[[ -1 5]
[ -5 -4]
[-12 5]
[ -5 -2]]
[[ 3 7]
[ -5 0]
[-10 6]
[ 7 -2]]]

Categories