Combine array along axis - python

I'm trying to do the following:
I have a (4,2)-shaped array:
a = np.array([[-1, 0],[1, 0],[0, -1], [0, 1]])
I have another (2, 2)-shaped array:
b = np.array([[10, 10], [5, 5]])
I'd like to add them along rows of b and concatenate, so that I end up with:
[[ 9, 10],
[11, 10],
[10, 9],
[10, 11],
[4, 5],
[6, 5],
[5, 4],
[5, 6]]
The first 4 elements are b[0]+a, and the last four are b[1]+a. How can i generalize that if b is (N, 2)-shaped, not using a for loop over its elements?

You can use broadcasting to get all the summations in a vectorized manner to have a 3D array, which could then be stacked into a 2D array with np.vstack for the desired output. Thus, the implementation would be something like this -
np.vstack((a + b[:,None,:]))
Sample run -
In [74]: a
Out[74]:
array([[-1, 0],
[ 1, 0],
[ 0, -1],
[ 0, 1]])
In [75]: b
Out[75]:
array([[10, 10],
[ 5, 5]])
In [76]: np.vstack((a + b[:,None,:]))
Out[76]:
array([[ 9, 10],
[11, 10],
[10, 9],
[10, 11],
[ 4, 5],
[ 6, 5],
[ 5, 4],
[ 5, 6]])
You can replace np.dstack with some reshaping and this might be a bit more efficient, like so -
(a + b[:,None,:]).reshape(-1,a.shape[1])

Related

Numpy "Fortran"-like reshape?

Let's say I have an array X of shape (6, 2) like this:
import numpy as np
X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
I want to reshape it to an array of shape (3, 2, 2), so I did this:
X.reshape(3, 2, 2)
And got:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
However, I need my data in a different format. To be precise, I want to end up wth:
array([[[ 1, 2],
[ 7, 8]],
[[ 3, 4],
[ 9, 10]],
[[ 5, 6],
[11, 12]]])
Should I be using reshape for this or something else? What's the best way to do this in Numpy?
You have to set the order option:
>>> X.reshape(3, 2, 2, order='F')
array([[[ 1, 2],
[ 7, 8]],
[[ 3, 4],
[ 9, 10]],
[[ 5, 6],
[11, 12]]])
‘F’ means to read / write the elements using Fortran-like index order, with the first index changing fastest, and the last index changing slowest.
see: https://numpy.org/doc/stable/reference/generated/numpy.reshape.html
You need to specify order;
X.reshape(3, 2, 2, order='F')
should work
A functional equivalent to the order='F' reshape:
In [31]: x.reshape(2,3,2).transpose(1,0,2)
Out[31]:
array([[[ 1, 2],
[ 7, 8]],
[[ 3, 4],
[ 9, 10]],
[[ 5, 6],
[11, 12]]])
In [32]: x.reshape(2,3,2).transpose(1,0,2).strides
Out[32]: (16, 48, 8)
Without the transpose the strides would be (48,16,8).
A thing that's a bit tricky about this layout is that the last dimension remains in 'C' order. It's the just first two dimension that are switched.
The full 'F' layout would be
In [33]: x = np.arange(1,13).reshape(3,2,2,order='F')
In [34]: x
Out[34]:
array([[[ 1, 7],
[ 4, 10]],
[[ 2, 8],
[ 5, 11]],
[[ 3, 9],
[ 6, 12]]])

numpy - slicing a 3d array, how to apply two slices of different length in a certain axis

I have been stuck with a question about slicing numpy array for a while.
Below is an array I have right now:
a = np.array([[[ 1, 2],
[ 3, 4],
[ 5, 6]],
[[ 7, 8],
[ 9, 10],
[11, 12]]]
How can I use slicing to get an array like the following?
np.array([[[ 1, 2]],
[[ 9, 10],
[11, 12]]]
I have tried a[[0,1],[0,[1,2]] however it didn't work and gave an error:
ValueError: setting an array element with a sequence.
Thank you in advance!
The exact thing you give as your desired output is not possible, since arrays have to be "hyper-rectangles", so X[0].shape has to be the same as X[1].shape.
What you can do is:
a[[0,1,1],[0,1,2]]
# array([[ 1, 2],
# [ 9, 10],
# [11, 12]])
You can do this, for example:
import numpy as np
a = np.array([[[ 1, 2], [ 3, 4], [ 5, 6]], [[ 7, 8], [ 9, 10], [11, 12]]])
print(np.array([[a[0, 0 ,: ], a[1, 1 ,:], a[1, 2 ,: ]]]))
Result:
[[[ 1 2]
[ 9 10]
[11 12]]]
You can apply two operations separably and merge them afterwards:
np.array((a[0,0:1].tolist(), a[1,1:].tolist()))
# array([[[1, 2]], [[9, 10], [11, 12]]], dtype=object)

Element-wise multiplication of 'slices' of 2D matrix to form 3D matrix

A matrix multiplication like this
Is easy to implement in Python using numpy
import numpy as np
np.array([[1, 2, 3]]) * np.array([[1], [2], [3]])
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
But in my situation, I have 2 2D matrices that I want to multiply to form a 3D matrix. Effectively, the first 'slice' of the 2D matrix is an array that I want to multiply by the first 'slice' of the second matrix to form a 2D matrix. This is continued for all the 'slices' of the 2D matrices. Think of the first as being dimensions [x,z] and the second being dimensions [y,z]. I want to multiply them to get [x,y,z]. Is there an elegant way to do this in numpy?
Because you can already describe your multiplication as
[x, z] * [y, z] -> [x, y, z]
the most straightforward solution will most likely be using Einsum:
import numpy as np
A = np.arange(12).reshape(4, 3)
# array([[ 0, 1, 2],
# [ 3, 4, 5],
# [ 6, 7, 8],
# [ 9, 10, 11]])
B = np.arange(9).reshape(3, 3)
# array([[0, 1, 2],
# [3, 4, 5],
# [6, 7, 8]])
C = np.einsum('xz,yz->xyz', A, B)
# array([[[ 0, 1, 4],
# [ 0, 4, 10],
# [ 0, 7, 16]],
#
# [[ 0, 4, 10],
# [ 9, 16, 25],
# [18, 28, 40]],
#
# [[ 0, 7, 16],
# [18, 28, 40],
# [36, 49, 64]],
#
# [[ 0, 10, 22],
# [27, 40, 55],
# [54, 70, 88]]])
An alternative is to simply use broadcasting
D = A[:, None, :] * B[None, :, :]
np.allclose(D, C)
# True
I managed to figure it out with the help of the response to this StackOverflow question.
arr = np.array([[1, 2, 3]])
arr * arr.T
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
mat = np.repeat(arr, 3, axis=0)
mat
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
mat[:,:,None] * np.transpose(mat[:,None,:], axes=(1, 0, 2))
array([[[1, 2, 3],
[2, 4, 6],
[3, 6, 9]],
[[1, 2, 3],
[2, 4, 6],
[3, 6, 9]],
[[1, 2, 3],
[2, 4, 6],
[3, 6, 9]]])

numpy array - efficiently subtract each row of B from A

I have two numpy arrays a and b. I want to subtract each row of b from a. I tried to use:
a1 - b1[:, None]
This works for small arrays, but takes too long when it comes to real world data sizes.
a = np.arange(16).reshape(8,2)
a
Out[35]:
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15]])
b = np.arange(6).reshape(3,2)
b
Out[37]:
array([[0, 1],
[2, 3],
[4, 5]])
a - b[:, None]
Out[38]:
array([[[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10],
[12, 12],
[14, 14]],
[[-2, -2],
[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10],
[12, 12]],
[[-4, -4],
[-2, -2],
[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10]]])
%%timeit
a - b[:, None]
The slowest run took 10.36 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.18 µs per loop
This approach is too slow / inefficient for larger arrays.
a1 = np.arange(18900 * 41).reshape(18900, 41)
b1 = np.arange(2674 * 41).reshape(2674, 41)
%%timeit
a1 - b1[:, None]
1 loop, best of 3: 12.1 s per loop
%%timeit
for index in range(len(b1)):
a1 - b1[index]
1 loop, best of 3: 2.35 s per loop
Is there any numpy trick I can use to speed this up?
You are playing with memory limits.
If like in your examples, 8 bits are sufficient to store data, use uint8:
import numpy as np
a1 = np.arange(18900 * 41,dtype=np.uint8).reshape(18900, 41)
b1 = np.arange(2674 * 41,dtype=np.uint8).reshape(2674, 41)
%time c1=(a1-b1[:,None])
#1.02 s

Efficient way of making a list of pairs from an array in Numpy

I have a numpy array x (with (n,4) shape) of integers like:
[[0 1 2 3],
[1 2 7 9],
[2 1 5 2],
...]
I want to transform the array into an array of pairs:
[0,1]
[0,2]
[0,3]
[1,2]
...
so first element makes a pair with other elements in the same sub-array. I have already a for-loop solution:
y=np.array([[x[j,0],x[j,i]] for i in range(1,4) for j in range(0,n)],dtype=int)
but since looping over numpy array is not efficient, I tried slicing as the solution. I can do the slicing for every column as:
y[1]=np.array([x[:,0],x[:,1]]).T
# [[0,1],[1,2],[2,1],...]
I can repeat this for all columns. My questions are:
How can I append y[2] to y[1],... such that the shape is (N,2)?
If number of columns is not small (in this example 4), how can I find y[i] elegantly?
What are the alternative ways to achieve the final array?
The cleanest way of doing this I can think of would be:
>>> x = np.arange(12).reshape(3, 4)
>>> x
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> n = x.shape[1] - 1
>>> y = np.repeat(x, (n,)+(1,)*n, axis=1)
>>> y
array([[ 0, 0, 0, 1, 2, 3],
[ 4, 4, 4, 5, 6, 7],
[ 8, 8, 8, 9, 10, 11]])
>>> y.reshape(-1, 2, n).transpose(0, 2, 1).reshape(-1, 2)
array([[ 0, 1],
[ 0, 2],
[ 0, 3],
[ 4, 5],
[ 4, 6],
[ 4, 7],
[ 8, 9],
[ 8, 10],
[ 8, 11]])
This will make two copies of the data, so it will not be the most efficient method. That would probably be something like:
>>> y = np.empty((x.shape[0], n, 2), dtype=x.dtype)
>>> y[..., 0] = x[:, 0, None]
>>> y[..., 1] = x[:, 1:]
>>> y.shape = (-1, 2)
>>> y
array([[ 0, 1],
[ 0, 2],
[ 0, 3],
[ 4, 5],
[ 4, 6],
[ 4, 7],
[ 8, 9],
[ 8, 10],
[ 8, 11]])
Like Jaimie, I first tried a repeat of the 1st column followed by reshaping, but then decided it was simpler to make 2 intermediary arrays, and hstack them:
x=np.array([[0,1,2,3],[1,2,7,9],[2,1,5,2]])
m,n=x.shape
x1=x[:,0].repeat(n-1)[:,None]
x2=x[:,1:].reshape(-1,1)
np.hstack([x1,x2])
producing
array([[0, 1],
[0, 2],
[0, 3],
[1, 2],
[1, 7],
[1, 9],
[2, 1],
[2, 5],
[2, 2]])
There probably are other ways of doing this sort of rearrangement. The result will copy the original data in one way or other. My guess is that as long as you are using compiled functions like reshape and repeat, the time differences won't be significant.
Suppose the numpy array is
arr = np.array([[0, 1, 2, 3],
[1, 2, 7, 9],
[2, 1, 5, 2]])
You can get the array of pairs as
import itertools
m, n = arr.shape
new_arr = np.array([x for i in range(m)
for x in itertools.product(a[i, 0 : 1], a[i, 1 : n])])
The output would be
array([[0, 1],
[0, 2],
[0, 3],
[1, 2],
[1, 7],
[1, 9],
[2, 1],
[2, 5],
[2, 2]])

Categories