I am trying to make use of numpy vectorized operations. But I struggle on the following task: The setting is two arrays of different length (X1, X2). I want to apply a method to each pair (e.g. X1[0] with X2[0], X2[1], etc). I wrote the following working code using loops, but I'd like to get rid of the loops.
result = []
for i in range(len(X1)):
result.append([])
for j in range(len(X2)):
tmp = my_method(X1[i] - X2[j])
result[i].append(tmp)
result = np.asarray(result)
You can reshape one of your vectors to be (N, 1) and then use vectorize which will broadcast the operation as normal:
import numpy as np
X1 = np.arange(5)
X2 = np.arange(3)
print(X1, X2)
# [0 1 2 3 4] [0 1 2]
def my_op(x, y):
return x + y
np.vectorize(my_op)(X1[:, np.newaxis], X2)
# array([[0, 1, 2],
# [1, 2, 3],
# [2, 3, 4],
# [3, 4, 5],
# [4, 5, 6]])
Note that my_op is just defined as an example; if your function is actually anything included in numpy's vectorized operations, it'd be much faster to just use that directly, e.g.:
X1[:, np.newaxis] + X2
itertools.product might be what you're looking for:
from itertools import product
import numpy as np
x1 = np.array(...)
x2 = np.array(...)
result = np.array([my_method(x_1 - x_2) for x_1, x_2 in product(x1,x2)])
Alternatively you could also use a double list comprehension:
result = np.array([my_method(x_1 - x_2) for x_1 in x1 for x_2 in x2])
This obviously depends on what my_method is doing and operating on and what you have stored in x1 and x2.
Assuming a simple function my_method(a, b), which adds the two numbers.
And this input:
X1 = np.arange(10)
X2 = np.arange(10,60,10)
You code is:
result = []
for i in range(len(X1)):
result.append([])
for j in range(len(X2)):
tmp = my_method(X1[i], X2[j])
result[i].append(tmp)
result = np.asarray(result)
You can replace it with broadcasting:
X1[:,None]+X2
output:
array([[10, 20, 30, 40, 50],
[11, 21, 31, 41, 51],
[12, 22, 32, 42, 52],
[13, 23, 33, 43, 53],
[14, 24, 34, 44, 54],
[15, 25, 35, 45, 55],
[16, 26, 36, 46, 56],
[17, 27, 37, 47, 57],
[18, 28, 38, 48, 58],
[19, 29, 39, 49, 59]])
Now you need to see if your operation can be vectorized… please share details on what you want to achieve. Functions can be vectorized using numpy.vectorize, but this is not a magic tool as it will loop on the elements, which can be slow. The best is to have a true vector operation.
Related
If I have a large 2D numpy array and 2 arrays which correspond to the x and y indices I want to extract, It's easy enough:
h = np.arange(49).reshape(7,7)
# h = [[0, 1, 2, 3, 4, 5, 6],
# [7, 8, 9, 10, 11, 12, 13],
# [14, 15, 16, 17, 18, 19, 20],
# [21, 22, 23, 24, 25, 26, 27],
# [28, 29, 30, 31, 32, 33, 34],
# [35, 36, 37, 38, 39, 40, 41],
# [42, 43, 44, 45, 46, 47, 48]]
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
reduced_h = h[x_indices, y_indices]
#reduced_h = [ 9, 24, 33]
However, I would like to, for each x,y pair cut out a square (denoted by 'a' - the number of indices in each direction from the centre) surrounding this 'coordinate' and return an array of these little 2D arrays.
For example, for h, x,y_indices as above and a=1:
reduced_h = [[[1,2,3],[8,9,10],[15,16,17]], [[16,17,18],[23,24,25],[30,31,32]], [[25,26,27],[32,33,34],[39,40,41]]]
i.e one 3x3 array for each x-y index pair corresponding to the 3x3 square of elements centred on the x-y index. In general, this should return a numpy array which has shape (len(x_indices),2a+1, 2a+1)
By analogy to reduced_h[0] = h[x_indices[0]-1:x_indices[0]+1 , y_indices[0]-1:y_indices[0]+1] = h[1-1:1+1 , 2-1:2+1] = h[0:2, 1:3] my first try was the following:
h[x_indices-a : x_indices+a, y_indices-a : y_indices+a]
However, perhaps unsurprisingly, slicing between the arrays fails.
So the obvious next thing to try is to create this slice manually. np.arange seems to struggle with this but linspace works:
a=1
xrange = np.linspace(x_indices-a, x_indices+a, 2*a+1, dtype=int)
# xrange = [ [0, 2, 3], [1, 3, 4], [2, 4, 5] ]
yrange = np.linspace(y_indices-a, y_indices+a, 2*a+1, dtype=int)
Now can try h[xrange,yrange] but this unsurprisingly does this element-wise meaning I get only one (2a+1)x(2a+1) array (the same dimensions as xrange and yrange). It there a way to, for every index, take the right slices from these ranges (without loops)? Or is there a way to make the broadcast work initially without having to set up linspace explicitly? Thanks
You can index np.lib.stride_tricks.sliding_window_view using your x and y indices:
import numpy as np
h = np.arange(49).reshape(7,7)
x_indices = np.array([1,3,4])
y_indices = np.array([2,3,5])
a = 1
window = (2*a+1, 2*a+1)
out = np.lib.stride_tricks.sliding_window_view(h, window)[x_indices-a, y_indices-a]
out:
array([[[ 1, 2, 3],
[ 8, 9, 10],
[15, 16, 17]],
[[16, 17, 18],
[23, 24, 25],
[30, 31, 32]],
[[25, 26, 27],
[32, 33, 34],
[39, 40, 41]]])
Note that you may need to pad h first to handle windows around your coordinates that reach "outside" h.
Here is the example:
import numpy as np
a = np.array([1,2],[3,4],[5,6])
b = np.array([7,8],[9,10],[11,12],[12,13])
what I want is to use each item in a to plus every item in b then plus them together. For example, [1,2] should plus every row in b 1+7=8,2+8=10, 8+10=18;1+9=10,2+10=12,10+12=22... The result would like to be that[[18,22,26,28...],[22,26,....],[26,30....]]
My question is how to fulfil that? I know use numpy can be more efficient than loop but how to use the matrix to calculate this?
I believe this does what you want:
>>> a = np.array([[1,2],[3,4],[5,6]])
>>> b = np.array([[7,8],[9,10],[11,12],[12,13]])
>>> np.sum(a, axis=1)[:,None] + np.sum(b, axis=1)[None,:]
array([[18, 22, 26, 28],
[22, 26, 30, 32],
[26, 30, 34, 36]])
You can use list comprehensions:
import numpy as np
a = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([[7, 8], [9, 10], [11, 12], [12, 13]])
[[sum(i) + sum(j) for j in b] for i in a]
Output:
[[18, 22, 26, 28], [22, 26, 30, 32], [26, 30, 34, 36]]
This can be done in the most precise and succinct way as follows:
np.einsum('ijk-> ij', a[:,None,:]+b)
Let me explain each step.It combines einsum and broadcasting concepts of numpy.
Step 1- a[:,None,:] reshapes matrix a to shape (3,1,2). This mid axis with value 1 is helpful for broadcasting.
Step 2- a[:,None,:] + b broadcasts a and adds matrix b to get a resultant matrix of shape (3,4,2).
Step 3- np.einsum('ijk-> ij', a[:,None,:]+b) does sum reduction along the last axis of matrix obtained from previous step.
For example, I have one np array A = [[30, 60, 50...], [ 15, 20, 18...], [21, 81, 50...]...] of size (N, 10).
And I have another np array B = [1, 1, 0...] of size (N, ).
I want to do operations E.g. I want all the sums of each column in A but only for rows where B==1. How would I do that without using any loops and just numpy methods?
So if I want sum of columns in A for indices where B == 1:
result = 30 + 15 because the first two indices in B are 1 but the third index is 0 so I wouldn't include it in the sum.
Use np.compress and sum along axis=0
>>> A = [[30, 60, 50], [ 15, 20, 18], [21, 81, 50]]
>>> B = [1, 1, 0]
>>> np.compress(B, A, axis=0).sum(0)
array([45, 80, 68])
If array, use np.nonzero on B:
>>> A = np.array([[30, 60, 50], [ 15, 20, 18], [21, 81, 50]])
>>> A[np.nonzero(B)].sum(0)
array([45, 80, 68])
Another way:
>>> A[B.astype(bool)].sum(0)
array([45, 80, 68])
If you want 0s:
>>> np.compress(B==0, A, axis=0).sum(0)
# Or,
>>> A[np.nonzero(B==0)].sum(0)
# Or,
>>> A[~B.astype(bool)].sum(0)
If you want both 1s and 0s, obviously:
>>> A.sum(0)
You can convert B to bool type and mask A. Then you can get the sum along columns.
A = np.array([[30, 60, 50], [ 15, 20, 18], [21, 81, 50]])
B = np.array([1, 1, 0])
A[B.astype(np.bool)].sum(axis=0)
array([45, 80, 68])
Let say I have this NumPY array
A =
array([[0, 1, 3],
[1, 2, 4]])
I have another array
B =
array([[10, 41, 26, 50, 12, 24],
[20, 15, 42, 40, 41, 62]])
I wanted to create another array, where it selects the element in B using the index of the column in A. That is
C =
array([[10, 41, 50],
[15, 42, 41]])
Try:
B[[[0],[1]], A]
Or more generally:
B[np.arange(A.shape[0])[:,None], A]
Output:
array([[10, 41, 50],
[15, 42, 41]])
You can use np.take_along_axis
np.take_along_axis(B, A, axis=1)
output:
array([[10, 41, 50],
[15, 42, 41]])
This can be simply done using list rather than numpy
Though, in the ending we can convert it into numpy.
Code:
import numpy as np
#to make it simpler take a 1d list
a = [0,1,3]
b = [10, 41, 26, 50, 12, 24]
c = []
a = np.array(a)
b = np.array(b)
#here we are using for loop to find the value in a and append the index of b in c
for i in range(len(a)):
print(i)
i = a[i]
c.append(b[i])
print(c)
c = np.array(c)
print(type(c))
#To make it more fun, you can use the random module to get random digits
I want to define a 3d numpy array with shape = [3, 3, 5] and also with values as a range starting with 11 and step = 3 in column-wise manner. I mean:
B[:,:,0] = [[ 11, 20, 29],
[ 14, 23, 32],
[17, 26, 35]]
B[:,:,1] = [[ 38, ...],
[ 41, ...],
[ 44, ...]]
...
I am new to numpy and I doubt doing it with np.arange or np.mgrid maybe. but I don't how to do.
How can this be done with minimal lines of code?
You can calculate the end of the range by multiplying the shape by the step and adding the start. Then it's just reshape and transpose to move the column around:
start = 11
step = 3
shape = [5, 3, 3]
end = np.prod(shape) * step + start
B = np.arange(start, end, step).reshape([5, 3, 3]).transpose(2, 1, 0)
B[:, :, 0]
# array([[11, 20, 29],
# [14, 23, 32],
# [17, 26, 35]])