vectorized addition in numpy array

vectorized addition in numpy array - python

How do I vectorize addition between columns in a numpy array? For example, what is the fastest way to implement something like:
import numpy
ary = numpy.array([[1,2,3],[3,4,5],[5,6,7],[7,8,9],[9,10,11]])
for i in range(ary.shape[0]):
ary[i,0] += ary[i,1]

With numpy.ndarray.sum over axis 1:
ary[:,0] = ary.sum(axis=1)
Or the same with straightforward addition on slices:
ary[:,0] = ary[:, 0] + ary[:, 1]

Related

Use numpy to sum indices based on another numpy vector

I am trying to sum specific indices per row in a numpy matrix, based on values in a second numpy vector. For example, in the image, there is the matrix A and the vector of indices inds. Here I want to sum:
A[0, inds[0]] + A[1, inds[1]] + A[2, inds[2]] + A[3, inds[3]]
I am currently using a python for loop, making the code quite slow. Is there a way to do this using vectorisation? Thanks!

Yes, numpy's magic indexing can do this. Just generate a range for the 1st dimension and use your coords for the second:
import numpy as np
x1 = np.array( [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]] )
print(x1[ [0,1,2,3],[2,0,3,1] ].sum())

The most efficient way of finding indices of element(s) in numpy 2D array

Out of huge matrix in numpy (currently 1000x1000) only a few elements are relevant for me. Say these elements are >1000 in value and others are way lower. I need to find indices of all such elements in the most efficient way because the search will be repeated often and the matrix can become even bigger.
For now I have two different approaches which should be about the same complexity (I omit possible solutions with for as inefficient):
import numpy as np
A = np.zeros((1000,1000))
#do something with the matrix
#first solution with np.where
np.where(A > 999).T
# array([[0, 0],[1, 20]....[785, 445]], dtype=int64) - made up numbers
#another solution with np.argwhere
np.argwhere(A > 999)
# array([[0, 0],[1, 20]....[785, 445]], dtype=int64) - outputs the same
Is there any possible way to speed up this search or is my solution the most efficient?
Thanks for any advices and suggestion!

You can try this, the filter directly included in the numpy array!
import numpy as np
arr = np.array([998, 999, 1000, 1001])
filter_arr = arr > 999
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)
https://www.w3schools.com/python/numpy_array_filter.asp

numpy - Computing "element-wise" difference between two arrays along first axis

Suppose I have two arrays A and B with dimensions (n1,m1,m2) and (n2,m1,m2), respectively. I want to compute the matrix C with dimensions (n1,n2) such that C[i,j] = sum((A[i,:,:] - B[j,:,:])^2). Here is what I have so far:
import numpy as np
A = np.array(range(1,13)).reshape(3,2,2)
B = np.array(range(1,9)).reshape(2,2,2)
C = np.zeros(shape=(A.shape[0], B.shape[0]) )
for i in range(A.shape[0]):
for j in range(B.shape[0]):
C[i,j] = np.sum(np.square(A[i,:,:] - B[j,:,:]))
C
What is the most efficient way to do this? In R I would use a vectorized approach, such as outer. Is there a similar method for Python?
Thanks.

You can use scipy's cdist, which is pretty efficient for such calculations after reshaping the input arrays to 2D, like so -
from scipy.spatial.distance import cdist
C = cdist(A.reshape(A.shape[0],-1),B.reshape(B.shape[0],-1),'sqeuclidean')
Now, the above approach must be memory efficient and thus a better one when working with large datasizes. For small input arrays, one can also use np.einsum and leverage NumPy broadcasting, like so -
diffs = A[:,None]-B
C = np.einsum('ijkl,ijkl->ij',diffs,diffs)

Efficient way of sampling from indices of a Numpy array?

I'd like to sample from indices of a 2D Numpy array, considering that each index is weighted by the number inside of that array. The way I know it is with numpy.random.choice however that does not return the index but the number itself. Is there any efficient way of doing so?
Here is my code:
import numpy as np
A=np.arange(1,10).reshape(3,3)
A_flat=A.flatten()
d=np.random.choice(A_flat,size=10,p=A_flat/float(np.sum(A_flat)))
print d

You could do something like:
import numpy as np
def wc(weights):
cs = np.cumsum(weights)
idx = cs.searchsorted(np.random.random() * cs[-1], 'right')
return np.unravel_index(idx, weights.shape)
Notice that the cumsum is the slowest part of this, so if you need to do this repeatidly for the same array I'd suggest computing the cumsum ahead of time and reusing it.

To expand on my comment: Adapting the weighted choice method presented here https://stackoverflow.com/a/10803136/553404
def weighted_choice_indices(weights):
cs = np.cumsum(weights.flatten())/np.sum(weights)
idx = np.sum(cs < np.random.rand())
return np.unravel_index(idx, weights.shape)

Inverted fancy indexing

Having an array and a mask for this array, using fancy indexing, it is easy to select only the data of the array corresponding to the mask.
import numpy as np
a = np.arange(20).reshape(4, 5)
mask = [0, 2]
data = a[:, mask]
But is there a rapid way to select all the data of the array that does not belong to the mask (i.e. the mask is the data we want to reject)?
I tried to find a general solution going through an intermediate boolean array, but I'm sure there is something really easier.
mask2 = np.ones(a.shape)==1
mask2[:, mask]=False
data = a[mask2].reshape(a.shape[0], a.shape[1]-size(mask))
Thank you

Have a look at numpy.invert, numpy.bitwise_not, numpy.logical_not, or more concisely ~mask. (They all do the same thing, in this case.)
As a quick example:
import numpy as np
x = np.arange(10)
mask = x > 5
print x[mask]
print x[~mask]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

vectorized addition in numpy array - python

How do I vectorize addition between columns in a numpy array? For example, what is the fastest way to implement something like: import numpy ary = numpy.array([[1,2,3],[3,4,5],[5,6,7],[7,8,9],[9,10,11]]) for i in range(ary.shape[0]): ary[i,0] += ary[i,1]

With numpy.ndarray.sum over axis 1: ary[:,0] = ary.sum(axis=1) Or the same with straightforward addition on slices: ary[:,0] = ary[:, 0] + ary[:, 1]

Related

Use numpy to sum indices based on another numpy vector

The most efficient way of finding indices of element(s) in numpy 2D array

numpy - Computing "element-wise" difference between two arrays along first axis

Efficient way of sampling from indices of a Numpy array?

Inverted fancy indexing

Categories

Resources