This sounds simple, and I think I'm overcomplicating this in my mind.
I want to make an array whose elements are generated from two source arrays of the same shape, depending on which element in the source arrays is greater.
to illustrate:
import numpy as np
array1 = np.array((2,3,0))
array2 = np.array((1,5,0))
array3 = (insert magic)
>> array([2, 5, 0))
I can't work out how to produce an array3 that combines the elements of array1 and array2 to produce an array where only the greater of the two array1/array2 element values is taken.
Any help would be much appreciated. Thanks.
We could use NumPy built-in np.maximum, made exactly for that purpose -
np.maximum(array1, array2)
Another way would be to use the NumPy ufunc np.max on a 2D stacked array and max-reduce along the first axis (axis=0) -
np.max([array1,array2],axis=0)
Timings on 1 million datasets -
In [271]: array1 = np.random.randint(0,9,(1000000))
In [272]: array2 = np.random.randint(0,9,(1000000))
In [274]: %timeit np.maximum(array1, array2)
1000 loops, best of 3: 1.25 ms per loop
In [275]: %timeit np.max([array1, array2],axis=0)
100 loops, best of 3: 3.31 ms per loop
# #Eric Duminil's soln1
In [276]: %timeit np.where( array1 > array2, array1, array2)
100 loops, best of 3: 5.15 ms per loop
# #Eric Duminil's soln2
In [277]: magic = lambda x,y : np.where(x > y , x, y)
In [278]: %timeit magic(array1, array2)
100 loops, best of 3: 5.13 ms per loop
Extending to other supporting ufuncs
Similarly, there's np.minimum for finding element-wise minimum values between two arrays of same or broadcastable shapes. So, to find element-wise minimum between array1 and array2, we would have :
np.minimum(array1, array2)
For a complete list of ufuncs that support this feature, please refer to the docs and look for the keyword : element-wise. Grep-ing for those, I got the following ufuncs :
add, subtract, multiply, divide, logaddexp, logaddexp2, true_divide,
floor_divide, power, remainder, mod, fmod, divmod, heaviside, gcd,
lcm, arctan2, hypot, bitwise_and, bitwise_or, bitwise_xor, left_shift,
right_shift, greater, greater_equal, less, less_equal, not_equal,
equal, logical_and, logical_or, logical_xor, maximum, minimum, fmax,
fmin, copysign, nextafter, ldexp, fmod
If your condition ever becomes more complex, you could use np.where:
import numpy as np
array1 = np.array((2,3,0))
array2 = np.array((1,5,0))
array3 = np.where( array1 > array2, array1, array2)
# array([2, 5, 0])
You could replace array1 > array2 with any condition. If all you want is the maximum, go with #Divakar's answer.
And just for fun :
magic = lambda x,y : np.where(x > y , x, y)
magic(array1, array2)
# array([2, 5, 0])
Related
I am working on a python project and making use of numpy. I frequently have to compute Kronecker products of matrices by the identity matrix. These are a pretty big bottleneck in my code so I would like to optimize them. There are two kinds of products I have to take. The first one is:
np.kron(np.eye(N), A)
This one is pretty easy to optimize by simply using scipy.linalg.block_diag. The product is equivalent to:
la.block_diag(*[A]*N)
Which is about 10 times faster. However, I am unsure on how to optimize the second kind of product:
np.kron(A, np.eye(N))
Is there a similar trick I can use?
One approach would be to initialize an output array of 4D and then assign values into it from A. Such an assignment would broadcast values and this is where we would get efficiency in NumPy.
Thus, a solution would be like so -
# Get shape of A
m,n = A.shape
# Initialize output array as 4D
out = np.zeros((m,N,n,N))
# Get range array for indexing into the second and fourth axes
r = np.arange(N)
# Index into the second and fourth axes and selecting all elements along
# the rest to assign values from A. The values are broadcasted.
out[:,r,:,r] = A
# Finally reshape back to 2D
out.shape = (m*N,n*N)
Put as a function -
def kron_A_N(A, N): # Simulates np.kron(A, np.eye(N))
m,n = A.shape
out = np.zeros((m,N,n,N),dtype=A.dtype)
r = np.arange(N)
out[:,r,:,r] = A
out.shape = (m*N,n*N)
return out
To simulate np.kron(np.eye(N), A), simply swap the operations along the first and second and similarly for third and fourth axes -
def kron_N_A(A, N): # Simulates np.kron(np.eye(N), A)
m,n = A.shape
out = np.zeros((N,m,N,n),dtype=A.dtype)
r = np.arange(N)
out[r,:,r,:] = A
out.shape = (m*N,n*N)
return out
Timings -
In [174]: N = 100
...: A = np.random.rand(100,100)
...:
In [175]: np.allclose(np.kron(A, np.eye(N)), kron_A_N(A,N))
Out[175]: True
In [176]: %timeit np.kron(A, np.eye(N))
1 loops, best of 3: 458 ms per loop
In [177]: %timeit kron_A_N(A, N)
10 loops, best of 3: 58.4 ms per loop
In [178]: 458/58.4
Out[178]: 7.842465753424658
Let's say I have following numpy arrays:
import numpy as np
a = np.array([1, 2])
b = np.array([1])
c = np.array([1, 4, 8, 10])
How can I do something like np.vstack((a, b, c)) without any error? I know there is a pure python way l = [a, b, c] but that's not efficient enough. I'd like to implement it in a numpy method. Do you have any idea? Thanks in advance!
In [863]: a = np.array([1, 2])
In [864]: b = np.array([1])
In [865]: c = np.array([1, 4, 8, 10])
A list of these 3 arrays:
In [866]: ll=[a,b,c]
An object dtype array made from this list:
In [867]: A=np.array(ll)
In [868]: A
Out[868]: array([array([1, 2]), array([1]), array([ 1, 4, 8, 10])], dtype=object)
A, like ll contains pointers to data objects elsewhere in memory. In terms of memory use they are equally efficient.
In [870]: id(A[1]),id(b)
Out[870]: (3032501768, 3032501768)
You can perform a limited number of math operations on the elements of A, for example addition works as one might expect
In [871]: A+3
Out[871]: array([array([4, 5]), array([4]), array([ 4, 7, 11, 13])], dtype=object)
But there's little to no speed advantage, e.g.
In [876]: timeit [x+3 for x in ll]
100000 loops, best of 3: 9.52 µs per loop
In [877]: timeit A+3
100000 loops, best of 3: 14.6 µs per loop
and other things like np.max don't work. You have to test this case by case.
More details here: Maintaining numpy subclass inside a container after applying ufunc and other object array questions.
To get numpy speed, you need to imbed the vectors into an array. Either a 2D array or 1D array could work. You could make an array of zeros that is large enough to hold all the values. Then put the vectors in that array. Or, you could make a large 1D array and concatenate the vectors end to end.
import numpy as np
a = np.array([1, 2])
b = np.array([1])
c = np.array([1, 4, 8, 10])
# Imbed the vectors in a 2D array
A = np.zeros((3, max(a.size, b.size, c.size)))
A[0, :a.size] = a
A[1, :b.size] = b
A[2, :c.size] = c
# 1D array imbedding
B = np.zeros(a.size + b.size + c.size)
B[:a.size] = a
B[a.size:(a.size+b.size)] = b
B[(a.size+b.size):] = c
%timeit A+3
1000000 loops, best of 3: 780 ns per loop
%timeit B+3
1000000 loops, best of 3: 764 ns per loop
This has the advantage of numpy speed. But it involves more coding work, and it is less easy to interpret the values of your arrays.
Also, to decide whether the 1D or 2D solution is better, it makes sense to think about how your using the arrays. For example, if the values are Fourier series coefficients, then the 2D array would probably be better. With a 2D array you can keep specific elements of your vectors aligned.
However, I could also imagine applications where concatenating vectors into a single 1D array would make more sense. I hope this was helpful.
I am quite new to python numpy.
If i do have a list of numpy vectors. What is the best way to ensure computation is fast.
I am currently doing this which i find it to be too slow.
vec = sum(list of numpy vectors) # 4 vectors of 500 dimensions each
It does take up quite a noticeable amount of time using sum.
Is this what you are trying to do (but with much larger arrays)?
In [193]: sum([np.ones((2,3)),np.arange(6).reshape(2,3)])
Out[193]:
array([[ 1., 2., 3.],
[ 4., 5., 6.]])
500 dimensions each is an unclear description. Do you mean an array with shape (500,) or with ndim==500? If the latter, just how many elements total are there.
The fact that it is a list of 4 of these arrays shouldn't be a big deal. What's the time for array1 + array2?
If the arrays just have 500 elements each, the sum time is trivial:
In [195]: timeit sum([np.arange(500),np.arange(500),np.arange(500),np.arange(500)])
10000 loops, best of 3: 20.9 µs per loop
on the other hand a sum of arrays with many small dimensions is slower simply because such an array is much larger
In [204]: x=np.ones((3,)*10)
In [205]: timeit z=sum([x,x,x,x])
1000 loops, best of 3: 1.6 ms per loop
In my opinion this is already the fastest variant. It is pure numpy and as such computed within C-code.
Alternatives could be to compute the sums of each vector individually and then sum the values of the list or to stack all vectors and then sum up. But both are slower:
import numpy as np
import time
n = 10000
start = time.time()
for i in range(n):
lst = np.hstack([np.random.random(500) for i in range(4)])
x = np.sum(lst)
print("stack then np.sum: ", time.time()- start)
start = time.time()
for i in range(n):
lst = [np.sum(np.random.random(500)) for i in range(4)]
x = np.sum(lst)
print("sum up individually: ", time.time()- start)
start = time.time()
for i in range(n):
lst = [np.random.random(500) for i in range(4)]
x = np.sum(lst)
print("np.sum on list of vectors:", time.time()- start)
output:
stack then np.sum: 0.35804247856140137
sum up individually: 0.400468111038208
np.sum on list of vectors: 0.3427283763885498
I have two numpy arrays:
x of shape ((d1,...,d_m))
y of shape ((e_1,...e_n))
I would like to form the outer tensor product, that is the numpy array
z of shape ((d1,...,d_m,e_1,...,e_n))
such that
z[i_1,...,i_n,i_{n+1}...,i_{m+n}] == x[i_1,...i_m]*y[i_{m+1},...,i_{m+n}]
I have to perform the above outer multiplication several times so I would like to speed this up as much as possible.
You want np.multiply.outer:
z = np.multiply.outer(x, y)
An alternative to outer is to explicitly expand the dimensions. For 1d arrays this would be
x[:,None]*y # y[None,:] is automatic.
For 10x10 arrays, and generalizing the dimension expansion, I get the same times
In [74]: timeit x[[slice(None)]*x.ndim + [None]*y.ndim] * y
10000 loops, best of 3: 53.6 µs per loop
In [75]: timeit np.multiply.outer(x,y)
10000 loops, best of 3: 52.6 µs per loop
So outer does save some coding, but the basic broadcasted multiplication is the same.
I am interested in calculating a large NumPy array. I have a large array A which contains a bunch of numbers. I want to calculate the sum of different combinations of these numbers. The structure of the data is as follows:
A = np.random.uniform(0,1, (3743, 1388, 3))
Combinations = np.random.randint(0,3, (306,3))
Final_Product = np.array([ np.sum( A*cb, axis=2) for cb in Combinations])
My question is if there is a more elegant and memory efficient way to calculate this? I find it frustrating to work with np.dot() when a 3-D array is involved.
If it helps, the shape of Final_Product ideally should be (3743, 306, 1388). Currently Final_Product is of the shape (306, 3743, 1388), so I can just reshape to get there.
np.dot() won't give give you the desired output , unless you involve extra step(s) that would probably include reshaping. Here's one vectorized approach using np.einsum to do it one shot without any extra memory overhead -
Final_Product = np.einsum('ijk,lk->lij',A,Combinations)
For completeness, here's with np.dot and reshaping as discussed earlier -
M,N,R = A.shape
Final_Product = A.reshape(-1,R).dot(Combinations.T).T.reshape(-1,M,N)
Runtime tests and verify output -
In [138]: # Inputs ( smaller version of those listed in question )
...: A = np.random.uniform(0,1, (374, 138, 3))
...: Combinations = np.random.randint(0,3, (30,3))
...:
In [139]: %timeit np.array([ np.sum( A*cb, axis=2) for cb in Combinations])
1 loops, best of 3: 324 ms per loop
In [140]: %timeit np.einsum('ijk,lk->lij',A,Combinations)
10 loops, best of 3: 32 ms per loop
In [141]: M,N,R = A.shape
In [142]: %timeit A.reshape(-1,R).dot(Combinations.T).T.reshape(-1,M,N)
100 loops, best of 3: 15.6 ms per loop
In [143]: Final_Product =np.array([np.sum( A*cb, axis=2) for cb in Combinations])
...: Final_Product2 = np.einsum('ijk,lk->lij',A,Combinations)
...: M,N,R = A.shape
...: Final_Product3 = A.reshape(-1,R).dot(Combinations.T).T.reshape(-1,M,N)
...:
In [144]: print np.allclose(Final_Product,Final_Product2)
True
In [145]: print np.allclose(Final_Product,Final_Product3)
True
Instead of dot you could use tensordot. Your current method is equivalent to:
np.tensordot(A, Combinations, [2, 1]).transpose(2, 0, 1)
Note the transpose at the end to put the axes in the correct order.
Like dot, the tensordot function can call down to the fast BLAS/LAPACK libraries (if you have them installed) and so should be perform well for large arrays.