I am trying to create a 'mask' of a numpy.array by specifying certain criteria. Python even has nice syntax for something like this:
>> A = numpy.array([1,2,3,4,5])
>> A > 3
array([False, False, False, True, True])
But if I have a list of criteria instead of a range:
>> A = numpy.array([1,2,3,4,5])
>> crit = [1,3,5]
I can't do this:
>> A in crit
I have to do something based on list comprehensions, like this:
>> [a in crit for a in A]
array([True, False, True, False, True])
Which is correct.
Now, the problem is that I am working with large arrays and the above code is very slow. Is there a more natural way of doing this operation that might speed it up?
EDIT: I was able to get a small speedup by making crit into a set.
EDIT2: For those who are interested:
Jouni's approach:
1000 loops, best of 3: 102 µs per loop
numpy.in1d:
1000 loops, best of 3: 1.33 ms per loop
EDIT3: Just tested again with B = randint(10,size=100)
Jouni's approach:
1000 loops, best of 3: 2.96 ms per loop
numpy.in1d:
1000 loops, best of 3: 1.34 ms per loop
Conclusion: Use numpy.in1d() unless B is very small.
I think that the numpy function in1d is what you are looking for:
>>> A = numpy.array([1,2,3,4,5])
>>> B = [1,3,5]
>>> numpy.in1d(A,crit)
array([ True, False, True, False, True], dtype=bool)
as stated in its docstring, "in1d(a, b) is roughly equivalent to np.array([item in b for item in a])"
Admittedly, I haven't done any speed tests, but it sounds like what you are looking for.
Another faster way
Here's another way to do it which is faster. Sort the B array first(containing the elements you are looking to find in A), turn it into a numpy array, and then do:
B[B.searchsorted(A)] == A
though if you have elements in A that are larger than the largest in B, you will need to do:
inds = B.searchsorted(A)
inds[inds == len(B)] = 0
mask = B[inds] == A
It may not be faster for small arrays (especially for B being small), but before long it will definitely be faster. Why? Because this is a O(N log M) algorithm, where N is the number of elements in A and M is the number of elements in M, putting together a bunch of individual masks is O(N * M). I tested it with N = 10000 and M = 14 and it was already faster. Anyway, just thought that you might like to know, especially if you are truly planning on using this on very large arrays.
Combine several comparisons with "or":
A = randint(10,size=10000)
mask = (A == 1) | (A == 3) | (A == 5)
Or if you have a list B and want to create the mask dynamically:
B = [1, 3, 5]
mask = zeros((10000,),dtype=bool)
for t in B: mask = mask | (A == t)
Create a mask and use the compress function of the numpy array. It should be much faster.
If you have a complex criteria, remember to construct it based on math of the arrays.
a = numpy.array([3,1,2,4,5])
mask = a > 3
b = a.compress(mask)
or
a = numpy.random.random_integers(1,5,100000)
c=a.compress((a<=4)*(a>=2)) ## numbers between n<=4 and n>=2
d=a.compress(~((a<=4)*(a>=2))) ## numbers either n>4 or n<2
Ok, if you want a mask that has all a in [1,3,5] you can do something like
a = numpy.random.random_integers(1,5,100000)
mask=(a==1)+(a==3)+(a==5)
or
a = numpy.random.random_integers(1,5,100000)
mask = numpy.zeros(len(a), dtype=bool)
for num in [1,3,5]:
mask += (a==num)
Related
I have two ndarrays with shapes:
A = (32,512,640)
B = (4,512)
I need to multiply A and B such that I get a new ndarray:
C = (4,32,512,640)
Another way to think of it is that each row of vector B is multiplied along axis=-2 of A, which results in a new 1,32,512,640 cube. Each row of B can be looped over forming 1,32,512,640 cubes, which can then be used to build C up by using np.concatenate or np.vstack, such as:
# Sample inputs, where the dimensions aren't necessarily known
a = np.arange(32*512*465, dtype='f4').reshape((32,512,465))
b = np.ones((4,512), dtype='f4')
# Using a loop
d = []
for row in b:
d.append(np.expand_dims(row[None,:,None]*a, axis=0))
# Or using list comprehension
d = [np.expand_dims(row[None,:,None]*a,axis=0) for row in b]
# Stacking the final list
result = np.vstack(d)
But I am wondering if it's possible to use something like np.einsum or np.tensordot to get this vectorized all in one line. I'm still learning how to use those two methods, so I'm not sure if it's appropriate here.
Thanks!
We can leverage broadcasting after extending the dimensions of B with None/np.newaxis -
C = A * B[:,None,:,None]
With einsum, it would be -
C = np.einsum('ijk,lj->lijk',A,B)
There's no sum-reduction happening here, so einsum won't be any better than the explicit-broadcasting one. But since, we are looking for Pythonic solution, that could be used, once we get past its string notation.
Let's get some timings to finish things off -
In [15]: m,n,r,p = 32,512,640,4
...: A = np.random.rand(m,n,r)
...: B = np.random.rand(p,n)
In [16]: %timeit A * B[:,None,:,None]
10 loops, best of 3: 80.9 ms per loop
In [17]: %timeit np.einsum('ijk,lj->lijk',A,B)
10 loops, best of 3: 109 ms per loop
# Original soln
In [18]: %%timeit
...: d = []
...: for row in B:
...: d.append(np.expand_dims(row[None,:,None]*A, axis=0))
...:
...: result = np.vstack(d)
10 loops, best of 3: 130 ms per loop
Leverage multi-core
We could leverage multi-core capability of numexpr, which is suited for arithmetic operations and large data and thus gain some performance boost here. Let's time with it -
In [42]: import numexpr as ne
In [43]: B4D = B[:,None,:,None] # this is virtually free
In [44]: %timeit ne.evaluate('A*B4D')
10 loops, best of 3: 64.6 ms per loop
In one-line as : ne.evaluate('A*B4D',{'A':A,'B4D' :B[:,None,:,None]}).
Related post on how to control multi-core functionality.
I have a function that will always converge to a fixpoint, e.g. f(x)= (x-a)/2+a. I have a function that will find this fixpoint through repetive invoking of the function:
def find_fix_point(f,x):
while f(x)>0.1:
x = f(x)
return x
Which works fine, now I want to do this for a vectoriced version;
def find_fix_point(f,x):
while (f(x)>0.1).any():
x = f(x)
return x
However this is quite inefficient, if most of the instances only need about 10 iterations and one needs 1000. What is a fast method to remove `x that already have been found?
The code can use numpy or scipy.
One way to solve this would be to use recursion:
def find_fix_point_recursive(f, x):
ind = x > 0.1
if ind.any():
x[ind] = find_fix_point_recursive(f, f(x[ind]))
return x
With this implementation, we only call f on the points which need to be updated.
Note that by using recursion we avoid having to do the check x > 0.1 all the time, with each call working on smaller and smaller arrays.
%timeit x = np.zeros(10000); x[0] = 10000; find_fix_point(f, x)
1000 loops, best of 3: 1.04 ms per loop
%timeit x = np.zeros(10000); x[0] = 10000; find_fix_point_recursive(f, x)
10000 loops, best of 3: 141 µs per loop
First for generality,I change the criteria to fit with the fix-point definition : we stop when |x-f(x)|<=epsilon.
You can mix boolean indexing and integer indexing to keep each time the active points. Here a way to do that :
def find_fix_point(f,x,epsilon):
ind=np.mgrid[:len(x)] # initial indices.
while ind.size>0:
xind=x[ind] # integer indexing
yind=f(xind)
x[ind]=yind
ind=ind[abs(yind-xind)>epsilon] # boolean indexing
An example with a lot of fix points :
from matplotlib.pyplot import plot,show
x0=np.linspace(0,1,1000)
x = x0.copy()
def f(x): return x*np.sin(1/x)
find_fix_point(f,x,1e-5)
plot(x0,x,'.');show()
The general method is to use boolean indexing to compute only the ones, that did not yet reach equilibrium.
I adapted the algorithm given by Jonas Adler to avoid maximal recursion depth:
def find_fix_point_vector(f,x):
x = x.copy()
x_fix = np.empty(x.shape)
unfixed = np.full(x.shape, True, dtype = bool)
while unfixed.any():
x_new = f(x) #iteration
x_fix[unfixed] = x_new # copy the values
cond = np.abs(x_new-x)>1
unfixed[unfixed] = cond #find out which ones are fixed
x = x_new[cond] # update the x values that still need to be computed
return x_fix
Edit:
Here I review the 3 solutions proposed. I will call the different fucntions according to their proposer, find_fix_Jonas, find_fix_Jurg, find_fix_BM. I changed the fixpoint condition in all functions according to BM (see updated fucntion of Jonas at the end).
Speed:
%timeit find_fix_BM(f, np.linspace(0,100000,10000),1)
100 loops, best of 3: 2.31 ms per loop
%timeit find_fix_Jonas(f, np.linspace(0,100000,10000))
1000 loops, best of 3: 1.52 ms per loop
%timeit find_fix_Jurg(f, np.linspace(0,100000,10000))
1000 loops, best of 3: 1.28 ms per loop
According to readability I think the version of Jonas is the easiest one to understand, so should be chosen when speed does not matter very much.
Jonas's version however might raise a Runtimeerror, when the number of iterations until fixpoint is reached is large (>1000). The other two solutions do not have this drawback.
The verion of B.M. however might be easier to understand than the version proposed by me.
#
Version of Jonas used:
def find_fix_Jonas(f, x):
fx = f(x)
ind = np.abs(fx-x)>1
if ind.any():
fx[ind] = find_fix_Jonas(f, fx[ind])
return fx
...remove `x that already have been found?
Create a new array using boolean indexing with your condition.
>>> a = np.array([3,1,6,3,9])
>>> a != 3
array([False, True, True, False, True], dtype=bool)
>>> b = a[a != 3]
>>> b
array([1, 6, 9])
>>>
I have a list of complex numbers for which I want to find the closest value in another list of complex numbers.
My current approach with numpy:
import numpy as np
refArray = np.random.random(16);
myArray = np.random.random(1000);
def find_nearest(array, value):
idx = (np.abs(array-value)).argmin()
return idx;
for value in np.nditer(myArray):
index = find_nearest(refArray, value);
print(index);
Unfortunately, this takes ages for a large amount of values.
Is there a faster or more "pythonian" way of matching each value in myArray to the closest value in refArray?
FYI: I don't necessarily need numpy in my script.
Important: the order of both myArray as well as refArray is important and should not be changed. If sorting is to be applied, the original index should be retained in some way.
Here's one vectorized approach with np.searchsorted based on this post -
def closest_argmin(A, B):
L = B.size
sidx_B = B.argsort()
sorted_B = B[sidx_B]
sorted_idx = np.searchsorted(sorted_B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) & \
((np.abs(A - sorted_B[sorted_idx-1]) < np.abs(A - sorted_B[sorted_idx])) )
return sidx_B[sorted_idx-mask]
Brief explanation :
Get the sorted indices for the left positions. We do this with - np.searchsorted(arr1, arr2, side='left') or just np.searchsorted(arr1, arr2). Now, searchsorted expects sorted array as the first input, so we need some preparatory work there.
Compare the values at those left positions with the values at their immediate right positions (left + 1) and see which one is closest. We do this at the step that computes mask.
Based on whether the left ones or their immediate right ones are closest, choose the respective ones. This is done with the subtraction of indices with the mask values acting as the offsets being converted to ints.
Benchmarking
Original approach -
def org_app(myArray, refArray):
out1 = np.empty(myArray.size, dtype=int)
for i, value in enumerate(myArray):
# find_nearest from posted question
index = find_nearest(refArray, value)
out1[i] = index
return out1
Timings and verification -
In [188]: refArray = np.random.random(16)
...: myArray = np.random.random(1000)
...:
In [189]: %timeit org_app(myArray, refArray)
100 loops, best of 3: 1.95 ms per loop
In [190]: %timeit closest_argmin(myArray, refArray)
10000 loops, best of 3: 36.6 µs per loop
In [191]: np.allclose(closest_argmin(myArray, refArray), org_app(myArray, refArray))
Out[191]: True
50x+ speedup for the posted sample and hopefully more for larger datasets!
An answer that is much shorter than that of #Divakar, also using broadcasting and even slightly faster:
abs(myArray[:, None] - refArray[None, :]).argmin(axis=-1)
I'm desperately searching for an efficient way to check if two 2D numpy Arrays intersect.
So what I have is two arrays with an arbitrary amount of 2D arrays like:
A=np.array([[2,3,4],[5,6,7],[8,9,10]])
B=np.array([[5,6,7],[1,3,4]])
C=np.array([[1,2,3],[6,6,7],[10,8,9]])
All I need is a True if there is at least one vector intersecting with another one of the other array, otherwise a false. So it should give results like this:
f(A,B) -> True
f(A,C) -> False
I'm kind of new to python and at first I wrote my program with Python lists, which works but of course is very inefficient. The Program takes days to finish so I am working on a numpy.array solution now, but these arrays really are not so easy to handle.
Here's Some Context about my Program and the Python List Solution:
What i'm doing is something like a self-avoiding random walk in 3 Dimensions. http://en.wikipedia.org/wiki/Self-avoiding_walk. But instead of doing a Random walk and hoping that it will reach a desirable length (e.g. i want chains build up of 1000 beads) without reaching a dead end i do the following:
I create a "flat" Chain with the desired length N:
X=[]
for i in range(0,N+1):
X.append((i,0,0))
Now i fold this flat chain:
randomly choose one of the elements ("pivotelement")
randomly choose one direction ( either all elements to the left or to the right of the pivotelment)
randomly choose one out of 9 possible rotations in space (3 axes * 3 possible rotations 90°,180°,270°)
rotate all the elements of the chosen direction with the chosen rotation
Check if the new elements of the chosen direction intersect with the other direction
No intersection -> accept the new configuration, else -> keep the old chain.
Steps 1.-6. have to be done a large amount of times (e.g. for a chain of length 1000, ~5000 Times) so these steps have to be done efficiently. My List-based solution for this is the following:
def PivotFold(chain):
randPiv=random.randint(1,N) #Chooses a random pivotelement, N is the Chainlength
Pivot=chain[randPiv] #get that pivotelement
C=[] #C is going to be a shifted copy of the chain
intersect=False
for j in range (0,N+1): # Here i shift the hole chain to get the pivotelement to the origin, so i can use simple rotations around the origin
C.append((chain[j][0]-Pivot[0],chain[j][1]-Pivot[1],chain[j][2]-Pivot[2]))
rotRand=random.randint(1,18) # rotRand is used to choose a direction and a Rotation (2 possible direction * 9 rotations = 18 possibilitys)
#Rotations around Z-Axis
if rotRand==1:
for j in range (randPiv,N+1):
C[j]=(-C[j][1],C[j][0],C[j][2])
if C[0:randPiv].__contains__(C[j])==True:
intersect=True
break
elif rotRand==2:
for j in range (randPiv,N+1):
C[j]=(C[j][1],-C[j][0],C[j][2])
if C[0:randPiv].__contains__(C[j])==True:
intersect=True
break
...etc
if intersect==False: # return C if there was no intersection in C
Shizz=C
else:
Shizz=chain
return Shizz
The Function PivotFold(chain) will be used on the initially flat chain X a large amount of times. it's pretty naivly written so maybe you have some protips to improve this ^^ I thought that numpyarrays would be good because i can efficiently shift and rotate entire chains without looping over all the elements ...
This should do it:
In [11]:
def f(arrA, arrB):
return not set(map(tuple, arrA)).isdisjoint(map(tuple, arrB))
In [12]:
f(A, B)
Out[12]:
True
In [13]:
f(A, C)
Out[13]:
False
In [14]:
f(B, C)
Out[14]:
False
To find intersection? OK, set sounds like a logical choice.
But numpy.array or list are not hashable? OK, convert them to tuple.
That is the idea.
A numpy way of doing involves very unreadable boardcasting:
In [34]:
(A[...,np.newaxis]==B[...,np.newaxis].T).all(1)
Out[34]:
array([[False, False],
[ True, False],
[False, False]], dtype=bool)
In [36]:
(A[...,np.newaxis]==B[...,np.newaxis].T).all(1).any()
Out[36]:
True
Some timeit result:
In [38]:
#Dan's method
%timeit set_comp(A,B)
10000 loops, best of 3: 34.1 µs per loop
In [39]:
#Avoiding lambda will speed things up
%timeit f(A,B)
10000 loops, best of 3: 23.8 µs per loop
In [40]:
#numpy way probably will be slow, unless the size of the array is very big (my guess)
%timeit (A[...,np.newaxis]==B[...,np.newaxis].T).all(1).any()
10000 loops, best of 3: 49.8 µs per loop
Also the numpy method will be RAM hungry, as A[...,np.newaxis]==B[...,np.newaxis].T step creates a 3D array.
Using the same idea outlined here, you could do the following:
def make_1d_view(a):
a = np.ascontiguousarray(a)
dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(dt).ravel()
def f(a, b):
return len(np.intersect1d(make_1d_view(A), make_1d_view(b))) != 0
>>> f(A, B)
True
>>> f(A, C)
False
This doesn't work for floating point types (it will not consider +0.0 and -0.0 the same value), and np.intersect1d uses sorting, so it is has linearithmic, not linear, performance. You may be able to squeeze some performance by replicating the source of np.intersect1d in your code, and instead of checking the length of the return array, calling np.any on the boolean indexing array.
You can also get the job done with some np.tile and np.swapaxes business!
def intersect2d(X, Y):
"""
Function to find intersection of two 2D arrays.
Returns index of rows in X that are common to Y.
"""
X = np.tile(X[:,:,None], (1, 1, Y.shape[0]) )
Y = np.swapaxes(Y[:,:,None], 0, 2)
Y = np.tile(Y, (X.shape[0], 1, 1))
eq = np.all(np.equal(X, Y), axis = 1)
eq = np.any(eq, axis = 1)
return np.nonzero(eq)[0]
To answer the question more specifically, you'd only need to check if the returned array is empty.
This should be much faster it is not O(n^2) like the for-loop solution, but it isn't fully numpythonic. Not sure how better to leverage numpy here
def set_comp(a, b):
sets_a = set(map(lambda x: frozenset(tuple(x)), a))
sets_b = set(map(lambda x: frozenset(tuple(x)), b))
return not sets_a.isdisjoint(sets_b)
i think you want true if tow arrays have subarray set ! you can use this :
def(A,B):
for i in A:
for j in B:
if i==j
return True
return False
This problem can be solved efficiently using the numpy_indexed package (disclaimer: I am its author):
import numpy_indexed as npi
len(npi.intersection(A, B)) > 0
I have two lists:
A = [0,0,0,1,0,1]
B = [0,0,1,1,1,1]
I want to find the number of 1s in the same position in both lists.
The answer for these arrays would be 2.
A little shorter and hopefully more pythonic way:
>>> A=[0,0,0,1,0,1]
>>> B=[0,0,1,1,1,1]
x = sum(1 for a,b in zip(A,B) if (a==b==1))
>>> x
2
I'm not an expert of Python, but what is wrong with a simple loop from start to end of first array?
In C# I would do something like:
int match=0;
for (int cnt=0; cnt< A.Count;cnt++)
{
if ((A[cnt]==B[cnt]==1)) match++;
}
Would that be possible in your language?
Motivated by brief need to be perverse, I offer the following solution:
A = [0,0,0,1,0,1]
B = [0,0,1,1,1,1]
print len(set(i for i, n in enumerate(A) if n == 1) &
set(i for i, n in enumerate(B) if n == 1))
(Drakosha's suggestion is a far more reasonable way to solve this problem. This just demonstrates that one can often look at the same problem in different ways.)
With SciPy:
>>> from scipy import array
>>> A=array([0,0,0,1,0,1])
>>> B=array([0,0,1,1,1,1])
>>> A==B
array([ True, True, False, True, False, True], dtype=bool)
>>> sum(A==B)
4
>>> A!=B
array([False, False, True, False, True, False], dtype=bool)
>>> sum(A!=B)
2
[A[i]+B[i] for i in range(min([len(A), len(B)]))].count(2)
Basically this just creates a new list which has all the elements of the other two added together. You know there were two 1's if the sum is 2 (assuming only 0's and 1's in the list). Therefore just perform the count operation on 2.
Here comes another method which exploits the fact that the array just contains zeros and ones.
The scalar product of two vectors x and y is sum( x(i)*y(i) ) the only situation yielding a non zero result is if x(i)==y(i)==1 thus using numpy for instance
from numpy import *
x = array([0,0,0,1,0,1])
y = array([0,0,1,1,1,1])
print dot(x,y)
simple and nice. This method does n multiplications and adds n-1 times, however there are fast implementations using SSE, GPGPU, vectorisation, (add your fancy word here) for dot products (scalar products)
I timed the numpy-method against this method:
sum(1 for a,b in zip(x,y) if (a==b==1))
and found that for 1000000 loops the numpy-version did it in 2121ms and the zip-method did it in 9502ms thus the numpy-version is a lot faster
I did a better analysis of the efectivness and found that
for n element(s) in the array the zip method took t1 ms and the dot product took t2 ms for one itteration
elements zip dot
1 0.0030 0.0207
10 0.0063 0.0230
100 0.0393 0.0476
1000 0.3696 0.2932
10000 7.6144 2.7781
100000 115.8824 30.1305
From this data one could draw the conclusion that if the number of elements in the array is expected to (in mean) be more than 350 (or say 1000) one should consider to use the dot-product method instead.