How to transform negative elements to zero without a loop? - python

If I have an array like
a = np.array([2, 3, -1, -4, 3])
I want to set all the negative elements to zero: [2, 3, 0, 0, 3]. How to do it with numpy without an explicit for? I need to use the modified a in a computation, for example
c = a * b
where b is another array with the same length of the original a
Conclusion
import numpy as np
from time import time
a = np.random.uniform(-1, 1, 20000000)
t = time(); b = np.where(a>0, a, 0); print ("1. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); b = a.clip(min=0); print ("2. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); a[a < 0] = 0; print ("3. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); a[np.where(a<0)] = 0; print ("4. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); b = [max(x, 0) for x in a]; print ("5. ", time() - t)
1.38629984856
0.516846179962 <- faster a.clip(min=0);
0.615426063538
0.944557905197
51.7364809513

a = a.clip(min=0)

I would do this:
a[a < 0] = 0
If you want to keep the original a and only set the negative elements to zero in a copy, you can copy the array first:
c = a.copy()
c[c < 0] = 0

Another trick is to use multiplication. This actually seems to be much faster than every other method here. For example
b = a*(a>0) # copies data
or
a *= (a>0) # in-place zero-ing
I ran tests with timeit, pre-calculating the the < and > because some of these modify in-place and that would greatly effect results. In all cases a was np.random.uniform(-1, 1, 20000000) but with negatives already set to 0 but L = a < 0 and G = a > 0 before a was changed. The clip is relatively negatively impacted since it doesn't get to use L or G (however calculating those on the same machine took only 17ms each, so it is not the major cause of speed difference).
%timeit b = np.where(G, a, 0) # 132ms copies
%timeit b = a.clip(min=0) # 165ms copies
%timeit a[L] = 0 # 158ms in-place
%timeit a[np.where(L)] = 0 # 122ms in-place
%timeit b = a*G # 87.4ms copies
%timeit np.multiply(a,G,a) # 40.1ms in-place (normal code would use `a*=G`)
When choosing to penalize the in-place methods instead of clip, the following timings come up:
%timeit b = np.where(a>0, a, 0) # 152ms
%timeit b = a.clip(min=0) # 165ms
%timeit b = a.copy(); b[a<0] = 0 # 231ms
%timeit b = a.copy(); b[np.where(a<0)] = 0 # 205ms
%timeit b = a*(a>0) # 108ms
%timeit b = a.copy(); b*=a>0 # 121ms
Non in-place methods are penalized by 20ms (the time required to calculate a>0 or a<0) and the in-place methods are penalize 73-83 ms (so it takes about 53-63ms to do b.copy()).
Overall the multiplication methods are much faster than clip. If not in-place, it is 1.5x faster. If you can do it in-place then it is 2.75x faster.

Use where
a[numpy.where(a<0)] = 0

Based on my answer here, using np.maximum is the fastest possible way.
a = np.random.random(1000) - 0.5
%%timeit
a_ = a.copy()
a_ = np.maximum(a_,0)
# 15.6 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%%timeit
a_ = a.copy()
a_ = a_.clip(min=0)
# 54.2 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

And just for the sake of comprehensiveness, I would like to add the use of the Heaviside function (or a step function) to achieve a similar outcome as follows:
Let say for continuity we have
a = np.array([2, 3, -1, -4, 3])
Then using a step function np.heaviside() one can try
b = a * np.heaviside(a, 0)
Note something interesting in this operation because the negative signs are preserved! Not very ideal for most situations I would say.
This can then be corrected for by
b = abs(b)
So this is probably a rather long way to do it without invoking some loop.

Related

Comparing numpy array with itself by element efficiently

I am performing a large number of these calculations:
A == A[np.newaxis].T
where A is a dense numpy array which frequently has common values.
For benchmarking purposes we can use:
n = 30000
A = np.random.randint(0, 1000, n)
A == A[np.newaxis].T
When I perform this calculation, I run into memory issues. I believe this is because the output isn't in more efficient bitarray or np.packedbits format. A secondary concern is we are performing twice as many comparisons as necessary, since the resulting Boolean array is symmetric.
The questions I have are:
Is it possible to produce the Boolean numpy array output in a more memory efficient fashion without sacrificing speed? The options I know about are bitarray and np.packedbits, but I only know how to apply these after the large Boolean array is created.
Can we utilise the symmetry of our calculation to halve the number of comparisons processed, again without sacrificing speed?
I will need to be able to perform & and | operations on Boolean arrays output. I have tried bitarray, which is super-fast for these bitwise operations. But it is slow to pack np.ndarray -> bitarray and then unpack bitarray -> np.ndarray.
[Edited to provide clarification.]
Here's one with numba to give us a NumPy boolean array as output -
from numba import njit
#njit
def numba_app1(idx, n, s, out):
for i,j in zip(idx[:-1],idx[1:]):
s0 = s[i:j]
c = 0
for p1 in s0[c:]:
for p2 in s0[c+1:]:
out[p1,p2] = 1
out[p2,p1] = 1
c += 1
return out
def app1(A):
s = A.argsort()
b = A[s]
n = len(A)
idx = np.flatnonzero(np.r_[True,b[1:] != b[:-1],True])
out = np.zeros((n,n),dtype=bool)
numba_app1(idx, n, s, out)
out.ravel()[::out.shape[1]+1] = 1
return out
Timings -
In [287]: np.random.seed(0)
...: n = 30000
...: A = np.random.randint(0, 1000, n)
# Original soln
In [288]: %timeit A == A[np.newaxis].T
1 loop, best of 3: 317 ms per loop
# #Daniel F's soln-1 that skips assigning lower diagonal in output
In [289]: %timeit sparse_outer_eq(A)
1 loop, best of 3: 450 ms per loop
# #Daniel F's soln-2 (complete one)
In [291]: %timeit sparse_outer_eq(A)
1 loop, best of 3: 634 ms per loop
# Solution from this post
In [292]: %timeit app1(A)
10 loops, best of 3: 66.9 ms per loop
This isn't even a numpy answer, but should work to keep your data requirements down by using a bit of homebrewed sparse notation
from numba import jit
#jit # because this is gonna be loopy
def sparse_outer_eq(A):
n = A.size
c = []
for i in range(n):
for j in range(i + 1, n):
if A[i] == A[j]:
c.append((i, j))
return c
Now c is a list of coordinate tuples (i, j), i < j that correspond to coordinates in your boolean array that are "True". You can easily do and and or operations on these setwise:
list(set(c1) & set(c2))
list(set(c1) | set(c2))
Later, when you want to apply this mask to an array, you can back out the coordinates and use them for fancy indexing instead:
i_, j_ = list(np.array(c).T)
i = np.r_[i_, j_, np.arange(n)]
j = np.r_[j_, i_, np.arange(n)]
You can then np.lexsort i nd j if you care about order
Alternatively, you can define sparse_outer_eq as:
#jit
def sparse_outer_eq(A):
n = A.size
c = []
for i in range(n):
for j in range(n):
if A[i] == A[j]:
c.append((i, j))
return c
Which keeps >2x the data, but then the coordinates come out simply:
i, j = list(np.array(c).T)
if you've done any set operations, this will still need to be lexsorted if you want a rational order.
If your coordinates are each n-bit integers, this should be more space-efficient than boolean format as long as your sparsity is less than 1/n -> 3% or so for 32-bit.
as for time, thanks to numba it's even faster than broadcasting:
n = 3000
A = np.random.randint(0, 1000, n)
%timeit sparse_outer_eq(A)
100 loops, best of 3: 4.86 ms per loop
%timeit A == A[:, None]
100 loops, best of 3: 11.8 ms per loop
and comparisons:
a = A == A[:, None]
b = B == B[:, None]
a_ = sparse_outer_eq(A)
b_ = sparse_outer_eq(B)
%timeit a & b
100 loops, best of 3: 5.9 ms per loop
%timeit list(set(a_) & set(b_))
1000 loops, best of 3: 641 µs per loop
%timeit a | b
100 loops, best of 3: 5.52 ms per loop
%timeit list(set(a_) | set(b_))
1000 loops, best of 3: 955 µs per loop
EDIT: if you want to do &~ (as per your comment) use the second sparse_outer_eq method (so you don't have to keep track of the diagonal) and just do:
list(set(a_) - set(b_))
Here is the more or less canonical argsort solution:
import numpy as np
def f_argsort(A):
idx = np.argsort(A)
As = A[idx]
ne_ = np.r_[True, As[:-1] != As[1:], True]
bnds = np.flatnonzero(ne_)
valid = np.diff(bnds) != 1
return [idx[bnds[i]:bnds[i+1]] for i in np.flatnonzero(valid)]
n = 30000
A = np.random.randint(0, 1000, n)
groups = f_argsort(A)
for grp in groups:
print(len(grp), set(A[grp]), end=' ')
print()
I'm adding a solution to my question because it satisfies these 3 properties:
Low, fixed, memory requirement
Fast bitwise operations (&, |, ~, etc)
Low storage, 1-bit per Boolean via packing integers
The downside is it is stored in np.packbits format. It is substantially slower than other methods (especially argsort), but if speed is not an issue the algorithm should work well. If anyone figures a way to optimise further, this would be very helpful.
Update: A more efficient version of the below algorithm can be found here: Improving performance on comparison algorithm np.packbits(A==A[:, None], axis=1).
import numpy as np
from numba import jit
#jit(nopython=True)
def bool2int(x):
y = 0
for i, j in enumerate(x):
if j: y += int(j)<<(7-i)
return y
#jit(nopython=True)
def compare_elementwise(arr, result, section):
n = len(arr)
for row in range(n):
for col in range(n):
section[col%8] = arr[row] == arr[col]
if ((col + 1) % 8 == 0) or (col == (n-1)):
result[row, col // 8] = bool2int(section)
section[:] = 0
return result
A = np.random.randint(0, 10, 100)
n = len(A)
result_arr = np.zeros((n, n // 8 if n % 8 == 0 else n // 8 + 1)).astype(np.uint8)
selection_arr = np.zeros(8).astype(np.uint8)
packed = compare_elementwise(A, result_arr, selection_arr)

Improving performance on comparison algorithm np.packbits(A==A[:, None], axis=1)

I am looking to memory optimise np.packbits(A==A[:, None], axis=1), where A is dense array of integers of length n. A==A[:, None] is memory hungry for large n since the resulting Boolean array is stored inefficiently with each Boolean value costing 1 byte.
I wrote the below script to achieve the same result while packing bits one section at a time. It is, however, around 3x slower, so I am looking for ways to speed it up. Or, alternatively, a better algorithm with small memory overhead.
Note: this is a follow-up question to one I asked earlier; Comparing numpy array with itself by element efficiently.
Reproducible code below for benchmarking.
import numpy as np
from numba import jit
#jit(nopython=True)
def bool2int(x):
y = 0
for i, j in enumerate(x):
if j: y += int(j)<<(7-i)
return y
#jit(nopython=True)
def compare_elementwise(arr, result, section):
n = len(arr)
for row in range(n):
for col in range(n):
section[col%8] = arr[row] == arr[col]
if ((col + 1) % 8 == 0) or (col == (n-1)):
result[row, col // 8] = bool2int(section)
section[:] = 0
return result
n = 10000
A = np.random.randint(0, 1000, n)
result_arr = np.zeros((n, n // 8 if n % 8 == 0 else n // 8 + 1)).astype(np.uint8)
selection_arr = np.zeros(8).astype(np.uint8)
# memory efficient version, but slow
packed = compare_elementwise(A, result_arr, selection_arr)
# memory inefficient version, but fast
packed2 = np.packbits(A == A[:, None], axis=1)
assert (packed == packed2).all()
%timeit compare_elementwise(A, result_arr, selection_arr) # 1.6 seconds
%timeit np.packbits(A == A[:, None], axis=1) # 0.460 second
Here is a solution 3 times faster than the numpy one (a.size must be a multiple of 8; see below) :
#nb.njit
def comp(a):
res=np.zeros((a.size,a.size//8),np.uint8)
for i,x in enumerate(a):
for j,y in enumerate(a):
if x==y: res[i,j//8] |= 128 >> j%8
return res
This works because the array is scanned one time, where you do it many times,
and amost all terms are null.
In [122]: %timeit np.packbits(A == A[:, None], axis=1)
389 ms ± 57.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [123]: %timeit comp(A)
123 ms ± 24.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
If a.size%8 > 0, the cost for find back the information will be higher. The best way in this case is to pad the initial array with some (in range(7)) zeros.
For completeness, the padding could be done as so:
if A.size % 8 != 0: A = np.pad(A, (0, 8 - A.size % 8), 'constant', constant_values=0)

How to find the longest sub-array within a threshold?

Let's say you have a sorted array of numbers sorted_array and a threshold threshold. What is the fastest way to find the longest sub-array in which all the values are within the threshold? In other words, find indices i and j such that:
sorted_array[j] - sorted_array[i] <= threshold
j - i is maximal
In case of a tie, return the pair with the smallest i.
I already have a loop-based solution, which I will post as an answer, but I'm curious to see if there's a better way, or a way that can avoid the loop using a vector-capable language or library like NumPy, for example.
Example input and output:
>>> sorted_array = [0, 0.7, 1, 2, 2.5]
>>> longest_subarray_within_threshold(sorted_array, 0.2)
(0, 0)
>>> longest_subarray_within_threshold(sorted_array, 0.4)
(1, 2)
>>> longest_subarray_within_threshold(sorted_array, 0.8)
(0, 1)
>>> longest_subarray_within_threshold(sorted_array, 1)
(0, 2)
>>> longest_subarray_within_threshold(sorted_array, 1.9)
(1, 4)
>>> longest_subarray_within_threshold(sorted_array, 2)
(0, 3)
>>> longest_subarray_within_threshold(sorted_array, 2.6)
(0, 4)
Most likely, the OP's own answer is the best possible algorithm, as it is O(n). However, the pure-python overhead makes it very slow. However, this overhead can easily be reduced by compiling the algorithm using numba, with the current version (0.28.1 as of this writing), there is no need for any manual typing, simply decorating your function with #numba.njit() is enough.
However, if you do not want to depend on numba, there is a numpy algorithm in O(n log n):
def algo_burnpanck(sorted_array,thresh):
limit = np.searchsorted(sorted_array,sorted_array+thresh,'right')
distance = limit - np.arange(limit.size)
best = np.argmax(distance)
return best, limit[best]-1
I did run a quick profiling on my own machine of the two previous answers (OP's and Divakar's), as well as my numpy algorithm and the numba version of the OP's algorithm.
thresh = 1
for n in [100, 10000]:
sorted_array = np.sort(np.random.randn(n,))
for f in [algo_user1475412,algo_Divakar,algo_burnpanck,algo_user1475412_numba]:
a,b = f(sorted_array, thresh)
d = b-a
diff = sorted_array[b]-sorted_array[a]
closestlonger = np.min(sorted_array[d+1:]-sorted_array[:-d-1])
assert sorted_array[b]-sorted_array[a]<=thresh
assert closestlonger>thresh
print('f=%s, n=%d thresh=%s:'%(f.__name__,n,thresh))#,d,a,b,diff,closestlonger)
%timeit f(sorted_array, thresh)
Here are the results:
f=algo_user1475412, n=100 thresh=1:
10000 loops, best of 3: 111 µs per loop
f=algo_Divakar, n=100 thresh=1:
10000 loops, best of 3: 74.6 µs per loop
f=algo_burnpanck, n=100 thresh=1:
100000 loops, best of 3: 9.38 µs per loop
f=algo_user1475412_numba, n=100 thresh=1:
1000000 loops, best of 3: 764 ns per loop
f=algo_user1475412, n=10000 thresh=1:
100 loops, best of 3: 12.1 ms per loop
f=algo_Divakar, n=10000 thresh=1:
1 loop, best of 3: 1.76 s per loop
f=algo_burnpanck, n=10000 thresh=1:
1000 loops, best of 3: 308 µs per loop
f=algo_user1475412_numba, n=10000 thresh=1:
10000 loops, best of 3: 82.9 µs per loop
At 100 numbers, O(n^2) solution using numpy just barely beats the O(n) python solution, but quickly after, the scaling makes that algorithm useless. The O(n log n) keeps up even at 10000 numbers, but the numba approach is unbeaten everywhere.
Here's a simple loop-based solution in Python:
def longest_subarray_within_threshold(sorted_array, threshold):
result = (0, 0)
longest = 0
i = j = 0
end = len(sorted_array)
while i < end:
if j < end and sorted_array[j] - sorted_array[i] <= threshold:
current_distance = j - i
if current_distance > longest:
longest = current_distance
result = (i, j)
j += 1
else:
i += 1
return result
Here's a vectorized approach using broadcasting -
def longest_thresh_subarray(sorted_array,thresh):
diffs = (sorted_array[:,None] - sorted_array)
r = np.arange(sorted_array.size)
valid_mask = r[:,None] > r
mask = (diffs <= thresh) & valid_mask
bestcolID = (mask).sum(0).argmax()
idx = np.nonzero(mask[:,bestcolID])[0]
if len(idx)==0:
out = (0,0)
else:
out = idx[0]-1, idx[-1]
return out
Sample runs -
In [137]: sorted_array = np.array([0, 0.7, 1, 2, 2.5])
In [138]: longest_thresh_subarray(sorted_array,0.2)
Out[138]: (0, 0)
In [139]: longest_thresh_subarray(sorted_array,0.4)
Out[139]: (1, 2)
In [140]: longest_thresh_subarray(sorted_array,0.8)
Out[140]: (0, 1)
In [141]: longest_thresh_subarray(sorted_array,1)
Out[141]: (0, 2)
In [142]: longest_thresh_subarray(sorted_array,1.9)
Out[142]: (1, 4)
In [143]: longest_thresh_subarray(sorted_array,2)
Out[143]: (0, 3)
In [144]: longest_thresh_subarray(sorted_array,2.6)
Out[144]: (0, 4)

Vectorize addition into array indexed by another array

I am trying to get a fast vectorized version of the following loop:
for i in xrange(N1):
A[y[i]] -= B[i,:]
Here A.shape = (N2,N3), y.shape = (N1) with y taking values in [0,N2[, B.shape = (N1,N3). You can think of entries of y being indices into rows of A. Here N1 is large, N2 is pretty small and N3 is smallish.
I thought simply doing
A[y] -= B
would work, but the issue is that there are repeated entries in y and this does not do the right thing (i.e., if y=[1,1] then A[1] is only added to once, not twice). Also this is does not seem to be any faster than the unvectorized for loop.
Is there a better way of doing this?
EDIT: YXD linked this answer to in comments which at first seems to fit the bill. It would seem you can do exactly what I want with
np.subtract.at(A, y, B)
and it does work, however when I try to run it it is significantly slower than the unvectorized version. So, the question remains: is there a more performant way of doing this?
EDIT2: An example, to make things concrete:
n1,n2,n3 = 10000, 10, 500
A = np.random.rand(n2,n3)
y = np.random.randint(n2, size=n1)
B = np.random.rand(n1,n3)
The for loop, when run using %timeit in ipython gives on my machine:
10 loops, best of 3: 19.4 ms per loop
The subtract.at version produces the same value for A in the end, but is much slower:
1 loops, best of 3: 444 ms per loop
The code for the original for-loop based approach would look something like this -
def for_loop(A):
N1 = B.shape[0]
for i in xrange(N1):
A[y[i]] -= B[i,:]
return A
Case #1
If n2 >> n3, I would suggest this vectorized approach -
def bincount_vectorized(A):
n3 = A.shape[1]
nrows = y.max()+1
id = y[:,None] + nrows*np.arange(n3)
A[:nrows] -= np.bincount(id.ravel(),B.ravel()).reshape(n3,nrows).T
return A
Runtime tests -
In [203]: n1,n2,n3 = 10000, 500, 10
...: A = np.random.rand(n2,n3)
...: y = np.random.randint(n2, size=n1)
...: B = np.random.rand(n1,n3)
...:
...: # Make copies
...: Acopy1 = A.copy()
...: Acopy2 = A.copy()
...:
In [204]: %timeit for_loop(Acopy1)
10 loops, best of 3: 19 ms per loop
In [205]: %timeit bincount_vectorized(Acopy2)
1000 loops, best of 3: 779 µs per loop
Case #2
If n2 << n3, a modified for-loop approach with lesser loop complexity could be suggested -
def for_loop_v2(A):
n2 = A.shape[0]
for i in range(n2):
A[i] -= np.einsum('ij->j',B[y==i]) # OR (B[y==i]).sum(0)
return A
Runtime tests -
In [206]: n1,n2,n3 = 10000, 10, 500
...: A = np.random.rand(n2,n3)
...: y = np.random.randint(n2, size=n1)
...: B = np.random.rand(n1,n3)
...:
...: # Make copies
...: Acopy1 = A.copy()
...: Acopy2 = A.copy()
...:
In [207]: %timeit for_loop(Acopy1)
10 loops, best of 3: 24.2 ms per loop
In [208]: %timeit for_loop_v2(Acopy2)
10 loops, best of 3: 20.3 ms per loop

Numpy modify array in place?

I have the following code which is attempting to normalize the values of an m x n array (It will be used as input to a neural network, where m is the number of training examples and n is the number of features).
However, when I inspect the array in the interpreter after the script runs, I see that the values are not normalized; that is, they still have the original values. I guess this is because the assignment to the array variable inside the function is only seen within the function.
How can I do this normalization in place? Or do I have to return a new array from the normalize function?
import numpy
def normalize(array, imin = -1, imax = 1):
"""I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""
dmin = array.min()
dmax = array.max()
array = imin + (imax - imin)*(array - dmin)/(dmax - dmin)
print array[0]
def main():
array = numpy.loadtxt('test.csv', delimiter=',', skiprows=1)
for column in array.T:
normalize(column)
return array
if __name__ == "__main__":
a = main()
If you want to apply mathematical operations to a numpy array in-place, you can simply use the standard in-place operators +=, -=, /=, etc. So for example:
>>> def foo(a):
... a += 10
...
>>> a = numpy.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> foo(a)
>>> a
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
The in-place version of these operations is a tad faster to boot, especially for larger arrays:
>>> def normalize_inplace(array, imin=-1, imax=1):
... dmin = array.min()
... dmax = array.max()
... array -= dmin
... array *= imax - imin
... array /= dmax - dmin
... array += imin
...
>>> def normalize_copy(array, imin=-1, imax=1):
... dmin = array.min()
... dmax = array.max()
... return imin + (imax - imin) * (array - dmin) / (dmax - dmin)
...
>>> a = numpy.arange(10000, dtype='f')
>>> %timeit normalize_inplace(a)
10000 loops, best of 3: 144 us per loop
>>> %timeit normalize_copy(a)
10000 loops, best of 3: 146 us per loop
>>> a = numpy.arange(1000000, dtype='f')
>>> %timeit normalize_inplace(a)
100 loops, best of 3: 12.8 ms per loop
>>> %timeit normalize_copy(a)
100 loops, best of 3: 16.4 ms per loop
This is a trick that it is slightly more general than the other useful answers here:
def normalize(array, imin = -1, imax = 1):
"""I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""
dmin = array.min()
dmax = array.max()
array[...] = imin + (imax - imin)*(array - dmin)/(dmax - dmin)
Here we are assigning values to the view array[...] rather than assigning these values to some new local variable within the scope of the function.
x = np.arange(5, dtype='float')
print x
normalize(x)
print x
>>> [0. 1. 2. 3. 4.]
>>> [-1. -0.5 0. 0.5 1. ]
EDIT:
It's slower; it allocates a new array. But it may be valuable if you are doing something more complicated where builtin in-place operations are cumbersome or don't suffice.
def normalize2(array, imin=-1, imax=1):
dmin = array.min()
dmax = array.max()
array -= dmin;
array *= (imax - imin)
array /= (dmax-dmin)
array += imin
A = np.random.randn(200**3).reshape([200] * 3)
%timeit -n5 -r5 normalize(A)
%timeit -n5 -r5 normalize2(A)
>> 47.6 ms ± 678 µs per loop (mean ± std. dev. of 5 runs, 5 loops each)
>> 26.1 ms ± 866 µs per loop (mean ± std. dev. of 5 runs, 5 loops each)
def normalize(array, imin = -1, imax = 1):
"""I = Imin + (Imax-Imin)*(D-Dmin)/(Dmax-Dmin)"""
dmin = array.min()
dmax = array.max()
array -= dmin;
array *= (imax - imin)
array /= (dmax-dmin)
array += imin
print array[0]
There is a nice way to do in-place normalization when using numpy. np.vectorize is is very usefull when combined with a lambda function when applied to an array. See the example below:
import numpy as np
def normalizeMe(value,vmin,vmax):
vnorm = float(value-vmin)/float(vmax-vmin)
return vnorm
imin = 0
imax = 10
feature = np.random.randint(10, size=10)
# Vectorize your function (only need to do it once)
temp = np.vectorize(lambda val: normalizeMe(val,imin,imax))
normfeature = temp(np.asarray(feature))
print feature
print normfeature
One can compare the performance with a generator expression, however there are likely many other ways to do this.
%%timeit
temp = np.vectorize(lambda val: normalizeMe(val,imin,imax))
normfeature1 = temp(np.asarray(feature))
10000 loops, best of 3: 25.1 µs per loop
%%timeit
normfeature2 = [i for i in (normalizeMe(val,imin,imax) for val in feature)]
100000 loops, best of 3: 9.69 µs per loop
%%timeit
normalize(np.asarray(feature))
100000 loops, best of 3: 12.7 µs per loop
So vectorize is definitely not the fastest, but can be conveient in cases where performance is not as important.

Categories