Say I have 2 numpy 2D arrays, mins, and maxs, that will always be the same dimension as one another. I'd like to create a third array, results, that is the result of applying linspace to max and min value. Is there some "numpy"/vectorized way to do this? Example non-vectorized code is below to show results I would like.
import numpy as np
mins = np.random.rand(2,2)
maxs = np.random.rand(2,2)
# Number of elements in the linspace
x = 3
m, n = mins.shape
results = np.zeros((m, n, x))
for i in range(m):
for j in range(n):
min = mins[i][j]
max = maxs[i][j]
results[i][j] = np.linspace(min, max, num=x)
Here's one vectorized approach based on this post to cover for generic n-dim cases -
def create_ranges_nd(start, stop, N, endpoint=True):
if endpoint==1:
divisor = N-1
else:
divisor = N
steps = (1.0/divisor) * (stop - start)
return start[...,None] + steps[...,None]*np.arange(N)
Sample run -
In [536]: mins = np.array([[3,5],[2,4]])
In [537]: maxs = np.array([[13,16],[11,12]])
In [538]: create_ranges_nd(mins, maxs, 6)
Out[538]:
array([[[ 3. , 5. , 7. , 9. , 11. , 13. ],
[ 5. , 7.2, 9.4, 11.6, 13.8, 16. ]],
[[ 2. , 3.8, 5.6, 7.4, 9.2, 11. ],
[ 4. , 5.6, 7.2, 8.8, 10.4, 12. ]]])
As of Numpy version 1.16.0, non-scalar start and stop are now supported.
So, now you can do this:
assert np.__version__ > '1.17.2'
mins = np.random.rand(2,2)
maxs = np.random.rand(2,2)
# Number of elements in the linspace
x = 3
results = np.linspace(mins, maxs, num=x)
# And, if required
results = np.rollaxis(results, 0, 3)
Related
I have this for loop that I need to vectorize. The code below works, but takes a lot of time (this is a simplified example, the full version will have about 1e6 rows in col_ids). Can someone give me an idea how to vectorize this code to get rid of the loop? If it matters, the col_ids are fixed (will be the same every time the code is run), while the values will change.
values = np.array([1.5, 2, 2.3])
col_ids = np.array([[0,0,0,0], [0,0,0,1], [0,0,1,1]])
result = np.zeros((4,3))
for idx, col_idx in enumerate(col_ids):
result[np.arange(4),col_idx] += values[idx]
Result:
[[5.8 0. 0. ]
[5.8 0. 0. ]
[3.5 2.3 0. ]
[1.5 4.3 0. ]]
Update:
I am adding a second example as there was some ambiguity in the dimensions of my first example. Only values and col_ids are updated, everything else as in first example. (I keep the first one, since this is referred to in the answers)
values = np.array([1.5, 2, 5, 20, 50])
col_ids = np.array([[0,0,0,0], [0,0,0,1], [0,0,1,1], [0,0,1,2], [0,1,2,2]])
Result:
[[78.5 0. 0. ]
[28.5 50. 0. ]
[ 3.5 25. 50. ]
[ 1.5 7. 70. ]]
So result is m x n, col_ids is k x m and values has length k. Both m and n are small (m=4, n=3), k is large (about 1e6 in full example)
You can vectorize the loop, but creating an additional intermediate array is much slower for larger data (starting from result with shape (50,50))
import numpy as np
values = np.array([1.5, 2, 2.3])
col_ids = np.array([[0,0,0,0], [0,0,0,1], [0,0,1,1]])
(np.equal.outer(col_ids, np.arange(len(values))) * values[:,None,None]).sum(0)
# for a fixed result shape (4,3)
# (np.equal.outer(col_ids, np.arange(3)) * values[:,None,None]).sum(0)
Output
array([[5.8, 0. , 0. ],
[5.8, 0. , 0. ],
[3.5, 2.3, 0. ],
[1.5, 4.3, 0. ]])
The only reliably faster solution I could find is numba (using version 0.55.1). I thought this implementation would benefit from parallel execution, but I couldn't get any speed up on a 2-core colab instance.
import numba as nb
#nb.njit(parallel=False) # Try parallel=True for multi-threaded execution, no speed up in my benchmarks
def fill(val, ids):
res = np.zeros(ids.shape[::-1])
for i in nb.prange(len(res)):
for j in range(res.shape[1]):
res[i, ids[j,i]] += val[j]
return res
fill(values, col_ids)
Output
array([[5.8, 0. , 0. ],
[5.8, 0. , 0. ],
[3.5, 2.3, 0. ],
[1.5, 4.3, 0. ]])
For a fixed result shape (4,3) with suitable input.
#nb.njit(boundscheck=True) # ~1.25x slower, but much safer
def fill(val, ids):
res = np.zeros((4,3))
for i in nb.prange(ids.shape[0]):
for j in range(ids.shape[1]):
res[j, ids[i,j]] += val[i]
return res
fill(values, col_ids)
Output for the updated example data
array([[78.5, 0. , 0. ],
[28.5, 50. , 0. ],
[ 3.5, 25. , 50. ],
[ 1.5, 7. , 70. ]])
You can solve this using np.add.at. However, AFAIK, this function does not support 2D array so you need to flatten the arrays, computing the 1D flatten indices, and then call the function:
n, m = result.shape
result = np.zeros((4,3))
indices = np.tile(np.arange(0, n*m, m), col_ids.shape[0]) + col_ids.ravel()
np.add.at(result.ravel(), indices, np.repeat(values, n)) # In-place
print(result)
Let's say I have a simple array, like this one:
import numpy as np
a = np.array([1,2,3])
Which returns me, obviously:
array([1, 2, 3])
I'm trying to add calculated values between consecutive values in this array. The calculation should return me n equally spaced values between it's bounds.
To express myself in numbers, let's say I want to add 1 value between each pair of consecutive values, so the function should return me a array like this:
array([1, 1.5, 2, 2.5, 3])
Another example, now with 2 values between each pair:
array([1, 1.33, 1.66, 2, 2.33, 2.66, 3])
I know the logic and I can create myself a function which will do the work, but I feel numpy has specific functions that would make my code so much cleaner!
If your array is
import numpy as np
n = 2
a = np.array([1,2,5])
new_size = a.size + (a.size - 1) * n
x = np.linspace(a.min(), a.max(), new_size)
xp = np.linspace(a.min(), a.max(), a.size)
fp = a
result = np.interp(x, xp, fp)
returns: array([1. , 1.33333333, 1.66666667, 2. , 2.66666667, 3.33333333, 4. ])
If your array is always evenly spaced, you can just use
new_size = a.size + (a.size - 1) * n
result = np.linspace(a.min(), a.max(), new_size)
Using linspace should do the trick:
a = np.array([1,2,3])
n = 1
temps = []
for i in range(1, len(a)):
temps.append(np.linspace(a[i-1], a[i], num=n+1, endpoint=False))
# Add last final ending point
temps.append(np.array([a[-1]]))
new_a = np.concatenate(temps)
print(new_a)
Try with np.arange:
a = np.array([1,2,3])
n = 2
print(np.arange(a.min(), a.max(), 1 / (n + 1)))
Output:
[1. 1.33333333 1.66666667 2. 2.33333333 2.66666667]
I'd like to compare each value x of an array with a rolling window of the n previous values. More precisely I'd like to see at which percentile this new value x would be, if we added it to the previous window:
import numpy as np
A = np.array([1, 4, 9, 28, 28.5, 2, 283, 3.2, 7, 15])
print A
n = 4 # window width
for i in range(len(A)-n):
W = A[i:i+n]
x = A[i+n]
q = sum(W <= x) * 1.0 / n
print 'Value:', x, ' Window before this value:', W, ' Quantile:', q
[ 1. 4. 9. 28. 28.5 2. 283. 3.2 7. 15. ]
Value: 28.5 Window before this value: [ 1. 4. 9. 28.] Quantile: 1.0
Value: 2.0 Window before this value: [ 4. 9. 28. 28.5] Quantile: 0.0
Value: 283.0 Window before this value: [ 9. 28. 28.5 2. ] Quantile: 1.0
Value: 3.2 Window before this value: [ 28. 28.5 2. 283. ] Quantile: 0.25
Value: 7.0 Window before this value: [ 28.5 2. 283. 3.2] Quantile: 0.5
Value: 15.0 Window before this value: [ 2. 283. 3.2 7. ] Quantile: 0.75
Question: What is the name of this computation? Is there a clever numpy way to compute this more efficiently on arrays of millions of items (with n that can be ~5000)?
Note: here is a simulation for 1M items and n=5000 but it would take ~ 2 hours:
import numpy as np
A = np.random.random(1000*1000) # the following is not very interesting with a [0,1]
n = 5000 # uniform random variable, but anyway...
Q = np.zeros(len(A)-n)
for i in range(len(Q)):
Q[i] = sum(A[i:i+n] <= A[i+n]) * 1.0 / n
if i % 100 == 0:
print "%.2f %% already done. " % (i * 100.0 / len(A))
print Q
Note: this is not similar to How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?
Your code is so slow because you're using Python's own sum() instead of numpy.sum() or numpy.array.sum(); Python's sum() has to convert all the raw values to Python objects before doing the calculations, which is really slow. Just by changing sum(...) to np.sum(...) or (...).sum(), the runtime drops to under 20 seconds.
you can use np.lib.stride_tricks.as_strided as in the accepted answer of the question you linked. With the first example you give, it is pretty easy to understand:
A = np.array([1, 4, 9, 28, 28.5, 2, 283, 3.2, 7, 15])
n=4
print (np.lib.stride_tricks.as_strided(A, shape=(A.size-n,n),
strides=(A.itemsize,A.itemsize)))
# you get the A.size-n columns of the n rolling elements
array([[ 1. , 4. , 9. , 28. , 28.5, 2. ],
[ 4. , 9. , 28. , 28.5, 2. , 283. ],
[ 9. , 28. , 28.5, 2. , 283. , 3.2],
[ 28. , 28.5, 2. , 283. , 3.2, 7. ]])
Now to do the calculation, you can compare this array to A[n:], sum over the rows and divide by n:
print ((np.lib.stride_tricks.as_strided(A, shape=(n,A.size-n),
strides=(A.itemsize,A.itemsize))
<= A[n:]).sum(0)/(1.*n))
[1. 0. 1. 0.25 0.5 0.75] # same anwser
Now the problem is the size of you data (several M and n around 5000), not sure you can use directly this method. One way could be to chunk the data. Let's define a function
def compare_strides (arr, n):
return (np.lib.stride_tricks.as_strided(arr, shape=(n,arr.size-n),
strides=(arr.itemsize,arr.itemsize))
<= arr[n:]).sum(0)
and do the chunk, with np.concatenate and don't forget to divide by n:
nb_chunk = 1000 #this number depends on the capacity of you computer,
# not sure how to optimize it
Q = np.concatenate([compare_strides(A[chunk*nb_chunk:(chunk+1)*nb_chunk+n],n)
for chunk in range(0,A[n:].size/nb_chunk+1)])/(1.*n)
I can't do the 1M - 5000 test, but on a 5000 - 100, see the difference in timeit:
A = np.random.random(5000)
n = 100
%%timeit
Q = np.zeros(len(A)-n)
for i in range(len(Q)):
Q[i] = sum(A[i:i+n] <= A[i+n]) * 1.0 / n
#1 loop, best of 3: 6.75 s per loop
%%timeit
nb_chunk = 100
Q1 = np.concatenate([compare_strides(A[chunk*nb_chunk:(chunk+1)*nb_chunk+n],n)
for chunk in range(0,A[n:].size/nb_chunk+1)])/(1.*n)
#100 loops, best of 3: 7.84 ms per loop
#check for egality
print ((Q == Q1).all())
Out[33]: True
See the difference in time from 6750 ms to 7.84 ms. Hope it works on bigger data
Using np.sum instead of sum was already mentioned, so my only suggestion left is to additionally consider using pandas and its rolling window function, which you can apply any arbitrary function to:
import numpy as np
import pandas as pd
A = np.random.random(1000*1000)
df = pd.DataFrame(A)
n = 5000
def fct(x):
return np.sum(x[:-1] <= x[-1]) * 1.0 / (len(x)-1)
percentiles = df.rolling(n+1).apply(fct)
print(percentiles)
Additional benchmark: comparison between this solution and this solution:
import numpy as np, time
A = np.random.random(1000*1000)
n = 5000
def compare_strides (arr, n):
return (np.lib.stride_tricks.as_strided(arr, shape=(n,arr.size-n), strides=(arr.itemsize,arr.itemsize)) <= arr[n:]).sum(0)
# Test #1: with strides ===> 11.0 seconds
t0 = time.time()
nb_chunk = 10*1000
Q = np.concatenate([compare_strides(A[chunk*nb_chunk:(chunk+1)*nb_chunk+n],n) for chunk in range(0,A[n:].size/nb_chunk+1)])/(1.*n)
print time.time() - t0, Q
# Test #2: with just np.sum ===> 18.0 seconds
t0 = time.time()
Q2 = np.zeros(len(A)-n)
for i in range(len(Q2)):
Q2[i] = np.sum(A[i:i+n] <= A[i+n])
Q2 *= 1.0 / n # here the multiplication is vectorized; if instead, we move this multiplication to the previous line: np.sum(A[i:i+n] <= A[i+n]) * 1.0 / n, it is 6 seconds slower
print time.time() - t0, Q2
print all(Q == Q2)
There's also another (better) way, with numba and #jit decorator. Then it is much faster: only 5.4 seconds!
from numba import jit
import numpy as np
#jit # if you remove this line, it is much slower (similar to Test #2 above)
def doit():
A = np.random.random(1000*1000)
n = 5000
Q2 = np.zeros(len(A)-n)
for i in range(len(Q2)):
Q2[i] = np.sum(A[i:i+n] <= A[i+n])
Q2 *= 1.0/n
print(Q2)
doit()
When adding numba parallelization, it's even faster: 1.8 seconds!
import numpy as np
from numba import jit, prange
#jit(parallel=True)
def doit(A, Q, n):
for i in prange(len(Q)):
Q[i] = np.sum(A[i:i+n] <= A[i+n])
A = np.random.random(1000*1000)
n = 5000
Q = np.zeros(len(A)-n)
doit(A, Q, n)
You can use the np.quantile instead of sum(A[i:i+n] <= A[i+n]) * 1.0 / n. That may be as good as it gets. Not sure if there really is a better approach for your question.
Hi I have to enlarge the number of points inside of vector to enlarge the vector to fixed size. for example:
for this simple vector
>>> a = np.array([0, 1, 2, 3, 4, 5])
>>> len(a)
# 6
now, I want to get a vector with size of 11 taken the a vector as base the results will be
# array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])
EDIT 1
what I need is a function that will enter the base vector and the number of values that must be the resultant vector, and I return a new vector with size equal to the parameter. something like
def enlargeVector(vector, size):
.....
return newVector
to use like:
>>> a = np.array([0, 1, 2, 3, 4, 5])
>>> b = enlargeVector(a, 200):
>>> len(b)
# 200
and b contains data results of linear, cubic, or whatever interpolation methods
There are many methods to do this within scipy.interpolate. My favourite is UnivariateSpline, which produces an order k spline guaranteed to be differentiable k times.
To use it:
from scipy.interpolate import UnivariateSpline
old_indices = np.arange(0,len(a))
new_length = 11
new_indices = np.linspace(0,len(a)-1,new_length)
spl = UnivariateSpline(old_indices,a,k=3,s=0)
new_array = spl(new_indices)
The s is a smoothing factor that you should set to 0 in this case (since the data are exact).
Note that for the problem you have specified (since a just increases monotonically by 1), this is overkill, since the second np.linspace gives already the desired output.
EDIT: clarified that the length is arbitrary
As AGML pointed out there are tools to do this, but how about a pure numpy solution:
In [20]: a = np.arange(6)
In [21]: temp = np.dstack((a[:-1], a[:-1] + np.diff(a) / 2.0)).ravel()
In [22]: temp
Out[22]: array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
In [23]: np.hstack((temp, [a[-1]]))
Out[23]: array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])
I have an n x n numpy array that contains all pairwise distances and another 1 x n array that contains some scoring metric.
Example:
import numpy as np
import scipy.spatial.distance
dists = scipy.spatial.distance.squareform(np.array([3.2,4.1,8.8,.6,1.5,9.,5.0,9.9,10.,1.1]))
array([[ 0. , 3.2, 4.1, 8.8, 0.6],
[ 3.2, 0. , 1.5, 9. , 5. ],
[ 4.1, 1.5, 0. , 9.9, 10. ],
[ 8.8, 9. , 9.9, 0. , 1.1],
[ 0.6, 5. , 10. , 1.1, 0. ]])
score = np.array([19., 1.3, 4.8, 6.2, 5.7])
array([ 19. , 1.3, 4.8, 6.2, 5.7])
So, note that the ith element of the score array corresponds to the ith row of the distance array.
What I need to do is vectorize this process:
For the ith value in the score array, find all other values that are larger than the ith value and note their indices
Then, in the ith row of the distance array, get all of the distances with the same indices as noted in step 1. above and return the smallest distance
In the case where the ith value in the score array is the largest, then the smallest distance is set as the largest distance found in the distance array
Here is an un-vectorized version:
n = score.shape[0]
min_dist = np.full(n, np.max(dists))
for i in range(score.shape[0]):
inx = numpy.where(score > score[i])
if len(inx[0]) > 0:
min_dist[i] = np.min(dists[i, inx])
min_dist
array([ 10. , 1.5, 4.1, 8.8, 0.6])
This works but is pretty inefficient in terms of speed and my arrays are expected to be much, much larger. I am hoping to improve the efficiency by using faster vectorized operations to achieve the same result.
Update: Based on Oliver W.'s answer, I came up with my own that doesn't require making a copy of the distance array
def new_method (dists, score):
mask = score > score.reshape(-1,1)
return np.ma.masked_array(dists, mask=~mask).min(axis=1).filled(dists.max())
One could in theory make it a one-liner but it's already a bit challenging to read to the untrained eye.
One possible vectorized solution is given below.
import numpy as np
import scipy.spatial.distance
dists = scipy.spatial.distance.squareform(np.array([3.2,4.1,8.8,.6,1.5,9.,5.0,9.9,10.,1.1]))
score = np.array([19., 1.3, 4.8, 6.2, 5.7])
def your_method(dists, score):
dim = score.shape[0]
min_dist = np.full(dim, np.max(dists))
for i in range(dim):
inx = np.where(score > score[i])
if len(inx[0]) > 0:
min_dist[i] = np.min(dists[i, inx])
return min_dist
def vectorized_method_v1(dists, score):
mask = score > score.reshape(-1,1)
dists2 = dists.copy() # get rid of this in case the dists array can be changed
dists2[np.logical_not(mask)] = dists.max()
return dists2.min(axis=1)
The speed gain is not so impressive for these small arrays (~factor of 3 on my machine), so I'll demonstrate with a larger set:
dists = scipy.spatial.distance.squareform(np.random.random(50*99))
score = np.random.random(dists.shape[0])
print(dists.shape)
%timeit your_method(dists, score)
%timeit vectorized_method_v1(dists, score)
## -- End pasted text --
(100, 100)
100 loops, best of 3: 2.98 ms per loop
10000 loops, best of 3: 125 µs per loop
Which is close to a factor of 24.