Python numpy split with indices - python

I'm looking for a numpy equivalent of my suboptimal Python code. The calculation I want to do can be summarized by:
The average of the peak of each section for each row.
Here the code with a sample array and list of indices. Sections can be of different sizes.
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
indices = [2]
result = np.empty((1, x.shape[0]))
for row in x:
splited = np.array_split(row, indexes)
peak = [np.amax(a) for a in splited]
result[0, i] = np.average(peak)
Which gives: result = array([[3., 7.]])
What is the optimized numpy way to suppress both loop?

You could just take off the for loop and use axis instead:
result2 = np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
Output:
array([3., 7.])
Benchmark:
x_large = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]] * 1000)
%%timeit
result = []
for row in x_large:
splited = np.array_split(row, indices)
peak = [np.amax(a) for a in splited]
result.append(np.average(peak))
# 29.9 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
# 37.4 µs ± 499 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Validation:
np.array_equal(result, result2)
# True

Related

Calculating Kernel matrix using numpy methods

I have a data of shape d X N (each column is a vector of features)
I have this code for calculating the kernel matrix:
def kernel(x1, x2):
return x1.T # x2
data = np.array([[1,2,3], [1,2,3], [1,2,3]])
result = []
for i in range(data.shape[1]):
current_result = []
for j in range(data.shape[1]):
x1 = data[:, i]
x2 = data[:, j]
current_result.append(kernel(x1, x2))
result.append(current_result)
np.array(result)
and I am getting this result:
array([[ 3, 6, 9],
[ 6, 12, 18],
[ 9, 18, 27]])
The problem is that this code is too slow, so I tried to use np.vectorize:
vec = np.vectorize(kernel, signature='(n),(n)->()')
vec(data, data)
But I am getting the wrong result:
array([14, 14, 14])
what am I doing wrong?
When tested for bigger dimensions of your problem, and random numbers to ensure the robustness, for instance with dimensions (100,200), there are several ways:
import numpy as np
def kernel(x1, x2):
return x1.T # x2
def kernel_kenny(a):
result = []
for i in range(a.shape[1]):
current_result = []
for j in range(a.shape[1]):
x1 = a[:, i]
x2 = a[:, j]
current_result.append(kernel(x1, x2))
result.append(current_result)
return np.array(result)
a = np.random.random((100,200))
res1 = kernel_kenny(a)
# perhaps einsum signature might help you to understand the calculations
res2 = np.einsum('ji,jk->ik', a, a, optimize=True)
# or the following if you want to explicitly specify the transpose
# res2 = np.einsum('ij,jk->ik', a.T, a, optimize=True)
# or simply ...
res3 = a.T # a
Hera are the sanity checks:
np.allclose(res1,res2)
>>> True
np.allclose(res1,res3)
>>> True
and timings:
%timeit kernel_kenny(a)
>>> 83.2 ms ± 425 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.einsum('ji,jk->ik', a, a, optimize=True)
>>> 325 µs ± 4.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit a.T # a
>>> 82 µs ± 9.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Find largest row in a matrix with numpy (row with highest length)

I have a massive array with rows and columns. Some rows are larger than others. I need to get the max length row, that is, the row that has the highest length. I wrote a simple function for this, but I wanted it to be as fas as possible, like numpy fast. Currently, it looks like this:
Example array:
values = [
[1,2,3],
[4,5,6,7,8,9],
[10,11,12,13]
]
def values_max_width(values):
max_width = 1
for row in values:
if len(row) > max_width:
max_width = len(row)
return max_width
Is there any way to accomplish this with numpy?
In [261]: values = [
...: [1,2,3],
...: [4,5,6,7,8,9],
...: [10,11,12,13]
...: ]
...:
In [262]:
In [262]: values
Out[262]: [[1, 2, 3], [4, 5, 6, 7, 8, 9], [10, 11, 12, 13]]
In [263]: def values_max_width(values):
...: max_width = 1
...: for row in values:
...: if len(row) > max_width:
...: max_width = len(row)
...: return max_width
...:
In [264]: values_max_width(values)
Out[264]: 6
In [265]: [len(v) for v in values]
Out[265]: [3, 6, 4]
In [266]: max([len(v) for v in values])
Out[266]: 6
In [267]: np.max([len(v) for v in values])
Out[267]: 6
Your loop and the list comprehension are similar in speed, np.max is much slower - it has to first turn the list into an array.
In [268]: timeit max([len(v) for v in values])
656 ns ± 16.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [269]: timeit np.max([len(v) for v in values])
13.9 µs ± 181 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [271]: timeit values_max_width(values)
555 ns ± 13 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
If you are starting with a list, it's a good idea to thoroughly test the list implementation. numpy is fast when it is doing compiled array stuff, but creating an array from a list is time consuming.
Making an array directly from values isn't much help. The result in a object dtype array:
In [272]: arr = np.array(values)
In [273]: arr
Out[273]:
array([list([1, 2, 3]), list([4, 5, 6, 7, 8, 9]), list([10, 11, 12, 13])],
dtype=object)
Math on such an array is hit-or-miss, and always slower than math on pure numeric arrays. We can iterate on such an array, but that iteration is slower than on a list.
In [275]: values_max_width(arr)
Out[275]: 6
In [276]: timeit values_max_width(arr)
1.3 µs ± 8.27 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Not sure how you can make it faster. I've tried using np.max over the length of each item, but that will take even longer:
import numpy as np
import time
values = []
for k in range(100000):
values.append(list(np.random.randint(100, size=np.random.randint(1000))))
def timeit(func):
def wrapper(*args, **kwargs):
now = time.time()
retval = func(*args, **kwargs)
print('{} took {:.5f}s'.format(func.__name__, time.time() - now))
return retval
return wrapper
#timeit
def values_max_width(values):
max_width = 1
for row in values:
if len(row) > max_width:
max_width = len(row)
return max_width
#timeit
def value_max_width_len(values):
return np.max([len(l) for l in values])
values_max_width(values)
value_max_width_len(values)
values_max_width took 0.00598s
value_max_width_len took 0.00994s
* Edit *
As #Mstaino suggested, using map does make this code faster:
#timeit
def value_max_width_len(values):
return max(map(len, values))
values_max_width took 0.00598s
value_max_width_len took 0.00499s

How to add multiple extra columns to a NumPy array

Let’s say I have two NumPy arrays, a and b:
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([8,9])
And I would like to append the same array b to every row (ie. adding multiple columns) to get an array, c:
b = np.array([
[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]
])
How can I do this easily and efficiently in NumPy?
I am especially concerned about its behaviour with big datasets (where a is much bigger than b), is there any way around creating many copies (ie. a.shape[0]) of b?
Related to this question, but with multiple values.
Here's one way. I assume it's efficient because it's vectorised. It relies on the fact that in matrix multiplication, pre-multiplying a row by the column (1, 1) will produce two stacked copies of the row.
import numpy as np
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([[8,9]])
np.concatenate([a, np.array([[1],[1]]).dot(b)], axis=1)
Out: array([[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]])
Note that b is specified slightly differently (as a two-dimensional array).
Is there any way around creating many copies of b?
The final result contains those copies (and numpy arrays are literally arrays of values in memory), so I don't see how.
An alternative to concatenate approach is to make a recipient array, and copy values to it:
In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b
sample timings
In [488]: %%timeit
...: res = np.zeros((100,5),int)
...: res[:,:3]=a
...: res[:,3:]=b
...:
...:
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1)
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The way I solved this initially was :
c = np.concatenate([a, np.tile(b, (a.shape[0],1))], axis = 1)
But this feels very inefficient...

Search indexes where values in my array match a value in a different array (python) [duplicate]

I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Python Numpy - numpy axis performance

By default, numpy is row major.
Therefore, the following results are accepted naturally to me.
a = np.random.rand(5000, 5000)
%timeit a[0,:].sum()
3.57 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit a[:,0].sum()
38.8 µs ± 8.19 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Because it is a row major order, it is natural to calculate faster by a [0,:].
However, if use the numpy sum function, the result is different.
%timeit a.sum(axis=0)
16.9 ms ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit a.sum(axis=1)
29.5 ms ± 90.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
If use the numpy sum function, it is faster to compute it along the column.
So My point is why the speed along the axis = 0 (calculated along column) is faster than the along the axis = 1(along row).
For example
a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]], order='C')
In the row major order, [1,2,3] and [4,5,6], [7,8,9] are allocated to adjacent memory, respectively.
Therefore, the speed calculated along axis = 1 should be faster than axis = 0.
However, when using numpy sum function, it is faster to calculate along the column (axis = 0).
How can you explain this?
Thanks
You don't compute the same thing.
The first two commands only compute one row/column out of the entire array.
a[0, :].sum().shape # sums just the first row only
()
The second two commands, sum the entire contents of the 2D array, but along a certain axis. That way, you don't get a single result (as in the first two commands), but an 1D array of sums.
a.sum(axis=0).shape # computes the row-wise sum for each column
(5000,)
In summary, the two sets of commands do different things.
a
array([[1, 6, 9, 1, 6],
[5, 6, 9, 1, 3],
[5, 0, 3, 5, 7],
[2, 8, 3, 8, 6],
[3, 4, 8, 5, 0]])
a[0, :]
array([1, 6, 9, 1, 6])
a[0, :].sum()
23
a.sum(axis=0)
array([16, 24, 32, 20, 22])

Categories