Let’s say I have two NumPy arrays, a and b:
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([8,9])
And I would like to append the same array b to every row (ie. adding multiple columns) to get an array, c:
b = np.array([
[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]
])
How can I do this easily and efficiently in NumPy?
I am especially concerned about its behaviour with big datasets (where a is much bigger than b), is there any way around creating many copies (ie. a.shape[0]) of b?
Related to this question, but with multiple values.
Here's one way. I assume it's efficient because it's vectorised. It relies on the fact that in matrix multiplication, pre-multiplying a row by the column (1, 1) will produce two stacked copies of the row.
import numpy as np
a = np.array([
[1, 2, 3],
[2, 3, 4]
])
b = np.array([[8,9]])
np.concatenate([a, np.array([[1],[1]]).dot(b)], axis=1)
Out: array([[1, 2, 3, 8, 9],
[2, 3, 4, 8, 9]])
Note that b is specified slightly differently (as a two-dimensional array).
Is there any way around creating many copies of b?
The final result contains those copies (and numpy arrays are literally arrays of values in memory), so I don't see how.
An alternative to concatenate approach is to make a recipient array, and copy values to it:
In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b
sample timings
In [488]: %%timeit
...: res = np.zeros((100,5),int)
...: res[:,:3]=a
...: res[:,3:]=b
...:
...:
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1)
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
The way I solved this initially was :
c = np.concatenate([a, np.tile(b, (a.shape[0],1))], axis = 1)
But this feels very inefficient...
Related
I'm looking for a numpy equivalent of my suboptimal Python code. The calculation I want to do can be summarized by:
The average of the peak of each section for each row.
Here the code with a sample array and list of indices. Sections can be of different sizes.
x = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
indices = [2]
result = np.empty((1, x.shape[0]))
for row in x:
splited = np.array_split(row, indexes)
peak = [np.amax(a) for a in splited]
result[0, i] = np.average(peak)
Which gives: result = array([[3., 7.]])
What is the optimized numpy way to suppress both loop?
You could just take off the for loop and use axis instead:
result2 = np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
Output:
array([3., 7.])
Benchmark:
x_large = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]] * 1000)
%%timeit
result = []
for row in x_large:
splited = np.array_split(row, indices)
peak = [np.amax(a) for a in splited]
result.append(np.average(peak))
# 29.9 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.mean([np.max(arr, 1) for arr in np.array_split(x_large, indices, 1)], axis=0)
# 37.4 µs ± 499 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Validation:
np.array_equal(result, result2)
# True
How do you apply vectorized functions on sub-arrays? Suppose I have the following:
array = np.array([
[0, 1, 2],
[2],
[],
])
And I wanted to obtain the first element in each subarray, else None.
[0, 2, None]
While simple, is there are way to do this leveraging Numpy's pure vectorization? There doesn't seem to be native operations, and the np.vectorize() function is described to not be true documentation and has been stated at various other points in threads.
Is my only option to do a np.apply_along_axes()?
When do I know when I cannot solve my problem with numpy's pure vectorization?
You've created an object dtype array - containing lists (not subarrays):
In [2]: array = np.array([
...: [0, 1, 2],
...: [2],
...: [],
...: ])
/usr/local/bin/ipython3:4: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (1.19dev gives warning)
In [3]: array
Out[3]: array([list([0, 1, 2]), list([2]), list([])], dtype=object)
We could use a list comprehension:
In [4]: [a[0] for a in array]
....
IndexError: list index out of range
and correcting for the empty list:
In [5]: [a[0] if a else None for a in array]
Out[5]: [0, 2, None]
Most of the fast compiled code for numpy - the "vectorized" stuff - only works with numeric dtype arrays. For object dtype it has to do something akin to a list comprehension. Even when math works, it's because it was able to delegate the action to the elements.
For example applying list replication to all elements of your array:
In [7]: array*3
Out[7]:
array([list([0, 1, 2, 0, 1, 2, 0, 1, 2]), list([2, 2, 2]), list([])],
dtype=object)
and sum is just list join:
In [8]: array.sum()
Out[8]: [0, 1, 2, 2]
apply_along_axis isn't an faster than np.vectorize. And I can't imagine how it would be used in a case like this. array is 1d.
Sometimes frompyfunc is handy when working with object dtype arrays (but it's not a speed solution):
In [11]: timeit np.frompyfunc(lambda a: a[0] if a else None, 1,1)(array)
3.8 µs ± 9.85 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [12]: timeit [a[0] if a else None for a in array]
1.02 µs ± 5.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [14]: timeit np.vectorize(lambda a: a[0] if a else None, otypes=['O'])(array)
18 µs ± 46.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
By default, numpy is row major.
Therefore, the following results are accepted naturally to me.
a = np.random.rand(5000, 5000)
%timeit a[0,:].sum()
3.57 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit a[:,0].sum()
38.8 µs ± 8.19 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Because it is a row major order, it is natural to calculate faster by a [0,:].
However, if use the numpy sum function, the result is different.
%timeit a.sum(axis=0)
16.9 ms ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit a.sum(axis=1)
29.5 ms ± 90.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
If use the numpy sum function, it is faster to compute it along the column.
So My point is why the speed along the axis = 0 (calculated along column) is faster than the along the axis = 1(along row).
For example
a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]], order='C')
In the row major order, [1,2,3] and [4,5,6], [7,8,9] are allocated to adjacent memory, respectively.
Therefore, the speed calculated along axis = 1 should be faster than axis = 0.
However, when using numpy sum function, it is faster to calculate along the column (axis = 0).
How can you explain this?
Thanks
You don't compute the same thing.
The first two commands only compute one row/column out of the entire array.
a[0, :].sum().shape # sums just the first row only
()
The second two commands, sum the entire contents of the 2D array, but along a certain axis. That way, you don't get a single result (as in the first two commands), but an 1D array of sums.
a.sum(axis=0).shape # computes the row-wise sum for each column
(5000,)
In summary, the two sets of commands do different things.
a
array([[1, 6, 9, 1, 6],
[5, 6, 9, 1, 3],
[5, 0, 3, 5, 7],
[2, 8, 3, 8, 6],
[3, 4, 8, 5, 0]])
a[0, :]
array([1, 6, 9, 1, 6])
a[0, :].sum()
23
a.sum(axis=0)
array([16, 24, 32, 20, 22])
I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)