Related
I have two numpy arrays a and b. I want to subtract each row of b from a. I tried to use:
a1 - b1[:, None]
This works for small arrays, but takes too long when it comes to real world data sizes.
a = np.arange(16).reshape(8,2)
a
Out[35]:
array([[ 0, 1],
[ 2, 3],
[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11],
[12, 13],
[14, 15]])
b = np.arange(6).reshape(3,2)
b
Out[37]:
array([[0, 1],
[2, 3],
[4, 5]])
a - b[:, None]
Out[38]:
array([[[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10],
[12, 12],
[14, 14]],
[[-2, -2],
[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10],
[12, 12]],
[[-4, -4],
[-2, -2],
[ 0, 0],
[ 2, 2],
[ 4, 4],
[ 6, 6],
[ 8, 8],
[10, 10]]])
%%timeit
a - b[:, None]
The slowest run took 10.36 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.18 µs per loop
This approach is too slow / inefficient for larger arrays.
a1 = np.arange(18900 * 41).reshape(18900, 41)
b1 = np.arange(2674 * 41).reshape(2674, 41)
%%timeit
a1 - b1[:, None]
1 loop, best of 3: 12.1 s per loop
%%timeit
for index in range(len(b1)):
a1 - b1[index]
1 loop, best of 3: 2.35 s per loop
Is there any numpy trick I can use to speed this up?
You are playing with memory limits.
If like in your examples, 8 bits are sufficient to store data, use uint8:
import numpy as np
a1 = np.arange(18900 * 41,dtype=np.uint8).reshape(18900, 41)
b1 = np.arange(2674 * 41,dtype=np.uint8).reshape(2674, 41)
%time c1=(a1-b1[:,None])
#1.02 s
In Python, let's say I have a 1366x768 numpy array. And I want to delete each second row from it (0th row remains, 1st removed, 2nd remains, 3rd removed.. and so on), and replace the empty space with a duplicate from the row which was before it (the undeleted row) at the same time.
Is it possible in numpy?
One approach -
a[::2].repeat(2,axis=0)
To make the changes in the array, assign it back.
Sample run -
In [105]: a
Out[105]:
array([[2, 5, 1, 1],
[2, 0, 2, 5],
[1, 1, 5, 7],
[0, 7, 1, 8],
[8, 5, 2, 3],
[2, 1, 0, 6],
[5, 6, 1, 6],
[7, 1, 4, 7],
[3, 8, 1, 4],
[5, 8, 8, 8]])
In [106]: a[::2].repeat(2,axis=0)
Out[106]:
array([[2, 5, 1, 1],
[2, 5, 1, 1],
[1, 1, 5, 7],
[1, 1, 5, 7],
[8, 5, 2, 3],
[8, 5, 2, 3],
[5, 6, 1, 6],
[5, 6, 1, 6],
[3, 8, 1, 4],
[3, 8, 1, 4]])
If we care about performance, here's another approach using NumPy strides -
def strided_app(a):
m0,n0 = a.strides
m,n = a.shape
strided = np.lib.stride_tricks.as_strided
return strided(a,shape=(m//2,2,n),strides=(2*m0,0,n0)).reshape(-1,n)
Sample run -
In [154]: a
Out[154]:
array([[4, 8, 7, 7],
[5, 5, 1, 7],
[1, 8, 1, 3],
[6, 6, 5, 6],
[0, 2, 6, 3],
[6, 6, 8, 7],
[7, 6, 8, 1],
[7, 8, 8, 2],
[4, 0, 2, 8],
[5, 8, 1, 4]])
In [155]: strided_app(a)
Out[155]:
array([[4, 8, 7, 7],
[4, 8, 7, 7],
[1, 8, 1, 3],
[1, 8, 1, 3],
[0, 2, 6, 3],
[0, 2, 6, 3],
[7, 6, 8, 1],
[7, 6, 8, 1],
[4, 0, 2, 8],
[4, 0, 2, 8]])
Timings -
In [156]: arr = np.arange(1000000).reshape(1000, 1000)
# Proposed soln-1
In [157]: %timeit arr[::2].repeat(2,axis=0)
1000 loops, best of 3: 1.26 ms per loop
# #Psidom 's soln
In [158]: %timeit arr[1::2] = arr[::2]
1000 loops, best of 3: 928 µs per loop
In [159]: arr = np.arange(1000000).reshape(1000, 1000)
# Proposed soln-2
In [160]: %timeit strided_app(arr)
1000 loops, best of 3: 830 µs per loop
Looks like you have an even number of rows, in which case, you can use assignment (assign the odd rows values to corresponding even rows):
arr = np.array([[1,4],[3,1],[2,3],[2,2]])
arr[1::2] = arr[::2]
arr
#array([[1, 4],
# [1, 4],
# [2, 3],
# [2, 3]])
This avoids copying the entire array, but doesn't work if the array has odd number of rows.
Timing: Here is a comparison of the timing, the assignment does seem faster.
arr = np.arange(1000000).reshape(1000, 1000)
%timeit arr[::2].repeat(2,axis=0)
1000 loops, best of 3: 913 µs per loop
%timeit arr[1::2] = arr[::2]
1000 loops, best of 3: 655 µs per loop
This works for both even and an odd number of rows.
for i in range(1,len(a),2):
a[i] = a[i-1]
I have a vector [x,y,z,q] and I want to create a matrix:
[[x,y,z,q],
[x,y,z,q],
[x,y,z,q],
...
[x,y,z,q]]
with m rows. I think this could be done in some smart way, using broadcasting, but I can only think of doing it with a for loop.
Certainly possible with broadcasting after adding with m zeros along the columns, like so -
np.zeros((m,1),dtype=vector.dtype) + vector
Now, NumPy already has an in-built function np.tile for exactly that same task -
np.tile(vector,(m,1))
Sample run -
In [496]: vector
Out[496]: array([4, 5, 8, 2])
In [497]: m = 5
In [498]: np.zeros((m,1),dtype=vector.dtype) + vector
Out[498]:
array([[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2]])
In [499]: np.tile(vector,(m,1))
Out[499]:
array([[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2]])
You can also use np.repeat after extending its dimension with np.newaxis/None for the same effect, like so -
In [510]: np.repeat(vector[None],m,axis=0)
Out[510]:
array([[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2]])
You can also use integer array indexing to get the replications, like so -
In [525]: vector[None][np.zeros(m,dtype=int)]
Out[525]:
array([[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2]])
And finally with np.broadcast_to, you can simply create a 2D view into the input vector and as such this would be virtually free and with no extra memory requirement. So, we would simply do -
In [22]: np.broadcast_to(vector,(m,len(vector)))
Out[22]:
array([[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2],
[4, 5, 8, 2]])
Runtime test -
Here's a quick runtime test comparing the various approaches -
In [12]: vector = np.random.rand(10000)
In [13]: m = 10000
In [14]: %timeit np.broadcast_to(vector,(m,len(vector)))
100000 loops, best of 3: 3.4 µs per loop # virtually free!
In [15]: %timeit np.zeros((m,1),dtype=vector.dtype) + vector
10 loops, best of 3: 95.1 ms per loop
In [16]: %timeit np.tile(vector,(m,1))
10 loops, best of 3: 89.7 ms per loop
In [17]: %timeit np.repeat(vector[None],m,axis=0)
10 loops, best of 3: 86.2 ms per loop
In [18]: %timeit vector[None][np.zeros(m,dtype=int)]
10 loops, best of 3: 89.8 ms per loop
This question already has answers here:
How to repeat elements of an array along two axes?
(5 answers)
Closed 3 years ago.
I have a 2D array of integers that is MxN, and I would like to expand the array to (BM)x(BN) where B is the length of a square tile side thus each element of the input array is repeated as a BxB block in the final array. Below is an example with a nested for loop. Is there a quicker/builtin way?
import numpy as np
a = np.arange(9).reshape([3,3]) # input array - 3x3
B=2. # block size - 2
A = np.zeros([a.shape[0]*B,a.shape[1]*B]) # output array - 6x6
# Loop, filling A with tiled values of a at each index
for i,l in enumerate(a): # lines in a
for j,aij in enumerate(l): # a[i,j]
A[B*i:B*(i+1),B*j:B*(j+1)] = aij
Result ...
a= [[0 1 2]
[3 4 5]
[6 7 8]]
A = [[ 0. 0. 1. 1. 2. 2.]
[ 0. 0. 1. 1. 2. 2.]
[ 3. 3. 4. 4. 5. 5.]
[ 3. 3. 4. 4. 5. 5.]
[ 6. 6. 7. 7. 8. 8.]
[ 6. 6. 7. 7. 8. 8.]]
One option is
>>> a.repeat(2, axis=0).repeat(2, axis=1)
array([[0, 0, 1, 1, 2, 2],
[0, 0, 1, 1, 2, 2],
[3, 3, 4, 4, 5, 5],
[3, 3, 4, 4, 5, 5],
[6, 6, 7, 7, 8, 8],
[6, 6, 7, 7, 8, 8]])
This is slightly wasteful due to the intermediate array but it's concise at least.
Here's a potentially fast way using stride tricks and reshaping:
from numpy.lib.stride_tricks import as_strided
def tile_array(a, b0, b1):
r, c = a.shape # number of rows/columns
rs, cs = a.strides # row/column strides
x = as_strided(a, (r, b0, c, b1), (rs, 0, cs, 0)) # view a as larger 4D array
return x.reshape(r*b0, c*b1) # create new 2D array
The underlying data in a is copied when reshape is called, so this function does not return a view. However, compared to using repeat along multiple axes, fewer copying operations are required.
The function can be then used as follows:
>>> a = np.arange(9).reshape(3, 3)
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> tile_array(a, 2, 2)
array([[0, 0, 1, 1, 2, 2],
[0, 0, 1, 1, 2, 2],
[3, 3, 4, 4, 5, 5],
[3, 3, 4, 4, 5, 5],
[6, 6, 7, 7, 8, 8],
[6, 6, 7, 7, 8, 8]])
>>> tile_array(a, 3, 4)
array([[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
[3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5],
[3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5],
[3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5],
[6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8],
[6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8],
[6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8]])
Now, for small blocks, this method is a little slower than using repeat but faster than kron.
For slightly larger blocks, however, it becomes quicker than other alternatives. For instance, using a block shape of (20, 20):
>>> %timeit tile_array(a, 20, 20)
100000 loops, best of 3: 18.7 µs per loop
>>> %timeit a.repeat(20, axis=0).repeat(20, axis=1)
10000 loops, best of 3: 26 µs per loop
>>> %timeit np.kron(a, np.ones((20,20), a.dtype))
10000 loops, best of 3: 106 µs per loop
The gap between the methods increases as the block size increases.
Also if a is a large array, it may be quicker than alternatives:
>>> a2 = np.arange(1000000).reshape(1000, 1000)
>>> %timeit tile_array(a2, 2, 2)
100 loops, best of 3: 11.4 ms per loop
>>> %timeit a2.repeat(2, axis=0).repeat(2, axis=1)
1 loops, best of 3: 30.9 ms per loop
Probably not the fastest, but..
np.kron(a, np.ones((B,B), a.dtype))
It does the Kronecker product, so it involves a multiplication for each element in the output.
I can't figure out the difference between these two kinds of indexing. It seems like they should produce the same results but they do not. Any explanation?
A[1:3, 0:2] takes rows 1-3 and columns 0-2 thus returning a 2x2 array.
A[1:3][0:2] first takes rows 1-3 and from this subarray takes the rows 0-2, resulting in a 2xn array where n is the original number of columns.
In [1]: import numpy as np
In [2]: a = np.arange(16).reshape(4,4)
In [3]: a
Out[3]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [4]: a[1:3,0:2]
Out[4]:
array([[4, 5],
[8, 9]])
In [5]: a[1:3]
Out[5]:
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [6]: a[1:3][0:2]
Out[6]:
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
The equivalent of A[1:3,0:2] using two [] is: A[1:3][:,0:2]:
In [7]: a[1:3][:,0:2]
Out[7]:
array([[4, 5],
[8, 9]])
Where : means "all the rows". So you are first selecting the rows via [1:3] and then, from all the rows select columns 0-2.
A[1:3][0:2] means first apply [1:3] on A, and then apply [0:2] on the array returned from the first step, so both slicing are only applied on the rows. OTOH A[1:3, 0:2] means apply 1:3 on the rows and 0:2 on columns, ie. get second and third row only and get only the first two columns of those rows.
>>> import numpy as np
>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3][0:2]
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3] #Get 2nd and 3rd row.
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> _[0:2] #Get the first two rows of the last array.
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a[1:3, 0:2]
array([[4, 5],
[8, 9]])