I have two arrays a & b
a.shape
(5, 4, 3)
array([[[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0.10772717, 0.604584 , 0.41664413]],
[[ 0. , 0. , 0. ],
[ 0. , 0. , 0. ],
[ 0.10772717, 0.604584 , 0.41664413],
[ 0.95879616, 0.85575133, 0.46135877]],
[[ 0. , 0. , 0. ],
[ 0.10772717, 0.604584 , 0.41664413],
[ 0.95879616, 0.85575133, 0.46135877],
[ 0.70442301, 0.74126523, 0.88965603]],
[[ 0.10772717, 0.604584 , 0.41664413],
[ 0.95879616, 0.85575133, 0.46135877],
[ 0.70442301, 0.74126523, 0.88965603],
[ 0.8039435 , 0.62802183, 0.58885027]],
[[ 0.95879616, 0.85575133, 0.46135877],
[ 0.70442301, 0.74126523, 0.88965603],
[ 0.8039435 , 0.62802183, 0.58885027],
[ 0.95848603, 0.72429311, 0.71461332]]])
and b
array([ 0.79212707, 0.66629398, 0.58676553], dtype=float32)
b.shape
(3,)
I want to get array
ab.shape
(5,5,3)
I do as below
first
b = b.reshape(1,1,3)
then
b=np.concatenate((b, b,b, b, b), axis = 0)
And
ab=np.concatenate((a, b), axis = 1)
ab.shape
(5, 5, 3)
I get the right result, but it's not very convenient especially at the step
b=np.concatenate((b, b,b, b, b), axis = 0)
when I have to type many times (the real dataset has much dimensions). Are there any faster ways to come to this result?
Simply broadcast b to 3D and then concatenate along second axis -
b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
out = np.concatenate((a,b3D),axis=1)
The broadcasting part with np.broadcast_to doesn't actual replicate or make copies and is simply a replicated view and then in the next step, we do the concatenation that does the replication on-the-fly.
Benchmarking
We are comparing np.repeat version from #cᴏʟᴅsᴘᴇᴇᴅ's solution against np.broadcast_to one
in this section with focus on performance. The broadcasting based one does the replication and concatenation in the second step, as a merged command so to speak, while np.repeat version makes copy and then concatenates in two separate steps.
Timing the approaches as whole :
Case #1 : a = (500,400,300) and b = (300,)
In [321]: a = np.random.rand(500,400,300)
In [322]: b = np.random.rand(300)
In [323]: %%timeit
...: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
...: r = np.concatenate((a, b3D), axis=1)
10 loops, best of 3: 72.1 ms per loop
In [325]: %%timeit
...: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
...: out = np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72.5 ms per loop
For smaller input shapes, call to np.broadcast_to would take a bit longer than np.repeat given the work needed for setting up the broadcasting is apparently more complicated, as the timings suggest below :
In [360]: a = np.random.rand(5,4,3)
In [361]: b = np.random.rand(3)
In [366]: %timeit np.broadcast_to(b,(a.shape[0],1,len(b)))
100000 loops, best of 3: 3.12 µs per loop
In [367]: %timeit b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
1000000 loops, best of 3: 957 ns per loop
But, the broadcasting part would have a constant time irrepective of the shapes of the inputs, i.e. the 3 u-sec part would stay around that mark. The timing for the counterpart : b.reshape(1, 1, -1).repeat(a.shape[0], axis=0) would depend on the input shapes. So, let's dig deeper and see how the concatenation steps for the two approaches fair/behave.
Diging deeper
Trying to dig deeper to see how much the concatenation part is consuming :
In [353]: a = np.random.rand(500,400,300)
In [354]: b = np.random.rand(300)
In [355]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
In [356]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72 ms per loop
In [357]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
In [358]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 72 ms per loop
Conclusion : Doesn't seem too different.
Now, let's try a case where the replication needed for b is a bigger number and b has noticeably high number of elements as well.
In [344]: a = np.random.rand(10000, 10, 1000)
In [345]: b = np.random.rand(1000)
In [346]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
In [347]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 130 ms per loop
In [348]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
In [349]: %timeit np.concatenate((a,b3D),axis=1)
10 loops, best of 3: 141 ms per loop
Conclusion : Seems like the merged concatenate+replication with np.broadcast_to is doing a bit better here.
Let's try the original case of (5,4,3) shape :
In [360]: a = np.random.rand(5,4,3)
In [361]: b = np.random.rand(3)
In [362]: b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
In [363]: %timeit np.concatenate((a,b3D),axis=1)
1000000 loops, best of 3: 948 ns per loop
In [364]: b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
In [365]: %timeit np.concatenate((a,b3D),axis=1)
1000000 loops, best of 3: 950 ns per loop
Conclusion : Again, not too different.
So, the final conclusion is that if there are a lot of elements in b and if the first axis of a is also a big number (as the replication number is that one), np.broadcast_to would be a good option, otherwise np.repeat based version takes care of the other cases pretty well.
You can use np.repeat:
r = np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
What this does, is first reshape your b array to match the dimensions of a, and then repeat its values as many times as needed according to a's first axis:
b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
array([[[1, 2, 3]],
[[1, 2, 3]],
[[1, 2, 3]],
[[1, 2, 3]],
[[1, 2, 3]]])
b3D.shape
(5, 1, 3)
This intermediate result is then concatenated with a -
r = np.concatenate((a, b3d), axis=0)
r.shape
(5, 5, 3)
This differs from your current answer mainly in the fact that the repetition of values is not hard-coded (i.e., it is taken care of by the repeat).
If you need to handle this for a different number of dimensions (not 3D arrays), then some changes are needed (mainly in how remove the hardcoded reshape of b).
Timings
a = np.random.randn(100, 99, 100)
b = np.random.randn(100)
# Tai's answer
%timeit np.insert(a, 4, b, axis=1)
100 loops, best of 3: 3.7 ms per loop
# Divakar's answer
%%timeit
b3D = np.broadcast_to(b,(a.shape[0],1,len(b)))
np.concatenate((a,b3D),axis=1)
100 loops, best of 3: 3.67 ms per loop
# solution in this post
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
100 loops, best of 3: 3.62 ms per loop
These are all pretty competitive solutions. However, note that performance depends on your actual data, so make sure you test things first!
Here are some simple timings based on cᴏʟᴅsᴘᴇᴇᴅ's and Divakar's solutions:
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
Output:
The slowest run took 6.44 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.68 µs per loop
%timeit np.concatenate((a, np.broadcast_to(b[None,None], (a.shape[0], 1, len(b)))), axis=1)
Output:
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 10.7 µs per loop
Now here is the timing based on your original code:
%timeit original_func(a, b)
Output:
The slowest run took 4.62 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.69 µs per loop
Since the question asked for faster ways to come up with the same result, I would go for cᴏʟᴅsᴘᴇᴇᴅ's solution based on these problem calculations.
You can also use np.insert.
b_broad = np.expand_dims(b, axis=0) # b_broad.shape = (1, 3)
ab = np.insert(a, 4, b_broad, axis=1)
"""
Because now we are inserting along axis 1
a'shape without axis 1 = (5, 3)
b_broad's shape (1, 3)
can be aligned and broadcast b_broad to (5, 3)
"""
In this example, we insert along the axis 1, and will put b_broad before the index given, 4 here. In other words, the b_broad will occupy index 4 at long the axis and make ab.shape equal (5, 5, 3).
Note again that before we do insertion, we turn b into b_broad for safely achieve the right broadcasting you want. The dimension of b is smaller and there will be broadcasting at insertion. We can use expand_dims to achieve this goal.
If a is of shape (3, 4, 5), you will need b_broad to have shape (3, 1) to match up dimensions if inserting along axis 1. This can be achieved by
b_broad = np.expand_dims(b, axis=1) # shape = (3, 1)
It would be a good practice to make b_broad in a right shape because you might have a.shape = (3, 4, 3) and you really need to specify which way to broadcast in this case!
Timing Results
From OP's dataset: COLDSPEED's answer is 3 times faster.
def Divakar(): # Divakar's answer
b3D = b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)
r = np.concatenate((a, b3D), axis=1)
# COLDSPEED's result
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
2.95 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Divakar's result
%timeit Divakar()
3.03 µs ± 173 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
# Mine's
%timeit np.insert(a, 4, b, axis=1)
10.1 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Dataset 2 (Borrow the timing experiment from COLDSPEED): nothing can be concluded in this case because they share nearly the same mean and standard deviation.
a = np.random.randn(100, 99, 100)
b = np.random.randn(100)
# COLDSPEED's result
%timeit np.concatenate((a, b.reshape(1, 1, -1).repeat(a.shape[0], axis=0)), axis=1)
2.37 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Divakar's
%timeit Divakar()
2.31 ms ± 249 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Mine's
%timeit np.insert(a, 99, b, axis=1)
2.34 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Speed will depend on data's size, shape, and volume. Please tested on you dataset if speed is your concern.
Related
I have N number of points, for example:
A = [2, 3]
B = [3, 4]
C = [3, 3]
.
.
.
And they're in an array like so:
arr = np.array([[2, 3], [3, 4], [3, 3]])
I need as output all pairwise distances in BFS (Breadth First Search) order to track which distance is which, like: A->B, A->C, B->C. For the above example data, the result would be [1.41, 1.0, 1.0].
EDIT: I have to accomplish it with numpy or core libraries.
If you can use it, SciPy has a function for this:
In [2]: from scipy.spatial.distance import pdist
In [3]: pdist(arr)
Out[3]: array([1.41421356, 1. , 1. ])
Here's a numpy-only solution (fair warning: it requires a lot of memory, unlike pdist)...
dists = np.triu(np.linalg.norm(arr - arr[:, None], axis=-1)).flatten()
dists = dists[dists != 0]
Demo:
In [4]: arr = np.array([[2, 3], [3, 4], [3, 3], [5, 2], [4, 5]])
In [5]: pdist(arr)
Out[5]:
array([1.41421356, 1. , 3.16227766, 2.82842712, 1. ,
2.82842712, 1.41421356, 2.23606798, 2.23606798, 3.16227766])
In [6]: dists = np.triu(np.linalg.norm(arr - arr[:, None], axis=-1)).flatten()
In [7]: dists = dists[dists != 0]
In [8]: dists
Out[8]:
array([1.41421356, 1. , 3.16227766, 2.82842712, 1. ,
2.82842712, 1.41421356, 2.23606798, 2.23606798, 3.16227766])
Timings (with the solution above wrapped in a function called triu):
In [9]: %timeit pdist(arr)
7.27 µs ± 738 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [10]: %timeit triu(arr)
25.5 µs ± 4.58 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
As an alternative method, but similar to ddejohn answer, we can use np.triu_indices which return just the upper triangular indices in the matrix, which may be more memory-efficient:
np.linalg.norm(arr - arr[:, None], axis=-1)[np.triu_indices(arr.shape[0], 1)]
This doesn't need additional modules like flattening and indexing. Its performance is similar to the aforementioned answer for large data (e.g. you can check it by arr = np.random.rand(10000, 2) on colab, which will be done near 4.6 s for both; It may beats the np.triu and flatten in larger data).
I have tested the memory usage one time by memory-profiler as follows, but it must be checked again if it be important in terms of memory usage (I'm not sure):
Update:
I have tried to limit the calculations just to the upper triangle, that speed the code up 2 to 3 times on the tested arrays. As array size grows, the performance difference between this loop and the previous methods by np.triu_indices or np.triu grows and be more obvious:
ind = np.arange(arr.shape[0] - 1)
sub_ind = ind + 1
result = np.zeros(sub_ind.sum())
j = 0
for i in range(ind.shape[0]):
result[j:j+ind[-1-i]+1] = np.linalg.norm(arr[ind[i]] - arr[sub_ind[i]:], axis=-1)
j += ind[-1-i]+1
Also, through this way, the memory consumption is reduced at least ~x4. So, this method made it possible to work on larger arrays and more quickly.
Benchmarks:
# arr = np.random.rand(100, 2)
100 loops, best of 5: 459 µs per loop (ddejohns --> np.triu & np.flatten)
100 loops, best of 5: 528 µs per loop (mine --> np.triu_indices)
100 loops, best of 5: 1.42 ms per loop (This method)
--------------------------------------
# arr = np.random.rand(1000, 2)
10 loops, best of 5: 49.9 ms per loop
10 loops, best of 5: 49.7 ms per loop
10 loops, best of 5: 30.4 ms per loop (~x1.7) The fastest
--------------------------------------
# arr = np.random.rand(10000, 2)
2 loops, best of 5: 4.56 s per loop
2 loops, best of 5: 4.6 s per loop
2 loops, best of 5: 1.85 s per loop (~x2.5) The fastest
I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
I have two matrices with different dimensions that I would like to multiply using einsum numpy:
C(24, 79) and D(1, 1, 24, 1). I want to obtain the matrix with the dimension (1, 1, 79, 1).
I have tried to multiply them in two ways:
tmp = np.einsum('px, klpj ->klxj', C, D)
tmp = np.einsum('xp, klpj ->klxj', C, D)
and I'm obtaining different results. Why? What is the correct way of multiplying these matrices?
Owing to the singleton dimensions that don't really result in sum-reduction, we can introduce matrix-multiplication with np.tensordot or np.dot to have two more approaches to solve it -
np.tensordot(C,D,axes=([0],[2])).swapaxes(0,2)
D.ravel().dot(C).reshape(1,1,C.shape[1],1)
Verify results -
In [26]: tmp = np.einsum('px, klpj ->klxj', C, D)
In [27]: out = np.tensordot(C,D,axes=([0],[2])).swapaxes(0,2)
In [28]: np.allclose(out, tmp)
Out[28]: True
In [29]: out = D.ravel().dot(C).reshape(1,1,C.shape[1],1)
In [30]: np.allclose(out, tmp)
Out[30]: True
Runtime test -
In [31]: %timeit np.einsum('px, klpj ->klxj', C, D)
100000 loops, best of 3: 5.84 µs per loop
In [32]: %timeit np.tensordot(C,D,axes=([0],[2])).swapaxes(0,2)
100000 loops, best of 3: 18.5 µs per loop
In [33]: %timeit D.ravel().dot(C).reshape(1,1,C.shape[1],1)
100000 loops, best of 3: 3.29 µs per loop
With bigger datasets, you would see noticeable benefits with matrix-multiplication -
In [36]: C = np.random.rand(240,790)
...: D = np.random.rand(1,1,240,1)
...:
In [37]: %timeit np.einsum('px, klpj ->klxj', C, D)
...: %timeit np.tensordot(C,D,axes=([0],[2])).swapaxes(0,2)
...: %timeit D.ravel().dot(C).reshape(1,1,C.shape[1],1)
...:
1000 loops, best of 3: 182 µs per loop
10000 loops, best of 3: 84.9 µs per loop
10000 loops, best of 3: 55.5 µs per loop
I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
I want to create a list of points that would correspond to a grid. So if I want to create a grid of the region from (0, 0) to (1, 1), it would contain the points (0, 0), (0, 1), (1, 0) and (1, 0).
I know that that this can be done with the following code:
g = np.meshgrid([0,1],[0,1])
np.append(g[0].reshape(-1,1),g[1].reshape(-1,1),axis=1)
Yielding the result:
array([[0, 0],
[1, 0],
[0, 1],
[1, 1]])
My question is twofold:
Is there a better way of doing this?
Is there a way of generalizing this to higher dimensions?
I just noticed that the documentation in numpy provides an even faster way to do this:
X, Y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([X.ravel(), Y.ravel()])
This can easily be generalized to more dimensions using the linked meshgrid2 function and mapping 'ravel' to the resulting grid.
g = meshgrid2(x, y, z)
positions = np.vstack(map(np.ravel, g))
The result is about 35 times faster than the zip method for a 3D array with 1000 ticks on each axis.
Source: http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html#scipy.stats.gaussian_kde
To compare the two methods consider the following sections of code:
Create the proverbial tick marks that will help to create the grid.
In [23]: import numpy as np
In [34]: from numpy import asarray
In [35]: x = np.random.rand(100,1)
In [36]: y = np.random.rand(100,1)
In [37]: z = np.random.rand(100,1)
Define the function that mgilson linked to for the meshgrid:
In [38]: def meshgrid2(*arrs):
....: arrs = tuple(reversed(arrs))
....: lens = map(len, arrs)
....: dim = len(arrs)
....: sz = 1
....: for s in lens:
....: sz *= s
....: ans = []
....: for i, arr in enumerate(arrs):
....: slc = [1]*dim
....: slc[i] = lens[i]
....: arr2 = asarray(arr).reshape(slc)
....: for j, sz in enumerate(lens):
....: if j != i:
....: arr2 = arr2.repeat(sz, axis=j)
....: ans.append(arr2)
....: return tuple(ans)
Create the grid and time the two functions.
In [39]: g = meshgrid2(x, y, z)
In [40]: %timeit pos = np.vstack(map(np.ravel, g)).T
100 loops, best of 3: 7.26 ms per loop
In [41]: %timeit zip(*(x.flat for x in g))
1 loops, best of 3: 264 ms per loop
Are your gridpoints always integral? If so, you could use numpy.ndindex
print list(np.ndindex(2,2))
Higher dimensions:
print list(np.ndindex(2,2,2))
Unfortunately, this does not meet the requirements of the OP since the integral assumption (starting with 0) is not met. I'll leave this answer only in case someone else is looking for the same thing where those assumptions are true.
Another way to do this relies on zip:
g = np.meshgrid([0,1],[0,1])
zip(*(x.flat for x in g))
This portion scales nicely to arbitrary dimensions. Unfortunately, np.meshgrid doesn't scale well to multiple dimensions, so that part will need to be worked out, or (assuming it works), you could use this SO answer to create your own ndmeshgrid function.
Yet another way to do it is:
np.indices((2,2)).T.reshape(-1,2)
Which can be generalized to higher dimensions, e.g.:
In [60]: np.indices((2,2,2)).T.reshape(-1,3)
Out[60]:
array([[0, 0, 0],
[1, 0, 0],
[0, 1, 0],
[1, 1, 0],
[0, 0, 1],
[1, 0, 1],
[0, 1, 1],
[1, 1, 1]])
To get the coordinates of a grid from 0 to 1, a reshape can do the work. Here are examples for 2D and 3D. Also works with floats.
grid_2D = np.mgrid[0:2:1, 0:2:1]
points_2D = grid_2D.reshape(2, -1).T
grid_3D = np.mgrid[0:2:1, 0:2:1, 0:2:1]
points_3D = grid_3D.reshape(3, -1).T
A simple example in 3D (can be extended to N-dimensions I guess, but beware of the final dimension and RAM usage):
import numpy as np
ndim = 3
xmin = 0.
ymin = 0.
zmin = 0.
length_x = 1000.
length_y = 1000.
length_z = 50.
step_x = 1.
step_y = 1.
step_z = 1.
x = np.arange(xmin, length_x, step_x)
y = np.arange(ymin, length_y, step_y)
z = np.arange(zmin, length_z, step_z)
%timeit xyz = np.array(np.meshgrid(x, y, z)).T.reshape(-1, ndim)
in: 2.76 s ± 185 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
which yields:
In [2]: xyx
Out[2]:
array([[ 0., 0., 0.],
[ 0., 1., 0.],
[ 0., 2., 0.],
...,
[999., 997., 49.],
[999., 998., 49.],
[999., 999., 49.]])
In [4]: xyz.shape
Out[4]: (50000000, 3)
Python 3.6.9
Numpy: 1.19.5
I am using the following to convert meshgrid to M X 2 array. Also changes the list of vectors to iterators can make it really fast.
import numpy as np
# Without iterators
x_vecs = [np.linspace(0,1,1000), np.linspace(0,1,1000)]
%timeit np.reshape(np.meshgrid(*x_vecs),(2,-1)).T
6.85 ms ± 93.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# With iterators
x_vecs = iter([np.linspace(0,1,1000), np.linspace(0,1,1000)])
%timeit np.reshape(np.meshgrid(*x_vecs),(2,-1)).T
5.78 µs ± 172 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
for N-D array using generator
vec_dim = 3
res = 100
# Without iterators
x_vecs = [np.linspace(0,1,res) for i in range(vec_dim)]
>>> %timeit np.reshape(np.meshgrid(*x_vecs),(vec_dim,-1)).T
11 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# With iterators
x_vecs = (np.linspace(0,1,res) for i in range(vec_dim))
>>> %timeit np.reshape(np.meshgrid(*x_vecs),(vec_dim,-1)).T
5.54 µs ± 32.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)