Related
I think it must be easy, but I cannot google it. Suppose I have array of numbers 1, 2, 3, 4.
import numpy as np
a = np.array([1,2,3,4])
How to index array if I want sequence 2, 3, 4, 1??
I know that for sequence 2, 3, 4 I can choose e.g.:
print(a[1::1])
If you want to rotate the list, you can use a deque instead of a numpy array. This data structure is designed for this kind of operation and directly provides a rotate function.
>>> from collections import deque
>>> a = deque([1, 2, 3, 4])
>>> a.rotate(-1)
>>> a
deque([2, 3, 4, 1])
If you want to use Numpy, you can check out the roll function.
>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> np.roll(a, -1)
array([2, 3, 4, 1])
One possible way is to define index set (a list).
index_set = [1, 2, 3, 0]
print(a[index_set])
Is there any way to get the indices of several elements in a NumPy array at once?
E.g.
import numpy as np
a = np.array([1, 2, 4])
b = np.array([1, 2, 3, 10, 4])
I would like to find the index of each element of a in b, namely: [0,1,4].
I find the solution I am using a bit verbose:
import numpy as np
a = np.array([1, 2, 4])
b = np.array([1, 2, 3, 10, 4])
c = np.zeros_like(a)
for i, aa in np.ndenumerate(a):
c[i] = np.where(b == aa)[0]
print('c: {0}'.format(c))
Output:
c: [0 1 4]
You could use in1d and nonzero (or where for that matter):
>>> np.in1d(b, a).nonzero()[0]
array([0, 1, 4])
This works fine for your example arrays, but in general the array of returned indices does not honour the order of the values in a. This may be a problem depending on what you want to do next.
In that case, a much better answer is the one #Jaime gives here, using searchsorted:
>>> sorter = np.argsort(b)
>>> sorter[np.searchsorted(b, a, sorter=sorter)]
array([0, 1, 4])
This returns the indices for values as they appear in a. For instance:
a = np.array([1, 2, 4])
b = np.array([4, 2, 3, 1])
>>> sorter = np.argsort(b)
>>> sorter[np.searchsorted(b, a, sorter=sorter)]
array([3, 1, 0]) # the other method would return [0, 1, 3]
This is a simple one-liner using the numpy-indexed package (disclaimer: I am its author):
import numpy_indexed as npi
idx = npi.indices(b, a)
The implementation is fully vectorized, and it gives you control over the handling of missing values. Moreover, it works for nd-arrays as well (for instance, finding the indices of rows of a in b).
All of the solutions here recommend using a linear search. You can use np.argsort and np.searchsorted to speed things up dramatically for large arrays:
sorter = b.argsort()
i = sorter[np.searchsorted(b, a, sorter=sorter)]
For an order-agnostic solution, you can use np.flatnonzero with np.isin (v 1.13+).
import numpy as np
a = np.array([1, 2, 4])
b = np.array([1, 2, 3, 10, 4])
res = np.flatnonzero(np.isin(a, b)) # NumPy v1.13+
res = np.flatnonzero(np.in1d(a, b)) # earlier versions
# array([0, 1, 2], dtype=int64)
There are a bunch of approaches for getting the index of multiple items at once mentioned in passing in answers to this related question: Is there a NumPy function to return the first index of something in an array?. The wide variety and creativity of the answers suggests there is no single best practice, so if your code above works and is easy to understand, I'd say keep it.
I personally found this approach to be both performant and easy to read: https://stackoverflow.com/a/23994923/3823857
Adapting it for your example:
import numpy as np
a = np.array([1, 2, 4])
b_list = [1, 2, 3, 10, 4]
b_array = np.array(b_list)
indices = [b_list.index(x) for x in a]
vals_at_indices = b_array[indices]
I personally like adding a little bit of error handling in case a value in a does not exist in b.
import numpy as np
a = np.array([1, 2, 4])
b_list = [1, 2, 3, 10, 4]
b_array = np.array(b_list)
b_set = set(b_list)
indices = [b_list.index(x) if x in b_set else np.nan for x in a]
vals_at_indices = b_array[indices]
For my use case, it's pretty fast, since it relies on parts of Python that are fast (list comprehensions, .index(), sets, numpy indexing). Would still love to see something that's a NumPy equivalent to VLOOKUP, or even a Pandas merge. But this seems to work for now.
I have two arrays with 3 elements in each.
reduction_combs = [2, 3, 7]
elements = [3,6,8]
Is there a shortway to compute new array which is :
c = [2**3 , 3**6, 7**8]
This can be achieved using a simple list comprehension.
[x ** y for (x, y) in zip(elements, reduction_combs)]
Yes, you can just do [x**y for (x,y) in zip(reduction_combs, elements)]
You can also use map with lambda expressions passing two lists:
c = list(map(lambda x,y: x**y, reduction_combs, elements))
Where x and y will be values from reduction_combs and elements, respectively.
In addition to the zip method, this is another way using enumerate and list comprehension. Here j is the element of reduction_combs and i is the corresponding index using which you fetch the power to be raised from elements
c = [j**elements[i] for i, j in enumerate(reduction_combs)]
Using numpy arrays:
import numpy as np
a = np.array([2, 3, 7])
b = np.array([3, 6, 8])
a ** b
# output: array([8, 729,5764801])
You can do this either using numpy:
import numpy
reduction_combs = numpy.array([2, 3, 7])
elements = numpy.array([3, 6, 8])
c = reduction_combs ** elements
or if you want to do it with plain python, you might want to consider list comprehension:
c = [reduction_combs[i] ** elements[i] for i in range(len(reduction_combs))]
You should learn a bit more about what lists do in python and if you often work with arrays, get used to work with numpy!
If you like functional style as an alternative to the excellent list comprehension proposed by tda, here's a solution with operator.pow and itertools.starmap.
>>> from operator import pow
>>> from itertools import starmap
>>> list(starmap(pow, zip(reduction_combs, elements)))
[8, 729, 5764801]
In addition, since you tagged numpy, leveraging element-wise vectorized operations makes for a very straight forward solution.
>>> import numpy as np
>>> r = np.array(reduction_combs)
>>> e = np.array(elements)
>>>
>>> r**e
array([ 8, 729, 5764801])
You could use the numpy power function:
import numpy as np
reduction_combs = [2, 3, 7]
elements = [3, 6, 8]
print(np.power(reduction_combs, elements))
Output
[ 8 729 5764801]
If you want the output as a list simply do:
np.power(reduction_combs, elements).tolist()
One of the quick solution would be:
c = [a**b for a,b in zip(reduction_combs, elements)]
You can also try using numpy as below:
import numpy as np
c = np.power(reduction_combs, elements)
Using enumerate and pow
c = [pow(v, elements[i]) for i, v in enumerate(reduction_combs)]
I have two arrays:
>>> import numpy as np
>>> a=np.array([2, 1, 3, 3, 3])
>>> b=np.array([1, 2, 3, 3, 3])
What is the fastest way of comparing these two arrays for equality of elements, regardless of the order?
EDIT
I measured for the execution times of the following functions:
def compare1(): #works only for arrays without redundant elements
a=np.array([1,2,3,5,4])
b=np.array([2,1,3,4,5])
temp=0
for i in a:
temp+=len(np.where(b==i)[0])
if temp==5:
val=True
else:
val=False
return 0
def compare2():
a=np.array([1,2,3,3,3])
b=np.array([2,1,3,3,3])
val=np.all(np.sort(a)==np.sort(b))
return 0
def compare3(): #thx to ODiogoSilva
a=np.array([1,2,3,3,3])
b=np.array([2,1,3,3,3])
val=set(a)==set(b)
return 0
import numpy.lib.arraysetops as aso
def compare4(): #thx to tom10
a=np.array([1,2,3,3,3])
b=np.array([2,1,3,3,3])
val=len(aso.setdiff1d(a,b))==0
return 0
The results are:
>>> import timeit
>>> timeit.timeit(compare1,number=1000)
0.0166780948638916
>>> timeit.timeit(compare2,number=1000)
0.016178131103515625
>>> timeit.timeit(compare3,number=1000)
0.008063077926635742
>>> timeit.timeit(compare4,number=1000)
0.03257489204406738
Seems like the "set"-method by ODiogoSilva is the fastest.
Do you know other methods that I can test as well?
EDIT2
The runtime above was not the right measure for comparing arrays, as explained in a comment by user2357112.
#test.py
import numpy as np
import numpy.lib.arraysetops as aso
#without duplicates
N=10000
a=np.arange(N,0,step=-2)
b=np.arange(N,0,step=-2)
def compare1():
temp=0
for i in a:
temp+=len(np.where(b==i)[0])
if temp==len(a):
val=True
else:
val=False
return val
def compare2():
val=np.all(np.sort(a)==np.sort(b))
return val
def compare3():
val=set(a)==set(b)
return val
def compare4():
val=len(aso.setdiff1d(a,b))==0
return val
The output is:
>>> from test import *
>>> import timeit
>>> timeit.timeit(compare1,number=1000)
101.16708397865295
>>> timeit.timeit(compare2,number=1000)
0.09285593032836914
>>> timeit.timeit(compare3,number=1000)
1.425955057144165
>>> timeit.timeit(compare4,number=1000)
0.44780397415161133
Now compare2 is the fastest. Is there still a method that could outgun this?
Numpy as a collection of set operations.
import numpy as np
import numpy.lib.arraysetops as aso
a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])
print aso.setdiff1d(a, b)
To see if both arrays contain the same kind of elements, in this case [1,2,3], you could do:
import numpy as np
a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])
set(a) == set(b)
# True
Is there a straightforward way to RETURN a shuffled array in Python rather than shuffling it in place?
e.g., instead of
x = [array]
random.shuffle(x)
I'm looking for something like
y = shuffle(x)
which maintains x.
Note, I am not looking for a function, not something like:
x=[array]
y=x
random.shuffle(x)
sorted with a key function that returns a random value:
import random
sorted(l, key=lambda *args: random.random())
Or
import os
sorted(l, key=os.urandom)
It would be pretty simple to implement your own using random. I would write it as follows:
def shuffle(l):
l2 = l[:] #copy l into l2
random.shuffle(l2) #shuffle l2
return l2 #return shuffled l2
Just write your own.
import random
def shuffle(x):
x = list(x)
random.shuffle(x)
return x
x = range(10)
y = shuffle(x)
print x # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print y # [2, 5, 0, 4, 9, 3, 6, 1, 7, 8]
There is no function you are looking for. Just copy a list.
You could use numpy.random.permutation for either a list or array, but is the right function if you have a numpy array already. For lists with mixed types, converting to a numpy array will do type conversions.
import numpy as np
my_list = ['foo', 'bar', 'baz', 42]
print list(np.random.permutation(my_list))
# ['bar', 'baz', '42', 'foo']
Using this as a demo elsewhere so thought it may be worth sharing:
import random
x = shuffleThis(x)
def shuffleThis(y):
random.shuffle(y)
return(y)
#end of Shuffle Function
Hope this is useful.