Find positions of elements in sorted array - python

Suppose I have some numpy array (all elements are unique) that I want to sort in descending order. I need to find out which positions elements of initial array will take in sorted array.
Example.
In1: [1, 2, 3] # Input
Out1: [2, 1, 0] # Expected output
In2: [1, -2, 2] # Input
Out2: [1, 2, 0] # Expected output
I tried this one:
def find_positions(A):
A = np.array(A)
A_sorted = np.sort(A)[::-1]
return np.argwhere(A[:, None] == A_sorted[None, :])[:, 1]
But it doesn't work when the input array is very large (len > 100000). What I did wrong and how can I resolve it?

Approach #1
We could use double argsort -
np.argsort(a)[::-1].argsort() # a is input array/list
Approach #2
We could use one argsort and then array-assignment -
# https://stackoverflow.com/a/41242285/ #Andras Deak
def argsort_unique(idx):
n = idx.size
sidx = np.empty(n,dtype=int)
sidx[idx] = np.arange(n)
return sidx
out = argsort_unique(np.argsort(a)[::-1])

Take a look at numpy.argsort(...) function:
Returns the indices that would sort an array.
Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.
Here is the reference from the documentation, and the following is a simple example:
import numpy
arr = numpy.random.rand(100000)
indexes = numpy.argsort(arr)
the indexes array will contain all the indexes in the order in which the array arr would be sorted

I face the same problem for plain lists, and would like to avoid using numpy. So I propose a possible solution that should also work for an np.array, and which avoids reversal of the result:
def argsort(A, key=None, reverse=False):
"Indirect sort of list or array A: return indices of elements in order."
keyfunc = (lambda i: A[i]) if key is None else lambda i: key(A[i])
return sorted(range(len(A)), keyfunc, reverse=reverse)
Example of use:
>>> L = [3,1,4,1,5,9,2,6]
>>> argsort( L )
[1, 3, 6, 0, 2, 4, 7, 5]
>>> [L[i]for i in _]
[1, 1, 2, 3, 4, 5, 6, 9]
>>> argsort( L, key=lambda x:(x%2,x) ) # even elements first
[6, 2, 7, 1, 3, 0, 4, 5]
>>> [L[i]for i in _]
[2, 4, 6, 1, 1, 3, 5, 9]
>>> argsort( L, key=lambda x:(x%2,x), reverse = True)
[5, 4, 0, 1, 3, 7, 2, 6]
>>> [L[i]for i in _]
[9, 5, 3, 1, 1, 6, 4, 2]
Feedback would be welcome! (Efficiency compared to previously proposed solutions? Suggestions for improvements?)

Related

How do I find the sorted position/index of a list in python?

For example, I have a list:
[12, 1205, 102, 6]
I want to get
[1, 3, 2, 0]
Because this is the position they are supposed to be in if the list is sorted.
How can I achieve this in python?
Use a double numpy.argsort, or self-indexing:
l = [12, 1205, 102, 6]
out = np.argsort(np.argsort(l)).to_list()
# or
x = np.argsort(l)
out = x[x]
Output: [1, 3, 2, 0]
older (inefficient) answer
IIUC, you want the sorted rank:
sorted_list = sorted(your_list)
[sorted_list.index(x) for x in your_list]
You can get a list of indexes in the order of sorted values using the sorted() function, then use this to set the positions in the resulting list:
L = [12, 1205, 102, 6]
P = sorted(range(len(L)),key=L.__getitem__) # positions in sorted order
S = [None]*len(L) # resulting list
for p,i in enumerate(P): S[i]=p # assign position at original indexes
print(S) # [1, 3, 2, 0]
The equivalent solution using numpy could look like this:
S = np.zeros(len(L),dtype=np.int) # prepare resulting array
S[np.argsort(L)] = np.arange(len(L)) # assign positions at indexes
print(S) # array([1, 3, 2, 0])
Sorting your list can be done with numpy:
numpy.argsort([2, 3, 5, 1, 4])

Return indexes of list in same order as input

I have the following list:
inputList = [5, 2, 1]
which corresponds to the indexes in another dataframe
[4,1,0]
However, to get the indexes I always get the sorted indexes ([0,1,4]), not the inputList index order:
idx = [df['id'].isin(map(str,sorted(inputList)))].index.tolist()
How can I get it?
If inputList = [5, 2, 1] matches the indexes [4, 1, 0], it is not surprising that sorted(inputList) (which is [1, 2, 5]) matches the indexes [0, 1, 4].
Try without sorted:
idx = [df['id'].isin(map(str, inputList))].index.tolist()

Is there a consistent expression for python list reverse selection?

The case is if I want to reverse select a python list to n like:
n = 3
l = [1,2,3,4,5,6]
s = l[5:n:-1] # s is [6, 5]
OK, it works, but how can I set n's value to select the whole list?
let's see this example, what I expect the first line is [5, 4, 3, 2, 1]
[40]: for i in range(-1, 5):
...: print(l[4:i:-1])
...:
[]
[5, 4, 3, 2]
[5, 4, 3]
[5, 4]
[5]
[]
if the upper bound n set to 0, the result will lost 0. but if n is -1, the result is empty because -1 means "the last one".
The only way I can do is:
if n < 0:
s = l[5::-1]
else:
s = l[5:n:-1]
a bit confusing.
To fully reverse the list by slicing:
l = [1,2,3,4,5,6]
print(l[::-1])
#[6, 5, 4, 3, 2, 1]
If you want to be able to partially or fully reverse the list based on the value of n, you can do it like this:
l = [1,2,3,4,5,6]
def custom_reverse(l,n):
return [l[i] for i in range(len(l)-1,n,-1)]
print(custom_reverse(l,3)) #[6, 5]
print(custom_reverse(l,-1)) #[6, 5, 4, 3, 2, 1]
Hopefully this is what you mean.
print(l[n+1::-1])

Combine elements from two lists

I want to merge two arrays in python in a special way.
The entries with an odd index of my output array out shall be the coresponding entries of my first input array in0. The entries with an even index in out shall be the coresponding entries of my second input array
in1.
in0, in1 and out are all the same length.
Example:
The input arrays
in0 = [0, 1, 2, 3]
in1 = [4, 5, 6, 7]
shall be merge to the output array
out = [0, 5, 2, 7]
Is there a nicer way than to loop over the whole length of the inputs and fill my out 'by hand'?
You could use a list comprehension and select values from in0 on even indices and in1 on odd indices:
[in0[i] if i % 2 == 0 else in1[i] for i in range(len(in0))]
# [0, 5, 2, 7]
If you're happy to make full list copy, this is simple with slicing:
>>> in0 = [0, 1, 2, 3]
>>> in1 = [4, 5, 6, 7]
>>> out = in0[:]
>>> out[1::2] = in1[1::2]
>>> out
[0, 5, 2, 7]
If you don't mind some verbosity...
from itertools import cycle
in0 = [0, 1, 2, 3]
in1 = [4, 5, 6, 7]
out = [pair[i] for pair, i in zip(zip(in0, in1), cycle([0,1]))]
How it works:
zip(in0, in1) is a sequence of tuples, (0,4), (1,5), (2,6), (3,7).
cycle([0,1]) is an endless stream of alternating 0s and 1s to be used as indices in the tuples from step 1.
zip(zip(...), cycle(...)) produces a pair of tuples and indices:
(0, (0,4)), (1, (1,5)), (0, (2,6)), (1, (3,7)).
The list comprehension takes the correct element from each tuple.
In the end, the list comprehension is a general version of
[(0,4)[0], (1,5)[1], (2,6)[0], (3,7)[1]]
Without using loops, but not in the exact same order you requested:
>> in0 = [0, 1, 2, 3]
>> in1 = [4, 5, 6, 7]
>> out = in0[0::2] + in1[1::2]
>> out
[0, 2, 5, 7]
EDIT: correcting the output order with itertools:
>> import itertools
>> in0 = [0, 1, 2, 3]
>> in1 = [4, 5, 6, 7]
>> out = list(itertools.chain(*zip(in0[0::2], in1[1::2])))
>> out
[0, 5, 2, 7]

How to replace numbers with order in (python) list

I have a list containing integers and want to replace them so that the element which previously contained the highest number now contains a 1, the second highest number set to 2, etc etc.
Example:
[5, 6, 34, 1, 9, 3] should yield [4, 3, 1, 6, 2, 5].
I personally only care about the first 9 highest numbers by I thought there might be a simple algorithm or possibly even a python function to do take care of this task?
Edit: I don't care how duplicates are handled.
A fast way to do this is to first generate a list of tuples of the element and its position:
sort_data = [(x,i) for i,x in enumerate(data)]
next we sort these elements in reverse:
sort_data = sorted(sort_data,reverse=True)
which generates (for your sample input):
>>> sort_data
[(34, 2), (9, 4), (6, 1), (5, 0), (3, 5), (1, 3)]
and nest we need to fill in these elements like:
result = [0]*len(data)
for i,(_,idx) in enumerate(sort_data,1):
result[idx] = i
Or putting it together:
def obtain_rank(data):
sort_data = [(x,i) for i,x in enumerate(data)]
sort_data = sorted(sort_data,reverse=True)
result = [0]*len(data)
for i,(_,idx) in enumerate(sort_data,1):
result[idx] = i
return result
this approach works in O(n log n) with n the number of elements in data.
A more compact algorithm (in the sense that no tuples are constructed for the sorting) is:
def obtain_rank(data):
sort_data = sorted(range(len(data)),key=lambda i:data[i],reverse=True)
result = [0]*len(data)
for i,idx in enumerate(sort_data,1):
result[idx] = i
return result
Another option, you can use rankdata function from scipy, and it provides options to handle duplicates:
from scipy.stats import rankdata
lst = [5, 6, 34, 1, 9, 3]
rankdata(list(map(lambda x: -x, lst)), method='ordinal')
# array([4, 3, 1, 6, 2, 5])
Assuimg you do not have any duplicates, the following list comprehension will do:
lst = [5, 6, 34, 1, 9, 3]
tmp_sorted = sorted(lst, reverse=True) # kudos to #Wondercricket
res = [tmp_sorted.index(x) + 1 for x in lst] # [4, 3, 1, 6, 2, 5]
To understand how it works, you can break it up into pieces like so:
lst = [5, 6, 34, 1, 9, 3]
# let's see what the sorted returns
print(sorted(lst, reverse=True)) # [34, 9, 6, 5, 3, 1]
# biggest to smallest. that is handy.
# Since it returns a list, i can index it. Let's try with 6
print(sorted(lst, reverse=True).index(6)) # 2
# oh, python is 0-index, let's add 1
print(sorted(lst, reverse=True).index(6) + 1) # 3
# that's more like it. now the same for all elements of original list
for x in lst:
print(sorted(lst, reverse=True).index(x) + 1) # 4, 3, 1, 6, 2, 5
# too verbose and not a list yet..
res = [sorted(lst, reverse=True).index(x) + 1 for x in lst]
# but now we are sorting in every iteration... let's store the sorted one instead
tmp_sorted = sorted(lst, reverse=True)
res = [tmp_sorted.index(x) + 1 for x in lst]
Using numpy.argsort:
numpy.argsort returns the indices that would sort an array.
>>> xs = [5, 6, 34, 1, 9, 3]
>>> import numpy as np
>>> np.argsort(np.argsort(-np.array(xs))) + 1
array([4, 3, 1, 6, 2, 5])
A short, log-linear solution using pure Python, and no look-up tables.
The idea: store the positions in a list of pairs, then sort the list to reorder the positions.
enum1 = lambda seq: enumerate(seq, start=1) # We want 1-based positions
def replaceWithRank(xs):
# pos = position in the original list, rank = position in the top-down sorted list.
vp = sorted([(value, pos) for (pos, value) in enum1(xs)], reverse=True)
pr = sorted([(pos, rank) for (rank, (_, pos)) in enum1(vp)])
return [rank for (_, rank) in pr]
assert replaceWithRank([5, 6, 34, 1, 9, 3]) == [4, 3, 1, 6, 2, 5]

Categories