How do I sort with multiple conditions on python?

How do I sort with multiple conditions on python? - python

6 7 5 2 12
0 2 3 6 12
2 8 5 4 13
4 3 5 7 14
def getMinIndex(list, start, stop):
n=len(list)
min_index = start
for i in range(start,stop):
if list[i][4] > myList[min_index][4]:
min_index = i
return min_index
def swapElements(list,i,j):
temp = list[i]
list[i] = list[j]
list[j] = temp
With this code I manage to sort the last element on the list which is index 4 but I'm having problem to sort the 1st index as I want the results to be like this.
0 2 3 6 12
6 7 5 2 12
2 8 5 4 13
4 3 5 7 14
So if the last element is the same after sorting then I want to sort the 1st element. Can anyone help? Thanks :D

What you're looking for is a key function for items on your list. Let me illustrate with an example.
Sorting keys
Suppose you have a list of people. You want to sort them by height. You can't just sort them using sorted because they aren't comparable by default the way numbers are. You need to specify the key used for sorting. The key is the characteristic you want to sort on. In this case it could look like:
sorted(people, key=lambda person: person.height)
or, if you find the lambda notation confusing:
def get_height(person):
return person.height
sorted(people, key=get_height)
Sorting tuples
A tuple is a finite sequence of items: (2,3) (2-tuple or pair), (-3, 2, 1) (3-tuple) and so on. Tuples are sorted alphabetically automatically. You don't need to do anything.
What's special in your case is that you don't want to sort by the first element, then by the second, and so on. You want to sort by the fourth and then by the first.
This is where keys enter the scene.
Tying it all together
You need a key function that will turn (a, b, c, d, e) into (e, a) which means: sort by the fifth column first and then by the first one:
def sorting_key(item):
return (item[4], item[0])
Then you can just call:
sorted(items, key=sorting_key)
# or with a lambda
sorted(items, key=lambda item: (item[4], item[0]))
Getting the index corresponding to a minimum
I noticed that your function returns a minimum corresponding to the element. You can sort the whole thing and take the first element. Alternatively, you can use the built-in min function and provide it the sorting key.
The only thing you need to take into account is that min returns the corresponding value, not the index. You can work around this with:
min_index, min_value = min(enumerate(items), key=lambda (index, value): (value[4], value[0]))
enumerate pairs list items with their indexes so [a, b, c] becomes [(0, a), (1, b), (2, c)]. Then you sort these pairs as if the indexes weren't present: key accepts index as the first argument in a tuple but ignores it completely.

You can use operator.itemgetter and use it for a custom sorting key function which you can pass to one of the built-in sort functions:
> from operator import itemgetter
> lst = [
[2, 8, 5, 4, 13],
[6, 7, 5, 2, 12],
[4, 3, 5, 7, 14],
[0, 2, 3, 6, 12]
]
# use tuple of last and first elmnt as sorting key
> sorted(lst, key=itemgetter(-1, 0))
[
[0, 2, 3, 6, 12],
[6, 7, 5, 2, 12],
[2, 8, 5, 4, 13],
[4, 3, 5, 7, 14]
]

Related

Comparisons between an arbitrary number of lists of arbitrary length Python

Given an arbitrary number of lists of integers of arbitrary length, I would like to group the integers into new lists based on a given distance threshold.
Input:
l1 = [1, 3]
l2 = [2, 4, 6, 10]
l3 = [12, 13, 15]
threshold = 2
Output:
[1, 2, 3, 4, 6] # group 1
[10, 12, 13, 15] # group 2
The elements of the groups act as a growing chain so first we have
abs(l1[0] - l2[0]) < threshold #true
so l1[0] and l2[0] are in group 1, and then the next check could be
abs(group[-1] - l1[1]) < threshold #true
so now l1[1] is added to group 1
Is there a clever way to do this without first grouping l1 and l2 and then grouping l3 with that output?

Based on the way that you asked the question, it sounds like you just want a basic python solution for utility, so I'll give you a simple solution.
Instead of treating the lists as all separate entities, it's easiest to just utilize a big cluster of non-duplicate numbers. You can exploit the set property of only containing unique values to go ahead and cluster all of the lists together:
# Throws all contents of lists into a set, converts it back to list, and sorts
elems = sorted(list({*l1, *l2, *l3}))
# elems = [1, 2, 3, 4, 6, 10, 12, 13, 15]
If you had a list of lists that you wanted to perform this on:
lists = [l1, l2, l3]
elems = []
[elems.extend(l) for l in lists]
elems = sorted(list(set(elems)))
# elems = [1, 2, 3, 4, 6, 10, 12, 13, 15]
If you want to keep duplicated:
elems = sorted([*l1, *l2, *l3])
# and respectively
elems = sorted(elems)
From there, you can just do the separation iteratively. Specifically:
Go through the elements one-by-one. If the next element is validly spaced, add it to the list you're building on.
When an invalidly-spaced element is encountered, create a new list containing that element, and start appending to the new list instead.
This can be done as follows (note, -1'th index refers to last element):
out = [[elems[0]]]
thresh = 2
for el in elems[1:]:
if el - out[-1][-1] <= thresh:
out[-1].append(el)
else:
out.append([el])
# out = [[1, 2, 3, 4, 6], [10, 12, 13, 15]]

How could I write a function to find fractional ranking of a list of numbers?

I'm trying to write a code in Python to create a fractional ranking list for a given one.
The fraction ranking is basically the following:
We have a list of numbers x = [4,4,10,4,10,2,4,1,1,2]
First, we need to sort the list in ascending order. I will use insertion sort for it, I already coded this part.
Now we have the sorted list x = [1, 1, 2, 2, 4, 4, 4, 4, 10, 10]. The list has 10 elements and we need to compare it with a list of the first 10 natural numbers n = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
For each element in x we assign a value. Notice the number 1 appears in positions 1 and 2. So, the number 1 receives the rank (1 + 2) / 2 = 1.5.
The number 2 appears in positions 3 and 4, so it receives the rank (3 + 4) / 2 = 3.5.
The number 4 appears in positions 5, 6, 7 and 8, so it receives the rank (5 + 6 + 7 + 8) / 4 = 6.5
The number 10 appears in positions 9 and 10, so it receives the rank (9 + 10) / 2 = 9.5
In the end of this process we need to have a new list of ranks r = [1.5, 1.5, 3.5, 3.5, 6.5, 6.5, 6.5, 6.5, 9.5, 9.5]
I don't want an entire solution, I want some tips to guide me while writing down the code.
I'm trying to use the for function to make a new list using the elements in the original one, but my first attempt failed so bad. I tried to get at least the first elements right, but it didn't work as expected:
# Suppose the list is already sorted.
def ranking(x):
l = len(x)
for ele in range(1, l):
t = x[ele-1]
m = x.count(t)
i = 0
sum = 0
while i < m: # my intention was to get right at least the rank of the first item of the list
sum = sum + 1
i = i + 1
x[ele] = sum/t
return x
Any ideais about how could I solve this problem?

Ok, first, for your for loop there you can more easily loop through each element in the list by just saying for i in x:. At least for me, that would make it a little easier to read. Then, to get the rank, maybe loop through again with a nested for loop and check if it equals whatever element you're currently on. I don't know if that makes sense; I didn't want to provide too many details because you said you didn't want the full solution (definitely reply if you want me to explain better).

Here is an idea:
You can use x.count(1) to see how many number 1s you have in list, x.count(2) for number 2 etc.
Also, never use sum as a variable name since it is an inbuilt function.
Maybe use 2 for loops. First one will go through elements in list x, second one will also go through elements in list x, and if it finds the same element, appends it to new_list.
You can then use something like sum(new_list) and clear list after each iteration.
You don't even need to loop through list n if you use indexing while looping through x
for i, y in enumerate(x) so you could use n[i] to read the value
If you want the code I'll post it in the comment

#VictorPaesPlinio- would you try this sample code for the problem: (it's a partial solution, did the data aggregation work, and leave the last part put the output for your own exercise).
from collections import defaultdict
x = [4, 4, 10, 4, 10, 2, 4, 1, 1, 2]
x.sort()
print(x)
lst = list(range(1, len(x)+1))
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ranking = defaultdict(list)
for idx, num in enumerate(x, 1):
print(idx, num)
ranking[num].append(idx)
print(ranking)
'''defaultdict(<class 'list'>, {1: [1, 2], 2: [3, 4],
4: [5, 6, 7, 8], 10: [9, 10]})
'''
r = []
# r = [1.5, 1.5, 3.5, 3.5, 6.5, 6.5, 6.5, 6.5, 9.5, 9.5]
# 1 1 2 2 4 4 4 4 10 10
for key, values in ranking.items():
# key is the number, values in the list()
print(key, values, sum(values))
Outputs:
1 [1, 2] 3
2 [3, 4] 7
4 [5, 6, 7, 8] 26
10 [9, 10] 19 # then you can do the final outputs part...

how to identify duplicate integers within a list, than minus each integer following the duplicate by one?

I am trying to solve an issue that I am currently running into. I want to have to have a list that is made up of only random integers. Then if i find a duplicate integer within this list i want to minus the rest of the list by one, after the second time the duplicate number appeared. Furthermore if a second pair of duplicate numbers are encountered, it should then minus the rest of the list by two, than if a third by three and etc.
But it should not affect the same duplicate number or any other duplicated number (that differs from the first) that is in the sequence.
For example
mylist = [0 1 2 3 4 5 6 2 8 5 10 11 12 1 14 15 16 17]
I want the end result to look like;
mylist = [0 1 2 3 4 5 6 2 7 5 9 10 11 1 12 13 14 15]
I have some rough code that I created to attempt this, but it will always minus the whole list including duplicated integers (the first pairs and any further pairs).
If someone can shed some light on how to deal with this problem i will be highly grateful!
Sorry forgot to add my code
a = [49, 51, 53, 56, 49, 54, 53, 48]
dupes = list()
number = 1
print (dupes)
while True:
#move integers from a to dupes (one by one)
for i in a[:]:
if i >= 2:
dupes.append(i)
a.remove(i)
if dupes in a:
a = [x - number for x in a]
print (dupes)
print(dupes)
if dupes in a:
a = [x - number for x in a]
number = number+1
break
Forgot to mention earlier, me and friend are currently working on this problem and the code i supplied is our rough outline of what is should look like and now the end result, I know that it does now work so i decided to ask for help for the issue

You need to iterate through your list and when you encounter a duplicate(can use list slicing) then decrement the next item!
List slicing - example,
>>> L=[2,4,6,8,10]
>>> L[1:5] # all elements from index 1 to 5
[4, 6, 8, 10]
>>> L[3:] # all elements from index 3 till the end of list
[8, 10]
>>> L[:2] # all elements from index beginning of list to second element
[2, 4]
>>> L[:-2] # all elements from index beginning of list to last second element
[2, 4, 6]
>>> L[::-1] # reverse the list
[10, 8, 6, 4, 2]
And enumerate
returns a tuple containing a count (from start which defaults to 0)
and the values obtained from iterating over sequence
Therefore,
mylist=[0, 1, 2, 3, 4, 5, 6, 2, 8, 5, 10, 11, 12, 1, 14, 15, 16, 17]
dup=0
for index,i in enumerate(mylist):
if i in mylist[:index]:
dup+=1
else:
mylist[index]-=dup
print mylist
Output:
[0, 1, 2, 3, 4, 5, 6, 2, 7, 5, 8, 9, 10, 1, 11, 12, 13, 14]

How to do multiple comparisons while sorting a dictionary of lists in python?

I have a dictionary of lists, summary:
summary = {
'Raonic': [2, 0, 11, 122, 16, 139],
'Halep': [2, 2, 10, 75, 6, 60],
'Kerber': [2, 0, 7, 68, 7, 71],
'Wawrinka': [1, 2, 14, 133, 13, 128],
'Djokovic': [2, 2, 10, 75, 8, 125],
}
I wish to print out to the screen (standard output) a summary in decreasing order of ranking, where the ranking is according to the criteria 1-6 in that order (compare item 1 (of the list), and if equal compare item 2 (of the list), if those are equal compare item 3 (of the list). This comparison continues till item 4 of the list in descending order.
However the comparison of item 5 and item 6 of the lists must be done in the ascending order.
Output:
Halep 2 2 10 75 6 60
Djokovic 2 2 10 75 8 125
Raonic 2 0 11 122 16 139
Kerber 2 0 7 68 7 71
Wawrinka 1 2 14 133 13 128
My Solution was:
for key, value in sorted(summary.items(), key=lambda e: e[1][0], reverse = True):
print(key, end=' ')
for v in value:
print(v, end=' ')
print()
However my solution merely succeeds at sorting on column 1 of the list.
Visit this site for full question

For numeric values, you can negate the value to get the inverse sort. For your sort key, return a sequence of the values that are to be sorted, negating the values that go in the 'opposite' direction:
ranked = sorted(
summary.items(),
key=lambda kv: kv[1][:4] + [-kv[1][4], -kv[1][5]],
reverse=True)
This produces (key, value) tuples in sorted order, as dictionaries are unordered.
For ('Halep', [2, 2, 10, 75, 6, 60]), the sort key lambda returns [2, 2, 10, 75, -6, -60], ensuring that this is sorted before 'Djokovic' where the sort key is set to [2, 2, 10, 75, -8, -125], because in reversed sort order (reverse=True), -6 is sorted before -8. However, any difference in value in the first 4 columns is sorted in the other direction, so anything with 3 or more in the first column will be sorted before ('Halep', [...]), or anything starting with 2, 3 or higher, etc.
Where values are not numeric and there is no other 'inverse' value option available for the columns, you'd have to sort twice. First on the last 2 columns (sorting in ascending order), then on the first 4 columns (in descending, so reversed, order):
# non-numeric option
part_sorted = sorted(summary.items(), key=lambda kv: kv[1][-2:])
ranked = sorted(part_sorted, key=lambda kv: kv[1][:4], reverse=True)
This works because Python's sort algorithm (called Timsort) is stable, the relative order of any two inputs for which the sort key is exactly equal is untouched. So any keys A and B, where only the last on or two columns differ, are given an relative ordering in part_sorted that then doesn't change in ranked because the part_sorted order is untouched. If A is sorted after B in the first sort then the second sort will leave A after B.

Remove elements that appear more often than once from numpy array

The question is, how can I remove elements that appear more often than once in an array completely. Below you see an approach that is very slow when it comes to bigger arrays.
Any idea of doing this the numpy-way? Thanks in advance.
import numpy as np
count = 0
result = []
input = np.array([[1,1], [1,1], [2,3], [4,5], [1,1]]) # array with points [x, y]
# count appearance of elements with same x and y coordinate
# append to result if element appears just once
for i in input:
for j in input:
if (j[0] == i [0]) and (j[1] == i[1]):
count += 1
if count == 1:
result.append(i)
count = 0
print np.array(result)
UPDATE: BECAUSE OF FORMER OVERSIMPLIFICATION
Again to be clear: How can I remove elements appearing more than once concerning a certain attribute from an array/list ?? Here: list with elements of length 6, if first and second entry of every elements both appears more than once in the list, remove all concerning elements from list. Hope I'm not to confusing. Eumiro helped me a lot on this, but I don't manage to flatten the output list as it should be :(
import numpy as np
import collections
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
# here, from input there should be removed input[0], input[1] and input[4] because
# first and second entry appears more than once in the list, got it? :)
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2:])
outputDict = [list(k)+list(v) for k,v in d.iteritems() if len(v) == 1 ]
result = []
def flatten(x):
if isinstance(x, collections.Iterable):
return [a for i in x for a in flatten(i)]
else:
return [x]
# I took flatten(x) from http://stackoverflow.com/a/2158522/1132378
# And I need it, because output is a nested list :(
for i in outputDict:
result.append(flatten(i))
print np.array(result)
So, this works, but it's impracticable with big lists.
First I got
RuntimeError: maximum recursion depth exceeded in cmp
and after applying
sys.setrecursionlimit(10000)
I got
Segmentation fault
how could I implement Eumiros solution for big lists > 100000 elements?

np.array(list(set(map(tuple, input))))
returns
array([[4, 5],
[2, 3],
[1, 1]])
UPDATE 1: If you want to remove the [1, 1] too (because it appears more than once), you can do:
from collections import Counter
np.array([k for k, v in Counter(map(tuple, input)).iteritems() if v == 1])
returns
array([[4, 5],
[2, 3]])
UPDATE 2: with input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]:
input=[[1,1,2], [1,1,3], [2,3,4], [4,5,5], [1,1,7]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a[2])
d is now:
{(1, 1): [2, 3, 7],
(2, 3): [4],
(4, 5): [5]}
so we want to take all key-value pairs, that have single values and re-create the arrays:
np.array([k+tuple(v) for k,v in d.iteritems() if len(v) == 1])
returns:
array([[4, 5, 5],
[2, 3, 4]])
UPDATE 3: For larger arrays, you can adapt my previous solution to:
import numpy as np
input = [[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]]
d = {}
for a in input:
d.setdefault(tuple(a[:2]), []).append(a)
np.array([v for v in d.itervalues() if len(v) == 1])
returns:
array([[[456, 6, 5, 343, 435, 5]],
[[ 1, 3, 4, 5, 6, 7]],
[[ 3, 4, 6, 7, 7, 6]],
[[ 3, 3, 3, 3, 3, 3]]])

This is a corrected, faster version of Hooked's answer. count_unique counts the number of the number of occurrences for each unique key in keys.
import numpy as np
input = np.array([[1,1,3,5,6,6],
[1,1,4,4,5,6],
[1,3,4,5,6,7],
[3,4,6,7,7,6],
[1,1,4,6,88,7],
[3,3,3,3,3,3],
[456,6,5,343,435,5]])
def count_unique(keys):
"""Finds an index to each unique key (row) in keys and counts the number of
occurrences for each key"""
order = np.lexsort(keys.T)
keys = keys[order]
diff = np.ones(len(keys)+1, 'bool')
diff[1:-1] = (keys[1:] != keys[:-1]).any(-1)
count = np.where(diff)[0]
count = count[1:] - count[:-1]
ind = order[diff[1:]]
return ind, count
key = input[:, :2]
ind, count = count_unique(key)
print key[ind]
#[[ 1 1]
# [ 1 3]
# [ 3 3]
# [ 3 4]
# [456 6]]
print count
[3 1 1 1 1]
ind = ind[count == 1]
output = input[ind]
print output
#[[ 1 3 4 5 6 7]
# [ 3 3 3 3 3 3]
# [ 3 4 6 7 7 6]
# [456 6 5 343 435 5]]

Updated Solution:
From the comments below, the new solution is:
idx = argsort(A[:, 0:2], axis=0)[:,1]
kidx = where(sum(A[idx,:][:-1,0:2]!=A[idx,:][1:,0:2], axis=1)==0)[0]
kidx = unique(concatenate((kidx,kidx+1)))
for n in arange(0,A.shape[0],1):
if n not in kidx:
print A[idx,:][n]
> [1 3 4 5 6 7]
[3 3 3 3 3 3]
[3 4 6 7 7 6]
[456 6 5 343 435 5]
kidx is a index list of the elements you don't want. This preserves rows where the first two inner elements do not match any other inner element. Since everything is done with indexing, it should be fast(ish), though it requires a sort on the first two elements. Note that original row order is not preserved, though I don't think this is a problem.
Old Solution:
If I understand it correctly, you simply want to filter out the results of a list of lists where the first element of each inner list is equal to the second element.
With your input from your update A=[[1,1,3,5,6,6],[1,1,4,4,5,6],[1,3,4,5,6,7],[3,4,6,7,7,6],[1,1,4,6,88,7],[3,3,3,3,3,3],[456,6,5,343,435,5]], the following line removes A[0],A[1] and A[4]. A[5] is also removed since that seems to match your criteria.
[x for x in A if x[0]!=x[1]]
If you can use numpy, there is a really slick way of doing the above. Assume that A is an array, then
A[A[0,:] == A[1,:]]
Will pull out the same values. This is probably faster than the solution listed above if you want to loop over it.

Why not create another array to hold the output?
Iterate through your main list and for each i check if i is in your other array and if not append it.
This way, your new array will not contain more than one of each element

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I sort with multiple conditions on python? - python

Related

Comparisons between an arbitrary number of lists of arbitrary length Python

How could I write a function to find fractional ranking of a list of numbers?

how to identify duplicate integers within a list, than minus each integer following the duplicate by one?

How to do multiple comparisons while sorting a dictionary of lists in python?

Remove elements that appear more often than once from numpy array

Categories

Resources