Consider the following minimized example:
Code:
a = [(1,'A'), (2,'A'), (3,'A'), (4,'A'), (5,'A')]
b = [(1,'B'), (2,'B'), (3,'B')]
c = []
d = [(1,'D'), (2,'D'), (3,'D'), (4,'D')]
print(sorted(a+b+c+d))
Result:
[(1, 'A'), (1, 'B'), (1, 'D'), (2, 'A'), (2, 'B'), (2, 'D'), (3, 'A'), (3, 'B'), (3, 'D'), (4, 'A'), (4, 'D'), (5, 'A')]
Python sorts the list of sets by the first item of each set and then by the second. That's fine.
Now, I need the second sort order to be "circular" in strings (not sure if this is the right term for it).
Furthermore, I want to specify the last string in the new ordered list. For example if I specify 'B', the ordered list should start from 'C'. If 'C' doesn't exist it should start from 'D', etc. However, it could also happen that the specified character might not be in the list, e.g. if 'C' doesn't exist the new sorted list should nevertheless start from 'D'.
Edit:
Sorry, I didn't add the desired output order of the list of sets to make it clear.
Assuming I would specify mySpecialSort(myList,'B').
Then there should be first all the sets containing 1's as highest priority sort order and then the "circular" strings (here starting from 'D', since there is no C in the list).
Desired sort order:
[(1, 'D'), (1, 'A'), (1, 'B'), (2, 'D'), (2, 'A'), (2, 'B'), (3, 'D'), (3, 'A'), (3, 'B'), (4, 'D'), (4, 'A'), (5, 'A')]
or in shortened readable form:
1D, 1A, 1B, 2D, 2A, 2B, 3D, 3A, 3B, 4D, 4A, 5A
A (cumbersome) solution I came up (however so far only) for the "circular" sort on a single character list (here with duplicates) would be the following:
Code:
myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']
def myCircularSort(myList,myLast):
myListTmp = sorted(list(set(myList + [myLast]))) # add myLast, remove duplicates and sort
idx = myListTmp.index(myLast) # get index of myLast
myStart = myListTmp[(idx+1)%len(myListTmp)] # get the start list item
myListSorted = sorted(list(set(myList))) # sorted original list
print("Normal sort: {}".format(myListSorted))
idx_start = myListSorted.index(myStart) # find start item and get its index
myNewSort = myListSorted[idx_start:] + myListSorted[0:idx_start] # split list and put in new order
print("Circular sort with {} as last: {}\n".format(myLast,myNewSort))
myCircularSort(myList,'D')
myCircularSort(myList,'X')
Result:
Normal sort: ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with D as last: ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'D']
Normal sort: ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with X as last: ['Z', 'A', 'D', 'E', 'G', 'J', 'K', 'T'] # X actually not in the list
However, now I am stuck on how to get this "circular" sort (on the second item of the list of sets) together with the "normal" sort (on the first item of the list of sets).
Alternatively, I could possibly think of a "brute force" method to find the highest index (here: 4) and all existing strings (here: A-Z) and check the existence of each combination in two nested for-loops.
Am I on the right track or would I do something horribly complicated and inefficient or am I missing some smart Python features?
Edit2:
After some further search, I guess lambda and cmp(x,y) would have done the job (see an example), but it doesn't seem to exist in Python3 anymore. So, then probably maybe something with operator.itemgetter() or operator.methodcaller() from which I still don't have a clue how to use since I'm missing good examples...
You can use a dict to map a letter to its correct position:
from string import ascii_uppercase as ABC
start = ABC.index('D') + 1
sorter = {
ABC[(n + start) % len(ABC)]: n
for n in range(len(ABC))
}
myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']
print(sorted(myList, key=sorter.get))
# ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'A', 'D']
To work with arbitrary keywords, extract them into a keys list, rearrange it as desired and use keys.index(word) as a sort key:
myList = [
(1, 'ARTHUR'),
(2, 'CHARLIE'),
(3, 'GEORGE'),
(4, 'HARRY'),
(5, 'JACK'),
(6, 'LEO'),
(7, 'MUHAMMAD'),
(8, 'NOAH'),
(9, 'OLIVER'),
]
def circ_sorted(lst, start):
keys = sorted(e[1] for e in lst)
less = sum(1 for k in keys if k <= start)
keys = keys[less:] + keys[:less]
return sorted(lst, key=lambda e: (keys.index(e[1]), e[0]))
print(circ_sorted(myList, 'LEO')) ## [MUHAMMAD, NOAH...]
print(circ_sorted(myList, 'IAN')) ## [JACK, LEO...]
Phew, this was pretty time consuming, but I guess I have a solution now. At least the result seems to have the desired order.
The module functools offers cmp_to_key to replace cmp() which apparently was removed in Python3. At least that's what I found here.
In case there is a "native" Python3 solution I would be happy to learn about it. Comments, improvements, simplifications are welcome.
So, the following code sorts the sets of a list first by the number (here 1 to 5) and second by string in a circular way (here: Ag,Au,Ca,Fe,Ti) such that the last string will be determined by myRef.
Code:
### special numerical and circular alphanumerical sort on a list of sets
from functools import cmp_to_key
# different lists of sets
ag = [(1,'Ag'), (2,'Ag'), (3,'Ag'), (4,'Ag'), (5,'Ag')]
au = [(1,'Au'), (2,'Au')]
ba = []
ca = [(1,'Ca'), (2,'Ca'), (3,'Ca')]
fe = [(1,'Fe'), (2,'Fe')]
ti = [(1,'Ti'), (2,'Ti'), (3,'Ti')]
myList = fe + ti + ag + au + ca + ba # merge all lists
def mySpecialCircularSort(myList,myRef):
myList = list(set(myList)) # remove duplicates
myListNew = sorted(myList, key=cmp_to_key(lambda a, b:
-1 if a[0]<b[0] else 1 if a[0]>b[0] else
-1 if b[1]==myRef else
1 if a[1]==myRef else
-1 if a[1]>myRef and b[1]<myRef else
1 if a[1]<myRef and b[1]>myRef else
-1 if a[1]<b[1] else
1 if a[1]>b[1] else 0))
print("Circular sort with {} as last: {}".format(myRef,myListNew))
print("Unsorted as is: {}\n".format(myList))
mySpecialCircularSort(myList,'Ag')
mySpecialCircularSort(myList,'Au')
mySpecialCircularSort(myList,'Ba') # since Ba-List was empty, the result will be same as 'Au'
mySpecialCircularSort(myList,'Ca')
mySpecialCircularSort(myList,'Fe')
mySpecialCircularSort(myList,'Ti')
Result:
Unsorted as is: [(1, 'Fe'), (2, 'Fe'), (1, 'Ti'), (2, 'Ti'), (3, 'Ti'), (1, 'Ag'), (2, 'Ag'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag'), (1, 'Au'), (2, 'Au'), (1, 'Ca'), (2, 'Ca'), (3, 'Ca')]
Circular sort with Ag as last: [(1, 'Au'), (1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Au as last: [(1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ba as last: [(1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ca as last: [(1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (1, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (3, 'Ti'), (3, 'Ag'), (3, 'Ca'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Fe as last: [(1, 'Ti'), (1, 'Ag'), (1, 'Au'), (1, 'Ca'), (1, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (3, 'Ti'), (3, 'Ag'), (3, 'Ca'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ti as last: [(1, 'Ag'), (1, 'Au'), (1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (3, 'Ag'), (3, 'Ca'), (3, 'Ti'), (4, 'Ag'), (5, 'Ag')]
With a custom sort key function:
from string import ascii_uppercase
order = {c: i for i, c in enumerate(ascii_uppercase)}
def circular_sort(lst, last):
return sorted(lst, key=lambda x: (x[0], order[x[1]] + 26*(x[1]<=last)))
>>> circular_sort(a+b+c+d, 'B')
[(1, 'D'), (2, 'D'), (3, 'D'), (4, 'D'), (1, 'A'), (2, 'A'), (3, 'A'), (4, 'A'), (5, 'A'), (1, 'B'), (2, 'B'), (3, 'B')]
This simply adds 26 to any letter's index that is less or equal to the specified last letter.
I see a pattern in the sample data:
a = [(1,'A'), (2,'A'), (3,'A'), (4,'A'), (5,'A')]
b = [(1,'B'), (2,'B'), (3,'B')]
c = []
d = [(1,'D'), (2,'D'), (3,'D'), (4,'D')]
Maybe the pattern is misleading me, while the real data doesn't have the same pattern.In this case, just ignore my answer.
Otherwise, given the OP's answer to my comment:
starting point is several separate lists
I propose this solution:
build a nested list with the source lists;
rotate the list n times depending on the starting point;
transpose;
flatten;
Here is an example of implementation, defining some helpers:
from itertools import zip_longest
def rotate(l, n):
return l[n:] + l[:n]
def transpose(l):
return [list(filter(None,i)) for i in zip_longest(*tmp)]
def flatten(l):
return [item for sublist in l for item in sublist]
Then, for example rotate three times to start with D:
tmp = [a, b, c, d]
tmp = rotate(tmp, 3)
tmp = transpose(tmp)
tmp = flatten(tmp)
tmp
#=> [(1, 'D'), (1, 'A'), (1, 'B'), (2, 'D'), (2, 'A'), (2, 'B'), (3, 'D'), (3, 'A'), (3, 'B'), (4, 'D'), (4, 'A'), (5, 'A')]
I have a little problem, maybe dumb, but it seems I can't solve it.
I have a list of objects that have members, but let's say my list is this:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
I want to "gather" all elements based on the value I choose, and to put them into a dictionary based on that value/key (can be both the first or the second value of the tuple).
For example, if I want to gather the values based on the first element, I want something like that:
{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]}
However, what I achieved until now is this:
>>> {k:v for k,v in zip([e[0] for e in l], l)}
{1: (1, 'c'), 2: (2, 'a'), 3: (3, 'a')}
Can somebody please help me out?
My first thought would be using defaultdict(list) (efficient, in linear time), which does exactly what you were trying to do:
from collections import defaultdict
dic = defaultdict(list)
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
for item in l:
dic[item[0]].append(item)
output
defaultdict(list,{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]})
Here you go:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
output_dict = dict()
for item in l:
if item[0] in output_dict:
output_dict[item[0]].append(item)
continue
output_dict[item[0]] = [item]
print(output_dict)
Here with list comprehension, oneliner:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
print (dict([(x[0], [y for y in l if y[0] == x[0]]) for x in l]))
Output:
{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]}
First create a dico with list inside :
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
dico={}
for i in range(4):
dico[i]=[]
Then fill this dico
for i in l:
dico[i[0]].append(i)