Change `itertools.product` iteration order - python

Given some iterables, itertools.product iterates from back to front, trying all choices of the last iterable before advancing the second-to-last iterable, and trying all choices of the last two iterables before advancing the third-to-last iterable, etc. For instance,
>>> list(itertools.product([2,1,0],['b','c','a']))
[(2, 'b'), (2, 'c'), (2, 'a'), (1, 'b'), (1, 'c'), (1, 'a'), (0, 'b'), (0, 'c'), (0, 'a')]
I would like to iterate over the product in a different manner: the order the tuples should be produced is by the sum of the indices of the elements they contain, i.e., before producing a tuple whose elements' indices in their respective iterables sum to k, produce all tuples whose elements' indices in their respective iterables sum to k-1. For example, after producing the tuple containing the first element (index 0) of every iterable, the next tuples produced should each contain the second element from a single iterable and the first from the rest; after that, the tuples produced should contain the third element from one tuple or the second element from two tuples, etc. Using the above example,
>>> my_product([2,1,0],['b','c','a'])
[(2, 'b'), # element 0 from both iterables
(2, 'c'), (1, 'b'), # elements 0,1 and 1,0 (sums to 1)
(2, 'a'), (1, 'c'), (0, 'b'), # elements 0,2 and 1,1 and 2,0 (sums to 2)
(1, 'a'), (0, 'c'), # elements 1,2 and 2,1 (sums to 3)
(0, 'a')] # elements 2,2 (sums to 4)

Solved this with sorting:
def my_product(*args):
return [tuple(i[1] for i in p) for p in
sorted(itertools.product(*map(enumerate, args)),
key=lambda x: (sum(y[0] for y in x), x))]
Test:
>>> my_product([0,1,2],[3,4,5])
[(0, 3),
(0, 4), (1, 3),
(0, 5), (1, 4), (2, 3),
(1, 5), (2, 4),
(2, 5)]
works also with non-sorted, non-numeric items:
>>> my_product(['s0','b1','k2'],['z3','a4','c5'])
[('s0', 'z3'),
('s0', 'a4'), ('b1', 'z3'),
('s0', 'c5'), ('b1', 'a4'), ('k2', 'z3'),
('b1', 'c5'), ('k2', 'a4'),
('k2', 'c5')]
>>> my_product([2,1,0],['b','c','a'])
[(2, 'b'),
(2, 'c'), (1, 'b'),
(2, 'a'), (1, 'c'), (0, 'b'),
(1, 'a'), (0, 'c'),
(0, 'a')]
and with multiple args:
>>> my_product([2,1,0],['b','c','a'],['x','y','z'])
[(2, 'b', 'x'),
(2, 'b', 'y'), (2, 'c', 'x'), (1, 'b', 'x'),
(2, 'b', 'z'), (2, 'c', 'y'), (2, 'a', 'x'), (1, 'b', 'y'), (1, 'c', 'x'), (0, 'b', 'x'),
(2, 'c', 'z'), (2, 'a', 'y'), (1, 'b', 'z'), (1, 'c', 'y'), (1, 'a', 'x'), (0, 'b', 'y'), (0, 'c', 'x'),
(2, 'a', 'z'), (1, 'c', 'z'), (1, 'a', 'y'), (0, 'b', 'z'), (0, 'c', 'y'), (0, 'a', 'x'),
(1, 'a', 'z'), (0, 'c', 'z'), (0, 'a', 'y'),
(0, 'a', 'z')]

Given that you'll need to check that sum why don' you just use sort:
def my_product(*args):
return list(sorted(itertools.product(*args), key=lambda x: (sum(x), x)))

As an alternative to sorting, this solution makes multiple passes over the result of itertools.product().
Note that it uses the same decorate-manipulate-undecorate pattern that other answers use.
import itertools
# TESTED on Python3
def my_product(*args):
args = [list(enumerate(arg)) for arg in args]
for sum_indexes in range(sum(len(item) for item in args)):
for partial in itertools.product(*args):
indexes, values = zip(*partial)
if sum(indexes) == sum_indexes:
yield values
assert list(my_product([2,1,0],['b','c','a'])) == [
(2, 'b'), # element 0 from both iterables
(2, 'c'), (1, 'b'), # elements 0,1 and 1,0 (sums to 1)
(2, 'a'), (1, 'c'), (0, 'b'), # elements 0,2 and 1,1 and 2,0 (sums to 2)
(1, 'a'), (0, 'c'), # elements 1,2 and 2,1 (sums to 3)
(0, 'a')]

Related

How to do a circular (sub)sort of sets in Python?

Consider the following minimized example:
Code:
a = [(1,'A'), (2,'A'), (3,'A'), (4,'A'), (5,'A')]
b = [(1,'B'), (2,'B'), (3,'B')]
c = []
d = [(1,'D'), (2,'D'), (3,'D'), (4,'D')]
print(sorted(a+b+c+d))
Result:
[(1, 'A'), (1, 'B'), (1, 'D'), (2, 'A'), (2, 'B'), (2, 'D'), (3, 'A'), (3, 'B'), (3, 'D'), (4, 'A'), (4, 'D'), (5, 'A')]
Python sorts the list of sets by the first item of each set and then by the second. That's fine.
Now, I need the second sort order to be "circular" in strings (not sure if this is the right term for it).
Furthermore, I want to specify the last string in the new ordered list. For example if I specify 'B', the ordered list should start from 'C'. If 'C' doesn't exist it should start from 'D', etc. However, it could also happen that the specified character might not be in the list, e.g. if 'C' doesn't exist the new sorted list should nevertheless start from 'D'.
Edit:
Sorry, I didn't add the desired output order of the list of sets to make it clear.
Assuming I would specify mySpecialSort(myList,'B').
Then there should be first all the sets containing 1's as highest priority sort order and then the "circular" strings (here starting from 'D', since there is no C in the list).
Desired sort order:
[(1, 'D'), (1, 'A'), (1, 'B'), (2, 'D'), (2, 'A'), (2, 'B'), (3, 'D'), (3, 'A'), (3, 'B'), (4, 'D'), (4, 'A'), (5, 'A')]
or in shortened readable form:
1D, 1A, 1B, 2D, 2A, 2B, 3D, 3A, 3B, 4D, 4A, 5A
A (cumbersome) solution I came up (however so far only) for the "circular" sort on a single character list (here with duplicates) would be the following:
Code:
myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']
def myCircularSort(myList,myLast):
myListTmp = sorted(list(set(myList + [myLast]))) # add myLast, remove duplicates and sort
idx = myListTmp.index(myLast) # get index of myLast
myStart = myListTmp[(idx+1)%len(myListTmp)] # get the start list item
myListSorted = sorted(list(set(myList))) # sorted original list
print("Normal sort: {}".format(myListSorted))
idx_start = myListSorted.index(myStart) # find start item and get its index
myNewSort = myListSorted[idx_start:] + myListSorted[0:idx_start] # split list and put in new order
print("Circular sort with {} as last: {}\n".format(myLast,myNewSort))
myCircularSort(myList,'D')
myCircularSort(myList,'X')
Result:
Normal sort: ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with D as last: ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'D']
Normal sort: ['A', 'D', 'E', 'G', 'J', 'K', 'T', 'Z']
Circular sort with X as last: ['Z', 'A', 'D', 'E', 'G', 'J', 'K', 'T'] # X actually not in the list
However, now I am stuck on how to get this "circular" sort (on the second item of the list of sets) together with the "normal" sort (on the first item of the list of sets).
Alternatively, I could possibly think of a "brute force" method to find the highest index (here: 4) and all existing strings (here: A-Z) and check the existence of each combination in two nested for-loops.
Am I on the right track or would I do something horribly complicated and inefficient or am I missing some smart Python features?
Edit2:
After some further search, I guess lambda and cmp(x,y) would have done the job (see an example), but it doesn't seem to exist in Python3 anymore. So, then probably maybe something with operator.itemgetter() or operator.methodcaller() from which I still don't have a clue how to use since I'm missing good examples...
You can use a dict to map a letter to its correct position:
from string import ascii_uppercase as ABC
start = ABC.index('D') + 1
sorter = {
ABC[(n + start) % len(ABC)]: n
for n in range(len(ABC))
}
myList = ['A', 'D', 'E', 'G', 'Z', 'A', 'J', 'K', 'T']
print(sorted(myList, key=sorter.get))
# ['E', 'G', 'J', 'K', 'T', 'Z', 'A', 'A', 'D']
To work with arbitrary keywords, extract them into a keys list, rearrange it as desired and use keys.index(word) as a sort key:
myList = [
(1, 'ARTHUR'),
(2, 'CHARLIE'),
(3, 'GEORGE'),
(4, 'HARRY'),
(5, 'JACK'),
(6, 'LEO'),
(7, 'MUHAMMAD'),
(8, 'NOAH'),
(9, 'OLIVER'),
]
def circ_sorted(lst, start):
keys = sorted(e[1] for e in lst)
less = sum(1 for k in keys if k <= start)
keys = keys[less:] + keys[:less]
return sorted(lst, key=lambda e: (keys.index(e[1]), e[0]))
print(circ_sorted(myList, 'LEO')) ## [MUHAMMAD, NOAH...]
print(circ_sorted(myList, 'IAN')) ## [JACK, LEO...]
Phew, this was pretty time consuming, but I guess I have a solution now. At least the result seems to have the desired order.
The module functools offers cmp_to_key to replace cmp() which apparently was removed in Python3. At least that's what I found here.
In case there is a "native" Python3 solution I would be happy to learn about it. Comments, improvements, simplifications are welcome.
So, the following code sorts the sets of a list first by the number (here 1 to 5) and second by string in a circular way (here: Ag,Au,Ca,Fe,Ti) such that the last string will be determined by myRef.
Code:
### special numerical and circular alphanumerical sort on a list of sets
from functools import cmp_to_key
# different lists of sets
ag = [(1,'Ag'), (2,'Ag'), (3,'Ag'), (4,'Ag'), (5,'Ag')]
au = [(1,'Au'), (2,'Au')]
ba = []
ca = [(1,'Ca'), (2,'Ca'), (3,'Ca')]
fe = [(1,'Fe'), (2,'Fe')]
ti = [(1,'Ti'), (2,'Ti'), (3,'Ti')]
myList = fe + ti + ag + au + ca + ba # merge all lists
def mySpecialCircularSort(myList,myRef):
myList = list(set(myList)) # remove duplicates
myListNew = sorted(myList, key=cmp_to_key(lambda a, b:
-1 if a[0]<b[0] else 1 if a[0]>b[0] else
-1 if b[1]==myRef else
1 if a[1]==myRef else
-1 if a[1]>myRef and b[1]<myRef else
1 if a[1]<myRef and b[1]>myRef else
-1 if a[1]<b[1] else
1 if a[1]>b[1] else 0))
print("Circular sort with {} as last: {}".format(myRef,myListNew))
print("Unsorted as is: {}\n".format(myList))
mySpecialCircularSort(myList,'Ag')
mySpecialCircularSort(myList,'Au')
mySpecialCircularSort(myList,'Ba') # since Ba-List was empty, the result will be same as 'Au'
mySpecialCircularSort(myList,'Ca')
mySpecialCircularSort(myList,'Fe')
mySpecialCircularSort(myList,'Ti')
Result:
Unsorted as is: [(1, 'Fe'), (2, 'Fe'), (1, 'Ti'), (2, 'Ti'), (3, 'Ti'), (1, 'Ag'), (2, 'Ag'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag'), (1, 'Au'), (2, 'Au'), (1, 'Ca'), (2, 'Ca'), (3, 'Ca')]
Circular sort with Ag as last: [(1, 'Au'), (1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Au as last: [(1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ba as last: [(1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (3, 'Ca'), (3, 'Ti'), (3, 'Ag'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ca as last: [(1, 'Fe'), (1, 'Ti'), (1, 'Ag'), (1, 'Au'), (1, 'Ca'), (2, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (3, 'Ti'), (3, 'Ag'), (3, 'Ca'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Fe as last: [(1, 'Ti'), (1, 'Ag'), (1, 'Au'), (1, 'Ca'), (1, 'Fe'), (2, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (3, 'Ti'), (3, 'Ag'), (3, 'Ca'), (4, 'Ag'), (5, 'Ag')]
Circular sort with Ti as last: [(1, 'Ag'), (1, 'Au'), (1, 'Ca'), (1, 'Fe'), (1, 'Ti'), (2, 'Ag'), (2, 'Au'), (2, 'Ca'), (2, 'Fe'), (2, 'Ti'), (3, 'Ag'), (3, 'Ca'), (3, 'Ti'), (4, 'Ag'), (5, 'Ag')]
With a custom sort key function:
from string import ascii_uppercase
order = {c: i for i, c in enumerate(ascii_uppercase)}
def circular_sort(lst, last):
return sorted(lst, key=lambda x: (x[0], order[x[1]] + 26*(x[1]<=last)))
>>> circular_sort(a+b+c+d, 'B')
[(1, 'D'), (2, 'D'), (3, 'D'), (4, 'D'), (1, 'A'), (2, 'A'), (3, 'A'), (4, 'A'), (5, 'A'), (1, 'B'), (2, 'B'), (3, 'B')]
This simply adds 26 to any letter's index that is less or equal to the specified last letter.
I see a pattern in the sample data:
a = [(1,'A'), (2,'A'), (3,'A'), (4,'A'), (5,'A')]
b = [(1,'B'), (2,'B'), (3,'B')]
c = []
d = [(1,'D'), (2,'D'), (3,'D'), (4,'D')]
Maybe the pattern is misleading me, while the real data doesn't have the same pattern.In this case, just ignore my answer.
Otherwise, given the OP's answer to my comment:
starting point is several separate lists
I propose this solution:
build a nested list with the source lists;
rotate the list n times depending on the starting point;
transpose;
flatten;
Here is an example of implementation, defining some helpers:
from itertools import zip_longest
def rotate(l, n):
return l[n:] + l[:n]
def transpose(l):
return [list(filter(None,i)) for i in zip_longest(*tmp)]
def flatten(l):
return [item for sublist in l for item in sublist]
Then, for example rotate three times to start with D:
tmp = [a, b, c, d]
tmp = rotate(tmp, 3)
tmp = transpose(tmp)
tmp = flatten(tmp)
tmp
#=> [(1, 'D'), (1, 'A'), (1, 'B'), (2, 'D'), (2, 'A'), (2, 'B'), (3, 'D'), (3, 'A'), (3, 'B'), (4, 'D'), (4, 'A'), (5, 'A')]

Convert a list of tuples to a dictionary, based on tuple values

I have a little problem, maybe dumb, but it seems I can't solve it.
I have a list of objects that have members, but let's say my list is this:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
I want to "gather" all elements based on the value I choose, and to put them into a dictionary based on that value/key (can be both the first or the second value of the tuple).
For example, if I want to gather the values based on the first element, I want something like that:
{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]}
However, what I achieved until now is this:
>>> {k:v for k,v in zip([e[0] for e in l], l)}
{1: (1, 'c'), 2: (2, 'a'), 3: (3, 'a')}
Can somebody please help me out?
My first thought would be using defaultdict(list) (efficient, in linear time), which does exactly what you were trying to do:
from collections import defaultdict
dic = defaultdict(list)
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
for item in l:
dic[item[0]].append(item)
output
defaultdict(list,{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]})
Here you go:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
output_dict = dict()
for item in l:
if item[0] in output_dict:
output_dict[item[0]].append(item)
continue
output_dict[item[0]] = [item]
print(output_dict)
Here with list comprehension, oneliner:
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
print (dict([(x[0], [y for y in l if y[0] == x[0]]) for x in l]))
Output:
{1: [(1, 'a'), (1, 'b'), (1, 'c')], 2: [(2, 'a')], 3: [(3, 'a')]}
First create a dico with list inside :
l = [(1, 'a'), (2, 'a'), (1, 'b'), (1, 'c'), (3, 'a')]
dico={}
for i in range(4):
dico[i]=[]
Then fill this dico
for i in l:
dico[i[0]].append(i)

Join strings from consecutive equal tuples

Given a list which looks like this:
[(1, 'a'), (1, 'b'), (2, 'c'), (1, 'd')]
I want to join the consecutive tuples inside the list if they have same first value, so the result looks like following:
[(1, 'ab'), (2, 'c'), (1, 'd')]
Should only join if both are next to each other.
If key is None like below it should be merged to previous item.
[(1, 'a'), (1, 'b'), (None, 'e'), (2, 'c'), (1, 'd')]
result should be
[(1, 'abe'), (2, 'c'), (1, 'd')]
You can use itertools.groupby to group consecutive sublists with the same first value, and join the strings from the corresponding gruoped tuples:
from itertools import groupby
l = [(1, 'a'), (1, 'b'), (2, 'c'), (1, 'd')]
[(k,''.join([i for _,i in v])) for k,v in groupby(l, key=lambda x:x[0])]
# [(1, 'ab'), (2, 'c'), (1, 'd')]

How to generate tuples with some constant values with comprehension?

Is it possible to generate tuples with some constant values with comprehension?
I would like to have something like
[
(0, 'A', 'B'),
(1, 'A', 'B'),
(2, 'A', 'B'),
(3, 'A', 'B'),
...
]
so I would take 0, 1, 2, 3, ... from range. But how to get As and Bs, which are not change?
it's not because tuples are immutable that you cannot generate a list of tuples with a variable item only:
result = [(i,'A','B') for i in range(1,5)]
print(result)
yields:
[(1, 'A', 'B'), (2, 'A', 'B'), (3, 'A', 'B'), (4, 'A', 'B')]

How to create a given size list of lists from a dictionary?

Suppose I have a dictionary a which is
a = {'a':1,'b':2,'c':3,'d':4,'e':5}
Now If I will give size=2 than the output should be like a list of lists in which each list size should be equal to 2 , last list size doesn't matter, like in this case it should be like this
[[('a', 1), ('c', 3)], [('b', 2), ('e', 5)], [('d', 4)]]
or if size = 3 than it should be like this [[('a', 1), ('c', 3), ('b', 2)], [('e', 5), ('d', 4)]]
First of all convert your dictionary into a list , which you can do easily like this
In [1]: a = {'a':1,'b':2,'c':3,'d':4,'e':5}
In [2]: b = a.items()
In [3]: b
Out[3]: [('a', 1), ('c', 3), ('b', 2), ('e', 5), ('d', 4)]
than use this method to make the chunks of the list for a given size
In [4]: def chunks(c, n):
...: return [c[i:i+n] for i in range(0, len(c), n)]
...:
In [5]: list(chunks(b,2))
Out[5]: [[('a', 1), ('c', 3)], [('b', 2), ('e', 5)], [('d', 4)]]
In [6]: list(chunks(b,3))
Out[6]: [[('a', 1), ('c', 3), ('b', 2)], [('e', 5), ('d', 4)]]

Categories