zipping two lists, one containing similar elements - python

this might be a naive question, I have two lists, say list1 and list2.
list1 =[[('a', 'b'), ('c', 'd')], [('e', 'f'), ('g', 'h')]]
list2 =['aa', 'aa']
When I do
dict(zip(list2, list1))
I get the following
{'aa': [('e', 'f'), ('g', 'h')]}
The output I want is
{'aa': [('a', 'b'), ('c', 'd')], 'aa': [('e', 'f'), ('g', 'h')]}
When I change list2 to:
list2 = ['aa1', 'aa2']
dict(zip(list2, list1))
gives
{'aa1': [('a', 'b'), ('c', 'd')], 'aa2': [('e', 'f'), ('g', 'h')]}
Why I am not getting desired output in the first case? Could you please help me on this. Thanks in advance.

By default dict in python cannot have duplicate key, in the other hand we can have multiple values.
In the first statement, python will add the key aa with the value :
[('a', 'b'), ('c', 'd')]
then it will overwrite its value with :
[('e', 'f'), ('g', 'h')]
To create dict with duplicate keys, take a look at this make dictionary with duplicate keys in python

Related

How to find all unique combinations out of two list in python?

I have two lists:
l1 = ['a', 'b', 'c']
l2 = ['e', 'f', 'g']
And I want to generate all possible combinations of their pairings
Desired Output:
What is the best way to do it?
Right now I have written this code which works but seems highly inefficient.
You only need to permute the second list:
from itertools import permutations
l1 = ['a', 'b', 'c']
l2 = ['e', 'f', 'g']
pairs = [tuple(zip(l1, p)) for p in permutations(l2)]
print(pairs)
Output:
[(('a', 'e'), ('b', 'f'), ('c', 'g')),
(('a', 'e'), ('b', 'g'), ('c', 'f')),
(('a', 'f'), ('b', 'e'), ('c', 'g')),
(('a', 'f'), ('b', 'g'), ('c', 'e')),
(('a', 'g'), ('b', 'e'), ('c', 'f')),
(('a', 'g'), ('b', 'f'), ('c', 'e'))]
The easiest way to think about this is that each set of output values is the result of pairing the elements of the first list with some permutation of the elements of the second list. So you just need to generate all possible permutations if n elements, then use that as a permutation of the elements of the second list, pairing them with the elements of the first list.
The most flexible way to implement it is by defining a simple generator function:
from itertools import permutations
def gen_perm_pairs(l1, l2):
cnt = len(l1)
assert len(l2) == cnt
for perm in permutations(range(cnt)):
yield tuple((l1[i], l2[perm[i]]) for i in range(cnt))
You can then call it to generate the desired results:
l1 = ['a', 'b', 'c']
l2 = ['e', 'f', 'g']
for pairs in gen_perm_pairs(l1, l2):
print(pairs)
This produces the following:
(('a', 'e'), ('b', 'f'), ('c', 'g'))
(('a', 'e'), ('b', 'g'), ('c', 'f'))
(('a', 'f'), ('b', 'e'), ('c', 'g'))
(('a', 'f'), ('b', 'g'), ('c', 'e'))
(('a', 'g'), ('b', 'e'), ('c', 'f'))
(('a', 'g'), ('b', 'f'), ('c', 'e'))

Loop over list of strings in python

I have a list l of stings from a to z generated by using l=list(string.ascii_lowercase).
I want to take one value from the list l and combine with all other values of list except that selected value. For eg. ab,ac.....az.
Again take b and combine with all other values like ba,bb.....bz.
I know there will be redundant combinations.
I have tried this
for i in range(0, len(l)):
for j in range(0,len(l.pop(i))):
print (l[i],l[j])
I am getting 'list index out of range' error.
Is there more optimized way of doing it ?
In [1]: import string, itertools
In [2]: combinations = list(itertools.combinations(string.ascii_lowercase, 2))
In [4]: combinations
Out[4]:
[('a', 'b'),
('a', 'c'),
('a', 'd'),
('a', 'e'),
('a', 'f'),
('a', 'g'),
('a', 'h'),
('a', 'i'),
('a', 'j'),
('a', 'k'),
('a', 'l'),
('a', 'm'),
('a', 'n'),
('a', 'o'),
('a', 'p'),
...]
mylist = list(string.ascii_lowercase)
selected_value = 'b'
new_list = [selected_value+i for i in mylist if i != selected_value]

Reduce execution time of combinations-producing script

I need to reduce time of executing my python's script which generate a list of some combinations. A brief explanation of the problem:
There are two lists:
char_list = ['a','b','c','d','e','f','g','h']
n_list = [3,2,1,2]
The goal is to create one collection (list, tuple or whatever you want) of all possible combinations of characters from char_list with length and order according to pattern in n_list. One example out of 1680 possible:
(('a', 'd', 'e'), ('h', 'c'), ('b',), ('d', 'f'))
All combinations in collection must look like this one above, the only thing which will be changing is the place of particular characters. And this is where the difficulties begin because there are some rules which cannot be omitted:
there can't be duplicates of characters in each combination (each
character must occur only once in combination)
combinations with changed order of characters in tuples which are on the same
place as the previous ones also are treated as duplicates (this rule
is more complicated so let me show you example):
let's say we have thousands of combinations in our collection and suddenly we notice four that are looking almost same:
(('a', 'c', 'e'), ('b', 'd'), ('g',), ('f', 'h'))
(('a', 'c', 'e'), ('h', 'f'), ('g',), ('d', 'b'))
(('a', 'c', 'e'), ('f', 'h'), ('g',), ('b', 'd'))
(('a', 'e', 'c'), ('h', 'f'), ('g',), ('d', 'b'))
only two of them are correct (can belong to our collection, btw this situation means that the entire collection is wrong because among these four combinations there are two wrong) Which ones? First one is fine (at least for the purpose of this example) but in the case of the next three only one of them is fine and this is the first one (first among these three, second if we count from the beginning of all 4 combinations) because it show up before the next two in whole collection. Why third and fourth combination is not unique? Because the place of tuples with particular order of characters within them hasn't changed; only characters switched places but only in particular tuples and this not what makes whole combination unique. Look once again at the tuples of the first and third combinations. They are same. But the order of these tuples is different. The one (order) of the first one is unique in regard to others.
My approach to this coding problem:
import itertools as iter
char_list = ['a','b','c','d','e','f','g','h']
n_list = [3,2,1,2]
###this line creates a list of all possible combinations of characters within tuples###
char_comb_in_tuples = list(iter.chain(*[list(iter.combinations(char_list,n)) for n in n_list]))
### this is list in which the appropriate combinations will be appended###
list_of_good_combinations = []
###for loop for looping over all possible combinations of tuples from 'char_comb_in_tuples'###
for combination in iter.combinations(char_comb_in_tuples,4):
###filtering only these combinations with appropriate pattern from n_list (3,2,1,2)###
if len([tuple for n_list_number, tuple in zip(n_list, combination) if n_list_number ==len(tuple)])==4:
###filtering only these combinations with no character duplicates###
if len(list(iter.chain(*combination))) != len(set(list(iter.chain(*combination)))):
pass
else:
###appending right combination to final list###
list_of_good_combinations.append(combination)
else:
pass
You can use a recursive function that picks the number of items from the list specified in the first of the given partitions with itertools.combinations, subtract the picked items from the item pool, pass them to the next recursive call with the rest of the partitions, and merge the returning combinations with each of the currently picked combination for the first partition. For efficient subtraction of items from a pool, you can convert the given list to a set first:
from itertools import combinations
def partitioned_combinations(s, partitions):
if partitions:
for combination in combinations(s, r=partitions[0]):
for combinations in partitioned_combinations(s.difference(combination), partitions[1:]):
yield (combination, *combinations)
else:
yield ()
so that:
list(partitioned_combinations(set(char_list), n_list))
would return 1680 tuples in a list:
[(('a', 'e', 'f'), ('c', 'd'), ('g',), ('h', 'b')),
(('a', 'e', 'f'), ('c', 'd'), ('b',), ('h', 'g')),
(('a', 'e', 'f'), ('c', 'd'), ('h',), ('g', 'b')),
(('a', 'e', 'f'), ('c', 'g'), ('d',), ('b', 'h')),
(('a', 'e', 'f'), ('c', 'g'), ('b',), ('d', 'h')),
(('a', 'e', 'f'), ('c', 'g'), ('h',), ('d', 'b')),
(('a', 'e', 'f'), ('c', 'b'), ('d',), ('g', 'h')),
(('a', 'e', 'f'), ('c', 'b'), ('g',), ('d', 'h')),
(('a', 'e', 'f'), ('c', 'b'), ('h',), ('d', 'g')),
(('a', 'e', 'f'), ('c', 'h'), ('d',), ('g', 'b')),
...
Note that sets are unordered in Python so the result of this approach will not be in a definite order. If you do want them to be in order, however, you can install the ordered-set module, so that:
from ordered_set import OrderedSet
list(partitioned_combinations(OrderedSet(char_list), n_list))
returns:
[(('a', 'b', 'c'), ('d', 'e'), ('f',), ('g', 'h')),
(('a', 'b', 'c'), ('d', 'e'), ('g',), ('f', 'h')),
(('a', 'b', 'c'), ('d', 'e'), ('h',), ('f', 'g')),
(('a', 'b', 'c'), ('d', 'f'), ('e',), ('g', 'h')),
(('a', 'b', 'c'), ('d', 'f'), ('g',), ('e', 'h')),
(('a', 'b', 'c'), ('d', 'f'), ('h',), ('e', 'g')),
(('a', 'b', 'c'), ('d', 'g'), ('e',), ('f', 'h')),
(('a', 'b', 'c'), ('d', 'g'), ('f',), ('e', 'h')),
(('a', 'b', 'c'), ('d', 'g'), ('h',), ('e', 'f')),
(('a', 'b', 'c'), ('d', 'h'), ('e',), ('f', 'g')),
...

Find all unique pairs of keys of a dictionary

If there's a dictionary:
test_dict = { 'a':1,'b':2,'c':3,'d':4}
I want to find pairs of keys in list of tuples like:
[('a','b'),('a','c'),('a','d'),('b','c'),('b','d'),('c','d')]
I tried with the following double iteration
test_dict = { 'a':1,'b':2,'c':3,'d':4}
result = []
for first_key in test_dict:
for second_key in test_dict:
if first_key != second_key:
pair = (first_key,second_key)
result.append(pair)
But it's generating the following result
[('a', 'c'), ('a', 'b'), ('a', 'd'), ('c', 'a'), ('c', 'b'), ('c', 'd'), ('b', 'a'), ('b', 'c'), ('b', 'd'), ('d', 'a'), ('d', 'c'), ('d', 'b')]
For my test case ('a','b') and ('b','a') are similar and I just want one of them in the list. I had to run one more loop for getting the unique pairs from the result.
So is there any efficient way to do it in Python (preferably in 2.x)? I want to remove nested loops.
Update:
I have checked with the possible flagged duplicate, but it's not solving the problem here. It's just providing different combination. I just need the pairs of 2. For that question a tuple of ('a','b','c') and ('a','b','c','d') are valid, but for me they are not. I hope, this explains the difference.
Sounds like a job for itertools.
from itertools import combinations
test_dict = {'a':1, 'b':2, 'c':3, 'd':4}
results = list(combinations(test_dict, 2))
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
I should add that although the output above happens to be sorted, this is not guaranteed. If order is important, you can instead use:
results = sorted(combinations(test_dict, 2))
Since dictionary keys are unique, this problem becomes equivalent of finding all combinations of the keys of size 2. You can just use itertools for that:
>>> test_dict = { 'a':1,'b':2,'c':3,'d':4}
>>> import itertools
>>> list(itertools.combinations(test_dict, 2))
[('c', 'a'), ('c', 'd'), ('c', 'b'), ('a', 'd'), ('a', 'b'), ('d', 'b')]
Note, these will come in no particular order, since dict objects are inherently unordered. But you can sort before or after, if you want sorted order:
>>> list(itertools.combinations(sorted(test_dict), 2))
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
>>>
Note, this algorithm is relatively simple if you are working with sequences like a list:
>>> ks = list(test_dict)
>>> for i, a in enumerate(ks):
... for b in ks[i+1:]: # this is the important bit
... print(a, b)
...
c a
c d
c b
a d
a b
d b
Or more succinctly:
>>> [(a,b) for i, a in enumerate(ks) for b in ks[i+1:]]
[('c', 'a'), ('c', 'd'), ('c', 'b'), ('a', 'd'), ('a', 'b'), ('d', 'b')]
>>>
itertools.combinations does just what you want:
from itertools import combinations
test_dict = { 'a':1,'b':2,'c':3,'d':4}
keys = tuple(test_dict)
combs = list(combinations(keys, 2))
print(combs)
# [('a', 'd'), ('a', 'b'), ('a', 'c'), ('d', 'b'), ('d', 'c'), ('b', 'c')]
combs = list(combinations(test_dict, 2)) would just do; iterating over a dictionary is just iterating over its keys...

How to flatten a list of nested tuples in Python?

I have a list of tuples that looks like this:
[('a', 'b'), ('c', 'd'), (('e', 'f'), ('h', 'i'))]
I want to turn it into this:
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('h', 'i')]
What is the most Pythonic way to do this?
one-line, using list comprehension:
l = [('a', 'b'), ('c', 'd'), (('e', 'f'), ('h', 'i'))]
result = [z for y in (x if isinstance(x[0],tuple) else [x] for x in l) for z in y]
print(result)
yields:
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('h', 'i')]
this is artificially creating a list if the element is not a tuple of tuples, then flattening all does the job. To avoid creating a single element list [x], (x for _ in range(1)) can also do the job (although it appears clunky)
Limitation: doesn't handle more than 1 level of nesting. In which case, a more complex/recursive solution must be coded (check Martijn's answer).
Adjust the canonical un-flatten recipe to only unflatten when there are tuples in the value:
def flatten(l):
for el in l:
if isinstance(el, tuple) and any(isinstance(sub, tuple) for sub in el):
for sub in flatten(el):
yield sub
else:
yield el
This will only unwrap tuples, and only if there are other tuples in it:
>>> sample = [('a', 'b'), ('c', 'd'), (('e', 'f'), ('h', 'i'))]
>>> list(flatten(sample))
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('h', 'i')]
A one-line solution would be using itertools.chain:
>>> l = [('a', 'b'), ('c', 'd'), (('e', 'f'), ('h', 'i'))]
>>> from itertools import chain
>>> [*chain.from_iterable(x if isinstance(x[0], tuple) else [x] for x in l)]
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('h', 'i')]

Categories