Permutations between two lists of unequal length - python

I’m having trouble wrapping my head around a algorithm I’m try to implement. I have two lists and want to take particular combinations from the two lists.
Here’s an example.
names = ['a', 'b']
numbers = [1, 2]
the output in this case would be:
[('a', 1), ('b', 2)]
[('b', 1), ('a', 2)]
I might have more names than numbers, i.e. len(names) >= len(numbers). Here's an example with 3 names and 2 numbers:
names = ['a', 'b', 'c']
numbers = [1, 2]
output:
[('a', 1), ('b', 2)]
[('b', 1), ('a', 2)]
[('a', 1), ('c', 2)]
[('c', 1), ('a', 2)]
[('b', 1), ('c', 2)]
[('c', 1), ('b', 2)]

The simplest way is to use itertools.product:
a = ["foo", "melon"]
b = [True, False]
c = list(itertools.product(a, b))
>> [("foo", True), ("foo", False), ("melon", True), ("melon", False)]

May be simpler than the simplest one above:
>>> a = ["foo", "bar"]
>>> b = [1, 2, 3]
>>> [(x,y) for x in a for y in b] # for a list
[('foo', 1), ('foo', 2), ('foo', 3), ('bar', 1), ('bar', 2), ('bar', 3)]
>>> ((x,y) for x in a for y in b) # for a generator if you worry about memory or time complexity.
<generator object <genexpr> at 0x1048de850>
without any import

Note: This answer is for the specific question asked above. If you are here from Google and just looking for a way to get a Cartesian product in Python, itertools.product or a simple list comprehension may be what you are looking for - see the other answers.
Suppose len(list1) >= len(list2). Then what you appear to want is to take all permutations of length len(list2) from list1 and match them with items from list2. In python:
import itertools
list1=['a','b','c']
list2=[1,2]
[list(zip(x,list2)) for x in itertools.permutations(list1,len(list2))]
Returns
[[('a', 1), ('b', 2)], [('a', 1), ('c', 2)], [('b', 1), ('a', 2)], [('b', 1), ('c', 2)], [('c', 1), ('a', 2)], [('c', 1), ('b', 2)]]

I was looking for a list multiplied by itself with only unique combinations, which is provided as this function.
import itertools
itertools.combinations(list, n_times)
Here as an excerpt from the Python docs on itertools That might help you find what your looking for.
Combinatoric generators:
Iterator | Results
-----------------------------------------+----------------------------------------
product(p, q, ... [repeat=1]) | cartesian product, equivalent to a
| nested for-loop
-----------------------------------------+----------------------------------------
permutations(p[, r]) | r-length tuples, all possible
| orderings, no repeated elements
-----------------------------------------+----------------------------------------
combinations(p, r) | r-length tuples, in sorted order, no
| repeated elements
-----------------------------------------+----------------------------------------
combinations_with_replacement(p, r) | r-length tuples, in sorted order,
| with repeated elements
-----------------------------------------+----------------------------------------
product('ABCD', repeat=2) | AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD
permutations('ABCD', 2) | AB AC AD BA BC BD CA CB CD DA DB DC
combinations('ABCD', 2) | AB AC AD BC BD CD
combinations_with_replacement('ABCD', 2) | AA AB AC AD BB BC BD CC CD DD

the best way to find out all the combinations for large number of lists is:
import itertools
from pprint import pprint
inputdata = [
['a', 'b', 'c'],
['d'],
['e', 'f'],
]
result = list(itertools.product(*inputdata))
pprint(result)
the result will be:
[('a', 'd', 'e'),
('a', 'd', 'f'),
('b', 'd', 'e'),
('b', 'd', 'f'),
('c', 'd', 'e'),
('c', 'd', 'f')]

Or the KISS answer for short lists:
[(i, j) for i in list1 for j in list2]
Not as performant as itertools but you're using python so performance is already not your top concern...
I like all the other answers too!

You might want to try a one line list comprehension:
>>> [name+number for name in 'ab' for number in '12']
['a1', 'a2', 'b1', 'b2']
>>> [name+number for name in 'abc' for number in '12']
['a1', 'a2', 'b1', 'b2', 'c1', 'c2']

a tiny improvement for the answer from interjay, to make the result as a flatten list.
>>> list3 = [zip(x,list2) for x in itertools.permutations(list1,len(list2))]
>>> import itertools
>>> chain = itertools.chain(*list3)
>>> list4 = list(chain)
[('a', 1), ('b', 2), ('a', 1), ('c', 2), ('b', 1), ('a', 2), ('b', 1), ('c', 2), ('c', 1), ('a', 2), ('c', 1), ('b', 2)]
reference from this link

Without itertools as a flattened list:
[(list1[i], list2[j]) for i in range(len(list1)) for j in range(len(list2))]
or in Python 2:
[(list1[i], list2[j]) for i in xrange(len(list1)) for j in xrange(len(list2))]

Answering the question "given two lists, find all possible permutations of pairs of one item from each list" and using basic Python functionality (i.e., without itertools) and, hence, making it easy to replicate for other programming languages:
def rec(a, b, ll, size):
ret = []
for i,e in enumerate(a):
for j,f in enumerate(b):
l = [e+f]
new_l = rec(a[i+1:], b[:j]+b[j+1:], ll, size)
if not new_l:
ret.append(l)
for k in new_l:
l_k = l + k
ret.append(l_k)
if len(l_k) == size:
ll.append(l_k)
return ret
a = ['a','b','c']
b = ['1','2']
ll = []
rec(a,b,ll, min(len(a),len(b)))
print(ll)
Returns
[['a1', 'b2'], ['a1', 'c2'], ['a2', 'b1'], ['a2', 'c1'], ['b1', 'c2'], ['b2', 'c1']]

The better answers to this only work for specific lengths of lists that are provided.
Here's a version that works for any lengths of input. It also makes the algorithm clear in terms of the mathematical concepts of combination and permutation.
from itertools import combinations, permutations
list1 = ['1', '2']
list2 = ['A', 'B', 'C']
num_elements = min(len(list1), len(list2))
list1_combs = list(combinations(list1, num_elements))
list2_perms = list(permutations(list2, num_elements))
result = [
tuple(zip(perm, comb))
for comb in list1_combs
for perm in list2_perms
]
for idx, ((l11, l12), (l21, l22)) in enumerate(result):
print(f'{idx}: {l11}{l12} {l21}{l22}')
This outputs:
0: A1 B2
1: A1 C2
2: B1 A2
3: B1 C2
4: C1 A2
5: C1 B2

Related

Python generate sets of pairings between two lists [duplicate]

I’m having trouble wrapping my head around a algorithm I’m try to implement. I have two lists and want to take particular combinations from the two lists.
Here’s an example.
names = ['a', 'b']
numbers = [1, 2]
the output in this case would be:
[('a', 1), ('b', 2)]
[('b', 1), ('a', 2)]
I might have more names than numbers, i.e. len(names) >= len(numbers). Here's an example with 3 names and 2 numbers:
names = ['a', 'b', 'c']
numbers = [1, 2]
output:
[('a', 1), ('b', 2)]
[('b', 1), ('a', 2)]
[('a', 1), ('c', 2)]
[('c', 1), ('a', 2)]
[('b', 1), ('c', 2)]
[('c', 1), ('b', 2)]
The simplest way is to use itertools.product:
a = ["foo", "melon"]
b = [True, False]
c = list(itertools.product(a, b))
>> [("foo", True), ("foo", False), ("melon", True), ("melon", False)]
May be simpler than the simplest one above:
>>> a = ["foo", "bar"]
>>> b = [1, 2, 3]
>>> [(x,y) for x in a for y in b] # for a list
[('foo', 1), ('foo', 2), ('foo', 3), ('bar', 1), ('bar', 2), ('bar', 3)]
>>> ((x,y) for x in a for y in b) # for a generator if you worry about memory or time complexity.
<generator object <genexpr> at 0x1048de850>
without any import
Note: This answer is for the specific question asked above. If you are here from Google and just looking for a way to get a Cartesian product in Python, itertools.product or a simple list comprehension may be what you are looking for - see the other answers.
Suppose len(list1) >= len(list2). Then what you appear to want is to take all permutations of length len(list2) from list1 and match them with items from list2. In python:
import itertools
list1=['a','b','c']
list2=[1,2]
[list(zip(x,list2)) for x in itertools.permutations(list1,len(list2))]
Returns
[[('a', 1), ('b', 2)], [('a', 1), ('c', 2)], [('b', 1), ('a', 2)], [('b', 1), ('c', 2)], [('c', 1), ('a', 2)], [('c', 1), ('b', 2)]]
I was looking for a list multiplied by itself with only unique combinations, which is provided as this function.
import itertools
itertools.combinations(list, n_times)
Here as an excerpt from the Python docs on itertools That might help you find what your looking for.
Combinatoric generators:
Iterator | Results
-----------------------------------------+----------------------------------------
product(p, q, ... [repeat=1]) | cartesian product, equivalent to a
| nested for-loop
-----------------------------------------+----------------------------------------
permutations(p[, r]) | r-length tuples, all possible
| orderings, no repeated elements
-----------------------------------------+----------------------------------------
combinations(p, r) | r-length tuples, in sorted order, no
| repeated elements
-----------------------------------------+----------------------------------------
combinations_with_replacement(p, r) | r-length tuples, in sorted order,
| with repeated elements
-----------------------------------------+----------------------------------------
product('ABCD', repeat=2) | AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD
permutations('ABCD', 2) | AB AC AD BA BC BD CA CB CD DA DB DC
combinations('ABCD', 2) | AB AC AD BC BD CD
combinations_with_replacement('ABCD', 2) | AA AB AC AD BB BC BD CC CD DD
the best way to find out all the combinations for large number of lists is:
import itertools
from pprint import pprint
inputdata = [
['a', 'b', 'c'],
['d'],
['e', 'f'],
]
result = list(itertools.product(*inputdata))
pprint(result)
the result will be:
[('a', 'd', 'e'),
('a', 'd', 'f'),
('b', 'd', 'e'),
('b', 'd', 'f'),
('c', 'd', 'e'),
('c', 'd', 'f')]
Or the KISS answer for short lists:
[(i, j) for i in list1 for j in list2]
Not as performant as itertools but you're using python so performance is already not your top concern...
I like all the other answers too!
You might want to try a one line list comprehension:
>>> [name+number for name in 'ab' for number in '12']
['a1', 'a2', 'b1', 'b2']
>>> [name+number for name in 'abc' for number in '12']
['a1', 'a2', 'b1', 'b2', 'c1', 'c2']
a tiny improvement for the answer from interjay, to make the result as a flatten list.
>>> list3 = [zip(x,list2) for x in itertools.permutations(list1,len(list2))]
>>> import itertools
>>> chain = itertools.chain(*list3)
>>> list4 = list(chain)
[('a', 1), ('b', 2), ('a', 1), ('c', 2), ('b', 1), ('a', 2), ('b', 1), ('c', 2), ('c', 1), ('a', 2), ('c', 1), ('b', 2)]
reference from this link
Without itertools as a flattened list:
[(list1[i], list2[j]) for i in range(len(list1)) for j in range(len(list2))]
or in Python 2:
[(list1[i], list2[j]) for i in xrange(len(list1)) for j in xrange(len(list2))]
Answering the question "given two lists, find all possible permutations of pairs of one item from each list" and using basic Python functionality (i.e., without itertools) and, hence, making it easy to replicate for other programming languages:
def rec(a, b, ll, size):
ret = []
for i,e in enumerate(a):
for j,f in enumerate(b):
l = [e+f]
new_l = rec(a[i+1:], b[:j]+b[j+1:], ll, size)
if not new_l:
ret.append(l)
for k in new_l:
l_k = l + k
ret.append(l_k)
if len(l_k) == size:
ll.append(l_k)
return ret
a = ['a','b','c']
b = ['1','2']
ll = []
rec(a,b,ll, min(len(a),len(b)))
print(ll)
Returns
[['a1', 'b2'], ['a1', 'c2'], ['a2', 'b1'], ['a2', 'c1'], ['b1', 'c2'], ['b2', 'c1']]
The better answers to this only work for specific lengths of lists that are provided.
Here's a version that works for any lengths of input. It also makes the algorithm clear in terms of the mathematical concepts of combination and permutation.
from itertools import combinations, permutations
list1 = ['1', '2']
list2 = ['A', 'B', 'C']
num_elements = min(len(list1), len(list2))
list1_combs = list(combinations(list1, num_elements))
list2_perms = list(permutations(list2, num_elements))
result = [
tuple(zip(perm, comb))
for comb in list1_combs
for perm in list2_perms
]
for idx, ((l11, l12), (l21, l22)) in enumerate(result):
print(f'{idx}: {l11}{l12} {l21}{l22}')
This outputs:
0: A1 B2
1: A1 C2
2: B1 A2
3: B1 C2
4: C1 A2
5: C1 B2

Combinations with max repetitions per element

I want to get a list of k-sized tuples with the combinations of a list of elements (let's call it elements) similar to what itertools.combinations_with_replacement(elements, k) would do. The difference is that I want to add a maximum to the number of replacements per element.
So for example if I run the following:
elements = ['a', 'b']
print(list(itertools.combinations_with_replacement(elements, 3)))
I get:
[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'b'), ('b', 'b', 'b')]
I would like to have something like the following:
elements = {'a': 2, 'b': 3}
print(list(combinations_with_max_replacement(elements, 3)))
Which would print
[('a', 'a', 'b'), ('a', 'b', 'b'), ('b', 'b', 'b')]
Notice that the max number of 'a' in each tuple is 2 so ('a', 'a', 'a') is not part of the result.
I'd prefer to avoid looping through the results of itertools.combinations_with_replacement(elements, k) counting the elements in each tuple and filtering them out.
Let me know if I can give any further info.
Thanks for the help!
UPDATE
I tried:
elements = ['a'] * 2 + ['b'] * 3
print(set(itertools.combinations(elements, 3)))
and get:
{('a', 'b', 'b'), ('b', 'b', 'b'), ('a', 'a', 'b')}
I get the elements I need but I lose the order and seems kind of hacky
I know you don't want to loop through the results but maybe it's easier to filter the output in this way.
def custom_combinations(elements, max_count):
L = list(itertools.combinations_with_replacement(elements, max_count))
for element in elements.keys():
L = list(filter(lambda x: x.count(element) <= elements[element], L))
return L
Pure Python Solution (i.e. without itertools)
You can use recursion:
def combos(els, l):
if l == 1:
return [(k,) for k, v in els.items() if v]
cs = []
for e in els:
nd = {k: v if k != e else v - 1 for k, v in els.items() if v}
cs += [(e,)+c for c in combos(nd, l-1)]
return cs
and a test shows it works:
>>> combos({'a': 2, 'b': 3}, 3)
[('b', 'b', 'b'), ('b', 'b', 'a'), ('b', 'a', 'b'), ('b', 'a', 'a'), ('a', 'b', 'b'), ('a', 'b', 'a'), ('a', 'a', 'b')]
note that we do loose the order but this is unavoidable if we are passing els as a dictionary as you requested.
I believe this recursive solution has the time complexity you desire.
Rather than passing down a dict, we pass down a list of the item pairs. We also pass down start_idx, which tells the 'lower' recursive function calls to ignore earlier elements. This fixes the out-of-order problem of the other recursive answer.
def _combos(elements, start_idx, length):
# ignore elements before start_idx
for i in range(start_idx, len(elements)):
elem, count = elements[i]
if count == 0:
continue
# base case: only one element needed
if length == 1:
yield (elem,)
else:
# need more than one elem: mutate the list and recurse
elements[i] = (elem, count - 1)
# when we recurse, we ignore elements before this one
# this ensures we find combinations, not permutations
for combo in _combos(elements, i, length - 1):
yield (elem,) + combo
# fix the list
elements[i] = (elem, count)
def combos(elements, length):
elements = list(elements.items())
return _combos(elements, 0, length)
print(list(combos({'a': 2, 'b': 3}, 3)))
# [('a', 'a', 'b'), ('a', 'b', 'b'), ('b', 'b', 'b')]
As an bonus, profiling shows it's more performant than the set(itertools.combinations(_)) solution as the input size grows.
print(timeit.Timer("list(combos({'a': 2, 'b': 2, 'c': 2}, 3))",
setup="from __main__ import combos").timeit())
# 9.647649317979813
print(timeit.Timer("set(itertools.combinations(['a'] * 2 + ['b'] * 2 + ['c'] * 2, 3))").timeit())
# 1.7750148189952597
print(timeit.Timer("list(combos({'a': 4, 'b': 4, 'c': 4}, 4))",
setup="from __main__ import combos").timeit())
# 20.669851204031147
print(timeit.Timer("set(itertools.combinations(['a'] * 4 + ['b'] * 4 + ['c'] * 4, 4))").timeit())
# 28.194088937016204
print(timeit.Timer("list(combos({'a': 5, 'b': 5, 'c': 5}, 5))",
setup="from __main__ import combos").timeit())
# 36.4631432640017
print(timeit.Timer("set(itertools.combinations(['a'] * 5 + ['b'] * 5 + ['c'] * 5, 5))").timeit())
# 177.29063899395987

how can I create word count output in python just by using reduce function?

I have the following list of tuples: [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
I would like to know if I can utilize python's reduce function to aggregate them and produce the following output : [('a', 3), ('b', 1), ('c', 2)]
Or if there are other ways, I would like to know as well (loop is fine)
It seems difficult to achieve using reduce, because if both tuples that you "reduce" don't bear the same letter, you cannot compute the result. How to reduce ('a',1) and ('b',1) to some viable result?
Best I could do was l = functools.reduce(lambda x,y : (x[0],x[1]+y[1]) if x[0]==y[0] else x+y,sorted(l))
it got me ('a', 3, 'b', 1, 'c', 1, 'c', 1). So it kind of worked for the first element, but would need more than one pass to do the other ones (recreating tuples and make another similar reduce, well, not very efficient to say the least!).
Anyway, here are 2 working ways of doing it
First, using collections.Counter counting elements of the same kind:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
import collections
c = collections.Counter()
for a,i in l:
c[a] += i
We cannot use listcomp because each element has a weight (even if here it is 1)
Result: a dictionary: Counter({'a': 3, 'c': 2, 'b': 1})
Second option: use itertools.groupby on the sorted list, grouping by name/letter, and performing the sum on the integers bearing the same letter:
print ([(k,sum(e for _,e in v)) for k,v in itertools.groupby(sorted(l),key=lambda x : x[0])])
result:
[('a', 3), ('b', 1), ('c', 2)]
The alternative approach using defaultdict subclass and sum function:
import collections
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
d = collections.defaultdict(list)
for t in l:
d[t[0]].append(t[1])
result = [(k,sum(v)) for k,v in d.items()]
print(result)
The output:
[('b', 1), ('a', 3), ('c', 2)]
Another way is that to create your custom reduce function.
for example:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
def myreduce(func , seq):
output_dict = {}
for k,v in seq:
output_dict[k] = func(output_dict.get(k,0),v)
return output_dict
myreduce((lambda sum,value:total+sum),l)
output:
{'a': 3, 'b': 1, 'c': 2}
later on you can modify the generated output as a list of tuples.

Contracting elements from two different lists

I have two different lists list1 = ['A','B'] and list2 = ['C','D','E']. I would like to be able to find all possible contractions between the elements of the two lists. For the present case I would like to have a code (preferably Python, Mathematica or MATLAB) that takes the lists above and returns:
AC,BD , AC,BE , AD,BC , AD,BE , AE,BC , AE,BD
which are all the possible contractions. I would like to be able to do this for lists of variable size (but always 2 of them). I've played a lot with Python's itertools but I can't get the hang of how it works with two lists. Any help would be much appreciated.
Here is my version:
import itertools
l1 = 'AB'
l2 = 'CDE'
n = min(len(l1),len(l2))
print('; '.join(
','.join(a+b for a,b in zip(s1,s2))
for s1,s2 in itertools.product(
itertools.permutations(l1,n),
itertools.combinations(l2,n),
)
))
This will output:
AC,BD; AC,BE; AD,BE; BC,AD; BC,AE; BD,AE
Note that for shortness, I did not build a list of the items, but directly iterated the strings. It does not matter which of the two lists gets permutations and which gets combinations, that just changes the order of the output. permutations takes all possible orderings, while combinations returns sorted orderings. This way, you get each contraction exactly once.
For each contraction, you will get two sequences s1 and s2, the contraction is between elements of like index in each sequence. ','.join(a+b for a,b in zip(s1,s2)) makes a nice string for such a contraction.
listA = {"A", "B"};
listB = {"C", "D", "E"};
f[x_, y_] := If[StringMatchQ[StringTake[x, {2}], StringTake[y, {2}]],
Sequence ## {}, List[x, y]];
z = Outer[StringJoin, listA, listB];
Flatten[Outer[f, First#z, Last#z], 1]
In [2]: list1 = ['A','B']
In [3]: list2 = ['C','D','E']
In [4]: list(itertools.product(list1, list2))
Out[4]: [('A', 'C'), ('A', 'D'), ('A', 'E'), ('B', 'C'), ('B', 'D'), ('B', 'E')]
In [5]: [''.join(p) for p in itertools.product(list1, list2)]
Out[5]: ['AC', 'AD', 'AE', 'BC', 'BD', 'BE']
If you're asking about how to build all permutations of the items contained within both lists, with no repetitions, with each result of length two, you could use itertools.permutation:
combined_list = []
for i in list1 + list2:
if i not in combined_list:
combined_list.append(i)
for perm in itertools.permutations(combined_list, 2):
print(perm)
For the inputs list1 = ['a', 'b'] and list2 = ['c', 'd', 'e'], this outputs:
('a', 'b') ('a', 'c') ('a', 'd') ('a', 'e') ('b', 'a') ('b', 'c') ('b', 'd') ('b', 'e') ('c', 'a') ('c', 'b') ('c', 'd') ('c', 'e') ('d', 'a') ('d', 'b') ('d', 'c') ('d', 'e') ('e', 'a') ('e', 'b') ('e', 'c') ('e', 'd')

Python: how to get sorted count of items in a list?

In Python, I've got a list of items like:
mylist = [a, a, a, a, b, b, b, d, d, d, c, c, e]
And I'd like to output something like:
a (4)
b (3)
d (3)
c (2)
e (1)
How can I output a count and leaderboard of items in a list? I'm not too bothered about efficiency, just any way that works :)
Thanks!
I'm surprised that no one has mentioned collections.Counter. Assuming
import collections
mylist = ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'd', 'd', 'd', 'c', 'c', 'e']
it's just a one liner:
print(collections.Counter(mylist).most_common())
which will print:
[('a', 4), ('b', 3), ('d', 3), ('c', 2), ('e', 1)]
Note that Counter is a subclass of dict with some useful methods for counting objects. Refer to the documentation for more info.
from collections import defaultdict
def leaders(xs, top=10):
counts = defaultdict(int)
for x in xs:
counts[x] += 1
return sorted(counts.items(), reverse=True, key=lambda tup: tup[1])[:top]
So this function uses a defaultdict to count the number of each entry in our list. We then take each pair of the entry and its count and sort it in descending order according to the count. We then take the top number of entries and return that.
So now we can say
>>> xs = list("jkl;fpfmklmcvuioqwerklmwqpmksdvjioh0-45mkofwk903rmiok0fmdfjsd")
>>> print leaders(xs)
[('k', 7), ('m', 7), ('f', 5), ('o', 4), ('0', 3), ('d', 3), ('i', 3), ('j', 3), ('l', 3), ('w', 3)]
A two-liner:
for count, elem in sorted(((mylist.count(e), e) for e in set(mylist)), reverse=True):
print '%s (%d)' % (elem, count)

Categories