Programmatically generate list of combinations other lists - python

I want to create a list of possible combinations of a list of lists (example will explain better)
list=[[a,b],[c,d],[f]]
The result should be
acf
adf
bcf
bdf
The length of the list can vary, and the length of the lists within the variable list can also vary. How would I make this/these loop(s) programmatically? (preferably explained in Python or pseudo-language)

That's what itertools.product is for:
>>> lst = ['ab','cd','f']
>>> from itertools import product
>>> list(product(*lst))
[('a', 'c', 'f'), ('a', 'd', 'f'), ('b', 'c', 'f'), ('b', 'd', 'f')]

import itertools
list=[['a','b'],['c','d'],['f']]
for comb in itertools.product(*list):
print ''.join(comb)

You can do it recursively:
def printCombos(arrays, combo):
if len(arrays) == 0:
print combo
else:
for i in arrays[0]:
combo.append(i)
printCombos(arrays[1:], combo)
combo.pop()
l=[['a','b'],['c','d'],['f']]
printCombos(l, [])

curlist = []
for firstobj in listoflists[0]:
for secondobj in listoflists[1]:
for lastobj in listoflists[2]:
curlist.append(firstobj)
curlist.append(secondobj)
curlist.append(lastobj)
print ','.join(curlist)

Related

Sequential nested list with string concatenation

Given the nested list:
l = [['a','b','c'], ['d'], ['e','f']]
I would like to join them sequentially with '/'.join().
With the expected list result:
['a/d/e', 'a/d/f', 'b/d/e', 'b/d/f', 'c/d/e', 'c/d/f']
The solution needs to be able to scale (2D list of various sizes).
What is the best way to achieve this?
This is what's known as a Cartesian product. Here's an approach using itertools.product:
import itertools as it
list("/".join(p) for p in it.product(*l))
Output:
['a/d/e', 'a/d/f', 'b/d/e', 'b/d/f', 'c/d/e', 'c/d/f']
The itertools.product function takes an arbitrary number of iterables as arguments (and an optional repeat parameter). What I'm doing with *l is unpacking your sublists as separate arguments to the itertools.product function. This is essentially what it sees:
it.product(["a", "b", "c"], ["d"], ["e", "f"])
PS - you could actually use strings as well, since strings are iterable:
In [6]: list(it.product("abc", "d", "ef"))
Out[6]:
[('a', 'd', 'e'),
('a', 'd', 'f'),
('b', 'd', 'e'),
('b', 'd', 'f'),
('c', 'd', 'e'),
('c', 'd', 'f')]
Beware that the size of the Cartesian product of collections A, B, etc is the product of the sizes of each collection. For example, the Cartesian product of (0, 1), ("a", "b", "c") would be 2x3=6. Adding a third collection, (5, 6, 7, 8) bumps the size up to 24.
You need to unpack the sublists and use itertools.product:
from itertools import product
out = ['/'.join(tpl) for tpl in product(*l)]
Output:
['a/d/e', 'a/d/f', 'b/d/e', 'b/d/f', 'c/d/e', 'c/d/f']

Combinations of a list of items in efficient way

I am trying to find if there is a more efficient way of finding these combinations using some Python scientific library.
I am trying to avoid native for loops and list append preferring to use some NumPy or similar functionality that in theory should be more efficient given it's using C code under the hood. I am struggling to find one, but to me this is quite a common problem to make these operations in an efficient way rather than using slow Python native structures.
I am wondering if I am looking in the wrong places? E.g. this does not seem to help here: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.binomial.html
See here I am taking the binomial coefficients of a list of length 5 starting from a lower bound of 2 and finding out all the possible combinations. Meanwhile I append to a global list so I then have a nice list of "taken items" from the original input list.
import itertools
input_list = ['a', 'b', 'c', 'd', 'e']
minimum_amount = 2
comb_list = []
for i in range(minimum_amount, len(input_list)):
curr_list = input_list[:i+1]
print(f"the current index is: {i}, the lists are: {curr_list}")
curr_comb_list = list(itertools.combinations(curr_list, i))
comb_list = comb_list + curr_comb_list
print(f"found {len(comb_list)} combinations (check on set length: {len(set(comb_list))})")
print(comb_list)
Gives:
found 12 combinations (check on set length: 12)
[('a', 'b'), ('a', 'c'), ('b', 'c'), ('a', 'b', 'c'), ('a', 'b', 'd'),
('a', 'c', 'd'), ('b', 'c', 'd'), ('a', 'b', 'c', 'd'), ('a', 'b', 'c', 'e'),
('a', 'b', 'd', 'e'), ('a', 'c', 'd', 'e'), ('b', 'c', 'd', 'e')]
Is it possible to do this avoiding the for loop and using some scientific libraries to do this quicker?
How can I do this in a quicker way?
The final list contains all combinations of any length from 1 to len(input_list), which is actually the Power Set.
Look at How to get all possible combinations of a list’s elements?.
You want all combinations from input_list of length 2 or more.
To get them, you can run:
comb_lst = list(itertools.chain.from_iterable(
[ itertools.combinations(input_list, i)
for i in range(2, len(input_list)) ]))
Something similiar to powerset in examples in the itertools web site,
but not exactly the same (the length starts from 2, not from 1).
Note also that curr_list in your code is actually used only for printing.

Markov Chain from String

I am currently sitting on a problem considering Markov chains were an input is given in the form of a list of strings. This input has to be transformed into a Markov chain. I have been sitting on this problem already a couple of hours.
My idea: As you can see below I have tried to use the counter from collections to count all transitions, which has worked. Now I am trying to count all the tuples where A and B are the first elements. This gives me all possible transitions for A.
Then I'll count the transitions like (A, B).
Then I want to use these to create a matrix with all probabilities.
def markov(seq):
states = Counter(seq).keys()
liste = []
print(states)
a = zip(seq[:-1], seq[1:])
print(list(a))
print(markov(["A","A","B","B","A","B","A","A","A"]))
So far I can't get the counting of the tuples to work.
Any help or new ideas on how to solve this is appreciated
To count the tuple, you can create another counter.
b = Counter()
for word_pair in a:
b[word_pair] += 1
b will keep the count of the pair.
To create the matrix, you can use numpy.
c = np.array([[b[(i,j)] for j in states] for i in states], dtype = float)
I will leave the task of normalizing each row sum to 1 as an exercise.
I didn't get exactly what you wanted but here is what I think it is:
from collections import Counter
def count_occurence(seq):
counted_states = []
transition_dict = {}
for tup in seq:
if tup not in counted_states:
transition_dict[tup] = seq.count(tup)
counted_states.append(tup)
print(transition_dict)
#{('A', 'A'): 3, ('A', 'B'): 2, ('B', 'B'): 1, ('B', 'A'): 2}
def markov(seq):
states = Counter(seq).keys()
print(states)
#dict_keys(['A', 'B'])
a = list(zip(seq[:-1], seq[1:]))
print(a)
#[('A', 'A'), ('A', 'B'), ('B', 'B'), ('B', 'A'), ('A', 'B'), ('B',
#'A'), ('A', 'A'), ('A', 'A')]
return a
seq = markov(["A","A","B","B","A","B","A","A","A"])
count_occurence(seq)

Get unique products between lists and maintain order of input

There are quite a lot of questions about the unique (Cartesian) product of lists, but I am looking for something peculiar that I haven't found in any of the other questions.
My input will always consist of two lists. When the lists are identical, I want to get all combinations but when they are different I need the unique product (i.e. order does not matter). However, in addition I also need the order to be preserved, in the sense that the order of the input lists matters. In fact, what I need is that the items in the first list should always be the first item of the product tuple.
I have the following working code, which does what I want with the exception I haven't managed to find a good, efficient way to keep the items ordered as described above.
import itertools
xs = ['w']
ys = ['a', 'b', 'c']
def get_up(x_in, y_in):
if x_in == y_in:
return itertools.combinations(x_in, 2)
else:
ups = []
for x in x_in:
for y in y_in:
if x == y:
continue
# sort so that cases such as (a,b) (b,a) get filtered by set later on
ups.append(sorted((x, y)))
ups = set(tuple(up) for up in ups)
return ups
print(list(get_up(xs, ys)))
# [('c', 'w'), ('b', 'w'), ('a', 'w')]
As you can see, the result is a list of unique tuples that are ordered alphabetically. I used the sorting so I could filter duplicate entries by using a set. But because the first list (xs) contains the w, I want the tuples to have that w as a first item.
[('w', 'c'), ('w', 'b'), ('w', 'a')]
If there's an overlap between two lists, the order of the items that occur in both lists don't matter., so for xs = ['w', 'a', 'b'] and ys = ['a', 'b', 'c'] the order for a doesn't matter
[('w', 'c'), ('w', 'b'), ('w', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'c')]
^
or
[('w', 'c'), ('w', 'b'), ('w', 'a'), ('a', 'c'), ('b', 'a'), ('b', 'c')]
^
Preferably I'd end up with a generator (as combinations returns). I'm also only interested in Python >= 3.6.
Collect the tuples in an order-preserving way (as when the lists are identical), then filter by removing tuples whose inverse is also in the list.
if x_in == y_in:
return itertools.combinations(x_in, 2)
else:
seen = set()
for a,b in itertools.product(x_in, y_in):
if a == b or (b, a) in seen:
continue
else:
yield (a,b)
seen.add((a,b))
This will give you the tuples in (x, y) order; when both (a,b) and (b,a) occur, you get only the order seen first.
I'll give an answer to my own question, though I bet there is a better solution using itertools or others.
xs = ['c', 'b']
ys = ['a', 'b', 'c']
def get_unique_combinations(x_in, y_in):
""" get unique combinations that maintain order, i.e. x is before y """
yielded = set()
for x in x_in:
for y in y_in:
if x == y or (x, y) in yielded or (y, x) in yielded:
continue
yield x, y
yielded.add((x, y))
return None
print(list(get_unique_combinations(xs, ys)))

Converting the output of itertools.permutations from list of tuples to list of strings

Having some issues with a list after using the itertools permutations function.
from itertools import permutations
def longestWord(letters):
combinations = list(permutations(letters))
for s in combinations:
''.join(s)
print(combinations)
longestWord("aah")
The output looks like this:
[('a', 'a', 'h'), ('a', 'h', 'a'), ('a', 'a', 'h'), ('a', 'h', 'a'),
('h', 'a', 'a'), ('h', 'a', 'a')]
I would like this to be a simple list, but it seems to be coming out as a list of tuples(?). Can anyone help me format this so it comes out as the following:
['aah', 'aha', 'aah', 'aha', 'haa', 'haa']
from itertools import permutations
def longestWord(letters):
return [''.join(i) for i in permutations(letters)]
print(longestWord("aah"))
Result:
['aah', 'aha', 'aah', 'aha', 'haa', 'haa']
A few suggestions:
Don't print inside the function, return instead and print the returned value.
Your naming of variable combination is not good, as combination is different from permutation
Your join wasn't doing anything, join doesn't change value inline, it returns the string
The function name does not represent what it does. longest word?
Permutations returns an iterator yielding tuples so you need to join them. A map is a nice way to do it instead of your for-loop.
from itertools import permutations
def longestWord(letters):
combinations = list(map("".join, permutations(letters)))
print(combinations)
longestWord("aah")
The way you were doing it, you were joining the letters in each tuple into a single string but you weren't altering the combinations list.
one liner
[''.join(h) for h in [list(k) for k in longestWord("aah")]]
Try this instead:
combinations = permutations(letters)
print [''.join(x) for x in combinations]
(Your join wasn't really doing anything useful--after the join was performed its return value wasn't saved.)

Categories