Extracting Unique String Combinations from List of List in Python

Extracting Unique String Combinations from List of List in Python - python

I'm trying to extract all the unique combinations of strings from a list of lists in Python. For example, in the code below, ['a', 'b','c'] and ['b', 'a', 'c'] are not unique, while ['a', 'b','c'] and ['a', 'e','f'] or ['a', 'b','c'] and ['d', 'e','f'] are unique.
I've tried converting my list of lists to a list of tuples and using sets to compare elements, but all elements are still being returned.
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
# converting list of list to list of tuples, so they can be converted into a set
combos = [tuple(item) for item in combos]
combos = set(combos)
grouping_list = set()
for combination in combos:
if combination not in grouping_list:
grouping_list.add(combination)
##
print grouping_list
>>> set([('a', 'b', 'c'), ('c', 'a', 'b'), ('d', 'e', 'f'), ('c', 'b', 'a'), ('c', 'f', 'b')])

How about sorting, (and using a Counter)?
from collections import Counter
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
combos = Counter(tuple(sorted(item)) for item in combos)
print(combos)
returns:
Counter({('a', 'b', 'c'): 3, ('d', 'e', 'f'): 1, ('b', 'c', 'f'): 1})
EDIT: I'm not sure if I'm correctly understanding your question. You can use a Counter to count occurances, or use a set if you're only interested in the resulting sets of items, and not in their uniqueness.
Something like:
combos = set(tuple(sorted(item)) for item in combos)
Simply returns
set([('a', 'b', 'c'), ('d', 'e', 'f'), ('b', 'c', 'f')])

>>> set(tuple(set(combo)) for combo in combos)
{('a', 'c', 'b'), ('c', 'b', 'f'), ('e', 'd', 'f')}
Simple but if we have same elements in the combo, it will return wrong answer. Then, sorting is the way to go as suggested in others.

How about this:
combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']]
print [list(y) for y in set([''.join(sorted(c)) for c in combos])]

Related

How to do selective combination in Python List?

I'm not really sure how to frame my question, but here's a try. I have a list of strings and tuples of strings. I want all combinations such that I pick only one value from each tuple.
It's much easier to demonstrate with an example.
Input:
x = ['a', ('b', 'c'), ('d', 'e'), 'f']
Output:
y = [
['a', 'b', 'd', 'f'],
['a', 'c', 'd', 'f'],
['a', 'b', 'e', 'f'],
['a', 'c', 'e', 'f']
]
Example 2:
Input:
x = ['a', ('b', 'c'), ('d', 'e'), 'f', ('g', 'h')]
Output:
y = [
['a', 'b', 'd', 'f', 'g'],
['a', 'c', 'd', 'f', 'g'],
['a', 'b', 'e', 'f', 'g'],
['a', 'c', 'e', 'f', 'g'],
['a', 'b', 'd', 'f', 'h'],
['a', 'c', 'd', 'f', 'h'],
['a', 'b', 'e', 'f', 'h'],
['a', 'c', 'e', 'f', 'h']
]

x = ['a', ('b', 'c'), ('d', 'e'), 'f', ('g', 'h')]
First normalize your input
x = [tuple(xx) for xx in x if not isinstance(x, tuple)]
Then:
import iterools
list(itertools.product(*x))
In the output your have a list of tuples, it should be very easy to get list of list as you want.
Actually the normalization step is not necessary.

Deleting Duplicate Tuples of Lists from List

I want to write a script to take a list of categories and return the unique ways to split the categories into 2 groups. For now I have it in tuple form (list_a, list_b) where the union of list_a and list_b represents the full list of categories.
Below I have shown with an example with categories ['A','B','C','D'], I can get all the groups. However, some are duplicates (['A'], ['B', 'C', 'D']) represents the same split as (['B', 'C', 'D'], ['A']). How do I retain only unique splits? Also what is a better title for this post?
import itertools
def getCompliment(smallList, fullList):
compliment = list()
for item in fullList:
if item not in smallList:
compliment.append(item)
return compliment
optionList = ['A','B','C','D']
combos = list()
for r in range(1,len(optionList)):
tuples = list(itertools.combinations(optionList, r))
for t in tuples:
combos.append((list(t),getCompliment(list(t), optionList)))
print(combos)
[(['A'], ['B', 'C', 'D']),
(['B'], ['A', 'C', 'D']),
(['C'], ['A', 'B', 'D']),
(['D'], ['A', 'B', 'C']),
(['A', 'B'], ['C', 'D']),
(['A', 'C'], ['B', 'D']),
(['A', 'D'], ['B', 'C']),
(['B', 'C'], ['A', 'D']),
(['B', 'D'], ['A', 'C']),
(['C', 'D'], ['A', 'B']),
(['A', 'B', 'C'], ['D']),
(['A', 'B', 'D'], ['C']),
(['A', 'C', 'D'], ['B']),
(['B', 'C', 'D'], ['A'])]
I need the following:
[(['A'], ['B', 'C', 'D']),
(['B'], ['A', 'C', 'D']),
(['C'], ['A', 'B', 'D']),
(['D'], ['A', 'B', 'C']),
(['A', 'B'], ['C', 'D']),
(['A', 'C'], ['B', 'D']),
(['A', 'D'], ['B', 'C'])]

You are very close. What you need is a set of results.
Since set elements must be hashable and list objects are not hashable, you can use tuple instead. This can be achieved by some trivial changes to your code.
import itertools
def getCompliment(smallList, fullList):
compliment = list()
for item in fullList:
if item not in smallList:
compliment.append(item)
return tuple(compliment)
optionList = ('A','B','C','D')
combos = set()
for r in range(1,len(optionList)):
tuples = list(itertools.combinations(optionList, r))
for t in tuples:
combos.add(frozenset((tuple(t), getCompliment(tuple(t), optionList))))
print(combos)
{frozenset({('A',), ('B', 'C', 'D')}),
frozenset({('A', 'C', 'D'), ('B',)}),
frozenset({('A', 'B', 'D'), ('C',)}),
frozenset({('A', 'B'), ('C', 'D')}),
frozenset({('A', 'C'), ('B', 'D')}),
frozenset({('A', 'D'), ('B', 'C')}),
frozenset({('A', 'B', 'C'), ('D',)})}
If you need to convert the result back to a list of lists, this is possible via a list comprehension:
res = [list(map(list, i)) for i in combos]
[[['A'], ['B', 'C', 'D']],
[['B'], ['A', 'C', 'D']],
[['A', 'B', 'D'], ['C']],
[['A', 'B'], ['C', 'D']],
[['B', 'D'], ['A', 'C']],
[['B', 'C'], ['A', 'D']],
[['A', 'B', 'C'], ['D']]]

Equally distribute a list in python

Suppose I have the following list in python:
a = ['a','b','c','d','e','f','g','h','i','j']
How do I distribute the list like this:
['a','f']
['b','g']
['c','h']
['d','i']
['e','j']
And how do I achieve this if I have a list of unequal length and putting the 'superfluous' items into a separate list?
I want to be able to distribute the elements of the original list into n parts in the indicated manner.
So if n=3 that would be:
['a','d','g']
['b','e','h']
['c','f','i']
and the 'superfluous' element in a separate list
['j']

You can use zip with a list comprehension:
def distribute(seq):
n = len(seq)//2 #Will work in both Python 2 and 3
return [list(x) for x in zip(seq[:n], seq[n:])]
print distribute(['a','b','c','d','e','f','g','h','i','j'])
#[['a', 'f'], ['b', 'g'], ['c', 'h'], ['d', 'i'], ['e', 'j']]

Not exceedingly elegant, but here goes:
In [5]: a = ['a','b','c','d','e','f','g','h','i','j']
In [6]: [[a[i], a[len(a)//2+i]] for i in range(len(a)//2)]
Out[6]: [['a', 'f'], ['b', 'g'], ['c', 'h'], ['d', 'i'], ['e', 'j']]
If you're happy with a list of tuples, you could use zip():
In [7]: zip(a[:len(a)//2], a[len(a)//2:])
Out[7]: [('a', 'f'), ('b', 'g'), ('c', 'h'), ('d', 'i'), ('e', 'j')]
To convert this into a list of lists:
In [8]: map(list, zip(a[:len(a)//2], a[len(a)//2:]))
Out[8]: [['a', 'f'], ['b', 'g'], ['c', 'h'], ['d', 'i'], ['e', 'j']]

Sorting a list of lists in python alphabetically by "column"

I have a 2d list of characters like
[['J', 'A', 'M', 'E', 'S'],
['F', 'C', 'A', 'A', 'A'],
['F', 'A', 'B', 'B', 'B']]
What is the best way to go about sorting the first list alphabetically, with the proceeding lists following, ie:
[['A', 'E', 'J', 'M', 'S'],
['C', 'A', 'F', 'A', 'A'],
['A', 'B', 'F', 'B', 'B']]

You can use zip():
>>> [list(t) for t in zip(*sorted(zip(*s)))]
[['A', 'E', 'J', 'M', 'S'], ['C', 'A', 'F', 'A', 'A'], ['A', 'B', 'F', 'B', 'B']]
where s is your list of lists.

The other answers demonstrate how it can be done in one line. This answer illustrates how this works:
Given a list, l:
In [1]: l = [['J', 'A', 'M', 'E', 'S'],
...: ['F', 'C', 'A', 'A', 'A'],
...: ['F', 'A', 'B', 'B', 'B']]
Group the columns into tuples, by passing each row into zip():
In [2]: zip(*l)
Out[2]:
[('J', 'F', 'F'),
('A', 'C', 'A'),
('M', 'A', 'B'),
('E', 'A', 'B'),
('S', 'A', 'B')]
Sort this list of tuples:
In [3]: sorted(zip(*l))
Out[3]:
[('A', 'C', 'A'),
('E', 'A', 'B'),
('J', 'F', 'F'),
('M', 'A', 'B'),
('S', 'A', 'B')]
Note that if the first list contains duplicate items then this sort is not a stable sort.
Transpose the list again to get three lists of tuples:
In [4]: zip(*sorted(zip(*l)))
Out[4]:
[('A', 'E', 'J', 'M', 'S'),
('C', 'A', 'F', 'A', 'A'),
('A', 'B', 'F', 'B', 'B')]
Convert the list of tuples back to a list of lists, using a list comprehension:
In [5]: [list(t) for t in zip(*sorted(zip(*l)))]
Out[5]:
[['A', 'E', 'J', 'M', 'S'],
['C', 'A', 'F', 'A', 'A'],
['A', 'B', 'F', 'B', 'B']]

>>> l = [['J', 'A', 'M', 'E', 'S'],
... ['F', 'C', 'A', 'A', 'A'],
... ['F', 'A', 'B', 'B', 'B']]
>>> zip(*sorted(zip(*l)))
[('A', 'E', 'J', 'M', 'S'), ('C', 'A', 'F', 'A', 'A'), ('A', 'B', 'F', 'B', 'B')]
if you need lists in result:
>>> map(list, zip(*sorted(zip(*l))))
[['A', 'E', 'J', 'M', 'S'], ['C', 'A', 'F', 'A', 'A'], ['A', 'B', 'F', 'B', 'B']]

Permutations of a list of lists

I have a list like this:
l = [['a', 'b', 'c'], ['a', 'b'], ['g', 'h', 'r', 'w']]
I want to pick an element from each list and combine them to be a string.
For example: 'aag', 'aah', 'aar', 'aaw', 'abg', 'abh' ....
However, the length of the list l and the length of each inner list are all unknown before the program is running. So how can I do want I want?

Take a previous solution and use itertools.product(*l) instead.

If anybody's interested in the algorithm, here's a very simple way to use recursion to find the combos:
l = [['a', 'b', 'c'], ['a', 'b'], ['g', 'h', 'r', 'w']]
def permu(lists, prefix=''):
if not lists:
print prefix
return
first = lists[0]
rest = lists[1:]
for letter in first:
permu(rest, prefix + letter)
permu(l)

using recursion
def permutenew(l):
if len(l)==1:
return l[0]
else:
lnew=[]
for a in l[0]:
for b in permutenew(l[1:]):
lnew.append(a+b)
return lnew
l = [['a', 'b', 'c'], ['a', 'b'], ['g', 'h', 'r', 'w']]
print permutenew(l)

Piggy-backing off of JasonWoof's answer. The following will create a list instead of printing. Be mindful that this may be very slow as it requires a lot of memory to store the values.
from __future__ import print_function
import itertools # Not actually used in the code below
def permu(lists):
def fn(lists, group=[], result=[]):
if not lists:
result.append(group)
return
first, rest = lists[0], lists[1:]
for letter in first:
fn(rest, group + [letter], result)
result = []
fn(lists, result=result)
return result
if __name__ == '__main__':
ll = [ [[1, 2, 3], [5, 10], [42]],
[['a', 'b', 'c'], ['a', 'b'], ['g', 'h', 'r', 'w']] ]
nth = lambda i: 'Permutation #{0}:\n{1}'.format(i, '-'*16)
# Note: permu(list) can be replaced with itertools.product(*l)
[[print(p) for p in [nth(i)]+permu(l)+['\n']] for i,l in enumerate(ll)]
Result
Permutation #0:
----------------
[1, 5, 42]
[1, 10, 42]
[2, 5, 42]
[2, 10, 42]
[3, 5, 42]
[3, 10, 42]
Permutation #1:
----------------
['a', 'a', 'g']
['a', 'a', 'h']
['a', 'a', 'r']
['a', 'a', 'w']
['a', 'b', 'g']
['a', 'b', 'h']
['a', 'b', 'r']
['a', 'b', 'w']
['b', 'a', 'g']
['b', 'a', 'h']
['b', 'a', 'r']
['b', 'a', 'w']
['b', 'b', 'g']
['b', 'b', 'h']
['b', 'b', 'r']
['b', 'b', 'w']
['c', 'a', 'g']
['c', 'a', 'h']
['c', 'a', 'r']
['c', 'a', 'w']
['c', 'b', 'g']
['c', 'b', 'h']
['c', 'b', 'r']
['c', 'b', 'w']
Below is an equivalent substitution for itertools.product(*iterables[, repeat]):
This function is equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:
def product(*args, **kwds):
pools = list(map(tuple, args)) * kwds.get('repeat', 1)
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)

Quite easy with itertools.product :
>>> import itertools
>>> list(itertools.product("abc", "ab", "ghrw"))
[('a', 'a', 'g'), ('a', 'a', 'h'), ('a', 'a', 'r'), ('a', 'a', 'w'), ('a', 'b', 'g'), ('a', 'b', 'h'), ('a', 'b', 'r'), ('a', 'b', 'w'), ('b', 'a', 'g'), ('b', 'a', 'h'), ('b', 'a', 'r'), ('b', 'a', 'w'), ('b', 'b', 'g'), ('b', 'b', 'h'), ('b', 'b', 'r'), ('b', 'b', 'w'), ('c', 'a', 'g'), ('c', 'a', 'h'), ('c', 'a', 'r'), ('c', 'a', 'w'), ('c', 'b', 'g'), ('c', 'b', 'h'), ('c', 'b', 'r'), ('c', 'b', 'w')]

Here you go
reduce(lambda a,b: [i+j for i in a for j in b], l)
OUT: ['aag', 'aah', 'aar', 'aaw', 'abg', 'abh', 'abr', 'abw', 'bag', 'bah', 'bar', 'baw', 'bbg', 'bbh', 'bbr', 'bbw', 'cag', 'cah', 'car', 'caw', 'cbg', 'cbh', 'cbr', 'cbw']
If you'd like to reuse/regeneralize:
def opOnCombos(a,b, op=operator.add):
return [op(i,j) for i in a for j in b]
def f(x):
return lambda a,b: opOnCombo(a,b,x)
reduce(opOnCombos, l) //same as before
reduce(f(operator.mul), l)) //multiply combos of several integer list

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting Unique String Combinations from List of List in Python - python

>>> set(tuple(set(combo)) for combo in combos) {('a', 'c', 'b'), ('c', 'b', 'f'), ('e', 'd', 'f')} Simple but if we have same elements in the combo, it will return wrong answer. Then, sorting is the way to go as suggested in others.

How about this: combos = [['a', 'b', 'c'], ['c', 'b', 'a'], ['d', 'e', 'f'], ['c', 'a', 'b'], ['c', 'f', 'b']] print [list(y) for y in set([''.join(sorted(c)) for c in combos])]

Related

How to do selective combination in Python List?

Deleting Duplicate Tuples of Lists from List

Equally distribute a list in python

Sorting a list of lists in python alphabetically by "column"

Permutations of a list of lists

Categories

Resources