Related
I want to filter a list of lists for duplicates. I consider two lists to be a duplicate of each other when they contain the same elements but not necessarily in the same order. So for example
[['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
should become
[['A', 'B', 'C'], ['D', 'B', 'A']]
since ['C', 'B', 'A'] is a duplicate of ['A', 'B', 'C'].
It does not matter which one of the duplicates gets removed, as long as the final list of lists does not contain any duplicates anymore. And all lists need to keep the order of there elements. So using set() may not be an option.
I found this related questions:
Determine if 2 lists have the same elements, regardless of order? ,
How to efficiently compare two unordered lists (not sets)?
But they only talk about how to compare two lists, not how too efficiently remove duplicates. I'm using python.
using dictionary comprehension
>>> data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> result = {tuple(sorted(i)): i for i in data}.values()
>>> result
dict_values([['C', 'B', 'A'], ['D', 'B', 'A']])
>>> list( result )
[['C', 'B', 'A'], ['D', 'B', 'A']]
You can use frozenset
>>> x = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> [list(s) for s in set([frozenset(item) for item in x])]
[['A', 'B', 'D'], ['A', 'B', 'C']]
Or, with map:
>>> [list(s) for s in set(map(frozenset, x))]
[['A', 'B', 'D'], ['A', 'B', 'C']]
If you want to keep the order of elements:
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
seen = set()
result = []
for obj in data:
if frozenset(obj) not in seen:
result.append(obj)
seen.add(frozenset(obj))
Output:
[['A', 'B', 'C'], ['D', 'B', 'A']]
Do you want to keep the order of elements?
from itertools import groupby
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
print([k for k, _ in groupby(data, key=sorted)])
Output:
[['A', 'B', 'C'], ['A', 'B', 'D']]
In python you have to remember that you can't change existing data but you can somehow append / update data.
The simplest way is as follows:
dict = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
temp = []
for i in dict:
if sorted(i) in temp:
pass
else:
temp.append(i)
print(temp)
cheers, athrv
I have a list:
['A', 'B', 'C', ['D', ['E', 'F'], 'G'], 'H']
and I want to turn this into:
[['E', 'F'], ['D', 'G'], ['A', 'B', 'C', 'H']]
So basically I want the sublist on the deepest level of the list to come first in the new list and then counting down the level the remaining sublists.
This should work with any nested list.
If there are two sublists on the same level, then it doesn't really matter which one comes first.
['A', 'B', 'C', ['D', ['E', 'F'], 'G'], ['H', 'I', 'J']]
[['E', 'F'], ['D', 'G'], ['H', 'I', 'J'], ['A', 'B', 'C', 'H']] #this is fine
[['E', 'F'], ['H', 'I', 'J'], ['D', 'G'], ['A', 'B', 'C', 'H']] #this too
I thought of first using a function to determine on what level the deepest sublist is, but then again I don't know how to access items in a list based on their level or if that's even possible.
Been tinkering around this for far too long now and I think my head just gave up, hope someone can assist me with this problem!
You can use a recursive generator function:
def sort_depth(d, c = 0):
r = {0:[], 1:[]}
for i in d:
r[not isinstance(i, list)].append(i)
yield from [i for j in r[0] for i in sort_depth(j, c+1)]
yield (c, r[1])
def result(d):
return [b for _, b in sorted(sort_depth(d), key=lambda x:x[0], reverse=True) if b]
print(result(['A', 'B', 'C', ['D', ['E', 'F'], 'G'], 'H']))
print(result(['A', 'B', 'C', ['D', ['E', 'F'], 'G'], ['H', 'I', 'J']]))
print(result([[1, [2]], [3, [4]]]))
Output:
[['E', 'F'], ['D', 'G'], ['A', 'B', 'C', 'H']]
[['E', 'F'], ['D', 'G'], ['H', 'I', 'J'], ['A', 'B', 'C']]
[[2], [4], [1], [3]]
Here is a relatively straight-forward solution:
def sort_depth(d):
def dlists(obj, dep=0):
for x in filter(list.__instancecheck__, obj):
yield from dlists(x, dep-1)
yield [x for x in obj if not isinstance(x, list)], dep
return [x for x, y in sorted(dlists(d), key=lambda p: p[1])]
>>> [*sort_depth([[1, [2]], [3, [4]]])]
[[2], [4], [1], [3], []]
>>> [*sort_depth(['A', 'B', 'C', ['D', ['E', 'F'], 'G'], 'H'])]
[['E', 'F'], ['D', 'G'], ['A', 'B', 'C', 'H']]
The approach:
Collect all the sublists and annotate them with their (negative) nesting level, e.g. (['E', 'F'], -2)
Sort them by their nesting level
Extract the lists back from the sorted data
I have a nested list
x = [['a', 'b', 'c'], ['d'], ['e', 'f', ['g', ['h', 'i']]]]
I want to do all possible permutations of elements in sublists without going beyond corresponding sublist.
The expected output are variations of something like this:
[['c', 'b', 'a'], ['d'], ['f', 'e', ['g', ['i', 'h']]]]
[['d'], ['a', 'b', 'c'], ['f', 'e', [['h', 'i'], 'g']]]
Each element must be kept is kept in it's square bracket.
I Worte this generator:
def swap(x):
if isinstance(x, list):
res = np.random.choice(x, len(x), replace = False)
return [list(map(ff, res))]
else:
return x
It gives random variants of expected result, but i need to collect them all. How could I do it? Should I do:
my_list = []
for i in range(10000): # not necessary 10000, any huge number
my_list.append(ff(yy1))
And then apply unique function to my_list to select unique ones, or there is another option?
The isinstance()+itertools.permutations() is a good direction, just you need a product of them, and some tracking which permutation applies to what part of the tree(?) (I was thinking along generating all possible traversals of a tree):
import itertools
def plan(part,res):
if isinstance(part,list) and len(part)>1:
res.append(itertools.permutations(range(len(part))))
for elem in part:
plan(elem,res)
return res
def remix(part,p):
if isinstance(part,list) and len(part)>1:
coll=[0]*len(part)
for i in range(len(part)-1,-1,-1):
coll[i]=remix(part[i],p)
mix=p.pop()
return [coll[i] for i in mix]
else:
return part
def swap(t):
plans=itertools.product(*plan(t,[]))
for p in plans:
yield remix(t,list(p))
for r in swap([['a', 'b', 'c'], ['d'], ['e', 'f', ['g', ['h', 'i']]]]):
print(r)
plan() recursively finds all "real" lists (which have more than one element), and creates itertools.permutations() for them.
swap() calls plan(), and then combines the permutations into one single compound megapermutation using itertools.product()
remix() creates an actual object for a single megapermutation step. It is a bit complicated because I did not want to fight with tracking tree-position, instead remix() works backwards, going to the very last list, and swizzling it with the very last component of the current plan, removing it from the list.
It seems to work, though your example is a bit long, with simpler inputs it has manageable output:
for r in swap([['a', ['b', 'c']], ['d'], 'e']):
print(r)
[['a', ['b', 'c']], ['d'], 'e']
[['a', ['c', 'b']], ['d'], 'e']
[[['b', 'c'], 'a'], ['d'], 'e']
[[['c', 'b'], 'a'], ['d'], 'e']
[['a', ['b', 'c']], 'e', ['d']]
[['a', ['c', 'b']], 'e', ['d']]
[[['b', 'c'], 'a'], 'e', ['d']]
[[['c', 'b'], 'a'], 'e', ['d']]
[['d'], ['a', ['b', 'c']], 'e']
[['d'], ['a', ['c', 'b']], 'e']
[['d'], [['b', 'c'], 'a'], 'e']
[['d'], [['c', 'b'], 'a'], 'e']
[['d'], 'e', ['a', ['b', 'c']]]
[['d'], 'e', ['a', ['c', 'b']]]
[['d'], 'e', [['b', 'c'], 'a']]
[['d'], 'e', [['c', 'b'], 'a']]
['e', ['a', ['b', 'c']], ['d']]
['e', ['a', ['c', 'b']], ['d']]
['e', [['b', 'c'], 'a'], ['d']]
['e', [['c', 'b'], 'a'], ['d']]
['e', ['d'], ['a', ['b', 'c']]]
['e', ['d'], ['a', ['c', 'b']]]
['e', ['d'], [['b', 'c'], 'a']]
['e', ['d'], [['c', 'b'], 'a']]
24 permutations, as expected
Not particularly pythonic, but I would approach it by finding permutations of the indexes, as seen here:
from itertools import permutations
mylist= [[1], [1,2], [1,2,3]]
combinations = list(permutations([i for i in range(len(mylist))]))
print(combinations)
for item in combinations:
print([mylist[item[i]] for i in range(len(mylist))])
Output:
[(0, 1, 2), (0, 2, 1), (1, 0, 2), (1, 2, 0), (2, 0, 1), (2, 1, 0)]
[[1], [1, 2], [1, 2, 3]]
[[1], [1, 2, 3], [1, 2]]
[[1, 2], [1], [1, 2, 3]]
[[1, 2], [1, 2, 3], [1]]
[[1, 2, 3], [1], [1, 2]]
[[1, 2, 3], [1, 2], [1]]
Have you considered using itertools?
There are explicit combination and permutation tools available
From the docs:
itertools.permutations(iterable[, r])
Return successive r length
permutations of elements in the iterable.
If r is not specified or is None, then r defaults to the length of the
iterable and all possible full-length permutations are generated.
Permutations are emitted in lexicographic sort order. So, if the input
iterable is sorted, the permutation tuples will be produced in sorted
order.
Elements are treated as unique based on their position, not on their
value. So if the input elements are unique, there will be no repeat
values in each permutation.
itertools.combinations(iterable, r)
Return r length subsequences of elements from the input iterable.
Combinations are emitted in lexicographic sort order. So, if the input
iterable is sorted, the combination tuples will be produced in sorted
order.
Elements are treated as unique based on their position, not on their
value. So if the input elements are unique, there will be no repeat
values in each combination.
I try do enumerate all the group of 2 (possible) in a list of people. Like for example for a group project I went to have all the possibilities of group of two people in the list of the class. (In python).
For example if my list of people is: {a,b,c,d,e,f} I want to have:
I tried a lot of things (itertools.combinations() or itertools.permutations()) but I don't succeed to have this result without tuples or without two times the same person in a group.
You can do the following (copied from https://stackoverflow.com/a/5360442):
lst = ['a','b','c','d','e','f']
def all_pairs(lst):
if len(lst) < 2:
yield []
return
if len(lst) % 2 == 1:
# Handle odd length list
for i in range(len(lst)):
for result in all_pairs(lst[:i] + lst[i+1:]):
yield result
else:
a = lst[0]
for i in range(1,len(lst)):
pair = [a,lst[i]]
for rest in all_pairs(lst[1:i]+lst[i+1:]):
yield [pair] + rest
print(list(all_pairs(lst)))
which gives you:
[[['a', 'b'], ['c', 'd'], ['e', 'f']],
[['a', 'b'], ['c', 'e'], ['d', 'f']],
[['a', 'b'], ['c', 'f'], ['d', 'e']],
[['a', 'c'], ['b', 'd'], ['e', 'f']],
[['a', 'c'], ['b', 'e'], ['d', 'f']],
[['a', 'c'], ['b', 'f'], ['d', 'e']],
[['a', 'd'], ['b', 'c'], ['e', 'f']],
[['a', 'd'], ['b', 'e'], ['c', 'f']],
[['a', 'd'], ['b', 'f'], ['c', 'e']],
[['a', 'e'], ['b', 'c'], ['d', 'f']],
[['a', 'e'], ['b', 'd'], ['c', 'f']],
[['a', 'e'], ['b', 'f'], ['c', 'd']],
[['a', 'f'], ['b', 'c'], ['d', 'e']],
[['a', 'f'], ['b', 'd'], ['c', 'e']],
[['a', 'f'], ['b', 'e'], ['c', 'd']]]
As required.
You can use this built-in function
import itertools
data = ['a', 'b', 'c', 'd', 'e', 'f']
#in case the number of items is odd
len(data) % 2 != 0 and data.append(None)
number_of_groups = int(len(data) / 2)
check = lambda x, y: not list(set(x) & set(y))
test = lambda grps : all([check(x[0], x[1]) for x in itertools.combinations(grps, 2)])
pairs = [list(x) for x in itertools.combinations(['a', 'b', 'c', 'd', 'e', 'f'], 2)]
[list(x) for x in itertools.combinations(pairs, number_of_groups) if test(x)]
The result is [[['a', 'b'], ['c', 'd'], ['e', 'f']], [['a', 'b'], ['c', 'e'], ['d', 'f']], [['a', 'b'], ['c', 'f'], ['d', 'e']], [['a', 'c'], ['b', 'd'], ['e', 'f']], [['a', 'c'], ['b', 'e'], ['d', 'f']], [['a', 'c'], ['b', 'f'], ['d', 'e']], [['a', 'd'], ['b', 'c'], ['e', 'f']], [['a', 'd'], ['b', 'e'], ['c', 'f']], [['a', 'd'], ['b', 'f'], ['c', 'e']], [['a', 'e'], ['b', 'c'], ['d', 'f']], [['a', 'e'], ['b', 'd'], ['c', 'f']], [['a', 'e'], ['b', 'f'], ['c', 'd']], [['a', 'f'], ['b', 'c'], ['d', 'e']], [['a', 'f'], ['b', 'd'], ['c', 'e']], [['a', 'f'], ['b', 'e'], ['c', 'd']]]
import itertools
people = ['a', 'b', 'c', 'd', 'e', 'f']
# number of groups that could be created
n_groups = len(people) // 2
# create all possible pairs
pairs = itertools.combinations(people, 2)
# create all group constellations
group_combo = itertools.combinations(pairs, n_groups)
# check for impossible constellations
# ie. it is not possible to have 'a' in two groups
for group in group_combo:
flatten_group_tuple = [element for tupl in group for element in tupl]
# check for duplicate members, if duplicates exist the set-size will be < n_groups * 2
if len(set(flatten_group_tuple)) == n_groups * 2:
print([list(x) for x in group])
this is an algorithm that is meant to write down some of the permutations of the list P and it does it well, but...
def p():
global P
P = ['a', 'b', 'c', 'd']
perm(4)
per = []
def perm(k):
global P
if k==1:
print(P)
per.append(P)
else:
for i in range(k):
P[i], P[k-1] = P[k-1], P[i]
perm(k-1)
P[i], P[k-1] = P[k-1], P[i]
when i want it to add the permutations to a global list (necessary for the rest of the program) there is a problem. It still prints all the permutations
['b', 'c', 'd', 'a']
['b', 'c', 'd', 'a']
['d', 'b', 'c', 'a']
['d', 'b', 'c', 'a']
['b', 'd', 'c', 'a']
['b', 'd', 'c', 'a']
['a', 'c', 'b', 'd']
['a', 'c', 'b', 'd']
['b', 'a', 'c', 'd']
['b', 'a', 'c', 'd']
['a', 'b', 'c', 'd']
['a', 'b', 'c', 'd']
['b', 'd', 'a', 'c']
['b', 'd', 'a', 'c']
['a', 'b', 'd', 'c']
['a', 'b', 'd', 'c']
['b', 'a', 'd', 'c']
['b', 'a', 'd', 'c']
['a', 'd', 'b', 'c']
['a', 'd', 'b', 'c']
['b', 'a', 'd', 'c']
['b', 'a', 'd', 'c']
['a', 'b', 'd', 'c']
['a', 'b', 'd', 'c']
but when i check the list it's filled with the default set
[['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c']]
could you please help me at least with what's the actual issue?
Try using built-in itertools link .
import itertools
P = [letter for letter in "abcd"]
def perm(permutate_this):
return list(itertools.permutations(permutate_this))
print(perm(P))
Use
per.append(P[:])
to copy the list. You are appending a reference to a list and thats always the same data. Your per contains the same reference over and over.
Comment-remark of cdarke:
Slicing a list is calles a shallow copy - if you have lists that contain other refs (f.e. inner lists) it will only copy the ref and you have the same problem for the inner lists - you would have to resort to copy.deepcopy in that case.
Example:
innerlist = [1,2,3]
l2 = [innerlist, 5, 6]
l3 = l2[:]
print(l2) # orig
print(l3) # the shallow copy
l3[2] = "changed" # l2[2] is unchanged
print(l2)
print(l3)
innerlist[2] = 999 # both (l2 and l3) will reflect this change in the innerlist
print(l2)
print(l3)
Output:
[[1, 2, 3], 5, 6] # l2
[[1, 2, 3], 5, 6] # l3
[[1, 2, 3], 5, 6] # l2 unchanged by l3[2]='changed'
[[1, 2, 3], 5, 'changed'] # l3 changed by -"-
[[1, 2, 999], 5, 6] # l2 and l3 affected by change in innerlist
[[1, 2, 999], 5, 'changed']
It may be more convenient for you to use built-in permutations:
from itertools import permutations
arr = [1, 2, 3]
list(permutations(arr))
> [(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]