Eliminate duplicate from list of lists [duplicate]

Eliminate duplicate from list of lists [duplicate] - python

This question already has answers here:
set of list of lists in python
(4 answers)
Closed 3 years ago.
I need to eliminate duplicates from a list of list like this one:
list = [[10, 5, 3], [10, 5, 3], [10, 10, 3], [10, 10], [3, 3, 3], [10, 5, 3]]
As a expected result:
result_list = [[10, 5, 3], [10, 3], [10], [3]]
Eliminating duplicates inside sub-lists and in the main list, would it be possible?
I tried with:
result_list = [list(result) for result in set(set(item) for item in list)]
but throws an TypeError saying that a set is a unhashable type
I think it was not a duplicated question, i need to remove the duplicates within the sublists, not just in the main list.
Thanks to everyone who helped me, problem solved.

Sets aren't hashable, but frozensets are:
lst = [[10, 5, 3], [10, 5, 3], [10, 10, 3], [10, 10], [3, 3, 3], [10, 5, 3]]
result_list = [list(result) for result in set(frozenset(item) for item in lst)]
Also don't shadow the builtin name list, especially if you want to use its usual meaning immediately after.

You should use map in order to convert to tuple
result = [list(set(i)) for i in set(map(tuple, mylist))]
Output
[[3], [10, 3, 5], [10, 3], [10]]

You need to use tuples to be able to set, however in a nested list comprehension, you can turn this tuple items back to lists:
list_example = [[10, 5, 3], [10, 5, 3], [10, 10, 3], [10, 10], [3, 3, 3], [10, 5, 3]]
output = [list(x) for x in (list(set([tuple(set(result)) for result in list(set(item) for item in list_example)])))]
print(output)
Output:
[[10, 3, 5], [10], [10, 3], [3]]

You are iterating a list:
for item in list
putting the results into a set:
set(item)
then putting the set into a set:
set(set(item)
For anything to go into a set it has to be hashable, meaning it has a defined hash value resulting from being an immutable object. sets aren't immutable, and so don't have a hash. See Why aren't Python sets hashable?.

Related

Recursive method to zip list?

I have got a nested list of list that looks like the following,
list1 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
However, I would like to find out a method to concatenate the first index of each list with the first index of the other list.
list1 = [[1,4,7,10],[2,5,8,11],[3,6,9,12]]
I have tried doing list comprehension by using the following code
list1 = [[list1[j][i] for j in range(len(list1)) ] for i in range(len(list1[0])) ]
# gives me
# list1 = [[1,4,7,10],[2,5,8,11],[3,6,9,12]]
However, i was hoping alternative methods to achieve the same results, hopefully something that is simpler and more elegant.
Thanks in advance.

zip is a built-in method and does not require outside packages:
>>> list1 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
>>> print([list(x) for x in zip(*list1)])
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]
Notice the *list1! This is needed since list1 is a nested list, so the * unpacks that list's elements to the zip method to zip together. Then, since zip returns a list of tuples we simply convert them to lists (as per your request)

A possible recursion solution can utilize a generator:
def r_zip(d):
yield [i[0] for i in d]
if d[0][1:]:
yield from r_zip([i[1:] for i in d])
print(list(r_zip(list1)))
Output:
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]

x = min([len(list1[i]) for i in range(len(list1))])
[[i[j] for i in list1] for j in range(x)]

Or try using:
>>> list1 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
>>> list(map(list, zip(*list1)))
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]
>>>

How to sort one list based on another list in Python? [duplicate]

This question already has answers here:
Sorting list based on values from another list
(20 answers)
Closed 3 years ago.
I have two separate lists one list contain list of lists(list1) and another list is simple numeric values.
The list2 is sorted but i want to sort list1 based on the values of list2 .Using Zip method it is gives errors:unhashable type.
list1=[[1 ,2],[2,1],[1,3],[1,9],[6,9],[3,5],[6,8],[4,5],[7,9]]
list2=[0.0,1.4142135623730951,1.0,7.0,8.602325267042627, 3.605551275463989,7.810249675906654,4.242640687119285,9.219544457292887]
keydict=dict(zip(list1,list2))//Gives errror: unhashable type
.Can Anybody suggest solution.

You can use zip() + sorted():
[x for x, _ in sorted(zip(list1, list2), key=lambda x: x[1])]
Code:
list1 = [[1 ,2],[2,1],[1,3],[1,9],[6,9],[3,5],[6,8],[4,5],[7,9]]
list2 = [0.0,1.4142135623730951,1.0,7.0,8.602325267042627, 3.605551275463989,7.810249675906654,4.242640687119285,9.219544457292887]
print([x for x, _ in sorted(zip(list1, list2), key=lambda x: x[1])])
# [[1, 2], [1, 3], [2, 1], [3, 5], [4, 5], [1, 9], [6, 8], [6, 9], [7, 9]]

Use the below list comprehension:
print([list1[list2.index(i)] for i in sorted(list2)])
Output:
[[1, 2], [1, 3], [2, 1], [3, 5], [4, 5], [1, 9], [6, 8], [6, 9], [7, 9]]

Sort only the first two elements in each sub-array of 3 elements

I have an array:
list=[[1, 2, 3], [31, 10, 2], [7, 2, 4]]
And I want to sort the only the first two elements in each sub-array (in increasing order).
The list should look like this:
list=[[1, 2, 3], [10, 31, 2], [2, 7, 4]]
The code that I used to do was:
list=[[1, 2, 3], [10, 31, 2], [2, 7, 4]]
for i in list:
if list[0]>list[1]:
list[0],list[1]=list[1],list[0]
else:
print('Do Nothing')
Is there any faster way to do it?

I don't think this will work as you're only switching around the first two sublists of list. (list[0] and list[1], instead of list[i][0] and list[i][1])
I would use something like this:
list_unsorted = [[1, 2, 3], [31, 10, 2], [7, 2, 4]]
list_sorted = [ [min(i[0], i[1]), max(i[0], i[1]), i[2] ] for i in list_unsorted ]
Also, I renamed list to list_unsorted as list already is a function in python.

Faster, not without having more context on how the input is obtained, and access to more tools - such as numpy.
But maybe a more pythonic way :
alist=[[1, 2, 3, 5], [31, 10, 2], [7, 2, 4]]
def p(l):
a,b,c=l[0], l[1], l[2:]
if a>b:
a,b=b,a
return [a,b, *c]
print(list(map(p, alist)))
This prevents the king of error you maid by using variable "list" instead of variable "i" in your code, as it would be detected by the interpreter.
As a good practice, as list is a python keyword, it should not be used as a variable name.

Identify duplicate lists in list of lists?

How can I compare a list of lists with itself in python in order to:
identify identical sublists with the same items (not necessarily in the same
item order)
delete these duplicate sublists
Example:
list = [ [1, 3, 5, 6], [7, 8], [10, 12], [9], [3, 1, 5, 6], [12, 10] ]
clean_list = [ [1, 3, 5, 6], [7, 8], [10, 12], [9] ]
Any help is greatly appreciated.
I can't seem to figure this out.

I would rebuild the "clean_list" in a list comprehension, checking that the sorted version of the sublist isn't already in the previous elements
the_list = [ [1, 3, 5, 6], [7, 8], [10, 12], [9], [3, 1, 5, 6], [12, 10] ]
clean_list = [l for i,l in enumerate(the_list) if all(sorted(l)!=sorted(the_list[j]) for j in range(0,i))]
print(clean_list)
of course, sorting the items for each iteration is time consuming, so you could prepare a sorted list of sublists:
the_sorted_list = [sorted(l) for l in the_list]
and use it:
clean_list = [the_list[i] for i,l in enumerate(the_sorted_list) if all(l!=the_sorted_list[j] for j in range(0,i))]
result (in both cases):
[[1, 3, 5, 6], [7, 8], [10, 12], [9]]
As many suggested, maybe a simple for loop (no list comprehension there) storing the already seen items in a set would be more performant for the lookup of the duplicates. That alternate solution could be necessary if the input list is really big to avoid the O(n) lookup of all.
An example of implementation could be:
test_set = set()
clean_list = []
for l in the_list:
sl = sorted(l)
tsl = tuple(sl)
if not tsl in test_set:
test_set.add(tsl) # note it down to avoid inserting it next time
clean_list.append(sl)

Create a set. Then for each list in the list, sort it, transform into tuple, then insert into set.
setOfLists = set()
for list in listOfLists:
list.sort()
setOfLists.add(tuple(list))
print setOfLists
You can retransform the tuples in the set into lists again.

Simple for loops will work, but if your dataset is small, e.g. 1k or less, you can use this :
b = []
[b.append(i) for i in a if len([j for j in b if set(j) == set(i)])==0 ]
print b

So heres my take on this.
I def a function that sorts each sublist and appends to a temp list. then I check if the sublist in temp_my_list is 'not' in temp_clean_list and if not then append to new list. this should work for any 2 sets of list. I added some extra list to show some kind of result other than an empty string.
my_list = [[1, 3, 5, 6], [7, 8], [10, 12], [9], [3, 1, 5, 6], [12, 10],[16]]
clean_list = [ [1, 3, 5, 6], [7, 8], [10, 12], [9],[18]]
new_list = []
def getNewList():
temp_my_list = []
temp_clean_list = []
for sublist in my_list:
sublist.sort()
temp_my_list.append(msublist)
for sublist in clean_list:
sublist.sort()
temp_clean_list.append(sublist)
for sublist in temp_my_list:
if sublist not in temp_clean_list:
new_list.append(sublist)
getNewList()
print (new_list)
Resulit:
[[16]]

python how to link sublist together in a list which describe a tree

For example list =[[1,0],[2,1],[6,2],[7,6],[7,8],[15,13],[8,15]]
shows tree
_7_
_6 8_
_2 _15
how to get a new list contain all these number.
like the example list =
[[1,0],[2,1],[6,2],[7,6],[7,8],[15,13],[8,15]]
the output will be new_list=[0,1,2,6,7,8,15,13] (order not important)
My biggest problem is to link [6,2],[7,6],[7,8],[8,15] together

simple, if order doesn't matter:
l =[[1,0],[2,1],[6,2],[7,6],[7,8],[15,13],[8,15]]
# flatten list
t = sum(l,[])
# transform in a set removing duplicate values
# otherwise if u want to keep the order you have to use an OrderedDict
list(set(t)) # [0, 1, 2, 6, 7, 8, 13, 15]

if the order is not important, you can loop through the list and sub-list and check if element is not in new_list:
list = [[1, 0], [2, 1], [6, 2], [7, 6], [7, 8], [15, 13], [8, 15]]
new_list = []
for sub in list:
for elem in sub:
if elem not in new_list:
new_list.append(elem)
print new_list
output:
[1, 0, 2, 6, 7, 8, 15, 13]

Leaving Tree and Graph Theory aside:
list =[[1,0],[2,1],[6,2],[7,6],[7,8],[15,13],[8,15]]
uniq = {}
for i in list:
uniq.update({i[0]: True, i[1]: True})
print(uniq.keys())
>>> [0, 1, 2, 6, 7, 8, 13, 15]
Using python set:
list =[[1,0],[2,1],[6,2],[7,6],[7,8],[15,13],[8,15]]
uniq = set()
for i in list:
uniq.add(i[0])
uniq.add(i[1])
print uniq

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Eliminate duplicate from list of lists [duplicate] - python

You should use map in order to convert to tuple result = [list(set(i)) for i in set(map(tuple, mylist))] Output [[3], [10, 3, 5], [10, 3], [10]]

Related

Recursive method to zip list?

How to sort one list based on another list in Python? [duplicate]

Sort only the first two elements in each sub-array of 3 elements

Identify duplicate lists in list of lists?

python how to link sublist together in a list which describe a tree

Categories

Resources