Find duplicates within and outside nested list in python

Find duplicates within and outside nested list in python - python

I have a list s:
s=[[1,2,1],[2,2,1],[1,2,1]]
Case 1: Remove the duplicate groups in the list
Case 2: Remove the duplicate values within the groups
Desired Result:
Case 1 : [[1,2,1],[2,2,1]]
Case 2 : [[1,2],[2,1],[1,2]]
I tried using the list(set(s)) but it throws up an error:
unhashable type: 'list'

IIUC,
Case 1:
Convert the lists to tuple for hashing, then apply a set on the list of tuples to remove the duplicates. Finally, convert back to lists.
out1 = list(map(list, set(map(tuple, s))))
# [[1, 2, 1], [2, 2, 1]]
Case 2:
For each sublist, remove the duplicates while keeping order with conversion to dictionary keys (that are unique), then back to list:
out2 = [list(dict.fromkeys(l)) for l in s]
# [[1, 2], [2, 1], [1, 2]]

You need to know that it's not possible in Python to have a set of lists. The reason is that lists are not hashable. The simplest way to do your task in the first case is to use a new list of lists without duplicates such as below:
temp_list = []
for each_element in [[1,2,1],[2,2,1],[1,2,1]]:
if each_element not in temp_list:
temp_set.append(each_element)
print(temp_list)
The output:
[[1, 2, 1], [2, 2, 1]]
The case2 is more simple:
temp_list = []
for each_element in [[1,2,1],[2,2,1],[1,2,1]]:
temp_list.append(list(set(each_element)))
print(temp_list)
And this is the output:
[[1, 2], [1, 2], [1, 2]]
However these codes are not the pythonic way of doing things, they are very simple to be understood by beginners.

Related

Remove list from list of lists if condition is met

I have a list of lists containing an index and two coordinates, [i,x,y] eg:
L=[[1,0,0][2,0,1][3,1,2]]
I want to check if L[i][1] is repeated (as is the case in the example for i=0 and i=1) and keep in the list only the list with the smallest i. In the example [2,0,1] would be removed and L would be:
L=[[1,0,0][3,1,2]]
Is there a simple way to do such a thing?

Keep a set of the x coordinates we've already seen, traverse the input list sorted by ascending i and build and output list adding only the sublists whose x we haven't seen yet:
L = [[1, 0, 0], [2, 0, 1], [3, 1, 2]]
ans = []
seen = set()
for sl in sorted(L):
if sl[1] not in seen:
ans.append(sl)
seen.add(sl[1])
L = ans
It works as required:
L
=> [[1, 0, 0], [3, 1, 2]]

There are probably better solution but you can do with:
i1_list=[]
result_list=[]
for i in L:
if not i[1] in i1_list:
result_list.append(i)
i1_list.append(i[1])
print(result_list)

Take unique values out of a list with unhashable elements [duplicate]

This question already has an answer here:
Python, TypeError: unhashable type: 'list'
(1 answer)
Closed 3 years ago.
So I have the following list:
test_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
Now I want to take the unique values from the list and print them on the screen. I've tried using the set function, but that doesn't work (Type error: unhasable type: 'list'), because of the [1,2] and [2,3] values in the list. I tried using the append and extend functions, but didn't come up with a solution yet.
expectation:
['Hallo', 42, [1,2], (3+2j), 'Hello', [2,3]]
def unique_list(a_list):
a = set(a_list)
print(a)
a_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
print(unique_list(a_list))

If the list contains unhashable elements, create a hashable key using repr that be used with a set:
def unique_list(a_list):
seen = set()
for x in a_list:
key = repr(x)
if key not in seen:
seen.add(key)
print(x)

You can use a simple for loop that appends only new elements:
test_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
new_list = []
for item in test_list:
if item not in new_list:
new_list.append(item)
print(new_list)
# ['Hallo', 42, [1, 2], (3+2j), 'Hello', [2, 3]]

To get the unique items from a list of non-hashables, one can do a partition by equivalence, which is a quadratic method as it compares each items to an item in each of the partitions and if it isn't equal to one of them it creates a new partition just for that item, and then take first item of each partition.
If some of the items are hashable, one can restrict the partition of equivalence to just the non-hashables. And feed the rest of the items through a set.
import itertools
def partition(L):
parts = []
for item in L:
for part in parts:
if item == part[0]:
part.append(item)
break
else:
parts.append([item])
return parts
def unique(L):
return [p[0] for p in partition(L)]
Untested.

One approach that solves this in linear time is to serialize items with serializers such as pickle so that unhashable objects such as lists can be added to a set for de-duplication, but since sets are unordered in Python and you apparently want the output to be in the original insertion order, you can use dict.fromkeys instead:
import pickle
list(map(pickle.loads, dict.fromkeys(map(pickle.dumps, test_list))))
so that given your sample input, this returns:
['Hallo', 42, [1, 2], (3+2j), 'Hello', [2, 3]]
Note that if you're using Python 3.6 or earlier versions where key orders of dicts are not guaranteed, you can use collections.OrderedDict in place of dict.

You could do it in a regular for loop that runs in O(n^2).
def unique_list(a_list):
orig = a_list[:] # shallow-copy original list to avoid modifying it
uniq = [] # start with an empty list as our result
while(len(orig) > 0): # iterate through the original list
uniq.append(orig[0]) # for each element, append it to the unique elements list
while(uniq[-1] in orig): # then, remove all occurrences of that element in the original list
orig.remove(uniq[-1])
return uniq # finally, return the list of unique elements in order of first occurrence in the original list
There's also probably a way to finagle this into a list comprehension, which would be more elegant, but I can't figure it out at the moment. If every element was hashable you could use the set method and that would be easier.

Appending variable that is modified in a for loop doesn't work as expected [duplicate]

This question already has answers here:
List on python appending always the same value [duplicate]
(5 answers)
Closed 4 years ago.
I have this code:
lst = []
given = [1, 2, 3, 4, 5]
result = []
for item in given:
lst.append(item)
print(lst)
result.append(lst)
print(result)
My expected result is [[1], [1, 2], [1, 2, 3], ...], but displayed result is [[1, 2, 3, 4, 5], ...] with 12345 repeated 5 times. What is wrong?
lst printed is as expected, which is [1] for the first loop, [1, 2] for the second loop, and so on.

Python doesn't create copy of lst every time when you append it to result, it just inserts reference. As a result you get list with N references to same list.
To create a copy of lst you can use lst.copy(). Also list slice operator works same lst[:].
Shortened version of your code:
given = [1, 2, 3, 4, 5]
result = [given[0 : i + 1] for i in range(len(given))]
print(result)
Result:
[[1], [1, 2], [1, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4, 5]]

The problem is that you are appending the list as such which is equivalent to appending the reference object to the original list. Therefore, whenever the original list is modified, the changes are reflected in the places where the reference is created, in this case in result. As you keep iterating via the for loop, all your references appended in result keep getting updated with the latest value of lst. The final result is that at the end of the for loop, you have appended 5 references to the original list lst and all of them store the latest value of lst being [1,2,3,4,5].
There are several ways to avoid this. What you need is to copy only the values. One of them is to use lst[:]. other way is to use lst.copy()
for item in given:
lst.append(item)
print(lst)
result.append(lst[:])
print (result)
# [[1], [1, 2], [1, 2, 3], [1, 2, 3, 4], [1, 2, 3, 4, 5]]

List is a mutable data type, there is only one copy in memory for a list unless you explicitly copy it to another variable. So
result.append(lst)
just appends a reference of the real copy and all the refercences point to the same copy.
In conclusion, you should learn about mutable/immutable data types and reference count in python.

Append lst.copy() gives the right output.
lst = []
given = [1,2,3,4,5]
result = []
for item in given:
lst.append(item)
print(lst)
result.append(lst.copy())
print(result)

Python - mapping list of lists to dictionary

I want to make a correlation between a list of lists and a dictionay of lists.
On one hand, I have the following list:
list_1=[['new','address'],['hello'],['I','am','John']]
and on the other hand I have a dictionary of lists:
dict={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
What I want to get is a new list of lists (of lists) like this:
list_2=[[[1,3,4],[0,1,2]],[[7,8,9]],[[1,1,1],[0,0,0],[1,3,4]]]
This means that every word from list_1 was mapped to each value in the dictionary dict and what is more, notice that 'am' from list_1 that was not found in dict took values [0,0,0].
Thanx in advance for the help.

just rebuild the list of lists using dictionary queries using dict.get, and a default value in case the key isn't found:
list_1=[['new','address'],['hello'],['I','am','John']]
d={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
list_2=[[d.get(k,[0,0,0]) for k in sl] for sl in list_1]
print(list_2)
result:
[[[1, 3, 4], [0, 1, 2]], [[7, 8, 9]], [[1, 1, 1], [0, 0, 0], [1, 3, 4]]]

list_1=[['new','address'],['hello'],['I','am','John']]
dict={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
list_2=[[dict[x] for x in l if x in dict] for l in list_1]
if you want a list even if the key doesn't exists in dict
list_2=[[dict.get(x, []) for x in l] for l in list_1]

python 2-D list how to make a set [duplicate]

This question already has answers here:
How to remove duplicate lists in a list of list? [duplicate]
(2 answers)
Closed 6 years ago.
as a python list follow
list1 = [[1,2],[3,4],[1,2]]
I want make a set so I can the unique list items like
list2 = [[1,2],[3,4]].
Is there some function in python I can use. Thanks

That will do:
>>> list1 = [[1,2],[3,4],[1,2]]
>>> list2 = list(map(list, set(map(tuple,list1))))
>>> list2
[[1, 2], [3, 4]]

Unfortunately, there is not a single built-in function that can handle this. Lists are "unhashable" (see this SO post). So you cannot have a set of list in Python.
But tuples are hashable:
l = [[1, 2], [3, 4], [1, 2]]
s = {tuple(x) for x in l}
print(s)
# out: {(1, 2), (3, 4)}
Of course, this won't help you if you want to later, say, append to these lists inside your main data structure, as they are now all tuples. If you absolutely must have the original list functionality, you can check out this code recipe for uniquification by Tim Peters.

Note that this only removes duplicate sublists, it does not take into account the sublist's individual elements. Ex: [[1,2,3], [1,2], [1]] -> [[1,2,3], [1,2], [1]]
>>> print map(list, {tuple(sublist) for sublist in list1})
[[1, 2], [3, 4]]

You can try this:
list1 = [[1,2],[3,4],[1,2]]
list2 = []
for i in list1:
if i not in list2:
list2.append(i)
print(list2)
[[1, 2], [3, 4]]

The most typical solutions have already been posted, so let's give a new one:
Python 2.x
list1 = [[1, 2], [3, 4], [1, 2]]
list2 = {str(v): v for v in list1}.values()
Python 3.x
list1 = [[1, 2], [3, 4], [1, 2]]
list2 = list({str(v): v for v in list1}.values())

There is no inbuilt single function to achieve this. You have received many answers. In addition to those, you may also use a lambda function to achieve this:
list(map(list, set(map(lambda i: tuple(i), list1))))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find duplicates within and outside nested list in python - python

I have a list s: s=[[1,2,1],[2,2,1],[1,2,1]] Case 1: Remove the duplicate groups in the list Case 2: Remove the duplicate values within the groups Desired Result: Case 1 : [[1,2,1],[2,2,1]] Case 2 : [[1,2],[2,1],[1,2]] I tried using the list(set(s)) but it throws up an error: unhashable type: 'list'

Related

Remove list from list of lists if condition is met

Take unique values out of a list with unhashable elements [duplicate]

Appending variable that is modified in a for loop doesn't work as expected [duplicate]

Python - mapping list of lists to dictionary

python 2-D list how to make a set [duplicate]

Categories

Resources