Python - mapping list of lists to dictionary - python

I want to make a correlation between a list of lists and a dictionay of lists.
On one hand, I have the following list:
list_1=[['new','address'],['hello'],['I','am','John']]
and on the other hand I have a dictionary of lists:
dict={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
What I want to get is a new list of lists (of lists) like this:
list_2=[[[1,3,4],[0,1,2]],[[7,8,9]],[[1,1,1],[0,0,0],[1,3,4]]]
This means that every word from list_1 was mapped to each value in the dictionary dict and what is more, notice that 'am' from list_1 that was not found in dict took values [0,0,0].
Thanx in advance for the help.

just rebuild the list of lists using dictionary queries using dict.get, and a default value in case the key isn't found:
list_1=[['new','address'],['hello'],['I','am','John']]
d={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
list_2=[[d.get(k,[0,0,0]) for k in sl] for sl in list_1]
print(list_2)
result:
[[[1, 3, 4], [0, 1, 2]], [[7, 8, 9]], [[1, 1, 1], [0, 0, 0], [1, 3, 4]]]

list_1=[['new','address'],['hello'],['I','am','John']]
dict={'new':[1,3,4], 'address':[0,1,2], 'hello':[7,8,9], 'I':[1,1,1], 'John':[1,3,4]}
list_2=[[dict[x] for x in l if x in dict] for l in list_1]
if you want a list even if the key doesn't exists in dict
list_2=[[dict.get(x, []) for x in l] for l in list_1]

Related

Find duplicates within and outside nested list in python

I have a list s:
s=[[1,2,1],[2,2,1],[1,2,1]]
Case 1: Remove the duplicate groups in the list
Case 2: Remove the duplicate values within the groups
Desired Result:
Case 1 : [[1,2,1],[2,2,1]]
Case 2 : [[1,2],[2,1],[1,2]]
I tried using the list(set(s)) but it throws up an error:
unhashable type: 'list'
IIUC,
Case 1:
Convert the lists to tuple for hashing, then apply a set on the list of tuples to remove the duplicates. Finally, convert back to lists.
out1 = list(map(list, set(map(tuple, s))))
# [[1, 2, 1], [2, 2, 1]]
Case 2:
For each sublist, remove the duplicates while keeping order with conversion to dictionary keys (that are unique), then back to list:
out2 = [list(dict.fromkeys(l)) for l in s]
# [[1, 2], [2, 1], [1, 2]]
You need to know that it's not possible in Python to have a set of lists. The reason is that lists are not hashable. The simplest way to do your task in the first case is to use a new list of lists without duplicates such as below:
temp_list = []
for each_element in [[1,2,1],[2,2,1],[1,2,1]]:
if each_element not in temp_list:
temp_set.append(each_element)
print(temp_list)
The output:
[[1, 2, 1], [2, 2, 1]]
The case2 is more simple:
temp_list = []
for each_element in [[1,2,1],[2,2,1],[1,2,1]]:
temp_list.append(list(set(each_element)))
print(temp_list)
And this is the output:
[[1, 2], [1, 2], [1, 2]]
However these codes are not the pythonic way of doing things, they are very simple to be understood by beginners.

Remove list from list of lists if condition is met

I have a list of lists containing an index and two coordinates, [i,x,y] eg:
L=[[1,0,0][2,0,1][3,1,2]]
I want to check if L[i][1] is repeated (as is the case in the example for i=0 and i=1) and keep in the list only the list with the smallest i. In the example [2,0,1] would be removed and L would be:
L=[[1,0,0][3,1,2]]
Is there a simple way to do such a thing?
Keep a set of the x coordinates we've already seen, traverse the input list sorted by ascending i and build and output list adding only the sublists whose x we haven't seen yet:
L = [[1, 0, 0], [2, 0, 1], [3, 1, 2]]
ans = []
seen = set()
for sl in sorted(L):
if sl[1] not in seen:
ans.append(sl)
seen.add(sl[1])
L = ans
It works as required:
L
=> [[1, 0, 0], [3, 1, 2]]
There are probably better solution but you can do with:
i1_list=[]
result_list=[]
for i in L:
if not i[1] in i1_list:
result_list.append(i)
i1_list.append(i[1])
print(result_list)

Take unique values out of a list with unhashable elements [duplicate]

This question already has an answer here:
Python, TypeError: unhashable type: 'list'
(1 answer)
Closed 3 years ago.
So I have the following list:
test_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
Now I want to take the unique values from the list and print them on the screen. I've tried using the set function, but that doesn't work (Type error: unhasable type: 'list'), because of the [1,2] and [2,3] values in the list. I tried using the append and extend functions, but didn't come up with a solution yet.
expectation:
['Hallo', 42, [1,2], (3+2j), 'Hello', [2,3]]
def unique_list(a_list):
a = set(a_list)
print(a)
a_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
print(unique_list(a_list))
If the list contains unhashable elements, create a hashable key using repr that be used with a set:
def unique_list(a_list):
seen = set()
for x in a_list:
key = repr(x)
if key not in seen:
seen.add(key)
print(x)
You can use a simple for loop that appends only new elements:
test_list = ['Hallo', 42, [1, 2], 42, 3 + 2j, 'Hallo', 'Hello', [1, 2], [2, 3], 3 + 2j, 42]
new_list = []
for item in test_list:
if item not in new_list:
new_list.append(item)
print(new_list)
# ['Hallo', 42, [1, 2], (3+2j), 'Hello', [2, 3]]
To get the unique items from a list of non-hashables, one can do a partition by equivalence, which is a quadratic method as it compares each items to an item in each of the partitions and if it isn't equal to one of them it creates a new partition just for that item, and then take first item of each partition.
If some of the items are hashable, one can restrict the partition of equivalence to just the non-hashables. And feed the rest of the items through a set.
import itertools
def partition(L):
parts = []
for item in L:
for part in parts:
if item == part[0]:
part.append(item)
break
else:
parts.append([item])
return parts
def unique(L):
return [p[0] for p in partition(L)]
Untested.
One approach that solves this in linear time is to serialize items with serializers such as pickle so that unhashable objects such as lists can be added to a set for de-duplication, but since sets are unordered in Python and you apparently want the output to be in the original insertion order, you can use dict.fromkeys instead:
import pickle
list(map(pickle.loads, dict.fromkeys(map(pickle.dumps, test_list))))
so that given your sample input, this returns:
['Hallo', 42, [1, 2], (3+2j), 'Hello', [2, 3]]
Note that if you're using Python 3.6 or earlier versions where key orders of dicts are not guaranteed, you can use collections.OrderedDict in place of dict.
You could do it in a regular for loop that runs in O(n^2).
def unique_list(a_list):
orig = a_list[:] # shallow-copy original list to avoid modifying it
uniq = [] # start with an empty list as our result
while(len(orig) > 0): # iterate through the original list
uniq.append(orig[0]) # for each element, append it to the unique elements list
while(uniq[-1] in orig): # then, remove all occurrences of that element in the original list
orig.remove(uniq[-1])
return uniq # finally, return the list of unique elements in order of first occurrence in the original list
There's also probably a way to finagle this into a list comprehension, which would be more elegant, but I can't figure it out at the moment. If every element was hashable you could use the set method and that would be easier.

python 2-D list how to make a set [duplicate]

This question already has answers here:
How to remove duplicate lists in a list of list? [duplicate]
(2 answers)
Closed 6 years ago.
as a python list follow
list1 = [[1,2],[3,4],[1,2]]
I want make a set so I can the unique list items like
list2 = [[1,2],[3,4]].
Is there some function in python I can use. Thanks
That will do:
>>> list1 = [[1,2],[3,4],[1,2]]
>>> list2 = list(map(list, set(map(tuple,list1))))
>>> list2
[[1, 2], [3, 4]]
Unfortunately, there is not a single built-in function that can handle this. Lists are "unhashable" (see this SO post). So you cannot have a set of list in Python.
But tuples are hashable:
l = [[1, 2], [3, 4], [1, 2]]
s = {tuple(x) for x in l}
print(s)
# out: {(1, 2), (3, 4)}
Of course, this won't help you if you want to later, say, append to these lists inside your main data structure, as they are now all tuples. If you absolutely must have the original list functionality, you can check out this code recipe for uniquification by Tim Peters.
Note that this only removes duplicate sublists, it does not take into account the sublist's individual elements. Ex: [[1,2,3], [1,2], [1]] -> [[1,2,3], [1,2], [1]]
>>> print map(list, {tuple(sublist) for sublist in list1})
[[1, 2], [3, 4]]
You can try this:
list1 = [[1,2],[3,4],[1,2]]
list2 = []
for i in list1:
if i not in list2:
list2.append(i)
print(list2)
[[1, 2], [3, 4]]
The most typical solutions have already been posted, so let's give a new one:
Python 2.x
list1 = [[1, 2], [3, 4], [1, 2]]
list2 = {str(v): v for v in list1}.values()
Python 3.x
list1 = [[1, 2], [3, 4], [1, 2]]
list2 = list({str(v): v for v in list1}.values())
There is no inbuilt single function to achieve this. You have received many answers. In addition to those, you may also use a lambda function to achieve this:
list(map(list, set(map(lambda i: tuple(i), list1))))

Append feature frequency to existing list

I am looking for an fairly efficient way to append the frequency of a feature in a list each item in that list.
For example, given this list:
[['syme', 4, 2], ['said', 4, 2], ['the', 3, 5]]
I would like to append to it the frequency with which the second two items occur in the list. In the list above, this would look something like this:
[['syme', 4, 2, 2], ['said', 4, 2, 2], ['the', 3, 5, 1]]
Where the third number represents how frequently the second two numbers occur as the second two items in the lists. (for example, [4, 2] appears twice as the second two numbers and [3,5] appears once so the first two lists would append a 2 at the end and the third list would append a 1.)
The actual list may have several hundred thousand items so both efficiency AND readable code are both valued here and I would like to maintain the current order of the list.
Thanks in advance!
Probably the most performant method is to use collections.Counter to get the counts based on pairs
counts = Counter(tuple(item[1:]) for item in lst)
then update the list accordingly:
for item in lst:
item.append(counts[tuple(item[1:])])
If the order of the two items doesn't matter, wrap item[1:] with sorted(...) when creating counts and updating lst.
You can use the collections.Counter class:
from collections import Counter
my_list = [['syme', 4, 2], ['said', 4, 2], ['the', 3, 5]]
counts = Counter([(x[1],x[2],) for x in my_list])
for sub_list in my_list:
sub_list.append(counts[(sub_list[1], sub_list[2])])
If order doesn't matter:
import collections
collections.Counter(frozenset((i[1], i[2]))
a_list = [['syme', 4, 2], ['said', 4, 2], ['the', 3, 5]]
counts = Counter(frozenset(l[1], l[2]) for x in a_list)
for l in a_list:
l.append(counts[frozenset(l[1], l[2])])

Categories