Remove duplicate from dictionary in python - python

I have 5 lists in a dictionary I want to delete duplicates and retain the element that occurs in first list by comparing all lists available
dict = {1:[0,1,2,3], 2:[1,4,5], 3:[0,4,2,5,6], 4:[0,2,7,8], 5:[9]}
Output should look like this:
dict = {1:[0,1,2,3], 2:[4,5], 3:[6], 4:[7,8], 5:[9]}

You can make a set to store items that have been seen, and then sequentially update the dict according to the set:
d = {1:[0,1,2,3], 2:[1,4,5], 3:[0,4,2,5,6], 4:[0,2,7,8], 5:[9]}
seen = set()
for k, v in d.items():
d[k] = [x for x in v if x not in seen]
seen.update(d[k])
print(d) # {1: [0, 1, 2, 3], 2: [4, 5], 3: [6], 4: [7, 8], 5: [9]}

A one-liner with a dictionary comprehension:
>>> {k: [i for i in v if i not in sum(list(dct.values())[:idx], [])] for idx, (k, v) in enumerate(dct.items())}
{1: [0, 1, 2, 3], 2: [4, 5], 3: [6], 4: [7, 8], 5: [9]}
>>>
It filter and flattens all the values in the list before the certain key and filters the values not in there.
P.S. I renamed your dict to dct so it doesn't override the function name

Related

how to append values to lists in dictionary?

I, have a dictionary say {0:[1,2,3],1[1,2,3],2:[1,2,3],3:[1,2,4],4:[1,2,4],5:[1,2,4]}. there are some duplicate values but the keys are different. when i append a value to list corresponding to key 0 python also adds the value to other duplicates what i don't need.
my code :-
for k, v in f.items():
if k == 0:
v.append(1)
result:-
{0:[1,2,3,1],1[1,2,3,1],2:[1,2,3,1],3:[1,2,4],4:[1,2,4],5:[1,2,4]}
what i want is :-
{0:[1,2,3,1],1[1,2,3],2:[1,2,3],3:[1,2,4],4:[1,2,4],5:[1,2,4]}
I guess your dictionary is faulty. It should be like this:
f={0:[1,2,3],1:[1,2,3],2:[1,2,3],3:[1,2,4],4:[1,2,4],5:[1,2,4]}
for k,v in f.items():
if k==0:
v.append(1)
print(f)
Output: {0: [1, 2, 3, 1], 1: [1, 2, 3], 2: [1, 2, 3], 3: [1, 2, 4], 4: [1, 2, 4], 5: [1, 2, 4]}
Isn't it what you want?

Dict from two lists including multiple values for keys

Is there a possibility to create a dict from two lists with same key occurring multiple times without iterating over the whole dataset?
Minimal example:
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# hoped for result:
dictionary = dict(???)
dictionary = {1 : [1,8], 2:[2,4], 3:[3,5], 4:[6], 5:[7]}
When using zip the key-value-pair is inserted overwriting the old one:
dictionary = dict(zip(keys,values))
dictionary = {1: 8, 2: 4, 3: 5, 4: 6, 5: 7}
I would be happy with a Multidict as well.
This is one approach that doesn't require 2 for loops
h = defaultdict(list)
for k, v in zip(keys, values):
h[k].append(v)
print(h)
# defaultdict(<class 'list'>, {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]})
print(dict(h))
# {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]}
This is the only one-liner I could do.
dictionary = {k: [values[i] for i in [j for j, x in enumerate(keys) if x == k]] for k in set(keys)}
It is far from readable. Remember that clear code is always better than pseudo-clever code ;)
Here is an example that I think is easy to follow logically. Unfortunately it does not use zip like you would prefer, nor does it avoid iterating, because a task like this has to involve iterating In some form.
# Your data
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# Make result dict
result = {}
for x in range(1, max(keys)+1):
result[x] = []
# Populate result dict
for index, num in enumerate(keys):
result[num].append(values[index])
# Print result
print(result)
If you know the range of values in the keys array, you could make this faster by providing the results dictionary as a literal with integer keys and empty list values.

Fastest way to generate dictionary from two lists

I have two lists. For example:
keys = [1, 2, 3, 2, 4, 2, 1]
and
values = [1, 2, 3, 4, 5, 6, 7]
I want to create a dictionary of lists out of them as shown below:
dict = {1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
What is the fastest way to do it and what generates the efficiency gains, both by using any module and also by not importing any additional module?
Using collections.defaultdict
from collections import defaultdict
d_dict = defaultdict(list)
for k,v in zip(keys, values):
d_dict[k].append(v)
dict(d_dict)
#{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
You can use the following:
d = {}
for k, v in zip(keys, values):
d.setdefault(k, []).append(v)
print(d)
which outputs:
{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
You could use defaultdict for your case. In case if some key does not exist the default dict calls the factory method (list() for current case) and instead of raising KeyError it will return new empty list, that is appended with value:
from collections import defaultdict
ld = defaultdict(list)
for k,v in zip(keys, values):
ld[k].append(v)
print(ld)

How to make dict element value as list in Python

I have a list. Let's say [3,4,2,3,4,2,1,4,5].
I need to create a dictionary from the indexes of the elements.
Here in this case, I need to create a dict as follows:
{
'3':[0,3],
'4':[1,4,7],
'2':[2,5],
'1':[6],
'5':[8]
}
where the element values are the indexes of the keys in list provided.
I've tried. But was able to change the values as integers only. But unable to make them as list.
Is there any way to do this with just 1 for loop?
The code I've tried:
d=dict()
ll=[1,2,1,2,1,2,3,4,5,5,4,2,4,6,5,6,78,3,2,4,5,7,8,9,4,4,2,2,34,5,6,3]
for i,j in enumerate(ll):
d[j].append(i)
print(d)
You can use collections.defaultdict with enumerate for an O(n) solution:
from collections import defaultdict
d = defaultdict(list)
A = [3,4,2,3,4,2,1,4,5]
for idx, val in enumerate(A):
d[val].append(idx)
print(d)
defaultdict(list, {1: [6], 2: [2, 5], 3: [0, 3], 4: [1, 4, 7], 5: [8]})
This will work, the key thing you're looking for is the enumerate() function:
list_to_convert = [3,4,2,3,4,2,1,4,5]
out_dict = {}
for idx, val in enumerate(list_to_convert):
if val in out_dict:
out_dict[val].append(idx)
else:
out_dict[val] = [idx,]
print (out_dict)
Gives:
{3: [0, 3], 4: [1, 4, 7], 2: [2, 5], 1: [6], 5: [8]}
mylist = [3, 4, 2, 3, 4, 2, 1, 4, 5]
d = {}
for index, item in enumerate(mylist):
d.setdefault(item, []).append(index)
results in
{3: [0, 3], 4: [1, 4, 7], 2: [2, 5], 1: [6], 5: [8]}
Why? Well, we iterate over the list, and for each item, we first make sure that there is a list in the dictionary mapped to by this item. Then we append the respective index to that list. What results is a dictionary which maps each seen item to a list of indexes it was found at.
The solution is similar to jpp's solution, except of the part with .setdefault(), which creates an empty list in every loop run, while the defaultdict approach only creates new lists if needed.
Another approach could be a dict subclass which implements __missing__. This is called whenever a key isn't present.
class ListDict(dict):
def __missing__(self, key):
l = []
self[key] = l
return l
and then just do d[item].append(index). Now, whenever a key is not found, __missing__() is called which "fixes" the problem. See also How can I call __missing__ from dict for this.
You can use a set:
d = [3,4,2,3,4,2,1,4,5]
new_d = {i:[c for c, a in enumerate(d) if i == a] for i in set(d)}
Output:
{1: [6], 2: [2, 5], 3: [0, 3], 4: [1, 4, 7], 5: [8]}

Find intersection of dictionary values which are lists

I have dictionaries in a list with same keys, while the values are variant:
[{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},
{1:[2,3,4], 2:[6,7], 3:[1,3,5]},
...]
I would like to get intersection as dictionary under same keys like this:
{1:[2,3,4], 2:[6,7], 3:[1,3,5]}
Give this a try?
dicts = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
result = { k: set(dicts[0][k]).intersection(*(d[k] for d in dicts[1:])) for k in dicts[0].keys() }
print(result)
# Output:
# {1: {2, 3, 4}, 2: {6, 7}, 3: {1, 3, 5}}
If you want lists instead of sets as the output value type, just throw a list(...) around the set intersection.
For a list of dictionaries, reduce the whole list as this:
>>> from functools import reduce
>>> d = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
>>> reduce(lambda x, y: {k: sorted(list(set(x[k])&set(y[k]))) for k in x.keys()}, d)
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}
I would probably do something along these lines:
# Take the first dict and convert the values to `set`.
output = {k: set(v) for k, v in dictionaries[0].items()}
# For the rest of the dicts, update the set at a given key by intersecting it with each of the other lists that have the same key.
for d in dictionaries[1:]:
for k, v in output.items():
output[k] = v.intersection(d[k])
There are different variations on this same theme, but I find this one to be about as simple to read as it gets (and since code is read more often than it is written, I consider that a win :-)
use dict.viewkeys and dict.viewitems
In [103]: dict.viewkeys?
Docstring: D.viewkeys() -> a set-like object providing a view on D's keys
dict.viewitems?
Docstring: D.viewitems() -> a set-like object providing a view on D's items
a = [{1: [1, 2, 3, 4, 5], 2: [6, 7, 8], 3: [1, 3, 5, 7, 9]},
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}]
In [100]: dict(zip(a[0].viewkeys() and a[1].viewkeys(), a[0].viewvalues() and a[1].viewvalues()))
Out[100]: {1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}

Categories