Python remove duplicate value in a combined dictionary's list

Python remove duplicate value in a combined dictionary's list - python

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!

Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique

>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}

In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added

Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.

Related

Python: Create a dictionary where keys have multiple values

The problem that I have is hard to explain, easy to understand:
I have a list of tuples:
L=[('a','111'),('b','222'),('a','333'),('b','444')]
from this list I want to createa dictionary where the keys are the first elements of the tuples ('a' and 'b') and the values associated are in a list:
expected output:
{'a':['111','333'],'b':['222','444']}
How can I solve this problem?
d={}
for x in range (len(L)):
d[L[x][0]]=[L[x][1]]
return d
but as you can easy understand, the output won't be complete since the list will show just the last value associated to that key in L

You can use setdefault() to set the key in the dict the first time. Then append your value:
L=[('a','111'),('b','222'),('a','333'),('b','444')]
d = {}
for key, value in L:
d.setdefault(key, []).append(value)
print(d)
# {'a': ['111', '333'], 'b': ['222', '444']}

You have to append L[x][1] to an existing list, not replace whatever was there with a new singleton list.
d={}
for x in range (len(L)):
if L[x][0] not in d:
d[L[x][0]] = []
d[L[x][0]].append(L[x][1])
return d
A defaultdict makes this easier:
from collections import defaultdict
d = defaultdict(list)
for x in range(len(L)):
d[L[x][0]].append(L[x][1])
return d
A more idiomatic style of writing this would be to iterate directly over the list and unpack the key and value immediately:
d = defaultdict(list)
for key, value in L:
d[key].append(value)

You can try this:
L = [('a','111'),('b','222'),('a','333'),('b','444')]
my_dict = {}
for item in L:
if item[0] not in my_dict:
my_dict[item[0]] = []
my_dict[item[0]].append(item[1])
print(my_dict)
Output:
python your_script.py
{'a': ['111', '333'], 'b': ['222', '444']}
As pointed by #chepner, you can use defaultdict to.
Basically, with defaultdict you'll not need to check if there is no key yet in your dict.
So it would be:
L = [('a','111'),('b','222'),('a','333'),('b','444')]
my_dict = defaultdict(list)
for item in L:
my_dict[item[0]].append(item[1])
print(my_dict)
And the output:
defaultdict(<class 'list'>, {'a': ['111', '333'], 'b': ['222', '444']})
And if you want to get a dict from the defaultdict, you can simply create a new dict from it:
print(dict(my_dict))
And the output will be:
{'a': ['111', '333'], 'b': ['222', '444']}

How to count occurrences of key in list of dictionaries

I know this is a frequently asked question, however I do not have access to the Counter module as I'm using v2.6 of Python. I want to count the number of time a specific key appears in a list of dictionaries.
If my dictionary looks like this:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
How would I find out how many times "a" appears? I've tried using len, but that only returns the number of values for one key.
len(data['a'])

You can use list comprehension.
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
sum([1 for d in data if 'a' in d])
Explanation:
First take the dictionary object from list data, check if key 'a' is present in the dictionary or not, if present, add 1 to the list. Then sum the new list.

You won't have access to collections.Counter, but collections.defaultdict was added in Python 2.5
keys and flatten list
data = [j for i in data for j in i.keys()]
# ['a', 'b', 'a', 'c', 'c', 'b', 'a', 'c', 'a', 'd']
collections.defaultdict
from collections import defaultdict
dct = defaultdict(int)
for key in data:
dct[key] += 1
# defaultdict(<type 'int'>, {'a': 4, 'c': 3, 'b': 2, 'd': 1})
If you only need the count for a, there are simpler ways to do this, but this will give you the counts of all keys in your list of dictionaries.

A one-line solution could be:
len([k for d in data for k in d.keys() if k == 'a'])

For this you could write the following function that would work for data in the structure you provided (a list of dicts):
def count_key(key,dict_list):
keys_list = []
for item in dict_list:
keys_list += item.keys()
return keys_list.count(key)
Then, you could invoke the function as follows:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
count_a = count_key('a',data)
In this case, count_a will be 4.

This question looks very much like a class assignment. Here is a simple bit of code that will do the job:
n=0
for d in data:
if 'a' in d:
n+=1
print(n)
Here n is a counter, the for loop iterates through the list of dictionaries.
The 'a' in d expression will return true if the key 'a' is in the dictionary d, in which case the counter n will be incremented. At the end the result is printed. I believe in Python 2.6 the brackets would be optional (I am using 3.6).

How to get multiple max key values in a dictionary?

Let's say I have a dictionary:
data = {'a':1, 'b':2, 'c': 3, 'd': 3}
I want to get the maximum value(s) in the dictionary. So far, I have been just doing:
max(zip(data.values(), data.keys()))[1]
but I'm aware that I could be missing another max value. What would be the most efficient way to approach this?

Based on your example, it seems like you're looking for the key(s) which map to the maximum value. You could use a list comprehension:
[k for k, v in data.items() if v == max(data.values())]
# ['c', 'd']
If you have a large dictionary, break this into two lines to avoid calculating max for as many items as you have:
mx = max(data.values())
[k for k, v in data.items() if v == mx]
In Python 2.x you will need .iteritems().

You could try collecting reverse value -> key pairs in a defaultdict, then output the values with the highest key:
from collections import defaultdict
def get_max_value(data):
d = defaultdict(list)
for key, value in data.items():
d[value].append(key)
return max(d.items())[1]
Which Outputs:
>>> get_max_value({'a':1, 'b':2, 'c': 3, 'd': 3})
['c', 'd']
>>> get_max_value({'a': 10, 'b': 10, 'c': 4, 'd': 5})
['a', 'b']

First of all, find what is the max value that occurs in the dictionary. If you are trying to create a list of all the max value(s), then try something like this:
data = {'a':1, 'b':2, 'c': 3, 'd': 3}
max_value = data.get(max(data))
list_num_max_value = []
for letter in data:
if data.get(letter) == max_value:
list_num_max_value.append(max_value)
print (list_num_max_value)
Please let me know if that's not what you are trying to do and I will guide you through the right process.

Update a dictionary with values from a list in Python

I have a Dictionary here:
dic = {'A':1, 'B':6, 'C':42, 'D':1, 'E':12}
and a list here:
lis = ['C', 'D', 'C', 'C', 'F']
What I'm trying to do is (also a requirement of the homework) to check whether the values in the lis matches the key in dic, if so then it increment by 1 (for example there's 3 'C's in the lis then in the output of dic 'C' should be 45). If not, then we create a new item in the dic and set the value to 1.
So the example output should be look like this:
dic = {'A':1, 'B':6, 'C':45, 'D':2, 'E':12, 'F':1}
Here's what my code is:
def addToInventory(dic, lis):
for k,v in dic.items():
for i in lis:
if i == k:
dic[k] += 1
else:
dic[i] = 1
return dic
and execute by this code:
dic = addToInventory(dic,lis)
It compiles without error but the output is strange, it added the missing F into the dic but didn't update the values correctly.
dic = {'A':1, 'B':6, 'C':1, 'D':1, 'E':12, 'F':1}
What am I missing here?

There's no need to iterate over a dictionary when it supports random lookup. You can use if x in dict to do this. Furthermore, you'd need your return statement outside the loop.
Try, instead:
def addToInventory(dic, lis):
for i in lis:
if i in dic:
dic[i] += 1
else:
dic[i] = 1
return dic
out = addToInventory(dic, lis)
print(out)
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}
As Harvey suggested, you can shorten the function a little by making use of dict.get.
def addToInventory(dic, lis):
for i in lis:
dic[i] = dic.get(i, 0) + 1
return dic
The dic.get function takes two parameters - the key, and a default value to be passed if the value associated with that key does not already exist.
If your professor allows the use of libraries, you can use the collections.Counter data structure, it's meant precisely for keeping counts.
from collections import Counter
c = Counter(dic)
for i in lis:
c[i] += 1
print(dict(c))
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}

How to replace keys (key labels) in a dictionary from a list of tokens

I have a dictionary
dict = {'a': 'cat', 'b':'dog'}
and I want to replace the keys in the dict
with new keys (or key labels) from a list ['c', 'd'] so that I get (the same)
dict = {'c': 'cat', 'd':'dog'}. How can I do this?

You can define the relation between the old keys and their replacements, in another dictionary, like this. Here, mapping is the dictionary which maps the old keys with the new keys.
d, mapping = {'a': 'cat', 'b':'dog'}, {"a":"c", "b":"d"}
print {mapping[k]:v for k, v in d.items()}
Output
{'c': 'cat', 'd': 'dog'}

As already has been pointed out, a dictionary is not ordered. So if you want to replace your keys with values from a list (which is ordered), you will need to specify how the keys of your dict are ordered. Something along these lines:
def replaceKeys (d, newKeys, sort):
return {newKeys[idx]: v for idx, (_, v)
in enumerate(sorted(d.items(), key = lambda kv: sort(kv[0])))}
d = {'cat': 'gato', 'dog': 'chucho', 'mule': 'mula'}
d2 = replaceKeys(d, ['a', 'b', 'c'], lambda oldKey: oldKey) #sort alphabetically
print(d2)
d2 = replaceKeys(d, ['a', 'b', 'c'], lambda oldKey: -len(oldKey)) #sort by length ascending
print(d2)
d2 = replaceKeys(d, ['a', 'b', 'c'], lambda oldKey: oldKey[2]) #sort by third letter
print(d2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python remove duplicate value in a combined dictionary's list - python

In your for loop add this: for dict in d: for k in dict.items(): if k[0] in newdicts: # This line below if k[1] not in newdicts[k[0]]: newdicts[k[0]].append(k[1]) else: newdicts[k[0]]=[k[1]] This makes sure duplicates aren't added

Use set when you want unique elements: def merge_dicts(*d): result={} for dict in d: for key, value in dict.items(): result.setdefault(key, set()).add(value) return result Try to avoid using indices; unpack tuples instead.

Related

Python: Create a dictionary where keys have multiple values

How to count occurrences of key in list of dictionaries

How to get multiple max key values in a dictionary?

Update a dictionary with values from a list in Python

How to replace keys (key labels) in a dictionary from a list of tokens

Categories

Resources