Python - Splitting dictionary into dictionaries with the same values? - python

Say I have a dictionary with many items that have the same values; for example:
dict = {'hello':'a', 'goodbye':'z', 'bonjour':'a', 'au revoir':'z', 'how are you':'m'}
How would I split the dictionary into dictionaries (in this case, three dictionaries) with the same values? In the example, I want to end up with this:
dict1 = {'hello':'a', 'bonjour':'a'}
dict2 = {'goodbye':'z', 'au revoir':'z'}
dict3 = {'how are you':'m'}

You can use itertools.groupby to collect by the common values, then create dict objects for each group within a list comprehension.
>>> from itertools import groupby
>>> import operator
>>> by_value = operator.itemgetter(1)
>>> [dict(g) for k, g in groupby(sorted(d.items(), key = by_value), by_value)]
[{'hello': 'a', 'bonjour': 'a'},
{'how are you': 'm'},
{'goodbye': 'z', 'au revoir': 'z'}]

Another way without importing any modules is as follows:
def split_dict(d):
unique_vals = list(set(d.values()))
split_dicts = []
for i in range(len(unique_vals)):
unique_dict = {}
for key in d:
if d[key] == unique_vals[i]:
unique_dict[key] = d[key]
split_dicts.append(unique_dict)
return split_dicts
For each unique value in the input dictionary, we create a dictionary and add the key values pairs from the input dictionary where the value is equal to that value. We then append each dictionary to a list, which is finally returned.

Related

Dedupe a list of dictionaries by values of a specified key

I have a list of dictionaries, and each dictionary can have any key - value pairs. Given a key, I want to remove all the dictionaries from the list except the last one that has duplicate value for that key. If any dictionary does not have specified key, it should be in the final resultset.
In summary - dedupe the items of a list of dictionaries by the value of a specified key.
Example :
list_dicts = [{'a':'apple','b':'ball'}, {'a':'apple','c':'cat'}, {'c':'cheat','d':'dog'}, {'a':'amazon','c':'cheat'}]
The function dedupe(list_dicts, key='a')
should return : [{'a':'apple','c':'cat'}, {'c':'cheat','d':'dog'}, 'a':'amazon','c':'cheat']
I have working code but I somehow feel there would be a much shorter and smarter way to do this.
def dedupe_dicts_from_dict_list(dict_list, dedupe_key):
result = list()
temp = dict()
for dict_ in dict_list:
if dedupe_key in dict_:
temp[dict_[dedupe_key]] = dict_
else:
result.append(dict_)
result.extend(temp.values())
return result
Thanks in advance.
from collections import defaultdict
def dedupe(list_of_dicts, dedupe_key):
temp = defaultdict(list)
final_list_of_dict = []
for d in list_of_dicts:
if dedupe_key in d:
# collects the dictionaries based on dedupe_key and the value present in that key
temp[(dedupe_key, d[dedupe_key])].append(d)
if dedupe_key not in d:
# retains the dictionaries in which dedupe_key not found
final_list_of_dict.append(d)
for _, value in temp.items():
final_list_of_dict.append(value[-1])
print(final_list_of_dict)
dedupe(list_dicts, dedupe_key="a")
Output:
[{'c': 'cheat', 'd': 'dog'}, {'a': 'apple', 'c': 'cat'}, {'a': 'amazon', 'c': 'cheat'}]

Finding a key-value pair present only in the first dictionary

Two dictionaries:
dict1 = {'firstvalue':1, 'secondvalue':2, 'fourthvalue':4}
dict2 = {'firstvalue':1, 'thirdvalue':3, 'fourthvalue':5}
I get set(['secondvalue']) as a result upon doing:
dict1.viewkeys() - dict2
I need {'secondvalue':2} as a result.
When I use set, and then do the - operation, it does not give the desired result as it consists of {'fourthvalue:4} as well.
How could I do it?
The problem with - is that (in this context) it is an operation of dict_keys and thus the results will have no values. Using - with viewitems() does not work, either, as those are tuples, i.e. will compare both keys and values.
Instead, you can use a conditional dictionary comprehension, keeping only those keys that do not appear in the second dictionary. Other than Counter, this also works in the more general case, where the values are not integers, and with integer values, it just checks whether a key is present irrespective of the value that is accociated with it.
>>> dict1 = {'firstvalue':1, 'secondvalue':2, 'fourthvalue':4}
>>> dict2 = {'firstvalue':1, 'thirdvalue':3, 'fourthvalue':5}
>>> {k: v for k, v in dict1.items() if k not in dict2}
{'secondvalue': 2}
IIUC and providing a solution to Finding a key-value pair present only in the first dictionary as specified, you could take a set from the key/value pairs as tuples, subtract both sets and construct a dictionary from the result:
dict(set(dict1.items()) - set(dict2.items()))
# {'fourthvalue': 4, 'secondvalue': 2}
Another simple variation with set difference:
res = {k: dict1[k] for k in dict1.keys() - dict2.keys()}
Python 2.x:
dict1 = {'firstvalue':1, 'secondvalue':2, 'fourthvalue':4}
dict2 = {'firstvalue':1, 'thirdvalue':3, 'fourthvalue':5}
keys = dict1.viewkeys() - dict2.viewkeys()
print ({key:dict1[key] for key in keys})
output:
{'secondvalue': 2}

Comparing a list to each value of dictionary (which are a list of strings)

I have a list of drugs that I want to compare to a dictionary, where the dictionary keys are drug codes and the dictionary values are lists of drugs. I'd like to only retain the drugs within the dictionary that correspond to the list of drugs.
Example list:
l = ['sodium', 'nitrogen', 'phosphorus']
And dictionary:
d = {'A02A4': ['sodium', 'nitrogen', 'carbon']}
I would want my final dictionary to look like:
{'A02A4': ['nitrogen', 'sodium']}
with the value that is not present in the list removed, and to do this for all key, value pairs in the dictionary
You could use a dictionary comprehension and sets to keep only the values that intersect with the list:
l = ['sodium', 'nitrogen', 'phosphorus']
d = {'A02A4': ['sodium', 'nitrogen', 'carbon']}
{i: list(set(v) & set(l)) for i,v in d.items()}
{'A02A4': ['nitrogen', 'sodium']}
Or equivalently, using intersection:
{i: list(set(v).intersection(l)) for i,v in d.items()}
{'A02A4': ['nitrogen', 'sodium']}

Use list items as keys for nested dictionary

I want to use the items in a list as dictionary keys to find a value in nested dictionaries.
For example, given the following list:
keys = ['first', 'second', 'third']
I want to do:
result = dictionary[keys[0]][keys[1]][keys[2]]
Which would be the equivalent of:
result = dictionary['first']['second']['third']
But I don't know how many items will be in the keys list beforehand (except that it will always have at least 1).
Iteratively go into the subdictionaries.
result = dictionary
for key in keys:
result = result[key]
print(result)
A simple for-loop will work:
result = dictionary[keys[0]] # Access the first level (which is always there)
for k in keys[1:]: # Step down through any remaining levels
result = result[k]
Demo:
>>> dictionary = {'first': {'second': {'third': 123}}}
>>> keys = ['first', 'second', 'third']
>>> result = dictionary[keys[0]]
>>> for k in keys[1:]:
... result = result[k]
...
>>> result
123
>>>

How to implement associative array (not dictionary) in Python?

I trying to print out a dictionary in Python:
Dictionary = {"Forename":"Paul","Surname":"Dinh"}
for Key,Value in Dictionary.iteritems():
print Key,"=",Value
Although the item "Forename" is listed first, but dictionaries in Python seem to be sorted by values, so the result is like this:
Surname = Dinh
Forename = Paul
How to print out these with the same order in code or the order when items are appended in (not sorted by values nor by keys)?
You can use a list of tuples (or list of lists). Like this:
Arr= [("Forename","Paul"),("Surname","Dinh")]
for Key,Value in Arr:
print Key,"=",Value
Forename = Paul
Surname = Dinh
you can make a dictionary out of this with:
Dictionary=dict(Arr)
And the correctly sorted keys like this:
keys = [k for k,v in Arr]
Then do this:
for k in keys: print k,Dictionary[k]
but I agree with the comments on your question: Would it not be easy to sort the keys in the required order when looping instead?
EDIT: (thank you Rik Poggi), OrderedDict does this for you:
od=collections.OrderedDict(Arr)
for k in od: print k,od[k]
First of all dictionaries are not sorted at all nor by key, nor by value.
And basing on your description. You actualy need collections.OrderedDict module
from collections import OrderedDict
my_dict = OrderedDict([("Forename", "Paul"), ("Surname", "Dinh")])
for key, value in my_dict.iteritems():
print '%s = %s' % (key, value)
Note that you need to instantiate OrderedDict from list of tuples not from another dict as dict instance will shuffle the order of items before OrderedDict will be instantiated.
You can use collections.OrderedDict. It's available in python2.7 and python3.2+.
This may meet your need better:
Dictionary = {"Forename":"Paul","Surname":"Dinh"}
KeyList = ["Forename", "Surname"]
for Key in KeyList:
print Key,"=",Dictionary[Key]
'but dictionaries in Python are sorted by values' maybe I'm mistaken here but what game you that ideea? Dictionaries are not sorted by anything.
You would have two solutions, either keep a list of keys additional to the dictionary, or use a different data structure like an array or arrays.
I wonder if it is an ordered dict that you want:
>>> k = "one two three four five".strip().split()
>>> v = "a b c d e".strip().split()
>>> k
['one', 'two', 'three', 'four', 'five']
>>> v
['a', 'b', 'c', 'd', 'e']
>>> dx = dict(zip(k, v))
>>> dx
{'four': 'd', 'three': 'c', 'five': 'e', 'two': 'b', 'one': 'a'}
>>> for itm in dx:
print(itm)
four
three
five
two
one
>>> # instantiate this data structure from OrderedDict class in the Collections module
>>> from Collections import OrderedDict
>>> dx = OrderedDict(zip(k, v))
>>> for itm in dx:
print(itm)
one
two
three
four
five
A dictionary created using the OrderdDict preserves the original insertion order.
Put another way, such a dictionary iterates over the key/value pairs according to the order in which they were inserted.
So for instance, when you delete a key and then add the same key again, the iteration order is changes:
>>> del dx['two']
>>> for itm in dx:
print(itm)
one
three
four
five
>>> dx['two'] = 'b'
>>> for itm in dx:
print(itm)
one
three
four
five
two
As of Python 3.7, regular dicts are guaranteed to be ordered, so you can just do
Dictionary = {"Forename":"Paul","Surname":"Dinh"}
for Key,Value in Dictionary.items():
print(Key,"=",Value)

Categories