Sorting identical keys in a dictionary [duplicate] - python

This question already has answers here:
How can one make a dictionary with duplicate keys in Python?
(9 answers)
Closed 6 years ago.
I just started learning python and was wondering if its possible to pair two identical keys?
example:
my_dict= {1:a,3:b,1:c}
and I want to make a new dict, something like this:
new_dict= {1:{a,c},3:{b}}
thanks for the help

You can't have identical keys, that is the definition of dictionary. As soon as you add again the same key, the previous entry is deleted

You cannot have repeated dictionary keys in python.

From what I understand, you're trying to combine the two dictionaries. Given that you cannot have same keys in a dictionary, I'll suppose you have two distinct dictionaries you'd like to combine to obtain your combination.
Ex:
dic1 = {'a': 1, 'b': 2, 'c': 3}
dic2 = {'c': 4, 'd': 5, 'e': 6}
And the combination would produce:
{'a': {1}, 'b': {2}, 'c':{3, 4}, 'd': {4}, 'e': {6}}
You could use this Example:
from itertools import chain
from collections import defaultdict
dic1 = {'A': 1, 'B': 2, 'C': 3}
dic2 = {'C': 4, 'D': 5, 'E': 6}
dic = defaultdict(set)
for k, v in chain(dic1.items(), dic2.items()):
dic[k].add(v)
print(dict(dic))
Something to note:
'dic' is not exactly a dict(), it's of type 'defaultdict', but you can do dict(dic) to make it so.
You could instead of using set, use list in the defaultdict() argument and have a dictionary as so:
{'a': [1], 'b': [2], 'c':[3, 4], 'd': [4], 'e': [6]}
But to do so, dic[k].append(v) would be used in the for loop. You add items to sets, append to lists.

Dictionaries in Python, or any hash-able type for that matter, will not allow duplicate keys. If two duplicate keys are found in a dictionary, the key farthest in the dictionary will be persevered. This behavior can be obsevered first hand:
>>> my_dict= {1:'a',3:'b',1:'c'}
>>> my_dict
{1: 'c', 3: 'b'}
>>> my_dict= {1:'c',3:'b',1:'a'}
>>> my_dict
{1: 'a', 3: 'b'}
>>>
The Python documentation for dict()s touches on this matter:
It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary).[...].
(emphasis mine)

Related

Remove the duplicate from list of dict in python

I have dict like this
d=[{'a':1,'b':2},{'a':2,'b':2},{'a':3},{'a':1}]
So i need like
d=[{'a':1,'b':2},{'a':2,'b':2},{'a':3}]
Remove duplicate
Learn more about Python data types and theirs functions. Here is one tutorial for dictionaries
As what I know using list of dictionaries is a bad practice
Here is my solution it could be not so elegant, but it is working. Also remove duplicates means remove ALL of them, so I remove all list elements where they were repeated.
d = [{'a': 1, 'b': 2}, {'a': 2, 'b': 2}, {'a': 3}, {'a': 1}]
temp = []
for i in d:
for x, y in i.items():
temp_var = [x, y]
if temp_var in temp:
d.pop(d.index(i))
else:
temp.append(temp_var)
print(d)
# result [{'a': 1, 'b': 2}, {'a': 3}]
P.S.: Keep learning and have a nice day :)

Rearranging levels of a nested dictionary in python

Is there a library that would help me achieve the task to rearrange the levels of a nested dictionary
Eg: From this:
{1:{"A":"i","B":"ii","C":"i"},2:{"B":"i","C":"ii"},3:{"A":"iii"}}
To this:
{"A":{1:"i",3:"iii"},"B":{1:"ii",2:"i"},"C":{1:"i",2:"ii"}}
ie first two levels on a 3 levelled dictionary swapped. So instead of 1 mapping to A and 3 mapping to A, we have A mapping to 1 and 3.
The solution should be practical for an arbitrary depth and move from one level to any other within.
>>> d = {1:{"A":"i","B":"ii","C":"i"},2:{"B":"i","C":"ii"},3:{"A":"iii"}}
>>> keys = ['A','B','C']
>>> e = {key:{k:d[k][key] for k in d if key in d[k]} for key in keys}
>>> e
{'C': {1: 'i', 2: 'ii'}, 'B': {1: 'ii', 2: 'i'}, 'A': {1: 'i', 3: 'iii'}}
thank god for dict comprehension
One way to think about this would be to consider your data as a (named) array and to take the transpose. An easy way to achieve this would be to use the data analysis package Pandas:
import pandas as pd
df = pd.DataFrame({1: {"A":"i","B":"ii","C":"i"},
2: {"B":"i","C":"ii"},
3: {"A":"iii"}})
df.transpose().to_dict()
{'A': {1: 'i', 2: nan, 3: 'iii'},
'B': {1: 'ii', 2: 'i', 3: nan},
'C': {1: 'i', 2: 'ii', 3: nan}}
I don't really care about performance for my application of this so I haven't bothered checking how efficient this is. Its based on bubblesort so my guess is ~O(N^2).
Maybe this is convoluted, but essentially below works by:
- providing dict_swap_index a nested dictionary and a list. the list should be of the format [i,j,k]. The length should be the depth of the dictionary. Each element corresponds to which position you'd like to move each element to. e.g. [2,0,1] would indicate move element 0 to position 2, element 1 to position 0 and element 2 to position 1.
- this function performs a bubble sort on the order list and dict_, calling deep_swap to swap the levels of the dictionary which are being swapped in the order list
- deep_swap recursively calls itself to find the level provided and returns a dictionary which has been re-ordered
- swap_two_level_dict is called to swap any two levels in a dictionary.
Essentially the idea is to perform a bubble sort on the dictionary, but instead of swapping elements in a list swap levels in a dictionary.
from collections import defaultdict
def dict_swap_index(dict_, order):
for pas_no in range(len(order)-1,0,-1):
for i in range(pas_no):
if order[i] > order[i+1]:
temp = order[i]
order[i] = order[i+1]
order[i+1] = temp
dict_ = deep_swap(dict_, i)
return dict_, order
def deep_swap(dict_, level):
dict_ = deepcopy(dict_)
if level==0:
dict_ = swap_two_level_dict(dict_)
else:
for key in dict_:
dict_[key] = deep_swap(dict_[key], level-1)
return dict_
def swap_two_level_dict(a):
b = defaultdict(dict)
for key1, value1 in a.items():
for key2, value2 in value1.items():
b[key2].update({key1: value2})
return b
e.g.
test_dict = {'a': {'c': {'e':0, 'f':1}, 'd': {'e':2,'f':3}}, 'b': {'c': {'g':4,'h':5}, 'd': {'j':6,'k':7}}}
result = dict_swap_index(test_dict, [2,0,1])
result
(defaultdict(dict,
{'c': defaultdict(dict,
{'e': {'a': 0},
'f': {'a': 1},
'g': {'b': 4},
'h': {'b': 5}}),
'd': defaultdict(dict,
{'e': {'a': 2},
'f': {'a': 3},
'j': {'b': 6},
'k': {'b': 7}})}),
[0, 1, 2])

Pythonic way of finding duplicate maps in a list while ignoring certain keys, and combining the duplicate maps to make a new list

I want to write a code which takes the following inputs:
list (list of maps)
request_keys (list of strings)
operation (add,substract,multiply,concat)
The code would look at the list for the maps having the same value for all keys except the keys given in request_keys. Upon finding two maps for which the value in the search keys match, the code would do the operation (add,multiple,substract,concat) on the two maps and combine them into one map. This combination map would basically replace the other two maps.
i have written the following peice of code to do this. The code only does add operation. It can be extended to make the other operations
In [83]: list
Out[83]:
[{'a': 2, 'b': 3, 'c': 10},
{'a': 2, 'b': 3, 'c': 3},
{'a': 2, 'b': 4, 'c': 4},
{'a': 2, 'b': 3, 'c': 2},
{'a': 2, 'b': 3, 'c': 3}]
In [84]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:def func(list,request_keys):
: new_list = []
: found_indexes = []
: for i in range(0,len(list)):
: new_item = list[i]
: if i in found_indexes:
: continue
: for j in range(0,len(list)):
: if i != j and {k: v for k,v in list[i].iteritems() if k not in request_keys} == {k: v for k,v in list[j].iteritems() if k not in request_keys}:
: found_indexes.append(j)
: for request_key in request_keys:
: new_item[request_key] += list[j][request_key]
: new_list.append(new_item)
: return new_list
:--
In [85]: func(list,['c'])
Out[85]: [{'a': 2, 'b': 3, 'c': 18}, {'a': 2, 'b': 4, 'c': 4}]
In [86]:
What i want to know is, is there a faster, more memory efficient, cleaner and a more pythonic way of doing the same?
Thank you
You manually generate all the combinations and then compare each of those combinations. This is pretty wasteful. Instead, I suggest grouping the dictionaries in another dictionary by their matching keys, then adding the "same" dictionaries. Also, you forgot the operator parameter.
import collections, operator, functools
def func(lst, request_keys, op=operator.add):
matching_dicts = collections.defaultdict(list)
for d in lst:
key = tuple(sorted(((k, d[k]) for k in d if k not in request_keys)))
matching_dicts[key].append(d)
for group in matching_dicts.values():
merged = dict(group[0])
merged.update({key: functools.reduce(op, (g[key] for g in group))
for key in request_keys})
yield merged
What this does: First, it creates a dictionary, mapping the key-value pairs that have to be equal for two dictionaries to match to all those dictionaries that have those key-value pairs. Then it iterates the dicts from those groups, using one of that group as a prototype and updating it with the sum (or product, or whatever, depending on the operator) of the all the dicts in that group for the required_keys.
Note that this returns a generator. If you want a list, just call it like list(func(...)), or accumulate the merged dicts in a list and return that list.
from itertools import groupby
from operator import itemgetter
def mergeDic(inputData, request_keys):
keys = inputData[0].keys()
comparedKeys = [item for item in keys if item not in request_keys]
grouper = itemgetter(*comparedKeys)
result = []
for key, grp in groupby(sorted(inputData, key = grouper), grouper):
temp_dict = dict(zip(comparedKeys, key))
for request_key in request_keys:
temp_dict[request_key] = sum(item[request_key] for item in grp)
result.append(temp_dict)
return result
inputData = [{'a': 2, 'b': 3, 'c': 10},
{'a': 2, 'b': 3, 'c': 3},
{'a': 2, 'b': 4, 'c': 4},
{'a': 2, 'b': 3, 'c': 2},
{'a': 2, 'b': 3, 'c': 3}]
from pprint import pprint
pprint(mergeDic(inputData,['c']))

Python remove duplicate value in a combined dictionary's list

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!
Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique
>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}
In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added
Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.

How to access key in Python dictionary

I know it's possible to traverse a dictionary by using iteritems() or keys(). But what if I got a list of dics such as: l = [{'a': 1}, {'b': 2}, {'c': 3}] and I want to compose a string using its keys, e.g., s = 'a, b, c'?
One solution is to copy all the keys into a list in advance, and compose the string I want. Just wondering if there is a better solution.
You can use itertools.chain() for it and make use of the fact that iterating over a dict will yield its keys.
import itertools
lst = [{'a': 1}, {'b': 2}, {'c': 3}]
print ', '.join(itertools.chain(*lst))
This also works it there are dicts with more than one element.
If you do not want duplicate elements, use a set:
print ', '.join(set(itertools.chain(*lst)))
Assuming each dictionary will only have a single key:
', '.join(a.keys()[0] for a in l)
if not, maybe something like:
>>> l = [{'a': 1, 'd': 4}, {'b': 2}, {'c': 3}]
>>> ', '.join(', '.join(a.keys()) for a in l)
'a, d, b, c'
Iterate over the dictionaries and then over the keys of each dictionary:
>>> lst = [{'a': 1}, {'b': 2}, {'c': 3}]
>>> ', '.join(key for dct in lst for key in dct.keys())
'a, b, c'
Try this:
''.join([i.keys()[0] for i in l])
You can put any delimiter you want inside of the quotes. Just bear in mind that the dictionary isn't ordered, so you may get weird looking strings as a result of this.

Categories