I would like to merge or update a dictionary in Python with new entries, but replace the values of entries whose key exists with the smaller of the values associated with the key in the existing entry and the new entry. For example:
Input:
dict_A = {1:14, 2:15, 3:16, 4:17}, dict_B= {2:19, 3:9, 4:11, 5:13}
Expected output:
{1:14, 2:15, 3:9, 4:11, 5:13}
I know it can be achieved with a loop iterating through the dictionaries while performing comparisons, but is there any simpler and faster ways or any helpful libraries to achieve this?
in this case you could easily use pandas to avoid writing the loop, though I dunno if there would be any speedup - didn't test that
import pandas as pd
df = pd.DataFrame([dict_A, dict_B])
out = df.min().to_dict()
output: {1: 14.0, 2: 15.0, 3: 9.0, 4: 11.0, 5: 13.0}
there's probably some edge cases you'd have to account for
Quick One-liner
c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
Explanation
This should be quick enough because it uses the set.
You can merge dictionaries by using the **dictionary syntax like so: {**a, **b}. The ** simply just "expands" out the dictionary into each individual item, with the last expanded dictionary overwriting any previous ones (so in {**a, **b}, any matching keys in b overwrite the value from a).
The first thing I do is load in all the values in a and b into the new dictionary:
c = {**a, **b, ...
Then I use dictionary comprehension to generate a new dictionary, which only has the smallest value for every set of keys which are in both a and b.
... {key:min(a[key], b[key]) for key in set(a).intersection(set(b))} ...
To get the set of keys which only exist in both a and b, I convert both dictionaries to sets (which converts them to sets of their keys) and use intersection to quickly find all keys which are in both sets.
... set(a).intersection(set(b)) ...
Then I loop through each of the keys in the matching-keys set, and use the dictionary comprehension to generate a new dictionary with the current key and the min of both dictionaries' values for that key.
... {key:min(a[key], b[key]) ...
Then I use the ** syntax to "expand" this new generated dictionary with the expanded a and b, putting it last to make sure it overwrites any values from the two.
Works on the example given (ctrl-cv'd straight from my terminal):
>>> a = {1:14, 2:15, 3:16, 4:17}
>>> b = {2:19, 3:9, 4:11, 5:13}
>>> c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
>>> c
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
Here's something short to do it. Not inherently the fastest or best way to do it, but figured I'd share nonetheless.
max_val = max(max(dict_A.values()), max(dict_B.values())) + 1
keys = set(list(dict_A.keys()) + list(dict_B.keys()))
dict_C = { key : min(dict_A.get(key, max_val), dict_B.get(key, max_val)) for key in keys }
hello this is the code I have made Hope it's what you needed :
dict_A = {1:14, 2:15, 3:16, 4:17}
dict_B= {2:19, 3:9, 4:11, 5:13}
dict_res =dict_B
dict_A_keys = dict_A.keys()
dict_B_keys = dict_B.keys()
for e in dict_A_keys :
if e in dict_B_keys :
if dict_A[e]>dict_B[e]:
dict_res[e]=dict_B[e]
else:
dict_res[e]=dict_A[e]
else:
dict_res[e]=dict_A[e]
I have tested it and the output is :
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
Related
For example, my dictionary contain this:
a = {1:'a',3:'b',2:'c',4:'d',1:'e',4:'f'}
To sort key on that list, I can do this:
b = []
for m in a:
b.append(m)
b.sort()
for value in b:
print(a[value])
The problem is: 'd' and 'e' share the same value 4, and its only export one value. But I want to export all of them. How can I do that?
A dictionary is assigns a value to a key (or a key references a specific value). Therefore it is not possible to have a dictionary with the same key twice (check, what a looks like after assigning!).
What you may want to look at is a list of tuples!
a = [(1,'a'),(3,'b'),(2,'c'),(4,'d'),(1,'e'),(4,'f')]
After that, you can do the sorting just by doing this:
a.sort(key=lambda x: x[0])
Duplicate keys in a dictionary is not allowed. Your dict does not contain what you think.
>>> a = {1:'a',3:'b',2:'c',4:'d',1:'e',4:'f'}
>>> a
{1: 'e', 2: 'c', 3: 'b', 4: 'f'}
When initializing your dictionary, the key 4 is defined twice: 4:'d', 4:'f'.
The second value 'f' overwrites the 'd' stored at a[4]. This is by design and therefore, you would need to correct your implementation to take this behaviour into account.
I was running this code through python tutor, and was just confused as to how the keys and values get switched around. I also was confused as to what value myDict[d[key]] would correspond to as I'm not sure what the d in [d[key]] actually does.
def dict_invert(d):
'''
d: dict
Returns an inverted dictionary according to the instructions above
'''
myDict = {}
for key in d.keys():
if d[key] in myDict:
myDict[d[key]].append(key)
else:
myDict[d[key]] = [key]
for val in myDict.values():
val.sort()
return myDict
print(dict_invert({8: 6, 2: 6, 4: 6, 6: 6}))
In your function d is the dictionary being passed in. Your code is creating a new dictionary, mapping the other direction (from the original dictionary's values to its keys). Since there may not be a one to one mapping (since values can be repeated in a dictionary), the new mapping actually goes from value to a list of keys.
When the code loops over the keys in d, it then uses d[key] to look up the corresponding value. As I commented above, this is not really the most efficient way to go about this. Instead of getting the key first and indexing to get the value, you can instead iterate over the items() of the dictionary and get key, value 2-tuples in the loop.
Here's how I'd rewrite the function, in what I think is a more clear fashion (as well as perhaps a little bit more efficient):
def dict_invert(d):
myDict = {}
for key, value in d.items(): # Get both key and value in the iteration.
if value in myDict: # That change makes these later lines more clear,
myDict[value].append(key) # as they can use value instead of d[key].
else:
myDict[value] = [key] # here too
for val in myDict.values():
val.sort()
return myDict
The function you are showing inverts a dictionary d. A dictionary is a collection of unique keys that map to values which are not necessarily unique. That means that when you swap keys and values, you may get multiple keys that have the same value. Your function handles this by adding keys in the input to a list in the inverse, instead of storing them directly as values. This avoids any possibility of conflict.
Let's look at a sample conceptually first before digging in. Let's say you have
d = {
'a': 1,
'b': 1,
'c': 2
}
When you invert that, you will have the keys 1 and 2. Key 1 will have two values: 'a' and 'b'. Key 2 will only have one value: 'c'. I used different types for the keys and values so you can tell immediately when you're looking at the input vs the output. The output should look like this:
myDict = {
1: ['a', 'b'],
2: ['c']
}
Now let's look at the code. First you initialize an empty output:
myDict = {}
Then you step through every key in the input d. Remember that these keys will become the values of the output:
for key in d.keys():
The value in d for key is d[key]. You need to check if that's a key in myDict since values become keys in the inverse:
if d[key] in myDict:
If the input's value is already a key in myDict, then it maps to a list of keys from d, and you need to append another one to the list. Specifically, d[key] represents the value in d for the key key. This value becomes a key in myDict, which is why it's being indexed like that:
myDict[d[key]].append(key)
Otherwise, create a new list with the single inverse recorded in it:
else:
myDict[d[key]] = [key]
The final step is to sort the values of the inverse. This is not necessarily a good idea. The values were keys in the input, so they are guaranteed to be hashable, but not necessarily comparable to each other:
for val in myDict.values():
val.sort()
The following should raise an error in Python 3:
dict_invert({(1, 2): 'a', 3: 'b'})
myDict[d[key]] takes value of d[key] and uses it as a key in myDict, for example
d = {'a': 'alpha', 'b': 'beta'}
D = {'alpha': 1, 'beta': 2}
D[d['a']] = 3
D[d['b']] = 4
now when contents of d and D should be as following
d = {'a': 'alpha', 'b': 'beta'}
D = {'alpha': 3, 'beta': 4}
d is the dictionary you are passing into the function
def dict_invert(d)
When you create
myDict[d[key]] = d
Its meaning is
myDict[value of d] = key of d
Resulting in
myDict = {'value of d': 'key of d'}
I have 2 dictionaries, A and B. A has 700000 key-value pairs and B has 560000 key-values pairs. All key-value pairs from B are present in A, but some keys in A are duplicates with different values and some have duplicated values but unique keys. I would like to subtract B from A, so I can get the remaining 140000 key-value pairs. When I subtract key-value pairs based on key identity, I remove lets say 150000 key-value pairs because of the repeated keys. I want to subtract key-value pairs based on the identity of BOTH key AND value for each key-value pair, so I get 140000. Any suggestion would be welcome.
This is an example:
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}
I DO want to get:
A-B = {'10':1, '12':1, '10':2, '11':3}
I DO NOT want to get:
a) When based on keys:
{'10':1, '12':1, '10':2}
or
b) When based on values:
{'11':3}
To get items in A that are not in B, based just on key:
C = {k:v for k,v in A.items() if k not in B}
To get items in A that are not in B, based on key and value:
C = {k:v for k,v in A.items() if k not in B or v != B[k]}
To update A in place (as in A -= B) do:
from collections import deque
consume = deque(maxlen=0).extend
consume(A.pop(key, None) for key in B)
(Unlike using map() with A.pop, calling A.pop with a None default will not break if a key from B is not present in A. Also, unlike using all, this iterator consumer will iterate over all values, regardless of truthiness of the popped values.)
An easy, intuitive way to do this is
dict(set(a.items()) - set(b.items()))
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}
You can't have duplicate keys in Python. If you run the above, it will get reduced to:
A={'11': 3, '10': 2, '12': 1}
B={'11': 2}
But to answer you question, to do A - B (based on dict keys):
all(map( A.pop, B)) # use all() so it works for Python 2 and 3.
print A # {'10': 2, '12': 1}
dict-views:
Keys views are set-like since their entries are unique and hashable. If all values are hashable, so that (key, value) pairs are unique and hashable, then the items view is also set-like. (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^).
So you can:
>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> B = {'11':1, '11':2}
>>> A.items() - B.items()
{('11', 3), ('12', 1), ('10', 2)}
>>> dict(A.items() - B.items())
{'11': 3, '12': 1, '10': 2}
For python 2 use dict.viewitems.
P.S. You can't have duplicate keys in dict.
>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> A
{'10': 2, '11': 3, '12': 1}
>>> B = {'11':1, '11':2}
>>> B
{'11': 2}
Another way of using the efficiency of sets. This might be more multipurpose than the answer by #brien. His answer is very nice and concise, so I upvoted it.
diffKeys = set(a.keys()) - set(b.keys())
c = dict()
for key in diffKeys:
c[key] = a.get(key)
EDIT: There is the assumption here, based on the OP's question, that dict B is a subset of dict A, that the key/val pairs in B are in A. The above code will have unexpected results if you are not working strictly with a key/val subset. Thanks to Steven for pointing this out in his comment.
Since I can not (yet) comment: the accepted answer will fail if there are some keys in B not present in A.
Using dict.pop with a default would circumvent it (borrowed from How to remove a key from a Python dictionary?):
all(A.pop(k, None) for k in B)
or
tuple(A.pop(k, None) for k in B)
result = A.copy()
[result.pop(key) for key in B if B[key] == A[key]]
Based on only keys assuming A is a superset of B or B is a subset of A:
Python 3: c = {k:a[k] for k in a.keys() - b.keys()}
Python 2: c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))}
Based on keys and can be used to update a in place as well #PaulMcG answer
For subtracting the dictionaries, you could do :
A.subtract(B)
Note: This will give you negative values in a situation where B has keys that A does not.
I have the following dictionary, where keys are integers and values are floats:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
This dictionary has keys 1, 2, 3 and 4.
Now, I remove the key '2':
foo = {1:0.001,3:1.093,4:5.246}
I only have the keys 1, 3 and 4 left. But I want these keys to be called 1, 2 and 3.
The function 'enumerate' allows me to get the list [1,2,3]:
some_list = []
for k,v in foo.items():
some_list.append(k)
num_list = list(enumerate(some_list, start=1))
Next, I try to populate the dictionary with these new keys and the old values:
new_foo = {}
for i in num_list:
for value in foo.itervalues():
new_foo[i[0]] = value
However, new_foo now contains the following values:
{1: 5.246, 2: 5.246, 3: 5.246}
So every value was replaced by the last value of 'foo'. I think the problem comes from the design of my for loop, but I don't know how to solve this. Any tips?
Using the list-comprehension-like style:
bar = dict( (k,v) for k,v in enumerate(foo.values(), start=1) )
But, as mentioned in the comments the ordering is going to be arbitrary, since the dict structure in python is unordered. To preserve the original order the following can be used:
bar = dict( ( i,foo[k] ) for i, k in enumerate(sorted(foo), start=1) )
here sorted(foo) returns the list of sorted keys of foo. i is the new enumeration of the sorted keys as well as the new enumeration for the new dict.
Like others have said, it would be best to use a list instead of dict. However, in case you prefer to stick with a dict, you can do
foo = {j+1:foo[k] for j,k in enumerate(sorted(foo))}
Agreeing with the other responses that a list implements the behavior you describe, and so it probably more appropriate, but I will suggest an answer anyway.
The problem with your code is the way you are using the data structures. Simply enumerate the items left in the dictionary:
new_foo = {}
for key, (old_key, value) in enumerate( sorted( foo.items() ) ):
key = key+1 # adjust for 1-based
new_foo[key] = value
A dictionary is the wrong structure here. Use a list; lists map contiguous integers to values, after all.
Either adjust your code to start at 0 rather than 1, or include a padding value at index 0:
foo = [None, 0.001, 2.097, 1.093, 5.246]
Deleting the 2 'key' is then as simple as:
del foo[2]
giving you automatic renumbering of the rest of your 'keys'.
This looks suspiciously like Something You Should Not Do, but I'll assume for a moment that you're simplifying the process for an MCVE rather than actually trying to name your dict keys 1, 2, 3, 4, 5, ....
d = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
del d[2]
# d == {1:0.001, 3:1.093, 4:5.246}
new_d = {idx:val for idx,val in zip(range(1,len(d)+1),
(v for _,v in sorted(d.items())))}
# new_d == {1: 0.001, 2: 1.093, 3: 5.246}
You can convert dict to list, remove specific element, then convert list to dict. Sorry, it is not a one liner.
In [1]: foo = {1:0.001,2:2.097,3:1.093,4:5.246}
In [2]: l=foo.values() #[0.001, 2.097, 1.093, 5.246]
In [3]: l.pop(1) #returns 2.097, not the list
In [4]: dict(enumerate(l,1))
Out[4]: {1: 0.001, 2: 1.093, 3: 5.246}
Try:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
foo.pop(2)
new_foo = {i: value for i, (_, value) in enumerate(sorted(foo.items()), start=1)}
print new_foo
However, I'd advise you to use a normal list instead, which is designed exactly for fast lookup of gapless, numeric keys:
foo = [0.001, 2.097, 1.093, 5.245]
foo.pop(1) # list indices start at 0
print foo
One liner that filters a sequence, then re-enumerates and constructs a dict.
In [1]: foo = {1:0.001, 2:2.097, 3:1.093, 4:5.246}
In [2]: selected=1
In [3]: { k:v for k,v in enumerate((foo[i] for i in foo if i<>selected), 1) }
Out[3]: {1: 2.097, 2: 1.093, 3: 5.246}
I have a more compact method.
I think it's more readable and easy to understand. You can refer as below:
foo = {1:0.001,2:2.097,3:1.093,4:5.246}
del foo[2]
foo.update({k:foo[4] for k in foo.iterkeys()})
print foo
So you can get answer you want.
{1: 5.246, 3: 5.246, 4: 5.246}
Is it possible to replace all values in a dictionary, regardless of value, with the integer 1?
Thank you!
Sure, you can do something like:
d = {x: 1 for x in d}
That creates a new dictionary d that maps every key in d (the old one) to 1.
You can use a dict comprehension (as others have said) to create a new dictionary with the same keys as the old dictionary, or, if you need to do the whole thing in place:
for k in d:
d[k] = 1
If you're really fond of 1-liners, you can do it in place using update:
d.update( (k,1) for k in d )
a = {1:2, 2:2,3:2}
a = {x:1 for (x,_) in a.iteritems()}
print a
{1: 1, 2: 1, 3: 1}
Yes, it's possible. Iterate through every key in the dictionary and set the related value to 1.