I have 2 dictionaries, A and B. A has 700000 key-value pairs and B has 560000 key-values pairs. All key-value pairs from B are present in A, but some keys in A are duplicates with different values and some have duplicated values but unique keys. I would like to subtract B from A, so I can get the remaining 140000 key-value pairs. When I subtract key-value pairs based on key identity, I remove lets say 150000 key-value pairs because of the repeated keys. I want to subtract key-value pairs based on the identity of BOTH key AND value for each key-value pair, so I get 140000. Any suggestion would be welcome.
This is an example:
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}
I DO want to get:
A-B = {'10':1, '12':1, '10':2, '11':3}
I DO NOT want to get:
a) When based on keys:
{'10':1, '12':1, '10':2}
or
b) When based on values:
{'11':3}
To get items in A that are not in B, based just on key:
C = {k:v for k,v in A.items() if k not in B}
To get items in A that are not in B, based on key and value:
C = {k:v for k,v in A.items() if k not in B or v != B[k]}
To update A in place (as in A -= B) do:
from collections import deque
consume = deque(maxlen=0).extend
consume(A.pop(key, None) for key in B)
(Unlike using map() with A.pop, calling A.pop with a None default will not break if a key from B is not present in A. Also, unlike using all, this iterator consumer will iterate over all values, regardless of truthiness of the popped values.)
An easy, intuitive way to do this is
dict(set(a.items()) - set(b.items()))
A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
B = {'11':1, '11':2}
You can't have duplicate keys in Python. If you run the above, it will get reduced to:
A={'11': 3, '10': 2, '12': 1}
B={'11': 2}
But to answer you question, to do A - B (based on dict keys):
all(map( A.pop, B)) # use all() so it works for Python 2 and 3.
print A # {'10': 2, '12': 1}
dict-views:
Keys views are set-like since their entries are unique and hashable. If all values are hashable, so that (key, value) pairs are unique and hashable, then the items view is also set-like. (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^).
So you can:
>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> B = {'11':1, '11':2}
>>> A.items() - B.items()
{('11', 3), ('12', 1), ('10', 2)}
>>> dict(A.items() - B.items())
{'11': 3, '12': 1, '10': 2}
For python 2 use dict.viewitems.
P.S. You can't have duplicate keys in dict.
>>> A = {'10':1, '11':1, '12':1, '10':2, '11':2, '11':3}
>>> A
{'10': 2, '11': 3, '12': 1}
>>> B = {'11':1, '11':2}
>>> B
{'11': 2}
Another way of using the efficiency of sets. This might be more multipurpose than the answer by #brien. His answer is very nice and concise, so I upvoted it.
diffKeys = set(a.keys()) - set(b.keys())
c = dict()
for key in diffKeys:
c[key] = a.get(key)
EDIT: There is the assumption here, based on the OP's question, that dict B is a subset of dict A, that the key/val pairs in B are in A. The above code will have unexpected results if you are not working strictly with a key/val subset. Thanks to Steven for pointing this out in his comment.
Since I can not (yet) comment: the accepted answer will fail if there are some keys in B not present in A.
Using dict.pop with a default would circumvent it (borrowed from How to remove a key from a Python dictionary?):
all(A.pop(k, None) for k in B)
or
tuple(A.pop(k, None) for k in B)
result = A.copy()
[result.pop(key) for key in B if B[key] == A[key]]
Based on only keys assuming A is a superset of B or B is a subset of A:
Python 3: c = {k:a[k] for k in a.keys() - b.keys()}
Python 2: c = {k:a[k] for k in list(set(a.keys())-set(b.keys()))}
Based on keys and can be used to update a in place as well #PaulMcG answer
For subtracting the dictionaries, you could do :
A.subtract(B)
Note: This will give you negative values in a situation where B has keys that A does not.
Related
I would like to merge or update a dictionary in Python with new entries, but replace the values of entries whose key exists with the smaller of the values associated with the key in the existing entry and the new entry. For example:
Input:
dict_A = {1:14, 2:15, 3:16, 4:17}, dict_B= {2:19, 3:9, 4:11, 5:13}
Expected output:
{1:14, 2:15, 3:9, 4:11, 5:13}
I know it can be achieved with a loop iterating through the dictionaries while performing comparisons, but is there any simpler and faster ways or any helpful libraries to achieve this?
in this case you could easily use pandas to avoid writing the loop, though I dunno if there would be any speedup - didn't test that
import pandas as pd
df = pd.DataFrame([dict_A, dict_B])
out = df.min().to_dict()
output: {1: 14.0, 2: 15.0, 3: 9.0, 4: 11.0, 5: 13.0}
there's probably some edge cases you'd have to account for
Quick One-liner
c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
Explanation
This should be quick enough because it uses the set.
You can merge dictionaries by using the **dictionary syntax like so: {**a, **b}. The ** simply just "expands" out the dictionary into each individual item, with the last expanded dictionary overwriting any previous ones (so in {**a, **b}, any matching keys in b overwrite the value from a).
The first thing I do is load in all the values in a and b into the new dictionary:
c = {**a, **b, ...
Then I use dictionary comprehension to generate a new dictionary, which only has the smallest value for every set of keys which are in both a and b.
... {key:min(a[key], b[key]) for key in set(a).intersection(set(b))} ...
To get the set of keys which only exist in both a and b, I convert both dictionaries to sets (which converts them to sets of their keys) and use intersection to quickly find all keys which are in both sets.
... set(a).intersection(set(b)) ...
Then I loop through each of the keys in the matching-keys set, and use the dictionary comprehension to generate a new dictionary with the current key and the min of both dictionaries' values for that key.
... {key:min(a[key], b[key]) ...
Then I use the ** syntax to "expand" this new generated dictionary with the expanded a and b, putting it last to make sure it overwrites any values from the two.
Works on the example given (ctrl-cv'd straight from my terminal):
>>> a = {1:14, 2:15, 3:16, 4:17}
>>> b = {2:19, 3:9, 4:11, 5:13}
>>> c = {**a, **b, **{key:min(a[key], b[key]) for key in set(a).intersection(set(b))} }
>>> c
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
Here's something short to do it. Not inherently the fastest or best way to do it, but figured I'd share nonetheless.
max_val = max(max(dict_A.values()), max(dict_B.values())) + 1
keys = set(list(dict_A.keys()) + list(dict_B.keys()))
dict_C = { key : min(dict_A.get(key, max_val), dict_B.get(key, max_val)) for key in keys }
hello this is the code I have made Hope it's what you needed :
dict_A = {1:14, 2:15, 3:16, 4:17}
dict_B= {2:19, 3:9, 4:11, 5:13}
dict_res =dict_B
dict_A_keys = dict_A.keys()
dict_B_keys = dict_B.keys()
for e in dict_A_keys :
if e in dict_B_keys :
if dict_A[e]>dict_B[e]:
dict_res[e]=dict_B[e]
else:
dict_res[e]=dict_A[e]
else:
dict_res[e]=dict_A[e]
I have tested it and the output is :
{1: 14, 2: 15, 3: 9, 4: 11, 5: 13}
i was asked to write a code including a function- reverse_dict_in_place(d)
which switches between keys and values of the inputed dictionary
without changing the dictionary's location in memory (in place).
however, testing it with id() function shows that all my solutions do change dictionaries memory location..
def reverse_dict_in_place(d):
d={y:x for x,y in d.items()}
return d
Alternative to current ones which allows values to be same as keys. Works in mostly the same way though, however once again no two values may be the same.
def reverse_dict_in_place(d):
copy = d.copy().items()
d.clear()
for k, v in copy:
d[v] = k
return d
>>> x = {0: 1, 1: 2}
>>> y = reverse_dict_in_place(x)
>>> id(x) == id(y)
True
>>>
Some assumptions for this to work (thanks to all the users who pointed these out):
There are no duplicate values
There are no non-hashable values
There are no values that are also keys
If you're comfortable with those assumption then I think this should work:
def reverse_dict_in_place(d):
for k,v in d.items():
del d[k]
d[v] = k
return d
Extending on Gad suggestion, you could use dict comprehension:
reversed = {v: k for k, v in d.items()}
Where d is a dict, and the same assumptions apply:
There are no duplicate values
There are no non-hashable values
There are no values that are also keys
This would not work, without modification, for nested dicts.
Note: #NightShade has posted a similar answer as my below answer, earlier than I posted.
You can try this:
def reverse_dict_in_place(d):
d_copy = d.copy()
d.clear()
for k in d_copy:
d[d_copy[k]] = k
This would work even if one of the dictionary's values happens to also be a key (as tested out below)
Testing it out:
my_dict = {1:1, 2:'two', 3:'three'}
reverse_dict_in_place(my_dict)
print (my_dict)
Output:
{1: 1, 'two': 2, 'three': 3}
I was running this code through python tutor, and was just confused as to how the keys and values get switched around. I also was confused as to what value myDict[d[key]] would correspond to as I'm not sure what the d in [d[key]] actually does.
def dict_invert(d):
'''
d: dict
Returns an inverted dictionary according to the instructions above
'''
myDict = {}
for key in d.keys():
if d[key] in myDict:
myDict[d[key]].append(key)
else:
myDict[d[key]] = [key]
for val in myDict.values():
val.sort()
return myDict
print(dict_invert({8: 6, 2: 6, 4: 6, 6: 6}))
In your function d is the dictionary being passed in. Your code is creating a new dictionary, mapping the other direction (from the original dictionary's values to its keys). Since there may not be a one to one mapping (since values can be repeated in a dictionary), the new mapping actually goes from value to a list of keys.
When the code loops over the keys in d, it then uses d[key] to look up the corresponding value. As I commented above, this is not really the most efficient way to go about this. Instead of getting the key first and indexing to get the value, you can instead iterate over the items() of the dictionary and get key, value 2-tuples in the loop.
Here's how I'd rewrite the function, in what I think is a more clear fashion (as well as perhaps a little bit more efficient):
def dict_invert(d):
myDict = {}
for key, value in d.items(): # Get both key and value in the iteration.
if value in myDict: # That change makes these later lines more clear,
myDict[value].append(key) # as they can use value instead of d[key].
else:
myDict[value] = [key] # here too
for val in myDict.values():
val.sort()
return myDict
The function you are showing inverts a dictionary d. A dictionary is a collection of unique keys that map to values which are not necessarily unique. That means that when you swap keys and values, you may get multiple keys that have the same value. Your function handles this by adding keys in the input to a list in the inverse, instead of storing them directly as values. This avoids any possibility of conflict.
Let's look at a sample conceptually first before digging in. Let's say you have
d = {
'a': 1,
'b': 1,
'c': 2
}
When you invert that, you will have the keys 1 and 2. Key 1 will have two values: 'a' and 'b'. Key 2 will only have one value: 'c'. I used different types for the keys and values so you can tell immediately when you're looking at the input vs the output. The output should look like this:
myDict = {
1: ['a', 'b'],
2: ['c']
}
Now let's look at the code. First you initialize an empty output:
myDict = {}
Then you step through every key in the input d. Remember that these keys will become the values of the output:
for key in d.keys():
The value in d for key is d[key]. You need to check if that's a key in myDict since values become keys in the inverse:
if d[key] in myDict:
If the input's value is already a key in myDict, then it maps to a list of keys from d, and you need to append another one to the list. Specifically, d[key] represents the value in d for the key key. This value becomes a key in myDict, which is why it's being indexed like that:
myDict[d[key]].append(key)
Otherwise, create a new list with the single inverse recorded in it:
else:
myDict[d[key]] = [key]
The final step is to sort the values of the inverse. This is not necessarily a good idea. The values were keys in the input, so they are guaranteed to be hashable, but not necessarily comparable to each other:
for val in myDict.values():
val.sort()
The following should raise an error in Python 3:
dict_invert({(1, 2): 'a', 3: 'b'})
myDict[d[key]] takes value of d[key] and uses it as a key in myDict, for example
d = {'a': 'alpha', 'b': 'beta'}
D = {'alpha': 1, 'beta': 2}
D[d['a']] = 3
D[d['b']] = 4
now when contents of d and D should be as following
d = {'a': 'alpha', 'b': 'beta'}
D = {'alpha': 3, 'beta': 4}
d is the dictionary you are passing into the function
def dict_invert(d)
When you create
myDict[d[key]] = d
Its meaning is
myDict[value of d] = key of d
Resulting in
myDict = {'value of d': 'key of d'}
I want to copy pairs from this dictionary based on their values so they can be assigned to new variables. From my research it seems easy to do this based on keys, but in my case the values are what I'm tracking.
things = ({'alpha': 1, 'beta': 2, 'cheese': 3, 'delta': 4})
And in made-up language I can assign variables like so -
smaller_things = all values =3 in things
You can use .items() to traverse through the pairs and make changes like this:
smaller_things = {}
for k, v in things.items():
if v == 3:
smaller_things[k] = v
If you want a one liner and only need the keys back, list comprehension will do it:
smaller_things = [k for k, v in things.items() if v == 3]
>>> things = { 'a': 3, 'b': 2, 'c': 3 }
>>> [k for k, v in things.items() if v == 3]
['a', 'c']
you can just reverse the dictionary and pull from that:
keys_values = { 1:"a", 2:"b"}
values_keys = dict(zip(keys_values.values(), keys_values.keys()))
print values_keys
>>> {"a":1, "b":2}
That way you can do whatever you need to with standard dictionary syntax.
The potential drawback is if you have non-unique values in the original dictionary; items in the original with the same value will have the same key in the reversed dictionary, so you can't guarantee which of the original keys would be the new value. And potentially some values are unhashable (such as lists).
Unless you have a compulsive need to be clever, iterating over items is easier:
for key, val in my_dict.items():
if matches_condition(val):
do_something(key)
kindly this answer is as per my understanding of your question .
The dictionary is a kind of hash table , the main intension of dictionary is providing the non integer indexing to the values . The keys in dictionary are just like indexes .
for suppose consider the "array" , the elements in array are addressed by the index , and we have index for the elements not the elements for index . Just like that we have keys(non integer indexes) for values in dictionary .
And there is one implication the values in dictionary are non hashable I mean the values in dictionary are mutable and keys in dictionary are immutable ,simply values could be changed any time .
simply it is not good approach to address any thing by using values in dictionary
Is it possible to replace all values in a dictionary, regardless of value, with the integer 1?
Thank you!
Sure, you can do something like:
d = {x: 1 for x in d}
That creates a new dictionary d that maps every key in d (the old one) to 1.
You can use a dict comprehension (as others have said) to create a new dictionary with the same keys as the old dictionary, or, if you need to do the whole thing in place:
for k in d:
d[k] = 1
If you're really fond of 1-liners, you can do it in place using update:
d.update( (k,1) for k in d )
a = {1:2, 2:2,3:2}
a = {x:1 for (x,_) in a.iteritems()}
print a
{1: 1, 2: 1, 3: 1}
Yes, it's possible. Iterate through every key in the dictionary and set the related value to 1.