Key based intersection of two python dictionaries with priority - python

Lets say I have 2 dictionaries, a and b:
a = {"a": 2, "b": 4}
b = {"a": 5, "b": 2, "c": 10}
I want to find the common keys in both dictionaries, and then take the value of those keys from b to create a new one. Example:
c = intersect_keys(a, b)
# c = {"a": 5, "b": 2}
As you can understand, the keys which were not present in the first array were not used in the newly generated one. How can I do this in a fast way using Python?
Moreover, because we are always picking the values from the second array, would it be better to just make a into a list and then iterate over it and then the values from b? Thanks for any answers.

You can use a dictionary comprehension, in order to keep those keys in b that are also present in a:
{k:v for k,v in b.items() if k in a}
Output
{'a': 5, 'b': 2}

Try this?:
myKeys = set(a.keys()).intersection(b.keys())
c = {}
for key in myKeys:
c[key] = b[key]

Related

make list of dictionaries overwriting one key entry from a list using iterators

I have the horrible feeling this will be a duplicate, I tried my best to find the answer already.
I have a dictionary and a list, and I want to create a list of dictionaries, using the list to overwrite one of the key values, like this:
d={"a":1,"b":10}
c=[3,4,5]
arg=[]
for i in c:
e=d.copy()
e["a"]=i
arg.append(e)
this gives the desired result
arg
[{'a': 3, 'b': 10}, {'a': 4, 'b': 10}, {'a': 5, 'b': 10}]
but the code is ugly, especially with the copy command, and instead of one list I have 4 or 5 in my real example which leads to a huge nested loop. I feel sure there is a neater way with an iterator like
arg=[d *with* d[a]=i for i in c]
where I'm not sure what to put in the place of the "with".
Again, apologies if this is already answered.
IIUC, you could do:
d={"a":1,"b":10}
c=[3,4,5]
res = [{ **d, "a" : ci } for ci in c]
print(res)
Output
[{'a': 3, 'b': 10}, {'a': 4, 'b': 10}, {'a': 5, 'b': 10}]
The part:
"a" : ci
rewrites the value at the key "a" and **d unpacks the dictionary.
I would do it this way:
arg=[d.copy() for i in range(len(c))]
for i in range(len(arg)):
arg[i]['a']=c[i]
This code first creates a list of dictionaries with the length of c and then updates 'a' for each dictionary, with the respective itme of c
You could do it using a dictionary comprehension within a list comprehension, checking for key == 'a':
d = {"a":1,"b":10}
c = [3,4,5]
l = [{k: num if k == 'a' else v for k,v in d.items()} for num in c]
In Python 3.9 there is new method to create new dictionary with updated values and keep old dictionary without updates - using operator |
new_dict = old_dict | dict_with_updates
With list comprehension it will be
arg = [ d | {"a": i} for i in c]
Full example
d = {"a": 1, "b": 10}
c = [3, 4, 5]
arg = [ d | {"a": i} for i in c]
print(arg)
BTW: There is also |= to update existing dictionary
old_dict |= dict_with_updates
Doc: What’s New In Python 3.9

Over counting pairs in python loop

I have a list of dictionaries where each dict is of the form:
{'A': a,'B': b}
I want to iterate through the list and for every (a,b) pair, find the pair(s), (b,a), if it exists.
For example if for a given entry of the list A = 13 and B = 14, then the original pair would be (13,14). I would want to search the entire list of dicts to find the pair (14,13). If (14,13) occurred multiple times I would like to record that too.
I would like to count the number of times for all original (a,b) pairs in the list, when the complement (b,a) appears, and if so how many times. To do this I have two for loops and a counter when a complement pair is found.
pairs_found = 0
for i, val in enumerate( list_of_dicts ):
for j, vol in enumerate( list_of_dicts ):
if val['A'] == vol['B']:
if vol['A'] == val['B']:
pairs_found += 1
This generates a pairs_found greater than the length of list_of_dicts. I realize this is because the same pairs will be over-counted. I am not sure how I can overcome this degeneracy?
Edit for Clarity
list_of_dicts = []
list_of_dicts[0] = {'A': 14, 'B', 23}
list_of_dicts[1] = {'A': 235, 'B', 98}
list_of_dicts[2] = {'A': 686, 'B', 999}
list_of_dicts[3] = {'A': 128, 'B', 123}
....
Lets say that the list has around 100000 entries. Somewhere in that list, there will be one or more entries, of the form {'A' 23, 'B': 14}. If this is true then I would like a counter to increase its value by one. I would like to do this for every value in the list.
Here is what I suggest:
Use tuple to represent your pairs and use them as dict/set keys.
Build a set of unique inverted pairs you'll look for.
Use a dict to store the number of time a pair appears inverted
Then the code should look like this:
# Create a set of unique inverted pairs
inverted_pairs_set = {(d['B'],d['A']) for d in list_of_dicts}
# Create a counter for original pairs
pairs_counter_dict = {(ip[1],ip[0]):0 for ip in inverted_pairs_set]
# Create list of pairs
pairs_list = [(d['A'],d['B']) for d in list_of_dicts]
# Count for each inverted pairs, how many times
for p in pairs_list:
if p in inverted_pairs_set:
pairs_counter_dict[(p[1],p[0])] += 1
You can create a counter dictionary that contains the values of the 'A' and 'B' keys in all your dictionaries:
complements_cnt = {(dct['A'], dct['B']): 0 for dct in list_of_dicts}
Then all you need is to iterate over your dictionaries again and increment the value for the "complements":
for dct in list_of_dicts:
try:
complements_cnt[(dct['B'], dct['A'])] += 1
except KeyError: # in case there is no complement there is nothing to increase
pass
For example with such a list_of_dicts:
list_of_dicts = [{'A': 1, 'B': 2}, {'A': 2, 'B': 1}, {'A': 1, 'B': 2}]
This gives:
{(1, 2): 1, (2, 1): 2}
Which basically says that the {'A': 1, 'B': 2} has one complement (the second) and {'A': 2, 'B': 1} has two (the first and the last).
The solution is O(n) which should be quite fast even for 100000 dictionaries.
Note: This is quite similar to #debzsud answer. I haven't seen it before I posted the answer though. :(
I am still not 100% sure what it is you want to do but here is my guess:
pairs_found = 0
for i, dict1 in enumerate(list_of_dicts):
for j, dict2 in enumerate(list_of_dicts[i+1:]):
if dict1['A'] == dict2['B'] and dict1['B'] == dict2['A']:
pairs_found += 1
Note the slicing on the second for loop. This avoids checking pairs that have already been checked before (comparing D1 with D2 is enough; no need to compare D2 to D1)
This is better than O(n**2) but still there is probably room for improvement
You could first create a list with the values of each dictionary as tuples:
example_dict = [{"A": 1, "B": 2}, {"A": 4, "B": 3}, {"A": 5, "B": 1}, {"A": 2, "B": 1}]
dict_values = [tuple(x.values()) for x in example_dict]
Then create a second list with the number of occurrences of each element inverted:
occurrences = [dict_values.count(x[::-1]) for x in dict_values]
Finally, create a dict with dict_values as keys and occurrences as values:
dict(zip(dict_values, occurrences))
Output:
{(1, 2): 1, (2, 1): 1, (4, 3): 0, (5, 1): 0}
For each key, you have the number of inverted keys. You can also create the dictionary on the fly:
occurrences = {dict_values: dict_values.count(x[::-1]) for x in dict_values}

Removing dictionaries from a list on the basis of duplicate value of key

I am new to Python. Suppose i have the following list of dictionaries:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
From the above list, i want to remove dictionaries with same value of key b. So the resultant list should be:
mydictList = [{'a':1,'b':2,'c':3},{'a':2,'b':3,'c':4}]
You can create a new dictionary based on the value of b, iterating the mydictList backwards (since you want to retain the first value of b), and get only the values in the dictionary, like this
>>> {item['b'] : item for item in reversed(mydictList)}.values()
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]
If you are using Python 3.x, you might want to use list function over the dictionary values, like this
>>> list({item['b'] : item for item in reversed(mydictList)}.values())
Note: This solution may not maintain the order of the dictionaries.
First, sort the list by b-values (Python's sorting algorithm is stable, so dictionaries with identical b values will retain their relative order).
from operator import itemgetter
tmp1 = sorted(mydictList, key=itemgetter('b'))
Next, use itertools.groupby to create subiterators that iterate over dictionaries with the same b value.
import itertools
tmp2 = itertools.groupby(tmp1, key=itemgetter('b))
Finally, create a new list that contains only the first element of each subiterator:
# Each x is a tuple (some-b-value, iterator-over-dicts-with-b-equal-some-b-value)
newdictList = [ next(x[1]) for x in tmp2 ]
Putting it all together:
from itertools import groupby
from operator import itemgetter
by_b = itemgetter('b')
newdictList = [ next(x[1]) for x in groupby(sorted(mydictList, key=by_b), key=by_b) ]
A very straight forward approach can go something like this:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
b_set = set()
new_list = []
for d in mydictList:
if d['b'] not in b_set:
new_list.append(d)
b_set.add(d['b'])
Result:
>>> new_list
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]

How to get the index with the key in a dictionary?

I have the key of a python dictionary and I want to get the corresponding index in the dictionary. Suppose I have the following dictionary,
d = { 'a': 10, 'b': 20, 'c': 30}
Is there a combination of python functions so that I can get the index value of 1, given the key value 'b'?
d.??('b')
I know it can be achieved with a loop or lambda (with a loop embedded). Just thought there should be a more straightforward way.
Use OrderedDicts: http://docs.python.org/2/library/collections.html#collections.OrderedDict
>>> x = OrderedDict((("a", "1"), ("c", '3'), ("b", "2")))
>>> x["d"] = 4
>>> x.keys().index("d")
3
>>> x.keys().index("c")
1
For those using Python 3
>>> list(x.keys()).index("c")
1
Dictionaries in python (<3.6) have no order. You could use a list of tuples as your data structure instead.
d = { 'a': 10, 'b': 20, 'c': 30}
newd = [('a',10), ('b',20), ('c',30)]
Then this code could be used to find the locations of keys with a specific value
locations = [i for i, t in enumerate(newd) if t[0]=='b']
>>> [1]
You can simply send the dictionary to list and then you can select the index of the item you are looking for.
DictTest = {
'4000':{},
'4001':{},
'4002':{},
'4003':{},
'5000':{},
}
print(list(DictTest).index('4000'))
No, there is no straightforward way because Python dictionaries do not have a set ordering.
From the documentation:
Keys and values are listed in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions.
In other words, the 'index' of b depends entirely on what was inserted into and deleted from the mapping before:
>>> map={}
>>> map['b']=1
>>> map
{'b': 1}
>>> map['a']=1
>>> map
{'a': 1, 'b': 1}
>>> map['c']=1
>>> map
{'a': 1, 'c': 1, 'b': 1}
As of Python 2.7, you could use the collections.OrderedDict() type instead, if insertion order is important to your application.
#Creating dictionary
animals = {"Cat" : "Pat", "Dog" : "Pat", "Tiger" : "Wild"}
#Convert dictionary to list (array)
keys = list(animals)
#Printing 1st dictionary key by index
print(keys[0])
#Done :)

Delete an element from a dictionary

How do I delete an item from a dictionary in Python?
Without modifying the original dictionary, how do I obtain another dict with the item removed?
See also How can I remove a key from a Python dictionary? for the specific issue of removing an item (by key) that may not already be present.
The del statement removes an element:
del d[key]
Note that this mutates the existing dictionary, so the contents of the dictionary changes for anybody else who has a reference to the same instance. To return a new dictionary, make a copy of the dictionary:
def removekey(d, key):
r = dict(d)
del r[key]
return r
The dict() constructor makes a shallow copy. To make a deep copy, see the copy module.
Note that making a copy for every dict del/assignment/etc. means you're going from constant time to linear time, and also using linear space. For small dicts, this is not a problem. But if you're planning to make lots of copies of large dicts, you probably want a different data structure, like a HAMT (as described in this answer).
pop mutates the dictionary.
>>> lol = {"hello": "gdbye"}
>>> lol.pop("hello")
'gdbye'
>>> lol
{}
If you want to keep the original you could just copy it.
I think your solution is best way to do it. But if you want another solution, you can create a new dictionary with using the keys from old dictionary without including your specified key, like this:
>>> a
{0: 'zero', 1: 'one', 2: 'two', 3: 'three'}
>>> {i:a[i] for i in a if i!=0}
{1: 'one', 2: 'two', 3: 'three'}
There're a lot of nice answers, but I want to emphasize one thing.
You can use both dict.pop() method and a more generic del statement to remove items from a dictionary. They both mutate the original dictionary, so you need to make a copy (see details below).
And both of them will raise a KeyError if the key you're providing to them is not present in the dictionary:
key_to_remove = "c"
d = {"a": 1, "b": 2}
del d[key_to_remove] # Raises `KeyError: 'c'`
and
key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove) # Raises `KeyError: 'c'`
You have to take care of this:
by capturing the exception:
key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
del d[key_to_remove]
except KeyError as ex:
print("No such key: '%s'" % ex.message)
and
key_to_remove = "c"
d = {"a": 1, "b": 2}
try:
d.pop(key_to_remove)
except KeyError as ex:
print("No such key: '%s'" % ex.message)
by performing a check:
key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
del d[key_to_remove]
and
key_to_remove = "c"
d = {"a": 1, "b": 2}
if key_to_remove in d:
d.pop(key_to_remove)
but with pop() there's also a much more concise way - provide the default return value:
key_to_remove = "c"
d = {"a": 1, "b": 2}
d.pop(key_to_remove, None) # No `KeyError` here
Unless you use pop() to get the value of a key being removed you may provide anything, not necessary None.
Though it might be that using del with in check is slightly faster due to pop() being a function with its own complications causing overhead. Usually it's not the case, so pop() with default value is good enough.
As for the main question, you'll have to make a copy of your dictionary, to save the original dictionary and have a new one without the key being removed.
Some other people here suggest making a full (deep) copy with copy.deepcopy(), which might be an overkill, a "normal" (shallow) copy, using copy.copy() or dict.copy(), might be enough. The dictionary keeps a reference to the object as a value for a key. So when you remove a key from a dictionary this reference is removed, not the object being referenced. The object itself may be removed later automatically by the garbage collector, if there're no other references for it in the memory. Making a deep copy requires more calculations compared to shallow copy, so it decreases code performance by making the copy, wasting memory and providing more work to the GC, sometimes shallow copy is enough.
However, if you have mutable objects as dictionary values and plan to modify them later in the returned dictionary without the key, you have to make a deep copy.
With shallow copy:
def get_dict_wo_key(dictionary, key):
"""Returns a **shallow** copy of the dictionary without a key."""
_dict = dictionary.copy()
_dict.pop(key, None)
return _dict
d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"
new_d = get_dict_wo_key(d, key_to_remove)
print(d) # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d) # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d) # {"a": [1, 2, 3, 100], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3, 100], "b": 2222}
With deep copy:
from copy import deepcopy
def get_dict_wo_key(dictionary, key):
"""Returns a **deep** copy of the dictionary without a key."""
_dict = deepcopy(dictionary)
_dict.pop(key, None)
return _dict
d = {"a": [1, 2, 3], "b": 2, "c": 3}
key_to_remove = "c"
new_d = get_dict_wo_key(d, key_to_remove)
print(d) # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3], "b": 2}
new_d["a"].append(100)
print(d) # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3, 100], "b": 2}
new_d["b"] = 2222
print(d) # {"a": [1, 2, 3], "b": 2, "c": 3}
print(new_d) # {"a": [1, 2, 3, 100], "b": 2222}
The del statement is what you're looking for. If you have a dictionary named foo with a key called 'bar', you can delete 'bar' from foo like this:
del foo['bar']
Note that this permanently modifies the dictionary being operated on. If you want to keep the original dictionary, you'll have to create a copy beforehand:
>>> foo = {'bar': 'baz'}
>>> fu = dict(foo)
>>> del foo['bar']
>>> print foo
{}
>>> print fu
{'bar': 'baz'}
The dict call makes a shallow copy. If you want a deep copy, use copy.deepcopy.
Here's a method you can copy & paste, for your convenience:
def minus_key(key, dictionary):
shallow_copy = dict(dictionary)
del shallow_copy[key]
return shallow_copy
… how can I delete an item from a dictionary to return a copy (i.e., not modifying the original)?
A dict is the wrong data structure to use for this.
Sure, copying the dict and popping from the copy works, and so does building a new dict with a comprehension, but all that copying takes time—you've replaced a constant-time operation with a linear-time one. And all those copies alive at once take space—linear space per copy.
Other data structures, like hash array mapped tries, are designed for exactly this kind of use case: adding or removing an element returns a copy in logarithmic time, sharing most of its storage with the original.1
Of course there are some downsides. Performance is logarithmic rather than constant (although with a large base, usually 32-128). And, while you can make the non-mutating API identical to dict, the "mutating" API is obviously different. And, most of all, there's no HAMT batteries included with Python.2
The pyrsistent library is a pretty solid implementation of HAMT-based dict-replacements (and various other types) for Python. It even has a nifty evolver API for porting existing mutating code to persistent code as smoothly as possible. But if you want to be explicit about returning copies rather than mutating, you just use it like this:
>>> from pyrsistent import m
>>> d1 = m(a=1, b=2)
>>> d2 = d1.set('c', 3)
>>> d3 = d1.remove('a')
>>> d1
pmap({'a': 1, 'b': 2})
>>> d2
pmap({'c': 3, 'a': 1, 'b': 2})
>>> d3
pmap({'b': 2})
That d3 = d1.remove('a') is exactly what the question is asking for.
If you've got mutable data structures like dict and list embedded in the pmap, you'll still have aliasing issues—you can only fix that by going immutable all the way down, embedding pmaps and pvectors.
1. HAMTs have also become popular in languages like Scala, Clojure, Haskell because they play very nicely with lock-free programming and software transactional memory, but neither of those is very relevant in Python.
2. In fact, there is an HAMT in the stdlib, used in the implementation of contextvars. The earlier withdrawn PEP explains why. But this is a hidden implementation detail of the library, not a public collection type.
d = {1: 2, '2': 3, 5: 7}
del d[5]
print 'd = ', d
Result: d = {1: 2, '2': 3}
Using del you can remove a dict value passing the key of that value
Link:
del method
del dictionary['key_to_del']
Simply call del d['key'].
However, in production, it is always a good practice to check if 'key' exists in d.
if 'key' in d:
del d['key']
No, there is no other way than
def dictMinus(dct, val):
copy = dct.copy()
del copy[val]
return copy
However, often creating copies of only slightly altered dictionaries is probably not a good idea because it will result in comparatively large memory demands. It is usually better to log the old dictionary(if even necessary) and then modify it.
# mutate/remove with a default
ret_val = body.pop('key', 5)
# no mutation with a default
ret_val = body.get('key', 5)
Here a top level design approach:
def eraseElement(d,k):
if isinstance(d, dict):
if k in d:
d.pop(k)
print(d)
else:
print("Cannot find matching key")
else:
print("Not able to delete")
exp = {'A':34, 'B':55, 'C':87}
eraseElement(exp, 'C')
I'm passing the dictionary and the key I want into my function, validates if it's a dictionary and if the key is okay, and if both exist, removes the value from the dictionary and prints out the left-overs.
Output: {'B': 55, 'A': 34}
Hope that helps!
>>> def delete_key(dict, key):
... del dict[key]
... return dict
...
>>> test_dict = {'one': 1, 'two' : 2}
>>> print delete_key(test_dict, 'two')
{'one': 1}
>>>
this doesn't do any error handling, it assumes the key is in the dict, you might want to check that first and raise if its not
Below code snippet will help you definitely, I have added comments in each line which will help you in understanding the code.
def execute():
dic = {'a':1,'b':2}
dic2 = remove_key_from_dict(dic, 'b')
print(dict2) # {'a': 1}
print(dict) # {'a':1,'b':2}
def remove_key_from_dict(dictionary_to_use, key_to_delete):
copy_of_dict = dict(dictionary_to_use) # creating clone/copy of the dictionary
if key_to_delete in copy_of_dict : # checking given key is present in the dictionary
del copy_of_dict [key_to_delete] # deleting the key from the dictionary
return copy_of_dict # returning the final dictionary
or you can also use dict.pop()
d = {"a": 1, "b": 2}
res = d.pop("c") # No `KeyError` here
print (res) # this line will not execute
or the better approach is
res = d.pop("c", "key not found")
print (res) # key not found
print (d) # {"a": 1, "b": 2}
res = d.pop("b", "key not found")
print (res) # 2
print (d) # {"a": 1}
Solution 1: with deleting
info = {'country': 'Iran'}
country = info.pop('country') if 'country' in info else None
Solution 2: without deleting
info = {'country': 'Iran'}
country = info.get('country') or None
Here's another variation using list comprehension:
original_d = {'a': None, 'b': 'Some'}
d = dict((k,v) for k, v in original_d.iteritems() if v)
# result should be {'b': 'Some'}
The approach is based on an answer from this post:
Efficient way to remove keys with empty strings from a dict
For Python 3 this is
original_d = {'a': None, 'b': 'Some'}
d = dict((k,v) for k, v in original_d.items() if v)
print(d)
species = {'HI': {'1': (1215.671, 0.41600000000000004),
'10': (919.351, 0.0012),
'1025': (1025.722, 0.0791),
'11': (918.129, 0.0009199999999999999),
'12': (917.181, 0.000723),
'1215': (1215.671, 0.41600000000000004),
'13': (916.429, 0.0005769999999999999),
'14': (915.824, 0.000468),
'15': (915.329, 0.00038500000000000003),
'CII': {'1036': (1036.3367, 0.11900000000000001), '1334': (1334.532, 0.129)}}
The following code will make a copy of dict species and delete items which are not in trans_HI
trans_HI=['1025','1215']
for transition in species['HI'].copy().keys():
if transition not in trans_HI:
species['HI'].pop(transition)
In Python 3, 'dict' object has no attribute 'remove'.
But with immutables package, can perform mutations that allow to apply changes to the Map object and create new (derived) Maps:
import immutables
map = immutables.Map(a=1, b=2)
map1 = map.delete('b')
print(map, map1)
# will print:
# <immutables.Map({'b': 2, 'a': 1})>
# <immutables.Map({'a': 1})>
can try my method. In one line.
yourList = [{'key':'key1','version':'1'},{'key':'key2','version':'2'},{'key':'key3','version':'3'}]
resultList = [{'key':dic['key']} for dic in yourList if 'key' in dic]
print(resultList)

Categories