Search for values in dictionaries within dictrionaries - python

I am quite new to python and wondering if there is an easy way to find a value for a specific key for dictionaries within dictionaries. I am sure you could write a loop etc but wondering if there is a more direct way especially if there are multiple layers and you don't know upfront where exactly the value sits?
Let's say if I like to find the value for 'Mother'
a = {'family 1':{'Father':'Joe', 'Mother': 'Eva'}}
Thanks a lot.

def recursive_lookup(d, key):
if key in d:
return d[key]
for v in d.values():
if not isinstance(v, dict):
continue
x = recursive_lookup(v, key)
if x is not None:
return x
return None
This can be used as follows:
>>> d = {'family 1': {'Father': 'Joe', 'Mother': 'Eva'}}
>>> recursive_lookup(d, "Mother")
'Eva'

Related

Remove JSON data pairs from nested structure [duplicate]

I had to remove some fields from a dictionary, the keys for those fields are on a list. So I wrote this function:
def delete_keys_from_dict(dict_del, lst_keys):
"""
Delete the keys present in lst_keys from the dictionary.
Loops recursively over nested dictionaries.
"""
dict_foo = dict_del.copy() #Used as iterator to avoid the 'DictionaryHasChanged' error
for field in dict_foo.keys():
if field in lst_keys:
del dict_del[field]
if type(dict_foo[field]) == dict:
delete_keys_from_dict(dict_del[field], lst_keys)
return dict_del
This code works, but it's not very elegant and I'm sure that there is a better solution.
First of, I think your code is working and not inelegant. There's no immediate reason not to use the code you presented.
There are a few things that could be better though:
Comparing the type
Your code contains the line:
if type(dict_foo[field]) == dict:
That can be definitely improved. Generally (see also PEP8) you should use isinstance instead of comparing types:
if isinstance(dict_foo[field], dict)
However that will also return True if dict_foo[field] is a subclass of dict. If you don't want that, you could also use is instead of ==. That will be marginally (and probably unnoticeable) faster.
If you also want to allow arbitary dict-like objects you could go a step further and test if it's a collections.abc.MutableMapping. That will be True for dict and dict subclasses and for all mutable mappings that explicitly implement that interface without subclassing dict, for example UserDict:
>>> from collections import MutableMapping
>>> # from UserDict import UserDict # Python 2.x
>>> from collections import UserDict # Python 3.x - 3.6
>>> # from collections.abc import MutableMapping # Python 3.7+
>>> isinstance(UserDict(), MutableMapping)
True
>>> isinstance(UserDict(), dict)
False
Inplace modification and return value
Typically functions either modify a data structure inplace or return a new (modified) data structure. Just to mention a few examples: list.append, dict.clear, dict.update all modify the data structure inplace and return None. That makes it easier to keep track what a function does. However that's not a hard rule and there are always valid exceptions from this rule. However personally I think a function like this doesn't need to be an exception and I would simply remove the return dict_del line and let it implicitly return None, but YMMV.
Removing the keys from the dictionary
You copied the dictionary to avoid problems when you remove key-value pairs during the iteration. However, as already mentioned by another answer you could just iterate over the keys that should be removed and try to delete them:
for key in keys_to_remove:
try:
del dict[key]
except KeyError:
pass
That has the additional advantage that you don't need to nest two loops (which could be slower, especially if the number of keys that need to be removed is very long).
If you don't like empty except clauses you can also use: contextlib.suppress (requires Python 3.4+):
from contextlib import suppress
for key in keys_to_remove:
with suppress(KeyError):
del dict[key]
Variable names
There are a few variables I would rename because they are just not descriptive or even misleading:
delete_keys_from_dict should probably mention the subdict-handling, maybe delete_keys_from_dict_recursive.
dict_del sounds like a deleted dict. I tend to prefer names like dictionary or dct because the function name already describes what is done to the dictionary.
lst_keys, same there. I'd probably use just keys there. If you want to be more specific something like keys_sequence would make more sense because it accepts any sequence (you just have to be able to iterate over it multiple times), not just lists.
dict_foo, just no...
field isn't really appropriate either, it's a key.
Putting it all together:
As I said before I personally would modify the dictionary in-place and not return the dictionary again. Because of that I present two solutions, one that modifies it in-place but doesn't return anything and one that creates a new dictionary with the keys removed.
The version that modifies in-place (very much like Ned Batchelders solution):
from collections import MutableMapping
from contextlib import suppress
def delete_keys_from_dict(dictionary, keys):
for key in keys:
with suppress(KeyError):
del dictionary[key]
for value in dictionary.values():
if isinstance(value, MutableMapping):
delete_keys_from_dict(value, keys)
And the solution that returns a new object:
from collections import MutableMapping
def delete_keys_from_dict(dictionary, keys):
keys_set = set(keys) # Just an optimization for the "if key in keys" lookup.
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value # or copy.deepcopy(value) if a copy is desired for non-dicts.
return modified_dict
However it only makes copies of the dictionaries, the other values are not returned as copy, you could easily wrap these in copy.deepcopy (I put a comment in the appropriate place of the code) if you want that.
def delete_keys_from_dict(dict_del, lst_keys):
for k in lst_keys:
try:
del dict_del[k]
except KeyError:
pass
for v in dict_del.values():
if isinstance(v, dict):
delete_keys_from_dict(v, lst_keys)
return dict_del
Since the question requested an elegant way, I'll submit my general-purpose solution to wrangling nested structures. First, install the boltons utility package with pip install boltons, then:
from boltons.iterutils import remap
data = {'one': 'remains', 'this': 'goes', 'of': 'course'}
bad_keys = set(['this', 'is', 'a', 'list', 'of', 'keys'])
drop_keys = lambda path, key, value: key not in bad_keys
clean = remap(data, visit=drop_keys)
print(clean)
# Output:
{'one': 'remains'}
In short, the remap utility is a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles and special containers.
This page has many more examples, including ones working with much larger objects from Github's API.
It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.
def delete_keys_from_dict(d, to_delete):
if isinstance(to_delete, str):
to_delete = [to_delete]
if isinstance(d, dict):
for single_to_delete in set(to_delete):
if single_to_delete in d:
del d[single_to_delete]
for k, v in d.items():
delete_keys_from_dict(v, to_delete)
elif isinstance(d, list):
for i in d:
delete_keys_from_dict(i, to_delete)
d = {'a': 10, 'b': [{'c': 10, 'd': 10, 'a': 10}, {'a': 10}], 'c': 1 }
delete_keys_from_dict(d, ['a', 'c']) # inplace deletion
print(d)
>>> {'b': [{'d': 10}, {}]}
This solution works for dict and list in a given nested dict. The input to_delete can be a list of str to be deleted or a single str.
Plese note, that if you remove the only key in a dict, you will get an empty dict.
I think the following is more elegant:
def delete_keys_from_dict(dict_del, lst_keys):
if not isinstance(dict_del, dict):
return dict_del
return {
key: value
for key, value in (
(key, delete_keys_from_dict(value, lst_keys))
for key, value in dict_del.items()
)
if key not in lst_keys
}
Example usage:
test_dict_in = {
1: {1: {0: 2, 3: 4}},
0: {2: 3},
2: {5: {0: 4}, 6: {7: 8}},
}
test_dict_out = {
1: {1: {3: 4}},
2: {5: {}, 6: {7: 8}},
}
assert delete_keys_from_dict(test_dict_in, [0]) == test_dict_out
Since you already need to loop through every element in the dict, I'd stick with a single loop and just make sure to use a set for looking up the keys to delete
def delete_keys_from_dict(dict_del, the_keys):
"""
Delete the keys present in the lst_keys from the dictionary.
Loops recursively over nested dictionaries.
"""
# make sure the_keys is a set to get O(1) lookups
if type(the_keys) is not set:
the_keys = set(the_keys)
for k,v in dict_del.items():
if k in the_keys:
del dict_del[k]
if isinstance(v, dict):
delete_keys_from_dict(v, the_keys)
return dict_del
this works with dicts containing Iterables (list, ...) that may contain dict. Python 3. For Python 2 unicode should also be excluded from the iteration. Also there may be some iterables that don't work that I'm not aware of. (i.e. will lead to inifinite recursion)
from collections.abc import Iterable
def deep_omit(d, keys):
if isinstance(d, dict):
for k in keys:
d.pop(k, None)
for v in d.values():
deep_omit(v, keys)
elif isinstance(d, Iterable) and not isinstance(d, str):
for e in d:
deep_omit(e, keys)
return d
Since nobody posted an interactive version that could be useful for someone:
def delete_key_from_dict(adict, key):
stack = [adict]
while stack:
elem = stack.pop()
if isinstance(elem, dict):
if key in elem:
del elem[key]
for k in elem:
stack.append(elem[k])
This version is probably what you would push to production. The recursive version is elegant and easy to write but it scales badly (by default Python uses a maximum recursion depth of 1000).
If you have nested keys as well and based on #John La Rooy's answer here is an elegant solution:
from boltons.iterutils import remap
def sof_solution():
data = {"user": {"name": "test", "pwd": "******"}, "accounts": ["1", "2"]}
sensitive = {"user.pwd", "accounts"}
clean = remap(
data,
visit=lambda path, key, value: drop_keys(path, key, value, sensitive)
)
print(clean)
def drop_keys(path, key, value, sensitive):
if len(path) > 0:
nested_key = f"{'.'.join(path)}.{key}"
return nested_key not in sensitive
return key not in sensitive
sof_solution() # prints {'user': {'name': 'test'}}
Using the awesome code from this post and add a small statement:
def remove_fields(self, d, list_of_keys_to_remove):
if not isinstance(d, (dict, list)):
return d
if isinstance(d, list):
return [v for v in (self.remove_fields(v, list_of_keys_to_remove) for v in d) if v]
return {k: v for k, v in ((k, self.remove_fields(v, list_of_keys_to_remove)) for k, v in d.items()) if k not in list_of_keys_to_remove}
I came here to search for a solution to remove keys from deeply nested Python3 dicts and all solutions seem to be somewhat complex.
Here's a oneliner for removing keys from nested or flat dicts:
nested_dict = {
"foo": {
"bar": {
"foobar": {},
"shmoobar": {}
}
}
}
>>> {'foo': {'bar': {'foobar': {}, 'shmoobar': {}}}}
nested_dict.get("foo", {}).get("bar", {}).pop("shmoobar", None)
>>> {'foo': {'bar': {'foobar': {}}}}
I used .get() to not get KeyError and I also provide empty dict as default value up to the end of the chain. I do pop() for the last element and I provide None as the default there to avoid KeyError.

get the value of the deepest nested dict in python

I have a dictionary like
{a:{b:{c:{d:2}}}, e:2, f:2}
How am I supposed to get the value of d and change it in python? Previous questions only showed how to get the level of nesting but didn't show how to get the value. In this case, I do not know the level of nesting of the dict. Any help will be appreciated, thank you!!!
P.S. I am using python 3
How about some recursion?
def unest(data, key):
if key in data.keys():
return data.get(key)
else:
for dkey in data.keys():
if isinstance(data.get(dkey), dict):
return unest(data.get(dkey), key)
else:
continue
d = {'a':{'b':{'c':{'d':25}}}, 'e':2, 'f':2}
r = unest(d, 'd')
# 25
r = unest(d, 'c')
# {'d': 25}
Edit: As Paul Rooney points out we don't have to use data.keys, we can simply iterate over the keys by doing if key in data
Edit 2: Given the input you provided this would find the deepest level, however this does not cover certain cases, such as where for example, key e is also a nested dictionary that goes n + 1 levels where n is equal to the depth of key a.
def find_deepest(data):
if not any([isinstance(data.get(k), dict) for k in data]):
return data
else:
for dkey in data:
if isinstance(data.get(dkey), dict):
return find_deepest(data.get(dkey))
else:
continue
u = find_deepest(d)
# {'d': 2}
I was searching for a similar solution, but I wanted to find the deepest instance of any key within a (possibly) nested dictionary and return it's value. This assumes you know what key you want to search for ahead of time. It's unclear to me if that is a valid assumption for the original question, but I hope this will help others if they stumble on this thread.
def find_deepest_item(obj, key, deepest_item = None):
if key in obj:
deepest_item = obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = find_deepest_item(v, key, deepest_item)
if item is not None:
deepest_item = item
return deepest_item
In this example, the 'd' key exists at the top level as well:
d = {'a':{'b':{'c':{'d':25}}}, 'd':3, 'e':2, 'f':2}
v = find_deepest_item(d, 'd')
# 25
Search for 'c':
v = find_deepest_item(d, 'c')
# {'d': 25}

How to get my hashmap to return multiple values to one key in python

I am at my wit's end. I've been learning Python from LearningPythontheHardWay, I was learning well until exercise 39 with the dictionary module hashmap.
I can get it to work with just key, value pairs. I want this module to allow me to work with multiple values to each key. There was another question on here like this but it wasn't answered completely, or I did not understand the answer. I want to learn Python, but this is really restricting me from progressing and I need help.
def get(aMap, key, default=None):
"""Gets the value in a bucket for the given key, or the default."""
bucket = get_bucket(aMap, key)
i, k, v = get_slot(aMap, key, default=default)
for k in enumerate(bucket):
while v:
return v #returns the value of the key!!!
This get function from the module does not work. I can get Python to list the entire dictionary with multiple key values using list function, so I know the values are in there through my set function:
def set(aMap, key, value):
"""Sets the key to the value, replacing any existing value."""
bucket = get_bucket(aMap, key)
i, k, v = get_slot(aMap, key)
bucket.append((key, value))
I know I'm supposed to get a value list in there and then loop through the list if the key should contain more than one value.
I am having a hard time putting this in code language. The bucket contains the list for the tuple (k,v) pairs and the k should contain a list of v.
So far I can only get one value to appear and it stops. Why does the while loop stop?
Thank you.
EDIT: For more clarity, I want to return multiple values if I input a single key that has more than one value.
cities = hashmap.new()
hashmap.set(cities, 'MH', 'Mumbai')
hashmap.set(cities, 'MH', 'Pune')
hashmap.set(cities, 'MH', 'Augu')
print ("%s" % hashmap.get(cities, 'MH'))
This should return all those values out.
Keep in mind that the "return" keyword always terminates the function(callee) and jump to caller, that's why you can not "return multiple value by using return keyword multiple times".
The workaround are
return a list or tuple(as suggested by #terry-jan-reedy)
implement as Iterators & Generators
A hashmap is a hash by key (means efficient research algorithm) not by value ...
If I understand correctly you seems to fear that the value may be a complex type.
The simplest dictionary is
{"key1":"Value1", "key2":"Value2"}
Where you can also have:
{"key1": [1,2,3,"hello"], "key2": A_class_instance, "key3": range(500) }
then value should be as complex as you want ...
now if you want to build a hash map of values (for filtering for instance)
you may glance at the following code:
Mydic = {}
for k,v in My_huge_key_value_table_of_couples:
try:
Mydic[k].append(v)
except KeyError:
Mydic[k] = [v]
to build a hash map with multiple value in each values
Now To access it with default value
just
def get (amap, key, default=None):
try: return amap[key]
except KeyError: return default
to strictly answer the question
Edit to match your hashmap edit..
If you write hashmap.new() ... you had imported a hashmap module, yours ?
I suppose there hashmap is a dictionary
cities = {} # create the dictionary
def set (amap, key, value):
try: amap[key].append(value) #append the element if key already exists
except KeyError: amap[key] = [value] # Init the first element
set(cities, , 'MH', 'Mumbai')
set(cities, 'MH', 'Pune')
set(cities, 'MH', 'Augu')
def get (amap, key, default=None):
try: return ','.join(amap[key]) # You must return a string not a list
except KeyError: return default # in case of no key
print get(cities, 'MH')
Supposing hashmap exists ;-) as a tuple of unfixed limit tuples.
in the form:
((k,v),(k1,v1,v2), ... )
Your code is a little bit hard to follow.
Keep in mind also that a tuple is static and stand for static data.
Manipulating tuple as you want should be tricky
def get_h(h, k):
# like that ?
def get_bucket(h,k):
# supposition get_bucket return the tuple (key, value1, value2) as a list: [ value1, value2 ] ?
# even if your line bucket.append((key, value)) in the set function is a non sens !
#it will produce [ value1, value2, (key, value)] result
for f in h:
if f[0] == k: return list(f[1:])
bucket = get_bucket(aMap, key)
but be carefull at this level you cannot do
print ("%s" % get_h(cities, 'MH'))
# ^^^^^^^^^^^^^^^^^^^--------> is a list not a string
print ("%s" % ','.join(get_(cities, 'MH'))) # is correct
Tips: You should use the IDLE .. to test things .. it is often very usefull
I think your first question is how to store multiple values for each key. We can try storing the values as a list. I think the changes you need to make are mainly to the functions set().
def set(aMap, key, value):
"""Sets the key to the value, multiple values for each key"""
#i, k, v = get_slot(aMap, key)
bucket = get_bucket(aMap, key)
i, k, vlist = get_slot(aMap, key)
if i >= 0:
# if key exists, append value
vlist.append(value)
else:
# the key does not exist, append a key and list. The list stores the values
bucket.append((key, [value]))
The [value] refers to the list.
To get the values, we can change the get function to:
def get(aMap, key, default=None):
"""Gets the values in a bucket for the given key, or the default."""
i, k, vlist = get_slot(aMap, key, default=default)
return vlist

How can I get my function to return more than 1 result (with for-loops and dictionaries) in Python?

I am trying to find a way to return more than one result for my dictionary in Python:
def transitive_property(d1, d2):
'''
Return a new dictionary in which the keys are from d1 and the values are from d2.
A key-value pair should be included only if the value associated with a key in d1
is a key in d2.
>>> transitive_property({'one':1, 'two':2}, {1:1.0})
{'one':1.0}
>>> transitive_property({'one':1, 'two':2}, {3:3.0})
{}
>>> transitive_property({'one':1, 'two':2, 'three':3}, {1:1.0, 3:3.0})
{'one':1.0}
{'three': 3.0}
'''
for key, val in d1.items():
if val in d2:
return {key:d2[val]}
else:
return {}
I've come up with a bunch of different things but they would never pass a few test cases such as the third one (with {'three':3}). This what results when I test using the third case in the doc string:
{'one':1.0}
So since it doesn't return {'three':3.0}, I feel that it only returns a single occurrence within the dictionary, so maybe it's a matter of returning a new dictionary so it could iterate over all of the cases. What would you say on this approach? I'm quite new so I hope the code below makes some sense despite the syntax errors. I really did try.
empty = {}
for key, val in d1.items():
if val in d2:
return empty += key, d2[val]
return empty
Your idea almost works but (i) you are returning the value immediately, which exits the function at that point, and (ii) you can't add properties to a dictionary using +=. Instead you need to set its properties using dictionary[key] = value.
result = {}
for key, val in d1.items():
if val in d2:
result[key] = d2[val]
return result
This can also be written more succinctly as a dictionary comprehension:
def transitive_property(d1, d2):
return {key: d2[val] for key, val in d1.items() if val in d2}
You can also have the function return a list of dictionaries with a single key-value pair in each, though I'm not sure why you would want that:
def transitive_property(d1, d2):
return [{key: d2[val]} for key, val in d1.items() if val in d2]
If return is used to , then the function is terminated for that particular call . So if you want to return more than one value it is impossible. You can use arrays instead .You can store values in array and the return thhe array.

Recursive dictionary modification in python

What would be the easiest way to go about turning this dictionary:
{'item':{'w':{'c':1, 'd':2}, 'x':120, 'y':240, 'z':{'a':100, 'b':200}}}
into this one:
{'item':{'y':240, 'z':{'b':200}}}
given only that you need the vars y and b while maintaining the structure of the dictionary? The size or number of items or the depth of the dictionary should not matter, as the one I'm working with can be anywhere from 2 to 5 levels deep.
EDIT: I apologize for the type earlier, and to clarify, I am given an array of strings (eg ['y', 'b']) which I need to find in the dictionary and then keep ONLY 'y' and 'b' as well as any other keys in order to maintain the structure of the original dictionary, in this case, it would be 'z'
A better example can be found here where I need Chipset Model, VRAM, and Resolution.
In regards to the comment, the input would be the above link as the starting dictionary along with an array of ['chipset model', 'vram', 'resolution'] as the keep list. It should return this:
{'Graphics/Displays':{'NVIDIA GeForce 7300 GT':{'Chipset Model':'NVIDIA GeForce 7300 GT', 'Displays':{'Resolution':'1440 x 900 # 75 Hz'}, 'VRAM (Total)':'256 Mb'}}
Assuming that the dictionary you want to assign to an element of a super-dictionary is foo, you could just do this:
my_dictionary['keys']['to']['subdict']=foo
Regarding your edit—where you need to eliminate all keys except those on a certain list—this function should do the trick:
def drop_keys(recursive_dict,keep_list):
key_list=recursive_dict.keys()
for key in key_list:
if(type(recursive_dict[key]) is dict):
drop_keys(recursive_dict[key], keep_list)
elif(key not in keep_list):
del recursive_dict[key]
Something like this?
d = {'item': {'w': {'c': 1, 'd': 2}, 'x': 120, 'y': 240, 'z': {'a': 100, 'b': 200}}}
l = ['y', 'z']
def do_dict(d, l):
return {k: v for k, v in d['item'].items() if k in l}
Here's what I arrived at for a recursive solution, which ended up being similar to what #Dan posted:
def recursive_del(d,keep):
for k in d.copy():
if type(d[k]) == dict:
recursive_del(d[k],keep)
if len(d[k]) == 0: #all keys were deleted, clean up empty dict
del d[k]
elif k not in keep:
del d[k]
demo:
>>> keepset = {'y','b'}
>>> a = {'item':{'w':{'c':1, 'd':2}, 'x':120, 'y':240, 'z':{'a':100, 'b':200}}}
>>> recursive_del(a,keepset)
>>> a
{'item': {'z': {'b': 200}, 'y': 240}}
The only thing I think he missed is that you will need to sometimes need to clean up dicts which had all their keys deleted; i.e. without that adjustment you would end up with a vestigial 'w':{} in your example output.
Using your second example I made something like this, it's not exactly pretty but it should be easy to extend. If your tree starts to get big, you can define some sets of rules to parse the dict.
Each rule here are actually pretty much "what should I do when i'm in which state".
def rule2(key, value):
if key == 'VRAM (Total)':
return (key, value)
elif key == 'Chipset Model':
return (key, value)
def rule1(key, value):
if key == "Graphics/Displays":
if isinstance(value, dict):
return (key, recursive_checker(value, rule1))
else:
return (key, value)
else:
return (key, recursive_checker(value, rule2))
def recursive_checker(dat, rule):
def inner(item):
key = item[0]
value = item[1]
return rule(key, value)
return dict(filter(lambda x: x!= None, map(inner, dat.items())))
# Important bits
print recursive_checker(data, rule1)
In your case, as there is not many states, it isn't worth doing it but in case you have multiple cards and you don't necessarly know which key should be traversed but only know that you want certain keys from the tree. This method could be used to search the tree easily. It can be applied to many things.

Categories