I have an API that I call that returns a dictionary. Part of that dictionary is itself another dictionary. In that inside dictionary, there are some keys that might not exist, or they might. Those keys could reference another dictionary.
To give an example, say I have the following dictionaries:
dict1 = {'a': {'b': {'c':{'d':3}}}}
dict2 = {'a': {'b': {''f': 2}}}
I would like to write a function that I can pass in the dictionary, and a list of keys that would lead me to the 3 in dict1, and the 2 in dict2. However, it is possible that b and c might not exist in dict1, and b and f might not exist in dict2.
I would like to have a function that I could call like this:
get_value(dict1, ['a', 'b', 'c'])
and that would return a 3, or if the keys are not found, then return a default value of 0.
I know that I can use something like this:
val = dict1.get('a', {}).get('b', {}).get('c', 0)
but that seems to be quite wordy to me.
I can also flatten the dict (see https://stackoverflow.com/a/6043835/1758023), but that can be a bit intensive since my dictionary is actually fairly large, and has about 5 levels of nesting in some keys. And, I only need to get two things from the dict.
Right now I am using the flattenDict function in the SO question, but that seems a bit of overkill for my situation.
You can use a recursive function:
def get_value(mydict, keys):
if not keys:
return mydict
if keys[0] not in mydict:
return 0
return get_value(mydict[keys[0]], keys[1:])
If keys can not only be missing, but be other, non-dict types, you can handle this like so:
def get_value(mydict, keys):
if not keys:
return mydict
key = keys[0]
try:
newdict = mydict[key]
except (TypeError, KeyError):
return 0
return get_value(newdict, keys[1:])
Without recursion, just iterate through the keys and go down one level at a time. Putting that inside a try/except allows you to handle the missing key case. KeyError will be raised when the key is not there, and TypeError will be raised if you hit the "bottom" of the dict too soon and try to apply the [] operator to an int or something.
def get_value(d, ks):
for k in ks:
try:
d = d[k] # descend one level
except (KeyError, TypeError):
return 0 # when any lookup fails, return 0
return d # return the final element
Here is a recursive function that should work for general cases
def recursive_get(d, k):
if len(k) == 0:
return 0
elif len(k) == 1:
return d.get(k[0], 0)
else:
value = d.get(k[0], 0)
if isinstance(value, dict):
return recursive_get(value, k[1:])
else:
return value
It takes arguments of the dict to search, and a list of keys, which it will check one per level
>>> dict1 = {'a': {'b': {'c':{'d':3}}}}
>>> recursive_get(dict1, ['a', 'b', 'c'])
{'d': 3}
>>> dict2 = {'a': {'b': {'f': 2}}}
>>> recursive_get(dict2, ['a', 'b', 'c'])
0
Related
I want to implement a function that:
Given a dictionary and an iterable of keys,
deletes the value accessed by iterating over those keys.
Originally I had tried
def delete_dictionary_value(dict, keys):
inner_value = dict
for key in keys:
inner_value = inner_value[key]
del inner_value
return dict
Thinking that since inner_value is assigned to dict by reference, we can mutate dict implcitly by mutating inner_value. However, it seems that assigning inner_value itself creates a new reference (sys.getrefcount(dict[key]) is incremented by assigning inner_value inside the loop) - the result being that the local variable assignment is deled but dict is returned unchanged.
Using inner_value = None has the same effect - presumably because this merely reassigns inner_value.
Other people have posted looking for answers to questions like:
how do I ensure that my dictionary includes no values at the key x - which might be a question about recursion for nested dictionaries, or
how do I iterate over values at a given key (different flavours of this question)
how do I access the value of the key as opposed to the keyed value in a dictionary
This is none of the above - I want to remove a specific key,value pair in a dictionary that may be nested arbitrarily deeply - but I always know the path to the key,value pair I want to delete.
The solution I have hacked together so far is:
def delete_dictionary_value(dict, keys):
base_str = f"del dict"
property_access_str = ''.join([f"['{i}']" for i in keys])
return exec(base_str + property_access_str)
Which doesn't feel right.
This also seems like pretty basic functionality - but I've not found an obvious solution. Most likely I am missing something (most likely something blindingly obvious) - please help me see.
If error checking is not required at all, you just need to iterate to the penultimate key and then delete the value from there:
def del_by_path(d, keys):
for k in keys[:-1]:
d = d[k]
return d.pop(keys[-1])
d = {'a': {'b': {'c': {'d': 'Value'}}}}
del_by_path(d, 'abcd')
# 'Value'
print(d)
# {'a': {'b': {'c': {}}}}
Just for fun, here's a more "functional-style" way to do the same thing:
from functools import reduce
def del_by_path(d, keys):
*init, last = keys
return reduce(dict.get, init, d).pop(last)
Don't use a string-evaluation approach. Try to iteratively move to the last dictionary and delete the key-value pair from it. Here a possibility:
def delete_key(d, value_path):
# move to most internal dictionary
for kp in value_path[:-1]:
if kp in dd and isinstance(d[kp], dict):
d = d[kp]
else:
e_msg = f"Key-value delete-operation failed at key '{kp}'"
raise Exception(e_msg)
# last entry check
lst_kp = value_path[-1]
if lst_kp not in d:
e_msg = f"Key-value delete-operation failed at key '{lst_kp}'"
raise Exception(e_msg)
# delete key-value of most internal dictionary
print(f'Value "{d[lst_kp]}" at position "{value_path}" deleted')
del d[lst_kp]
d = {1: 2, 2:{3: "a"}, 4: {5: 6, 6:{8:9}}}
delete_key(d, [44, 6, 0])
#Value "9" at position "[4, 6, 8]" deleted
#{1: 2, 2: {3: 'a'}, 4: {5: 6, 6: {}}}
I had to remove some fields from a dictionary, the keys for those fields are on a list. So I wrote this function:
def delete_keys_from_dict(dict_del, lst_keys):
"""
Delete the keys present in lst_keys from the dictionary.
Loops recursively over nested dictionaries.
"""
dict_foo = dict_del.copy() #Used as iterator to avoid the 'DictionaryHasChanged' error
for field in dict_foo.keys():
if field in lst_keys:
del dict_del[field]
if type(dict_foo[field]) == dict:
delete_keys_from_dict(dict_del[field], lst_keys)
return dict_del
This code works, but it's not very elegant and I'm sure that there is a better solution.
First of, I think your code is working and not inelegant. There's no immediate reason not to use the code you presented.
There are a few things that could be better though:
Comparing the type
Your code contains the line:
if type(dict_foo[field]) == dict:
That can be definitely improved. Generally (see also PEP8) you should use isinstance instead of comparing types:
if isinstance(dict_foo[field], dict)
However that will also return True if dict_foo[field] is a subclass of dict. If you don't want that, you could also use is instead of ==. That will be marginally (and probably unnoticeable) faster.
If you also want to allow arbitary dict-like objects you could go a step further and test if it's a collections.abc.MutableMapping. That will be True for dict and dict subclasses and for all mutable mappings that explicitly implement that interface without subclassing dict, for example UserDict:
>>> from collections import MutableMapping
>>> # from UserDict import UserDict # Python 2.x
>>> from collections import UserDict # Python 3.x - 3.6
>>> # from collections.abc import MutableMapping # Python 3.7+
>>> isinstance(UserDict(), MutableMapping)
True
>>> isinstance(UserDict(), dict)
False
Inplace modification and return value
Typically functions either modify a data structure inplace or return a new (modified) data structure. Just to mention a few examples: list.append, dict.clear, dict.update all modify the data structure inplace and return None. That makes it easier to keep track what a function does. However that's not a hard rule and there are always valid exceptions from this rule. However personally I think a function like this doesn't need to be an exception and I would simply remove the return dict_del line and let it implicitly return None, but YMMV.
Removing the keys from the dictionary
You copied the dictionary to avoid problems when you remove key-value pairs during the iteration. However, as already mentioned by another answer you could just iterate over the keys that should be removed and try to delete them:
for key in keys_to_remove:
try:
del dict[key]
except KeyError:
pass
That has the additional advantage that you don't need to nest two loops (which could be slower, especially if the number of keys that need to be removed is very long).
If you don't like empty except clauses you can also use: contextlib.suppress (requires Python 3.4+):
from contextlib import suppress
for key in keys_to_remove:
with suppress(KeyError):
del dict[key]
Variable names
There are a few variables I would rename because they are just not descriptive or even misleading:
delete_keys_from_dict should probably mention the subdict-handling, maybe delete_keys_from_dict_recursive.
dict_del sounds like a deleted dict. I tend to prefer names like dictionary or dct because the function name already describes what is done to the dictionary.
lst_keys, same there. I'd probably use just keys there. If you want to be more specific something like keys_sequence would make more sense because it accepts any sequence (you just have to be able to iterate over it multiple times), not just lists.
dict_foo, just no...
field isn't really appropriate either, it's a key.
Putting it all together:
As I said before I personally would modify the dictionary in-place and not return the dictionary again. Because of that I present two solutions, one that modifies it in-place but doesn't return anything and one that creates a new dictionary with the keys removed.
The version that modifies in-place (very much like Ned Batchelders solution):
from collections import MutableMapping
from contextlib import suppress
def delete_keys_from_dict(dictionary, keys):
for key in keys:
with suppress(KeyError):
del dictionary[key]
for value in dictionary.values():
if isinstance(value, MutableMapping):
delete_keys_from_dict(value, keys)
And the solution that returns a new object:
from collections import MutableMapping
def delete_keys_from_dict(dictionary, keys):
keys_set = set(keys) # Just an optimization for the "if key in keys" lookup.
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value # or copy.deepcopy(value) if a copy is desired for non-dicts.
return modified_dict
However it only makes copies of the dictionaries, the other values are not returned as copy, you could easily wrap these in copy.deepcopy (I put a comment in the appropriate place of the code) if you want that.
def delete_keys_from_dict(dict_del, lst_keys):
for k in lst_keys:
try:
del dict_del[k]
except KeyError:
pass
for v in dict_del.values():
if isinstance(v, dict):
delete_keys_from_dict(v, lst_keys)
return dict_del
Since the question requested an elegant way, I'll submit my general-purpose solution to wrangling nested structures. First, install the boltons utility package with pip install boltons, then:
from boltons.iterutils import remap
data = {'one': 'remains', 'this': 'goes', 'of': 'course'}
bad_keys = set(['this', 'is', 'a', 'list', 'of', 'keys'])
drop_keys = lambda path, key, value: key not in bad_keys
clean = remap(data, visit=drop_keys)
print(clean)
# Output:
{'one': 'remains'}
In short, the remap utility is a full-featured, yet succinct approach to handling real-world data structures which are often nested, and can even contain cycles and special containers.
This page has many more examples, including ones working with much larger objects from Github's API.
It's pure-Python, so it works everywhere, and is fully tested in Python 2.7 and 3.3+. Best of all, I wrote it for exactly cases like this, so if you find a case it doesn't handle, you can bug me to fix it right here.
def delete_keys_from_dict(d, to_delete):
if isinstance(to_delete, str):
to_delete = [to_delete]
if isinstance(d, dict):
for single_to_delete in set(to_delete):
if single_to_delete in d:
del d[single_to_delete]
for k, v in d.items():
delete_keys_from_dict(v, to_delete)
elif isinstance(d, list):
for i in d:
delete_keys_from_dict(i, to_delete)
d = {'a': 10, 'b': [{'c': 10, 'd': 10, 'a': 10}, {'a': 10}], 'c': 1 }
delete_keys_from_dict(d, ['a', 'c']) # inplace deletion
print(d)
>>> {'b': [{'d': 10}, {}]}
This solution works for dict and list in a given nested dict. The input to_delete can be a list of str to be deleted or a single str.
Plese note, that if you remove the only key in a dict, you will get an empty dict.
I think the following is more elegant:
def delete_keys_from_dict(dict_del, lst_keys):
if not isinstance(dict_del, dict):
return dict_del
return {
key: value
for key, value in (
(key, delete_keys_from_dict(value, lst_keys))
for key, value in dict_del.items()
)
if key not in lst_keys
}
Example usage:
test_dict_in = {
1: {1: {0: 2, 3: 4}},
0: {2: 3},
2: {5: {0: 4}, 6: {7: 8}},
}
test_dict_out = {
1: {1: {3: 4}},
2: {5: {}, 6: {7: 8}},
}
assert delete_keys_from_dict(test_dict_in, [0]) == test_dict_out
Since you already need to loop through every element in the dict, I'd stick with a single loop and just make sure to use a set for looking up the keys to delete
def delete_keys_from_dict(dict_del, the_keys):
"""
Delete the keys present in the lst_keys from the dictionary.
Loops recursively over nested dictionaries.
"""
# make sure the_keys is a set to get O(1) lookups
if type(the_keys) is not set:
the_keys = set(the_keys)
for k,v in dict_del.items():
if k in the_keys:
del dict_del[k]
if isinstance(v, dict):
delete_keys_from_dict(v, the_keys)
return dict_del
this works with dicts containing Iterables (list, ...) that may contain dict. Python 3. For Python 2 unicode should also be excluded from the iteration. Also there may be some iterables that don't work that I'm not aware of. (i.e. will lead to inifinite recursion)
from collections.abc import Iterable
def deep_omit(d, keys):
if isinstance(d, dict):
for k in keys:
d.pop(k, None)
for v in d.values():
deep_omit(v, keys)
elif isinstance(d, Iterable) and not isinstance(d, str):
for e in d:
deep_omit(e, keys)
return d
Since nobody posted an interactive version that could be useful for someone:
def delete_key_from_dict(adict, key):
stack = [adict]
while stack:
elem = stack.pop()
if isinstance(elem, dict):
if key in elem:
del elem[key]
for k in elem:
stack.append(elem[k])
This version is probably what you would push to production. The recursive version is elegant and easy to write but it scales badly (by default Python uses a maximum recursion depth of 1000).
If you have nested keys as well and based on #John La Rooy's answer here is an elegant solution:
from boltons.iterutils import remap
def sof_solution():
data = {"user": {"name": "test", "pwd": "******"}, "accounts": ["1", "2"]}
sensitive = {"user.pwd", "accounts"}
clean = remap(
data,
visit=lambda path, key, value: drop_keys(path, key, value, sensitive)
)
print(clean)
def drop_keys(path, key, value, sensitive):
if len(path) > 0:
nested_key = f"{'.'.join(path)}.{key}"
return nested_key not in sensitive
return key not in sensitive
sof_solution() # prints {'user': {'name': 'test'}}
Using the awesome code from this post and add a small statement:
def remove_fields(self, d, list_of_keys_to_remove):
if not isinstance(d, (dict, list)):
return d
if isinstance(d, list):
return [v for v in (self.remove_fields(v, list_of_keys_to_remove) for v in d) if v]
return {k: v for k, v in ((k, self.remove_fields(v, list_of_keys_to_remove)) for k, v in d.items()) if k not in list_of_keys_to_remove}
I came here to search for a solution to remove keys from deeply nested Python3 dicts and all solutions seem to be somewhat complex.
Here's a oneliner for removing keys from nested or flat dicts:
nested_dict = {
"foo": {
"bar": {
"foobar": {},
"shmoobar": {}
}
}
}
>>> {'foo': {'bar': {'foobar': {}, 'shmoobar': {}}}}
nested_dict.get("foo", {}).get("bar", {}).pop("shmoobar", None)
>>> {'foo': {'bar': {'foobar': {}}}}
I used .get() to not get KeyError and I also provide empty dict as default value up to the end of the chain. I do pop() for the last element and I provide None as the default there to avoid KeyError.
Many SO posts show you how to efficiently check the existence of a key in a dictionary, e.g., Check if a given key already exists in a dictionary
How do I do this for a multi level key? For example, if d["a"]["b"] is a dict, how can I check if d["a"]["b"]["c"]["d"] exists without doing something horrendous like this:
if "a" in d and isInstance(d["a"], dict) and "b" in d["a"] and isInstance(d["a"]["b"], dict) and ...
Is there some syntax like
if "a"/"b"/"c"/"d" in d
What I am actually using this for: we have jsons, parsed into dicts using simplejson, that I need to extract values from. Some of these values are nested three and four levels deep; but sometimes the value doesn't exist at all. So I wanted something like:
val = None if not d["a"]["b"]["c"]["d"] else d["a"]["b"]["c"]["d"] #here d["a"]["b"] may not even exist
EDIT: prefer not to crash if some subkey exists but is not a dictionary, e.g, d["a"]["b"] = 5.
Sadly, there isn't any builtin syntax or a common library to query dictionaries like that.
However, I believe the simplest(and I think it's efficient enough) thing you can do is:
d.get("a", {}).get("b", {}).get("c")
Edit: It's not very common, but there is: https://github.com/akesterson/dpath-python
Edit 2: Examples:
>>> d = {"a": {"b": {}}}
>>> d.get("a", {}).get("b", {}).get("c")
>>> d = {"a": {}}
>>> d.get("a", {}).get("b", {}).get("c")
>>> d = {"a": {"b": {"c": 4}}}
>>> d.get("a", {}).get("b", {}).get("c")
4
This isn't probably a good idea and I wouldn't recommend using this in prod. However, if you're just doing it for learning purposes then the below might work for you.
def rget(dct, keys, default=None):
"""
>>> rget({'a': 1}, ['a'])
1
>>> rget({'a': {'b': 2}}, ['a', 'b'])
2
"""
key = keys.pop(0)
try:
elem = dct[key]
except KeyError:
return default
except TypeError:
# you gotta handle non dict types here
# beware of sequences when your keys are integers
if not keys:
return elem
return rget(elem, keys, default)
UPDATE: I ended up writing my own open-source, pippable library that allows one to do this: https://pypi.python.org/pypi/dictsearch
A non-recursive version, quite similar to #Meitham's solution, which does not mutate the looked-for key. Returns True/False if the exact structure is present in the source dictionary.
def subkey_in_dict(dct, subkey):
""" Returns True if the given subkey is present within the structure of the source dictionary, False otherwise.
The format of the subkey is parent_key:sub_key1:sub_sub_key2 (etc.) - description of the dict structure, where the
character ":" is the delemiter.
:param dct: the dictionary to be searched in.
:param subkey: the target keys structure, which should be present.
:returns Boolean: is the keys structure present in dct.
:raises AttributeError: if subkey is not a string.
"""
keys = subkey.split(':')
work_dict = dct
while keys:
target = keys.pop(0)
if isinstance(work_dict, dict):
if target in work_dict:
if not keys: # this is the last element in the input, and it is in the dict
return True
else: # not the last element of subkey, change the temp var
work_dict = work_dict[target]
else:
return False
else:
return False
The structure that is checked is in the form parent_key:sub_key1:sub_sub_key2, where the : char is the delimiter. Obviously - it will match case-sensitively, and will stop (return False) if there's a list within the dictionary.
Sample usage:
dct = {'a': {'b': {'c': {'d': 123}}}}
print(subkey_in_dict(dct, 'a:b:c:d')) # prints True
print(subkey_in_dict(dct, 'a:b:c:d:e')) # False
print(subkey_in_dict(dct, 'a:b:d')) # False
print(subkey_in_dict(dct, 'a:b:c')) # True
This is what I usually use
def key_in_dict(_dict: dict, key_lookup: str, separator='.'):
"""
Searches for a nested key in a dictionary and returns its value, or None if nothing was found.
key_lookup must be a string where each key is deparated by a given "separator" character, which by default is a dot
"""
keys = key_lookup.split(separator)
subdict = _dict
for k in keys:
subdict = subdict[k] if k in subdict else None
if subdict is None: break
return subdict
Returns the key if exists, or None it it doesn't
key_in_dict({'test': {'test': 'found'}}, 'test.test') // 'found'
key_in_dict({'test': {'test': 'found'}}, 'test.not_a_key') // None
For single layer dicts like x = {'a': 1, 'b': 2} the problem is easy and answered on SO (Pythonic way to check if two dictionaries have the identical set of keys?) but what about nested dicts?
For example, y = {'a': {'c': 3}, 'b': {'d': 4}} has keys 'a' and 'b' but I want to compare its shape to another nested dict structure like z = {'a': {'c': 5}, 'b': {'d': 6}} which has the same shape and keys (different values is fine) as y. w = {'a': {'c': 3}, 'b': {'e': 4}} would have keys 'a' and 'b' but on the next layer in it differs from y because w['b'] has key 'e' while y['b'] has key 'd'.
Want a short/simple function of two arguments dict_1 and dict_2 and return True if they have same shape and key as described above, and False otherwise.
This provides a copy of both dictionaries stripped of any non-dictionary values, then compares them:
def getshape(d):
if isinstance(d, dict):
return {k:getshape(d[k]) for k in d}
else:
# Replace all non-dict values with None.
return None
def shape_equal(d1, d2):
return getshape(d1) == getshape(d2)
I liked nneonneo's answer, and it should be relatively fast, but I want something that didn't create extra unnecessary data structures (I've been learning about memory fragmentation in Python). This may or may not be as fast or faster.
(EDIT: Spoiler!)
Faster by a decent enough margin to make it preferable in all cases, see the other analysis answer.
But if dealing with lots and lots of these and having memory problems, it is likely to be preferable to do it this way.
Implementation
This should work in Python 3, maybe 2.7 if you translate keys to viewkeys, definitely not 2.6. It relies on the set-like view of the keys that dicts have:
def sameshape(d1, d2):
if isinstance(d1, dict):
if isinstance(d2, dict):
# then we have shapes to check
return (d1.keys() == d2.keys() and
# so the keys are all the same
all(sameshape(d1[k], d2[k]) for k in d1.keys()))
# thus all values will be tested in the same way.
else:
return False # d1 is a dict, but d2 isn't
else:
return not isinstance(d2, dict) # if d2 is a dict, False, else True.
Edit updated to reduce redundant type check, now even more efficient.
Testing
To check:
print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None: {} }}}))
print('expect true:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:'foo'}}}))
print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:None, 'baz':'foo'}}}))
Prints:
expect false:
False
expect true:
True
expect false:
False
To profile the two currently existing answers, first lets import timeit:
import timeit
Now we need to setup the code:
setup = '''
import copy
def getshape(d):
if isinstance(d, dict):
return {k:getshape(d[k]) for k in d}
else:
# Replace all non-dict values with None.
return None
def nneo_shape_equal(d1, d2):
return getshape(d1) == getshape(d2)
def aaron_shape_equal(d1,d2):
if isinstance(d1, dict) and isinstance(d2, dict):
return (d1.keys() == d2.keys() and
all(aaron_shape_equal(d1[k], d2[k]) for k in d1.keys()))
else:
return not (isinstance(d1, dict) or isinstance(d2, dict))
class Vividict(dict):
def __missing__(self, key):
value = self[key] = type(self)()
return value
d = Vividict()
d['foo']['bar']
d['foo']['baz']
d['fizz']['buzz']
d['primary']['secondary']['tertiary']['quaternary']
d0 = copy.deepcopy(d)
d1 = copy.deepcopy(d)
d1['primary']['secondary']['tertiary']['extra']
# d == d0 is True
# d == d1 is now False!
'''
And now let's test the two options out, first with Python 3.3!
>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[36.784881490981206, 36.212246977956966, 36.29759863798972]
And it looks like my solution takes 2/3rd to 3/4th the time, making it more than 1.25 times as fast.
>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[26.838892214931548, 26.61037168605253, 27.170253590098582]
And on a version of Python 3.4 (an alpha) that I compiled myself:
>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[272.5629618819803, 273.49581588001456, 270.13374400604516]
>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[214.87033835891634, 215.69223327597138, 214.85333003790583]
Still about the same ratio. The time difference between the two is likely because I self-compiled 3.4 without optimizations.
Thanks to all readers!
What would be the easiest way to go about turning this dictionary:
{'item':{'w':{'c':1, 'd':2}, 'x':120, 'y':240, 'z':{'a':100, 'b':200}}}
into this one:
{'item':{'y':240, 'z':{'b':200}}}
given only that you need the vars y and b while maintaining the structure of the dictionary? The size or number of items or the depth of the dictionary should not matter, as the one I'm working with can be anywhere from 2 to 5 levels deep.
EDIT: I apologize for the type earlier, and to clarify, I am given an array of strings (eg ['y', 'b']) which I need to find in the dictionary and then keep ONLY 'y' and 'b' as well as any other keys in order to maintain the structure of the original dictionary, in this case, it would be 'z'
A better example can be found here where I need Chipset Model, VRAM, and Resolution.
In regards to the comment, the input would be the above link as the starting dictionary along with an array of ['chipset model', 'vram', 'resolution'] as the keep list. It should return this:
{'Graphics/Displays':{'NVIDIA GeForce 7300 GT':{'Chipset Model':'NVIDIA GeForce 7300 GT', 'Displays':{'Resolution':'1440 x 900 # 75 Hz'}, 'VRAM (Total)':'256 Mb'}}
Assuming that the dictionary you want to assign to an element of a super-dictionary is foo, you could just do this:
my_dictionary['keys']['to']['subdict']=foo
Regarding your edit—where you need to eliminate all keys except those on a certain list—this function should do the trick:
def drop_keys(recursive_dict,keep_list):
key_list=recursive_dict.keys()
for key in key_list:
if(type(recursive_dict[key]) is dict):
drop_keys(recursive_dict[key], keep_list)
elif(key not in keep_list):
del recursive_dict[key]
Something like this?
d = {'item': {'w': {'c': 1, 'd': 2}, 'x': 120, 'y': 240, 'z': {'a': 100, 'b': 200}}}
l = ['y', 'z']
def do_dict(d, l):
return {k: v for k, v in d['item'].items() if k in l}
Here's what I arrived at for a recursive solution, which ended up being similar to what #Dan posted:
def recursive_del(d,keep):
for k in d.copy():
if type(d[k]) == dict:
recursive_del(d[k],keep)
if len(d[k]) == 0: #all keys were deleted, clean up empty dict
del d[k]
elif k not in keep:
del d[k]
demo:
>>> keepset = {'y','b'}
>>> a = {'item':{'w':{'c':1, 'd':2}, 'x':120, 'y':240, 'z':{'a':100, 'b':200}}}
>>> recursive_del(a,keepset)
>>> a
{'item': {'z': {'b': 200}, 'y': 240}}
The only thing I think he missed is that you will need to sometimes need to clean up dicts which had all their keys deleted; i.e. without that adjustment you would end up with a vestigial 'w':{} in your example output.
Using your second example I made something like this, it's not exactly pretty but it should be easy to extend. If your tree starts to get big, you can define some sets of rules to parse the dict.
Each rule here are actually pretty much "what should I do when i'm in which state".
def rule2(key, value):
if key == 'VRAM (Total)':
return (key, value)
elif key == 'Chipset Model':
return (key, value)
def rule1(key, value):
if key == "Graphics/Displays":
if isinstance(value, dict):
return (key, recursive_checker(value, rule1))
else:
return (key, value)
else:
return (key, recursive_checker(value, rule2))
def recursive_checker(dat, rule):
def inner(item):
key = item[0]
value = item[1]
return rule(key, value)
return dict(filter(lambda x: x!= None, map(inner, dat.items())))
# Important bits
print recursive_checker(data, rule1)
In your case, as there is not many states, it isn't worth doing it but in case you have multiple cards and you don't necessarly know which key should be traversed but only know that you want certain keys from the tree. This method could be used to search the tree easily. It can be applied to many things.