I have a lot of nested dictionaries, I am trying to find a certain key nested inside somewhere.
e.g. this key is called "fruit". How do I find the value of this key?
#HÃ¥vard's recursive solution is probably going to be OK... unless the level of nesting is too high, and then you get a RuntimeError: maximum recursion depth exceeded. To remedy that, you can use the usual technique for recursion removal: keep your own stack of items to examine (as a list that's under your control). I.e.:
def find_key_nonrecursive(adict, key):
stack = [adict]
while stack:
d = stack.pop()
if key in d:
return d[key]
for k, v in d.iteritems():
if isinstance(v, dict):
stack.append(v)
The logic here is quite close to the recursive answer (except for checking for dict in the right way;-), with the obvious exception that the recursive calls are replaced with a while loop and .pop and .append operations on the explicit-stack list, stack.
(Making some wild guesses about your data structure...)
Do it recursively:
def findkey(d, key):
if key in d: return d[key]
for k,subdict in d.iteritems():
val = findkey(subdict, key)
if val: return val
Just traverse the dictionary and check for the keys (note the comment in the bottom about the "not found" value).
def find_key_recursive(d, key):
if key in d:
return d[key]
for k, v in d.iteritems():
if type(v) is dict: # Only recurse if we hit a dict value
value = find_key_recursive(v, key)
if value:
return value
# You may want to return something else than the implicit None here (and change the tests above) if None is an expected value
Almost 11 years later... based on Alex Martelli answer with slight modification, for Python 3 and lists:
def find_key_nonrecursive(adict, key):
stack = [adict]
while stack:
d = stack.pop()
if key in d:
return d[key]
for v in d.values():
if isinstance(v, dict):
stack.append(v)
if isinstance(v, list):
stack += v
I have written a handy library for this purpose.
I am iterating over ast of the dict and trying to check if a particular key is present or not.
Do check this out. https://github.com/Agent-Hellboy/trace-dkey
An example from README
>>> from trace_dkey import trace
>>> l={'a':{'b':{'c':{'d':{'e':{'f':1}}}}}}
>>> print(trace(l,'f'))
[['a', 'b', 'c', 'd', 'e', 'f']]
Now you can query it as l['a']['b']['c']['d']['e']['f']
>>> l['a']['b']['c']['d']['e']['f']
1
Related
I am quite new to python and wondering if there is an easy way to find a value for a specific key for dictionaries within dictionaries. I am sure you could write a loop etc but wondering if there is a more direct way especially if there are multiple layers and you don't know upfront where exactly the value sits?
Let's say if I like to find the value for 'Mother'
a = {'family 1':{'Father':'Joe', 'Mother': 'Eva'}}
Thanks a lot.
def recursive_lookup(d, key):
if key in d:
return d[key]
for v in d.values():
if not isinstance(v, dict):
continue
x = recursive_lookup(v, key)
if x is not None:
return x
return None
This can be used as follows:
>>> d = {'family 1': {'Father': 'Joe', 'Mother': 'Eva'}}
>>> recursive_lookup(d, "Mother")
'Eva'
I have a dictionary like
{a:{b:{c:{d:2}}}, e:2, f:2}
How am I supposed to get the value of d and change it in python? Previous questions only showed how to get the level of nesting but didn't show how to get the value. In this case, I do not know the level of nesting of the dict. Any help will be appreciated, thank you!!!
P.S. I am using python 3
How about some recursion?
def unest(data, key):
if key in data.keys():
return data.get(key)
else:
for dkey in data.keys():
if isinstance(data.get(dkey), dict):
return unest(data.get(dkey), key)
else:
continue
d = {'a':{'b':{'c':{'d':25}}}, 'e':2, 'f':2}
r = unest(d, 'd')
# 25
r = unest(d, 'c')
# {'d': 25}
Edit: As Paul Rooney points out we don't have to use data.keys, we can simply iterate over the keys by doing if key in data
Edit 2: Given the input you provided this would find the deepest level, however this does not cover certain cases, such as where for example, key e is also a nested dictionary that goes n + 1 levels where n is equal to the depth of key a.
def find_deepest(data):
if not any([isinstance(data.get(k), dict) for k in data]):
return data
else:
for dkey in data:
if isinstance(data.get(dkey), dict):
return find_deepest(data.get(dkey))
else:
continue
u = find_deepest(d)
# {'d': 2}
I was searching for a similar solution, but I wanted to find the deepest instance of any key within a (possibly) nested dictionary and return it's value. This assumes you know what key you want to search for ahead of time. It's unclear to me if that is a valid assumption for the original question, but I hope this will help others if they stumble on this thread.
def find_deepest_item(obj, key, deepest_item = None):
if key in obj:
deepest_item = obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = find_deepest_item(v, key, deepest_item)
if item is not None:
deepest_item = item
return deepest_item
In this example, the 'd' key exists at the top level as well:
d = {'a':{'b':{'c':{'d':25}}}, 'd':3, 'e':2, 'f':2}
v = find_deepest_item(d, 'd')
# 25
Search for 'c':
v = find_deepest_item(d, 'c')
# {'d': 25}
I have a dictionary something like this:
{'A': [12343,
2342349,
{'B': [3423,
342349283,
73,
{'C': [-23,
-2342342,
36],
'D': [-2,
-2520206,
63]}],
'E': [-1.5119711426000446,
-1405627.5262916991,
26.110728689275614,
{'F': [-1.7211282679440503,
-1601770.8149339128,
113.9541439658396],
'G': [0.21282003105839883,
196143.28864221353,
-13.954143965839597,
{'H': [0.43384581412426826,
399408,
203],
'I': [-0.22,
-203265,
-103]}]}]}]}
I want a function using which I can get values.
example, traverse(dictionary,'F') and it should give me the output. Couldn't found any solution. I am able to traverse 1 or two level, but not more. Either the code will break or it will not stop.
My Current solution which is not working is:
def traverse(dictionary,search):
print "Traversing"
if isinstance(dictionary,dict):
keys = dictionary.keys()
print keys
if search in keys:
print "Found",search,"in",keys
print "Printing found dict",dictionary
print
print "Check this out",dictionary.get(search)
print "Trying to return"
val=dictionary.get(search)
return val
else:
for key in keys:
print 'Key >>>>>>>>>',dictionary.get(key)
print
temp=dictionary.get(key)[-1]
print "Temp >>>>>>>",temp
traverse(temp,search)
Assuming there's going to be only one matching key in any of your given data structure, you can use a function that recursively traverses the dictionary looking for the key and returning its value if found, and if not found it will raise an exception, so that the calling frame can catch it and move on to the next candidate key:
def traverse(dictionary, search):
for k, v in dictionary.items():
if k == search:
return v
if isinstance(v[-1], dict):
try:
return traverse(v[-1], search)
except ValueError:
pass
raise ValueError("Key '%s' not found" % search)
so that traverse(d, 'F') returns (assuming your dict is stored as variable d):
[-1.7211282679440503, -1601770.8149339128, 113.9541439658396]
On the other hand, if there can be multiple matches in the given data, you can make the function yield the value of a matching key instead so that the function becomes a generator that generates sub-lists of 0 to many matching keys:
def traverse(dictionary, search):
for k, v in dictionary.items():
if k == search:
yield v
if isinstance(v[-1], dict):
yield from traverse(v[-1], search)
so that list(traverse(d, 'F')) returns:
[[-1.7211282679440503, -1601770.8149339128, 113.9541439658396]]
You need to handle both dictionaries and lists to traverse your structure entirely. You currently only handle dictionaries, but the dictionary with the 'F' key in it is an element of a list object, so you can't find it with your method.
While you can use recursion to make use of the function call stack to track the different levels of your structure, I'd do it iteratively and use a list or collections.deque() (faster for this job) to do track the objects still to process. That's more efficient and won't run into recursion depth errors for larger structures.
For example, walking all the elements with a generator function, then yielding each element visited, could be:
from collections import deque
def walk(d):
queue = deque([d])
while queue:
elem = queue.popleft()
if isinstance(elem, dict):
queue.extend(elem.values())
elif isinstance(elem, list):
queue.extend(elem)
yield elem
The above uses a queue to process elements breath first; to use it as a stack, just replace queue.popleft() with queue.pop().
You can then use the above walker to find your elements:
def search_key(obj, key):
for elem in walk(obj):
if isinstance(elem, dict) and key in elem:
return elem
For your dictionary, the above returns the first dictionary that contains the looked-for key:
>>> search_key(dictionary, 'F')
{'F': [-1.7211282679440503, -1601770.8149339128, 113.9541439658396], 'G': [0.21282003105839883, 196143.28864221353, -13.954143965839597, {'H': [0.43384581412426826, 399408, 203], 'I': [-0.22, -203265, -103]}]}
>>> _['F']
[-1.7211282679440503, -1601770.8149339128, 113.9541439658396]
If you are only ever interested in the value for the given key, just return that, of course:
def search_key(obj, key):
for elem in walk(obj):
if isinstance(elem, dict) and key in elem:
return elem[key]
Following works for a dictionary, but not OrderedDict. For od it seems to form an infinite loop. Can you tell me why?
If the function input is dict it has to return dict, if input is OrderedDict it has to return od.
def key_lower(d):
"""returns d for d or od for od with keys changed to lower case
"""
for k in d.iterkeys():
v = d.pop(k)
if (type(k) == str) and (not k.islower()):
k = k.lower()
d[k] = v
return d
It forms an infinite loop because of the way ordered dictionaries add new members (to the end)
Since you are using iterkeys, it is using a generator. When you assign d[k] = v you are adding the new key/value to the end of the dictionary. Because you are using a generator, that will continue to generate keys as you continue adding them.
You could fix this in a few ways. One would be to create a new ordered dict from the previous.
def key_lower(d):
newDict = OrderedDict()
for k, v in d.iteritems():
if (isinstance(k, (str, basestring))):
k = k.lower()
newDict[k] = v
return newDict
The other way would be to not use a generator and use keys instead of iterkeys
As sberry mentioned, the infinite loop is essentially as you are modifying and reading the dict at the same time.
Probably the simplest solution is to use OrderedDict.keys() instead of OrderedDict.iterkeys():
for k in d.keys():
v = d.pop(k)
if (type(k) == str) and (not k.islower()):
k = k.lower()
d[k] = v
as the keys are captured directly at the start, they won't get updated as items are changed in the dict.
I'm trying to create a generic function that replaces dots in keys of a nested dictionary. I have a non-generic function that goes 3 levels deep, but there must be a way to do this generic. Any help is appreciated! My code so far:
output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}}
def print_dict(d):
new = {}
for key,value in d.items():
new[key.replace(".", "-")] = {}
if isinstance(value, dict):
for key2, value2 in value.items():
new[key][key2] = {}
if isinstance(value2, dict):
for key3, value3 in value2.items():
new[key][key2][key3.replace(".", "-")] = value3
else:
new[key][key2.replace(".", "-")] = value2
else:
new[key] = value
return new
print print_dict(output)
UPDATE: to answer my own question, I made a solution using json object_hooks:
import json
def remove_dots(obj):
for key in obj.keys():
new_key = key.replace(".","-")
if new_key != key:
obj[new_key] = obj[key]
del obj[key]
return obj
output = {'key1': {'key2': 'value2', 'key3': {'key4 with a .': 'value4', 'key5 with a .': 'value5'}}}
new_json = json.loads(json.dumps(output), object_hook=remove_dots)
print new_json
Yes, there exists better way:
def print_dict(d):
new = {}
for k, v in d.iteritems():
if isinstance(v, dict):
v = print_dict(v)
new[k.replace('.', '-')] = v
return new
(Edit: It's recursion, more on Wikipedia.)
Actually all of the answers contain a mistake that may lead to wrong typing in the result.
I'd take the answer of #ngenain and improve it a bit below.
My solution will take care about the types derived from dict (OrderedDict, defaultdict, etc) and also about not only list, but set and tuple types.
I also do a simple type check in the beginning of the function for the most common types to reduce the comparisons count (may give a bit of speed in the large amounts of the data).
Works for Python 3. Replace obj.items() with obj.iteritems() for Py2.
def change_keys(obj, convert):
"""
Recursively goes through the dictionary obj and replaces keys with the convert function.
"""
if isinstance(obj, (str, int, float)):
return obj
if isinstance(obj, dict):
new = obj.__class__()
for k, v in obj.items():
new[convert(k)] = change_keys(v, convert)
elif isinstance(obj, (list, set, tuple)):
new = obj.__class__(change_keys(v, convert) for v in obj)
else:
return obj
return new
If I understand the needs right, most of users want to convert the keys to use them with mongoDB that does not allow dots in key names.
I used the code by #horejsek, but I adapted it to accept nested dictionaries with lists and a function that replaces the string.
I had a similar problem to solve: I wanted to replace keys in underscore lowercase convention for camel case convention and vice versa.
def change_dict_naming_convention(d, convert_function):
"""
Convert a nested dictionary from one convention to another.
Args:
d (dict): dictionary (nested or not) to be converted.
convert_function (func): function that takes the string in one convention and returns it in the other one.
Returns:
Dictionary with the new keys.
"""
new = {}
for k, v in d.iteritems():
new_v = v
if isinstance(v, dict):
new_v = change_dict_naming_convention(v, convert_function)
elif isinstance(v, list):
new_v = list()
for x in v:
new_v.append(change_dict_naming_convention(x, convert_function))
new[convert_function(k)] = new_v
return new
Here's a simple recursive solution that deals with nested lists and dictionnaries.
def change_keys(obj, convert):
"""
Recursivly goes through the dictionnary obj and replaces keys with the convert function.
"""
if isinstance(obj, dict):
new = {}
for k, v in obj.iteritems():
new[convert(k)] = change_keys(v, convert)
elif isinstance(obj, list):
new = []
for v in obj:
new.append(change_keys(v, convert))
else:
return obj
return new
You have to remove the original key, but you can't do it in the body of the loop because it will throw RunTimeError: dictionary changed size during iteration.
To solve this, iterate through a copy of the original object, but modify the original object:
def change_keys(obj):
new_obj = obj
for k in new_obj:
if hasattr(obj[k], '__getitem__'):
change_keys(obj[k])
if '.' in k:
obj[k.replace('.', '$')] = obj[k]
del obj[k]
>>> foo = {'foo': {'bar': {'baz.121': 1}}}
>>> change_keys(foo)
>>> foo
{'foo': {'bar': {'baz$121': 1}}}
You can dump everything to a JSON
replace through the whole string and load the JSON back
def nested_replace(data, old, new):
json_string = json.dumps(data)
replaced = json_string.replace(old, new)
fixed_json = json.loads(replaced)
return fixed_json
Or use a one-liner
def short_replace(data, old, new):
return json.loads(json.dumps(data).replace(old, new))
While jllopezpino's answer works but only limited to the start with the dictionary, here is mine that works with original variable is either list or dict.
def fix_camel_cases(data):
def convert(name):
# https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()
if isinstance(data, dict):
new_dict = {}
for key, value in data.items():
value = fix_camel_cases(value)
snake_key = convert(key)
new_dict[snake_key] = value
return new_dict
if isinstance(data, list):
new_list = []
for value in data:
new_list.append(fix_camel_cases(value))
return new_list
return data
Here's a 1-liner variant of #horejsek 's answer using dict comprehension for those who prefer:
def print_dict(d):
return {k.replace('.', '-'): print_dict(v) for k, v in d.items()} if isinstance(d, dict) else d
I've only tested this in Python 2.7
I am guessing you have the same issue as I have, inserting dictionaries into a MongoDB collection, encountering exceptions when trying to insert dictionaries that have keys with dots (.) in them.
This solution is essentially the same as most other answers here, but it is slightly more compact, and perhaps less readable in that it uses a single statement and calls itself recursively. For Python 3.
def replace_keys(my_dict):
return { k.replace('.', '(dot)'): replace_keys(v) if type(v) == dict else v for k, v in my_dict.items() }