Is there are more readable way to check if a key buried in a dict exists without checking each level independently?
Lets say I need to get this value in a object buried (example taken from Wikidata):
x = s['mainsnak']['datavalue']['value']['numeric-id']
To make sure that this does not end with a runtime error it is necessary to either check every level like so:
if 'mainsnak' in s and 'datavalue' in s['mainsnak'] and 'value' in s['mainsnak']['datavalue'] and 'nurmeric-id' in s['mainsnak']['datavalue']['value']:
x = s['mainsnak']['datavalue']['value']['numeric-id']
The other way I can think of to solve this is wrap this into a try catch construct which I feel is also rather awkward for such a simple task.
I am looking for something like:
x = exists(s['mainsnak']['datavalue']['value']['numeric-id'])
which returns True if all levels exists.
To be brief, with Python you must trust it is easier to ask for forgiveness than permission
try:
x = s['mainsnak']['datavalue']['value']['numeric-id']
except KeyError:
pass
The answer
Here is how I deal with nested dict keys:
def keys_exists(element, *keys):
'''
Check if *keys (nested) exists in `element` (dict).
'''
if not isinstance(element, dict):
raise AttributeError('keys_exists() expects dict as first argument.')
if len(keys) == 0:
raise AttributeError('keys_exists() expects at least two arguments, one given.')
_element = element
for key in keys:
try:
_element = _element[key]
except KeyError:
return False
return True
Example:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it"
}
}
}
print 'spam (exists): {}'.format(keys_exists(data, "spam"))
print 'spam > bacon (do not exists): {}'.format(keys_exists(data, "spam", "bacon"))
print 'spam > egg (exists): {}'.format(keys_exists(data, "spam", "egg"))
print 'spam > egg > bacon (exists): {}'.format(keys_exists(data, "spam", "egg", "bacon"))
Output:
spam (exists): True
spam > bacon (do not exists): False
spam > egg (exists): True
spam > egg > bacon (exists): True
It loop in given element testing each key in given order.
I prefere this to all variable.get('key', {}) methods I found because it follows EAFP.
Function except to be called like: keys_exists(dict_element_to_test, 'key_level_0', 'key_level_1', 'key_level_n', ..). At least two arguments are required, the element and one key, but you can add how many keys you want.
If you need to use kind of map, you can do something like:
expected_keys = ['spam', 'egg', 'bacon']
keys_exists(data, *expected_keys)
You could use .get with defaults:
s.get('mainsnak', {}).get('datavalue', {}).get('value', {}).get('numeric-id')
but this is almost certainly less clear than using try/except.
Python 3.8 +
dictionary = {
"main_key": {
"sub_key": "value",
},
}
if sub_key_value := dictionary.get("main_key", {}).get("sub_key"):
print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {sub_key_value}")
else:
print("Key 'sub_key' doesn't exists or their value is Falsy")
Extra
A little but important clarification.
In the previous code block, we verify that a key exists in a dictionary but that its value is also Truthy.
Most of the time, this is what people are really looking for, and I think this is what the OP really wants. However, it is not really the most "correct" answer, since if the key exists but its value is False, the above code block will tell us that the key does not exist, which is not true.
So, I leet here a more correct answer:
dictionary = {
"main_key": {
"sub_key": False,
},
}
if "sub_key" in dictionary.get("main_key", {}):
print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {dictionary['main_key']['sub_key']}")
else:
print("Key 'sub_key' doesn't exists")
Try/except seems to be most pythonic way to do that.
The following recursive function should work (returns None if one of the keys was not found in the dict):
def exists(obj, chain):
_key = chain.pop(0)
if _key in obj:
return exists(obj[_key], chain) if chain else obj[_key]
myDict ={
'mainsnak': {
'datavalue': {
'value': {
'numeric-id': 1
}
}
}
}
result = exists(myDict, ['mainsnak', 'datavalue', 'value', 'numeric-id'])
print(result)
>>> 1
I suggest you to use python-benedict, a solid python dict subclass with full keypath support and many utility methods.
You just need to cast your existing dict:
s = benedict(s)
Now your dict has full keypath support and you can check if the key exists in the pythonic way, using the in operator:
if 'mainsnak.datavalue.value.numeric-id' in s:
# do stuff
Here the library repository and the documentation:
https://github.com/fabiocaccamo/python-benedict
Note: I am the author of this project
You can use pydash to check if exists: http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has
Or get the value (you can even set default - to return if doesn't exist): http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has
Here is an example:
>>> get({'a': {'b': {'c': [1, 2, 3, 4]}}}, 'a.b.c[1]')
2
The try/except way is the most clean, no contest. However, it also counts as an exception in my IDE, which halts execution while debugging.
Furthermore, I do not like using exceptions as in-method control statements, which is essentially what is happening with the try/catch.
Here is a short solution which does not use recursion, and supports a default value:
def chained_dict_lookup(lookup_dict, keys, default=None):
_current_level = lookup_dict
for key in keys:
if key in _current_level:
_current_level = _current_level[key]
else:
return default
return _current_level
The accepted answer is a good one, but here is another approach. It's a little less typing and a little easier on the eyes (in my opinion) if you end up having to do this a lot. It also doesn't require any additional package dependencies like some of the other answers. Have not compared performance.
import functools
def haskey(d, path):
try:
functools.reduce(lambda x, y: x[y], path.split("."), d)
return True
except KeyError:
return False
# Throwing in this approach for nested get for the heck of it...
def getkey(d, path, *default):
try:
return functools.reduce(lambda x, y: x[y], path.split("."), d)
except KeyError:
if default:
return default[0]
raise
Usage:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it",
}
}
}
(Pdb) haskey(data, "spam")
True
(Pdb) haskey(data, "spamw")
False
(Pdb) haskey(data, "spam.egg")
True
(Pdb) haskey(data, "spam.egg3")
False
(Pdb) haskey(data, "spam.egg.bacon")
True
Original inspiration from the answers to this question.
EDIT: a comment pointed out that this only works with string keys. A more generic approach would be to accept an iterable path param:
def haskey(d, path):
try:
functools.reduce(lambda x, y: x[y], path, d)
return True
except KeyError:
return False
(Pdb) haskey(data, ["spam", "egg"])
True
I had the same problem and recent python lib popped up:
https://pypi.org/project/dictor/
https://github.com/perfecto25/dictor
So in your case:
from dictor import dictor
x = dictor(s, 'mainsnak.datavalue.value.numeric-id')
Personal note:
I don't like 'dictor' name, since it doesn't hint what it actually does. So I'm using it like:
from dictor import dictor as extract
x = extract(s, 'mainsnak.datavalue.value.numeric-id')
Couldn't come up with better naming than extract. Feel free to comment, if you come up with more viable naming. safe_get, robust_get didn't felt right for my case.
Another way:
def does_nested_key_exists(dictionary, nested_key):
exists = nested_key in dictionary
if not exists:
for key, value in dictionary.items():
if isinstance(value, dict):
exists = exists or does_nested_key_exists(value, nested_key)
return exists
The selected answer works well on the happy path, but there are a couple obvious issues to me. If you were to search for ["spam", "egg", "bacon", "pizza"], it would throw a type error due to trying to index "well..." using the string "pizza". Like wise, if you replaced pizza with 2, it would use that to get the index 2 from "Well..."
Selected Answer Output Issues:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it"
}
}
}
print(keys_exists(data, "spam", "egg", "bacon", "pizza"))
>> TypeError: string indices must be integers
print(keys_exists(data, "spam", "egg", "bacon", 2)))
>> l
I also feel that using try except can be a crutch that we might too quickly rely on. Since I believe we already need to check for the type, might as well remove the try except.
Solution:
def dict_value_or_default(element, keys=[], default=Undefined):
'''
Check if keys (nested) exists in `element` (dict).
Returns value if last key exists, else returns default value
'''
if not isinstance(element, dict):
return default
_element = element
for key in keys:
# Necessary to ensure _element is not a different indexable type (list, string, etc).
# get() would have the same issue if that method name was implemented by a different object
if not isinstance(_element, dict) or key not in _element:
return default
_element = _element[key]
return _element
Output:
print(dict_value_or_default(data, ["spam", "egg", "bacon", "pizza"]))
>> INVALID
print(dict_value_or_default(data, ["spam", "egg", "bacon", 2]))
>> INVALID
print(dict_value_or_default(data, ["spam", "egg", "bacon"]))
>> "Well..."
Here's my small snippet based on #Aroust's answer:
def exist(obj, *keys: str) -> bool:
_obj = obj
try:
for key in keys:
_obj = _obj[key]
except (KeyError, TypeError):
return False
return True
if __name__ == '__main__':
obj = {"mainsnak": {"datavalue": {"value": "A"}}}
answer = exist(obj, "mainsnak", "datavalue", "value", "B")
print(answer)
I added TypeError because when _obj is str, int, None, or etc, it would raise that error.
I wrote a data parsing library called dataknead for cases like this, basically because i got frustrated by the JSON the Wikidata API returns as well.
With that library you could do something like this
from dataknead import Knead
numid = Knead(s).query("mainsnak/datavalue/value/numeric-id").data()
if numid:
# Do something with `numeric-id`
Using dict with defaults is concise and appears to execute faster than using consecutive if statements.
Try it yourself:
import timeit
timeit.timeit("'x' in {'a': {'x': {'y'}}}.get('a', {})")
# 0.2874350370002503
timeit.timeit("'a' in {'a': {'x': {'y'}}} and 'x' in {'a': {'x': {'y'}}}['a']")
# 0.3466246419993695
I have written a handy library for this purpose.
I am iterating over ast of the dict and trying to check if a particular key is present or not.
Do check this out.
https://github.com/Agent-Hellboy/trace-dkey
If you can suffer testing a string representation of the object path then this approach might work for you:
def exists(str):
try:
eval(str)
return True
except:
return False
exists("lst['sublist']['item']")
one can try to use this for checking whether key/nestedkey/value is in nested dict
import yaml
#d - nested dictionary
if something in yaml.dump(d, default_flow_style=False):
print(something, "is in", d)
else:
print(something, "is not in", d)
There are many great answers. here is my humble take on it. Added check for array of dictionaries as well. Please note that I am not checking for arguments validity. I used part Arnot's code above. I added this answer because a I got a use case that requires checking array or dictionaries in my data.
Here is the code:
def keys_exists(element, *keys):
'''
Check if *keys (nested) exists in `element` (dict).
'''
retval=False
if isinstance(element,dict):
for key,value in element.items():
for akey in keys:
if element.get(akey) is not None:
return True
if isinstance(value,dict) or isinstance(value,list):
retval= keys_exists(value, *keys)
elif isinstance(element, list):
for val in element:
if isinstance(val,dict) or isinstance(val,list):
retval=keys_exists(val, *keys)
return retval
I am a little stumped in this code that I have written (Pardon me if my title is misleading..).
See the following code:
my_dict = {'aaa' : 12, 'bbb' :34, 'ccc' : 56}
my_inputs = ['aaa', 'bbb']
def check(user_input):
input_check = my_dict.get(user_input)
if not input_check:
raise ValueError('{0} is not part of the dictionary'.format(user_input))
#sys.exit()
#return
return input_check
for i in my_inputs:
check(i)
print 'Executing next...'
In an ideal scenario, assuming if the contents within my_inputs are all correct and find-able in my_dict, it is executing in the way that I wanted.
However, if I change the contents to my_inputs = ['aaa1', 'bbb'], going by this order, it is unable to print out the statement in the for loop.
But if I change it to my_inputs = ['aaa', 'bbb1'], it will first prints out the statement then raise the ValueError. Though this is right as aaa does exists in the dictionary.
My question here is - I am trying to make my check function to check all the inputs at one go and see if it exists within the dictionary before execution the next function. Whether the order of my_inputs are ['aaa1', 'bbb'] or ['aaa', 'bbb1'], it should simply stop at the ValueError and does not prints out the statement. The statement will only be print if all items in my_inputs are accounted for in my_dict.
I tried using sys.exit() and return, but that does not seems to work.
You can extract the logic of raising exception outside of check.
my_dict = {'aaa' : 12, 'bbb' :34, 'ccc' : 56}
my_inputs = ['aaa', 'bbb']
def check(user_input):
input_check = my_dict.get(user_input)
return input_check is not None
failed_result = [key for key in my_inputs if not check(key)]
if not failed_result:
print("It's OK!")
else:
print("The following key(s) is(are) not in dict!", failed_result)
Or make check simpler, [key for key in my_inputs if key is not in my_dict]
To see is a variant of any of the elements that exists in my_inputs also exists in the dictionary, you can try this:
my_inputs = ['aaa1', 'bbb']
my_dict = {'aaa' : 12, 'bbb' :34, 'ccc' : 56}
final_vals = [i for i in my_inputs if any(i.startswith(c) or c.startswith(i) for c in my_dict)]
Output:
['aaa1', 'bbb']
In the following test data, I am trying to append key 'x' value to the list ls. My question is why I didn't get a KeyError when looping through the first row of the data. Clearly, the first row does not contain the key 'x'. Originally I thought I had to use Try/Except to avoid getting an error when looping through the data, but it seems that Try/Except is not needed.
Could anyone help me understand why a KeyError is not generated here?
data = [{u'xyz': []},
{u'xyz': [{u'x' : 2,
u'y' : 3,
u'z' : 4}]}]
ls = []
for item in data:
ddd = item['xyz']
print ddd
for d in ddd:
ls.append(d['x'])
ls
output:
[]
[{u'y': 3, u'x': 2, u'z': 4}]
[2]
A loop over nothing doesn't run:
>>> for item in []:
... print item
...
>>>
so
data = [{u'xyz': []},
... ]
# first time through
for item in data:
ddd = item['xyz']
# ddd is an empty list
for d in ddd:
# this doesn't run, therefore no KeyError
ls.append(d['x'])
Try/Except is needed if you want to catch a KeyError, if you want to avoid one you can either do:
if 'x' in d: # test if 'x' is a key in d
or
d.get('x') # returns the value, or None
The first row is an empty list, there's no KeyError because there's no dictionary in it. So the loop:
for d in ddd:
is not entered. Try change the first row to:
data = [{u'xyz': [{}]},
You would see the KeyError as expected.
I have a list of dictionaries in Python and I want to check if an dictionary entry exists for a specific term. It works using the syntax
if any(d['acronym'] == 'lol' for d in loaded_data):
print "found"
but I also want to get the value stored at this key, I mean d['acronym']['meaning']. My problem is that when I try to print it out Python does not know about d. Any suggestions, maybe how can I get the index of the occurence without looping again through all the list? Thanks!
If you know there's at most one match (or, alternatively, that you only care about the first) you can use next:
>>> loaded_data = [{"acronym": "AUP", "meaning": "Always Use Python"}, {"acronym": "GNDN", "meaning": "Goes Nowhere, Does Nothing"}]
>>> next(d for d in loaded_data if d['acronym'] == 'AUP')
{'acronym': 'AUP', 'meaning': 'Always Use Python'}
And then depending on whether you want an exception or None as the not-found value:
>>> next(d for d in loaded_data if d['acronym'] == 'AZZ')
Traceback (most recent call last):
File "<ipython-input-18-27ec09ac3228>", line 1, in <module>
next(d for d in loaded_data if d['acronym'] == 'AZZ')
StopIteration
>>> next((d for d in loaded_data if d['acronym'] == 'AZZ'), None)
>>>
You could even get the value and not the dict directly, if you wanted:
>>> next((d['meaning'] for d in loaded_data if d['acronym'] == 'GNDN'), None)
'Goes Nowhere, Does Nothing'
You can just use filter function:
filter(lambda d: d['acronym'] == 'lol', loaded_data)
That will return a list of dictionaries containing acronym == lol:
l = filter(lambda d: d['acronym'] == 'lol', loaded_data)
if l:
print "found"
print l[0]
Don't even need to use any function at all.
If you want to use the item, rather than just check that it's there:
for d in loaded_data:
if d['acronym'] == 'lol':
print("found")
# use d
break # skip the rest of loaded_data
any() only gives you back a boolean, so you can't use that. So just write a loop:
for d in loaded_data:
if d['acronym'] == 'lol':
print "found"
meaning = d['meaning']
break
else:
# The else: of a for runs only if the loop finished without break
print "not found"
meaning = None
Edit: or change it into a slightly more generic function:
def first(iterable, condition):
# Return first element of iterable for which condition is True
for element in iterable:
if condition(element):
return element
return None
found_d = first(loaded_data, lambda d: d['acronym'] == 'lol')
if found_d:
print "found"
# Use found_d
firstone = next((d for d in loaded_data if d['acronym'] == 'lol'), None)
gives you the first dict where the condition applies, or None if there is no such dict.
In my application I am receiving a string 'abc[0]=123'
I want to convert this string to an array of items. I have tried eval() it didnt work for me. I know the array name abc but the number of items will be different in each time.
I can split the string, get array index and do. But I would like to know if there is any direct way to convert this string as an array insert.
I would greately appreciate any suggestion.
are you looking for something like
In [36]: s = "abc[0]=123"
In [37]: vars()[s[:3]] = []
In [38]: vars()[s[:3]].append(eval(s[s.find('=') + 1:]))
In [39]: abc
Out[39]: [123]
But this is not a good way to create a variable
Here's a function for parsing urls according to php rules (i.e. using square brackets to create arrays or nested structures):
import urlparse, re
def parse_qs_as_php(qs):
def sint(x):
try:
return int(x)
except ValueError:
return x
def nested(rest, base, val):
curr, rest = base, re.findall(r'\[(.*?)\]', rest)
while rest:
curr = curr.setdefault(
sint(rest.pop(0) or len(curr)),
{} if rest else val)
return base
def dtol(d):
if not hasattr(d, 'items'):
return d
if sorted(d) == range(len(d)):
return [d[x] for x in range(len(d))]
return {k:dtol(v) for k, v in d.items()}
r = {}
for key, val in urlparse.parse_qsl(qs):
id, rest = re.match(r'^(\w+)(.*)$', key).groups()
r[id] = nested(rest, r.get(id, {}), val) if rest else val
return dtol(r)
Example:
qs = 'one=1&abc[0]=123&abc[1]=345&foo[bar][baz]=555'
print parse_qs_as_php(qs)
# {'abc': ['123', '345'], 'foo': {'bar': {'baz': '555'}}, 'one': '1'}
Your other application is doing it wrong. It should not be specifying index values in the parameter keys. The correct way to specify multiple values for a single key in a GET is to simply repeat the key:
http://my_url?abc=123&abc=456
The Python server side should correctly resolve this into a dictionary-like object: you don't say what framework you're running, but for instance Django uses a QueryDict which you can then access using request.GET.getlist('abc') which will return ['123', '456']. Other frameworks will be similar.