Python set dictionary nested key with dot delineated string - python

If I have a dictionary that is nested, and I pass in a string like "key1.key2.key3" which would translate to:
myDict["key1"]["key2"]["key3"]
What would be an elegant way to be able to have a method where I could pass on that string and it would translate to that key assignment? Something like
myDict.set_nested('key1.key2.key3', someValue)

Using only builtin stuff:
def set(my_dict, key_string, value):
"""Given `foo`, 'key1.key2.key3', 'something', set foo['key1']['key2']['key3'] = 'something'"""
# Start off pointing at the original dictionary that was passed in.
here = my_dict
# Turn the string of key names into a list of strings.
keys = key_string.split(".")
# For every key *before* the last one, we concentrate on navigating through the dictionary.
for key in keys[:-1]:
# Try to find here[key]. If it doesn't exist, create it with an empty dictionary. Then,
# update our `here` pointer to refer to the thing we just found (or created).
here = here.setdefault(key, {})
# Finally, set the final key to the given value
here[keys[-1]] = value
myDict = {}
set(myDict, "key1.key2.key3", "some_value")
assert myDict == {"key1": {"key2": {"key3": "some_value"}}}
This traverses myDict one key at a time, ensuring that each sub-key refers to a nested dictionary.
You could also solve this recursively, but then you risk RecursionError exceptions without any real benefit.

There are a number of existing modules that will already do this, or something very much like it. For example, the jmespath module will resolve jmespath expressions, so given:
>>> mydict={'key1': {'key2': {'key3': 'value'}}}
You can run:
>>> import jmespath
>>> jmespath.search('key1.key2.key3', mydict)
'value'
The jsonpointer module does something similar, although it likes / for a separator instead of ..
Given the number of pre-existing modules I would avoid trying to write your own code to do this.

EDIT: OP's clarification makes it clear that this answer isn't what he's looking for. I'm leaving it up here for people who find it by title.
I implemented a class that did this a while back... it should serve your purposes.
I achieved this by overriding the default getattr/setattr functions for an object.
Check it out! AndroxxTraxxon/cfgutils
This lets you do some code like the following...
from cfgutils import obj
a = obj({
"b": 123,
"c": "apple",
"d": {
"e": "nested dictionary value"
}
})
print(a.d.e)
>>> nested dictionary value

Related

How to pass in a dictionary with additional elements in python?

I have a dictionary:
big_dict = {1:"1",
2:"2",
...
1000:"1000"}
(Note: My dictionary isn't actually numbers to strings)
I am passing this dictionary into a function that calls for it. I use the dictionary often for different functions. However, on occasion I want to send in big_dict with an extra key:item pair such that the dictionary I want to send in would be equivalent to:
big_dict[1001]="1001"
But I don't want to actually add the value to the dictionary. I could make a copy of the dictionary and add it there, but I'd like to avoid the memory + CPU cycles this would consume.
The code I currently have is:
big_dict[1001]="1001"
function_that_uses_dict(big_dict)
del big_dict[1001]
While this works, it seems rather kludgy.
If this were a string I'd do:
function_that_uses_string(myString + 'what I want to add on')
Is there any equivalent way of doing this with a dictionary?
As pointed out by Veedrac in his answer, this problem has already been solved in Python 3.3+ in the form of the ChainMap class:
function_that_uses_dict(ChainMap({1001 : "1001"}, big_dict))
If you don't have Python 3.3 you should use a backport, and if for some reason you don't want to, then below you can see how to implement it by yourself :)
You can create a wrapper, similarly to this:
class DictAdditionalValueWrapper:
def __init__(self, baseDict, specialKey, specialValue):
self.baseDict = baseDict
self.specialKey = specialKey
self.specialValue = specialValue
def __getitem__(self, key):
if key == self.specialKey:
return self.specialValue
return self.baseDict[key]
# ...
You need to supply all other dict method of course, or use the UserDict as a base class, which should simplify this.
and then use it like this:
function_that_uses_dict(DictAdditionalValueWrapper(big_dict, 1001, "1001"))
This can be easily extended to a whole additional dictionary of "special" keys and values, not just single additional element.
You can also extend this approach to reach something similar as in your string example:
class AdditionalKeyValuePair:
def __init__(self, specialKey, specialValue):
self.specialKey = specialKey
self.specialValue = specialValue
def __add__(self, d):
if not isinstance(d, dict):
raise Exception("Not a dict in AdditionalKeyValuePair")
return DictAdditionalValueWrapper(d, self.specialKey, self.specialValue)
and use it like this:
function_that_uses_dict(AdditionalKeyValuePair(1001, "1001") + big_dict)
If you're on 3.3+, just use ChainMap. Otherwise use a backport.
new_dict = ChainMap({1001: "1001"}, old_dict)
You can add the extra key-value pair leaving original dictionary as such like this:
>>> def function_that_uses_bdict(big_dict):
... print big_dict[1001]
...
>>> dct = {1:'1', 2:'2'}
>>> function_that_uses_bdict(dict(dct.items()+[(1001,'1001')]))
1001
>>> dct
{1: '1', 2: '2'} # original unchanged
This is a bit annoying too, but you could just have the function take two parameters, one of them being big_dict, and another being a temporary dictionary, created just for the function (so something like fxn(big_dict, {1001,'1001'}) ). Then you could access both dictionaries without changing your first one, and without copying big_dict.

Challenge with codes stemming from an issue regarding the "Hashable" property

We are attempting to refactor and modify a Python program such that it is able to take a user-defined JSON file, parse that file, and then execute a workflow based on the options that they user wants and had defined in the JSON. So basically, the user will have to specify a dictionary in JSON, and when this JSON file is parsed by the Python program, we obtain a python dictionary which we then pass in as an argument into a class that we instantiate in a top level module. To sum this up, the JSON dictionary defined by the user will eventually be added into the instance namespace when the python program is running.
Implementing the context managers to parse the JSON inputs was not a problem for us. However, we have a requirement that we be able to use the JSON dictionary (which gets subsequently added into the instance namespace) and generate multiple lines from a Jinja2 template file using looping within a template. We attempted to use this line for one of the key-value pairs in the JSON:
"extra_scripts" : [["Altera/AlteraCommon.lua",
"Altera/StratixIV/EP4SGX70HF35C2.lua"]]
and this is sitting in a large dictionary object, let's call it option_space_dict and for simplicity in this example, it has only 4 key-value pairs (assume that "extra_scripts" is 'key4' here), although for our program, it is much larger:
option_space_dict = {
'key1' : ['value1'],
'key2' : ['value2'],
'key3' : ['value3A', 'value3B', 'value3C'],
'key4' : [['value4A', 'value4B']]
}
which is the parsed by this line:
import itertools
option_space = [ dict(itertools.izip(option_space_dict, opt)) for opt in itertools.product(*option_space_dict.itervalues()) ]
to get the option_space which essentially differs from option_space_dict in that it is something like:
[
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3A'
'key4' : ['value4A', 'value4B'] },
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3B'
'key4' : ['value4A', 'value4B'] },
{ 'key1' : 'value1',
'key2' : 'value2',
'key3' : 'value3C'
'key4' : ['value4A', 'value4B'] }
]
So the option_space we generate serves us well for what we want to do with the jinja2 templating. However, in order to get this, the key4 key that we added to option_space_dict caused an issue somewhere else in the program which did:
# ignore self.option as it is not relevant to the issue here
def getOptionCompack(self) :
return [ (k, v) for k, v in self.option.iteritems() if set([v]) != set(self.option_space_dict[k])]
I get the error TypeError: unhashable type: 'list' stemming from the fact that the value of key4 contains a nested list structure, which is 'unhashable'.
So we kind of hit a barrier. Does anyone have a suggestion on how we could overcome this; being able to specify our JSON files in that way to do what we'd want with Jinja2 while still being able to parse the data structures out in the same format?
Thanks a million!
You can normalize your key data structures to use hashable types after they have parsed from JSON.
Since key4 is a list, you have two options:
Convert it to a tuple where order is significant. E.g.,
key = tuple(key)
Convert it to a frozenset where order is insignificant. E.g.,
key = frozenset(key)
If a key can contain a dictionary, then you'll have two additional options:
Convert it to either a sorted tuple or frozenset of its item tuples. E.g.,
key = tuple(sorted(key.iteritems())) # Use key.items() for Python 3.
# OR
key = frozenset(key.iteritems()) # Use key.items() for Python 3.
Convert it to a third-party frozendict (Python 3 compatible version here). E.g.,
import frozendict
key = frozendict.frozendict(key)
Depending on how simple or complex your keys are, you may have to apply the transformation recursively.
Since your keys come directly from JSON, you can check for the native types directly:
if isinstance(key, list):
# Freeze list.
elif isinstance(key, dict):
# Freeze dict.
If you want to support the generic types, you can do something similar to:
import collections
if isinstance(key, collections.Sequence) and not isinstance(key, basestring): # Use str for Python 2.
# NOTE: Make sure to exclude basestring because it meets the requirements for a Sequence (of characters).
# Freeze list.
elif isinstance(key, collections.Mapping):
# Freeze dict.
Here is a full example:
def getOptionCompack(self):
results = []
for k, v in self.option.iteritems():
k = self.freeze_key(k)
if set([v]) != set(self.option_space_dict[k]):
results.append((k, v))
return results
def freeze_key(self, key):
if isinstance(key, list):
return frozenset(self.freeze_key(subv) for subv in key)
# If dictionaries need to be supported, uncomment this.
#elif isinstance(key, dict):
# return frozendict((subk, self.freeze_key(subv)) for subk, subv in key.iteritems())
return key
Where self.option_space_dict already had its keys converted using self.freeze_key().
We have managed to figure out the solution to this problem. The main gist of our solution lies in that we implemented a Helper Function that assists us to actually convert a list into a tuple. Basically, going back to my question, remember we had this list: [["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]]?
With our original getOptionCompack(self) method, and the way we were invoking it, what happened was that we directly tried to convert the list to a set with the statement
return [ (k, v) for k, v in self.option.iteritems() if set([v]) != set(self.option_space_dict[k])]
where set(self.option_space_dict[k]) and iterating over k would mean we will hit the dictionary key-value pair that would give us one instance of doing set([["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]])
which was the cause of the error. This is because a list object is not hashable and set() would actually hash over each element within the outer list that is fed to it, and the element in this case is an inner list. Try doing set([[2]]) and you will see what I mean.
So we figured that the workaround would be to define a Helper function that would accept a list object, or any iterable object for that matter, and test whether each element in it is a list or not. If the element was not a list, it would not do any change to its object type, if it was (and that which would be a nested list), then the Helper function would convert that nested list to a tuple object instead, and in doing that iteratively, it actually constructs a set object that it returns to itself. The definition of the function is:
# Helper function to build a set
def Set(iterable) :
return { tuple(v) if isinstance(v, list) else v for v in iterable }
and so a call that invoked Set() would be in our example:
Set([["Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua"]])
and the object that it returns to itself would be:
{("Altera/AlteraCommon.lua", "Altera/StratixIV/EP4SGX70HF35C2.lua")}
The inner nested list gets converted to a tuple, which is an object type that fits within a set object, as denoted by the {} that encloses the tuple. That's why it can work now, that the set can be formed.
We proceeded to redefine out original method to use our own Set() function:
def getOptionCompack(self) :
return [ (k, v) for k, v in self.option.iteritems() if Set([v]) != Set(self.option_space_dict[k]) ]
and now we no longer have the TypeError, and solved the problem. Seems like a lot of trouble just to do this, but the reason why we went through all this was so as to have an objective means of comparing two objects by sort of "normalizing" them to be the same object type, a set, in order to perform some other action later on as part of our source code.

Python: How to traverse a List[Dict{List[Dict{}]}]

I was just wondering if there is a simple way to do this. I have a particular structure that is parsed from a file and the output is a list of a dict of a list of a dict. Currently, I just have a bit of code that looks something like this:
for i in xrange(len(data)):
for j, k in data[i].iteritems():
for l in xrange(len(data[i]['data'])):
for m, n in data[i]['data'][l].iteritems():
dostuff()
I just wanted to know if there was a function that would traverse a structure and internally figure out whether each entry was a list or a dict and if it is a dict, traverse into that dict and so on. I've only been using Python for about a month or so, so I am by no means an expert or even an intermediate user of the language. Thanks in advance for the answers.
EDIT: Even if it's possible to simplify my code at all, it would help.
You never need to iterate through xrange(len(data)). You iterate either through data (for a list) or data.items() (or values()) (for a dict).
Your code should look like this:
for elem in data:
for val in elem.itervalues():
for item in val['data']:
which is quite a bit shorter.
Will, if you're looking to decend an arbitrary structure of array/hash thingies then you can create a function to do that based on the type() function.
def traverse_it(it):
if (isinstance(it, list)):
for item in it:
traverse_it(item)
elif (isinstance(it, dict)):
for key in it.keys():
traverse_it(it[key])
else:
do_something_with_real_value(it)
Note that the average object oriented guru will tell you not to do this, and instead create a class tree where one is based on an array, another on a dict and then have a single function to process each with the same function name (ie, a virtual function) and to call that within each class function. IE, if/else trees based on types are "bad". Functions that can be called on an object to deal with its contents in its own way "good".
I think this is what you're trying to do. There is no need to use xrange() to pull out the index from the list since for iterates over each value of the list. In my example below d1 is therefore a reference to the current data[i].
for d1 in data: # iterate over outer list, d1 is a dictionary
for x in d1: # iterate over keys in d1 (the x var is unused)
for d2 in d1['data']: # iterate over the list
# iterate over (key,value) pairs in inner most dict
for k,v in d2.iteritems():
dostuff()
You're also using the name l twice (intentionally or not), but beware of how the scoping works.
well, question is quite old. however, out of my curiosity, I would like to respond to your question for much better answer which I just tried.
Suppose, dictionary looks like: dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
Solution:
dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
def recurse(dict):
if type(dict) == type({}):
for key in dict:
recurse(dict[key])
elif type(dict) == type([]):
for element in dict:
if type(element) == type({}):
recurse(element)
else:
print element
else:
print dict
recurse(dict1)

How can I get Python to automatically create missing key/value pairs in a dictionary? [duplicate]

This question already has answers here:
Is there a standard class for an infinitely nested defaultdict?
(6 answers)
Closed 9 years ago.
I'm creating a dictionary structure that is several levels deep. I'm trying to do something like the following:
dict = {}
dict['a']['b'] = True
At the moment the above fails because key 'a' does not exist. At the moment I have to check at every level of nesting and manually insert an empty dictionary. Is there some type of syntactic sugar to be able to do something like the above can produce:
{'a': {'b': True}}
Without having to create an empty dictionary at each level of nesting?
As others have said, use defaultdict. This is the idiom I prefer for arbitrarily-deep nesting of dictionaries:
def nested_dict():
return collections.defaultdict(nested_dict)
d = nested_dict()
d[1][2][3] = 'Hello, dictionary!'
print(d[1][2][3]) # Prints Hello, dictionary!
This also makes checking whether an element exists a little nicer, too, since you may no longer need to use get:
if not d[2][3][4][5]:
print('That element is empty!')
This has been edited to use a def rather than a lambda for pep8 compliance. The original lambda form looked like this below, which has the drawback of being called <lambda> everywhere instead of getting a proper function name.
>>> nested_dict = lambda: collections.defaultdict(nested_dict)
>>> d = nested_dict()
>>> d[1][2][3]
defaultdict(<function <lambda> at 0x037E7540>, {})
Use defaultdict.
Python: defaultdict of defaultdict?
Or you can do this, since dict() function can handle **kwargs:
http://docs.python.org/2/library/functions.html#func-dict
print dict(a=dict(b=True))
# {'a': {'b' : True}}
If the depth of your data structure is fixed (that is, you know in advance that you need mydict[a][b][c] but not mydict[a][b][c][d]), you can build a nested defaultdict structure using lambda expressions to create the inner structures:
two_level = defaultdict(dict)
three_level = defaultdict(lambda: defaultdict(dict))
four_level = defaultdict(lamda: defaultdict(lambda: defaultdict(dict)))

Check for a key pattern in a dictionary in python

dict1=({"EMP$$1":1,"EMP$$2":2,"EMP$$3":3})
How to check if EMP exists in the dictionary using python
dict1.get("EMP##") ??
It's not entirely clear what you want to do.
You can loop through the keys in the dict selecting keys using the startswith() method:
>>> for key in dict1:
... if key.startswith("EMP$$"):
... print "Found",key
...
Found EMP$$1
Found EMP$$2
Found EMP$$3
You can use a list comprehension to get all the values that match:
>>> [value for key,value in dict1.items() if key.startswith("EMP$$")]
[1, 2, 3]
If you just want to know if a key matches you could use the any() function:
>>> any(key.startswith("EMP$$") for key in dict1)
True
This approach strikes me as contrary to the intent of a dictionary.
A dictionary is made up of hash keys which have had values associated with them. The benefit of this structure is that it provides very fast lookups (on the order of O(1)). By searching through the keys, you're negating that benefit.
I would suggest reorganizing your dictionary.
dict1 = {"EMP$$": {"1": 1, "2": 2, "3": 3} }
Then, finding "EMP$$" is as simple as
if "EMP$$" in dict1:
#etc...
You need to be a lot more specific with what you want to do. However, assuming the dictionary you gave:
dict1={"EMP$$1":1, "EMP$$2":2, "EMP$$3":3}
If you wanted to know if a specific key was present before trying to request it you could:
dict1.has_key('EMP$$1')
True
Returns True as dict1 has the a key EMP$$1.
You could also forget about checking for keys and rely on the default return value of dict1.get():
dict1.get('EMP$$5',0)
0
Returns 0 as default given dict1 doesn't have a key EMP$$5.
In a similar way you could also use a `try/except/ structure to catch and handle missed keys:
try:
dict1['EMP$$5']
except KeyError, e:
# Code to deal w key error
print 'Trapped key error in dict1 looking for %s' % e
The other answers to this question are also great, but we need more info to be more precise.
There's no way to match dictionary keys like this. I suggest you rethink your data structure for this problem. If this has to be extra quick you could use something like a suffix tree.
You can use in string operator that checks if item is in another string. dict1 iterator returns list of keys, so you check "EMP$$" against of each dict1.key.
dict1 = {"EMP$$1": 1, "EMP$$2": 2, "EMP$$3": 3}
print(any("EMP$$" in i for i in dict1))
# True
# testing for item that doesn't exist
print(any("AMP$$" in i for i in dict1))
# False

Categories