I am iterating a dict created using the following.
tree = defaultdict(partial(defaultdict, partial(defaultdict, list)))
for dt, hour, value in flat_list:
tree[dt][hour]=[]
tree[dt][hour].append(value)
My output looks like this:
for k,v in tree.iteritems():
2012-08-07 defaultdict(<functools.partial object at 0x1e0a050>, {'17': ['30']})
2012-08-24 defaultdict(<functools.partial object at 0x1e0a050>, {'3': ['70']})
How do I get rid of this stuff? How do I iter like a regular dict?:
defaultdict(<functools.partial object at 0x1e0a050>
You are already iterating over the default dicts like a regular dict, but you are printing the defaultdict representation too.
To print these like you would print a regular dict, just turn them back into one:
for k,v in tree.iteritems():
print k, dict(v)
Note that a defaultdict is a direct subclass of dict, apart from the updated __getitem__ behaviour and the updated __repr__ hook1, a defaultdict behaves exactly like a normal dict would, certainly when it comes to iterating.
1__copy__ and __deepcopy__ are overridden too, to create a new defaultdict when using the copy module. A custom __reduce__ is provided for the pickle module for the same reasons.
Related
I have this nested dictionary that I get from an API.
response_body = \
{
u'access_token':u'SIF_HMACSHA256lxWT0K',
u'expires_in':86000,
u'name':u'Gandalf Grey',
u'preferred_username':u'gandalf',
u'ref_id':u'ab1d4237-edd7-4edd-934f-3486eac5c262',
u'refresh_token':u'eyJhbGciOiJIUzI1N',
u'roles':u'Instructor',
u'sub':{
u'cn':u'Gandalf Grey',
u'dc':u'7477',
u'uid':u'gandalf',
u'uniqueIdentifier':u'ab1d4237-edd7-4edd-934f-3486eac5c262'
}
}
I used the following to convert it into a Python object:
class sample_token:
def __init__(self, **response):
self.__dict__.update(response)
and used it like this:
s = sample_token(**response_body)
After this, I can access the values using s.access_token, s.name etc. But the value of c.sub is also a dictionary. How can I get the values of the nested dictionary using this technique? i.e. s.sub.cn returns Gandalf Grey.
Maybe a recursive method like this -
>>> class sample_token:
... def __init__(self, **response):
... for k,v in response.items():
... if isinstance(v,dict):
... self.__dict__[k] = sample_token(**v)
... else:
... self.__dict__[k] = v
...
>>> s = sample_token(**response_body)
>>> s.sub
<__main__.sample_token object at 0x02CEA530>
>>> s.sub.cn
'Gandalf Grey'
We go over each key:value pair in the response, and if value is a dictionary we create a sample_token object for that and put that new object in the __dict__() .
You can iterate over all key/value pairs with response.items() and for each value which isinstance(value, dict), replace it with sample_token(**value).
Nothing will do the recursion automagically for you.
Once you've evaluated the expression in Python, it's not a JSON object anymore; it's a Python dict; the usual way to access entries is with the [] indexer notation, e.g.:
response_body['sub']['uid']
'gandalf'
If you must access it as an object rather than a dict, check out the answers in the question Convert Python dict to object?; the case of nested dicsts is covered in one of the later answers.
I have a dictionary:
big_dict = {1:"1",
2:"2",
...
1000:"1000"}
(Note: My dictionary isn't actually numbers to strings)
I am passing this dictionary into a function that calls for it. I use the dictionary often for different functions. However, on occasion I want to send in big_dict with an extra key:item pair such that the dictionary I want to send in would be equivalent to:
big_dict[1001]="1001"
But I don't want to actually add the value to the dictionary. I could make a copy of the dictionary and add it there, but I'd like to avoid the memory + CPU cycles this would consume.
The code I currently have is:
big_dict[1001]="1001"
function_that_uses_dict(big_dict)
del big_dict[1001]
While this works, it seems rather kludgy.
If this were a string I'd do:
function_that_uses_string(myString + 'what I want to add on')
Is there any equivalent way of doing this with a dictionary?
As pointed out by Veedrac in his answer, this problem has already been solved in Python 3.3+ in the form of the ChainMap class:
function_that_uses_dict(ChainMap({1001 : "1001"}, big_dict))
If you don't have Python 3.3 you should use a backport, and if for some reason you don't want to, then below you can see how to implement it by yourself :)
You can create a wrapper, similarly to this:
class DictAdditionalValueWrapper:
def __init__(self, baseDict, specialKey, specialValue):
self.baseDict = baseDict
self.specialKey = specialKey
self.specialValue = specialValue
def __getitem__(self, key):
if key == self.specialKey:
return self.specialValue
return self.baseDict[key]
# ...
You need to supply all other dict method of course, or use the UserDict as a base class, which should simplify this.
and then use it like this:
function_that_uses_dict(DictAdditionalValueWrapper(big_dict, 1001, "1001"))
This can be easily extended to a whole additional dictionary of "special" keys and values, not just single additional element.
You can also extend this approach to reach something similar as in your string example:
class AdditionalKeyValuePair:
def __init__(self, specialKey, specialValue):
self.specialKey = specialKey
self.specialValue = specialValue
def __add__(self, d):
if not isinstance(d, dict):
raise Exception("Not a dict in AdditionalKeyValuePair")
return DictAdditionalValueWrapper(d, self.specialKey, self.specialValue)
and use it like this:
function_that_uses_dict(AdditionalKeyValuePair(1001, "1001") + big_dict)
If you're on 3.3+, just use ChainMap. Otherwise use a backport.
new_dict = ChainMap({1001: "1001"}, old_dict)
You can add the extra key-value pair leaving original dictionary as such like this:
>>> def function_that_uses_bdict(big_dict):
... print big_dict[1001]
...
>>> dct = {1:'1', 2:'2'}
>>> function_that_uses_bdict(dict(dct.items()+[(1001,'1001')]))
1001
>>> dct
{1: '1', 2: '2'} # original unchanged
This is a bit annoying too, but you could just have the function take two parameters, one of them being big_dict, and another being a temporary dictionary, created just for the function (so something like fxn(big_dict, {1001,'1001'}) ). Then you could access both dictionaries without changing your first one, and without copying big_dict.
Let us say I have a custom data structure comprising of primitive dicts. I need to serialize this using JSON. My structure is as follows:
path_list_dict = {(node1, node2 .. nodeN): (float1, float2, float3)}
So this is keyed with a tuple and the value is a tuple of three values. Each node element in the key is a custom class object with a _str_ method written for it. The wrapper dict which identifies each dict entry in path_list_dict with a key is as follows:
path_options_dict = {‘Path1’: {(node1, node2 .. nodeN): (float1, float2, float3)}, ‘Path2’: {(nodeA1, nodeA2 .. nodeAN): (floatA1, floatA2, floatA3)} }
and so on.
When I try to serialize this using JSON, of course I run into a TypeError because the inner dict has tuples as keys and values and a dict needs to have keys as strings to be serialized. This can be easily taken care of for me by inserting into the dict as the str(tuple) representation instead of just the native tuple.
What I am concerned about is that when I receive it and unpack the values, I am going to have all strings at the receiving end. The key tuple of the inner dict that consists of custom class elements is now represented as a str. Will I be able to recover the embedded data? Or is these some other way to do this better?
For more clarity, I am using this JSON tutorial as reference.
You have several options:
Serialize with a custom key prefix that you can pick out and unserialize again:
tuple_key = '__tuple__({})'.format(','.join(key))
would produce '__tuple__(node1,node2,nodeN)' as a key, which you could parse back into a tuple on the other side:
if key.startswith('__tuple__('):
key = tuple(key[10:-1].split(','))
Demo:
>>> key = ('node1', 'node2', 'node3')
>>> '__tuple__({})'.format(','.join(key))
'__tuple__(node1,node2,node3)'
>>> mapped_key = '__tuple__({})'.format(','.join(key))
>>> tuple(mapped_key[10:-1].split(','))
('node1', 'node2', 'node3')
Don't use dictionaries, use a list of lists:
{'Path': [[[node1, node2 .. nodeN], [float1, float2, float3]], [...]]}
You can build such a list simply from the dict.items() result:
>>> json.dumps({(1, 2, 3): ('foo', 'bar')}.items())
'[[[1, 2, 3], ["foo", "bar"]]]'
and when decoding, feed the whole thing back into dict() while mapping each key-value list to tuples:
>>> dict(map(tuple, kv) for kv in json.loads('[[[1, 2, 3], ["foo", "bar"]]]'))
{(1, 2, 3): (u'foo', u'bar')}
The latter approach is more suitable for custom classes as well, as the JSONEncoder.default() method will still be handed these custom objects for you to serialize back to a suitable dictionary object, which gives means that a suitable object_hook passed to JSONDecoder() a chance to return fully deserialized custom objects again for those.
This question already has answers here:
Is there a standard class for an infinitely nested defaultdict?
(6 answers)
Closed 9 years ago.
I'm creating a dictionary structure that is several levels deep. I'm trying to do something like the following:
dict = {}
dict['a']['b'] = True
At the moment the above fails because key 'a' does not exist. At the moment I have to check at every level of nesting and manually insert an empty dictionary. Is there some type of syntactic sugar to be able to do something like the above can produce:
{'a': {'b': True}}
Without having to create an empty dictionary at each level of nesting?
As others have said, use defaultdict. This is the idiom I prefer for arbitrarily-deep nesting of dictionaries:
def nested_dict():
return collections.defaultdict(nested_dict)
d = nested_dict()
d[1][2][3] = 'Hello, dictionary!'
print(d[1][2][3]) # Prints Hello, dictionary!
This also makes checking whether an element exists a little nicer, too, since you may no longer need to use get:
if not d[2][3][4][5]:
print('That element is empty!')
This has been edited to use a def rather than a lambda for pep8 compliance. The original lambda form looked like this below, which has the drawback of being called <lambda> everywhere instead of getting a proper function name.
>>> nested_dict = lambda: collections.defaultdict(nested_dict)
>>> d = nested_dict()
>>> d[1][2][3]
defaultdict(<function <lambda> at 0x037E7540>, {})
Use defaultdict.
Python: defaultdict of defaultdict?
Or you can do this, since dict() function can handle **kwargs:
http://docs.python.org/2/library/functions.html#func-dict
print dict(a=dict(b=True))
# {'a': {'b' : True}}
If the depth of your data structure is fixed (that is, you know in advance that you need mydict[a][b][c] but not mydict[a][b][c][d]), you can build a nested defaultdict structure using lambda expressions to create the inner structures:
two_level = defaultdict(dict)
three_level = defaultdict(lambda: defaultdict(dict))
four_level = defaultdict(lamda: defaultdict(lambda: defaultdict(dict)))
Several times (even several in a row) I've been bitten by the defaultdict bug: forgetting that something is actually a defaultdict and treating it like a regular dictionary.
d = defaultdict(list)
...
try:
v = d["key"]
except KeyError:
print "Sorry, no dice!"
For those who have been bitten too, the problem is evident: when d has no key 'key', the v = d["key"] magically creates an empty list and assigns it to both d["key"] and v instead of raising an exception. Which can be quite a pain to track down if d comes from some module whose details one doesn't remember very well.
I'm looking for a way to take the sting out of this bug. For me, the best solution would be to somehow disable a defaultdict's magic before returning it to the client.
You may still convert it to an normal dict.
d = collections.defaultdict(list)
d = dict(d)
use different idiom:
if 'key' not in d:
print "Sorry, no dice!"
You can prevent creation of default values by assigning d.default_factory = None. However, I don't quite like the idea of object suddenly changing behavior. I'd prefer copying values to the new dict unless it imposes severe performance penalty.
That is exactly the behavior you want from a defaultdict and not a bug. If you dont't want it, dont use a defaultdict.
If you keep forgetting what type variables have, then name them appropriately - for example suffix your defaultdict names with "_ddict".
Using rkhayrov's idea of resetting self.default_factory, here is a toggleable subclass of defaultdict:
class ToggleableDefaultdict(collections.defaultdict):
def __init__(self,default_factory):
self._default_factory=default_factory
super(ToggleableDefaultdict,self).__init__(default_factory)
def off(self):
self.default_factory=None
def on(self):
self.default_factory=self._default_factory
For example:
d=ToggleableDefaultdict(list)
d['key'].append(1)
print(d)
# defaultdict(<type 'list'>, {'key': [1]})
d.off()
d['newkey'].append(2)
# KeyError: 'newkey'
d.on()
d['newkey'].append(2)
# defaultdict(<type 'list'>, {'newkey': [2], 'key': [1]})