I'm new to Python, and am sort of surprised I cannot do this.
dictionary = {
'a' : '123',
'b' : dictionary['a'] + '456'
}
I'm wondering what the Pythonic way to correctly do this in my script, because I feel like I'm not the only one that has tried to do this.
EDIT: Enough people were wondering what I'm doing with this, so here are more details for my use cases. Lets say I want to keep dictionary objects to hold file system paths. The paths are relative to other values in the dictionary. For example, this is what one of my dictionaries may look like.
dictionary = {
'user': 'sholsapp',
'home': '/home/' + dictionary['user']
}
It is important that at any point in time I may change dictionary['user'] and have all of the dictionaries values reflect the change. Again, this is an example of what I'm using it for, so I hope that it conveys my goal.
From my own research I think I will need to implement a class to do this.
No fear of creating new classes -
You can take advantage of Python's string formating capabilities
and simply do:
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item) % self
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/%(user)s',
'bin' : '%(home)s/bin'
})
print dictionary["home"]
print dictionary["bin"]
Nearest I came up without doing object:
dictionary = {
'user' : 'gnucom',
'home' : lambda:'/home/'+dictionary['user']
}
print dictionary['home']()
dictionary['user']='tony'
print dictionary['home']()
>>> dictionary = {
... 'a':'123'
... }
>>> dictionary['b'] = dictionary['a'] + '456'
>>> dictionary
{'a': '123', 'b': '123456'}
It works fine but when you're trying to use dictionary it hasn't been defined yet (because it has to evaluate that literal dictionary first).
But be careful because this assigns to the key of 'b' the value referenced by the key of 'a' at the time of assignment and is not going to do the lookup every time. If that is what you are looking for, it's possible but with more work.
What you're describing in your edit is how an INI config file works. Python does have a built in library called ConfigParser which should work for what you're describing.
This is an interesting problem. It seems like Greg has a good solution. But that's no fun ;)
jsbueno as a very elegant solution but that only applies to strings (as you requested).
The trick to a 'general' self referential dictionary is to use a surrogate object. It takes a few (understatement) lines of code to pull off, but the usage is about what you want:
S = SurrogateDict(AdditionSurrogateDictEntry)
d = S.resolve({'user': 'gnucom',
'home': '/home/' + S['user'],
'config': [S['home'] + '/.emacs', S['home'] + '/.bashrc']})
The code to make that happen is not nearly so short. It lives in three classes:
import abc
class SurrogateDictEntry(object):
__metaclass__ = abc.ABCMeta
def __init__(self, key):
"""record the key on the real dictionary that this will resolve to a
value for
"""
self.key = key
def resolve(self, d):
""" return the actual value"""
if hasattr(self, 'op'):
# any operation done on self will store it's name in self.op.
# if this is set, resolve it by calling the appropriate method
# now that we can get self.value out of d
self.value = d[self.key]
return getattr(self, self.op + 'resolve__')()
else:
return d[self.key]
#staticmethod
def make_op(opname):
"""A convience class. This will be the form of all op hooks for subclasses
The actual logic for the op is in __op__resolve__ (e.g. __add__resolve__)
"""
def op(self, other):
self.stored_value = other
self.op = opname
return self
op.__name__ = opname
return op
Next, comes the concrete class. simple enough.
class AdditionSurrogateDictEntry(SurrogateDictEntry):
__add__ = SurrogateDictEntry.make_op('__add__')
__radd__ = SurrogateDictEntry.make_op('__radd__')
def __add__resolve__(self):
return self.value + self.stored_value
def __radd__resolve__(self):
return self.stored_value + self.value
Here's the final class
class SurrogateDict(object):
def __init__(self, EntryClass):
self.EntryClass = EntryClass
def __getitem__(self, key):
"""record the key and return"""
return self.EntryClass(key)
#staticmethod
def resolve(d):
"""I eat generators resolve self references"""
stack = [d]
while stack:
cur = stack.pop()
# This just tries to set it to an appropriate iterable
it = xrange(len(cur)) if not hasattr(cur, 'keys') else cur.keys()
for key in it:
# sorry for being a duche. Just register your class with
# SurrogateDictEntry and you can pass whatever.
while isinstance(cur[key], SurrogateDictEntry):
cur[key] = cur[key].resolve(d)
# I'm just going to check for iter but you can add other
# checks here for items that we should loop over.
if hasattr(cur[key], '__iter__'):
stack.append(cur[key])
return d
In response to gnucoms's question about why I named the classes the way that I did.
The word surrogate is generally associated with standing in for something else so it seemed appropriate because that's what the SurrogateDict class does: an instance replaces the 'self' references in a dictionary literal. That being said, (other than just being straight up stupid sometimes) naming is probably one of the hardest things for me about coding. If you (or anyone else) can suggest a better name, I'm all ears.
I'll provide a brief explanation. Throughout S refers to an instance of SurrogateDict and d is the real dictionary.
A reference S[key] triggers S.__getitem__ and SurrogateDictEntry(key) to be placed in the d.
When S[key] = SurrogateDictEntry(key) is constructed, it stores key. This will be the key into d for the value that this entry of SurrogateDictEntry is acting as a surrogate for.
After S[key] is returned, it is either entered into the d, or has some operation(s) performed on it. If an operation is performed on it, it triggers the relative __op__ method which simple stores the value that the operation is performed on and the name of the operation and then returns itself. We can't actually resolve the operation because d hasn't been constructed yet.
After d is constructed, it is passed to S.resolve. This method loops through d finding any instances of SurrogateDictEntry and replacing them with the result of calling the resolve method on the instance.
The SurrogateDictEntry.resolve method receives the now constructed d as an argument and can use the value of key that it stored at construction time to get the value that it is acting as a surrogate for. If an operation was performed on it after creation, the op attribute will have been set with the name of the operation that was performed. If the class has a __op__ method, then it has a __op__resolve__ method with the actual logic that would normally be in the __op__ method. So now we have the logic (self.op__resolve) and all necessary values (self.value, self.stored_value) to finally get the real value of d[key]. So we return that which step 4 places in the dictionary.
finally the SurrogateDict.resolve method returns d with all references resolved.
That'a a rough sketch. If you have any more questions, feel free to ask.
If you, just like me wandering how to make #jsbueno snippet work with {} style substitutions, below is the example code (which is probably not much efficient though):
import string
class MyDict(dict):
def __init__(self, *args, **kw):
super(MyDict,self).__init__(*args, **kw)
self.itemlist = super(MyDict,self).keys()
self.fmt = string.Formatter()
def __getitem__(self, item):
return self.fmt.vformat(dict.__getitem__(self, item), {}, self)
xs = MyDict({
'user' : 'gnucom',
'home' : '/home/{user}',
'bin' : '{home}/bin'
})
>>> xs["home"]
'/home/gnucom'
>>> xs["bin"]
'/home/gnucom/bin'
I tried to make it work with the simple replacement of % self with .format(**self) but it turns out it wouldn't work for nested expressions (like 'bin' in above listing, which references 'home', which has it's own reference to 'user') because of the evaluation order (** expansion is done before actual format call and it's not delayed like in original % version).
Write a class, maybe something with properties:
class PathInfo(object):
def __init__(self, user):
self.user = user
#property
def home(self):
return '/home/' + self.user
p = PathInfo('thc')
print p.home # /home/thc
As sort of an extended version of #Tony's answer, you could build a dictionary subclass that calls its values if they are callables:
class CallingDict(dict):
"""Returns the result rather than the value of referenced callables.
>>> cd = CallingDict({1: "One", 2: "Two", 'fsh': "Fish",
... "rhyme": lambda d: ' '.join((d[1], d['fsh'],
... d[2], d['fsh']))})
>>> cd["rhyme"]
'One Fish Two Fish'
>>> cd[1] = 'Red'
>>> cd[2] = 'Blue'
>>> cd["rhyme"]
'Red Fish Blue Fish'
"""
def __getitem__(self, item):
it = super(CallingDict, self).__getitem__(item)
if callable(it):
return it(self)
else:
return it
Of course this would only be usable if you're not actually going to store callables as values. If you need to be able to do that, you could wrap the lambda declaration in a function that adds some attribute to the resulting lambda, and check for it in CallingDict.__getitem__, but at that point it's getting complex, and long-winded, enough that it might just be easier to use a class for your data in the first place.
This is very easy in a lazily evaluated language (haskell).
Since Python is strictly evaluated, we can do a little trick to turn things lazy:
Y = lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
d1 = lambda self: lambda: {
'a': lambda: 3,
'b': lambda: self()['a']()
}
# fix the d1, and evaluate it
d2 = Y(d1)()
# to get a
d2['a']() # 3
# to get b
d2['b']() # 3
Syntax wise this is not very nice. That's because of us needing to explicitly construct lazy expressions with lambda: ... and explicitly evaluate lazy expression with ...(). It's the opposite problem in lazy languages needing strictness annotations, here in Python we end up needing lazy annotations.
I think with some more meta-programmming and some more tricks, the above could be made more easy to use.
Note that this is basically how let-rec works in some functional languages.
The jsbueno answer in Python 3 :
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item).format(self)
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/{0[user]}',
'bin' : '{0[home]}/bin'
})
print(dictionary["home"])
print(dictionary["bin"])
Her ewe use the python 3 string formatting with curly braces {} and the .format() method.
Documentation : https://docs.python.org/3/library/string.html
I apologise if the title is cryptic, I could not think of a way to describe my problem in a sentence. I am building some code in python2.7 that I describe below.
Minimal working example
My code has a Parameter class that implements attributes such as name and value, which looks something like this.
class Parameter(object):
def __init__(self, name, value=None, error=None, dist=None, prior=None):
self.name = name
self._value = value # given value for parameter, this is going to be changed very often in an MCMC sampler
self.error = error # initial estimate of error for the parameter, will only be set once
self._dist = dist # a distribution for the parameter, will only be set once
self.prior = prior
#property
def value(self):
return self._value
#property
def dist(self):
return self._dist
The class also has several properties that returns the mean, median, etc. of Parameter.dist if a distribution is given.
I have another class, e.g. ParameterSample, that creates a population of different Parameter objects. Some of these Parameter objects have their attributes (e.g. value, error) set using the Parameter.set_parameter() function, but some other Parameter objects are not explicitly set, but their value and dist attributes depend on some of the other Parameter objects that are set:
class ParameterSample(object):
def __init__(self):
varied_parameters = ('a', 'b') # parameter names whose `value` attribute is varied
derived_parameters = ('c',) # parameter names whose `value` attribute is varied, but depends on `a.value` and `b.value`
parameter_names = varied_parameters + derived_parameters
# create `Parameter` objects for each parameter name
for name in parameter_names:
setattr(self, name, Parameter(name))
def set_parameter(self, name, **kwargs):
for key, val in kwargs.items():
if key == 'value':
key = '_'.join(['', key]) # add underscore to set `Parameter._value`
setattr(getattr(self, name), key, val) # basically does e.g. `self.a.value = 1`
I can now create a ParameterSample and use them like this:
parobj = ParameterSample()
parobj.set_parameter('a', value=1, error=0.1)
parobj.set_parameter('b', value=2, error=0.5)
parobj.a.value
>>> 1
parobj.b.error
>>> 0.5
parobj.set_parameter('b', value=3)
parobj.b.value
>>> 3
parobj.b.error
>>> 0.5
What I want
What I ultimately want, is to use Parameter.c the same way. For example:
parobj.c.value
>>> 4 # returns parobj.a.value + parobj.b.value
parobj.c.dist
>>> None # returns a.dist + b.dist, but since they are not currently set it is None
c therefore needs to be a Parameter object with all the same attributes as a and b, but where its value and dist are updated according to the current attributes of a and b.
However, I should also mention that I want to be able to set the allowed prior ranges for parameter c, e.g. parobj.set_parameter('c', prior=(0,10)) before making any calls to its value -- so c needs to be an already defined Parameter object upon the creation of the ParameterSample object.
How would I implement this into my ParameterSample class?
What I've tried
I have tried looking into making my own decorators, but I am not sure if that is the way to go since I don't fully understand how I would use those.
I've also considered adding a #property to c that creates a new Parameter object every time it is called, but I feel like that is not the way to go since it may slow down the code.
I should also note that the ParameterSample class above is going to be inherited in a different class, so whatever the solution is it should be able to be used in this setting:
class Companion(ParameterSample)
def __init__(self, name):
self.name = name
super(Companion, self).__init__()
comp = Companion(name='Earth')
comp.set_parameter('a', value=1)
comp.set_parameter('b', value=3)
comp.c.value
>>> 4
I could not get this to work in Python 2 - the setattr calls never seemed to propagate the attributes to the child classes (Companion would have no c attribute).
I was more successful with Python 3 though. Since you have two parameter types (varied vs. derived), it makes sense IMO to have two classes to implement the behavior, instead of treating them all as one.
I added a DerivedParameter class, inheriting from Parameter that takes a dependents argument (along with its parent class' args/kwargs), but redefining value and dist to give dependent behavior:
class DerivedParameter(Parameter):
def __init__(self, name, dependents, **kwargs):
self._dependents = dependents
super().__init__(name, **kwargs)
#property
def value(self):
try:
return sum(x._value for x in self._dependents if x is not None)
except TypeError:
return None
#property
def dist(self):
try:
return sum(x._dist for x in self._dependents if x is not None)
except TypeError:
return None
Then I adjusted how your parameter objects are added:
class ParameterSample:
def __init__(self):
# Store as instance attributes to reference later
self.varied_params = ('a', 'b') # parameter names whose `value` attribute is varied
self.derived_params = ('c',) # parameter names whose `value` attribute is varied, but depends on `a.value` and `b.value`
# No more combined names
# create `Parameter` objects for each varied parameter name
for name in self.varied_params:
setattr(self, name, Parameter(name))
# Create `DerivedParameter` objects for each derived parameter
# Derived parameters depend on all `Parameter` objects. It wasn't
# clear if this was the desired behavior though.
params = [v for _, v in self.__dict__.items() if isinstance(v, Parameter)]
for name in self.derived_params:
setattr(self, name, DerivedParameter(name, params))
def set_parameter(self, name, **kwargs):
for key, val in kwargs.items():
if key == 'value':
key = '_'.join(['', key]) # add underscore to set `Parameter._value`
setattr(getattr(self, name), key, val) # basically does e.g. `self.a.value = 1`
From this, I could then replicate your given example desired behavior:
>>> comp = Companion(name='Earth')
>>> comp.set_parameter('a', value=1)
>>> comp.set_parameter('b', value=3)
>>> print(comp.c.value)
>>> print(comp.c.dist)
4
None
>>> comp.set_parameter('c', prior=(0,10))
>>> print(comp.c.prior)
(0, 10)
As I pointed out in the comments, the design above ends up causing all derived parameters to use all varied parameters as their dependents - effectively making c and a potential d identical. You should be able to fix this fairly easily with some parameters/conditions.
Overall, I would have to agree with #Error - Syntactical Remorse though. This is a pretty complicated way to go about designing classes and would make maintenance confusing at best. I would strongly encourage you to reconsider your design and try to find an adaptable general solution that doesn't involve dynamic creation of attributes like this.
I find it more convenient to access dict keys as obj.foo instead of obj['foo'], so I wrote this snippet:
class AttributeDict(dict):
def __getattr__(self, attr):
return self[attr]
def __setattr__(self, attr, value):
self[attr] = value
However, I assume that there must be some reason that Python doesn't provide this functionality out of the box. What would be the caveats and pitfalls of accessing dict keys in this manner?
Update - 2020
Since this question was asked almost ten years ago, quite a bit has changed in Python itself since then.
While the approach in my original answer is still valid for some cases, (e.g. legacy projects stuck to older versions of Python and cases where you really need to handle dictionaries with very dynamic string keys), I think that in general the dataclasses introduced in Python 3.7 are the obvious/correct solution to vast majority of the use cases of AttrDict.
Original answer
The best way to do this is:
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
Some pros:
It actually works!
No dictionary class methods are shadowed (e.g. .keys() work just fine. Unless - of course - you assign some value to them, see below)
Attributes and items are always in sync
Trying to access non-existent key as an attribute correctly raises AttributeError instead of KeyError
Supports [Tab] autocompletion (e.g. in jupyter & ipython)
Cons:
Methods like .keys() will not work just fine if they get overwritten by incoming data
Causes a memory leak in Python < 2.7.4 / Python3 < 3.2.3
Pylint goes bananas with E1123(unexpected-keyword-arg) and E1103(maybe-no-member)
For the uninitiated it seems like pure magic.
A short explanation on how this works
All python objects internally store their attributes in a dictionary that is named __dict__.
There is no requirement that the internal dictionary __dict__ would need to be "just a plain dict", so we can assign any subclass of dict() to the internal dictionary.
In our case we simply assign the AttrDict() instance we are instantiating (as we are in __init__).
By calling super()'s __init__() method we made sure that it (already) behaves exactly like a dictionary, since that function calls all the dictionary instantiation code.
One reason why Python doesn't provide this functionality out of the box
As noted in the "cons" list, this combines the namespace of stored keys (which may come from arbitrary and/or untrusted data!) with the namespace of builtin dict method attributes. For example:
d = AttrDict()
d.update({'items':["jacket", "necktie", "trousers"]})
for k, v in d.items(): # TypeError: 'list' object is not callable
print "Never reached!"
You can have all legal string characters as part of the key if you use array notation.
For example, obj['!#$%^&*()_']
Wherein I Answer the Question That Was Asked
Why doesn't Python offer it out of the box?
I suspect that it has to do with the Zen of Python: "There should be one -- and preferably only one -- obvious way to do it." This would create two obvious ways to access values from dictionaries: obj['key'] and obj.key.
Caveats and Pitfalls
These include possible lack of clarity and confusion in the code. i.e., the following could be confusing to someone else who is going in to maintain your code at a later date, or even to you, if you're not going back into it for awhile. Again, from Zen: "Readability counts!"
>>> KEY = 'spam'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
If d is instantiated or KEY is defined or d[KEY] is assigned far away from where d.spam is being used, it can easily lead to confusion about what's being done, since this isn't a commonly-used idiom. I know it would have the potential to confuse me.
Additonally, if you change the value of KEY as follows (but miss changing d.spam), you now get:
>>> KEY = 'foo'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
AttributeError: 'C' object has no attribute 'spam'
IMO, not worth the effort.
Other Items
As others have noted, you can use any hashable object (not just a string) as a dict key. For example,
>>> d = {(2, 3): True,}
>>> assert d[(2, 3)] is True
>>>
is legal, but
>>> C = type('C', (object,), {(2, 3): True})
>>> d = C()
>>> assert d.(2, 3) is True
File "<stdin>", line 1
d.(2, 3)
^
SyntaxError: invalid syntax
>>> getattr(d, (2, 3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
>>>
is not. This gives you access to the entire range of printable characters or other hashable objects for your dictionary keys, which you do not have when accessing an object attribute. This makes possible such magic as a cached object metaclass, like the recipe from the Python Cookbook (Ch. 9).
Wherein I Editorialize
I prefer the aesthetics of spam.eggs over spam['eggs'] (I think it looks cleaner), and I really started craving this functionality when I met the namedtuple. But the convenience of being able to do the following trumps it.
>>> KEYS = 'spam eggs ham'
>>> VALS = [1, 2, 3]
>>> d = {k: v for k, v in zip(KEYS.split(' '), VALS)}
>>> assert d == {'spam': 1, 'eggs': 2, 'ham': 3}
>>>
This is a simple example, but I frequently find myself using dicts in different situations than I'd use obj.key notation (i.e., when I need to read prefs in from an XML file). In other cases, where I'm tempted to instantiate a dynamic class and slap some attributes on it for aesthetic reasons, I continue to use a dict for consistency in order to enhance readability.
I'm sure the OP has long-since resolved this to his satisfaction, but if he still wants this functionality, then I suggest he download one of the packages from pypi that provides it:
Bunch is the one I'm more familiar with. Subclass of dict, so you have all that functionality.
AttrDict also looks like it's also pretty good, but I'm not as familiar with it and haven't looked through the source in as much detail as I have Bunch.
Addict Is actively maintained and provides attr-like access and more.
As noted in the comments by Rotareti, Bunch has been deprecated, but there is an active fork called Munch.
However, in order to improve readability of his code I strongly recommend that he not mix his notation styles. If he prefers this notation then he should simply instantiate a dynamic object, add his desired attributes to it, and call it a day:
>>> C = type('C', (object,), {})
>>> d = C()
>>> d.spam = 1
>>> d.eggs = 2
>>> d.ham = 3
>>> assert d.__dict__ == {'spam': 1, 'eggs': 2, 'ham': 3}
Wherein I Update, to Answer a Follow-Up Question in the Comments
In the comments (below), Elmo asks:
What if you want to go one deeper? ( referring to type(...) )
While I've never used this use case (again, I tend to use nested dict, for
consistency), the following code works:
>>> C = type('C', (object,), {})
>>> d = C()
>>> for x in 'spam eggs ham'.split():
... setattr(d, x, C())
... i = 1
... for y in 'one two three'.split():
... setattr(getattr(d, x), y, i)
... i += 1
...
>>> assert d.spam.__dict__ == {'one': 1, 'two': 2, 'three': 3}
From This other SO question there's a great implementation example that simplifies your existing code. How about:
class AttributeDict(dict):
__slots__ = ()
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
Much more concise and doesn't leave any room for extra cruft getting into your __getattr__ and __setattr__ functions in the future.
You can pull a convenient container class from the standard library:
from argparse import Namespace
to avoid having to copy around code bits. No standard dictionary access, but easy to get one back if you really want it. The code in argparse is simple,
class Namespace(_AttributeHolder):
"""Simple object for storing attributes.
Implements equality by attribute names and values, and provides a simple
string representation.
"""
def __init__(self, **kwargs):
for name in kwargs:
setattr(self, name, kwargs[name])
__hash__ = None
def __eq__(self, other):
return vars(self) == vars(other)
def __ne__(self, other):
return not (self == other)
def __contains__(self, key):
return key in self.__dict__
Caveat emptor: For some reasons classes like this seem to break the multiprocessing package. I just struggled with this bug for awhile before finding this SO:
Finding exception in python multiprocessing
I found myself wondering what the current state of "dict keys as attr" in the python ecosystem. As several commenters have pointed out, this is probably not something you want to roll your own from scratch, as there are several pitfalls and footguns, some of them very subtle. Also, I would not recommend using Namespace as a base class, I've been down that road, it isn't pretty.
Fortunately, there are several open source packages providing this functionality, ready to pip install! Unfortunately, there are several packages. Here is a synopsis, as of Dec 2019.
Contenders (most recent commit to master|#commits|#contribs|coverage%):
addict (2021-01-05 | 229 | 22 | 100%)
munch (2021-01-22 | 166 | 17 | ?%)
easydict (2021-02-28 | 54 | 7 | ?%)
attrdict (2019-02-01 | 108 | 5 | 100%)
prodict (2021-03-06 | 100 | 2 | ?%)
No longer maintained or under-maintained:
treedict (2014-03-28 | 95 | 2 | ?%)
bunch (2012-03-12 | 20 | 2 | ?%)
NeoBunch
I currently recommend munch or addict. They have the most commits, contributors, and releases, suggesting a healthy open-source codebase for each. They have the cleanest-looking readme.md, 100% coverage, and good looking set of tests.
I do not have a dog in this race (for now!), besides having rolled my own dict/attr code and wasted a ton of time because I was not aware of all these options :). I may contribute to addict/munch in the future as I would rather see one solid package than a bunch of fragmented ones. If you like them, contribute! In particular, looks like munch could use a codecov badge and addict could use a python version badge.
addict pros:
recursive initialization (foo.a.b.c = 'bar'), dict-like arguments become addict.Dict
addict cons:
shadows typing.Dict if you from addict import Dict
No key checking. Due to allowing recursive init, if you misspell a key, you just create a new attribute, rather than KeyError (thanks AljoSt)
munch pros:
unique naming
built-in ser/de functions for JSON and YAML
munch cons:
no recursive init (you cannot construct foo.a.b.c = 'bar', you must set foo.a, then foo.a.b, etc.
Wherein I Editorialize
Many moons ago, when I used text editors to write python, on projects with only myself or one other dev, I liked the style of dict-attrs, the ability to insert keys by just declaring foo.bar.spam = eggs. Now I work on teams, and use an IDE for everything, and I have drifted away from these sorts of data structures and dynamic typing in general, in favor of static analysis, functional techniques and type hints. I've started experimenting with this technique, subclassing Pstruct with objects of my own design:
class BasePstruct(dict):
def __getattr__(self, name):
if name in self.__slots__:
return self[name]
return self.__getattribute__(name)
def __setattr__(self, key, value):
if key in self.__slots__:
self[key] = value
return
if key in type(self).__dict__:
self[key] = value
return
raise AttributeError(
"type object '{}' has no attribute '{}'".format(type(self).__name__, key))
class FooPstruct(BasePstruct):
__slots__ = ['foo', 'bar']
This gives you an object which still behaves like a dict, but also lets you access keys like attributes, in a much more rigid fashion. The advantage here is I (or the hapless consumers of your code) know exactly what fields can and can't exist, and the IDE can autocomplete fields. Also subclassing vanilla dict means json serialization is easy. I think the next evolution in this idea would be a custom protobuf generator which emits these interfaces, and a nice knock-on is you get cross-language data structures and IPC via gRPC for nearly free.
If you do decide to go with attr-dicts, it's essential to document what fields are expected, for your own (and your teammates') sanity.
Feel free to edit/update this post to keep it recent!
What if you wanted a key which was a method, such as __eq__ or __getattr__?
And you wouldn't be able to have an entry that didn't start with a letter, so using 0343853 as a key is out.
And what if you didn't want to use a string?
tuples can be used dict keys. How would you access tuple in your construct?
Also, namedtuple is a convenient structure which can provide values via the attribute access.
How about Prodict, the little Python class that I wrote to rule them all:)
Plus, you get auto code completion, recursive object instantiations and auto type conversion!
You can do exactly what you asked for:
p = Prodict()
p.foo = 1
p.bar = "baz"
Example 1: Type hinting
class Country(Prodict):
name: str
population: int
turkey = Country()
turkey.name = 'Turkey'
turkey.population = 79814871
Example 2: Auto type conversion
germany = Country(name='Germany', population='82175700', flag_colors=['black', 'red', 'yellow'])
print(germany.population) # 82175700
print(type(germany.population)) # <class 'int'>
print(germany.flag_colors) # ['black', 'red', 'yellow']
print(type(germany.flag_colors)) # <class 'list'>
It doesn't work in generality. Not all valid dict keys make addressable attributes ("the key"). So, you'll need to be careful.
Python objects are all basically dictionaries. So I doubt there is much performance or other penalty.
This doesn't address the original question, but should be useful for people that, like me, end up here when looking for a lib that provides this functionality.
Addict it's a great lib for this: https://github.com/mewwts/addict it takes care of many concerns mentioned in previous answers.
An example from the docs:
body = {
'query': {
'filtered': {
'query': {
'match': {'description': 'addictive'}
},
'filter': {
'term': {'created_by': 'Mats'}
}
}
}
}
With addict:
from addict import Dict
body = Dict()
body.query.filtered.query.match.description = 'addictive'
body.query.filtered.filter.term.created_by = 'Mats'
Just to add some variety to the answer, sci-kit learn has this implemented as a Bunch:
class Bunch(dict):
""" Scikit Learn's container object
Dictionary-like object that exposes its keys as attributes.
>>> b = Bunch(a=1, b=2)
>>> b['b']
2
>>> b.b
2
>>> b.c = 6
>>> b['c']
6
"""
def __init__(self, **kwargs):
super(Bunch, self).__init__(kwargs)
def __setattr__(self, key, value):
self[key] = value
def __dir__(self):
return self.keys()
def __getattr__(self, key):
try:
return self[key]
except KeyError:
raise AttributeError(key)
def __setstate__(self, state):
pass
All you need is to get the setattr and getattr methods - the getattr checks for dict keys and the moves on to checking for actual attributes. The setstaet is a fix for fix for pickling/unpickling "bunches" - if inerested check https://github.com/scikit-learn/scikit-learn/issues/6196
Here's a short example of immutable records using built-in collections.namedtuple:
def record(name, d):
return namedtuple(name, d.keys())(**d)
and a usage example:
rec = record('Model', {
'train_op': train_op,
'loss': loss,
})
print rec.loss(..)
After not being satisfied with the existing options for the reasons below I developed MetaDict. It behaves exactly like dict but enables dot notation and IDE autocompletion without the shortcomings and potential namespace conflicts of other solutions. All features and usage examples can be found on GitHub (see link above).
Full disclosure: I am the author of MetaDict.
Shortcomings/limitations I encountered when trying out other solutions:
Addict
No key autocompletion in IDE
Nested key assignment cannot be turned off
Newly assigned dict objects are not converted to support attribute-style key access
Shadows inbuilt type Dict
Prodict
No key autocompletion in IDE without defining a static schema (similar to dataclass)
No recursive conversion of dict objects when embedded in list or other inbuilt iterables
AttrDict
No key autocompletion in IDE
Converts list objects to tuple behind the scenes
Munch
Inbuilt methods like items(), update(), etc. can be overwritten with obj.items = [1, 2, 3]
No recursive conversion of dict objects when embedded in list or other inbuilt iterables
EasyDict
Only strings are valid keys, but dict accepts all hashable objects as keys
Inbuilt methods like items(), update(), etc. can be overwritten with obj.items = [1, 2, 3]
Inbuilt methods don't behave as expected: obj.pop('unknown_key', None) raises an AttributeError
Apparently there is now a library for this - https://pypi.python.org/pypi/attrdict - which implements this exact functionality plus recursive merging and json loading. Might be worth a look.
You can do it using this class I just made. With this class you can use the Map object like another dictionary(including json serialization) or with the dot notation. I hope help you:
class Map(dict):
"""
Example:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
"""
def __init__(self, *args, **kwargs):
super(Map, self).__init__(*args, **kwargs)
for arg in args:
if isinstance(arg, dict):
for k, v in arg.iteritems():
self[k] = v
if kwargs:
for k, v in kwargs.iteritems():
self[k] = v
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(Map, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(Map, self).__delitem__(key)
del self.__dict__[key]
Usage examples:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
# Add new key
m.new_key = 'Hello world!'
print m.new_key
print m['new_key']
# Update values
m.new_key = 'Yay!'
# Or
m['new_key'] = 'Yay!'
# Delete key
del m.new_key
# Or
del m['new_key']
Let me post another implementation, which builds upon the answer of Kinvais, but integrates ideas from the AttributeDict proposed in http://databio.org/posts/python_AttributeDict.html.
The advantage of this version is that it also works for nested dictionaries:
class AttrDict(dict):
"""
A class to convert a nested Dictionary into an object with key-values
that are accessible using attribute notation (AttrDict.attribute) instead of
key notation (Dict["key"]). This class recursively sets Dicts to objects,
allowing you to recurse down nested dicts (like: AttrDict.attr.attr)
"""
# Inspired by:
# http://stackoverflow.com/a/14620633/1551810
# http://databio.org/posts/python_AttributeDict.html
def __init__(self, iterable, **kwargs):
super(AttrDict, self).__init__(iterable, **kwargs)
for key, value in iterable.items():
if isinstance(value, dict):
self.__dict__[key] = AttrDict(value)
else:
self.__dict__[key] = value
This is what I use
args = {
'batch_size': 32,
'workers': 4,
'train_dir': 'train',
'val_dir': 'val',
'lr': 1e-3,
'momentum': 0.9,
'weight_decay': 1e-4
}
args = namedtuple('Args', ' '.join(list(args.keys())))(**args)
print (args.lr)
The easiest way is to define a class let's call it Namespace. which uses the object dict.update() on the dict. Then, the dict will be treated as an object.
class Namespace(object):
'''
helps referencing object in a dictionary as dict.key instead of dict['key']
'''
def __init__(self, adict):
self.__dict__.update(adict)
Person = Namespace({'name': 'ahmed',
'age': 30}) #--> added for edge_cls
print(Person.name)
No need to write your own as
setattr() and getattr() already exist.
The advantage of class objects probably comes into play in class definition and inheritance.
I created this based on the input from this thread. I need to use odict though, so I had to override get and set attr. I think this should work for the majority of special uses.
Usage looks like this:
# Create an ordered dict normally...
>>> od = OrderedAttrDict()
>>> od["a"] = 1
>>> od["b"] = 2
>>> od
OrderedAttrDict([('a', 1), ('b', 2)])
# Get and set data using attribute access...
>>> od.a
1
>>> od.b = 20
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
# Setting a NEW attribute only creates it on the instance, not the dict...
>>> od.c = 8
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
>>> od.c
8
The class:
class OrderedAttrDict(odict.OrderedDict):
"""
Constructs an odict.OrderedDict with attribute access to data.
Setting a NEW attribute only creates it on the instance, not the dict.
Setting an attribute that is a key in the data will set the dict data but
will not create a new instance attribute
"""
def __getattr__(self, attr):
"""
Try to get the data. If attr is not a key, fall-back and get the attr
"""
if self.has_key(attr):
return super(OrderedAttrDict, self).__getitem__(attr)
else:
return super(OrderedAttrDict, self).__getattr__(attr)
def __setattr__(self, attr, value):
"""
Try to set the data. If attr is not a key, fall-back and set the attr
"""
if self.has_key(attr):
super(OrderedAttrDict, self).__setitem__(attr, value)
else:
super(OrderedAttrDict, self).__setattr__(attr, value)
This is a pretty cool pattern already mentioned in the thread, but if you just want to take a dict and convert it to an object that works with auto-complete in an IDE, etc:
class ObjectFromDict(object):
def __init__(self, d):
self.__dict__ = d
Use SimpleNamespace:
from types import SimpleNamespace
obj = SimpleNamespace(color="blue", year=2050)
print(obj.color) #> "blue"
print(obj.year) #> 2050
EDIT / UPDATE: a closer answer to the OP's question, starting from a dictionary:
from types import SimpleNamespace
params = {"color":"blue", "year":2020}
obj = SimpleNamespace(**params)
print(obj.color) #> "blue"
print(obj.year) #> 2050
What would be the caveats and pitfalls of accessing dict keys in this manner?
As #Henry suggests, one reason dotted-access may not be used in dicts is that it limits dict key names to python-valid variables, thereby restricting all possible names.
The following are examples on why dotted-access would not be helpful in general, given a dict, d:
Validity
The following attributes would be invalid in Python:
d.1_foo # enumerated names
d./bar # path names
d.21.7, d.12:30 # decimals, time
d."" # empty strings
d.john doe, d.denny's # spaces, misc punctuation
d.3 * x # expressions
Style
PEP8 conventions would impose a soft constraint on attribute naming:
A. Reserved keyword (or builtin function) names:
d.in
d.False, d.True
d.max, d.min
d.sum
d.id
If a function argument's name clashes with a reserved keyword, it is generally better to append a single trailing underscore ...
B. The case rule on methods and variable names:
Variable names follow the same convention as function names.
d.Firstname
d.Country
Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.
Sometimes these concerns are raised in libraries like pandas, which permits dotted-access of DataFrame columns by name. The default mechanism to resolve naming restrictions is also array-notation - a string within brackets.
If these constraints do not apply to your use case, there are several options on dotted-access data structures.
this answer is taken from the book Fluent Python by Luciano Ramalho. so credits to that guy.
class AttrDict:
"""A read-only façade for navigating a JSON-like object
using attribute notation
"""
def __init__(self, mapping):
self._data = dict(mapping)
def __getattr__(self, name):
if hasattr(self._data, name):
return getattr(self._data, name)
else:
return AttrDict.build(self._data[name])
#classmethod
def build(cls, obj):
if isinstance(obj, Mapping):
return cls(obj)
elif isinstance(obj, MutableSequence):
return [cls.build(item) for item in obj]
else:
return obj
in the init we are taking the dict and making it a dictionary. when getattr is used we try to get the attribute from the dict if the dict already has that attribute. or else we are passing the argument to a class method called build. now build does the intresting thing. if the object is dict or a mapping like that, the that object is made an attr dict itself. if it's a sequence like list, it's passed to the build function we r on right now. if it's anythin else, like str or int. return the object itself.
class AttrDict(dict):
def __init__(self):
self.__dict__ = self
if __name__ == '____main__':
d = AttrDict()
d['ray'] = 'hope'
d.sun = 'shine' >>> Now we can use this . notation
print d['ray']
print d.sun
Solution is:
DICT_RESERVED_KEYS = vars(dict).keys()
class SmartDict(dict):
"""
A Dict which is accessible via attribute dot notation
"""
def __init__(self, *args, **kwargs):
"""
:param args: multiple dicts ({}, {}, ..)
:param kwargs: arbitrary keys='value'
If ``keyerror=False`` is passed then not found attributes will
always return None.
"""
super(SmartDict, self).__init__()
self['__keyerror'] = kwargs.pop('keyerror', True)
[self.update(arg) for arg in args if isinstance(arg, dict)]
self.update(kwargs)
def __getattr__(self, attr):
if attr not in DICT_RESERVED_KEYS:
if self['__keyerror']:
return self[attr]
else:
return self.get(attr)
return getattr(self, attr)
def __setattr__(self, key, value):
if key in DICT_RESERVED_KEYS:
raise AttributeError("You cannot set a reserved name as attribute")
self.__setitem__(key, value)
def __copy__(self):
return self.__class__(self)
def copy(self):
return self.__copy__()
You can use dict_to_obj
https://pypi.org/project/dict-to-obj/
It does exactly what you asked for
From dict_to_obj import DictToObj
a = {
'foo': True
}
b = DictToObj(a)
b.foo
True
This isn't a 'good' answer, but I thought this was nifty (it doesn't handle nested dicts in current form). Simply wrap your dict in a function:
def make_funcdict(d=None, **kwargs)
def funcdict(d=None, **kwargs):
if d is not None:
funcdict.__dict__.update(d)
funcdict.__dict__.update(kwargs)
return funcdict.__dict__
funcdict(d, **kwargs)
return funcdict
Now you have slightly different syntax. To acces the dict items as attributes do f.key. To access the dict items (and other dict methods) in the usual manner do f()['key'] and we can conveniently update the dict by calling f with keyword arguments and/or a dictionary
Example
d = {'name':'Henry', 'age':31}
d = make_funcdict(d)
>>> for key in d():
... print key
...
age
name
>>> print d.name
... Henry
>>> print d.age
... 31
>>> d({'Height':'5-11'}, Job='Carpenter')
... {'age': 31, 'name': 'Henry', 'Job': 'Carpenter', 'Height': '5-11'}
And there it is. I'll be happy if anyone suggests benefits and drawbacks of this method.
EDIT: NeoBunch is depricated, Munch (mentioned above) can be used as a drop-in replacement. I leave that solution here though, it can be useful for someone.
As noted by Doug there's a Bunch package which you can use to achieve the obj.key functionality. Actually there's a newer version called
NeoBunch Munch
It has though a great feature converting your dict to a NeoBunch object through its neobunchify function. I use Mako templates a lot and passing data as NeoBunch objects makes them far more readable, so if you happen to end up using a normal dict in your Python program but want the dot notation in a Mako template you can use it that way:
from mako.template import Template
from neobunch import neobunchify
mako_template = Template(filename='mako.tmpl', strict_undefined=True)
data = {'tmpl_data': [{'key1': 'value1', 'key2': 'value2'}]}
with open('out.txt', 'w') as out_file:
out_file.write(mako_template.render(**neobunchify(data)))
And the Mako template could look like:
% for d in tmpl_data:
Column1 Column2
${d.key1} ${d.key2}
% endfor