Python: Quick and dirty datatypes (DTO) - python

Very often, I find myself coding trivial datatypes like
class Pruefer:
def __init__(self, ident, maxNum=float('inf'), name=""):
self.ident = ident
self.maxNum = maxNum
self.name = name
While this is very useful (Clearly I don't want to replace the above with anonymous 3-tuples), it's also very boilerplate.
Now for example, when I want to use the class in a dict, I have to add more boilerplate like
def __hash__(self):
return hash(self.ident, self.maxNum, self.name)
I admit that it might be difficult to recognize a general pattern amongst all my boilerplate classes, but nevertheless I'd like to as this question:
Are there any
popular idioms in python to derive quick and dirty datatypes with named accessors?
Or maybe if there are not, maybe a Python guru might want to show off some metaclass hacking or class factory to make my life easier?

>>> from collections import namedtuple
>>> Pruefer = namedtuple("Pruefer", "ident maxNum name")
>>> pr = Pruefer(1,2,3)
>>> pr.ident
1
>>> pr.maxNum
2
>>> pr.name
3
>>> hash(pr)
2528502973977326415
To provide default values, you need to do little bit more... Simple solution is to write subclass with redefinition for __new__ method:
>>> class Pruefer(namedtuple("Pruefer", "ident maxNum name")):
... def __new__(cls, ident, maxNum=float('inf'), name=""):
... return super(Pruefer, cls).__new__(cls, ident, maxNum, name)
...
>>> Pruefer(1)
Pruefer(ident=1, maxNum=inf, name='')

One of the most promising things from with Python 3.6 is variable annotations. They allow to define namedtuple as class in next way:
In [1]: from typing import NamedTuple
In [2]: class Pruefer(NamedTuple):
...: ident: int
...: max_num: int
...: name: str
...:
In [3]: Pruefer(1,4,"name")
Out[3]: Pruefer(ident=1, max_num=4, name='name')
It same as a namedtuple, but is saves annotations and allow to check type with some static type analyzer like mypy.
Update: 15.05.2018
Now, in Python 3.7 dataclasses are present so this would preferable way of defining DTO, also for backwardcompatibility you could use attrs library.

Are there any popular idioms in python to derive quick ... datatypes with named accessors?
Dataclases. They accomplish this exact need.
Some answers have mentioned dataclasses, but here is an example.
Code
import dataclasses as dc
#dc.dataclass(unsafe_hash=True)
class Pruefer:
ident : int
maxnum : float = float("inf")
name : str = ""
Demo
pr = Pruefer(1, 2.0, "3")
pr
# Pruefer(ident=1, maxnum=2.0, name='3')
pr.ident
# 1
pr.maxnum
# 2.0
pr.name
# '3'
hash(pr)
# -5655986875063568239
Details
You get:
pretty reprs
default values
hashing
dotted attribute-access
... much more
You don't (directly) get:
tuple unpacking (unlike namedtuple)
Here's a guide on the details of dataclasses.

I don't have much to add to the already excellent answer by Alexey Kachayev -- However, one thing that may be useful is the following pattern:
Pruefer.__new__.func_defaults = (1,float('inf'),"")
This would allow you to create a factory function which returns a new named-tuple which can have default arguments:
def default_named_tuple(name,args,defaults=None):
named_tuple = collections.namedtuple(name,args)
if defaults is not None:
named_tuple.__new__.func_defaults = defaults
return named_tuple
This may seem like black magic -- It did to me at first, but it's all documented in the Data Model and discussed in this post.
In action:
>>> default_named_tuple("Pruefer", "ident maxNum name",(1,float('inf'),''))
<class '__main__.Pruefer'>
>>> Pruefer = default_named_tuple("Pruefer", "ident maxNum name",(1,float('inf'),''))
>>> Pruefer()
Pruefer(ident=1, maxNum=inf, name='')
>>> Pruefer(3)
Pruefer(ident=3, maxNum=inf, name='')
>>> Pruefer(3,10050)
Pruefer(ident=3, maxNum=10050, name='')
>>> Pruefer(3,10050,"cowhide")
Pruefer(ident=3, maxNum=10050, name='cowhide')
>>> Pruefer(maxNum=12)
Pruefer(ident=1, maxNum=12, name='')
And only specifying some of the arguments as defaults:
>>> Pruefer = default_named_tuple("Pruefer", "ident maxNum name",(float('inf'),''))
>>> Pruefer(maxNum=12)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __new__() takes at least 2 arguments (2 given)
>>> Pruefer(1,maxNum=12)
Pruefer(ident=1, maxNum=12, name='')
Note that as written, It's probably only safe to pass a tuple in as defaults. However, you could easily get more fancy by ensuring you have a reasonable tuple object within the function.

An alternate approach which might help you to make your boiler plate code a little more generic is the iteration over the (local) variable dicts. This enables you to put your variables in a list and the processing of these in a loop. E.g:
class Pruefer:
def __init__(self, ident, maxNum=float('inf'), name=""):
for n in "ident maxNum name".split():
v = locals()[n] # extract value from local variables
setattr(self, n, v) # set member variable
def printMemberVars(self):
print("Member variables are:")
for k,v in vars(self).items():
print(" {}: '{}'".format(k, v))
P = Pruefer("Id", 100, "John")
P.printMemberVars()
gives:
Member Variables are:
ident: 'Id'
maxNum: '100'
name: 'John'
From the viewpoint of efficient resource usage, this approach is of course suboptimal.

if using Python 3.7 you can use Data Classes; Data Classes can be thought of as "mutable namedtuples with defaults"
https://docs.python.org/3/library/dataclasses.html
https://www.python.org/dev/peps/pep-0557/

Related

How to type-hint a method that retrieves dynamic enum value?

I have a Python module that has a number of simple enums that are defined as follows:
class WordType(Enum):
ADJ = "Adjective"
ADV = "Adverb"
class Number(Enum):
S = "Singular"
P = "Plural"
Because there are a lot of these enums and I only decide at runtime which enums to query for any given input, I wanted a function that can retrieve the value given the enum-type and the enum-value as strings. I succeeded in doing that as follows:
names = inspect.getmembers(sys.modules[__name__], inspect.isclass)
def get_enum_type(name: str):
enum_class = [x[1] for x in names if x[0] == name]
return enum_class[0]
def get_enum_value(object_name: str, value_name: str):
return get_enum_type(object_name)[value_name]
This works well, but now I'm adding type hinting and I'm struggling with how to define the return types for these methods: I've tried slice and Literal[], both suggested by mypy, but neither checks out (maybe because I don't understand what type parameter I can give to Literal[]).
I am willing to modify the enum definitions, but I'd prefer to keep the dynamic querying as-is. Worst case scenario, I can do # type: ignore or just return -> Any, but I hope there's something better.
As you don't want to check-type for any Enum, I suggest to introduce a base type (say GrammaticalEnum) to mark all your Enums and to group them in an own module:
# module grammar_enums
import sys
import inspect
from enum import Enum
class GrammaticalEnum(Enum):
"""use as a base to mark all grammatical enums"""
pass
class WordType(GrammaticalEnum):
ADJ = "Adjective"
ADV = "Adverb"
class Number(GrammaticalEnum):
S = "Singular"
P = "Plural"
# keep this statement at the end, as all enums must be known first
grammatical_enums = dict(
m for m in inspect.getmembers(sys.modules[__name__], inspect.isclass)
if issubclass(m[1], GrammaticalEnum))
# you might prefer the shorter alternative:
# grammatical_enums = {k: v for (k, v) in globals().items()
# if inspect.isclass(v) and issubclass(v, GrammaticalEnum)}
Regarding typing, yakir0 already suggested the right types,
but with the common base you can narrow them.
If you like, you even could get rid of your functions at all:
from grammar_enums import grammatical_enums as g_enums
from grammar_enums import GrammaticalEnum
# just use g_enums instead of get_enum_value like this
WordType_ADJ: GrammaticalEnum = g_enums['WordType']['ADJ']
# ...or use your old functions:
# as your grammatical enums are collected in a dict now,
# you don't need this function any more:
def get_enum_type(name: str) -> Type[GrammaticalEnum]:
return g_enums[name]
def get_enum_value(enum_name: str, value_name: str) -> GrammaticalEnum:
# return get_enum_type(enum_name)[value_name]
return g_enums[enum_name][value_name]
You can always run your functions and print the result of the function to get a sense of what it should be. Note that you can use Enum in type hinting like any other class.
For example:
>>> result = get_enum_type('WordType')
... print(result)
... print(type(result))
<enum 'WordType'>
<class 'enum.EnumMeta'>
So you can actually use
get_enum_type(name: str) -> EnumMeta
But you can make it prettier by using Type from typing since EnumMeta is the type of a general Enum.
get_enum_type(name: str) -> Type[Enum]
For a similar process with get_enum_value you get
>>> type(get_enum_value('WordType', 'ADJ'))
<enum 'WordType'>
Obviously you won't always return the type WordType so you can use Enum to generalize the return type.
To sum it all up:
get_enum_type(name: str) -> Type[Enum]
get_enum_value(object_name: str, value_name: str) -> Enum
As I said in a comment, I don't think it's possible to have your dynamic code and have MyPy predict the outputs. For example, I don't think it would be possible to have MyPy know that get_enum_type("WordType") should be a WordType whereas get_enum_type("Number") should be a Number.
As others have said, you could clarify that they'll be Enums. You could add a base type, and say that they'll specifically be one of the base types. Part of the problem is that, although you could promise it, MyPy wouldn't be able to confirm. It can't know that inspect.getmembers(sys.modules[__name__], inspect.isclass) will just have Enums or GrammaticalEnums in [1].
If you're willing to change the implementation of your lookup, then I'd suggest you could make profitable use of __init_subclass__. Something like
GRAMMATICAL_ENUM_LOOKUP: "Mapping[str, GrammaticalEnum]" = {}
class GrammaticalEnum(Enum):
def __init_subclass__(cls, **kwargs):
GRAMMATICAL_ENUM_LOOKUP[cls.__name__] = cls
super().__init_subclass__(**kwargs)
def get_enum_type(name: str) -> Type[GrammaticalEnum]:
return GRAMMATICAL_ENUM_LOOKUP[name]
This at least has the advantage that the MyPy can see what's going on, and should be broadly happy with it. It knows that everything will in fact be a valid GrammaticalEnum because that's all that GRAMMATICAL_ENUM_LOOKUP gets populated with.

Why does the Method Resolution Order effect the behavior of my code?

I was looking into how the order in which you declare classes to inherit from affects Method Resolution Order (Detailed Here By Raymond Hettinger). I personally was using this to elegantly create an Ordered Counter via this code:
class OrderedCounter(Counter, OrderedDict):
pass
counts = OrderedCounter([1, 2, 3, 1])
print(*counts.items())
>>> (1, 2) (2, 1) (3, 1)
I was trying to understand why the following didn't work similarly:
class OrderedCounter(OrderedDict, Counter):
pass
counts = OrderedCounter([1, 2, 3, 1])
print(*counts.items())
>>> TypeError: 'int' object is not iterable
While I understand that on a fundamental level this is because the OrderedCounter object is using the OrderedDict.__init__() function in the second example which according to the documentation only accepts "[items]". In the first example however the Counter.__init__() function is used which according to the documentation accepts "[iterable-or-mapping]" thus it can take the list as an input.
I wanted to further understand this interaction specifically though so I went to look at the actual source. When I looked at the OrderedDict.__init__() function I noticed that after some error handling it made a call to self.update(*args, **kwds). However, the code simply has the line update = MutableMapping.update which I can't find much documentation on.
I guess I would just like a more concrete answer as to why the second code block doesn't work.
Note: For context, I have a decent amount of programming experience but I'm new to python and OOP in Python
TLDR: How/Why does the Method Resolution Order interfere with the second code block?
In your second example, class OrderedCounter(OrderedDict, Counter): the object looks in OrderedDict first which uses the update method from MutableMapping.
MutableMapping is an Abstract Base Class in collections._abc. Its update method source is here. You can see that if the other argument is not a mapping it will try to iterate over other unpacking a key and value on each iteration.
for key, value in other:
self[key] = value
If other is a sequence of tuples it would work.
>>> other = ((1,2),(3,4))
>>> for key,value in other:
print(key,value)
1 2
3 4
>>>
But if other is a sequence of single items it will throw the error when it tries to unpack a single value into two names/variables.
>>> other = (1,2,3,4)
>>> for key,value in other:
print(key,value)
Traceback (most recent call last):
File "<pyshell#50>", line 1, in <module>
for key,value in other:
TypeError: cannot unpack non-iterable int object
>>>
Whearas collections.Counter's update method calls a different function if other is not a Mapping.
else:
_count_elements(self, iterable)
_count_elements adds keys for new items (with a count of zero) or adds one to the count of existing keys.
As you probably discovered if a class inherits from two classes it will look in the first class to find an attribute, if it isn't there it will look in the second class.
>>> class A:
def __init__(self):
pass
def f(self):
print('class A')
>>> class B:
def __init__(self):
pass
def f(self):
print('class B')
>>> class C(A,B):
pass
>>> c = C()
>>> c.f()
class A
>>> class D(B,A):
pass
>>> d = D()
>>> d.f()
class B
In mro, children precede their parents and the order of appearance in __bases__ is respected.
In the first example, Counter is a subclass of dict. When OrderedDict is provided along with Counter, the parent dict of Counter is replaced by OrderedDict and the code works seamlessly.
In the second example, OrderedDict is again a subclass of dict. When Counter is provided along with OrderedDict, it tries to replace the parent dict of OrderedDict with Counter, which is counter intuitive (pun intended). Hence the error!!
I hope this layman explaination helps you. Just think about that for a moment.

Serializing namedtuples via PyYAML

I'm looking for some reasonable way to serialize namedtuples in YAML using PyYAML.
A few things I don't want to do:
Rely on a dynamic call to add a constructor/representor/resolver upon instantiation of the namedtuple. These YAML files may be stored and re-loaded later, so I cannot rely on the same runtime environment existing when they are restored.
Register the namedtuples in global.
Rely on the namedtuples having unique names
I was thinking of something along these lines:
class namedtuple(object):
def __new__(cls, *args, **kwargs):
x = collections.namedtuple(*args, **kwargs)
class New(x):
def __getstate__(self):
return {
"name": self.__class__.__name__,
"_fields": self._fields,
"values": self._asdict().values()
}
return New
def namedtuple_constructor(loader, node):
import IPython; IPython.embed()
value = loader.construct_scalar(node)
import re
pattern = re.compile(r'!!python/object/new:myapp.util\.')
yaml.add_implicit_resolver(u'!!myapp.util.namedtuple', pattern)
yaml.add_constructor(u'!!myapp.util.namedtuple', namedtuple_constructor)
Assuming this was in an application module at the path myapp/util.py
I'm not getting into the constructor, however, when I try to load:
from myapp.util import namedtuple
x = namedtuple('test', ['a', 'b'])
t = x(1,2)
dump = yaml.dump(t)
load = yaml.load(dump)
It will fail to find New in myapp.util.
I tried a variety of other approaches as well, this was just one that I thought might work best.
Disclaimer: Even once I get into the proper constructor I'm aware my spec will need further work regarding what arguments get saved how they are passed into the resulting object, but the first step for me is to get the YAML representation into my constructor function, then the rest should be easy.
I was able to solve my problem, though in a slightly less than ideal way.
My application now uses its own namedtuple implementation; I copied the collections.namedtuple source, created a base class for all new namedtuple types to inherit, and modified the template (excerpts below for brevity, simply highlighting whats change from the namedtuple source).
class namedtupleBase(tuple):
pass
_class_template = '''\
class {typename}(namedtupleBase):
'{typename}({arg_list})'
One little change to the namedtuple function itself to add the new class into the namespace:
namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
OrderedDict=OrderedDict, _property=property, _tuple=tuple,
namedtupleBase=namedtupleBase)
Now registering a multi_representer solves the problem:
def repr_namedtuples(dumper, data):
return dumper.represent_mapping(u"!namedtupleBase", {
"__name__": data.__class__.__name__,
"__dict__": collections.OrderedDict(
[(k, v) for k, v in data._asdict().items()])
})
def consruct_namedtuples(loader, node):
value = loader.construct_mapping(node)
cls_ = namedtuple(value['__name__'], value['__dict__'].keys())
return cls_(*value['__dict__'].values())
yaml.add_multi_representer(namedtupleBase, repr_namedtuples)
yaml.add_constructor("!namedtupleBase", consruct_namedtuples)
Hattip to Represent instance of different classes with the same base class in pyyaml for the inspiration behind the solution.
Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.
Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.
Here you go.
TL;DR
Proof of concept using PyAML 3.12.
import yaml
def named_tuple(self, data):
if hasattr(data, '_asdict'):
return self.represent_dict(data._asdict())
return self.represent_list(data)
yaml.SafeDumper.yaml_multi_representers[tuple] = named_tuple
Note: To be clean you should use one of the add_multi_representer() methods at your disposition and a custom representer/loader, like you did.
This gives you:
>>> import collections
>>> Foo = collections.namedtuple('Foo', 'x y z')
>>> yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})
'bar: [4, 5, 6]\nfoo: {x: 1, y: 2, z: 3}\n'
>>> print yaml.safe_dump({'foo': Foo(1,2,3), 'bar':(4,5,6)})
bar: [4, 5, 6]
foo: {x: 1, y: 2, z: 3}
How does this work
As you discovered by yourself, a namedtuple does not have a special class; exploring it gives:
>>> collections.namedtuple('Bar', '').mro()
[<class '__main__.Bar'>, <type 'tuple'>, <type 'object'>]
So the instances of the Python named tuples are tuple instances with an additional _asdict() method.

How to create a new unknown or dynamic/expando object in Python

In python how can we create a new object without having a predefined Class and later dynamically add properties to it ?
example:
dynamic_object = Dynamic()
dynamic_object.dynamic_property_a = "abc"
dynamic_object.dynamic_property_b = "abcdefg"
What is the best way to do it?
EDIT Because many people advised in comments that I might not need this.
The thing is that I have a function that serializes an object's properties. For that reason, I don't want to create an object of the expected class due to some constructor restrictions, but instead create a similar one, let's say like a mock, add any "custom" properties I need, then feed it back to the function.
Just define your own class to do it:
class Expando(object):
pass
ex = Expando()
ex.foo = 17
ex.bar = "Hello"
If you take metaclassing approach from #Martijn's answer, #Ned's answer can be rewritten shorter (though it's obviously less readable, but does the same thing).
obj = type('Expando', (object,), {})()
obj.foo = 71
obj.bar = 'World'
Or just, which does the same as above using dict argument:
obj = type('Expando', (object,), {'foo': 71, 'bar': 'World'})()
For Python 3, passing object to bases argument is not necessary (see type documentation).
But for simple cases instantiation doesn't have any benefit, so is okay to do:
ns = type('Expando', (object,), {'foo': 71, 'bar': 'World'})
At the same time, personally I prefer a plain class (i.e. without instantiation) for ad-hoc test configuration cases as simplest and readable:
class ns:
foo = 71
bar = 'World'
Update
In Python 3.3+ there is exactly what OP asks for, types.SimpleNamespace. It's just:
A simple object subclass that provides attribute access to its namespace, as well as a meaningful repr.
Unlike object, with SimpleNamespace you can add and remove attributes. If a SimpleNamespace object is initialized with keyword arguments, those are directly added to the underlying namespace.
import types
obj = types.SimpleNamespace()
obj.a = 123
print(obj.a) # 123
print(repr(obj)) # namespace(a=123)
However, in stdlib of both Python 2 and Python 3 there's argparse.Namespace, which has the same purpose:
Simple object for storing attributes.
Implements equality by attribute names and values, and provides a simple string representation.
import argparse
obj = argparse.Namespace()
obj.a = 123
print(obj.a) # 123
print(repr(obj)) # Namespace(a=123)
Note that both can be initialised with keyword arguments:
types.SimpleNamespace(a = 'foo',b = 123)
argparse.Namespace(a = 'foo',b = 123)
Using an object just to hold values isn't the most Pythonic style of programming. It's common in programming languages that don't have good associative containers, but in Python, you can use use a dictionary:
my_dict = {} # empty dict instance
my_dict["foo"] = "bar"
my_dict["num"] = 42
You can also use a "dictionary literal" to define the dictionary's contents all at once:
my_dict = {"foo":"bar", "num":42}
Or, if your keys are all legal identifiers (and they will be, if you were planning on them being attribute names), you can use the dict constructor with keyword arguments as key-value pairs:
my_dict = dict(foo="bar", num=42) # note, no quotation marks needed around keys
Filling out a dictionary is in fact what Python is doing behind the scenes when you do use an object, such as in Ned Batchelder's answer. The attributes of his ex object get stored in a dictionary, ex.__dict__, which should end up being equal to an equivalent dict created directly.
Unless attribute syntax (e.g. ex.foo) is absolutely necessary, you may as well skip the object entirely and use a dictionary directly.
Use the collections.namedtuple() class factory to create a custom class for your return value:
from collections import namedtuple
return namedtuple('Expando', ('dynamic_property_a', 'dynamic_property_b'))('abc', 'abcdefg')
The returned value can be used both as a tuple and by attribute access:
print retval[0] # prints 'abc'
print retval.dynamic_property_b # prints 'abcdefg'
One way that I found is also by creating a lambda. It can have sideeffects and comes with some properties that are not wanted. Just posting for the interest.
dynamic_object = lambda:expando
dynamic_object.dynamic_property_a = "abc"
dynamic_object.dynamic_property_b = "abcdefg"
I define a dictionary first because it's easy to define. Then I use namedtuple to convert it to an object:
from collections import namedtuple
def dict_to_obj(dict):
return namedtuple("ObjectName", dict.keys())(*dict.values())
my_dict = {
'name': 'The mighty object',
'description': 'Yep! Thats me',
'prop3': 1234
}
my_obj = dict_to_obj(my_dict)
Ned Batchelder's answer is the best. I just wanted to record a slightly different answer here, which avoids the use of the class keyword (in case that's useful for instructive reasons, demonstration of closure, etc.)
Just define your own class to do it:
def Expando():
def inst():
None
return inst
ex = Expando()
ex.foo = 17
ex.bar = "Hello"

Accessing dict keys like an attribute?

I find it more convenient to access dict keys as obj.foo instead of obj['foo'], so I wrote this snippet:
class AttributeDict(dict):
def __getattr__(self, attr):
return self[attr]
def __setattr__(self, attr, value):
self[attr] = value
However, I assume that there must be some reason that Python doesn't provide this functionality out of the box. What would be the caveats and pitfalls of accessing dict keys in this manner?
Update - 2020
Since this question was asked almost ten years ago, quite a bit has changed in Python itself since then.
While the approach in my original answer is still valid for some cases, (e.g. legacy projects stuck to older versions of Python and cases where you really need to handle dictionaries with very dynamic string keys), I think that in general the dataclasses introduced in Python 3.7 are the obvious/correct solution to vast majority of the use cases of AttrDict.
Original answer
The best way to do this is:
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
Some pros:
It actually works!
No dictionary class methods are shadowed (e.g. .keys() work just fine. Unless - of course - you assign some value to them, see below)
Attributes and items are always in sync
Trying to access non-existent key as an attribute correctly raises AttributeError instead of KeyError
Supports [Tab] autocompletion (e.g. in jupyter & ipython)
Cons:
Methods like .keys() will not work just fine if they get overwritten by incoming data
Causes a memory leak in Python < 2.7.4 / Python3 < 3.2.3
Pylint goes bananas with E1123(unexpected-keyword-arg) and E1103(maybe-no-member)
For the uninitiated it seems like pure magic.
A short explanation on how this works
All python objects internally store their attributes in a dictionary that is named __dict__.
There is no requirement that the internal dictionary __dict__ would need to be "just a plain dict", so we can assign any subclass of dict() to the internal dictionary.
In our case we simply assign the AttrDict() instance we are instantiating (as we are in __init__).
By calling super()'s __init__() method we made sure that it (already) behaves exactly like a dictionary, since that function calls all the dictionary instantiation code.
One reason why Python doesn't provide this functionality out of the box
As noted in the "cons" list, this combines the namespace of stored keys (which may come from arbitrary and/or untrusted data!) with the namespace of builtin dict method attributes. For example:
d = AttrDict()
d.update({'items':["jacket", "necktie", "trousers"]})
for k, v in d.items(): # TypeError: 'list' object is not callable
print "Never reached!"
You can have all legal string characters as part of the key if you use array notation.
For example, obj['!#$%^&*()_']
Wherein I Answer the Question That Was Asked
Why doesn't Python offer it out of the box?
I suspect that it has to do with the Zen of Python: "There should be one -- and preferably only one -- obvious way to do it." This would create two obvious ways to access values from dictionaries: obj['key'] and obj.key.
Caveats and Pitfalls
These include possible lack of clarity and confusion in the code. i.e., the following could be confusing to someone else who is going in to maintain your code at a later date, or even to you, if you're not going back into it for awhile. Again, from Zen: "Readability counts!"
>>> KEY = 'spam'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
If d is instantiated or KEY is defined or d[KEY] is assigned far away from where d.spam is being used, it can easily lead to confusion about what's being done, since this isn't a commonly-used idiom. I know it would have the potential to confuse me.
Additonally, if you change the value of KEY as follows (but miss changing d.spam), you now get:
>>> KEY = 'foo'
>>> d[KEY] = 1
>>> # Several lines of miscellaneous code here...
... assert d.spam == 1
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
AttributeError: 'C' object has no attribute 'spam'
IMO, not worth the effort.
Other Items
As others have noted, you can use any hashable object (not just a string) as a dict key. For example,
>>> d = {(2, 3): True,}
>>> assert d[(2, 3)] is True
>>>
is legal, but
>>> C = type('C', (object,), {(2, 3): True})
>>> d = C()
>>> assert d.(2, 3) is True
File "<stdin>", line 1
d.(2, 3)
^
SyntaxError: invalid syntax
>>> getattr(d, (2, 3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
>>>
is not. This gives you access to the entire range of printable characters or other hashable objects for your dictionary keys, which you do not have when accessing an object attribute. This makes possible such magic as a cached object metaclass, like the recipe from the Python Cookbook (Ch. 9).
Wherein I Editorialize
I prefer the aesthetics of spam.eggs over spam['eggs'] (I think it looks cleaner), and I really started craving this functionality when I met the namedtuple. But the convenience of being able to do the following trumps it.
>>> KEYS = 'spam eggs ham'
>>> VALS = [1, 2, 3]
>>> d = {k: v for k, v in zip(KEYS.split(' '), VALS)}
>>> assert d == {'spam': 1, 'eggs': 2, 'ham': 3}
>>>
This is a simple example, but I frequently find myself using dicts in different situations than I'd use obj.key notation (i.e., when I need to read prefs in from an XML file). In other cases, where I'm tempted to instantiate a dynamic class and slap some attributes on it for aesthetic reasons, I continue to use a dict for consistency in order to enhance readability.
I'm sure the OP has long-since resolved this to his satisfaction, but if he still wants this functionality, then I suggest he download one of the packages from pypi that provides it:
Bunch is the one I'm more familiar with. Subclass of dict, so you have all that functionality.
AttrDict also looks like it's also pretty good, but I'm not as familiar with it and haven't looked through the source in as much detail as I have Bunch.
Addict Is actively maintained and provides attr-like access and more.
As noted in the comments by Rotareti, Bunch has been deprecated, but there is an active fork called Munch.
However, in order to improve readability of his code I strongly recommend that he not mix his notation styles. If he prefers this notation then he should simply instantiate a dynamic object, add his desired attributes to it, and call it a day:
>>> C = type('C', (object,), {})
>>> d = C()
>>> d.spam = 1
>>> d.eggs = 2
>>> d.ham = 3
>>> assert d.__dict__ == {'spam': 1, 'eggs': 2, 'ham': 3}
Wherein I Update, to Answer a Follow-Up Question in the Comments
In the comments (below), Elmo asks:
What if you want to go one deeper? ( referring to type(...) )
While I've never used this use case (again, I tend to use nested dict, for
consistency), the following code works:
>>> C = type('C', (object,), {})
>>> d = C()
>>> for x in 'spam eggs ham'.split():
... setattr(d, x, C())
... i = 1
... for y in 'one two three'.split():
... setattr(getattr(d, x), y, i)
... i += 1
...
>>> assert d.spam.__dict__ == {'one': 1, 'two': 2, 'three': 3}
From This other SO question there's a great implementation example that simplifies your existing code. How about:
class AttributeDict(dict):
__slots__ = ()
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
Much more concise and doesn't leave any room for extra cruft getting into your __getattr__ and __setattr__ functions in the future.
You can pull a convenient container class from the standard library:
from argparse import Namespace
to avoid having to copy around code bits. No standard dictionary access, but easy to get one back if you really want it. The code in argparse is simple,
class Namespace(_AttributeHolder):
"""Simple object for storing attributes.
Implements equality by attribute names and values, and provides a simple
string representation.
"""
def __init__(self, **kwargs):
for name in kwargs:
setattr(self, name, kwargs[name])
__hash__ = None
def __eq__(self, other):
return vars(self) == vars(other)
def __ne__(self, other):
return not (self == other)
def __contains__(self, key):
return key in self.__dict__
Caveat emptor: For some reasons classes like this seem to break the multiprocessing package. I just struggled with this bug for awhile before finding this SO:
Finding exception in python multiprocessing
I found myself wondering what the current state of "dict keys as attr" in the python ecosystem. As several commenters have pointed out, this is probably not something you want to roll your own from scratch, as there are several pitfalls and footguns, some of them very subtle. Also, I would not recommend using Namespace as a base class, I've been down that road, it isn't pretty.
Fortunately, there are several open source packages providing this functionality, ready to pip install! Unfortunately, there are several packages. Here is a synopsis, as of Dec 2019.
Contenders (most recent commit to master|#commits|#contribs|coverage%):
addict (2021-01-05 | 229 | 22 | 100%)
munch (2021-01-22 | 166 | 17 | ?%)
easydict (2021-02-28 | 54 | 7 | ?%)
attrdict (2019-02-01 | 108 | 5 | 100%)
prodict (2021-03-06 | 100 | 2 | ?%)
No longer maintained or under-maintained:
treedict (2014-03-28 | 95 | 2 | ?%)
bunch (2012-03-12 | 20 | 2 | ?%)
NeoBunch
I currently recommend munch or addict. They have the most commits, contributors, and releases, suggesting a healthy open-source codebase for each. They have the cleanest-looking readme.md, 100% coverage, and good looking set of tests.
I do not have a dog in this race (for now!), besides having rolled my own dict/attr code and wasted a ton of time because I was not aware of all these options :). I may contribute to addict/munch in the future as I would rather see one solid package than a bunch of fragmented ones. If you like them, contribute! In particular, looks like munch could use a codecov badge and addict could use a python version badge.
addict pros:
recursive initialization (foo.a.b.c = 'bar'), dict-like arguments become addict.Dict
addict cons:
shadows typing.Dict if you from addict import Dict
No key checking. Due to allowing recursive init, if you misspell a key, you just create a new attribute, rather than KeyError (thanks AljoSt)
munch pros:
unique naming
built-in ser/de functions for JSON and YAML
munch cons:
no recursive init (you cannot construct foo.a.b.c = 'bar', you must set foo.a, then foo.a.b, etc.
Wherein I Editorialize
Many moons ago, when I used text editors to write python, on projects with only myself or one other dev, I liked the style of dict-attrs, the ability to insert keys by just declaring foo.bar.spam = eggs. Now I work on teams, and use an IDE for everything, and I have drifted away from these sorts of data structures and dynamic typing in general, in favor of static analysis, functional techniques and type hints. I've started experimenting with this technique, subclassing Pstruct with objects of my own design:
class BasePstruct(dict):
def __getattr__(self, name):
if name in self.__slots__:
return self[name]
return self.__getattribute__(name)
def __setattr__(self, key, value):
if key in self.__slots__:
self[key] = value
return
if key in type(self).__dict__:
self[key] = value
return
raise AttributeError(
"type object '{}' has no attribute '{}'".format(type(self).__name__, key))
class FooPstruct(BasePstruct):
__slots__ = ['foo', 'bar']
This gives you an object which still behaves like a dict, but also lets you access keys like attributes, in a much more rigid fashion. The advantage here is I (or the hapless consumers of your code) know exactly what fields can and can't exist, and the IDE can autocomplete fields. Also subclassing vanilla dict means json serialization is easy. I think the next evolution in this idea would be a custom protobuf generator which emits these interfaces, and a nice knock-on is you get cross-language data structures and IPC via gRPC for nearly free.
If you do decide to go with attr-dicts, it's essential to document what fields are expected, for your own (and your teammates') sanity.
Feel free to edit/update this post to keep it recent!
What if you wanted a key which was a method, such as __eq__ or __getattr__?
And you wouldn't be able to have an entry that didn't start with a letter, so using 0343853 as a key is out.
And what if you didn't want to use a string?
tuples can be used dict keys. How would you access tuple in your construct?
Also, namedtuple is a convenient structure which can provide values via the attribute access.
How about Prodict, the little Python class that I wrote to rule them all:)
Plus, you get auto code completion, recursive object instantiations and auto type conversion!
You can do exactly what you asked for:
p = Prodict()
p.foo = 1
p.bar = "baz"
Example 1: Type hinting
class Country(Prodict):
name: str
population: int
turkey = Country()
turkey.name = 'Turkey'
turkey.population = 79814871
Example 2: Auto type conversion
germany = Country(name='Germany', population='82175700', flag_colors=['black', 'red', 'yellow'])
print(germany.population) # 82175700
print(type(germany.population)) # <class 'int'>
print(germany.flag_colors) # ['black', 'red', 'yellow']
print(type(germany.flag_colors)) # <class 'list'>
It doesn't work in generality. Not all valid dict keys make addressable attributes ("the key"). So, you'll need to be careful.
Python objects are all basically dictionaries. So I doubt there is much performance or other penalty.
This doesn't address the original question, but should be useful for people that, like me, end up here when looking for a lib that provides this functionality.
Addict it's a great lib for this: https://github.com/mewwts/addict it takes care of many concerns mentioned in previous answers.
An example from the docs:
body = {
'query': {
'filtered': {
'query': {
'match': {'description': 'addictive'}
},
'filter': {
'term': {'created_by': 'Mats'}
}
}
}
}
With addict:
from addict import Dict
body = Dict()
body.query.filtered.query.match.description = 'addictive'
body.query.filtered.filter.term.created_by = 'Mats'
Just to add some variety to the answer, sci-kit learn has this implemented as a Bunch:
class Bunch(dict):
""" Scikit Learn's container object
Dictionary-like object that exposes its keys as attributes.
>>> b = Bunch(a=1, b=2)
>>> b['b']
2
>>> b.b
2
>>> b.c = 6
>>> b['c']
6
"""
def __init__(self, **kwargs):
super(Bunch, self).__init__(kwargs)
def __setattr__(self, key, value):
self[key] = value
def __dir__(self):
return self.keys()
def __getattr__(self, key):
try:
return self[key]
except KeyError:
raise AttributeError(key)
def __setstate__(self, state):
pass
All you need is to get the setattr and getattr methods - the getattr checks for dict keys and the moves on to checking for actual attributes. The setstaet is a fix for fix for pickling/unpickling "bunches" - if inerested check https://github.com/scikit-learn/scikit-learn/issues/6196
Here's a short example of immutable records using built-in collections.namedtuple:
def record(name, d):
return namedtuple(name, d.keys())(**d)
and a usage example:
rec = record('Model', {
'train_op': train_op,
'loss': loss,
})
print rec.loss(..)
After not being satisfied with the existing options for the reasons below I developed MetaDict. It behaves exactly like dict but enables dot notation and IDE autocompletion without the shortcomings and potential namespace conflicts of other solutions. All features and usage examples can be found on GitHub (see link above).
Full disclosure: I am the author of MetaDict.
Shortcomings/limitations I encountered when trying out other solutions:
Addict
No key autocompletion in IDE
Nested key assignment cannot be turned off
Newly assigned dict objects are not converted to support attribute-style key access
Shadows inbuilt type Dict
Prodict
No key autocompletion in IDE without defining a static schema (similar to dataclass)
No recursive conversion of dict objects when embedded in list or other inbuilt iterables
AttrDict
No key autocompletion in IDE
Converts list objects to tuple behind the scenes
Munch
Inbuilt methods like items(), update(), etc. can be overwritten with obj.items = [1, 2, 3]
No recursive conversion of dict objects when embedded in list or other inbuilt iterables
EasyDict
Only strings are valid keys, but dict accepts all hashable objects as keys
Inbuilt methods like items(), update(), etc. can be overwritten with obj.items = [1, 2, 3]
Inbuilt methods don't behave as expected: obj.pop('unknown_key', None) raises an AttributeError
Apparently there is now a library for this - https://pypi.python.org/pypi/attrdict - which implements this exact functionality plus recursive merging and json loading. Might be worth a look.
You can do it using this class I just made. With this class you can use the Map object like another dictionary(including json serialization) or with the dot notation. I hope help you:
class Map(dict):
"""
Example:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
"""
def __init__(self, *args, **kwargs):
super(Map, self).__init__(*args, **kwargs)
for arg in args:
if isinstance(arg, dict):
for k, v in arg.iteritems():
self[k] = v
if kwargs:
for k, v in kwargs.iteritems():
self[k] = v
def __getattr__(self, attr):
return self.get(attr)
def __setattr__(self, key, value):
self.__setitem__(key, value)
def __setitem__(self, key, value):
super(Map, self).__setitem__(key, value)
self.__dict__.update({key: value})
def __delattr__(self, item):
self.__delitem__(item)
def __delitem__(self, key):
super(Map, self).__delitem__(key)
del self.__dict__[key]
Usage examples:
m = Map({'first_name': 'Eduardo'}, last_name='Pool', age=24, sports=['Soccer'])
# Add new key
m.new_key = 'Hello world!'
print m.new_key
print m['new_key']
# Update values
m.new_key = 'Yay!'
# Or
m['new_key'] = 'Yay!'
# Delete key
del m.new_key
# Or
del m['new_key']
Let me post another implementation, which builds upon the answer of Kinvais, but integrates ideas from the AttributeDict proposed in http://databio.org/posts/python_AttributeDict.html.
The advantage of this version is that it also works for nested dictionaries:
class AttrDict(dict):
"""
A class to convert a nested Dictionary into an object with key-values
that are accessible using attribute notation (AttrDict.attribute) instead of
key notation (Dict["key"]). This class recursively sets Dicts to objects,
allowing you to recurse down nested dicts (like: AttrDict.attr.attr)
"""
# Inspired by:
# http://stackoverflow.com/a/14620633/1551810
# http://databio.org/posts/python_AttributeDict.html
def __init__(self, iterable, **kwargs):
super(AttrDict, self).__init__(iterable, **kwargs)
for key, value in iterable.items():
if isinstance(value, dict):
self.__dict__[key] = AttrDict(value)
else:
self.__dict__[key] = value
This is what I use
args = {
'batch_size': 32,
'workers': 4,
'train_dir': 'train',
'val_dir': 'val',
'lr': 1e-3,
'momentum': 0.9,
'weight_decay': 1e-4
}
args = namedtuple('Args', ' '.join(list(args.keys())))(**args)
print (args.lr)
The easiest way is to define a class let's call it Namespace. which uses the object dict.update() on the dict. Then, the dict will be treated as an object.
class Namespace(object):
'''
helps referencing object in a dictionary as dict.key instead of dict['key']
'''
def __init__(self, adict):
self.__dict__.update(adict)
Person = Namespace({'name': 'ahmed',
'age': 30}) #--> added for edge_cls
print(Person.name)
No need to write your own as
setattr() and getattr() already exist.
The advantage of class objects probably comes into play in class definition and inheritance.
I created this based on the input from this thread. I need to use odict though, so I had to override get and set attr. I think this should work for the majority of special uses.
Usage looks like this:
# Create an ordered dict normally...
>>> od = OrderedAttrDict()
>>> od["a"] = 1
>>> od["b"] = 2
>>> od
OrderedAttrDict([('a', 1), ('b', 2)])
# Get and set data using attribute access...
>>> od.a
1
>>> od.b = 20
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
# Setting a NEW attribute only creates it on the instance, not the dict...
>>> od.c = 8
>>> od
OrderedAttrDict([('a', 1), ('b', 20)])
>>> od.c
8
The class:
class OrderedAttrDict(odict.OrderedDict):
"""
Constructs an odict.OrderedDict with attribute access to data.
Setting a NEW attribute only creates it on the instance, not the dict.
Setting an attribute that is a key in the data will set the dict data but
will not create a new instance attribute
"""
def __getattr__(self, attr):
"""
Try to get the data. If attr is not a key, fall-back and get the attr
"""
if self.has_key(attr):
return super(OrderedAttrDict, self).__getitem__(attr)
else:
return super(OrderedAttrDict, self).__getattr__(attr)
def __setattr__(self, attr, value):
"""
Try to set the data. If attr is not a key, fall-back and set the attr
"""
if self.has_key(attr):
super(OrderedAttrDict, self).__setitem__(attr, value)
else:
super(OrderedAttrDict, self).__setattr__(attr, value)
This is a pretty cool pattern already mentioned in the thread, but if you just want to take a dict and convert it to an object that works with auto-complete in an IDE, etc:
class ObjectFromDict(object):
def __init__(self, d):
self.__dict__ = d
Use SimpleNamespace:
from types import SimpleNamespace
obj = SimpleNamespace(color="blue", year=2050)
print(obj.color) #> "blue"
print(obj.year) #> 2050
EDIT / UPDATE: a closer answer to the OP's question, starting from a dictionary:
from types import SimpleNamespace
params = {"color":"blue", "year":2020}
obj = SimpleNamespace(**params)
print(obj.color) #> "blue"
print(obj.year) #> 2050
What would be the caveats and pitfalls of accessing dict keys in this manner?
As #Henry suggests, one reason dotted-access may not be used in dicts is that it limits dict key names to python-valid variables, thereby restricting all possible names.
The following are examples on why dotted-access would not be helpful in general, given a dict, d:
Validity
The following attributes would be invalid in Python:
d.1_foo # enumerated names
d./bar # path names
d.21.7, d.12:30 # decimals, time
d."" # empty strings
d.john doe, d.denny's # spaces, misc punctuation
d.3 * x # expressions
Style
PEP8 conventions would impose a soft constraint on attribute naming:
A. Reserved keyword (or builtin function) names:
d.in
d.False, d.True
d.max, d.min
d.sum
d.id
If a function argument's name clashes with a reserved keyword, it is generally better to append a single trailing underscore ...
B. The case rule on methods and variable names:
Variable names follow the same convention as function names.
d.Firstname
d.Country
Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.
Sometimes these concerns are raised in libraries like pandas, which permits dotted-access of DataFrame columns by name. The default mechanism to resolve naming restrictions is also array-notation - a string within brackets.
If these constraints do not apply to your use case, there are several options on dotted-access data structures.
this answer is taken from the book Fluent Python by Luciano Ramalho. so credits to that guy.
class AttrDict:
"""A read-only façade for navigating a JSON-like object
using attribute notation
"""
def __init__(self, mapping):
self._data = dict(mapping)
def __getattr__(self, name):
if hasattr(self._data, name):
return getattr(self._data, name)
else:
return AttrDict.build(self._data[name])
#classmethod
def build(cls, obj):
if isinstance(obj, Mapping):
return cls(obj)
elif isinstance(obj, MutableSequence):
return [cls.build(item) for item in obj]
else:
return obj
in the init we are taking the dict and making it a dictionary. when getattr is used we try to get the attribute from the dict if the dict already has that attribute. or else we are passing the argument to a class method called build. now build does the intresting thing. if the object is dict or a mapping like that, the that object is made an attr dict itself. if it's a sequence like list, it's passed to the build function we r on right now. if it's anythin else, like str or int. return the object itself.
class AttrDict(dict):
def __init__(self):
self.__dict__ = self
if __name__ == '____main__':
d = AttrDict()
d['ray'] = 'hope'
d.sun = 'shine' >>> Now we can use this . notation
print d['ray']
print d.sun
Solution is:
DICT_RESERVED_KEYS = vars(dict).keys()
class SmartDict(dict):
"""
A Dict which is accessible via attribute dot notation
"""
def __init__(self, *args, **kwargs):
"""
:param args: multiple dicts ({}, {}, ..)
:param kwargs: arbitrary keys='value'
If ``keyerror=False`` is passed then not found attributes will
always return None.
"""
super(SmartDict, self).__init__()
self['__keyerror'] = kwargs.pop('keyerror', True)
[self.update(arg) for arg in args if isinstance(arg, dict)]
self.update(kwargs)
def __getattr__(self, attr):
if attr not in DICT_RESERVED_KEYS:
if self['__keyerror']:
return self[attr]
else:
return self.get(attr)
return getattr(self, attr)
def __setattr__(self, key, value):
if key in DICT_RESERVED_KEYS:
raise AttributeError("You cannot set a reserved name as attribute")
self.__setitem__(key, value)
def __copy__(self):
return self.__class__(self)
def copy(self):
return self.__copy__()
You can use dict_to_obj
https://pypi.org/project/dict-to-obj/
It does exactly what you asked for
From dict_to_obj import DictToObj
a = {
'foo': True
}
b = DictToObj(a)
b.foo
True
This isn't a 'good' answer, but I thought this was nifty (it doesn't handle nested dicts in current form). Simply wrap your dict in a function:
def make_funcdict(d=None, **kwargs)
def funcdict(d=None, **kwargs):
if d is not None:
funcdict.__dict__.update(d)
funcdict.__dict__.update(kwargs)
return funcdict.__dict__
funcdict(d, **kwargs)
return funcdict
Now you have slightly different syntax. To acces the dict items as attributes do f.key. To access the dict items (and other dict methods) in the usual manner do f()['key'] and we can conveniently update the dict by calling f with keyword arguments and/or a dictionary
Example
d = {'name':'Henry', 'age':31}
d = make_funcdict(d)
>>> for key in d():
... print key
...
age
name
>>> print d.name
... Henry
>>> print d.age
... 31
>>> d({'Height':'5-11'}, Job='Carpenter')
... {'age': 31, 'name': 'Henry', 'Job': 'Carpenter', 'Height': '5-11'}
And there it is. I'll be happy if anyone suggests benefits and drawbacks of this method.
EDIT: NeoBunch is depricated, Munch (mentioned above) can be used as a drop-in replacement. I leave that solution here though, it can be useful for someone.
As noted by Doug there's a Bunch package which you can use to achieve the obj.key functionality. Actually there's a newer version called
NeoBunch Munch
It has though a great feature converting your dict to a NeoBunch object through its neobunchify function. I use Mako templates a lot and passing data as NeoBunch objects makes them far more readable, so if you happen to end up using a normal dict in your Python program but want the dot notation in a Mako template you can use it that way:
from mako.template import Template
from neobunch import neobunchify
mako_template = Template(filename='mako.tmpl', strict_undefined=True)
data = {'tmpl_data': [{'key1': 'value1', 'key2': 'value2'}]}
with open('out.txt', 'w') as out_file:
out_file.write(mako_template.render(**neobunchify(data)))
And the Mako template could look like:
% for d in tmpl_data:
Column1 Column2
${d.key1} ${d.key2}
% endfor

Categories