Why is dataclasses.as_dict a function instead of a method - python

I know I can convert a Python dataclass to a dictionary using the asdict function:
from dataclasses import asdict, dataclass
#dataclass
class Point:
x: int
y: int
point = Point(1, 2)
asdict(point)
# {'x': 1, 'y': 2}
Why is asdict a separate function instead a method on the object? The method feels more Pythonic / intuitive and prevents what seems like an unnecessary import:
point.asdict()
# {'x': 1, 'y': 2}
Is there a specific reason for this design or did the authors just choose a function for no good reason? I'm curious because I'm wondering if there are any key design takeaways.
Hypothesis
One potential reason may be that injecting the method may override an existing method. For example, if Point.asdict() already exists on the vanilla class:
#dataclass
class Point:
x: int
y: int
def asdict(self):
return {"x": self.x, "y": self.y, "sum": self.x + self.y}
In this case, injecting asdict as a method would override the existing functionality.

Related

How to create multiple data members for a class in python using a loop..? [duplicate]

I want to use a bunch of local variables defined in a function, outside of the function. So I am passing x=locals() in the return value.
How can I load all the variables defined in that dictionary into the namespace outside the function, so that instead of accessing the value using x['variable'], I could simply use variable.
Rather than create your own object, you can use argparse.Namespace:
from argparse import Namespace
ns = Namespace(**mydict)
To do the inverse:
mydict = vars(ns)
Consider the Bunch alternative:
class Bunch(object):
def __init__(self, adict):
self.__dict__.update(adict)
so if you have a dictionary d and want to access (read) its values with the syntax x.foo instead of the clumsier d['foo'], just do
x = Bunch(d)
this works both inside and outside functions -- and it's enormously cleaner and safer than injecting d into globals()! Remember the last line from the Zen of Python...:
>>> import this
The Zen of Python, by Tim Peters
...
Namespaces are one honking great idea -- let's do more of those!
This is perfectly valid case to import variables in
one local space into another local space as long as
one is aware of what he/she is doing.
I have seen such code many times being used in useful ways.
Just need to be careful not to pollute common global space.
You can do the following:
adict = { 'x' : 'I am x', 'y' : ' I am y' }
locals().update(adict)
blah(x)
blah(y)
Importing variables into a local namespace is a valid problem and often utilized in templating frameworks.
Return all local variables from a function:
return locals()
Then import as follows:
r = fce()
for key in r.keys():
exec(key + " = r['" + key + "']")
The Bunch answer is ok but lacks recursion and proper __repr__ and __eq__ builtins to simulate what you can already do with a dict. Also the key to recursion is not only to recurse on dicts but also on lists, so that dicts inside lists are also converted.
These two options I hope will cover your needs (you might have to adjust the type checks in __elt() for more complex objects; these were tested mainly on json imports so very simple core types).
The Bunch approach (as per previous answer) - object takes a dict and converts it recursively. repr(obj) will return Bunch({...}) that can be re-interpreted into an equivalent object.
class Bunch(object):
def __init__(self, adict):
"""Create a namespace object from a dict, recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in adict.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
def __repr__(self):
"""Return repr(self)."""
return "%s(%s)" % (type(self).__name__, repr(self.__dict__))
def __eq__(self, other):
if hasattr(other, '__dict__'):
return self.__dict__ == other.__dict__
return NotImplemented
# Use this to allow comparing with dicts:
#return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The SimpleNamespace approach - since types.SimpleNamespace already implements __repr__ and __eq__, all you need is to implement a recursive __init__ method:
import types
class RecursiveNamespace(types.SimpleNamespace):
# def __init__(self, /, **kwargs): # better, but Python 3.8+
def __init__(self, **kwargs):
"""Create a SimpleNamespace recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in kwargs.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(**elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
# Optional, allow comparison with dicts:
#def __eq__(self, other):
# return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The RecursiveNamespace class takes keyword arguments, which can of course come from a de-referenced dict (ex **mydict)
Now let's put them to the test (argparse.Namespace added for comparison, although it's nested dict is manually converted):
from argparse import Namespace
from itertools import combinations
adict = {'foo': 'bar', 'baz': [{'aaa': 'bbb', 'ccc': 'ddd'}]}
a = Bunch(adict)
b = RecursiveNamespace(**adict)
c = Namespace(**adict)
c.baz[0] = Namespace(**c.baz[0])
for n in ['a', 'b', 'c']:
print(f'{n}:', str(globals()[n]))
for na, nb in combinations(['a', 'b', 'c'], 2):
print(f'{na} == {nb}:', str(globals()[na] == globals()[nb]))
The result is:
a: Bunch({'foo': 'bar', 'baz': [Bunch({'aaa': 'bbb', 'ccc': 'ddd'})]})
b: RecursiveNamespace(foo='bar', baz=[RecursiveNamespace(aaa='bbb', ccc='ddd')])
c: Namespace(foo='bar', baz=[Namespace(aaa='bbb', ccc='ddd')])
a == b: True
a == c: True
b == c: False
Although those are different classes, because they both (a and b) have been initialized to equivalent namespaces and their __eq__ method compares the namespace only (self.__dict__), comparing two namespace objects returns True. For the case of comparing with argparse.Namespace, for some reason only Bunch works and I'm unsure why (please comment if you know, I haven't looked much further as types.SimpleNameSpace is a built-in implementation).
You might also notice that I recurse using type(self)(...) rather than using the class name - this has two advantages: first the class can be renamed without having to update recursive calls, and second if the class is subclassed we'll be recursing using the subclass name. It's also the name used in __repr__ (type(self).__name__).
EDIT 2021-11-27:
Modified the Bunch.__eq__ method to make it safe against type mismatch.
Added/modified optional __eq__ methods (commented out) to allow comparing with the original dict and argparse.Namespace(**dict) (note that the later is not recursive but would still be comparable with other classes as the sublevel structs would compare fine anyway).
Used following snippet (PY2) to make recursive namespace from my dict(yaml) configs:
class NameSpace(object):
def __setattr__(self, key, value):
raise AttributeError('Please don\'t modify config dict')
def dump_to_namespace(ns, d):
for k, v in d.iteritems():
if isinstance(v, dict):
leaf_ns = NameSpace()
ns.__dict__[k] = leaf_ns
dump_to_namespace(leaf_ns, v)
else:
ns.__dict__[k] = v
config = NameSpace()
dump_to_namespace(config, config_dict)
There's Always this option, I don't know that it is the best method out there, but it sure does work. Assuming type(x) = dict
for key, val in x.items(): # unpack the keys from the dictionary to individual variables
exec (key + '=val')

Retrieve keys of output dictionary of python function without calling the function

assuming I have the following simple function
def f(name: str):
print (name)
return {'x': 2, 'y': 1}
Assuming I have access to the function f and I want to retrieve the keys of the returned value dict, without calling the function itself, is there a way to achieve this?
EDIT:
The expected output in this case is: ['x', 'y']
Why not annotate the return type, using a TypedDict:
from typing import TypedDict
class F(TypedDict):
x: int
y: int
def f(name: str) -> F:
print (name)
return {'x': 2, 'y': 1}
Not only does this make it easy to get the keys in the return dictionary from outside the function:
>>> f.__annotations__['return'].__annotations__.keys()
dict_keys(['x', 'y'])
it actually allows the type to be checked, both inside:
def f(name: str) -> F:
return {'x': 2, 'y': 1} # OK
def g(name: str) -> F:
return {'x': 2, 'y': '1'} # wrong type
def h(name: str) -> F:
return {'x': 2, 'y': 1, 'z': 'foo'} # extra key
and outside:
x: int = f('foo')['x'] # OK
y: str = f('foo')['y'] # wrong type
z: int = f('foo')['z'] # missing key
the function (see MyPy playground). Otherwise the inferred return type is just dict[str, int], which can't be checked so precisely, and you have to go spelunking into f.__code__ to find the keys.
TLDR: Fetch the values from f.__code__.co_consts[-1].
>>> f.__code__.co_consts[-1]
('x', 'y')
Literals are encoded in the code and constants of a function object. A convenient means to inspect this is to use the builtin dis module:
>>> import dis
>>> dis.dis(f)
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_FAST 0 (name)
4 CALL_FUNCTION 1
6 POP_TOP
3 8 LOAD_CONST 1 (2)
10 LOAD_CONST 2 (1)
12 LOAD_CONST 3 (('x', 'y'))
14 BUILD_CONST_KEY_MAP 2
16 RETURN_VALUE
As one can see, building a dict such as {'x': a, 'y': b, ...} will first load the individual values, then a single tuple of all keys. If nothing else is loaded before returning the dict, the keys are the last constant of the function.
The code and constants of a function object is accessible for programmatic inspection. Fetching "the last constant of the function" f literally translates to f.__code__.co_consts[-1].
Disclaimer: Extracting content from the code and constants of a function may be Python version dependent and especially depend on the function. Such an approach can be brittle and should not be used when arbitrary functions need to be processed.

Pythonic way to convert a dictionary into namedtuple or another hashable dict-like?

I have a dictionary like:
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
which I would like to convert to a namedtuple.
My current approach is with the following code
namedTupleConstructor = namedtuple('myNamedTuple', ' '.join(sorted(d.keys())))
nt= namedTupleConstructor(**d)
which produces
myNamedTuple(a=1, b=2, c=3, d=4)
This works fine for me (I think), but am I missing a built-in such as...
nt = namedtuple.from_dict() ?
UPDATE: as discussed in the comments, my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict.
UPDATE2: 4 years after I've posted this question, TLK posts a new answer recommending using the dataclass decorator that I think is really great. I think that's now what I would use going forward.
To create the subclass, you may just pass the keys of a dict directly:
MyTuple = namedtuple('MyTuple', d)
Now to create tuple instances from this dict, or any other dict with matching keys:
my_tuple = MyTuple(**d)
Beware: namedtuples compare on values only (ordered). They are designed to be a drop-in replacement for regular tuples, with named attribute access as an added feature. The field names will not be considered when making equality comparisons. It may not be what you wanted nor expected from the namedtuple type! This differs from dict equality comparisons, which do take into account the keys and also compare order agnostic.
For readers who don't really need a type which is a subclass of tuple, there probably isn't much point to use a namedtuple in the first place. If you just want to use attribute access syntax on fields, it would be simpler and easier to create namespace objects instead:
>>> from types import SimpleNamespace
>>> SimpleNamespace(**d)
namespace(a=1, b=2, c=3, d=4)
my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict
For a hashable "attrdict" like recipe, check out a frozen box:
>>> from box import Box
>>> b = Box(d, frozen_box=True)
>>> hash(b)
7686694140185755210
>>> b.a
1
>>> b["a"]
1
>>> b["a"] = 2
BoxError: Box is frozen
There may also be a frozen mapping type coming in a later version of Python, watch this draft PEP for acceptance or rejection:
PEP 603 -- Adding a frozenmap type to collections
from collections import namedtuple
nt = namedtuple('x', d.keys())(*d.values())
If you want an easier approach, and you have the flexibility to use another approach other than namedtuple I would like to suggest using SimpleNamespace (docs).
from types import SimpleNamespace as sn
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dd= sn(**d)
# dd.a>>1
# add new property
dd.s = 5
#dd.s>>5
PS: SimpleNamespace is a type, not a class
I'd like to recommend the dataclass for this type of situation. Similar to a namedtuple, but with more flexibility.
https://docs.python.org/3/library/dataclasses.html
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
You can use this function to handle nested dictionaries:
def create_namedtuple_from_dict(obj):
if isinstance(obj, dict):
fields = sorted(obj.keys())
namedtuple_type = namedtuple(
typename='GenericObject',
field_names=fields,
rename=True,
)
field_value_pairs = OrderedDict(
(str(field), create_namedtuple_from_dict(obj[field]))
for field in fields
)
try:
return namedtuple_type(**field_value_pairs)
except TypeError:
# Cannot create namedtuple instance so fallback to dict (invalid attribute names)
return dict(**field_value_pairs)
elif isinstance(obj, (list, set, tuple, frozenset)):
return [create_namedtuple_from_dict(item) for item in obj]
else:
return obj
use the dictionary keys as the fieldnames to the namedtuple
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
def dict_to_namedtuple(d):
return namedtuple('GenericDict', d.keys())(**d)
result=dict_to_namedtuple(d)
print(result)
output
GenericDict(a=1, b=2, c=3, d=4)
def toNametuple(dict_data):
return namedtuple(
"X", dict_data.keys()
)(*tuple(map(lambda x: x if not isinstance(x, dict) else toNametuple(x), dict_data.values())))
d = {
'id': 1,
'name': {'firstName': 'Ritesh', 'lastName':'Dubey'},
'list_data': [1, 2],
}
obj = toNametuple(d)
Access as obj.name.firstName, obj.id
This will work for nested dictionary with any data types.
I find the following 4-liner the most beautiful. It supports nested dictionaries as well.
def dict_to_namedtuple(typename, data):
return namedtuple(typename, data.keys())(
*(dict_to_namedtuple(typename + '_' + k, v) if isinstance(v, dict) else v for k, v in data.items())
)
The output will look good also:
>>> nt = dict_to_namedtuple('config', {
... 'path': '/app',
... 'debug': {'level': 'error', 'stream': 'stdout'}
... })
>>> print(nt)
config(path='/app', debug=config_debug(level='error', stream='stdout'))
Check this out:
def fill_tuple(NamedTupleType, container):
if container is None:
args = [None] * len(NamedTupleType._fields)
return NamedTupleType(*args)
if isinstance(container, (list, tuple)):
return NamedTupleType(*container)
elif isinstance(container, dict):
return NamedTupleType(**container)
else:
raise TypeError("Cannot create '{}' tuple out of {} ({}).".format(NamedTupleType.__name__, type(container).__name__, container))
Exceptions for incorrect names or invalid argument count is handled by __init__ of namedtuple.
Test with py.test:
def test_fill_tuple():
A = namedtuple("A", "aa, bb, cc")
assert fill_tuple(A, None) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [None, None, None]) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [1, 2, 3]) == A(aa=1, bb=2, cc=3)
assert fill_tuple(A, dict(aa=1, bb=2, cc=3)) == A(aa=1, bb=2, cc=3)
with pytest.raises(TypeError) as e:
fill_tuple(A, 2)
assert e.value.message == "Cannot create 'A' tuple out of int (2)."
Although I like #fuggy_yama answer, before read it I got my own function, so I leave it here just to show a different approach. It also handles nested namedtuples
def dict2namedtuple(thedict, name):
thenametuple = namedtuple(name, [])
for key, val in thedict.items():
if not isinstance(key, str):
msg = 'dict keys must be strings not {}'
raise ValueError(msg.format(key.__class__))
if not isinstance(val, dict):
setattr(thenametuple, key, val)
else:
newname = dict2namedtuple(val, key)
setattr(thenametuple, key, newname)
return thenametuple

Python: programmatic class instance variable initialisation with locals()

I have a class with many instance variables with default values, which optionally can be overridden in instantiantion (note: no mutable default arguments).
Since it's quite redundant to write self.x = x etc. many times, I initialise the variables programmatically.
To illustrate, consider this example (which has, for the sake of brevity, only 5 instance variables and any methods omitted):
Example:
# The "painful" way
class A:
def __init__(self, a, b=2, c=3, d=4.5, e=5):
self.a = a
self.b = b
self.c = c
self.d = d
self.e = e
# The "lazy" way
class B:
def __init__(self, a, b=2, c=3, d=4.5, e=5):
self.__dict__.update({k: v for k, v in locals().items() if k!='self'})
# The "better lazy" way suggested
class C:
def __init__(self, a, b=2, c=3, d=4.5, e=5):
for k, v in locals().items():
if k != 'self':
setattr(self, k, v)
x = A(1, c=7)
y = B(1, c=7)
z = C(1, c=7)
print(x.__dict__) # {'d': 4.5, 'c': 7, 'a': 1, 'b': 2, 'e': 5}
print(y.__dict__) # {'d': 4.5, 'c': 7, 'a': 1, 'b': 2, 'e': 5}
print(z.__dict__) # {'d': 4.5, 'c': 7, 'a': 1, 'b': 2, 'e': 5}
So to make my life easier, I use the idiom shown in class B, which yields the same result as A.
Is this bad practice? Are there any pitfalls?
Addendum:
Another reason to use this idiom was to save some space - I intended to use it in MicroPython. For whatever reason Because locals work differently there, only the way shown in class A works in it.
I would actually suggest using the code shown in class A. What you have is repetitive code, not redundant code, and repetitive isn't always bad. You only have to write __init__ once, and keeping one assignment per instance variable is good documentation (explicit and clear) for what instance variables your class expects.
One thing to keep in mind, though, is that too many variables that you can initialize as distinct parameters may be a sign that your class needs to be redesigned. Would some of the individual parameters make more sense being grouped into separate lists, dicts, or even additional classes?
Try a more pythonic approach:
class C:
def __init__(self,a,b=2,c=3,d=4.5,e=5):
for k,v in locals().iteritems():
setattr(self,k,v)
c = C(1)
print c.a, c.b
1 2
This approach may be a line or two longer, but the line lengths are shorter, and your intent less convoluted. In addition, anyone who may try to reuse your code will be able to access your objects' attributes as expected.
Hope this helps.
Edit: removed second approach using kwargs bc it does not address default variable requirement.
the important take-away here is that a user of your code would not be able to access your object's attributes as expected if done like your example class B shows.

How to define self-made object that can be unpacked by `**`?

Today I'm learning using * and ** to unpack arguments.
I find that both list, str, tuple, dict can be unpacked by *.
I guess because they are all iterables. So I made my own class.
# FILE CONTENT
def print_args(*args):
for i in args:
print i
class MyIterator(object):
count = 0
def __iter__(self):
while self.count < 5:
yield self.count
self.count += 1
self.count = 0
my_iterator = MyIterator()
# INTERPRETOR TEST
In [1]: print_args(*my_iterator)
0
1
2
3
4
It works! But how to make a mapping object like dict in python so that ** unpacking works on it? Is it possible to do that? And is there already another kind of mapping object in python except dict?
PS:
I know I can make an object inherit from dict class to make it a mapping object. But is there some key magic_method like __iter__ to make a mapping object without class inheritance?
PS2:
With the help of #mgilson's answer, I've made an object which can be unpacked by ** without inherit from current mapping object:
# FILE CONTENT
def print_kwargs(**kwargs):
for i, j in kwargs.items():
print i, '\t', j
class MyMapping(object):
def __getitem__(self, key):
if int(key) in range(5):
return "Mapping and unpacking!"
def keys(self):
return map(str, range(5))
my_mapping = MyMapping()
print_kwargs(**my_mapping)
# RESULTS
1 Mapping and unpacking!
0 Mapping and unpacking!
3 Mapping and unpacking!
2 Mapping and unpacking!
4 Mapping and unpacking!
Be aware, when unpacking using **, the key in your mapping object should be type str, or TypeError will be raised.
Any mapping can be used. I'd advise that you inherit from collections.Mapping or collections.MutableMapping1. They're abstract base classes -- you supply a couple methods and the base class fills in the rest.
Here's an example of a "frozendict" that you could use:
from collections import Mapping
class FrozenDict(Mapping):
"""Immutable dictionary.
Abstract methods required by Mapping are
1. `__getitem__`
2. `__iter__`
3. `__len__`
"""
def __init__(self, *args, **kwargs):
self._data = dict(*args, **kwargs)
def __getitem__(self, key):
return self._data[key]
def __iter__(self):
return iter(self._data)
def __len__(self):
return len(self._data)
And usage is just:
def printer(**kwargs):
print(kwargs)
d = FrozenDict({'a': 1, 'b': 2})
printer(**d)
To answer your question about which "magic" methods are necessary to allow unpacking -- just based on experimentation alone -- in Cpython a class with __getitem__ and keys is enough to allow it to be unpacked with **. With that said, there is no guarantee that works on other implementations (or future versions of CPython). To get the guarantee, you need to implement the full mapping interface (usually with the help of a base class as I've used above).
In python2.x, there's also UserDict.UserDict which can be accessed in python3.x as collections.UserDict -- However if you're going to use this one, you can frequently just subclass from dict.
1Note that as of Python3.3, those classes were moved to thecollections.abc module.
First, let's define unpacking:
def unpack(**kwargs):
"""
Collect all keyword arguments under one hood
and print them as 'key: value' pairs
"""
for key_value in kwargs.items():
print('key: %s, value: %s' % key_value)
Now, the structure: two built-in options available are collections.abc.Mapping and collections.UserDict. As there's another answer exploring highly-customizable Mapping type, I will focus on UserDict: UserDict can be easier to start with if all you need is a basic dict structure with some twist. After definition, underlying UserDict dictionary of is also accessible as .data attribute.
1.It can be used inline, like so:
from collections import UserDict
>>> d = UserDict({'key':'value'})
>>> # UserDict makes it feel like it's a regular dict
>>> d, d.data
({'key':'value'}, {'key':'value'})
Breaking UserDict into key=value pairs:
>>> unpack(**d)
key: key, value: value
>>> unpack(**d.data) # same a above
key: key, value: value
2.If subclassing, all you have to do is to define self.data within __init__. Note that i expanded the class with additional functionality with (self+other) 'magic' methods:
class CustomDict(UserDict):
def __init__(self, dct={}):
self.data = dct
def __add__(self, other={}):
"""Returning new object of the same type
In case of UserDict, unpacking self is the same as unpacking self.data
"""
return __class__({**self.data, **other})
def __iadd__(self, other={}):
"""Returning same object, modified in-place"""
self.update(other)
return self
Usage is:
>>> d = CustomDict({'key': 'value', 'key2': 'value2'})
>>> d
{'key': 'value', 'key2': 'value2'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Adding other dict (or any mapping type) to it will call __add__, returning new object:
>>> mixin = {'a': 'aaa', 'b': 'bbb'}
>>> d_new = d + mixin # __add__
>>> d_new
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>>type(d_new), id(d_new)
(<class '__main__.CustomDict'>, 4323059248) # new object
>>> d # unmodified
{'key': 'value', 'key2': 'value2'}
In-place modification with __iadd__ will return the same object (same id in memory)
>>> d += {'a': 'aaa', 'b': 'bbb'} # __iadd__
>>> d
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Btw, i agree with other contributors that you should also be familiar with collections.abc.Mapping and brethren types. For basic dictionary exploration UserDict has all the same features and does not require from you to override abstract methods before becoming usable.

Categories