I am confused about use of property class with regard to references to the fset/fget/fdel functions and in which namespaces they live. The behavior is different depending on whether I use property as a decorator or a helper function. Why do duplicate vars in class and instance namespaces impact one example but not the other?
When using property as a decorator shown here I must hide the var name in __dict__ with a leading underscore to prevent preempting the property functions. If not I'll see a recursion loop.
class setget():
"""Play with setters and getters"""
#property
def x(self):
print('getting x')
return self._x
#x.setter
def x(self, x):
print('setting x')
self._x = x
#x.deleter
def x(self):
print('deleting x')
del self._x
and I can see _x as an instance property and x as a class property:
>>> sg = setget()
>>> sg.x = 1
setting x
>>> sg.__dict__
{'_x': 1}
pprint(setget.__dict__)
mappingproxy({'__dict__': <attribute '__dict__' of 'setget' objects>,
'__doc__': 'Play with setters and getters',
'__module__': '__main__',
'__weakref__': <attribute '__weakref__' of 'setget' objects>,
'x': <property object at 0x000001BF3A0C37C8>})
>>>
Here's an example of recursion if the instance var name underscore is omitted. (code not shown here) This makes sense to me because instance property x does not exist and so we look further to class properties.
>>> sg = setget()
>>> sg.x = 1
setting x
setting x
setting x
setting x
...
However if I use property as a helper function as described in one of the answers here:
python class attributes vs instance attributes
the name hiding underscore is not needed and there is no conflict.
Copy of the example code:
class PropertyHelperDemo:
'''Demonstrates a property definition helper function'''
def prop_helper(k: str, doc: str):
print(f'Creating property instance {k}')
def _get(self):
print(f'getting {k}')
return self.__dict__.__getitem__(k) # might use '_'+k, etc.
def _set(self, v):
print(f'setting {k}')
self.__dict__.__setitem__(k, v)
def _del(self):
print(f'deleting {k}')
self.__dict__.__delitem__(k)
return property(_get, _set, _del, doc)
X: float = prop_helper('X', doc="X is the best!")
Y: float = prop_helper('Y', doc="Y do you ask?")
Z: float = prop_helper('Z', doc="Z plane!")
# etc...
def __init__(self, X: float, Y: float, Z: float):
#super(PropertyHelperDemo, self).__init__() # not sure why this was here
(self.X, self.Y, self.Z) = (X, Y, Z)
# for read-only properties, the built-in technique remains sleek enough already
#property
def Total(self) -> float:
return self.X + self.Y + self.Z
And here I verify that the property fset function is being executed on subsequent calls.
>>> p = PropertyHelperDemo(1, 2, 3)
setting X
setting Y
setting Z
>>> p.X = 11
setting X
>>> p.X = 111
setting X
>>> p.__dict__
{'X': 111, 'Y': 2, 'Z': 3}
>>> pprint(PropertyHelperDemo.__dict__)
mappingproxy({'Total': <property object at 0x000002333A093F98>,
'X': <property object at 0x000002333A088EF8>,
'Y': <property object at 0x000002333A093408>,
'Z': <property object at 0x000002333A093D18>,
'__annotations__': {'X': <class 'float'>,
'Y': <class 'float'>,
'Z': <class 'float'>},
'__dict__': <attribute '__dict__' of 'PropertyHelperDemo' objects>,
'__doc__': 'Demonstrates a property definition helper function',
'__init__': <function PropertyHelperDemo.__init__ at 0x000002333A0B3AF8>,
'__module__': '__main__',
'__weakref__': <attribute '__weakref__' of 'PropertyHelperDemo' objects>,
'prop_helper': <function PropertyHelperDemo.prop_helper at 0x000002333A052F78>})
>>>
I can see the class and instance properties with overlapping names X, Y, Z, in the two namespaces. It is my understanding that the namespace search order begins with local variables so I don't understand why the property fset function is executed here.
Any guidance is greatly appreciated.
I think you're a little astray in construing _x as an "instance property" and x as a "class property" - in fact, both are bound to the instance only, and neither is bound to the other except by the arbitrarily defined behaviour of the method decorated by #property.
They both occupy the same namespace, which is why, though they may represent the same quantity, they cannot share a name for fear of shadowing/confusing the namespace.
The issue of namespaces is not directly connected to the use of the #property decorator. You don't HAVE to "hide" the attribute name - you just need to ensure that the attribute name differs from the name of the method, because once you apply the #property decorator, the method decorated by #property can be accessed just like any other attribute without a typical method call signature including the ().
Here's an example, adjacent to the one you provided, that may help clarify. I define a class, PositionVector below, that holds the x, y and z coordinates of a point in space.
When initialising an instance of the class, I also create an attribute length that computes the length of the vector based on the x, y and z values. Trying this:
import numpy as np
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self.length = np.sqrt(x**2 + y**2 + z**2)
p1 = PositionVector(x = 10, y = 0, z = 0)
print (p1.length)
# Result -> 10.0
Only now I want to change the y attribute of the instance. I do this:
p1.y = 10.0
print (f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
Except now, if I again access the length of the vector, I get the wrong answer:
print (f"p1's length is {p1.length}")
# Result -> p1's length is 10.0
This arises because length, which at any given instant depends on the current values of x, y, and z, is never updated and kept consistent. We could fix this issue by redefining our class so length is a method that is continuously recalculated every time the user wants to access it, like so:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
def length(self):
return np.sqrt(self.x**2 + self.y**2 + self.z**2)
Now, I have a way to get the correct length of an instance of this class at all times by calling the instance's length() method:
p1 = PositionVector(x = 10, y = 0, z = 0)
print (f"p1's length is {p1.length()}")
# Result -> p1's length is 10.0
p1.y = 10.0
print (f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
print (f"p1's length is {p1.length()}")
# Result -> p1's length is 14.142135623730951
This is fine, except for two issues:
If this class had been in use already, going back and changing length from an attribute to a method would break backward compatibility, forcing any other code that uses this class to need modifications before it could work as before.
Though I DO want length to recalculate every time I invoke it, I want to be able to pick it up and "handle" it like it's a "property" of the instance, not a "behaviour" of the instance. So using p1.length() to get the instance's length instead of simply p1.length feels unidiomatic.
I can restore backward compatibility, AND permit length to be accessed like any other attribute by applying the #property decorator to the method. Simply adding #property to the length() method definition allows its call signature to go back to its original form:
#property
def length(self):
return np.sqrt(self.x**2 + self.y**2 + self.z**2)
p1 = PositionVector(x=10, y=0, z=0)
print(f"p1's length is {p1.length}")
# Result -> p1's length is 10.0
p1.y = 10.0
print(f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
print(f"p1's length is {p1.length}")
# Result -> p1's length is 14.142135623730951
At this point, there are no shadowed or "underscored" attribute names, I don't need them - I can access x, y and z normally, and access length as though it were any other attribute, and yet be confident that anytime I call it, I get the most current value, correctly reflective of the current values of x, y, and z. Calling dict on p1 in this state yields:
print(p1.__dict__)
# Result -> {'x': 10, 'y': 10.0, 'z': 0}
There could be use cases where you want to not only calculate length, but also save its value as a static attribute of an instance. This is where you might want to create an attribute and have it hold the value of length every time its calculated. You'd accomplish this like so:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self.placeholder_attribute_name = None
#property
def length(self):
self.placeholder_attribute_name = np.sqrt(self.x**2 + self.y**2 + self.z**2)
return self.placeholder_attribute_name
Doing this has no effect whatsoever on the prior functioning of the class. It simply creates a way to statically hold the value of length, independent of the act of creating it.
You don't HAVE to name that attribute anything in particular. You can name it anything you want, except for any other name already in use. In the case above, you can't name it x, y, z, or length, because all of those have other meanings.
For readability, however, it does make sense, and it's common practice, to do the following two things:
Make it obvious that this attribute is not meant to be used directly. In the case above - you don't want someone to get the length of the vector by calling p1.placeholder_attribute_name because this is not guaranteed to yield the correct current length - they should use p1.length instead. You indicate that this attribute is not for public consumption with a commonly adopted Python convention - the leading underscore:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self._placeholder_attribute_name = None
#property
def length(self):
self._placeholder_attribute_name = np.sqrt(self.x**2 + self.y**2 + self.z**2)
return self._placeholder_attribute_name
Use the name of the attribute to convey to anyone reading your code what the attribute actually means. If the attribute is meant to shadow the "length" property - putting length in there somewhere instead of the less helpful placeholder_attribute_name would enhance readability. You could indicate that this shadows length by naming it _length.
In summary:
Employing the #property decorator does not compel you to use "public" and "private" attribute names - you would only do so if, besides computing your attribute's value with the #property decorated method, you also wanted to save the value of that method's return in a persistent attribute bound to every instance of the class.
Even when you DO choose to use propertyname in public and _propertyname in private this is not an absolute rule, it is simply a convention adopted in aid of readability.
Thanks to #Vin for a nice detailed description of property but it doesn't really answer my question - which could have been worded much more clearly. It shows my confusion.
The fundamental reason for the recursion in setget but not PropertyHelperDemo is that the property methods in setget invoke themselves while the methods in PropertyHelperDemo access the instance __dict__ directly as such:
def _get(self):
print(f'getting {k}')
return self.__dict__.__getitem__(k)
This seems rather obvious now. It is apparent that conflicting property and __dict__ attribute names are not prevented and that the resolution order is to look for properties before __dict__ entries.
In other experiments I've found that it's possible to replace an instance method by making an entry of the same name in __dict__. So the overall resolution sequence remains less than clear (to me.)
Another source of confusion for me is that dir returns a list of names of methods plus __dict__ entries and other attributes, and apparently eliminates duplicates. From the doc:
If the object does not provide __dir__(), the function tries its best
to gather information from the object’s __dict__ attribute, if
defined, and from its type object. The resulting list is not
necessarily complete, and may be inaccurate when the object has a
custom __getattr__().
... If the object is a type or class object, the list contains the names
of its attributes, and recursively of the attributes of its bases.
... The resulting list is sorted alphabetically.
Interestingly, properties appear in the class __dict__ but not in the instance __dict__.
I found this in the Descriptor HowTo Guide offered by #chepner. THANKS!
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the method resolution order of type(a). If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined.
... Instance lookup scans through a chain of namespaces giving data descriptors the highest priority, followed by instance variables, then non-data descriptors, then class variables, and lastly __getattr__() if it is provided.
A Python property is a type of descriptor so resolution through __dict__ is preempted.
Another way to explore is using inspect which does not eliminate duplicates.
>>> p = PropertyHelperDemo(1, 2, 3)
setting X
setting Y
setting Z
>>>
>>> import inspect
>>> pprint(inspect.getmembers(p))
getting X
getting Y
getting Z
getting X
getting Y
getting Z
[('Total', 6),
('X', 1),
('Y', 2),
('Z', 3),
('__annotations__',
{'X': <class 'float'>, 'Y': <class 'float'>, 'Z': <class 'float'>}),
('__class__', <class '__main__.PropertyHelperDemo'>),
('__delattr__',
<method-wrapper '__delattr__' of PropertyHelperDemo object at 0x00000181D14C6608>),
('__dict__', {'X': 1, 'Y': 2, 'Z': 3}),
('__dir__',
<built-in method __dir__ of PropertyHelperDemo object at 0x00000181D14C6608>),
...
...
...
>>> pprint(inspect.getmembers(p, predicate=inspect.ismethod))
getting X
getting Y
getting Z
getting X
getting Y
getting Z
[('__init__',
<bound method PropertyHelperDemo.__init__ of <__main__.PropertyHelperDemo object at 0x00000181D14C6608>>),
('prop_helper',
<bound method PropertyHelperDemo.prop_helper of <__main__.PropertyHelperDemo object at 0x00000181D14C6608>>)]
>>>
In the first listing we can see the property methods as well as the __dict__ attributes. It's interesting (to me) that the property methods are executed by inspect. We see methods X, Y, Z executed twice because Total also calls them. Properties X, Y, Z and Total are not listed when we filter for methods.
Of course it's a great idea to re-use names like this only if you want to drive yourself and everyone else crazy.
Enough omphaloskepsis, it's time to move on.
Can anyone amend namedtuple or provide an alternative class so that it works for mutable objects?
Primarily for readability, I would like something similar to namedtuple that does this:
from Camelot import namedgroup
Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
Point(x=100, y=0)
It must be possible to pickle the resulting object. And per the characteristics of named tuple, the ordering of the output when represented must match the order of the parameter list when constructing the object.
There is a mutable alternative to collections.namedtuple – recordclass.
It can be installed from PyPI:
pip3 install recordclass
It has the same API and memory footprint as namedtuple and it supports assignments (It should be faster as well). For example:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
recordclass (since 0.5) support typehints:
from recordclass import recordclass, RecordClass
class Point(RecordClass):
x: int
y: int
>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
There is a more complete example (it also includes performance comparisons).
Recordclass library now provides another variant -- recordclass.make_dataclass factory function. It support dataclasses-like API (there are module level functions update, make, replace instead of self._update, self._replace, self._asdict, cls._make methods).
from recordclass import dataobject, make_dataclass
Point = make_dataclass('Point', [('x', int), ('y',int)])
Point = make_dataclass('Point', {'x':int, 'y':int})
class Point(dataobject):
x: int
y: int
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> p.x = 10; p.y += 3; print(p)
Point(x=10, y=5)
recordclass and make_dataclass can produce classes, whose instances occupy less memory than __slots__-based instances. This can be important for the instances with attribute values, which has not intended to have reference cycles. It may help reduce memory usage if you need to create millions of instances. Here is an illustrative example.
types.SimpleNamespace was introduced in Python 3.3 and supports the requested requirements.
from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
pickle.dump(t, f)
As a Pythonic alternative for this task, since Python-3.7, you can use
dataclasses module that not only behaves like a mutable NamedTuple, because they use normal class definitions, they also support other class features.
From PEP-0557:
Although they use a very different mechanism, Data Classes can be thought of as "mutable namedtuples with defaults". Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.
A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, "Syntax for Variable Annotations". In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there's really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.
This feature is introduced in PEP-0557 that you can read about it in more details on provided documentation link.
Example:
In [20]: from dataclasses import dataclass
In [21]: #dataclass
...: class InventoryItem:
...: '''Class for keeping track of an item in inventory.'''
...: name: str
...: unit_price: float
...: quantity_on_hand: int = 0
...:
...: def total_cost(self) -> float:
...: return self.unit_price * self.quantity_on_hand
...:
Demo:
In [23]: II = InventoryItem('bisc', 2000)
In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)
In [25]: II.name = 'choco'
In [26]: II.name
Out[26]: 'choco'
In [27]:
In [27]: II.unit_price *= 3
In [28]: II.unit_price
Out[28]: 6000
In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)
The latest namedlist 1.7 passes all of your tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation whereas the recordclass is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.
Your tests (but also see the note below):
from __future__ import print_function
import pickle
import sys
from namedlist import namedlist
Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)
print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))
print('2. String')
print('p: {}\n'.format(p))
print('3. Representation')
print(repr(p), '\n')
print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')
print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))
print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))
print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))
print('8. Iteration')
print('p: {}\n'.format([v for v in p]))
print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))
print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))
print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')
print('12. Fields\n')
print('p: {}\n'.format(p._fields))
print('13. Slots')
print('p: {}\n'.format(p.__slots__))
Output on Python 2.7
1. Mutation of field values
p: 10, 12
2. String
p: Point(x=10, y=12)
3. Representation
Point(x=10, y=12)
4. Sizeof
size of p: 64
5. Access by name of field
p: 10, 12
6. Access by index
p: 10, 12
7. Iterative unpacking
p: 10, 12
8. Iteration
p: [10, 12]
9. Ordered Dict
p: OrderedDict([('x', 10), ('y', 12)])
10. Inplace replacement (update?)
p: Point(x=100, y=200)
11. Pickle and Unpickle
Pickled successfully
12. Fields
p: ('x', 'y')
13. Slots
p: ('x', 'y')
The only difference with Python 3.5 is that the namedlist has become smaller, the size is 56 (Python 2.7 reports 64).
Note that I have changed your test 10 for in-place replacement. The namedlist has a _replace() method which does a shallow copy, and that makes perfect sense to me because the namedtuple in the standard library behaves the same way. Changing the semantics of the _replace() method would be confusing. In my opinion the _update() method should be used for in-place updates. Or maybe I failed to understand the intent of your test 10?
It seems like the answer to this question is no.
Below is pretty close, but it's not technically mutable. This is creating a new namedtuple() instance with an updated x value:
Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)
On the other hand, you can create a simple class using __slots__ that should work well for frequently updating class instance attributes:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
To add to this answer, I think __slots__ is good use here because it's memory efficient when you create lots of class instances. The only downside is that you can't create new class attributes.
Here's one relevant thread that illustrates the memory efficiency - Dictionary vs Object - which is more efficient and why?
The quoted content in the answer of this thread is a very succinct explanation why __slots__ is more memory efficient - Python slots
The following is a good solution for Python 3: A minimal class using __slots__ and Sequence abstract base class; does not do fancy error detection or such, but it works, and behaves mostly like a mutable tuple (except for typecheck).
from collections import Sequence
class NamedMutableSequence(Sequence):
__slots__ = ()
def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))
if a:
for k, v in zip(slots, a):
setattr(self, k, v)
def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)
__repr__ = __str__
def __getitem__(self, item):
return getattr(self, self.__slots__[item])
def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)
def __len__(self):
return len(self.__slots__)
class Point(NamedMutableSequence):
__slots__ = ('x', 'y')
Example:
>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)
If you want, you can have a method to create the class too (though using an explicit class is more transparent):
def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})
Example:
>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)
In Python 2 you need to adjust it slightly - if you inherit from Sequence, the class will have a __dict__ and the __slots__ will stop from working.
The solution in Python 2 is to not inherit from Sequence, but object. If isinstance(Point, Sequence) == True is desired, you need to register the NamedMutableSequence as a base class to Sequence:
Sequence.register(NamedMutableSequence)
Tuples are by definition immutable.
You can however make a dictionary subclass where you can access the attributes with dot-notation;
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
: def __getattr__(self, name):
: return self[name]
:
: def __setattr__(self, name, value):
: self[name] = value
:--
In [2]: test = AttrDict()
In [3]: test.a = 1
In [4]: test.b = True
In [5]: test
Out[5]: {'a': 1, 'b': True}
If you want similar behavior as namedtuples but mutable try namedlist
Note that in order to be mutable it cannot be a tuple.
Let's implement this with dynamic type creation:
import copy
def namedgroup(typename, fieldnames):
def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)
def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')
def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')
def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'
def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()
def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)
def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)
attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}
return type(typename, (object,), attrs)
This checks the attributes to see if they are valid before allowing the operation to continue.
So is this pickleable? Yes if (and only if) you do the following:
>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True
The definition has to be in your namespace, and must exist long enough for pickle to find it. So if you define this to be in your package, it should work.
Point = namedgroup("Point", ["x", "y"])
Pickle will fail if you do the following, or make the definition temporary (goes out of scope when the function ends, say):
some_point = namedgroup("Point", ["x", "y"])
And yes, it does preserve the order of the fields listed in the type creation.
I can't believe nobody's said this before, but it seems to me Python just wants you to write your own simple, mutable class instead of using a namedtuple whenever you need the "namedtuple" to be mutable.
Quick summary
Just jump straight down to Approach 5 below. It's short and to-the-point, and by far the best of these options.
Various, detailed approaches:
Approach 1 (good): simple, callable class with __call__()
Here is an example of a simple Point object for (x, y) points:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __call__(self):
"""
Make `Point` objects callable. Print their contents when they
are called.
"""
print("Point(x={}, y={})".format(self.x, self.y))
Now use it:
p1 = Point(1,2)
p1()
p1.x = 7
p1()
p1.y = 8
p1()
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
This is pretty similar to a namedtuple, except it is fully mutable, unlike a namedtuple. Also, a namedtuple isn't callable, so to see its contents, just type the object instance name withOUT parenthesis after it (as p2 in the example below, instead of as p2()). See this example and output here:
>>> from collections import namedtuple
>>> Point2 = namedtuple("Point2", ["x", "y"])
>>> p2 = Point2(1, 2)
>>> p2
Point2(x=1, y=2)
>>> p2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Point2' object is not callable
>>> p2.x = 7
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
Approach 2 (better): use __repr__() in place of __call__()
I just learned you can use __repr__() in place of __call__(), to get more namedtuple-like behavior. Defining the __repr__() method allows you to define "the 'official' string representation of an object" (see the official documentation here). Now, just calling p1 is the equivalent of calling the __repr__() method, and you get identical behavior to the namedtuple. Here is the new class:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
Now use it:
p1 = Point(1,2)
p1
p1.x = 7
p1
p1.y = 8
p1
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
Approach 3 (better still, but a little awkward to use): make it a callable which returns an (x, y) tuple
The original poster (OP) would also like something like this to work (see his comment below my answer):
x, y = Point(x=1, y=2)
Well, for simplicity, let's just make this work instead:
x, y = Point(x=1, y=2)()
# OR
p1 = Point(x=1, y=2)
x, y = p1()
While we are at it, let's also condense this:
self.x = x
self.y = y
...into this (source where I first saw this):
self.x, self.y = x, y
Here is the class definition for all of the above:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __call__(self):
"""
Make the object callable. Return a tuple of the x and y components
of the Point.
"""
return self.x, self.y
Here are some test calls:
p1 = Point(1,2)
p1
p1.x = 7
x, y = p1()
x2, y2 = Point(10, 12)()
x
y
x2
y2
I won't show pasting the class definition into the interpreter this time, but here are those calls with their output:
>>> p1 = Point(1,2)
>>> p1
Point(x=1, y=2)
>>> p1.x = 7
>>> x, y = p1()
>>> x2, y2 = Point(10, 12)()
>>> x
7
>>> y
2
>>> x2
10
>>> y2
12
Approach 4 (best so far, but a lot more code to write): make the class also an iterator
By making this into an iterator class, we can get this behavior:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Let's get rid of the __call__() method, but to make this class an iterator we will add the __iter__() and __next__() methods. Read more about these things here:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Here is the solution:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
self._iterator_index = 0
self._num_items = 2 # counting self.x and self.y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
return self
def __next__(self):
self._iterator_index += 1
if self._iterator_index == 1:
return self.x
elif self._iterator_index == 2:
return self.y
else:
raise StopIteration
And here are some test calls and their output:
>>> x, y = Point(x=1, y=2)
>>> x
1
>>> y
2
>>> x, y = Point(3, 4)
>>> x
3
>>> y
4
>>> p1 = Point(5, 6)
>>> x, y = p1
>>> x
5
>>> y
6
>>> p1
Point(x=5, y=6)
Approach 5 (USE THIS ONE) (Perfect!--best and cleanest/shortest approach): make the class an iterable, with the yield generator keyword
Study these references:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
What does the "yield" keyword do?
Here is the solution. It relies on a fancy "iterable-generator" (AKA: just "generator") keyword/Python mechanism, called yield.
Basically, the first time an iterable calls for the next item, it calls the __iter__() method, and stops and returns the contents of the first yield call (self.x in the code below). The next time an iterable calls for the next item, it picks up where it last left off (just after the first yield in this case), and looks for the next yield, stopping and returning the contents of that yield call (self.y in the code below). Each "return" from a yield actually returns a "generator" object, which is an iterable itself, so you can iterate on it. Each new iterable call for the next item continues this process, starting up where it last left off, just after the most-recently-called yield, until no more yield calls exist, at which point the iterations are ended and the iterable has been fully iterated. Therefore, once this iterable has called for two objects, both yield calls have been used up, so the iterator ends. The end result is that calls like this work perfectly, just as they did in Approach 4, but with far less code to write!:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Here is the solution (a part of this solution can also be found in the treyhunner.com reference just above). Notice how short and clean this solution is!
Just the class definition code; no docstrings, so you can truly see how short and simple this is:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
yield self.x
yield self.y
With descriptive docstrings:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
"""
Make this `Point` class an iterable. When used as an iterable, it will
now return `self.x` and `self.y` as the two elements of a list-like,
iterable object, "generated" by the usages of the `yield` "generator"
keyword.
"""
yield self.x
yield self.y
Copy and paste the exact same test code as used in the previous approach (Approach 4) just above, and you will get the exact same output as above as well!
References:
https://docs.python.org/3/library/collections.html#collections.namedtuple
Approach 1:
What is the difference between __init__ and __call__?
Approach 2:
https://www.tutorialspoint.com/What-does-the-repr-function-do-in-Python-Object-Oriented-Programming
Purpose of __repr__ method?
https://docs.python.org/3/reference/datamodel.html#object.__repr__
Approach 4:
*****[EXCELLENT!] https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Approach 5:
See links from Approach 4, plus:
*****[EXCELLENT!] What does the "yield" keyword do?
What is the meaning of single and double underscore before an object name?
Provided performance is of little importance, one could use a silly hack like:
from collection import namedtuple
Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])
If you want to be able to create classes "on-site", I find the following very convenient:
class Struct:
def __init__(self, **kw):
self.__dict__.update(**kw)
That allows me to write:
p = Struct(x=0, y=0)
P.x = 10
stats = Struct(count=0, total=0.0)
stats.count += 1
The most elegant way I can think of doesn't require a 3rd party library and lets you create a quick mock class constructor with default member variables without dataclasses cumbersome type specification. So it's better for roughing out some code:
# copy-paste 3 lines:
from inspect import getargvalues, stack
from types import SimpleNamespace
def DefaultableNS(): return SimpleNamespace(**getargvalues(stack()[1].frame)[3])
# then you can make classes with default fields on the fly in one line, eg:
def Node(value,left=None,right=None): return DefaultableNS()
node=Node(123)
print(node)
#[stdout] namespace(value=123, left=None, right=None)
print(node.value,node.left,node.right) # all fields exist
A plain SimpleNamespace is clumsier, it breaks DRY:
def Node(value,left=None,right=None):
return SimpleNamespace(value=value,left=left,right=right)
# breaks DRY as you need to repeat the argument names twice
I will share my solution to this question. I needed a way to save attributes in the case that my program crashed or was stopped for some reason so that it would know where where in a list of inputs to resume from. Based on #GabrielStaples's answer:
import pickle, json
class ScanSession:
def __init__(self, input_file: str = None, output_file: str = None,
total_viable_wallets: int = 0, total: float = 0,
report_dict: dict = {}, wallet_addresses: list = [],
token_map: list = [], token_map_file: str = 'data/token.maps.json',
current_batch: int = 0):
self.initialized = time.time()
self.input_file = input_file
self.output_file = output_file
self.total_viable_wallets = total_viable_wallets
self.total = total
self.report_dict = report_dict
self.wallet_addresses = wallet_addresses
self.token_map = token_map
self.token_map_file = token_map_file
self.current_batch = current_batch
#property
def __dict__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return {'initialized': self.initialized, 'input_file': self.input_file,
'output_file': self.output_file, 'total_viable_wallets': self.total_viable_wallets,
'total': self.total, 'report_dict': self.report_dict,
'wallet_addresses': self.wallet_addresses, 'token_map': self.token_map,
'token_map_file':self.token_map_file, 'current_batch': self.current_batch
}
def load_session(self, session_file):
with open(session_file, 'r') as f:
_session = json.loads(json.dumps(f.read()))
_session = dict(_session)
for key, value in _session.items():
setattr(self, key, value)
def dump_session(self, session_file):
with open(session_file, 'w') as f:
json.dump(self.__dict__, fp=f)
Using it:
session = ScanSession()
session.total += 1
session.__dict__
{'initialized': 1670801774.8050613, 'input_file': None, 'output_file': None, 'total_viable_wallets': 0, 'total': 10, 'report_dict': {}, 'wallet_addresses': [], 'token_map': [], 'token_map_file': 'data/token.maps.json', 'current_batch': 0}
pickle.dumps(session)
b'\x80\x04\x95\xe8\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0bScanSession\x94\x93\x94)\x81\x94}\x94(\x8c\x0binitialized\x94GA\xd8\xe5\x9a[\xb3\x86 \x8c\ninput_file\x94N\x8c\x0boutput_file\x94N\x8c\x14total_viable_wallets\x94K\x00\x8c\x05total\x94K\n\x8c\x0breport_dict\x94}\x94\x8c\x10wallet_addresses\x94]\x94\x8c\ttoken_map\x94]\x94\x8c\x0etoken_map_file\x94\x8c\x14data/token.maps.json\x94\x8c\rcurrent_batch\x94K\x00ub.'