python property class namespace confusion - python

I am confused about use of property class with regard to references to the fset/fget/fdel functions and in which namespaces they live. The behavior is different depending on whether I use property as a decorator or a helper function. Why do duplicate vars in class and instance namespaces impact one example but not the other?
When using property as a decorator shown here I must hide the var name in __dict__ with a leading underscore to prevent preempting the property functions. If not I'll see a recursion loop.
class setget():
"""Play with setters and getters"""
#property
def x(self):
print('getting x')
return self._x
#x.setter
def x(self, x):
print('setting x')
self._x = x
#x.deleter
def x(self):
print('deleting x')
del self._x
and I can see _x as an instance property and x as a class property:
>>> sg = setget()
>>> sg.x = 1
setting x
>>> sg.__dict__
{'_x': 1}
pprint(setget.__dict__)
mappingproxy({'__dict__': <attribute '__dict__' of 'setget' objects>,
'__doc__': 'Play with setters and getters',
'__module__': '__main__',
'__weakref__': <attribute '__weakref__' of 'setget' objects>,
'x': <property object at 0x000001BF3A0C37C8>})
>>>
Here's an example of recursion if the instance var name underscore is omitted. (code not shown here) This makes sense to me because instance property x does not exist and so we look further to class properties.
>>> sg = setget()
>>> sg.x = 1
setting x
setting x
setting x
setting x
...
However if I use property as a helper function as described in one of the answers here:
python class attributes vs instance attributes
the name hiding underscore is not needed and there is no conflict.
Copy of the example code:
class PropertyHelperDemo:
'''Demonstrates a property definition helper function'''
def prop_helper(k: str, doc: str):
print(f'Creating property instance {k}')
def _get(self):
print(f'getting {k}')
return self.__dict__.__getitem__(k) # might use '_'+k, etc.
def _set(self, v):
print(f'setting {k}')
self.__dict__.__setitem__(k, v)
def _del(self):
print(f'deleting {k}')
self.__dict__.__delitem__(k)
return property(_get, _set, _del, doc)
X: float = prop_helper('X', doc="X is the best!")
Y: float = prop_helper('Y', doc="Y do you ask?")
Z: float = prop_helper('Z', doc="Z plane!")
# etc...
def __init__(self, X: float, Y: float, Z: float):
#super(PropertyHelperDemo, self).__init__() # not sure why this was here
(self.X, self.Y, self.Z) = (X, Y, Z)
# for read-only properties, the built-in technique remains sleek enough already
#property
def Total(self) -> float:
return self.X + self.Y + self.Z
And here I verify that the property fset function is being executed on subsequent calls.
>>> p = PropertyHelperDemo(1, 2, 3)
setting X
setting Y
setting Z
>>> p.X = 11
setting X
>>> p.X = 111
setting X
>>> p.__dict__
{'X': 111, 'Y': 2, 'Z': 3}
>>> pprint(PropertyHelperDemo.__dict__)
mappingproxy({'Total': <property object at 0x000002333A093F98>,
'X': <property object at 0x000002333A088EF8>,
'Y': <property object at 0x000002333A093408>,
'Z': <property object at 0x000002333A093D18>,
'__annotations__': {'X': <class 'float'>,
'Y': <class 'float'>,
'Z': <class 'float'>},
'__dict__': <attribute '__dict__' of 'PropertyHelperDemo' objects>,
'__doc__': 'Demonstrates a property definition helper function',
'__init__': <function PropertyHelperDemo.__init__ at 0x000002333A0B3AF8>,
'__module__': '__main__',
'__weakref__': <attribute '__weakref__' of 'PropertyHelperDemo' objects>,
'prop_helper': <function PropertyHelperDemo.prop_helper at 0x000002333A052F78>})
>>>
I can see the class and instance properties with overlapping names X, Y, Z, in the two namespaces. It is my understanding that the namespace search order begins with local variables so I don't understand why the property fset function is executed here.
Any guidance is greatly appreciated.

I think you're a little astray in construing _x as an "instance property" and x as a "class property" - in fact, both are bound to the instance only, and neither is bound to the other except by the arbitrarily defined behaviour of the method decorated by #property.
They both occupy the same namespace, which is why, though they may represent the same quantity, they cannot share a name for fear of shadowing/confusing the namespace.
The issue of namespaces is not directly connected to the use of the #property decorator. You don't HAVE to "hide" the attribute name - you just need to ensure that the attribute name differs from the name of the method, because once you apply the #property decorator, the method decorated by #property can be accessed just like any other attribute without a typical method call signature including the ().
Here's an example, adjacent to the one you provided, that may help clarify. I define a class, PositionVector below, that holds the x, y and z coordinates of a point in space.
When initialising an instance of the class, I also create an attribute length that computes the length of the vector based on the x, y and z values. Trying this:
import numpy as np
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self.length = np.sqrt(x**2 + y**2 + z**2)
p1 = PositionVector(x = 10, y = 0, z = 0)
print (p1.length)
# Result -> 10.0
Only now I want to change the y attribute of the instance. I do this:
p1.y = 10.0
print (f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
Except now, if I again access the length of the vector, I get the wrong answer:
print (f"p1's length is {p1.length}")
# Result -> p1's length is 10.0
This arises because length, which at any given instant depends on the current values of x, y, and z, is never updated and kept consistent. We could fix this issue by redefining our class so length is a method that is continuously recalculated every time the user wants to access it, like so:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
def length(self):
return np.sqrt(self.x**2 + self.y**2 + self.z**2)
Now, I have a way to get the correct length of an instance of this class at all times by calling the instance's length() method:
p1 = PositionVector(x = 10, y = 0, z = 0)
print (f"p1's length is {p1.length()}")
# Result -> p1's length is 10.0
p1.y = 10.0
print (f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
print (f"p1's length is {p1.length()}")
# Result -> p1's length is 14.142135623730951
This is fine, except for two issues:
If this class had been in use already, going back and changing length from an attribute to a method would break backward compatibility, forcing any other code that uses this class to need modifications before it could work as before.
Though I DO want length to recalculate every time I invoke it, I want to be able to pick it up and "handle" it like it's a "property" of the instance, not a "behaviour" of the instance. So using p1.length() to get the instance's length instead of simply p1.length feels unidiomatic.
I can restore backward compatibility, AND permit length to be accessed like any other attribute by applying the #property decorator to the method. Simply adding #property to the length() method definition allows its call signature to go back to its original form:
#property
def length(self):
return np.sqrt(self.x**2 + self.y**2 + self.z**2)
p1 = PositionVector(x=10, y=0, z=0)
print(f"p1's length is {p1.length}")
# Result -> p1's length is 10.0
p1.y = 10.0
print(f"p1's 'y' value is {p1.y}")
# Result -> p1's 'y' value is 10.0
print(f"p1's length is {p1.length}")
# Result -> p1's length is 14.142135623730951
At this point, there are no shadowed or "underscored" attribute names, I don't need them - I can access x, y and z normally, and access length as though it were any other attribute, and yet be confident that anytime I call it, I get the most current value, correctly reflective of the current values of x, y, and z. Calling dict on p1 in this state yields:
print(p1.__dict__)
# Result -> {'x': 10, 'y': 10.0, 'z': 0}
There could be use cases where you want to not only calculate length, but also save its value as a static attribute of an instance. This is where you might want to create an attribute and have it hold the value of length every time its calculated. You'd accomplish this like so:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self.placeholder_attribute_name = None
#property
def length(self):
self.placeholder_attribute_name = np.sqrt(self.x**2 + self.y**2 + self.z**2)
return self.placeholder_attribute_name
Doing this has no effect whatsoever on the prior functioning of the class. It simply creates a way to statically hold the value of length, independent of the act of creating it.
You don't HAVE to name that attribute anything in particular. You can name it anything you want, except for any other name already in use. In the case above, you can't name it x, y, z, or length, because all of those have other meanings.
For readability, however, it does make sense, and it's common practice, to do the following two things:
Make it obvious that this attribute is not meant to be used directly. In the case above - you don't want someone to get the length of the vector by calling p1.placeholder_attribute_name because this is not guaranteed to yield the correct current length - they should use p1.length instead. You indicate that this attribute is not for public consumption with a commonly adopted Python convention - the leading underscore:
class PositionVector:
def __init__(self, x: float, y: float, z: float) -> None:
self.x = x
self.y = y
self.z = z
self._placeholder_attribute_name = None
#property
def length(self):
self._placeholder_attribute_name = np.sqrt(self.x**2 + self.y**2 + self.z**2)
return self._placeholder_attribute_name
Use the name of the attribute to convey to anyone reading your code what the attribute actually means. If the attribute is meant to shadow the "length" property - putting length in there somewhere instead of the less helpful placeholder_attribute_name would enhance readability. You could indicate that this shadows length by naming it _length.
In summary:
Employing the #property decorator does not compel you to use "public" and "private" attribute names - you would only do so if, besides computing your attribute's value with the #property decorated method, you also wanted to save the value of that method's return in a persistent attribute bound to every instance of the class.
Even when you DO choose to use propertyname in public and _propertyname in private this is not an absolute rule, it is simply a convention adopted in aid of readability.

Thanks to #Vin for a nice detailed description of property but it doesn't really answer my question - which could have been worded much more clearly. It shows my confusion.
The fundamental reason for the recursion in setget but not PropertyHelperDemo is that the property methods in setget invoke themselves while the methods in PropertyHelperDemo access the instance __dict__ directly as such:
def _get(self):
print(f'getting {k}')
return self.__dict__.__getitem__(k)
This seems rather obvious now. It is apparent that conflicting property and __dict__ attribute names are not prevented and that the resolution order is to look for properties before __dict__ entries.
In other experiments I've found that it's possible to replace an instance method by making an entry of the same name in __dict__. So the overall resolution sequence remains less than clear (to me.)
Another source of confusion for me is that dir returns a list of names of methods plus __dict__ entries and other attributes, and apparently eliminates duplicates. From the doc:
If the object does not provide __dir__(), the function tries its best
to gather information from the object’s __dict__ attribute, if
defined, and from its type object. The resulting list is not
necessarily complete, and may be inaccurate when the object has a
custom __getattr__().
... If the object is a type or class object, the list contains the names
of its attributes, and recursively of the attributes of its bases.
... The resulting list is sorted alphabetically.
Interestingly, properties appear in the class __dict__ but not in the instance __dict__.
I found this in the Descriptor HowTo Guide offered by #chepner. THANKS!
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the method resolution order of type(a). If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined.
... Instance lookup scans through a chain of namespaces giving data descriptors the highest priority, followed by instance variables, then non-data descriptors, then class variables, and lastly __getattr__() if it is provided.
A Python property is a type of descriptor so resolution through __dict__ is preempted.
Another way to explore is using inspect which does not eliminate duplicates.
>>> p = PropertyHelperDemo(1, 2, 3)
setting X
setting Y
setting Z
>>>
>>> import inspect
>>> pprint(inspect.getmembers(p))
getting X
getting Y
getting Z
getting X
getting Y
getting Z
[('Total', 6),
('X', 1),
('Y', 2),
('Z', 3),
('__annotations__',
{'X': <class 'float'>, 'Y': <class 'float'>, 'Z': <class 'float'>}),
('__class__', <class '__main__.PropertyHelperDemo'>),
('__delattr__',
<method-wrapper '__delattr__' of PropertyHelperDemo object at 0x00000181D14C6608>),
('__dict__', {'X': 1, 'Y': 2, 'Z': 3}),
('__dir__',
<built-in method __dir__ of PropertyHelperDemo object at 0x00000181D14C6608>),
...
...
...
>>> pprint(inspect.getmembers(p, predicate=inspect.ismethod))
getting X
getting Y
getting Z
getting X
getting Y
getting Z
[('__init__',
<bound method PropertyHelperDemo.__init__ of <__main__.PropertyHelperDemo object at 0x00000181D14C6608>>),
('prop_helper',
<bound method PropertyHelperDemo.prop_helper of <__main__.PropertyHelperDemo object at 0x00000181D14C6608>>)]
>>>
In the first listing we can see the property methods as well as the __dict__ attributes. It's interesting (to me) that the property methods are executed by inspect. We see methods X, Y, Z executed twice because Total also calls them. Properties X, Y, Z and Total are not listed when we filter for methods.
Of course it's a great idea to re-use names like this only if you want to drive yourself and everyone else crazy.
Enough omphaloskepsis, it's time to move on.

Related

Python - How to print the variable name of an Object

Thanks for reading. Preface, I don't mean how to make a print(objectA) make python output something other than the <__main__.A object at 0x00000273BC36A5C0> via the redefining the __str__ attribute.
I will use the following example to try to explain what I'm doing.
class Point:
'''
Represents a point in 2D space
attributes: x, y
'''
def __init__(self, x=0, y=0):
allowed_types = {int, float}
if type(x) not in allowed_types or type(y) not in allowed_types:
raise TypeError('Coordinates must be numbers.')
else:
self.x = x
self.y = y
def __str__(self):
return f' "the points name" has the points: ({self.x}, {self.y})'
__repr__ = __str__
I would like the "the points name" to be replaced with whatever the variable name assigned to a specific object. So if I instantiated pointA=Point(1,0), I would like to be able to print
pointA has the points: (1,0)
I can't seem to find anything like this online, just people having issues that are solved by redefining __str__. I tried to solve this issue by adding a .name attribute, but that made this very unwieldy (especially since I want to make other object classes that inherit Point()). I'm not entirely sure if this is possible from what I know about variable and object names in python, but after wrestling with it for a couple of days I'd figured I'd reach out for ideas.
Note that an object may be referred to as multiple names.
It is also possible that there is no object name referring to the object.
Below is one approach that achieves your goal. It uses globals(), the dictionary that stores mappings from names to objects inside the global environment. Essentially, the __str__ method searches the object in the global listings (so it can be very slow if there are many objects) and keeps the name if matches.
You could possibly use locals instead to narrow the search scope.
In the example, C is referring to the same object as A. So print(C) tells both A and C are the names.
class Point:
def __init__(self, x=0, y=0):
self.x = x
self.y = y
def __str__(self):
results = []
for name, obj in globals().items():
if obj == self:
results.append(f' "{name}" has the points: ({self.x}, {self.y})')
return "; ".join(results)
A = Point()
B = Point()
print(A)
#"A" has the points: (0, 0)
print(B)
# "B" has the points: (0, 0)
C = A
print(C)
# "A" has the points: (0, 0); "C" has the points: (0, 0)

#properties and public attribute

I'm following a tutorial on python 3 and there is a simple example I'm struggling with.
class P:
def __init__(self,x):
self.x = x
#property
def x(self):
return self.__x
#x.setter
def x(self, x):
if x < 0:
self.__x = 0
elif x > 1000:
self.__x = 1000
else:
self.__x = x
Why is the attribute x in __init__ defined as public but is accessed like a private attribute with self.__x in the functions decorated with #property and #x.setter?
This isn't that straightforward because it heavily relies on Pythons descriptor protocol see also Descriptor HOW-TO which refers to property as well. But I will try to explain it in easy terms.
You have a class that has (besides what is inherited by the implicit superclass object and some automatically included stuff) 2 attributes:
>>> P.__dict__
mappingproxy({'__init__': <function __main__.P.__init__>,
'x': <property at 0x2842664cbd8>})
I removed the automatically added attributes for the sake of this discussion. You can always add or replace attributes as much as you want:
>>> P.y = 1000
>>> P.__dict__
mappingproxy({'__init__': <function __main__.P.__init__>,
'x': <property at 0x2842664cbd8>,
'y': 1000})
But when you create an instance the instance will have only one attribute _P__x (the _P is inserted because variables starting with __ and not ending in __ are name-mangled):
>>> p = P(10)
>>> p.__dict__
{'_P__x': 10}
You can also add almost (only almost because the descriptor protocol intercepts certain operations - see below) any attribute for the instance:
>>> p.y = 100
>>> p.__dict__
{'_P__x': 10, 'z': 100}
That's where the descriptor protocol comes into play. If you access an attribute on the instance, it starts by looking if the instance has that attribute. If the instance doesn't have that attribute it will look at the class - but through the descriptor protocol! So when you access self.x this is roughly equivalent to: type(self).x.__get__(self):
>>> p.x
10
>>> type(p).x.__get__(p)
10
Likewise setting the attribute with self.x = 200 will call type(self).x.__set__(self, 200):
>>> p.x = 200
>>> p.x
200
>>> type(p).x.__set__(p, 100)
>>> p.x
100
The #property will intercept though the descriptor protocol any access to x on self. So you can't use the name x to store the actual value on the instance because it would always go into the #property and #x.setter (and also x.deleter but you haven't implemented that one) functions of the class. So you have to use another name to store the variable.
It's typically stored with the same name but one leading underscore (also eases maintainability). It's actually not good practice to use two leading underscores because that makes it hard to subclass your class and modify the x property - without name-mangling the variable name yourself.

Existence of mutable named tuple in Python?

Can anyone amend namedtuple or provide an alternative class so that it works for mutable objects?
Primarily for readability, I would like something similar to namedtuple that does this:
from Camelot import namedgroup
Point = namedgroup('Point', ['x', 'y'])
p = Point(0, 0)
p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
Point(x=100, y=0)
It must be possible to pickle the resulting object. And per the characteristics of named tuple, the ordering of the output when represented must match the order of the parameter list when constructing the object.
There is a mutable alternative to collections.namedtuple – recordclass.
It can be installed from PyPI:
pip3 install recordclass
It has the same API and memory footprint as namedtuple and it supports assignments (It should be faster as well). For example:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
recordclass (since 0.5) support typehints:
from recordclass import recordclass, RecordClass
class Point(RecordClass):
x: int
y: int
>>> Point.__annotations__
{'x':int, 'y':int}
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
There is a more complete example (it also includes performance comparisons).
Recordclass library now provides another variant -- recordclass.make_dataclass factory function. It support dataclasses-like API (there are module level functions update, make, replace instead of self._update, self._replace, self._asdict, cls._make methods).
from recordclass import dataobject, make_dataclass
Point = make_dataclass('Point', [('x', int), ('y',int)])
Point = make_dataclass('Point', {'x':int, 'y':int})
class Point(dataobject):
x: int
y: int
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> p.x = 10; p.y += 3; print(p)
Point(x=10, y=5)
recordclass and make_dataclass can produce classes, whose instances occupy less memory than __slots__-based instances. This can be important for the instances with attribute values, which has not intended to have reference cycles. It may help reduce memory usage if you need to create millions of instances. Here is an illustrative example.
types.SimpleNamespace was introduced in Python 3.3 and supports the requested requirements.
from types import SimpleNamespace
t = SimpleNamespace(foo='bar')
t.ham = 'spam'
print(t)
namespace(foo='bar', ham='spam')
print(t.foo)
'bar'
import pickle
with open('/tmp/pickle', 'wb') as f:
pickle.dump(t, f)
As a Pythonic alternative for this task, since Python-3.7, you can use
dataclasses module that not only behaves like a mutable NamedTuple, because they use normal class definitions, they also support other class features.
From PEP-0557:
Although they use a very different mechanism, Data Classes can be thought of as "mutable namedtuples with defaults". Because Data Classes use normal class definition syntax, you are free to use inheritance, metaclasses, docstrings, user-defined methods, class factories, and other Python class features.
A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, "Syntax for Variable Annotations". In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization, a repr, comparison methods, and optionally other methods as described in the Specification section. Such a class is called a Data Class, but there's really nothing special about the class: the decorator adds generated methods to the class and returns the same class it was given.
This feature is introduced in PEP-0557 that you can read about it in more details on provided documentation link.
Example:
In [20]: from dataclasses import dataclass
In [21]: #dataclass
...: class InventoryItem:
...: '''Class for keeping track of an item in inventory.'''
...: name: str
...: unit_price: float
...: quantity_on_hand: int = 0
...:
...: def total_cost(self) -> float:
...: return self.unit_price * self.quantity_on_hand
...:
Demo:
In [23]: II = InventoryItem('bisc', 2000)
In [24]: II
Out[24]: InventoryItem(name='bisc', unit_price=2000, quantity_on_hand=0)
In [25]: II.name = 'choco'
In [26]: II.name
Out[26]: 'choco'
In [27]:
In [27]: II.unit_price *= 3
In [28]: II.unit_price
Out[28]: 6000
In [29]: II
Out[29]: InventoryItem(name='choco', unit_price=6000, quantity_on_hand=0)
The latest namedlist 1.7 passes all of your tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation whereas the recordclass is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.
Your tests (but also see the note below):
from __future__ import print_function
import pickle
import sys
from namedlist import namedlist
Point = namedlist('Point', 'x y')
p = Point(x=1, y=2)
print('1. Mutation of field values')
p.x *= 10
p.y += 10
print('p: {}, {}\n'.format(p.x, p.y))
print('2. String')
print('p: {}\n'.format(p))
print('3. Representation')
print(repr(p), '\n')
print('4. Sizeof')
print('size of p:', sys.getsizeof(p), '\n')
print('5. Access by name of field')
print('p: {}, {}\n'.format(p.x, p.y))
print('6. Access by index')
print('p: {}, {}\n'.format(p[0], p[1]))
print('7. Iterative unpacking')
x, y = p
print('p: {}, {}\n'.format(x, y))
print('8. Iteration')
print('p: {}\n'.format([v for v in p]))
print('9. Ordered Dict')
print('p: {}\n'.format(p._asdict()))
print('10. Inplace replacement (update?)')
p._update(x=100, y=200)
print('p: {}\n'.format(p))
print('11. Pickle and Unpickle')
pickled = pickle.dumps(p)
unpickled = pickle.loads(pickled)
assert p == unpickled
print('Pickled successfully\n')
print('12. Fields\n')
print('p: {}\n'.format(p._fields))
print('13. Slots')
print('p: {}\n'.format(p.__slots__))
Output on Python 2.7
1. Mutation of field values
p: 10, 12
2. String
p: Point(x=10, y=12)
3. Representation
Point(x=10, y=12)
4. Sizeof
size of p: 64
5. Access by name of field
p: 10, 12
6. Access by index
p: 10, 12
7. Iterative unpacking
p: 10, 12
8. Iteration
p: [10, 12]
9. Ordered Dict
p: OrderedDict([('x', 10), ('y', 12)])
10. Inplace replacement (update?)
p: Point(x=100, y=200)
11. Pickle and Unpickle
Pickled successfully
12. Fields
p: ('x', 'y')
13. Slots
p: ('x', 'y')
The only difference with Python 3.5 is that the namedlist has become smaller, the size is 56 (Python 2.7 reports 64).
Note that I have changed your test 10 for in-place replacement. The namedlist has a _replace() method which does a shallow copy, and that makes perfect sense to me because the namedtuple in the standard library behaves the same way. Changing the semantics of the _replace() method would be confusing. In my opinion the _update() method should be used for in-place updates. Or maybe I failed to understand the intent of your test 10?
It seems like the answer to this question is no.
Below is pretty close, but it's not technically mutable. This is creating a new namedtuple() instance with an updated x value:
Point = namedtuple('Point', ['x', 'y'])
p = Point(0, 0)
p = p._replace(x=10)
On the other hand, you can create a simple class using __slots__ that should work well for frequently updating class instance attributes:
class Point:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
To add to this answer, I think __slots__ is good use here because it's memory efficient when you create lots of class instances. The only downside is that you can't create new class attributes.
Here's one relevant thread that illustrates the memory efficiency - Dictionary vs Object - which is more efficient and why?
The quoted content in the answer of this thread is a very succinct explanation why __slots__ is more memory efficient - Python slots
The following is a good solution for Python 3: A minimal class using __slots__ and Sequence abstract base class; does not do fancy error detection or such, but it works, and behaves mostly like a mutable tuple (except for typecheck).
from collections import Sequence
class NamedMutableSequence(Sequence):
__slots__ = ()
def __init__(self, *a, **kw):
slots = self.__slots__
for k in slots:
setattr(self, k, kw.get(k))
if a:
for k, v in zip(slots, a):
setattr(self, k, v)
def __str__(self):
clsname = self.__class__.__name__
values = ', '.join('%s=%r' % (k, getattr(self, k))
for k in self.__slots__)
return '%s(%s)' % (clsname, values)
__repr__ = __str__
def __getitem__(self, item):
return getattr(self, self.__slots__[item])
def __setitem__(self, item, value):
return setattr(self, self.__slots__[item], value)
def __len__(self):
return len(self.__slots__)
class Point(NamedMutableSequence):
__slots__ = ('x', 'y')
Example:
>>> p = Point(0, 0)
>>> p.x = 10
>>> p
Point(x=10, y=0)
>>> p.x *= 10
>>> p
Point(x=100, y=0)
If you want, you can have a method to create the class too (though using an explicit class is more transparent):
def namedgroup(name, members):
if isinstance(members, str):
members = members.split()
members = tuple(members)
return type(name, (NamedMutableSequence,), {'__slots__': members})
Example:
>>> Point = namedgroup('Point', ['x', 'y'])
>>> Point(6, 42)
Point(x=6, y=42)
In Python 2 you need to adjust it slightly - if you inherit from Sequence, the class will have a __dict__ and the __slots__ will stop from working.
The solution in Python 2 is to not inherit from Sequence, but object. If isinstance(Point, Sequence) == True is desired, you need to register the NamedMutableSequence as a base class to Sequence:
Sequence.register(NamedMutableSequence)
Tuples are by definition immutable.
You can however make a dictionary subclass where you can access the attributes with dot-notation;
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:class AttrDict(dict):
:
: def __getattr__(self, name):
: return self[name]
:
: def __setattr__(self, name, value):
: self[name] = value
:--
In [2]: test = AttrDict()
In [3]: test.a = 1
In [4]: test.b = True
In [5]: test
Out[5]: {'a': 1, 'b': True}
If you want similar behavior as namedtuples but mutable try namedlist
Note that in order to be mutable it cannot be a tuple.
Let's implement this with dynamic type creation:
import copy
def namedgroup(typename, fieldnames):
def init(self, **kwargs):
attrs = {k: None for k in self._attrs_}
for k in kwargs:
if k in self._attrs_:
attrs[k] = kwargs[k]
else:
raise AttributeError('Invalid Field')
self.__dict__.update(attrs)
def getattribute(self, attr):
if attr.startswith("_") or attr in self._attrs_:
return object.__getattribute__(self, attr)
else:
raise AttributeError('Invalid Field')
def setattr(self, attr, value):
if attr in self._attrs_:
object.__setattr__(self, attr, value)
else:
raise AttributeError('Invalid Field')
def rep(self):
d = ["{}={}".format(v,self.__dict__[v]) for v in self._attrs_]
return self._typename_ + '(' + ', '.join(d) + ')'
def iterate(self):
for x in self._attrs_:
yield self.__dict__[x]
raise StopIteration()
def setitem(self, *args, **kwargs):
return self.__dict__.__setitem__(*args, **kwargs)
def getitem(self, *args, **kwargs):
return self.__dict__.__getitem__(*args, **kwargs)
attrs = {"__init__": init,
"__setattr__": setattr,
"__getattribute__": getattribute,
"_attrs_": copy.deepcopy(fieldnames),
"_typename_": str(typename),
"__str__": rep,
"__repr__": rep,
"__len__": lambda self: len(fieldnames),
"__iter__": iterate,
"__setitem__": setitem,
"__getitem__": getitem,
}
return type(typename, (object,), attrs)
This checks the attributes to see if they are valid before allowing the operation to continue.
So is this pickleable? Yes if (and only if) you do the following:
>>> import pickle
>>> Point = namedgroup("Point", ["x", "y"])
>>> p = Point(x=100, y=200)
>>> p2 = pickle.loads(pickle.dumps(p))
>>> p2.x
100
>>> p2.y
200
>>> id(p) != id(p2)
True
The definition has to be in your namespace, and must exist long enough for pickle to find it. So if you define this to be in your package, it should work.
Point = namedgroup("Point", ["x", "y"])
Pickle will fail if you do the following, or make the definition temporary (goes out of scope when the function ends, say):
some_point = namedgroup("Point", ["x", "y"])
And yes, it does preserve the order of the fields listed in the type creation.
I can't believe nobody's said this before, but it seems to me Python just wants you to write your own simple, mutable class instead of using a namedtuple whenever you need the "namedtuple" to be mutable.
Quick summary
Just jump straight down to Approach 5 below. It's short and to-the-point, and by far the best of these options.
Various, detailed approaches:
Approach 1 (good): simple, callable class with __call__()
Here is an example of a simple Point object for (x, y) points:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __call__(self):
"""
Make `Point` objects callable. Print their contents when they
are called.
"""
print("Point(x={}, y={})".format(self.x, self.y))
Now use it:
p1 = Point(1,2)
p1()
p1.x = 7
p1()
p1.y = 8
p1()
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
This is pretty similar to a namedtuple, except it is fully mutable, unlike a namedtuple. Also, a namedtuple isn't callable, so to see its contents, just type the object instance name withOUT parenthesis after it (as p2 in the example below, instead of as p2()). See this example and output here:
>>> from collections import namedtuple
>>> Point2 = namedtuple("Point2", ["x", "y"])
>>> p2 = Point2(1, 2)
>>> p2
Point2(x=1, y=2)
>>> p2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Point2' object is not callable
>>> p2.x = 7
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
Approach 2 (better): use __repr__() in place of __call__()
I just learned you can use __repr__() in place of __call__(), to get more namedtuple-like behavior. Defining the __repr__() method allows you to define "the 'official' string representation of an object" (see the official documentation here). Now, just calling p1 is the equivalent of calling the __repr__() method, and you get identical behavior to the namedtuple. Here is the new class:
class Point():
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
Now use it:
p1 = Point(1,2)
p1
p1.x = 7
p1
p1.y = 8
p1
Here is the output:
Point(x=1, y=2)
Point(x=7, y=2)
Point(x=7, y=8)
Approach 3 (better still, but a little awkward to use): make it a callable which returns an (x, y) tuple
The original poster (OP) would also like something like this to work (see his comment below my answer):
x, y = Point(x=1, y=2)
Well, for simplicity, let's just make this work instead:
x, y = Point(x=1, y=2)()
# OR
p1 = Point(x=1, y=2)
x, y = p1()
While we are at it, let's also condense this:
self.x = x
self.y = y
...into this (source where I first saw this):
self.x, self.y = x, y
Here is the class definition for all of the above:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __call__(self):
"""
Make the object callable. Return a tuple of the x and y components
of the Point.
"""
return self.x, self.y
Here are some test calls:
p1 = Point(1,2)
p1
p1.x = 7
x, y = p1()
x2, y2 = Point(10, 12)()
x
y
x2
y2
I won't show pasting the class definition into the interpreter this time, but here are those calls with their output:
>>> p1 = Point(1,2)
>>> p1
Point(x=1, y=2)
>>> p1.x = 7
>>> x, y = p1()
>>> x2, y2 = Point(10, 12)()
>>> x
7
>>> y
2
>>> x2
10
>>> y2
12
Approach 4 (best so far, but a lot more code to write): make the class also an iterator
By making this into an iterator class, we can get this behavior:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Let's get rid of the __call__() method, but to make this class an iterator we will add the __iter__() and __next__() methods. Read more about these things here:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Here is the solution:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
self._iterator_index = 0
self._num_items = 2 # counting self.x and self.y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
return self
def __next__(self):
self._iterator_index += 1
if self._iterator_index == 1:
return self.x
elif self._iterator_index == 2:
return self.y
else:
raise StopIteration
And here are some test calls and their output:
>>> x, y = Point(x=1, y=2)
>>> x
1
>>> y
2
>>> x, y = Point(3, 4)
>>> x
3
>>> y
4
>>> p1 = Point(5, 6)
>>> x, y = p1
>>> x
5
>>> y
6
>>> p1
Point(x=5, y=6)
Approach 5 (USE THIS ONE) (Perfect!--best and cleanest/shortest approach): make the class an iterable, with the yield generator keyword
Study these references:
https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
What does the "yield" keyword do?
Here is the solution. It relies on a fancy "iterable-generator" (AKA: just "generator") keyword/Python mechanism, called yield.
Basically, the first time an iterable calls for the next item, it calls the __iter__() method, and stops and returns the contents of the first yield call (self.x in the code below). The next time an iterable calls for the next item, it picks up where it last left off (just after the first yield in this case), and looks for the next yield, stopping and returning the contents of that yield call (self.y in the code below). Each "return" from a yield actually returns a "generator" object, which is an iterable itself, so you can iterate on it. Each new iterable call for the next item continues this process, starting up where it last left off, just after the most-recently-called yield, until no more yield calls exist, at which point the iterations are ended and the iterable has been fully iterated. Therefore, once this iterable has called for two objects, both yield calls have been used up, so the iterator ends. The end result is that calls like this work perfectly, just as they did in Approach 4, but with far less code to write!:
x, y = Point(x=1, y=2)
# OR
x, y = Point(1, 2)
# OR
p1 = Point(1, 2)
x, y = p1
Here is the solution (a part of this solution can also be found in the treyhunner.com reference just above). Notice how short and clean this solution is!
Just the class definition code; no docstrings, so you can truly see how short and simple this is:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
yield self.x
yield self.y
With descriptive docstrings:
class Point():
def __init__(self, x, y):
self.x, self.y = x, y
def __repr__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return "Point(x={}, y={})".format(self.x, self.y)
def __iter__(self):
"""
Make this `Point` class an iterable. When used as an iterable, it will
now return `self.x` and `self.y` as the two elements of a list-like,
iterable object, "generated" by the usages of the `yield` "generator"
keyword.
"""
yield self.x
yield self.y
Copy and paste the exact same test code as used in the previous approach (Approach 4) just above, and you will get the exact same output as above as well!
References:
https://docs.python.org/3/library/collections.html#collections.namedtuple
Approach 1:
What is the difference between __init__ and __call__?
Approach 2:
https://www.tutorialspoint.com/What-does-the-repr-function-do-in-Python-Object-Oriented-Programming
Purpose of __repr__ method?
https://docs.python.org/3/reference/datamodel.html#object.__repr__
Approach 4:
*****[EXCELLENT!] https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/
How to build a basic iterator?
https://docs.python.org/3/library/exceptions.html#StopIteration
Approach 5:
See links from Approach 4, plus:
*****[EXCELLENT!] What does the "yield" keyword do?
What is the meaning of single and double underscore before an object name?
Provided performance is of little importance, one could use a silly hack like:
from collection import namedtuple
Point = namedtuple('Point', 'x y z')
mutable_z = Point(1,2,[3])
If you want to be able to create classes "on-site", I find the following very convenient:
class Struct:
def __init__(self, **kw):
self.__dict__.update(**kw)
That allows me to write:
p = Struct(x=0, y=0)
P.x = 10
stats = Struct(count=0, total=0.0)
stats.count += 1
The most elegant way I can think of doesn't require a 3rd party library and lets you create a quick mock class constructor with default member variables without dataclasses cumbersome type specification. So it's better for roughing out some code:
# copy-paste 3 lines:
from inspect import getargvalues, stack
from types import SimpleNamespace
def DefaultableNS(): return SimpleNamespace(**getargvalues(stack()[1].frame)[3])
# then you can make classes with default fields on the fly in one line, eg:
def Node(value,left=None,right=None): return DefaultableNS()
node=Node(123)
print(node)
#[stdout] namespace(value=123, left=None, right=None)
print(node.value,node.left,node.right) # all fields exist
A plain SimpleNamespace is clumsier, it breaks DRY:
def Node(value,left=None,right=None):
return SimpleNamespace(value=value,left=left,right=right)
# breaks DRY as you need to repeat the argument names twice
I will share my solution to this question. I needed a way to save attributes in the case that my program crashed or was stopped for some reason so that it would know where where in a list of inputs to resume from. Based on #GabrielStaples's answer:
import pickle, json
class ScanSession:
def __init__(self, input_file: str = None, output_file: str = None,
total_viable_wallets: int = 0, total: float = 0,
report_dict: dict = {}, wallet_addresses: list = [],
token_map: list = [], token_map_file: str = 'data/token.maps.json',
current_batch: int = 0):
self.initialized = time.time()
self.input_file = input_file
self.output_file = output_file
self.total_viable_wallets = total_viable_wallets
self.total = total
self.report_dict = report_dict
self.wallet_addresses = wallet_addresses
self.token_map = token_map
self.token_map_file = token_map_file
self.current_batch = current_batch
#property
def __dict__(self):
"""
Obtain the string representation of `Point`, so that just typing
the instance name of an object of this type will call this method
and obtain this string, just like `namedtuple` already does!
"""
return {'initialized': self.initialized, 'input_file': self.input_file,
'output_file': self.output_file, 'total_viable_wallets': self.total_viable_wallets,
'total': self.total, 'report_dict': self.report_dict,
'wallet_addresses': self.wallet_addresses, 'token_map': self.token_map,
'token_map_file':self.token_map_file, 'current_batch': self.current_batch
}
def load_session(self, session_file):
with open(session_file, 'r') as f:
_session = json.loads(json.dumps(f.read()))
_session = dict(_session)
for key, value in _session.items():
setattr(self, key, value)
def dump_session(self, session_file):
with open(session_file, 'w') as f:
json.dump(self.__dict__, fp=f)
Using it:
session = ScanSession()
session.total += 1
session.__dict__
{'initialized': 1670801774.8050613, 'input_file': None, 'output_file': None, 'total_viable_wallets': 0, 'total': 10, 'report_dict': {}, 'wallet_addresses': [], 'token_map': [], 'token_map_file': 'data/token.maps.json', 'current_batch': 0}
pickle.dumps(session)
b'\x80\x04\x95\xe8\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0bScanSession\x94\x93\x94)\x81\x94}\x94(\x8c\x0binitialized\x94GA\xd8\xe5\x9a[\xb3\x86 \x8c\ninput_file\x94N\x8c\x0boutput_file\x94N\x8c\x14total_viable_wallets\x94K\x00\x8c\x05total\x94K\n\x8c\x0breport_dict\x94}\x94\x8c\x10wallet_addresses\x94]\x94\x8c\ttoken_map\x94]\x94\x8c\x0etoken_map_file\x94\x8c\x14data/token.maps.json\x94\x8c\rcurrent_batch\x94K\x00ub.'

Argument passing by reference to a class in python (á la C++), to modify it with the class methods

In this case, I want that the program print "X = changed"
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
self.var = 'changed'
X = 'unchanged'
V = Clase(X)
V.set_var()
print "X = ",X
All values are objects and are passed by reference in Python, and assignment changes the reference.
def myfunc(y):
y = 13
x = 42 # x now points at the integer object, 42
myfunc(y) # inside myfunc, y initially points to 42,
# but myfunc changes its y to point to a
# different object, 13
print(x) # prints 42, since changing y inside myfunc
# does not change any other variable
It's important to note here that there are no "simple types" as there are in other languages. In Python, integers are objects. Floats are objects. Bools are objects. And assignment is always changing a pointer to refer to a different object, whatever the type of that object.
Thus, it's not possible to "assign through" a reference and change someone else's variable. You can, however, simulate this by passing a mutable container (e.g. a list or a dictionary) and changing the contents of the container, as others have shown.
This kind of mutation of arguments through pointers is common in C/C++ and is generally used to work around the fact that a function can only have a single return value. Python will happily create tuples for you in the return statement and unpack them to multiple variables on the other side, making it easy to return multiple values, so this isn't an issue. Just return all the values you want to return. Here is a trivial example:
def myfunc(x, y, z):
return x * 2, y + 5, z - 3
On the other side:
a, b, c = myFunc(4, 5, 6)
In practice, then, there is rarely any reason to need to do what you're trying to do in Python.
In python list and dict types are global and are passed around by reference. So if you change the type of your variable X to one of those you will get the desired results.
[EDIT: Added use case that op needed]
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
self.var.test = 'changed'
class ComplicatedClass():
def __init__(self, test):
self.test = test
X = ComplicatedClass('unchanged')
print('Before:', X.test)
V = Clase(X)
V.set_var()
print("After:",X.test)
>>> Before: unchanged
>>> After: changed
strings are immutable so you could not change X in this way
... an alternative might be reassigning X in the global space... this obviously will fail in many many senarios (ie it is not a global)
class Clase:
def __init__(self,variable):
self.var = variable
def set_var(self):
globals()[self.var] = 'changed'
X = 'unchanged'
V = Clase('X')
V.set_var()
print "X = ",X
the other alternative is to use a mutable data type as suggested by Ashwin
or the best option is that this is probably not a good idea and you should likely not do it...

Does a Python object which doesn't override comparison operators equals itself?

class A(object):
def __init__(self, value):
self.value = value
x = A(1)
y = A(2)
q = [x, y]
q.remove(y)
I want to remove from the list a specific object which was added before to it and to which I still have a reference. I do not want an equality test. I want an identity test. This code seems to work in both CPython and IronPython, but does the language guarantee this behavior or is it just a fluke?
The list.remove method documentation is this: same as del s[s.index(x)], which implies that an equality test is performed.
So will an object be equal to itself if you don't override __cmp__, __eq__ or __ne__?
Yes. In your example q.remove(y) would remove the first occurrence of an object which compares equal with y. However, the way the class A is defined, you shouldn't† ever have a variable compare equal with y - with the exception of any other names which are also bound to the same y instance.
The relevant section of the docs is here:
If no __cmp__(), __eq__() or __ne__() operation is defined, class
instances are compared by object identity ("address").
So comparison for A instances is by identity (implemented as memory address in CPython). No other object can have an identity equal to id(y) within y's lifetime, i.e. for as long as you hold a reference to y (which you must, if you're going to remove it from a list!)
† Technically, it is still possible to have objects at other memory locations which are comparing equal - mock.ANY is one such example. But these objects need to override their comparison operators to force the result.
In python, by default an object is always equal to itself (the only exception I can think of is float("nan"). An object of a user-defined class will not be equal to any other object unless you define a comparison function.
See also http://docs.python.org/reference/expressions.html#notin
The answer is yes and no.
Consider the following example
>>> class A(object):
def __init__(self, value):
self.value = value
>>> x = A(1)
>>> y = A(2)
>>> z = A(3)
>>> w = A(3)
>>> q = [x, y,z]
>>> id(y) #Second element in the list and y has the same reference
46167248
>>> id(q[1]) #Second element in the list and y has the same reference
46167248
>>> q.remove(y) #So it just compares the id and removes it
>>> q
[<__main__.A object at 0x02C19AB0>, <__main__.A object at 0x02C19B50>]
>>> q.remove(w) #Fails because though z and w contain the same value yet they are different object
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
q.remove(w)
ValueError: list.remove(x): x not in list
It will remove from the list iff they are the same object. If they are different object with same value it won;t remove it.

Categories