Why does setattr fail on a bound method - python

In the following, setattr succeeds in the first invocation, but fails in the second, with:
AttributeError: 'method' object has no attribute 'i'
Why is this, and is there a way of setting an attribute on a method such that it will only exist on one instance, not for each instance of the class?
class c:
def m(self):
print(type(c.m))
setattr(c.m, 'i', 0)
print(type(self.m))
setattr(self.m, 'i', 0)
Python 3.2.2

The short answer: There is no way of adding custom attributes to bound methods.
The long answer follows.
In Python, there are function objects and method objects. When you define a class, the def statement creates a function object that lives within the class' namespace:
>>> class c:
... def m(self):
... pass
...
>>> c.m
<function m at 0x025FAE88>
Function objects have a special __dict__ attribute that can hold user-defined attributes:
>>> c.m.i = 0
>>> c.m.__dict__
{'i': 0}
Method objects are different beasts. They are tiny objects just holding a reference to the corresponding function object (__func__) and one to its host object (__self__):
>>> c().m
<bound method c.m of <__main__.c object at 0x025206D0>>
>>> c().m.__self__
<__main__.c object at 0x02625070>
>>> c().m.__func__
<function m at 0x025FAE88>
>>> c().m.__func__ is c.m
True
Method objects provide a special __getattr__ that forwards attribute access to the function object:
>>> c().m.i
0
This is also true for the __dict__ property:
>>> c().m.__dict__['a'] = 42
>>> c.m.a
42
>>> c().m.__dict__ is c.m.__dict__
True
Setting attributes follows the default rules, though, and since they don't have their own __dict__, there is no way to set arbitrary attributes.
This is similar to user-defined classes defining __slots__ and no __dict__ slot, when trying to set a non-existing slot raises an AttributeError (see the docs on __slots__ for more information):
>>> class c:
... __slots__ = ('a', 'b')
...
>>> x = c()
>>> x.a = 1
>>> x.b = 2
>>> x.c = 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'c' object has no attribute 'c'

Q: "Is there a way of setting an attribute on a method such that it will only exist on one instance, not for each instance of the class?"
A: Yes:
class c:
def m(self):
print(type(c.m))
setattr(c.m, 'i', 0)
print(type(self))
setattr(self, 'i', 0)
The static variable on functions in the post you link to is not useful for methods. It sets an attribute on the function so that this attribute is available the next time the function is called, so you can make a counter or whatnot.
But methods have an object instance associated with them (self). Hence you have no need to set attributes on the method, as you simply can set it on the instance instead. That is in fact exactly what the instance is for.
The post you link to shows how to make a function with a static variable. I would say that in Python doing so would be misguided. Instead look at this answer: What is the Python equivalent of static variables inside a function?
That is the way to do it in Python in a way that is clear and easily understandable. You use a class and make it callable. Setting attributes on functions is possible and there are probably cases where it's a good idea, but in general it will just end up confusing people.

Related

Why include __slots__ = () in a class in Python3 [duplicate]

What is the purpose of __slots__ in Python — especially with respect to when I would want to use it, and when not?
In Python, what is the purpose of __slots__ and what are the cases one should avoid this?
TLDR:
The special attribute __slots__ allows you to explicitly state which instance attributes you expect your object instances to have, with the expected results:
faster attribute access.
space savings in memory.
The space savings is from
Storing value references in slots instead of __dict__.
Denying __dict__ and __weakref__ creation if parent classes deny them and you declare __slots__.
Quick Caveats
Small caveat, you should only declare a particular slot one time in an inheritance tree. For example:
class Base:
__slots__ = 'foo', 'bar'
class Right(Base):
__slots__ = 'baz',
class Wrong(Base):
__slots__ = 'foo', 'bar', 'baz' # redundant foo and bar
Python doesn't object when you get this wrong (it probably should), problems might not otherwise manifest, but your objects will take up more space than they otherwise should. Python 3.8:
>>> from sys import getsizeof
>>> getsizeof(Right()), getsizeof(Wrong())
(56, 72)
This is because the Base's slot descriptor has a slot separate from the Wrong's. This shouldn't usually come up, but it could:
>>> w = Wrong()
>>> w.foo = 'foo'
>>> Base.foo.__get__(w)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: foo
>>> Wrong.foo.__get__(w)
'foo'
The biggest caveat is for multiple inheritance - multiple "parent classes with nonempty slots" cannot be combined.
To accommodate this restriction, follow best practices: Factor out all but one or all parents' abstraction which their concrete class respectively and your new concrete class collectively will inherit from - giving the abstraction(s) empty slots (just like abstract base classes in the standard library).
See section on multiple inheritance below for an example.
Requirements:
To have attributes named in __slots__ to actually be stored in slots instead of a __dict__, a class must inherit from object (automatic in Python 3, but must be explicit in Python 2).
To prevent the creation of a __dict__, you must inherit from object and all classes in the inheritance must declare __slots__ and none of them can have a '__dict__' entry.
There are a lot of details if you wish to keep reading.
Why use __slots__: Faster attribute access.
The creator of Python, Guido van Rossum, states that he actually created __slots__ for faster attribute access.
It is trivial to demonstrate measurably significant faster access:
import timeit
class Foo(object): __slots__ = 'foo',
class Bar(object): pass
slotted = Foo()
not_slotted = Bar()
def get_set_delete_fn(obj):
def get_set_delete():
obj.foo = 'foo'
obj.foo
del obj.foo
return get_set_delete
and
>>> min(timeit.repeat(get_set_delete_fn(slotted)))
0.2846834529991611
>>> min(timeit.repeat(get_set_delete_fn(not_slotted)))
0.3664822799983085
The slotted access is almost 30% faster in Python 3.5 on Ubuntu.
>>> 0.3664822799983085 / 0.2846834529991611
1.2873325658284342
In Python 2 on Windows I have measured it about 15% faster.
Why use __slots__: Memory Savings
Another purpose of __slots__ is to reduce the space in memory that each object instance takes up.
My own contribution to the documentation clearly states the reasons behind this:
The space saved over using __dict__ can be significant.
SQLAlchemy attributes a lot of memory savings to __slots__.
To verify this, using the Anaconda distribution of Python 2.7 on Ubuntu Linux, with guppy.hpy (aka heapy) and sys.getsizeof, the size of a class instance without __slots__ declared, and nothing else, is 64 bytes. That does not include the __dict__. Thank you Python for lazy evaluation again, the __dict__ is apparently not called into existence until it is referenced, but classes without data are usually useless. When called into existence, the __dict__ attribute is a minimum of 280 bytes additionally.
In contrast, a class instance with __slots__ declared to be () (no data) is only 16 bytes, and 56 total bytes with one item in slots, 64 with two.
For 64 bit Python, I illustrate the memory consumption in bytes in Python 2.7 and 3.6, for __slots__ and __dict__ (no slots defined) for each point where the dict grows in 3.6 (except for 0, 1, and 2 attributes):
Python 2.7 Python 3.6
attrs __slots__ __dict__* __slots__ __dict__* | *(no slots defined)
none 16 56 + 272† 16 56 + 112† | †if __dict__ referenced
one 48 56 + 272 48 56 + 112
two 56 56 + 272 56 56 + 112
six 88 56 + 1040 88 56 + 152
11 128 56 + 1040 128 56 + 240
22 216 56 + 3344 216 56 + 408
43 384 56 + 3344 384 56 + 752
So, in spite of smaller dicts in Python 3, we see how nicely __slots__ scale for instances to save us memory, and that is a major reason you would want to use __slots__.
Just for completeness of my notes, note that there is a one-time cost per slot in the class's namespace of 64 bytes in Python 2, and 72 bytes in Python 3, because slots use data descriptors like properties, called "members".
>>> Foo.foo
<member 'foo' of 'Foo' objects>
>>> type(Foo.foo)
<class 'member_descriptor'>
>>> getsizeof(Foo.foo)
72
Demonstration of __slots__:
To deny the creation of a __dict__, you must subclass object. Everything subclasses object in Python 3, but in Python 2 you had to be explicit:
class Base(object):
__slots__ = ()
now:
>>> b = Base()
>>> b.a = 'a'
Traceback (most recent call last):
File "<pyshell#38>", line 1, in <module>
b.a = 'a'
AttributeError: 'Base' object has no attribute 'a'
Or subclass another class that defines __slots__
class Child(Base):
__slots__ = ('a',)
and now:
c = Child()
c.a = 'a'
but:
>>> c.b = 'b'
Traceback (most recent call last):
File "<pyshell#42>", line 1, in <module>
c.b = 'b'
AttributeError: 'Child' object has no attribute 'b'
To allow __dict__ creation while subclassing slotted objects, just add '__dict__' to the __slots__ (note that slots are ordered, and you shouldn't repeat slots that are already in parent classes):
class SlottedWithDict(Child):
__slots__ = ('__dict__', 'b')
swd = SlottedWithDict()
swd.a = 'a'
swd.b = 'b'
swd.c = 'c'
and
>>> swd.__dict__
{'c': 'c'}
Or you don't even need to declare __slots__ in your subclass, and you will still use slots from the parents, but not restrict the creation of a __dict__:
class NoSlots(Child): pass
ns = NoSlots()
ns.a = 'a'
ns.b = 'b'
And:
>>> ns.__dict__
{'b': 'b'}
However, __slots__ may cause problems for multiple inheritance:
class BaseA(object):
__slots__ = ('a',)
class BaseB(object):
__slots__ = ('b',)
Because creating a child class from parents with both non-empty slots fails:
>>> class Child(BaseA, BaseB): __slots__ = ()
Traceback (most recent call last):
File "<pyshell#68>", line 1, in <module>
class Child(BaseA, BaseB): __slots__ = ()
TypeError: Error when calling the metaclass bases
multiple bases have instance lay-out conflict
If you run into this problem, You could just remove __slots__ from the parents, or if you have control of the parents, give them empty slots, or refactor to abstractions:
from abc import ABC
class AbstractA(ABC):
__slots__ = ()
class BaseA(AbstractA):
__slots__ = ('a',)
class AbstractB(ABC):
__slots__ = ()
class BaseB(AbstractB):
__slots__ = ('b',)
class Child(AbstractA, AbstractB):
__slots__ = ('a', 'b')
c = Child() # no problem!
Add '__dict__' to __slots__ to get dynamic assignment:
class Foo(object):
__slots__ = 'bar', 'baz', '__dict__'
and now:
>>> foo = Foo()
>>> foo.boink = 'boink'
So with '__dict__' in slots we lose some of the size benefits with the upside of having dynamic assignment and still having slots for the names we do expect.
When you inherit from an object that isn't slotted, you get the same sort of semantics when you use __slots__ - names that are in __slots__ point to slotted values, while any other values are put in the instance's __dict__.
Avoiding __slots__ because you want to be able to add attributes on the fly is actually not a good reason - just add "__dict__" to your __slots__ if this is required.
You can similarly add __weakref__ to __slots__ explicitly if you need that feature.
Set to empty tuple when subclassing a namedtuple:
The namedtuple builtin make immutable instances that are very lightweight (essentially, the size of tuples) but to get the benefits, you need to do it yourself if you subclass them:
from collections import namedtuple
class MyNT(namedtuple('MyNT', 'bar baz')):
"""MyNT is an immutable and lightweight object"""
__slots__ = ()
usage:
>>> nt = MyNT('bar', 'baz')
>>> nt.bar
'bar'
>>> nt.baz
'baz'
And trying to assign an unexpected attribute raises an AttributeError because we have prevented the creation of __dict__:
>>> nt.quux = 'quux'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyNT' object has no attribute 'quux'
You can allow __dict__ creation by leaving off __slots__ = (), but you can't use non-empty __slots__ with subtypes of tuple.
Biggest Caveat: Multiple inheritance
Even when non-empty slots are the same for multiple parents, they cannot be used together:
class Foo(object):
__slots__ = 'foo', 'bar'
class Bar(object):
__slots__ = 'foo', 'bar' # alas, would work if empty, i.e. ()
>>> class Baz(Foo, Bar): pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
multiple bases have instance lay-out conflict
Using an empty __slots__ in the parent seems to provide the most flexibility, allowing the child to choose to prevent or allow (by adding '__dict__' to get dynamic assignment, see section above) the creation of a __dict__:
class Foo(object): __slots__ = ()
class Bar(object): __slots__ = ()
class Baz(Foo, Bar): __slots__ = ('foo', 'bar')
b = Baz()
b.foo, b.bar = 'foo', 'bar'
You don't have to have slots - so if you add them, and remove them later, it shouldn't cause any problems.
Going out on a limb here: If you're composing mixins or using abstract base classes, which aren't intended to be instantiated, an empty __slots__ in those parents seems to be the best way to go in terms of flexibility for subclassers.
To demonstrate, first, let's create a class with code we'd like to use under multiple inheritance
class AbstractBase:
__slots__ = ()
def __init__(self, a, b):
self.a = a
self.b = b
def __repr__(self):
return f'{type(self).__name__}({repr(self.a)}, {repr(self.b)})'
We could use the above directly by inheriting and declaring the expected slots:
class Foo(AbstractBase):
__slots__ = 'a', 'b'
But we don't care about that, that's trivial single inheritance, we need another class we might also inherit from, maybe with a noisy attribute:
class AbstractBaseC:
__slots__ = ()
#property
def c(self):
print('getting c!')
return self._c
#c.setter
def c(self, arg):
print('setting c!')
self._c = arg
Now if both bases had nonempty slots, we couldn't do the below. (In fact, if we wanted, we could have given AbstractBase nonempty slots a and b, and left them out of the below declaration - leaving them in would be wrong):
class Concretion(AbstractBase, AbstractBaseC):
__slots__ = 'a b _c'.split()
And now we have functionality from both via multiple inheritance, and can still deny __dict__ and __weakref__ instantiation:
>>> c = Concretion('a', 'b')
>>> c.c = c
setting c!
>>> c.c
getting c!
Concretion('a', 'b')
>>> c.d = 'd'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Concretion' object has no attribute 'd'
Other cases to avoid slots:
Avoid them when you want to perform __class__ assignment with another class that doesn't have them (and you can't add them) unless the slot layouts are identical. (I am very interested in learning who is doing this and why.)
Avoid them if you want to subclass variable length builtins like long, tuple, or str, and you want to add attributes to them.
Avoid them if you insist on providing default values via class attributes for instance variables.
You may be able to tease out further caveats from the rest of the __slots__ documentation (the 3.7 dev docs are the most current), which I have made significant recent contributions to.
Critiques of other answers
The current top answers cite outdated information and are quite hand-wavy and miss the mark in some important ways.
Do not "only use __slots__ when instantiating lots of objects"
I quote:
"You would want to use __slots__ if you are going to instantiate a lot (hundreds, thousands) of objects of the same class."
Abstract Base Classes, for example, from the collections module, are not instantiated, yet __slots__ are declared for them.
Why?
If a user wishes to deny __dict__ or __weakref__ creation, those things must not be available in the parent classes.
__slots__ contributes to reusability when creating interfaces or mixins.
It is true that many Python users aren't writing for reusability, but when you are, having the option to deny unnecessary space usage is valuable.
__slots__ doesn't break pickling
When pickling a slotted object, you may find it complains with a misleading TypeError:
>>> pickle.loads(pickle.dumps(f))
TypeError: a class that defines __slots__ without defining __getstate__ cannot be pickled
This is actually incorrect. This message comes from the oldest protocol, which is the default. You can select the latest protocol with the -1 argument. In Python 2.7 this would be 2 (which was introduced in 2.3), and in 3.6 it is 4.
>>> pickle.loads(pickle.dumps(f, -1))
<__main__.Foo object at 0x1129C770>
in Python 2.7:
>>> pickle.loads(pickle.dumps(f, 2))
<__main__.Foo object at 0x1129C770>
in Python 3.6
>>> pickle.loads(pickle.dumps(f, 4))
<__main__.Foo object at 0x1129C770>
So I would keep this in mind, as it is a solved problem.
Critique of the (until Oct 2, 2016) accepted answer
The first paragraph is half short explanation, half predictive. Here's the only part that actually answers the question
The proper use of __slots__ is to save space in objects. Instead of having a dynamic dict that allows adding attributes to objects at anytime, there is a static structure which does not allow additions after creation. This saves the overhead of one dict for every object that uses slots
The second half is wishful thinking, and off the mark:
While this is sometimes a useful optimization, it would be completely unnecessary if the Python interpreter was dynamic enough so that it would only require the dict when there actually were additions to the object.
Python actually does something similar to this, only creating the __dict__ when it is accessed, but creating lots of objects with no data is fairly ridiculous.
The second paragraph oversimplifies and misses actual reasons to avoid __slots__. The below is not a real reason to avoid slots (for actual reasons, see the rest of my answer above.):
They change the behavior of the objects that have slots in a way that can be abused by control freaks and static typing weenies.
It then goes on to discuss other ways of accomplishing that perverse goal with Python, not discussing anything to do with __slots__.
The third paragraph is more wishful thinking. Together it is mostly off-the-mark content that the answerer didn't even author and contributes to ammunition for critics of the site.
Memory usage evidence
Create some normal objects and slotted objects:
>>> class Foo(object): pass
>>> class Bar(object): __slots__ = ()
Instantiate a million of them:
>>> foos = [Foo() for f in xrange(1000000)]
>>> bars = [Bar() for b in xrange(1000000)]
Inspect with guppy.hpy().heap():
>>> guppy.hpy().heap()
Partition of a set of 2028259 objects. Total size = 99763360 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1000000 49 64000000 64 64000000 64 __main__.Foo
1 169 0 16281480 16 80281480 80 list
2 1000000 49 16000000 16 96281480 97 __main__.Bar
3 12284 1 987472 1 97268952 97 str
...
Access the regular objects and their __dict__ and inspect again:
>>> for f in foos:
... f.__dict__
>>> guppy.hpy().heap()
Partition of a set of 3028258 objects. Total size = 379763480 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 1000000 33 280000000 74 280000000 74 dict of __main__.Foo
1 1000000 33 64000000 17 344000000 91 __main__.Foo
2 169 0 16281480 4 360281480 95 list
3 1000000 33 16000000 4 376281480 99 __main__.Bar
4 12284 0 987472 0 377268952 99 str
...
This is consistent with the history of Python, from Unifying types and classes in Python 2.2
If you subclass a built-in type, extra space is automatically added to the instances to accomodate __dict__ and __weakrefs__. (The __dict__ is not initialized until you use it though, so you shouldn't worry about the space occupied by an empty dictionary for each instance you create.) If you don't need this extra space, you can add the phrase "__slots__ = []" to your class.
Quoting Jacob Hallen:
The proper use of __slots__ is to save space in objects. Instead of having
a dynamic dict that allows adding attributes to objects at anytime,
there is a static structure which does not allow additions after creation.
[This use of __slots__ eliminates the overhead of one dict for every object.] While this is sometimes a useful optimization, it would be completely
unnecessary if the Python interpreter was dynamic enough so that it would
only require the dict when there actually were additions to the object.
Unfortunately there is a side effect to slots. They change the behavior of
the objects that have slots in a way that can be abused by control freaks
and static typing weenies. This is bad, because the control freaks should
be abusing the metaclasses and the static typing weenies should be abusing
decorators, since in Python, there should be only one obvious way of doing something.
Making CPython smart enough to handle saving space without __slots__ is a major
undertaking, which is probably why it is not on the list of changes for P3k (yet).
You would want to use __slots__ if you are going to instantiate a lot (hundreds, thousands) of objects of the same class. __slots__ only exists as a memory optimization tool.
It's highly discouraged to use __slots__ for constraining attribute creation.
Pickling objects with __slots__ won't work with the default (oldest) pickle protocol; it's necessary to specify a later version.
Some other introspection features of python may also be adversely affected.
Each python object has a __dict__ atttribute which is a dictionary containing all other attributes. e.g. when you type self.attr python is actually doing self.__dict__['attr']. As you can imagine using a dictionary to store attribute takes some extra space & time for accessing it.
However, when you use __slots__, any object created for that class won't have a __dict__ attribute. Instead, all attribute access is done directly via pointers.
So if want a C style structure rather than a full fledged class you can use __slots__ for compacting size of the objects & reducing attribute access time. A good example is a Point class containing attributes x & y. If you are going to have a lot of points, you can try using __slots__ in order to conserve some memory.
In addition to the other answers, here is an example of using __slots__:
>>> class Test(object): #Must be new-style class!
... __slots__ = ['x', 'y']
...
>>> pt = Test()
>>> dir(pt)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__slots__', '__str__', 'x', 'y']
>>> pt.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> pt.x = 1
>>> pt.x
1
>>> pt.z = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute 'z'
>>> pt.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute '__dict__'
>>> pt.__slots__
['x', 'y']
So, to implement __slots__, it only takes an extra line (and making your class a new-style class if it isn't already). This way you can reduce the memory footprint of those classes 5-fold, at the expense of having to write custom pickle code, if and when that becomes necessary.
Slots are very useful for library calls to eliminate the "named method dispatch" when making function calls. This is mentioned in the SWIG documentation. For high performance libraries that want to reduce function overhead for commonly called functions using slots is much faster.
Now this may not be directly related to the OPs question. It is related more to building extensions than it does to using the slots syntax on an object. But it does help complete the picture for the usage of slots and some of the reasoning behind them.
A very simple example of __slot__ attribute.
Problem: Without __slots__
If I don't have __slot__ attribute in my class, I can add new attributes to my objects.
class Test:
pass
obj1=Test()
obj2=Test()
print(obj1.__dict__) #--> {}
obj1.x=12
print(obj1.__dict__) # --> {'x': 12}
obj1.y=20
print(obj1.__dict__) # --> {'x': 12, 'y': 20}
obj2.x=99
print(obj2.__dict__) # --> {'x': 99}
If you look at example above, you can see that obj1 and obj2 have their own x and y attributes and python has also created a dict attribute for each object (obj1 and obj2).
Suppose if my class Test has thousands of such objects? Creating an additional attribute dict for each object will cause lot of overhead (memory, computing power etc.) in my code.
Solution: With __slots__
Now in the following example my class Test contains __slots__ attribute. Now I can't add new attributes to my objects (except attribute x) and python doesn't create a dict attribute anymore. This eliminates overhead for each object, which can become significant if you have many objects.
class Test:
__slots__=("x")
obj1=Test()
obj2=Test()
obj1.x=12
print(obj1.x) # --> 12
obj2.x=99
print(obj2.x) # --> 99
obj1.y=28
print(obj1.y) # --> AttributeError: 'Test' object has no attribute 'y'
An attribute of a class instance has 3 properties: the instance, the name of the attribute, and the value of the attribute.
In regular attribute access, the instance acts as a dictionary and the name of the attribute acts as the key in that dictionary looking up value.
instance(attribute) --> value
In __slots__ access, the name of the attribute acts as the dictionary and the instance acts as the key in the dictionary looking up value.
attribute(instance) --> value
In flyweight pattern, the name of the attribute acts as the dictionary and the value acts as the key in that dictionary looking up the instance.
attribute(value) --> instance
In addition to the other answers, __slots__ also adds a little typographical security by limiting attributes to a predefined list. This has long been a problem with JavaScript which also allows you to add new attributes to an existing object, whether you meant to or not.
Here is a normal unslotted object which does nothing, but allows you to add attributes:
class Unslotted:
pass
test = Unslotted()
test.name = 'Fred'
test.Name = 'Wilma'
Since Python is case sensitive, the two attributes, spelled the same but with different case, are different. If you suspect that one of those is a typing error, then bad luck.
Using slots, you can limit this:
class Slotted:
__slots__ = ('name')
pass
test = Slotted()
test.name = 'Fred' # OK
test.Name = 'Wilma' # Error
This time, the second attribute (Name) is disallowed because it’s not in the __slots__ collection.
I would suggest that it’s probably better to use __slots__ where possible to keep more control over the object.
Beginning in Python 3.9, a dict may be used to add descriptions to attributes via __slots__. None may be used for attributes without descriptions, and private variables will not appear even if a description is given.
class Person:
__slots__ = {
"birthday":
"A datetime.date object representing the person's birthday.",
"name":
"The first and last name.",
"public_variable":
None,
"_private_variable":
"Description",
}
help(Person)
"""
Help on class Person in module __main__:
class Person(builtins.object)
| Data descriptors defined here:
|
| birthday
| A datetime.date object representing the person's birthday.
|
| name
| The first and last name.
|
| public_variable
"""
Another somewhat obscure use of __slots__ is to add attributes to an object proxy from the ProxyTypes package, formerly part of the PEAK project. Its ObjectWrapper allows you to proxy another object, but intercept all interactions with the proxied object. It is not very commonly used (and no Python 3 support), but we have used it to implement a thread-safe blocking wrapper around an async implementation based on tornado that bounces all access to the proxied object through the ioloop, using thread-safe concurrent.Future objects to synchronise and return results.
By default any attribute access to the proxy object will give you the result from the proxied object. If you need to add an attribute on the proxy object, __slots__ can be used.
from peak.util.proxies import ObjectWrapper
class Original(object):
def __init__(self):
self.name = 'The Original'
class ProxyOriginal(ObjectWrapper):
__slots__ = ['proxy_name']
def __init__(self, subject, proxy_name):
# proxy_info attributed added directly to the
# Original instance, not the ProxyOriginal instance
self.proxy_info = 'You are proxied by {}'.format(proxy_name)
# proxy_name added to ProxyOriginal instance, since it is
# defined in __slots__
self.proxy_name = proxy_name
super(ProxyOriginal, self).__init__(subject)
if __name__ == "__main__":
original = Original()
proxy = ProxyOriginal(original, 'Proxy Overlord')
# Both statements print "The Original"
print "original.name: ", original.name
print "proxy.name: ", proxy.name
# Both statements below print
# "You are proxied by Proxy Overlord", since the ProxyOriginal
# __init__ sets it to the original object
print "original.proxy_info: ", original.proxy_info
print "proxy.proxy_info: ", proxy.proxy_info
# prints "Proxy Overlord"
print "proxy.proxy_name: ", proxy.proxy_name
# Raises AttributeError since proxy_name is only set on
# the proxy object
print "original.proxy_name: ", proxy.proxy_name
The original question was about general use cases not only about memory.
So it should be mentioned here that you also get better performance when instantiating large amounts of objects - interesting e.g. when parsing large documents into objects or from a database.
Here is a comparison of creating object trees with a million entries, using slots and without slots. As a reference also the performance when using plain dicts for the trees (Py2.7.10 on OSX):
********** RUN 1 **********
1.96036410332 <class 'css_tree_select.element.Element'>
3.02922606468 <class 'css_tree_select.element.ElementNoSlots'>
2.90828204155 dict
********** RUN 2 **********
1.77050495148 <class 'css_tree_select.element.Element'>
3.10655999184 <class 'css_tree_select.element.ElementNoSlots'>
2.84120798111 dict
********** RUN 3 **********
1.84069895744 <class 'css_tree_select.element.Element'>
3.21540498734 <class 'css_tree_select.element.ElementNoSlots'>
2.59615707397 dict
********** RUN 4 **********
1.75041103363 <class 'css_tree_select.element.Element'>
3.17366290092 <class 'css_tree_select.element.ElementNoSlots'>
2.70941114426 dict
Test classes (ident, appart from slots):
class Element(object):
__slots__ = ['_typ', 'id', 'parent', 'childs']
def __init__(self, typ, id, parent=None):
self._typ = typ
self.id = id
self.childs = []
if parent:
self.parent = parent
parent.childs.append(self)
class ElementNoSlots(object): (same, w/o slots)
testcode, verbose mode:
na, nb, nc = 100, 100, 100
for i in (1, 2, 3, 4):
print '*' * 10, 'RUN', i, '*' * 10
# tree with slot and no slot:
for cls in Element, ElementNoSlots:
t1 = time.time()
root = cls('root', 'root')
for i in xrange(na):
ela = cls(typ='a', id=i, parent=root)
for j in xrange(nb):
elb = cls(typ='b', id=(i, j), parent=ela)
for k in xrange(nc):
elc = cls(typ='c', id=(i, j, k), parent=elb)
to = time.time() - t1
print to, cls
del root
# ref: tree with dicts only:
t1 = time.time()
droot = {'childs': []}
for i in xrange(na):
ela = {'typ': 'a', id: i, 'childs': []}
droot['childs'].append(ela)
for j in xrange(nb):
elb = {'typ': 'b', id: (i, j), 'childs': []}
ela['childs'].append(elb)
for k in xrange(nc):
elc = {'typ': 'c', id: (i, j, k), 'childs': []}
elb['childs'].append(elc)
td = time.time() - t1
print td, 'dict'
del droot
You have — essentially — no use for __slots__.
For the time when you think you might need __slots__, you actually want to use Lightweight or Flyweight design patterns. These are cases when you no longer want to use purely Python objects. Instead, you want a Python object-like wrapper around an array, struct, or numpy array.
class Flyweight(object):
def get(self, theData, index):
return theData[index]
def set(self, theData, index, value):
theData[index]= value
The class-like wrapper has no attributes — it just provides methods that act on the underlying data. The methods can be reduced to class methods. Indeed, it could be reduced to just functions operating on the underlying array of data.
In addition to the myriad advantages described in other answers herein – compact instances for the memory-conscious, less error-prone than the more mutable __dict__-bearing instances, et cetera – I find that using __slots__ offers more legible class declarations, as the instance variables of the class are explicitly out in the open.
To contend with inheritance issues with __slots__ declarations I use this metaclass:
import abc
class Slotted(abc.ABCMeta):
""" A metaclass that ensures its classes, and all subclasses,
will be slotted types.
"""
def __new__(metacls, name, bases, attributes, **kwargs):
""" Override for `abc.ABCMeta.__new__(…)` setting up a
derived slotted class.
"""
if '__slots__' not in attributes:
attributes['__slots__'] = tuple()
return super(Slotted, metacls).__new__(metacls, name, # type: ignore
bases,
attributes,
**kwargs)
… which, if declared as the metaclass of the base class in an inheritance tower, ensures that everything that derives from that base class will properly inherit __slots__ attributes, even if an intermediate class fails to declare any. Like so:
# note no __slots__ declaration necessary with the metaclass:
class Base(metaclass=Slotted):
pass
# class is properly slotted, no __dict__:
class Derived(Base):
__slots__ = 'slot', 'another_slot'
# class is also properly slotted:
class FurtherDerived(Derived):
pass

Why are the id of method between instance object and class object not same?

I am wondering why the method exist two copy, one for instance object, the other for class object, why it is designed like this?
class Bar():
def method(self):
pass
#classmethod
def clsmethod(cls):
pass
b1 = Bar()
b2 = Bar()
print(Bar.method,id(Bar.method))
print(b1.method,id(b1.method))
print(b2.method,id(b2.method))
print(Bar.clsmethod,id(Bar.clsmethod))
print(b1.clsmethod,id(b1.clsmethod))
print(b2.clsmethod,id(b2.clsmethod))
This design is based on descriptors, specifically non-data descriptors. Every function happens to be a non-data descriptor by defining a __get__ method:
>>> def foo():
... pass
...
>>> foo.__get__
<method-wrapper '__get__' of function object at 0x7fa75be5be50>
When you have an expression x.y in your code, this means the attribute y is being looked up on the object x. The specific rules are explained here, and one of them is concerned with y being a (non-)data descriptor stored on the class of x (or any subclass). The following is an example:
>>> class Foo:
... def test(self):
... pass
...
Here Foo.test looks up the name test on the class Foo. The result is the function as you would define in the global namespace:
>>> Foo.test
<function Foo.test at 0x7fa75be5bf70>
However, as we have seen above, every function is also a descriptor, so if you look up test on an instance of Foo, it will call the descriptor's __get__ method to compute the result:
>>> f = Foo()
>>> f.test
<bound method Foo.test of <__main__.Foo object at 0x7fa75bf56b20>>
We can obtain a similar result by manually invoking Foo.test.__get__:
>>> Foo.test.__get__(f, type(f))
<bound method Foo.test of <__main__.Foo object at 0x7fa75bf56b20>>
This mechanism is what ensures that the instance (typically denoted via self) is passed as the first argument to instance methods. The descriptor returns a bound method (bound to the instance on which the lookup was performed) rather than the original function. This bound method inserts the instance as the very first parameter when being called. Every time you do Foo.test a new bound-method object is returned and hence their ids differ.
The situation with classmethods is similar where Foo.test.__get__(None, Foo) is called. The only difference is that for instances object.__getattribute__ is called while for classes type.__getattribute__ takes precedence.
>>> class Bar:
... #classmethod
... def test(cls):
... pass
...
>>> Bar.test
<bound method Bar.test of <class '__main__.Bar'>>
>>> Bar.__dict__['test'].__get__(None, Bar)
<bound method Bar.test of <class '__main__.Bar'>>

How to override a function when Parent already explicitly `setattr` the same function?

A 'minimal' example I created:
class C:
def wave(self):
print("C waves")
class A:
def __init__(self):
c = C()
setattr(self, 'wave', getattr(c, 'wave'))
class B(A):
def wave(self):
print("B waves")
>>> a = A()
>>> a.wave()
C waves # as expected
>>> b = B()
>>> b.wave()
C waves # why not 'B waves'?
>>>
In the example, class A explicitly defined its method wave to be class C's wave method, although not through the more common function definition, but using setattr instead. Then we have class B that inherits A, B tries to override wave method with its own method, however, that's not possible, what is going on? how can I work around it?
I want to keep class A's setattr style definition if at all possible, please advise.
I've never systematically learned Python so I guess I am missing some understanding regarding how Python's inheritance and setattr work.
Class A sets the wave() method as its instance attribute in __init__(). This can be seen by inspecting the instance's dict:
>>> b.__dict__
{'wave': <bound method C.wave of <__main__.C object at 0x7ff0b32c63c8>>}
You can get around this by deleting the instance member from b
>>> del b.__dict__['wave']
>>> b.wave()
B waves
With the instance attribute removed, the wave() function is then taken from the class dict:
>>> B.__dict__
mappingproxy({'__module__': '__main__',
'wave': <function __main__.B.wave(self)>,
'__doc__': None})
The thing to note here is that when Python looks up an attribute, instance attributes take precedence over the class attributes (unless a class attribute is a data descriptor, but this is not the case here).
I have also written a blog post back then explaining how the attribute lookup works in even more detail.

Keep functions to become methods on instantiation

When a function is assigned to an attribute during class definition, this attribute stays an ordinary function with its original signature:
>>> def f(x):
return x**2
>>> class A:
ff = f
>>> A.ff
<function f at 0x037D6ED0>
When the class is instantiated, this attribute becomes a bound method and its signature changes:
>>> a = A()
>>> a.ff
<bound method A.f of <__main__.A object at 0x03A726B0>>
I need to define a class that I can later customize by changing some attributes before instantiating. One of these attributes is a function and I need it to keep it's signature.
Using #staticmethod is obviously not an option, since no function is defined on class definition/customization, and decorations dont apply to attributes.
Is there any way to keep a function to be transformed into a bound method on instantiation?
Using #staticmethod is obviously not an option, since no function is defined on class definition/customization, and decorations dont apply to attributes.
No, staticmethod is the option, just call it directly to produce an instance:
class A:
ff = staticmethod(f)
#decorator syntax is only syntactic sugar to produce the exact same assignment after a function object has been created.
This works fine:
>>> def f(x):
... return x**2
...
>>> class A:
... f_unchanged = f
... f_static = staticmethod(f)
...
>>> A().f_unchanged
<bound method f of <__main__.A object at 0x10cf7b2e8>>
>>> A().f_static
<function f at 0x10cfb6510>
>>> A().f_static(4)
16
It doesn't matter where a function is defined, a def statement produces a function object regardless where it is used. def name is two things: creating the function object and an assignment of that function object no a name. Wether or not this takes place in a class statement or elsewhere doesn't actually matter.
What turns functions into bound methods is accessing them on an instance, as then the descriptor protocol kicks in. For example, accessing A().ff is turned into A.__dict__['ff'].__get__(A()), and it is the __get__ method on a function that produces the bound method. The bound method is only a proxy for the actual function, passing in the instance as a first argument when called.
A staticmethod defines a different __get__, one that just returns the original function, unbound. You can play with those __get__ methods directly:
>>> f.__get__(A()) # bind f to an instance
<bound method f of <__main__.A object at 0x10cf9f630>>
>>> A.__dict__['f_unchanged'] # bypass the protocol
<function f at 0x10cfb6510>
>>> A.__dict__['f_static'] # bypass the protocol
<staticmethod object at 0x10cf60f28>
>>> A.__dict__['f_static'].__get__(A()) # activate the protocol
<function f at 0x10cfb6510>

Calling object functions - python [duplicate]

I'm asking this question because of a discussion on the comment thread of this answer. I'm 90% of the way to getting my head round it.
In [1]: class A(object): # class named 'A'
...: def f1(self): pass
...:
In [2]: a = A() # an instance
f1 exists in three different forms:
In [3]: a.f1 # a bound method
Out[3]: <bound method a.f1 of <__main__.A object at 0x039BE870>>
In [4]: A.f1 # an unbound method
Out[4]: <unbound method A.f1>
In [5]: a.__dict__['f1'] # doesn't exist
KeyError: 'f1'
In [6]: A.__dict__['f1'] # a function
Out[6]: <function __main__.f1>
What is the difference between the bound method, unbound method and function objects, all of which are described by f1? How does one call these three objects? How can they be transformed into each other? The documentation on this stuff is quite hard to understand.
A function is created by the def statement, or by lambda. Under Python 2, when a function appears within the body of a class statement (or is passed to a type class construction call), it is transformed into an unbound method. (Python 3 doesn't have unbound methods; see below.) When a function is accessed on a class instance, it is transformed into a bound method, that automatically supplies the instance to the method as the first self parameter.
def f1(self):
pass
Here f1 is a function.
class C(object):
f1 = f1
Now C.f1 is an unbound method.
>>> C.f1
<unbound method C.f1>
>>> C.f1.im_func is f1
True
We can also use the type class constructor:
>>> C2 = type('C2', (object,), {'f1': f1})
>>> C2.f1
<unbound method C2.f1>
We can convert f1 to an unbound method manually:
>>> import types
>>> types.MethodType(f1, None, C)
<unbound method C.f1>
Unbound methods are bound by access on a class instance:
>>> C().f1
<bound method C.f1 of <__main__.C object at 0x2abeecf87250>>
Access is translated into calling through the descriptor protocol:
>>> C.f1.__get__(C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf871d0>>
Combining these:
>>> types.MethodType(f1, None, C).__get__(C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf87310>>
Or directly:
>>> types.MethodType(f1, C(), C)
<bound method C.f1 of <__main__.C object at 0x2abeecf871d0>>
The main difference between a function and an unbound method is that the latter knows which class it is bound to; calling or binding an unbound method requires an instance of its class type:
>>> f1(None)
>>> C.f1(None)
TypeError: unbound method f1() must be called with C instance as first argument (got NoneType instance instead)
>>> class D(object): pass
>>> f1.__get__(D(), D)
<bound method D.f1 of <__main__.D object at 0x7f6c98cfe290>>
>>> C.f1.__get__(D(), D)
<unbound method C.f1>
Since the difference between a function and an unbound method is pretty minimal, Python 3 gets rid of the distinction; under Python 3 accessing a function on a class instance just gives you the function itself:
>>> C.f1
<function f1 at 0x7fdd06c4cd40>
>>> C.f1 is f1
True
In both Python 2 and Python 3, then, these three are equivalent:
f1(C())
C.f1(C())
C().f1()
Binding a function to an instance has the effect of fixing its first parameter (conventionally called self) to the instance. Thus the bound method C().f1 is equivalent to either of:
(lamdba *args, **kwargs: f1(C(), *args, **kwargs))
functools.partial(f1, C())
is quite hard to understand
Well, it is quite a hard topic, and it has to do with descriptors.
Lets start with function. Everything is clear here - you just call it, all supplied arguments are passed while executing it:
>>> f = A.__dict__['f1']
>>> f(1)
1
Regular TypeError is raised in case of any problem with number of parameters:
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f1() takes exactly 1 argument (0 given)
Now, methods. Methods are functions with a bit of spices. Descriptors come in game here. As described in Data Model, A.f1 and A().f1 are translated into A.__dict__['f1'].__get__(None, A) and type(a).__dict__['f1'].__get__(a, type(a)) respectively. And results of these __get__'s differ from the raw f1 function. These objects are wrappers around the original f1 and contain some additional logic.
In case of unbound method this logic includes a check whether first argument is an instance of A:
>>> f = A.f1
>>> f()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method f1() must be called with A instance as first argument (got nothing instead)
>>> f(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unbound method f1() must be called with A instance as first argument (got int instance instead)
If this check succeeds, it executes original f1 with that instance as first argument:
>>> f(A())
<__main__.A object at 0x800f238d0>
Note, that im_self attribute is None:
>>> f.im_self is None
True
In case of bound method this logic immediately supplies original f1 with an instance of A it was created of (this instance is actually stored in im_self attribute):
>>> f = A().f1
>>> f.im_self
<__main__.A object at 0x800f23950>
>>> f()
<__main__.A object at 0x800f23950>
So, bound mean that underlying function is bound to some instance. unbound mean that it is still bound, but only to a class.
A function object is a callable object created by a function definition. Both bound and unbound methods are callable objects created by a Descriptor called by the dot binary operator.
Bound and unbound method objects have 3 main properties: im_func is the function object defined in the class, im_class is the class, and im_self is the class instance. For unbound methods, im_self is None.
When a bound method is called, it calls im_func with im_self as the first parameter followed by its calling parameters. unbound methods call the underlying function with just its calling parameters.
Starting with Python 3, there are no unbound methods. Class.method returns a direct reference to the method.
Please refer to the Python 2 and Python 3 documentation for more details.
My interpretation is the following.
Class Function snippets:
Python 3:
class Function(object):
. . .
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
if obj is None:
return self
return types.MethodType(self, obj)
Python 2:
class Function(object):
. . .
def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
return types.MethodType(self, obj, objtype)
If a function is called without class or instance, it is a plain function.
If a function is called from a class or an instance, its __get__ is called to retrieve wrapped function:
a. B.x is same as B.__dict__['x'].__get__(None, B).
In Python 3, this returns plain function.
In Python 2, this returns an unbound function.
b. b.x is same as type(b).__dict__['x'].__get__(b, type(b). This will return a bound method in both Python 2 and Python 3, which means self will be implicitly passed as first argument.
What is the difference between a function, an unbound method and a bound method?
From the ground breaking what is a function perspective there is no difference.
Python object oriented features are built upon a function based environment.
Being bound is equal to:
Will the function take the class (cls) or the object instance (self) as the first parameter or no?
Here is the example:
class C:
#instance method
def m1(self, x):
print(f"Excellent m1 self {self} {x}")
#classmethod
def m2(cls, x):
print(f"Excellent m2 cls {cls} {x}")
#staticmethod
def m3(x):
print(f"Excellent m3 static {x}")
ci=C()
ci.m1(1)
ci.m2(2)
ci.m3(3)
print(ci.m1)
print(ci.m2)
print(ci.m3)
print(C.m1)
print(C.m2)
print(C.m3)
Outputs:
Excellent m1 self <__main__.C object at 0x000001AF40319160> 1
Excellent m2 cls <class '__main__.C'> 2
Excellent m3 static 3
<bound method C.m1 of <__main__.C object at 0x000001AF40319160>>
<bound method C.m2 of <class '__main__.C'>>
<function C.m3 at 0x000001AF4023CBF8>
<function C.m1 at 0x000001AF402FBB70>
<bound method C.m2 of <class '__main__.C'>>
<function C.m3 at 0x000001AF4023CBF8>
The output shows the static function m3 will be never called bound.
C.m2 is bound to the C class because we sent the cls parameter which is the class pointer.
ci.m1 and ci.m2 are both bound; ci.m1 because we sent self which is a pointer to the instance, and ci.m2 because the instance knows that the class is bound ;).
To conclude you can bound method to a class or to a class object, based on the first parameter the method takes. If method is not bound it can be called unbound.
Note that method may not be originally part of the class. Check this answer from Alex Martelli for more details.
One interesting thing I saw today is that, when I assign a function to a class member, it becomes an unbound method. Such as:
class Test(object):
#classmethod
def initialize_class(cls):
def print_string(self, str):
print(str)
# Here if I do print(print_string), I see a function
cls.print_proc = print_string
# Here if I do print(cls.print_proc), I see an unbound method; so if I
# get a Test object o, I can call o.print_proc("Hello")

Categories