While looking through the webapp2 documentation online, I found information on the decorator: webapp2.cached_property
In the documentation, it says:
A decorator that converts a function into a lazy property.
My question is:
→ What is a lazy property?
It is a property decorator that gets out of the way after the first call. It allows you to auto-cache a computed value.
The standard library #property decorator is a data descriptor object and is always called, even if there is an attribute on the instance of the same name.
The #cached_property decorator on the other hand, only has a __get__ method, which means that it is not called if there is an attribute with the same name already present. It makes use of this by setting an attribute with the same name on the instance on the first call.
Given a #cached_property-decorated bar method on an instance named foo, this is what happens:
Python resolves foo.bar. No bar attribute is found on the instance.
Python finds the bar descriptor on the class, and calls __get__ on that.
The cached_property __get__ method calls the decorated bar method.
The bar method calculates something, and returns the string 'spam'.
The cached_property __get__ method takes the return value and sets a new attribute bar on the instance; foo.bar = 'spam'.
The cached_property __get__ method returns the 'spam' return value.
If you ask for foo.bar again, Python finds the bar attribute on the instance, and uses that from here on out.
Also see the source code for the original Werkzeug implementation:
# implementation detail: this property is implemented as non-data
# descriptor. non-data descriptors are only invoked if there is
# no entry with the same name in the instance's __dict__.
# this allows us to completely get rid of the access function call
# overhead. If one choses to invoke __get__ by hand the property
# will still work as expected because the lookup logic is replicated
# in __get__ for manual invocation.
Note that as of Python 3.8, the standard library has a similar object, #functools.cached_property(). It's implementation is a little bit more robust, it guards against accidental re-use under a different name, produces a better error message if used on an object without a __dict__ attribute or where that object doesn't support item assignment, and is also thread-safe.
Related
According to Python 2.7.12 documentation, User-defined methods:
User-defined method objects may be created when getting an attribute
of a class (perhaps via an instance of that class), if that attribute
is a user-defined function object, an unbound user-defined method
object, or a class method object. When the attribute is a
user-defined method object, a new method object is only created if the
class from which it is being retrieved is the same as, or a derived
class of, the class stored in the original method object; otherwise,
the original method object is used as it is.
I know that everything in Python is an object, so a "user-defined method" must be identical to a "user-defined method object". However, I can't understand why there is a "user-defined function object attribute". Say, in the following code:
class Foo(object):
def meth(self):
pass
meth is a function defined inside a class body, and thus a method. So why can we have a "user-defined function object attribute"? Aren't all attributes defined inside a class body?
Bouns question: Provide some examples illustrating how a user-defined method object is created by getting an attribute of a class. Isn't objects defined in their class definition? (I know methods can be assigned to a class instance, but that's monkey patching.)
I'm asking for help because this part of document is really really confusing to me, a programmer who only knows C, since Python is such a magical language that supports both functional programming and object-oriented programmer, which I haven't mastered yet. I've done a lot of search, but still can't figure that out.
When you do
class Foo(object):
def meth(self):
pass
you are defining a class Foo with a method meth. However, when this class definition is executed, no method object is created to represent the method. The def statement creates an ordinary function object.
If you then do
Foo.meth
or
Foo().meth
the attribute lookup finds the function object, but the function object is not used as the value of the attribute. Instead, using the descriptor protocol, Python calls the __get__ method of the function object to construct a method object, and that method object is used as the value of the attribute for that lookup. For Foo.meth, the method object is an unbound method object, which behaves like the function you defined, but with an extra type checking of self. For Foo().meth, the method object is a bound method object, which already knows what self is.
This is why Foo().meth() doesn't complain about a missing self argument; you pass 0 arguments to the method object, which then prepends self to the (empty) argument list and passes the arguments on to the underlying function object. If Foo().meth evaluated to the meth function directly, you would have to pass it self explicitly.
In Python 3, Foo.meth doesn't create an unbound method object; the function's __get__ still gets called, but it returns the function directly, since Guido decided unbound method objects weren't useful. Foo().meth still creates a bound method object, though.
Consider the following:
class A(object):
def __init__(self):
print 'Hello!'
def foo(self):
print 'Foo!'
def __getattribute__(self, att):
raise AttributeError()
a = A() # Works, prints "Hello!"
a.foo() # throws AttributeError as expected
The implementation of __getattribute__ obviously fails all lookups. My questions:
Why is it still possible to instantiate an object? I would have expected the lookup of the __init__ method itself to fail as well.
What's the list of attributes that are not subject to __getattribute__?
The implementation of __getattribute__ obviously fails all lookups
Let's say it fails for all vanilla lookups.
So how did __getattribute__ itself get called in the first place since it is also an attribute of the class?
An attribute would refer to any name following a dot. So to get an attribute of a class instance, __getattribute__ is summoned unconditionally when you try to access that attribute (through dot reference).
However magic methods like __init__ are part of the language construct and so are not directly invoked (via dot reference) since they are implemented as part of the language.
Why is it still possible to instantiate an object?
When you do:
a = A()
The __init__ method gets called behind the scenes, but not via a vanilla lookup. The language handles this. Same applies to other methods like __setattr__, __delattr__, __getattribute__ also and others.
But if you directly called __init__:
a.__init__()
It would raise an error. Eh, this does not make any sense since the class is already initialized.
More subtly, if you tried to access __getattribute__ from your class instance via a dot reference:
a.__getattribute__
it would also raise an AttributeError; the language invocation of the same method attempted to lookup on the attribute __getattribute__, but failed with error.
What's the list of attributes that are not subject to
__getattribute__?
Summarily, __getattribute__ comes play when you try to access any attribute via dot reference. As long as you don't try to explicitly call a magic method, __getattribute__ will not be called.
I'm reading through this blog post on OOP in python 3. In it there is:
As you can see, a __get__ method is listed among the members of the function, and Python recognizes it as a method-wrapper. This method shall connect the open function to the door1 instance, so we can call it passing the instance alone.
I'm trying to understand this more intuitively. In this context what is 'wrapping' what?
The method-wrapper object is wrapping a C function. It binds together an instance (here a function instance, defined in a C struct) and a C function, so that when you call it, the right instance information is passed to the C function.
It is the C API equivalent of method objects and functions on a custom class. Given a class Foo with a function bar, Foo().bar produces a bound method, which combines together the Foo() instance and the Foo.bar function, so that when called, you can pass in that instance as the first argument (usually called self).
Also see the descriptor protocol; this is what defines the __get__ method and how it is invoked.
Best Guess:
method - def(self, maybeSomeVariables); lines of code which achieve some purpose
Function - same as method but returns something
Class - group of methods/functions
Module - a script, OR one or more classes. Basically a .py file.
Package - a folder which has modules in, and also a __init__.py file in there.
Suite - Just a word that gets thrown around a lot, by convention
TestCase - unittest's equivalent of a function
TestSuite - unittest's equivalent of a Class (or Module?)
My question is: Is this completely correct, and did I miss any hierarchical building blocks from that list?
I feel that you're putting in differences that don't actually exist. There isn't really a hierarchy as such. In python everything is an object. This isn't some abstract notion, but quite fundamental to how you should think about constructs you create when using python. An object is just a bunch of other objects. There is a slight subtlety in whether you're using new-style classes or not, but in the absence of a good reason otherwise, just use and assume new-style classes. Everything below is assuming new-style classes.
If an object is callable, you can call it using the calling syntax of a pair of braces, with the arguments inside them: my_callable(arg1, arg2). To be callable, an object needs to implement the __call__ method (or else have the correct field set in its C level type definition).
In python an object has a type associated with it. The type describes how the object was constructed. So, for example, a list object is of type list and a function object is of type function. The types themselves are of type type. You can find the type by using the built-in function type(). A list of all the built-in types can be found in the python documentation. Types are actually callable objects, and are used to create instances of a given type.
Right, now that's established, the nature of a given object is defined by it's type. This describes the objects of which it comprises. Coming back to your questions then:
Firstly, the bunch of objects that make up some object are called the attributes of that object. These attributes can be anything, but they typically consist of methods and some way of storing state (which might be types such as int or list).
A function is an object of type function. Crucially, that means it has the __call__ method as an attribute which makes it a callable (the __call__ method is also an object that itself has the __call__ method. It's __call__ all the way down ;)
A class, in the python world, can be considered as a type, but typically is used to refer to types that are not built-in. These objects are used to create other objects. You can define your own classes with the class keyword, and to create a class which is new-style you must inherit from object (or some other new-style class). When you inherit, you create a type that acquires all the characteristics of the parent type, and then you can overwrite the bits you want to (and you can overwrite any bits you want!). When you instantiate a class (or more generally, a type) by calling it, another object is returned which is created by that class (how the returned object is created can be changed in weird and crazy ways by modifying the class object).
A method is a special type of function that is called using the attribute notation. That is, when it is created, 2 extra attributes are added to the method (remember it's an object!) called im_self and im_func. im_self I will describe in a few sentences. im_func is a function that implements the method. When the method is called, like, for example, foo.my_method(10), this is equivalent to calling foo.my_method.im_func(im_self, 10). This is why, when you define a method, you define it with the extra first argument which you apparently don't seem to use (as self).
When you write a bunch of methods when defining a class, these become unbound methods. When you create an instance of that class, those methods become bound. When you call an bound method, the im_self argument is added for you as the object in which the bound method resides. You can still call the unbound method of the class, but you need to explicitly add the class instance as the first argument:
class Foo(object):
def bar(self):
print self
print self.bar
print self.bar.im_self # prints the same as self
We can show what happens when we call the various manifestations of the bar method:
>>> a = Foo()
>>> a.bar()
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
>>> Foo.bar()
TypeError: unbound method bar() must be called with Foo instance as first argument (got nothing instead)
>>> Foo.bar(a)
<__main__.Foo object at 0x179b610>
<bound method Foo.bar of <__main__.Foo object at 0x179b610>>
<__main__.Foo object at 0x179b610>
Bringing all the above together, we can define a class as follows:
class MyFoo(object):
a = 10
def bar(self):
print self.a
This generates a class with 2 attributes: a (which is an integer of value 10) and bar, which is an unbound method. We can see that MyFoo.a is just 10.
We can create extra attributes at run time, both within the class methods, and outside. Consider the following:
class MyFoo(object):
a = 10
def __init__(self):
self.b = 20
def bar(self):
print self.a
print self.b
def eep(self):
print self.c
__init__ is just the method that is called immediately after an object has been created from a class.
>>> foo = Foo()
>>> foo.bar()
10
20
>>> foo.eep()
AttributeError: 'MyFoo' object has no attribute 'c'
>>> foo.c = 30
>>> foo.eep()
30
This example shows 2 ways of adding an attribute to a class instance at run time (that is, after the object has been created from it's class).
I hope you can see then, that TestCase and TestSuite are just classes that are used to create test objects. There's nothing special about them except that they happen to have some useful features for writing tests. You can subclass and overwrite them to your heart's content!
Regarding your specific point, both methods and functions can return anything they want.
Your description of module, package and suite seems pretty sound. Note that modules are also objects!
Consider this example of a strategy pattern in Python (adapted from the example here). In this case the alternate strategy is a function.
class StrategyExample(object):
def __init__(self, strategy=None) :
if strategy:
self.execute = strategy
def execute(*args):
# I know that the first argument for a method
# must be 'self'. This is just for the sake of
# demonstration
print locals()
#alternate strategy is a function
def alt_strategy(*args):
print locals()
Here are the results for the default strategy.
>>> s0 = StrategyExample()
>>> print s0
<__main__.StrategyExample object at 0x100460d90>
>>> s0.execute()
{'args': (<__main__.StrategyExample object at 0x100460d90>,)}
In the above example s0.execute is a method (not a plain vanilla function) and hence the first argument in args, as expected, is self.
Here are the results for the alternate strategy.
>>> s1 = StrategyExample(alt_strategy)
>>> s1.execute()
{'args': ()}
In this case s1.execute is a plain vanilla function and as expected, does not receive self. Hence args is empty. Wait a minute! How did this happen?
Both the method and the function were called in the same fashion. How does a method automatically get self as the first argument? And when a method is replaced by a plain vanilla function how does it not get the self as the first argument?
The only difference that I was able to find was when I examined the attributes of default strategy and alternate strategy.
>>> print dir(s0.execute)
['__cmp__', '__func__', '__self__', ...]
>>> print dir(s1.execute)
# does not have __self__ attribute
Does the presence of __self__ attribute on s0.execute (the method), but lack of it on s1.execute (the function) somehow account for this difference in behavior? How does this all work internally?
You can read the full explanation here in the python reference, under "User defined methods". A shorter and easier explanation can be found in the python tutorial's description of method objects:
If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When an instance attribute is referenced that isn’t a data attribute, its class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, a new argument list is constructed from the instance object and the argument list, and the function object is called with this new argument list.
Basically, what happens in your example is this:
a function assigned to a class (as happens when you declare a method inside the class body) is... a method.
When you access that method through the class, eg. StrategyExample.execute you get an "unbound method": it doesn't "know" to which instance it "belongs", so if you want to use that on an instance, you would need to provide the instance as the first argument yourself, eg. StrategyExample.execute(s0)
When you access the method through the instance, eg. self.execute or s0.execute, you get a "bound method": it "knows" which object it "belongs" to, and will get called with the instance as the first argument.
a function that you assign to an instance attribute directly however, as in self.execute = strategy or even s0.execute = strategy is... just a plain function (contrary to a method, it doesn't pass via the class)
To get your example to work the same in both cases:
either you turn the function into a "real" method: you can do this with types.MethodType:
self.execute = types.MethodType(strategy, self, StrategyExample)
(you more or less tell the class that when execute is asked for this particular instance, it should turn strategy into a bound method)
or - if your strategy doesn't really need access to the instance - you go the other way around and turn the original execute method into a static method (making it a normal function again: it won't get called with the instance as the first argument, so s0.execute() will do exactly the same as StrategyExample.execute()):
#staticmethod
def execute(*args):
print locals()
You need to assign an unbound method (i.e. with a self parameter) to the class or a bound method to the object.
Via the descriptor mechanism, you can make your own bound methods, it's also why it works when you assign the (unbound) function to a class:
my_instance = MyClass()
MyClass.my_method = my_method
When calling my_instance.my_method(), the lookup will not find an entry on my_instance, which is why it will at a later point end up doing this: MyClass.my_method.__get__(my_instance, MyClass) - this is the descriptor protocol. This will return a new method that is bound to my_instance, which you then execute using the () operator after the property.
This will share method among all instances of MyClass, no matter when they were created. However, they could have "hidden" the method before you assigned that property.
If you only want specific objects to have that method, just create a bound method manually:
my_instance.my_method = my_method.__get__(my_instance, MyClass)
For more detail about descriptors (a guide), see here.
The method is a wrapper for the function, and calls the function with the instance as the first argument. Yes, it contains a __self__ attribute (also im_self in Python prior to 3.x) that keeps track of which instance it is attached to. However, adding that attribute to a plain function won't make it a method; you need to add the wrapper. Here is how (although you may want to use MethodType from the types module to get the constructor, rather than using type(some_obj.some_method).
The function wrapped, by the way, is accessible through the __func__ (or im_func) attribute of the method.
When you do self.execute = strategy you set the attribute to a plain method:
>>> s = StrategyExample()
>>> s.execute
<bound method StrategyExample.execute of <__main__.StrategyExample object at 0x1dbbb50>>
>>> s2 = StrategyExample(alt_strategy)
>>> s2.execute
<function alt_strategy at 0x1dc1848>
A bound method is a callable object that calls a function passing an instance as the first argument in addition to passing through all arguments it was called with.
See: Python: Bind an Unbound Method?