Why does functools.update_wrapper update the __dict__ in the wrapper object? - python

I came across a peculiar behaviour of functools.update_wrapper: it overwrites the __dict__ of the wrapper object by that of the wrapped object - which may hinder its use when nesting decorators.
As a simple example, assume that we are writing a decorator class that caches data in memory and another decorator class that caches data to a file. The following example demonstrates this (I made the example brief and omitted all cacheing logic, but I hope that it demonstrates the question):
import functools
class cached:
cache_type = 'memory'
def __init__(self, fcn):
super().__init__()
self.fcn = fcn
functools.update_wrapper(self, fcn, updated=())
def __call__(self, *args):
print("Retrieving from", type(self).cache_type)
return self.fcn(*args)
class diskcached(cached):
cache_type = 'disk'
#cached
#diskcached
def expensive_function(what):
print("expensive_function working on", what)
expensive_function("Expensive Calculation")
This example works as intended - its output is
Retrieving from memory
Retrieving from disk
expensive_function working on Expensive Calculation
However, it took me long to make this work - at first, I hat not included the 'updated=()' argument in the functools.update_wrapper call. But when this is left out, then nesting the decorators does not work - in this case, the output is
Retrieving from memory
expensive_function working on Expensive Calculation
I.e. the outer decorator directly calls the innermost wrapped function. The reason (which took me a while to understand) for this is that functools.update_wrapper updates the __dict__ attribute of the wrapper to the __dict__ attribute of the wrapped argument - which short-circuits the inner decorator, unless one adds the updated=() argument.
My question: is this behaviour intended - and why? (Python 3.7.1)

Making a wrapper function look like the function it wraps is the point of update_wrapper, and that includes __dict__ entries. It doesn't replace the __dict__; it calls update.
If update_wrapper didn't do this, then if one decorator set attributes on a function and another decorator wrapped the modified function:
#decorator_with_update_wrapper
#decorator_that_sets_attributes
def f(...):
...
the wrapper function wouldn't have the attributes set, rendering it incompatible with the code that looks for those attributes.

Related

Instance method aliases to builtin-functions in Python

Trying to code as efficient as possible object-oriented implementation of priority queue in Python, I faced an interesting behavior. The following code works fine
from heapq import heappush
class PriorityQueue(list):
__slots__ = ()
def push(self, item):
heappush(self, item)
However, I really didn’t want to write a wrapper method for calling heappush, as it incurs additional overhead for calling the function. I reasoned that since the heappush signature uses list as the first argument, while aliasing the push class attribute with the heappush function, the latter becomes a full-fledged class instance method. However, my assumption turned out to be false, and the following code gives an error.
from heapq import heappush
class PriorityQueue(list):
__slots__ = ()
push = heappush
PriorityQueue().push(0)
# TypeError: heappush expected 2 arguments, got 1
But going to cpython heapq source code, just copying heappush implementation into the scope and applying the same logic works fine.
from heapq import _siftdown
def heappush(heap, item):
"""Push item onto heap, maintaining the heap invariant."""
heap.append(item)
_siftdown(heap, 0, len(heap) - 1)
class PriorityQueue(list):
__slots__ = ()
push = heappush
pq = PriorityQueue()
pq.push(0)
pq.push(-1)
pq.push(3)
print(pq)
# [-1, 0, 3]
The first question: Why does it happen? How does Python decide which function is appropriate for binding as an instance method and which is not?
The second question: What is the difference between heappush in the cpython/Lib/heapq.py and the actual heappush from the heapq module? They are actually different since the following code gives an error
from dis import dis
from heapq import heappush
dis(heappush)
# TypeError: don't know how to disassemble builtin_function_or_method objects
The third question: How can one force Python to bind native heappush as an instance method? Some metaclass magic?
Thank you!
What takes place is that Python offers pure Python implementations of a lot of its algorithms in the standard library even when it contains acceletated native code implementations of the same algorithms.
The heapq library is one of those - if you pick the file you link to, but close to the end, you will see the code snippet which looks if the native version is available, and overwrites the Python version, which has the code you copy and pasted - https://github.com/python/cpython/blob/76cd81d60310d65d01f9d7b48a8985d8ab89c8b4/Lib/heapq.py#L580
try:
from _heapq import *
except ImportError:
pass
...
The native version of heappush is loaded into the module, and there is no easy way of getting a reference to the original Python function, short of getting to the actual file source code.
Now, the point: why do native functions do not work as class methods?
heappush's type is builtin_function_or_method, in constrast with function for pure Python functions - and one of the major diference is that the second object type features a __get__ method. This __get__ makes Python defined functions work as "descriptors": the __get__ method is called when one retrieves the attribute from an instance. For ordinary functions, this call records the self parameter and injects it when the actual function is called.
Thus, it is easy to write an "instancemethod" decorator that will make built-in functions work as Python functions and usable as methods. However, the overhead of creating a partial or a lambda function should surpass the overhead of the extra function call you are trying to eliminate - so you should get no speed gains from it, although it might still read as more elegant:
class instancemethod:
def __init__(self, func):
self.func = func
def __get__(self, instance, owner):
return lambda *args, **kwargs: self.func(instance, *args, **kwargs)
import heapq
class MyHeap(list):
push = instancemethod(heapq.heappush)
Maybe it the way python calls a function. When you try print(type(heappush)) you will notice the difference.
For question 1, the decorator that use to identify which function is which type (i.e. staticmethod, classmethod) is like call and process the function and return the processed one to that name. So the data that determine that should be in some attribute of the function. After I find where it is, question 3 may be solved.
For question 2. When you import the built-in function, it will be in the type of builtin_function_or_method. But if you copy and paste it, it was defined in your code so it's just function. That may cause the interpreter to call it a static method instead of an instance method.

Python __init__ second argument [duplicate]

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 months ago.
When defining a method on a class in Python, it looks something like this:
class MyClass(object):
def __init__(self, x, y):
self.x = x
self.y = y
But in some other languages, such as C#, you have a reference to the object that the method is bound to with the "this" keyword without declaring it as an argument in the method prototype.
Was this an intentional language design decision in Python or are there some implementation details that require the passing of "self" as an argument?
I like to quote Peters' Zen of Python. "Explicit is better than implicit."
In Java and C++, 'this.' can be deduced, except when you have variable names that make it impossible to deduce. So you sometimes need it and sometimes don't.
Python elects to make things like this explicit rather than based on a rule.
Additionally, since nothing is implied or assumed, parts of the implementation are exposed. self.__class__, self.__dict__ and other "internal" structures are available in an obvious way.
It's to minimize the difference between methods and functions. It allows you to easily generate methods in metaclasses, or add methods at runtime to pre-existing classes.
e.g.
>>> class C:
... def foo(self):
... print("Hi!")
...
>>>
>>> def bar(self):
... print("Bork bork bork!")
...
>>>
>>> c = C()
>>> C.bar = bar
>>> c.bar()
Bork bork bork!
>>> c.foo()
Hi!
>>>
It also (as far as I know) makes the implementation of the python runtime easier.
I suggest that one should read Guido van Rossum's blog on this topic - Why explicit self has to stay.
When a method definition is decorated, we don't know whether to automatically give it a 'self' parameter or not: the decorator could turn the function into a static method (which has no 'self'), or a class method (which has a funny kind of self that refers to a class instead of an instance), or it could do something completely different (it's trivial to write a decorator that implements '#classmethod' or '#staticmethod' in pure Python). There's no way without knowing what the decorator does whether to endow the method being defined with an implicit 'self' argument or not.
I reject hacks like special-casing '#classmethod' and '#staticmethod'.
Python doesn't force you on using "self". You can give it whatever name you want. You just have to remember that the first argument in a method definition header is a reference to the object.
Also allows you to do this: (in short, invoking Outer(3).create_inner_class(4)().weird_sum_with_closure_scope(5) will return 12, but will do so in the craziest of ways.
class Outer(object):
def __init__(self, outer_num):
self.outer_num = outer_num
def create_inner_class(outer_self, inner_arg):
class Inner(object):
inner_arg = inner_arg
def weird_sum_with_closure_scope(inner_self, num)
return num + outer_self.outer_num + inner_arg
return Inner
Of course, this is harder to imagine in languages like Java and C#. By making the self reference explicit, you're free to refer to any object by that self reference. Also, such a way of playing with classes at runtime is harder to do in the more static languages - not that's it's necessarily good or bad. It's just that the explicit self allows all this craziness to exist.
Moreover, imagine this: We'd like to customize the behavior of methods (for profiling, or some crazy black magic). This can lead us to think: what if we had a class Method whose behavior we could override or control?
Well here it is:
from functools import partial
class MagicMethod(object):
"""Does black magic when called"""
def __get__(self, obj, obj_type):
# This binds the <other> class instance to the <innocent_self> parameter
# of the method MagicMethod.invoke
return partial(self.invoke, obj)
def invoke(magic_self, innocent_self, *args, **kwargs):
# do black magic here
...
print magic_self, innocent_self, args, kwargs
class InnocentClass(object):
magic_method = MagicMethod()
And now: InnocentClass().magic_method() will act like expected. The method will be bound with the innocent_self parameter to InnocentClass, and with the magic_self to the MagicMethod instance. Weird huh? It's like having 2 keywords this1 and this2 in languages like Java and C#. Magic like this allows frameworks to do stuff that would otherwise be much more verbose.
Again, I don't want to comment on the ethics of this stuff. I just wanted to show things that would be harder to do without an explicit self reference.
I think it has to do with PEP 227:
Names in class scope are not accessible. Names are resolved in the
innermost enclosing function scope. If a class definition occurs in a
chain of nested scopes, the resolution process skips class
definitions. This rule prevents odd interactions between class
attributes and local variable access. If a name binding operation
occurs in a class definition, it creates an attribute on the resulting
class object. To access this variable in a method, or in a function
nested within a method, an attribute reference must be used, either
via self or via the class name.
I think the real reason besides "The Zen of Python" is that Functions are first class citizens in Python.
Which essentially makes them an Object. Now The fundamental issue is if your functions are object as well then, in Object oriented paradigm how would you send messages to Objects when the messages themselves are objects ?
Looks like a chicken egg problem, to reduce this paradox, the only possible way is to either pass a context of execution to methods or detect it. But since python can have nested functions it would be impossible to do so as the context of execution would change for inner functions.
This means the only possible solution is to explicitly pass 'self' (The context of execution).
So i believe it is a implementation problem the Zen came much later.
As explained in self in Python, Demystified
anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit). This is the reason the first parameter of a function in class must be the object itself.
class Point(object):
def __init__(self,x = 0,y = 0):
self.x = x
self.y = y
def distance(self):
"""Find distance from origin"""
return (self.x**2 + self.y**2) ** 0.5
Invocations:
>>> p1 = Point(6,8)
>>> p1.distance()
10.0
init() defines three parameters but we just passed two (6 and 8). Similarly distance() requires one but zero arguments were passed.
Why is Python not complaining about this argument number mismatch?
Generally, when we call a method with some arguments, the corresponding class function is called by placing the method's object before the first argument. So, anything like obj.meth(args) becomes Class.meth(obj, args). The calling process is automatic while the receiving process is not (its explicit).
This is the reason the first parameter of a function in class must be the object itself. Writing this parameter as self is merely a convention. It is not a keyword and has no special meaning in Python. We could use other names (like this) but I strongly suggest you not to. Using names other than self is frowned upon by most developers and degrades the readability of the code ("Readability counts").
...
In, the first example self.x is an instance attribute whereas x is a local variable. They are not the same and lie in different namespaces.
Self Is Here To Stay
Many have proposed to make self a keyword in Python, like this in C++ and Java. This would eliminate the redundant use of explicit self from the formal parameter list in methods. While this idea seems promising, it's not going to happen. At least not in the near future. The main reason is backward compatibility. Here is a blog from the creator of Python himself explaining why the explicit self has to stay.
The 'self' parameter keeps the current calling object.
class class_name:
class_variable
def method_name(self,arg):
self.var=arg
obj=class_name()
obj.method_name()
here, the self argument holds the object obj. Hence, the statement self.var denotes obj.var
There is also another very simple answer: according to the zen of python, "explicit is better than implicit".

Are Python decorators necessary?

I have a basic understanding of decorators, but as yet they seem superfluous and "hacky", like C macros but for functions.
An article stressing the importance of decorators gives this example of decorator usage:
from myapp.log import logger
def log_order_event(func):
def wrapper(*args, **kwargs):
logger.info("Ordering: %s", func.__name__)
order = func(*args, **kwargs)
logger.debug("Order result: %s", order.result)
return order
return wrapper
#log_order_event
def order_pizza(*toppings):
# let's get some pizza!
order_pizza(*toppings) # Usage
Isn't this equivalent to the decoratorless code below?
from myapp.log import logger
def log_order_event(func, *args, **kwargs):
logger.info("Ordering: %s", func.__name__)
order = func(*args, **kwargs)
logger.debug("Order result: %s", order.result)
return order
def order_pizza(*toppings):
# let's get some pizza!
log_order_event(order_pizza, *toppings) # Usage
In fact, isn't the second snippet easier to write since it's one function instead of a function and a wrapper function? The usage calls are longer, but they more clearly indicate what's actually being called.
Is it purely a matter of taste and syntactic sugar, or am I missing something?
Both snippets are equivalent as far as behavior is concerned. However, I'd use the decorator version for two main reasons:
Readability: With snippet 2, you'd require an explicit function call to a less relevant method log_order_event() whose main purpose is logging. This logging functionality may be required by other methods such as order_appetizers() and therefore, you would require individual calls to log_order_event() for every such method. Wouldn't it be better(from a readability perspective) to have a wrapper defined once & for all, that performs logging and you may call the wrapper(as a decorator) for such methods whenever necessary? Decorators are syntactic sugar but certainly not a hack. In fact, PEP 318 clearly talks about the motivation behind decorators, you must read that.
Scalability: Consider a situation where your code has several methods(say 20) such as order_appetizers(), order_maincourse(), order_desserts() and so on. Let's also assume that you'd like to perform some prior checks before ordering any of these items, say check for toppings(for pizza's), calories(for desserts) and so on. When using decorators, its just a matter of defining the wrapper functions in a separate wrapper class for each check, import it in your main code and decorate the required methods respectively. That way, your wrapper functions are completely isolated and the main logic remains clean and easy to maintain.
#check_toppings
#log_order_event
def order_pizza(*toppings):
# let's get some pizza!
#check_calories
#log_order_event
def order_desserts():
# let's order some desserts!
...and so on for 20 more methods
With a decoratorless logic, while you can separate the check(and log) methods in a separate class and import in your main logic, you would still have to add individual calls to each of these methods inside each order method which from a code maintenance perspective could prove to be daunting, especially for new eyes.

Is there a pythonic way to skip decoration on a subclass' method?

I have an class which decorates some methods using a decorator from another library. Specifically, the class subclasses flask-restful resources, decorates the http methods with httpauth.HTTPBasicAuth().login_required(), and does some sensible defaults on a model service.
On most subclasses I want the decorator applied; therefore I'd rather remove it than add it in the subclasses.
My thought is to have a private method which does the operations and a public method which is decorated. The effects of decoration can be avoided by overriding the public method to call the private one and not decorating this override. Mocked example below.
I am curious to know if there's a better way to do this. Is there a shortcut for 'cancelling decorators' in python that gives this effect?
Or can you recommend a better approach?
Some other questions have suitable answers for this, e.g. Is there a way to get the function a decorator has wrapped?. But my question is about broader design - i am interested in any pythonic way to run the operations in decorated methods without the effects of decoration. E.g. my example is one such way but there may be others.
def auth_required(fn):
def new_fn(*args, **kwargs):
print('Auth required for this resource...')
fn(*args, **kwargs)
return new_fn
class Resource:
name = None
#auth_required
def get(self):
self._get()
def _get(self):
print('Getting %s' %self.name)
class Eggs(Resource):
name = 'Eggs'
class Spam(Resource):
name = 'Spam'
def get(self):
self._get()
# super(Spam, self)._get()
eggs = Eggs()
spam = Spam()
eggs.get()
# Auth required for this resource...
# Getting Eggs
spam.get()
# Getting Spam
Flask-HTTPAuth uses functools.wraps in the login_required decorator:
def login_required(self, f):
#wraps(f)
def decorated(*args, **kwargs):
...
From Python 3.2, as this calls update_wrapper, you can access the original function via __wrapped__:
To allow access to the original function for introspection and other
purposes (e.g. bypassing a caching decorator such as lru_cache()),
this function automatically adds a __wrapped__ attribute to the
wrapper that refers to the function being wrapped.
If you're writing your own decorators, as in your example, you can also use #wraps to get the same functionality (as well as keeping the docstrings, etc.).
See also Is there a way to get the function a decorator has wrapped?
Another common option is to have the decorated function keep a copy of the original function that can be accessed:
def auth_required(fn):
def new_fn(*args, **kwargs):
print('Auth required for this resource...')
fn(*args, **kwargs)
new_fn.original_fn = fn
return new_fn
Now, for any function that has been decorated, you can access its original_fn attribute to get a handle to the original, un-decorated function.
In that case, you could define some type of dispatcher that either makes plain function calls (when you are happy with the decorator behavior) or makes calls to thing.original_fn when you prefer to avoid the decorator behavior.
Your proposed method is also a valid way to structure it, and whether my suggestion is "better" depends on the rest of the code you're dealing with, who needs to read it, and other kinds of trade-offs.
I am curious to know if there's a better way to do this. Is there a
shortcut for 'cancelling decorators' in python that gives this effect?
Use the undecorated library. It digs through all the decorators and returns just the original function. The docs should be self-explanatory, basically you just call: undecorated(your_decorated_function)

Allow help() to work on partial function object

I'm trying to make sure running help() at the Python 2.7 REPL displays the __doc__ for a function that was wrapped with functools.partial. Currently running help() on a functools.partial 'function' displays the __doc__ of the functools.partial class, not my wrapped function's __doc__. Is there a way to achieve this?
Consider the following callables:
def foo(a):
"""My function"""
pass
partial_foo = functools.partial(foo, 2)
Running help(foo) will result in showing foo.__doc__. However, running help(partial_foo) results in the __doc__ of a Partial object.
My first approach was to use functools.update_wrapper which correctly replaces the partial object's __doc__ with foo.__doc__. However, this doesn't fix the 'problem' because of how pydoc.
I've investigated the pydoc code, and the issue seems to be that partial_foo is actually a Partial object not a typical function/callable, see this question for more information on that detail.
By default, pydoc will display the __doc__ of the object type, not instance if the object it was passed is determined to be a class by inspect.isclass. See the render_doc function for more information about the code itself.
So, in my scenario above pydoc is displaying the help of the type, functools.partial NOT the __doc__ of my functools.partial instance.
Is there anyway to make alter my call to help() or functools.partial instance that's passed to help() so that it will display the __doc__ of the instance, not type?
I found a pretty hacky way to do this. I wrote the following function to override the __builtins__.help function:
def partialhelper(object=None):
if isinstance(object, functools.partial):
return pydoc.help(object.func)
else:
# Preserve the ability to go into interactive help if user calls
# help() with no arguments.
if object is None:
return pydoc.help()
else:
return pydoc.help(object)
Then just replace it in the REPL with:
__builtins__.help = partialhelper
This works and doesn't seem to have any major downsides, yet. However, there isn't a way with the above naive implementation to support still showing the __doc__ of some functools.partial objects. It's all or nothing, but could probably attach an attribute to the wrapped (original) function to indicate whether or not the original __doc__ should be shown. However, in my scenario I never want to do this.
Note the above does NOT work when using IPython and the embed functionality. This is because IPython directly sets the shell's namespace with references to the 'real' __builtin__, see the code and old mailing list for information on why this is.
So, after some investigation there's another way to hack this into IPython. We must override the site._Helper class, which is used by IPython to explicitly setup the help system. The following code will do just that when called BEFORE IPython.embed:
import site
site._Helper.__call__ = lambda self, *args, **kwargs: partialhelper(*args, **kwargs)
Are there any other downsides I'm missing here?
how bout implementing your own?
def partial_foo(*args):
""" some doc string """
return foo(*((2)+args))
not a perfect answer but if you really want this i suspect this is the only way to do it
You identified the issue - partial functions aren't typical functions, and the dunder variables don't carry over. This applies not just to __doc__, but also __name__, __module__, and more. Not sure if this solution existed when the question was asked, but you can achieve this more elegantly ("elegantly" up to interpretation) by re-writing partial() as a decorator factory. Since decorators (& factories) do not automatically copy over dunder variables, you need to also use #wraps(func):
def wrapped_partial(*args, **kwargs):
def foo(func):
#wraps(func)
def bar(*fargs,**fkwargs):
return func(*args, *fargs, **kwargs, **fkwargs)
return bar
return foo
Usage example:
#wrapped_partial(3)
def multiply_triple(x, y=1, z=0):
"""Multiplies three numbers"""
return x * y * z
# Without decorator syntax: multiply_triple = wrapped_partial(3)(multiply_triple)
With output:
>>>print(multiply_triple())
0
>>>print(multiply_triple(3,z=3))
9
>>>help(multiply_triple)
help(multiply_triple)
Help on function multiply_triple in module __main__:
multiply_triple(x: int, y: int = 1, z: int = 0)
Multiplies three numbers
Thing that didn't work, but informative when using multiple decorators
You might think, as I first did, that based upon the stacking syntax of decorators in PEP-318, you could put the wrapping and the partial function definition in separate decorators, e.g.
def partial_func(*args, **kwargs):
def foo(func):
def bar(*fargs,**fkwargs):
return func(*args, *fargs, **kwargs, **fkwargs)
return bar
return foo
def wrapped(f):
#wraps(f)
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
return wrapper
#wrapped
#partial_func(z=3)
def multiply_triple(x, y=1, z=0):
"""Multiplies three numbers"""
return x * y * z
In these cases (and in reverse order), the decorators are applied one at a time, and the #partial_func interrupts wrapping. This means that if you are trying to use any decorator that you want to wrap, you need to rewrite the decorator in a factory where the decorator's return function is itself decorated by #wraps(func). If you are using multiple decorators, they all have to be turned into wrapped factories.
Alternate method to have decorators "wrap"
Since decorators are just functions, you can write a copy_dunder_vars(obj1, obj2) function that retruns obj2 but with all the dunder variables from obj1. Call as:
def foo()
pass
foo = copy_dunder_vars(decorator(foo), foo)
This goes against the preferred syntax, but practicality beats purity. I think "not forcing you to rewrite decorators that you're borrowing from elsewhere and leaving largely unchanged" fits into that category. After all that wrapping, don't forget ribbon and a bow ;)

Categories