Intercept magic method calls in python class - python

I am trying to make a class that wraps a value that will be used across multiple other objects. For computational reasons, the aim is for this wrapped value to only be calculated once and the reference to the value passed around to its users. I don't believe this is possible in vanilla python due to its object container model. Instead, my approach is a wrapper class that is passed around, defined as follows:
class DynamicProperty():
def __init__(self, value = None):
# Value of the property
self.value: Any = value
def __repr__(self):
# Use value's repr instead
return repr(self.value)
def __getattr__(self, attr):
# Doesn't exist in wrapper, get it from the value
# instead
return getattr(self.value, attr)
The following works as expected:
wrappedString = DynamicProperty("foo")
wrappedString.upper() # 'FOO'
wrappedFloat = DynamicProperty(1.5)
wrappedFloat.__add__(2) # 3.5
However, implicitly calling __add__ through normal syntax fails:
wrappedFloat + 2 # TypeError: unsupported operand type(s) for
# +: 'DynamicProperty' and 'float'
Is there a way to intercept these implicit method calls without explicitly defining magic methods for DynamicProperty to call the method on its value attribute?

Talking about "passing by reference" will only confuse you. Keep that terminology to languages where you can have a choice on that, and where it makes a difference. In Python you always pass objects around - and this passing is the equivalent of "passing by reference" - for all objects - from None to int to a live asyncio network connection pool instance.
With that out of the way: the algorithm the language follows to retrieve attributes from an object is complicated, have details - implementing __getattr__ is just the tip of the iceberg. Reading the document called "Data Model" in its entirety will give you a better grasp of all the mechanisms involved in retrieving attributes.
That said, here is how it works for "magic" or "dunder" methods - (special functions with two underscores before and two after the name): when you use an operator that requires the existence of the method that implements it (like __add__ for +), the language checks the class of your object for the __add__ method - not the instance. And __getattr__ on the class can dynamically create attributes for instances of that class only.
But that is not the only problem: you could create a metaclass (inheriting from type) and put a __getattr__ method on this metaclass. For all querying you would do from Python, it would look like your object had the __add__ (or any other dunder method) in its class. However, for dunder methods, Python do not go through the normal attribute lookup mechanism - it "looks" directly at the class, if the dunder method is "physically" there. There are slots in the memory structure that holds the classes for each of the possible dunder methods - and they either refer to the corresponding method, or are "null" (this is "viewable" when coding in C on the Python side, the default dir will show these methods when they exist, or omit them if not). If they are not there, Python will just "say" the object does not implement that operation and period.
The way to work around that with a proxy object like you want is to create a proxy class that either features the dunder methods from the class you want to wrap, or features all possible methods, and upon being called, check if the underlying object actually implements the called method.
That is why "serious" code will rarely, if ever, offer true "transparent" proxy objects. There are exceptions, but from "Weakrefs", to "super()", to concurrent.futures, just to mention a few in the core language and stdlib, no one attempts a "fully working transparent proxy" - instead, the api is more like you call a ".value()" or ".result()" method on the wrapper to get to the original object itself.
However, it can be done, as I described above. I even have a small (long unmaintained) package on pypi that does that, wrapping a proxy for a future.
The code is at https://bitbucket.org/jsbueno/lelo/src/master/lelo/_lelo.py

The + operator in your case does not work, because DynamicProperty does not inherit from float. See:
>>> class Foo(float):
pass
>>> Foo(1.5) + 2
3.5
So, you'll need to do some kind of dynamic inheritance:
def get_dynamic_property(instance):
base = type(instance)
class DynamicProperty(base):
pass
return DynamicProperty(instance)
wrapped_string = get_dynamic_property("foo")
print(wrapped_string.upper())
wrapped_float = get_dynamic_property(1.5)
print(wrapped_float + 2)
Output:
FOO
3.5

Related

Explicit call to __call__ works and uses __init__

I'm learning overloading in Python 3.X and to better understand the topic, I wrote the following code that works in 3.X but not in 2.X. I expected the below code to fail since I've not defined __call__ for class Test. But to my surprise, it works and prints "constructor called". Demo.
class Test:
def __init__(self):
print("constructor called")
#Test.__getitem__() #error as expected
Test.__call__() #this works in 3.X(but not in 2.X) and prints "constructor called"! WHY THIS DOESN'T GIVE ERROR in 3.x?
So my question is that how/why exactly does this code work in 3.x but not in 2.x. I mean I want to know the mechanics behind what is going on.
More importantly, why __init__ is being used here when I am using __call__?
In 3.x:
About attribute lookup, type and object
Every time an attribute is looked up on an object, Python follows a process like this:
Is it directly a part of the actual data in the object? If so, use that and stop.
Is it directly a part of the object's class? If so, hold onto that for step 4.
Otherwise, check the object's class for __getattr__ and __getattribute__ overrides, look through base classes in the MRO, etc. (This is a massive simplification, of course.)
If something was found in step 2 or 3, check if it has a __get__. If it does, look that up (yes, that means starting over at step 1 for the attribute named __get__ on that object), call it, and use its return value. Otherwise, use what was returned directly.
Functions have a __get__ automatically; it is used to implement method binding. Classes are objects; that's why it's possible to look up attributes in them. That is: the purpose of the class Test: block is to define a data type; the code creates an object named Test which represents the data type that was defined.
But since the Test class is an object, it must be an instance of some class. That class is called type, and has a built-in implementation.
>>> type(Test)
<class 'type'>
Notice that type(Test) is not a function call. Rather, the name type is pre-defined to refer to a class, which every other class created in user code is (by default) an instance of.
In other words, type is the default metaclass: the class of classes.
>>> type
<class 'type'>
One may ask, what class does type belong to? The answer is surprisingly simple - itself:
>>> type(type) is type
True
Since the above examples call type, we conclude that type is callable. To be callable, it must have a __call__ attribute, and it does:
>>> type.__call__
<slot wrapper '__call__' of 'type' objects>
When type is called with a single argument, it looks up the argument's class (roughly equivalent to accessing the __class__ attribute of the argument). When called with three arguments, it creates a new instance of type, i.e., a new class.
How does type work?
Because this is digging right at the core of the language (allocating memory for the object), it's not quite possible to implement this in pure Python, at least for the reference C implementation (and I have no idea what sort of magic is going on in PyPy here). But we can approximately model the type class like so:
def _validate_type(obj, required_type, context):
if not isinstance(obj, required_type):
good_name = required_type.__name__
bad_name = type(obj).__name__
raise TypeError(f'{context} must be {good_name}, not {bad_name}')
class type:
def __new__(cls, name_or_obj, *args):
# __new__ implicitly gets passed an instance of the class, but
# `type` is its own class, so it will be `type` itself.
if len(args) == 0: # 1-argument form: check the type of an existing class.
return obj.__class__
# otherwise, 3-argument form: create a new class.
try:
bases, attrs = args
except ValueError:
raise TypeError('type() takes 1 or 3 arguments')
_validate_type(name, str, 'type.__new__() argument 1')
_validate_type(bases, tuple, 'type.__new__() argument 2')
_validate_type(attrs, dict, 'type.__new__() argument 3')
# This line would not work if we were actually implementing
# a replacement for `type`, as it would route to `object.__new__(type)`,
# which is explicitly disallowed. But let's pretend it does...
result = super().__new__()
# Now, fill in attributes from the parameters.
result.__name__ = name_or_obj
# Assigning to `__bases__` triggers a lot of other internal checks!
result.__bases__ = bases
for name, value in attrs.items():
setattr(result, name, value)
return result
del __new__.__get__ # `__new__`s of builtins don't implement this.
def __call__(self, *args):
return self.__new__(self, *args)
# this, however, does have a `__get__`.
What happens (conceptually) when we call the class (Test())?
Test() uses function-call syntax, but it's not a function. To figure out what should happen, we translate the call into Test.__class__.__call__(Test). (We use __class__ directly here, because translating the function call using type - asking type to categorize itself - would end up in endless recursion.)
Test.__class__ is type, so this becomes type.__call__(Test).
type contains a __call__ directly (type is its own class, remember?), so it's used directly - we don't go through the __get__ descriptor. We call the function, with Test as self, and no other arguments. (We have a function now, so we don't need to translate the function call syntax again. We could - given a function func, func.__class__.__call__.__get__(func) gives us an instance of an unnamed builtin "method wrapper" type, which does the same thing as func when called. Repeating the loop on the method wrapper creates a separate method wrapper that still does the same thing.)
This attempts the call Test.__new__(Test) (since self was bound to Test). Test.__new__ isn't explicitly defined in Test, but since Test is a class, we don't look in Test's class (type), but instead in Test's base (object).
object.__new__(Test) exists, and does magical built-in stuff to allocate memory for a new instance of the Test class, make it possible to assign attributes to that instance (even though Test is a subtype of object, which disallows that), and set its __class__ to Test.
Similarly, when we call type, the same logical chain turns type(Test) into type.__class__.__call__(type, Test) into type.__call__(type, Test), which forwards to type.__new__(type, Test). This time, there is a __new__ attribute directly in type, so this doesn't fall back to looking in object. Instead, with name_or_obj being set to Test, we simply return Test.__class__, i.e., type. And with separate name, bases, attrs arguments, type.__new__ instead creates an instance of type.
Finally: what happens when we call Test.__call__() explicitly?
If there's a __call__ defined in the class, it gets used, since it's found directly. This will fail, however, because there aren't enough arguments: the descriptor protocol isn't used since the attribute was found directly, so self isn't bound, and so that argument is missing.
If there isn't a __call__ method defined, then we look in Test's class, i.e., type. There's a __call__ there, so the rest proceeds like steps 3-5 in the previous section.
In Python 3.x, every class is implicitely a child of the builtin class object. And at least in the CPython implementation, the object class has a __call__ method which is defined in its metaclass type.
That means that Test.__call__() is exactly the same as Test() and will return a new Test object, calling your custom __init__ method.
In Python 2.x classes are by default old-style classes and are not child of object. Because of that __call__ is not defined. You can get the same behaviour in Python 2.x by using new style classes, meaning by making an explicit inheritance on object:
# Python 2 new style class
class Test(object):
...

Define #property on function

In JavaScript, we can do the following to any object or function
const myFn = () => {};
Object.defineProperties(myFn, {
property: {
get: () => console.log('property accessed')
}
});
This will allow for a #property like syntax by defining a getter function for the property property.
myFn.property
// property accessed
Is there anything similar for functions in Python?
I know we can't use property since it's not a new-style class, and assigning a lambda with setattr will not work since it'll be a function.
Basically what I want to achieve is that whenever my_fn.property is to return a new instance of another class on each call.
What I currently have with setattr is this
setattr(my_fn, 'property', OtherClass())
My hopes are to design an API that looks like this my_fn.property.some_other_function().
I would prefer using a function as my_fn and not an instance of a class, even though I realize that it might be easier to implement.
Below is the gist of what I'm trying to achieve
def my_fn():
pass
my_fn = property('property', lambda: OtherClass())
my_fn.property
// will be a new instance of OtherClass on each call
It's not possible to do exactly what you want. The descriptor protocol that powers the property built-in is only invoked when:
The descriptor is defined on a class
The descriptor's name is accessed on an instance of said class
Problem is, the class behind functions defined in Python (aptly named function, exposed directly as types.FunctionType or indirectly by calling type() on any function defined at the Python layer) is a single shared, immutable class, so you can't add descriptors to it (and even if you could, they'd become attributes of every Python level function, not just one particular function).
The closest you can get to what you're attempting would be to define a callable class (defining __call__) that defines the descriptor you're interested in as well. Make a single instance of that class (you can throw away the class itself at this point) and it will behave as you expect. Make __call__ a staticmethod, and you'll avoid changing the signature to boot.
For example, the behavior you want could be achieved with:
class my_fn:
# Note: Using the name "property" for a property has issues if you define
# other properties later in the class; this is just for illustration
#property
def property(self):
return OtherClass()
#staticmethod
def __call__(...whatever args apply; no need for self...):
... function behavior goes here ...
my_fn = my_fn() # Replace class with instance of class that behaves like a function
Now you can call the "function" (really a functor, to use C++ parlance):
my_fn(...)
or access the property, getting a brand new OtherClass each time:
>>> type(my_fn.property) is type(my_fn.property)
True
>>> my_fn.property is my_fn.property
False
No, this isn't what you asked for (you seem set on having a plain function do this for you), but you're asking for a very JavaScript specific thing which doesn't exist in Python.
What you want is not currently possible, because the property would have to be set on the function type to be invoked correctly. And you are not allowed to monkeypatch the function type:
>>> type(my_fn).property = 'anything else'
TypeError: can't set attributes of built-in/extension type 'function'
The solution: use a callable class instead.
Note: What you want may become possible in Python 3.8 if PEP 575 is accepted.

How to determine the method type of stdlib methods written in C

The classify_class_attrs function from the inspect module can be used to determine what kind of object each of a class's attributes is, including whether a function is an instance method, a class method, or a static method. Here is an example:
from inspect import classify_class_attrs
class Example(object):
#classmethod
def my_class_method(cls):
pass
#staticmethod
def my_static_method():
pass
def my_instance_method(self):
pass
print classify_class_attrs(Example)
This will output a list of Attribute objects for each attribute on Example, with metadata about the attribute. The relevant ones in these case are:
Attribute(name='my_class_method', kind='class method', defining_class=<class '__main__.Example'>, object=<classmethod object at 0x100535398>)
Attribute(name='my_instance_method', kind='method', defining_class=<class '__main__.Example'>, object=<unbound method Example.my_instance_method>)
Attribute(name='my_static_method', kind='static method', defining_class=<class '__main__.Example'>, object=<staticmethod object at 0x100535558>)
However, it seems that many objects in Python's standard library can't be introspected this way. I'm guessing this has something to do with the fact that many of them are implemented in C. For example, datetime.datetime.now is described with this Attribute object by inspect.classify_class_attrs:
Attribute(name='now', kind='method', defining_class=<type 'datetime.datetime'>, object=<method 'now' of 'datetime.datetime' objects>)
If we compare this to the metadata returned about the attributes on Example, you'd probably draw the conclusion that datetime.datetime.now is an instance method. But it actually behaves as a class method!
from datetime import datetime
print datetime.now() # called from the class: 2014-09-12 16:13:33.890742
print datetime.now().now() # called from a datetime instance: 2014-09-12 16:13:33.891161
Is there a reliable way to determine whether a method on a stdlib class is a static, class, or instance method?
I think you can get much of what you want, distinguishing five kinds, without relying on anything that isn't documented by inspect:
Python instance methods
Python class methods
Python static methods
Builtin instance methods
Builtin class methods or static methods
But you can't distinguish those last two from each other with using CPython-specific implementation details.
(As far as I know, only 3.x has any builtin static methods in the stdlib… but of course even in 2.x, someone could always define one in an extension module.)
The details of what's available in inspect and even what it means are a little different in each version of Python, partly because things have changed between 2.x and 3.x, partly because inspect is basically a bunch of heuristics that have gradually improved over time.
But at least for CPython 2.6 and 2.7 and 3.3-3.5, the simplest way to distinguish builtin instance methods from the other two types is isbuiltin on the method looked up from the class. For a static method or class method, this will be True; for an instance method, False. For example:
>>> inspect.isbuiltin(str.maketrans)
True
>>> inspect.isbuiltin(datetime.datetime.now)
True
>>> inspect.isbuiltin(datetime.datetime.ctime)
False
Why does this work? Well, isbuiltin will:
Return true if the object is a built-in function or a bound built-in method.
When looked up on an instance, either a regular method or a classmethod-like method is bound. But when looked up on the class, a regular method is unbound, while a classmethod-like method is bound (to the class). And of course a staticmethod-like method ends up as a plain-old function when looked up either way. So, it's a bit indirect, but it will always be correct.*
What about class methods vs. static methods?
In CPython 3.x, builtin static and class method descriptors both return the exact same type when looked up on their class, and none of the documented attributes can be used to distinguish them either. And even if this weren't true, I think the way the reference is written, it's guaranteed that no functions in inspect would be able to distinguish them.
What if we turn to the descriptors themselves? Yes, there are ways we can distinguish them… but I don't think it's something guaranteed by the language:
>>> callable(str.__dict__['maketrans'])
False
>>> callable(datetime.datetime.__dict__['now'])
True
Why does this work? Well, static methods just use a staticmethod descriptor, exactly like in Python (but wrapping a builtin function instead of a function). But class and instance methods use a special descriptor type, instead of using classmethod wrapping a (builtin) function and the (builtin) function itself, as Python class and instance methods do. These special descriptor types, classmethod_descriptor and method_descriptor, are unbound (class and instance) methods, as well as being the descriptors that bind them. There are historical/implementation reasons for this to be true, but I don't think there's anything in the language reference that requires it to be true, or even implies it.
And if you're willing to rely on implementation artifacts, isinstance(m, staticmethod) seems a lot simpler…
All that being said, are there any implementations besides CPython that have both builtin staticmethods and classmethods? If not, remember that practicality beats purity…
* What it's really testing for is whether the thing is callable without an extra argument, but that's basically the same thing as the documented "function or bound method"; either way, it's what you want.

python can't set attributes of built-in/extension type 'object'

Python seems disallow you to assign attribute to the internal highest level of 'object':
object.lst=lambda o:list(o)
Or
func=lambda o:list(o)
setattr(object,'lst',func)
All generate error messages.
Any design reason behind it? Honestly, this is odd, since everything is an object, and you can define func(obj) as you wish, so if func is not constraint the argument type, why the method way is prohibited to add then?
obj.func() is exactly func(obj) if you assign the func as an attribute to the object. Since everything is an object,an attribute of the object is a global function almost by definition.
Anyway to walk-around?
Basically if you have the capability to add some functions to build-in types, you can do pipeline style programming:obj.list().len().range() and I think the readibility of this type program is very clear and nothing non-pythonic.
I am not sure at the moment at the actual "design" reasoning for this, though there are - but the fact is that object, as some other built-in classes are written in native code (called "built-in") -
Due to the nature these built-in classes are made (their code is usually in C), they do not have a __dict__ attribute themselves, and thus, disallow arbitrary parameter setting.
Function objects had a __dict__ attribute implemented somewhere along the 2.x development cycle, so they can hold arbitrary attributes.
The way to go if you need something that works as an "object" but have arbitrary attributes is to subclass object, and add your attributes there:
class MyObj(object): pass
MyObj.lst = lambda o: list(o)

Which special methods bypasses __getattribute__ in Python?

In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
The docs mention special methods such as __hash__, __repr__ and __len__, and I know from experience it also includes __iter__ for Python 2.7.
To quote an answer to a related question:
"Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots."
In a quest to improve my answer to another question, I need to know: Which methods, specifically, are we talking about?
You can find an answer in the python3 documentation for object.__getattribute__, which states:
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the
latter will not be called unless __getattribute__() either calls it
explicitly or raises an AttributeError. This method should return the
(computed) attribute value or raise an AttributeError exception. In
order to avoid infinite recursion in this method, its implementation
should always call the base class method with the same name to access
any attributes it needs, for example, object.__getattribute__(self,
name).
Note
This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in
functions. See Special method lookup.
also this page explains exactly how this "machinery" works. Fundamentally __getattribute__ is called only when you access an attribute with the .(dot) operator(and also by hasattr as Zagorulkin pointed out).
Note that the page does not specify which special methods are implicitly looked up, so I deem that this hold for all of them(which you may find here.
Checked in 2.7.9
Couldn't find any way to bypass the call to __getattribute__, with any of the magical methods that are found on object or type:
# Preparation step: did this from the console
# magics = set(dir(object) + dir(type))
# got 38 names, for each of the names, wrote a.<that_name> to a file
# Ended up with this:
a.__module__
a.__base__
#...
Put this at the beginning of that file, which i renamed into a proper python module (asdf.py)
global_counter = 0
class Counter(object):
def __getattribute__(self, name):
# this will count how many times the method was called
global global_counter
global_counter += 1
return super(Counter, self).__getattribute__(name)
a = Counter()
# after this comes the list of 38 attribute accessess
a.__module__
#...
a.__repr__
#...
print global_counter # you're not gonna like it... it printer 38
Then i also tried to get each of those names by getattr and hasattr -> same result. __getattribute__ was called every time.
So if anyone has other ideas... I was too lazy to look inside C code for this, but I'm sure the answer lies somewhere there.
So either there's something that i'm not getting right, or the docs are lying.
super().method will also bypass __getattribute__. This atrocious code will run just fine (Python 3.11).
class Base:
def print(self):
print("whatever")
def __getattribute__(self, item):
raise Exception("Don't access this with a dot!")
class Sub(Base):
def __init__(self):
super().print()
a = Sub()
# prints 'whatever'
a.print()
# Exception Don't access this with a dot!

Categories