Python object proxying: how to access proxy - python

I found this recipe to create a proxy class. I've used it to wrap a custom object and would like to overload certain properties and also attach new attributes to the proxy. However, when I call any method on the proxy (from within the proxy class), I end up being delegated to the wrappee which is not what I want.
Is there any way of accessing or storing a reference to the proxy?
Here's some code (untested) to demonstrate the problem.
class MyObject(object):
#property
def value(self):
return 42
class MyObjectProxy(Proxy): # see the link above
def __getattribute__(self, attr):
# the problem is that `self` refers to the proxied
# object and thus this throws an AttributeError. How
# can I reference MyObjectProxy.another_value()?
if attr == 'value': return self.another_value() # return method or attribute, doesn't matter (same effect)
return super(MyObjectProxy, self).__getattribute__(attr)
def another_value(self):
return 21
o = MyObject()
p = MyObjectProxy(o)
print o.value
print p.value
In a sense my problem is that the proxy works too good, hiding all its own methods/attributes and posing itself as the proxied object (which is what it should do)...
Update
Based on the comments below, I changed __getattribute__ to this:
def __getattribute__(self, attr):
try:
return object.__getattribute__(self, attr)
except AttributeError:
return super(MyObjectProxy, self).__getattribute__(attr)
This seems to do the trick for now, but it would be better to add this directly to the Proxy class.

The reason that your code goes wrong is the loop in __getattribute__. You want to override __getattribute__ so you can reach certain properties in the proxy class itself. But let's see.
When you call p.value the __getattribute__ is called. Then it comes here if attr == 'value': return self.another_value(). Here we need to call another_value so we enter __getattribute__ again.
This time we comes here return super(MyObjectProxy, self).__getattribute__(attr). We call the Proxy's __getattribute__, and it tries to fetch another_value in Myobject. So the exceptions occur.
You can see from the traceback that we finally goes to return super(MyObjectProxy, self).__getattribute__(attr) that should not go to.
Traceback (most recent call last):
File "proxytest.py", line 22, in <module>
print p.value
File "proxytest.py", line 13, in __getattribute__
if attr == 'value': return self.another_value() # return method or attribute, doesn't matter (same effect)
File "proxytest.py", line 14, in __getattribute__
return super(MyObjectProxy, self).__getattribute__(attr)
File "/home/hugh/m/tspace/proxy.py", line 10, in __getattribute__
return getattr(object.__getattribute__(self, "_obj"), name)
AttributeError: 'MyObject' object has no attribute 'another_value'
edit:
Change the line of code if attr == 'value': return self.another_value() to if attr == 'value': return object.__getattribute__(self, 'another_value')().

Related

When using multiprocessing and spawn in python, use self.a in __getattr__ cause infinite loop

The following code will recurrent the bug:
from multiprocessing import Process, set_start_method
class TestObject:
def __init__(self) -> None:
self.a = lambda *args: {}
def __getattr__(self, item):
return self.a
class TestProcess(Process):
def __init__(self, textobject, **kwargs):
super(TestProcess, self).__init__(**kwargs)
self.testobject = textobject
def run(self) -> None:
print("heihei")
print(self.testobject)
if __name__ == "__main__":
set_start_method("spawn")
testobject = TestObject()
testprocess = TestProcess(testobject)
testprocess.start()
Using 'spawn' will cause infinite loop in the method if 'TestObject.__getattr__'.
When delete the line 'set_start_method('spawn')', all things go right.
It would be very thankful of us to know why the infinite loop happen.
If you head over to pickle's documentation, you will find a note that says
At unpickling time, some methods like getattr(), getattribute(), or setattr() may be called upon the instance. In case those methods rely on some internal invariant being true, the type should implement new() to establish such an invariant, as init() is not called when unpickling an instance.
I am unsure of what exact conditions leads to a __getattribute__ call, but you can bypass the default behaviour by providing a __setstate__ method:
class TestObject:
def __init__(self) -> None:
self.a = lambda *args: {}
def __getattr__(self, item):
return self.a
def __setstate__(self, state):
self.__dict__ = state
If it's present, pickle calls this method with the unpickled state and you are free to restore it however you wish.
Now we figure out what is really happening of the bug:
Before we look into the code, we should know two things:
When we define a __getattr__ method for our class, we should never try to get an attribute that does not belong to the class or the instance itself in __getattr__, otherwise it will cause infinite loop, for example:
class TestObject:
def __getattr__(self, item):
return self.a
if __name__ == "__main__":
testobject = TestObject()
print(f"print a: {testobject.a}")
The result should be like this:
Traceback (most recent call last):
File "tmp_test.py", line 10, in <module>
print(f"print a: {testobject.a}")
File "tmp_test.py", line 6, in __getattr__
return self.a
File "tmp_test.py", line 6, in __getattr__
return self.a
File "tmp_test.py", line 6, in __getattr__
return self.a
[Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded
Cause a is not in the instance's __dict__, so every time it can not find a, it will go into the __getattr__ method and then cause the infinite loop.
The next thing we should remember is how the pickle module in python
works. When pickling and unpickling one class's instance, its dumps and loads(same for dump and load) function will call the instance's __getstate__(for dumps) and __setstate__(for loads) methods. And guess when our class does not define these two methods, where python will look for? Yes, the __getattr__ method! Normally, it is ok when pickling the instance, cause for this time, the attributes used in __getattr__ still exist in the instance. But when unpickling, things go wrong.
This is how the pickle module documentation says when pickling the class's instance: https://docs.python.org/3/library/pickle.html#pickling-class-instances.
And here is what we should notice:
It means when unpickling one class's instance, it will not call the __init__ function to create the instance! So when unpickling, pickle's loads function would check whether the re-instantiate instance has the __setstate__ method, and as we said above, it will go into the __getattr__ method, but for now, the attributes that the instance once owned has not been given (at the code obj.__dict__.update(attributes)), so bingo, the infinite loop bug appears!
To reproduce the whole exact bug, you can run this code:
import pickle
class TestClass:
def __init__(self):
self.w = 1
class Test:
def __init__(self):
self.a = TestClass()
def __getattr__(self, item):
print(f"{item} begin.")
print(self.a)
print(f"{item} end.")
try:
return self.a.__getattribute__(item)
except AttributeError as e:
raise e
# def __getstate__(self):
# return self.__dict__
#
# def __setstate__(self, state):
# self.__dict__ = state
if __name__ == "__main__":
test = Test()
print(test.w)
test_data = pickle.dumps(test)
new_test = pickle.loads(test_data)
print(new_test.w)
You should get the infinite bug when not add the __getstate__ and __setstate__ method, and add them will fix it. You can also try to see the print info to see whether the bug exists at __getattr__('__setstate__').
And the connection between this pickle bug and our multiprocessing bug at beginning is that it seems when using `spawn``, the son process's context would try to pickle the father process's context and then unpickle it and inherit it. So now all things make sense.

Accessing `super()` inside python class property

I'm learning about python's inheritance, and have come across a behaviour I don't quite understand. Here is a minimal working example:
class Test():
def meth1(self):
print('accessing meth1')
return super().a #calling random nonexisting attribute; error (as expected)
#property
def prop1(self):
print('accessing prop1')
return super().a #calling random nonexisting attribute; no error?
def __getattr__(self, name):
print('getattr ' + name)
test = Test()
Calling .meth1() fails as expected...
In [1]: test.meth1()
accessing meth1
Traceback (most recent call last):
File "<ipython-input-160-4a0675c95211>", line 1, in <module>
test.meth1()
File "<ipython-input-159-1401fb9a0e13>", line 5, in meth1
return super().a #calling random nonexisting attribute; error (as expected)
AttributeError: 'super' object has no attribute 'a'
...as super() is object which does indeed not have this attribute.
But .prop1 does not...
In [2]: test.prop1
accessing prop1
getattr prop1
...which I don't understand. It seems the property is called twice, once 'normally' and once via __getattr__.
Some observations:
I assume it's got something to do with the property decorator.
The attribute .a seems to never be accessed.
If I replace the return super().a line in prop1 with something like return 5, the __getattr__ method is never called.
If I actually make Test inherit from a class having an attribute a, its value is returned from test.meth1(), but not from test.prop1.
Could someone explain what's going on here? I've not been able to find any useful information addressing the combination of attribute decorators and super().
Many thanks,
TLDR: meth1 raises AttributeError after lookup, when __getattr__ is not involved. prop1 raises AttributeError during lookup, triggering a fallback to __getattr__ which succeeds to return None.
>>> test.prop1 # AttributeError happens here during lookup
accessing prop1
getattr prop1
>>> meth = test.meth1 # AttributeError happens *not* here during lookup
>>> meth() # AttributeError happens here *after* lookup
...
AttributeError: 'super' object has no attribute 'a'
The __getattr__ method is only called when an "attribute is not found" – in other words that AttributeError is raised on access. The same behaviour occurs when the property raises the error directly:
class Test():
#property
def prop1(self):
print('accessing prop1')
raise AttributeError # replaces `super().a`
def __getattr__(self, name):
print('getattr ' + name)
test = Test()
test.prop1 # < explicitly raises AttributeError
# accessing prop1
# getattr prop1
test.prop2 # < implicitly raises AttributeError
# getattr prop2
The AttributeError does not reveal whether it comes from a missing prop1 attribute or some nested internal attribute (say, super().a). Thus, both trigger the fallback to __getattr__.
This is intended behaviour of __getattr__.
object.__getattr__(self, name)
Called when the default attribute access fails with an AttributeError (either __getattribute__() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or __get__() of a name property raises AttributeError).
It allows properties to fallback to the regular lookup mechanism when they cannot produce a value.

Implications of the IPython "Canary Method" and what happens if it exists?

The IPython source code includes a getattr check that tests for the existence of '_ipython_canary_method_should_not_exist_' at the beginning of the get_real_method function:
def get_real_method(obj, name):
"""Like getattr, but with a few extra sanity checks:
- If obj is a class, ignore everything except class methods
- Check if obj is a proxy that claims to have all attributes
- Catch attribute access failing with any exception
- Check that the attribute is a callable object
Returns the method or None.
"""
try:
canary = getattr(obj, '_ipython_canary_method_should_not_exist_', None)
except Exception:
return None
if canary is not None:
# It claimed to have an attribute it should never have
return None
And although it's easy enough to find other coders special-casing this name, it's a harder to find any meaningful explanation of why.
Given these two classes:
from __future__ import print_function
class Parrot(object):
def __getattr__(self, attr):
print(attr)
return lambda *a, **kw: print(attr, a, kw)
class DeadParrot(object):
def __getattr__(self, attr):
print(attr)
if attr == '_ipython_canary_method_should_not_exist_':
raise AttributeError(attr)
return lambda *a, **kw: print(attr, a, kw)
It seems that IPython is using the existence or lack of this method to decide whether to use repr or one of its rich display methods. Intentionally thwarting the test in DeadParrot causes IPython to look up and invoke _repr_mimebundle_.
I'm writing an object that pretends all attrs exist. How do I decide whether to special-case this?

Why can I call all these methods that aren't explicitly defined in a class?

So I am working with an API wrapper in python for vk, Europe's Facebook equivalent. The documentation on the vk site has all the API calls that can be used, and the wrapper is able to call them correctly. For example, to get a user's information, you would call api.users.get(id) to get a user by id. My question is this: how can the wrapper correctly handle such a call when neither users or a corresponding users.get() method is defined inside the api object?
I know it involves the __getattr__() and __call__() methods, but I can't find any good documentation on how to use them in this way.
EDIT
the api object is instantiated via api = vk.API(id, email, password)
Let's walk through this together, shall we?
api
To execute api.users.get(), Python first has to know api. And due to your instantiation, it does know it: It's a local variable holding an instance of APISession.
api.users
Then, it has to know api.users. Python first looks at the members of the api instance, at the members of its class (APISession) and at the members of that class' super-classes (only object in the case of APISession). Failing to find a member called users in any of these places, it looks for a member function called __getattr__ in those same places. It will find it right on the instance, because APISession has an (instance) member function of this name.
Python then calls it with 'users' (the name of the so-far missing member) and uses the function's return value as if it were that member. So
api.users
is equivalent to
api.__getattr__('users')
Let's see what that returns.
def __getattr__(self, method_name):
return APIMethod(self, method_name)
Oh. So
api.users # via api.__getattr__('users')
is equivalent to
APIMethod(api, 'users')
creating a new APIMethod instance.
api and 'users' end up as that instance's _api_session and _method_name members. Makes sense, I guess.
api.users.get
Python still hasn't executed our statement. It needs to know api.users.get() to do so. The same game as before repeats, just in the api.users object instead of the api object this time: No member method get() and no member get is found on the APIMethod instance api.users points to, nor on its class or superclasses, so Python turns to the __getattr__ method, which for this class does something peculiar:
def __getattr__(self, method_name):
return APIMethod(self._api_session, self._method_name + '.' + method_name)
A new instance of the same class! Let's plug in the instance members of api.users, and
api.users.get
becomes equivalent to
APIMethod(api, 'users' + '.' + 'get')
So we will have the api object also in api.user.get's _apisession and the string 'users.get' in its _method_name.
api.users.get() (note the ())
So api.users.get is an object. To call it, Python has to pretend it's a function, or more specifically, a method of api.users. It does so, by instead calling api.users.get's __call__ method, which looks like this:
def __call__(self, **method_kwargs):
return self._api_session(self._method_name, **method_kwargs)
Let's work this out:
api.users.get()
# is equivalent to
api.users.get.__call__() # no arguments, because we passed none to `get()`
# will return
api.users.get._api_session(api.users.get._method_name)
# which is
api('users.get')
So now Python is calling the api object as if it were a function. __call__ to the rescue, once more, this time looking like this:
def __call__(self, method_name, **method_kwargs):
response = self.method_request(method_name, **method_kwargs)
response.raise_for_status()
# there are may be 2 dicts in 1 json
# for example: {'error': ...}{'response': ...}
errors = []
error_codes = []
for data in json_iter_parse(response.text):
if 'error' in data:
error_data = data['error']
if error_data['error_code'] == CAPTCHA_IS_NEEDED:
return self.captcha_is_needed(error_data, method_name, **method_kwargs)
error_codes.append(error_data['error_code'])
errors.append(error_data)
if 'response' in data:
for error in errors:
warnings.warn(str(error))
return data['response']
if AUTHORIZATION_FAILED in error_codes: # invalid access token
self.access_token = None
self.get_access_token()
return self(method_name, **method_kwargs)
else:
raise VkAPIMethodError(errors[0])
Now, that's a lot of error handling. For this analysis, I'm only interested in the happy path. I'm only interested in the happy path's result (and how we got there). So lets start at the result.
return data['response']
Where did data come from? It's the first element of response.text interpreted as JSON that does contain a 'response' object. So it seems that from the response object we got, we're extracting the actual response part.
Where did the response object come from? It was returned by
api.method_request('users.get')
Which, for all we care, is a plain normal method call that doesn't require any fancy fallbacks. (Its implementation of course, on some levels, might.)
Assuming the comments are correct, and api is an instance of APISession as defined in this particular commit, then this is a bit of an interesting maze:
So first you want to access api.user. APISession has no such attribute, so it calls __getattr__('user') instead, which is defined as:
def __getattr__(self, method_name):
return APIMethod(self, method_name)
So this constructs an APIMethod(api,'user'). Now you want to call the method get on the APIMethod(api,'user') that is bound to api.users, but an APIMethod doesn't have a get method, so it calls its own __getattr__('get') to figure out what to do:
def __getattr__(self, method_name):
return APIMethod(self._api_session, self._method_name + '.' + method_name)
This returns a APIMethod(api,'users.get') which is then called, invoking the __call__ method of the APIMethod class, which is:
def __call__(self, **method_kwargs):
return self._api_session(self._method_name, **method_kwargs)
So this tries to return api('users.get'), but api is an APISession object, so it invokes the __call__ method of this class, defined as (stripping out the error handling for simplicity):
def __call__(self, method_name, **method_kwargs):
response = self.method_request(method_name, **method_kwargs)
response.raise_for_status()
for data in json_iter_parse(response.text):
if 'response' in data:
return data['response']
So it then calls a method_request('users.get'), which if you follow that method actually does a POST request, and some data comes back as a response, which is then returned.
The users.get() has nothing to do with the api object. As for the users, you are right, if there is no such member defined, then there is certainly some logic inside the __getattr__. So as you can see in the documentation __getattr__ is...
Called when an attribute lookup has not found the attribute in the usual places (i.e. it is not an instance attribute nor is it found in the class tree for self). name is the attribute name.
So exactly, as there is no users defined for the api's class, then the __getattr__ is being called with 'users' passed as the name parameter. Then, most probably dynamically, depending on the passed parameter, an object is being constructed for the users component and returned, which will be responsible to define or handle in similar way the get() method.
To get the whole idea, try the following:
class A(object):
def __init__(self):
super(A, self).__init__()
self.defined_one = 'I am defined inside A'
def __getattr__(self, item):
print('getting attribute {}'.format(item))
return 'I am ' + item
a = A()
>>> print(a.some_item) # this will call __getattr__ as some_item is not defined
getting attribute some_item
I am some_item
>>> print(a.and_another_one) # this will call __getattr__ as and_another_one is not defined
getting attribute and_another_one
I am and_another_one
>>> print(a.defined_one) # this will NOT call __getattr__ as defined_one is defined in A
I am defined inside A

How to make an Python subclass uncallable

How do you "disable" the __call__ method on a subclass so the following would be true:
class Parent(object):
def __call__(self):
return
class Child(Parent):
def __init__(self):
super(Child, self).__init__()
object.__setattr__(self, '__call__', None)
>>> c = Child()
>>> callable(c)
False
This and other ways of trying to set __call__ to some non-callable value still result in the child appearing as callable.
You can't. As jonrsharpe points out, there's no way to make Child appear to not have the attribute, and that's what callable(Child()) relies on to produce its answer. Even making it a descriptor that raises AttributeError won't work, per this bug report: https://bugs.python.org/issue23990 . A python 2 example:
>>> class Parent(object):
... def __call__(self): pass
...
>>> class Child(Parent):
... __call__ = property()
...
>>> c = Child()
>>> c()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> c.__call__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: unreadable attribute
>>> callable(c)
True
This is because callable(...) doesn't act out the descriptor protocol. Actually calling the object, or accessing a __call__ attribute, involves retrieving the method even if it's behind a property, through the normal descriptor protocol. But callable(...) doesn't bother going that far, if it finds anything at all it is satisfied, and every subclass of Parent will have something for __call__ -- either an attribute in a subclass, or the definition from Parent.
So while you can make actually calling the instance fail with any exception you want, you can't ever make callable(some_instance_of_parent) return False.
It's a bad idea to change the public interface of the class so radically from the parent to the base.
As pointed out elsewhere, you cant uninherit __call__. If you really need to mix in callable and non callable classes you should use another test (adding a class attribute) or simply making it safe to call the variants with no functionality.
To do the latter, You could override the __call__ to raise NotImplemented (or better, a custom exception of your own) if for some reason you wanted to mix a non-callable class in with the callable variants:
class Parent(object):
def __call__(self):
print "called"
class Child (Parent):
def __call__(self):
raise NotACallableInstanceException()
for child_or_parent in list_of_children_and_parents():
try:
child_or_parent()
except NotACallableInstanceException:
pass
Or, just override call with pass:
class Parent(object):
def __call__(self):
print "called"
class Child (Parent):
def __call__(self):
pass
Which will still be callable but just be a nullop.

Categories