Lazy loading of attributes - python

How would you implement lazy load of object attributes, i.e. if attributes are accessed but don't exist yet, some object method is called which is supposed to load these?
My first attempt is
def lazyload(cls):
def _getattr(obj, attr):
if "_loaded" not in obj.__dict__:
obj._loaded=True
try:
obj.load()
except Exception as e:
raise Exception("Load method failed when trying to access attribute '{}' of object\n{}".format(attr, e))
if attr not in obj.__dict__:
AttributeError("No attribute '{}' in '{}' (after loading)".format(attr, type(obj))) # TODO: infinite recursion if obj fails
return getattr(obj, attr)
else:
raise AttributeError("No attribute '{}' in '{}' (already loaded)".format(attr, type(obj)))
cls.__getattr__=_getattr
return cls
#lazyload
class Test:
def load(self):
self.x=1
t=Test() # not loaded yet
print(t.x) # will load as x isnt known yet
I will make lazyload specific to certain attribute names only.
As I havent done much meta-classing yet, I'm not sure if that is the right approach.
What would you suggest?

Seems like a simple property would do the trick better:
#property
def my_attribute():
if not hasattr(self, '_my_attribute'):
do_expensive_operation_to_get_attribute()
return self._my_attribute

Look at lazy from django/utils/functionals.py
https://docs.djangoproject.com/en/2.1/_modules/django/utils/functional

Related

Implications of the IPython "Canary Method" and what happens if it exists?

The IPython source code includes a getattr check that tests for the existence of '_ipython_canary_method_should_not_exist_' at the beginning of the get_real_method function:
def get_real_method(obj, name):
"""Like getattr, but with a few extra sanity checks:
- If obj is a class, ignore everything except class methods
- Check if obj is a proxy that claims to have all attributes
- Catch attribute access failing with any exception
- Check that the attribute is a callable object
Returns the method or None.
"""
try:
canary = getattr(obj, '_ipython_canary_method_should_not_exist_', None)
except Exception:
return None
if canary is not None:
# It claimed to have an attribute it should never have
return None
And although it's easy enough to find other coders special-casing this name, it's a harder to find any meaningful explanation of why.
Given these two classes:
from __future__ import print_function
class Parrot(object):
def __getattr__(self, attr):
print(attr)
return lambda *a, **kw: print(attr, a, kw)
class DeadParrot(object):
def __getattr__(self, attr):
print(attr)
if attr == '_ipython_canary_method_should_not_exist_':
raise AttributeError(attr)
return lambda *a, **kw: print(attr, a, kw)
It seems that IPython is using the existence or lack of this method to decide whether to use repr or one of its rich display methods. Intentionally thwarting the test in DeadParrot causes IPython to look up and invoke _repr_mimebundle_.
I'm writing an object that pretends all attrs exist. How do I decide whether to special-case this?

How to replace objects causing import errors with None during pickle load?

I have a pickled structure consisting of nested builtin primitives (list, dictionaries) and instances of classes that are not in the project anymore, that therefore cause errors during unpickling. I do not really care about those objects, I wish I could extract numerical values stored in this nested structure. Is there any way to unpickle from a file and replace everything that was broken due to import issues with, let's say, None?
I was trying to inherit from Unpickler and override find_class(self, module, name) to return Dummy if class can not be found, but for some reason I keep getting TypeError: 'NoneType' object is not callable in load reduce after that.
class Dummy(object):
def __init__(self, *argv, **kwargs):
pass
I tried something like
class RobustJoblibUnpickle(Unpickler):
def find_class(self, _module, name):
try:
super(RobustJoblibUnpickle, self).find_class(_module, name)
except ImportError:
return Dummy
Maybe you can catch the exception in a try block, and do what you want (set some object to None use a Dummy class)?
edit:
Take a look at this, I don't know if it is the right way to do it, but it seems to work fine:
import sys
import pickle
class Dummy:
pass
class MyUnpickler(pickle._Unpickler):
def find_class(self, module, name): # from the pickle module code but with a try
# Subclasses may override this. # we are doing it right now...
try:
if self.proto < 3 and self.fix_imports:
if (module, name) in _compat_pickle.NAME_MAPPING:
module, name = _compat_pickle.NAME_MAPPING[(module, name)]
elif module in _compat_pickle.IMPORT_MAPPING:
module = _compat_pickle.IMPORT_MAPPING[module]
__import__(module, level=0)
if self.proto >= 4:
return _getattribute(sys.modules[module], name)[0]
else:
return getattr(sys.modules[module], name)
except AttributeError:
return Dummy
# edit: as per Ben suggestion an even simpler subclass can be used
# instead of the above
class MyUnpickler2(pickle._Unpickler):
def find_class(self, module, name):
try:
return super().find_class(module, name)
except AttributeError:
return Dummy
class C:
pass
c1 = C()
with open('data1.dat', 'wb') as f:
pickle.dump(c1,f)
del C # simulate the missing class
with open('data1.dat', 'rb') as f:
unpickler = MyUnpickler(f) # or MyUnpickler2(f)
c1 = unpickler.load()
print(c1) # got a Dummy object because of missing class

Property and __getattr__ compatibility issue with AttributeError

I just encountered an unexpected behavior. This is a simple class with a __getattr__ method and a property attribute with a typo inside:
class A(object):
def __getattr__(self, attr):
if not attr.startswith("ignore_"):
raise AttributeError(attr)
#property
def prop(self):
return self.some_typo
a = A() # Instantiating
a.ignore_this # This is ignored
a.prop # This raises an Attribute Error
This is the expected outcome (the one I get if __getattr__ is commented):
AttributeError: 'A' object has no attribute 'some_typo'
And this is what I get:
AttributeError: prop
I know this has to do with__getattr__ catching the AttributeError but is there a nice and clean workaround for this issue? Because I can assure you, this is a debug nightmare...
You can just raise a better exception message:
class A(object):
def __getattr__(self, attr):
if not attr.startswith("ignore_"):
raise AttributeError("%r object has not attribute %r" % (self.__class__.__name__, attr))
#property
def prop(self):
return self.some_typo
a=A()
a.ignore_this
a.prop
EDIT: calling __getattribute__ from object base class solves the problem
class A(object):
def __getattr__(self, attr):
if not attr.startswith("ignore_"):
return self.__getattribute__(attr)
#property
def prop(self):
return self.some_typo
As mentioned by #asmeurer, the solution by #mguijarr calls prop twice. When prop first runs, it raises an AttributeError which triggers __getattr__. Then self.__getattribute__(attr) triggers prop again, finally resulting in the desired exception.
BETTER ANSWER:
Here we are better off replacing __getattribute__ instead of __getattr__. It gives us more control since __getattribute__ is invoked on all attribute access. In contrast, __getattr__ is only called when there has already been an AttributeError, and it doesn't give us access to that original error.
class A(object):
def __getattribute__(self, attr):
try:
return super().__getattribute__(attr)
except AttributeError as e:
if not attr.startswith("ignore_"):
raise e
#property
def prop(self):
print("hi")
return self.some_typo
To explain, since A subclasses object in this case, super().__getattribute__(attr) is equivalent to object.__getattribute__(self, attr). That reads a's underlying object attribute, avoiding the infinite recursion had we instead used self.__getattribute__(attr).
In case of AttributeError, we have full control to either fail or reraise, and reraising gives a sensible error message.

How do I instantiate a specific subclass based on arguments passed to __init__?

I'm wrapping a remote XML-based API from python 2.7. The API throws errors by sending along a <statusCode> element as well as a <statusDescription> element. Right now, I catch this condition and raise a single exception type. Something like:
class ApiError(Exception):
pass
def process_response(response):
if not response.success:
raise ApiError(response.statusDescription)
This works fine, except I now want to handle errors in a more sophisticated fashion. Since I have the statusCode element, I would like to raise a specific subclass of ApiError based on the statusCode. Effectively, I want my wrapper to be extended like this:
class ApiError(Exception):
def __init__(self, description, code):
# How do I change self to be a different type?
if code == 123:
return NotFoundError(description, code)
elif code == 456:
return NotWorkingError(description, code)
class NotFoundError(ApiError):
pass
class NotWorkingError(ApiError):
pass
def process_response(response):
if not response.success:
raise ApiError(response.statusDescription, response.statusCode)
def uses_the_api():
try:
response = call_remote_api()
except NotFoundError, e:
handle_not_found(e)
except NotWorkingError, e:
handle_not_working(e)
The machinery for tying specific statusCode's to specific subclasses is straightforward. But what I want is for that to be buried inside of ApiError somewhere. Specifically, I don't want to change process_response except to pass in the value statusCode.
I've looked at metaclasses, but not sure they help the situation, since __new__ gets write-time arguments, not run-time arguments. Similarly unhelpful is hacking around __init__ since it isn't intended to return an instance. So, how do I instantiate a specific subclass based on arguments passed to __init__?
A factory function is going to be much easier to understand. Use a dictionary to map codes to exception classes:
exceptions = {
123: NotFoundError,
456: NotWorkingError,
# ...
}
def exceptionFactory(description, code):
return exceptions[code](description, code)
Create a function that will yield requested error class basing on description.
Something like this:
def get_valid_exception(description, code):
if code == 123:
return NotFoundError(description, code)
elif code == 456:
return NotWorkingError(description, code)
Depending on your requirements and future changes, you could create exceptions with different arguments or do anything else, without affecting code that uses this function.
Then in your code you can use it like this:
def process_response(response):
if not response.success:
raise get_valid_exception(response.statusDescription, response.statusCode)
You could create a series of subclasses and use the base class' __new__ as a factory for the children. However, that's probably overkill here; you could just create a simple factory method or class. If you wanted to get fancy in another direction though, you could create a metaclass for the base class that would automatically add your subclasses to a factory when they are created. Something like:
class ApiErrorRegistry(type):
code_map = {}
def __new__(cls, name, bases, attrs):
try:
mapped_code = attrs.pop('__code__')
except KeyError:
if name != 'ApiError':
raise TypeError('ApiError subclasses must define a __code__.')
mapped_code = None
new_class = super(ApiErrorRegistry, cls).__new__(cls, name, bases, attrs)
if mapped_code is not None:
ApiErrorRegistry.code_map[mapped_code] = new_class
return new_class
def build_api_error(description, code):
try:
return ApiErrorRegistry.code_map[code](description, code)
except KeyError:
raise ValueError('No error for code %s registered.' % code)
class ApiError(Exception):
__metaclass__ = ApiErrorRegistry
class NotFoundError(ApiError):
__code__ = 123
class NotWorkingError(ApiError):
__code__ = 456
def process_response(response):
if not response.success:
raise build_api_error(response.statusDescription, response.statusCode)
def uses_the_api():
try:
response = call_remote_api()
except ApiError as e:
handle_error(e)

Python: Why is __getattr__ catching AttributeErrors?

I'm struggling with __getattr__. I have a complex recursive codebase, where it is important to let exceptions propagate.
class A(object):
#property
def a(self):
raise AttributeError('lala')
def __getattr__(self, name):
print('attr: ', name)
return 1
print(A().a)
Results in:
('attr: ', 'a')
1
Why this behaviour? Why is no exception thrown? This behaviour is not documented (__getattr__ documentation). getattr() could just use A.__dict__. Any thoughts?
I just changed the code to
class A(object):
#property
def a(self):
print "trying property..."
raise AttributeError('lala')
def __getattr__(self, name):
print('attr: ', name)
return 1
print(A().a)
and, as we see, indeed the property is tried first. But as it claims not to be there (by raising AttributeError), __getattr__() is called as "last resort".
It is not documented clearly, but can maybe be counted under "Called when an attribute lookup has not found the attribute in the usual places".
Using __getattr__ and properties in the same class is dangerous, because it can lead to errors that are very difficult to debug.
If the getter of a property throws AttributeError, then the AttributeError is silently caught, and __getattr__ is called. Usually, this causes __getattr__ to fail with an exception, but if you are extremely unlucky, it doesn't, and you won't even be able to easily trace the problem back to __getattr__.
EDIT: Example code for this problem can be found in this answer.
Unless your property getter is trivial, you can never be 100% sure it won't throw AttributeError. The exception may be thrown several levels deep.
Here is what you could do:
Avoid using properties and __getattr__ in the same class.
Add a try ... except block to all property getters that are not trivial
Keep property getters simple, so you know they won't throw AttributeError
Write your own version of the #property decorator, which catches AttributeError and re-throws it as RuntimeError.
See also http://blog.devork.be/2011/06/using-getattr-and-property_17.html
EDIT: In case anyone is considering solution 4 (which I don't recommend), it can be done like this:
def property_(f):
def getter(*args, **kwargs):
try:
return f(*args, **kwargs)
except AttributeError as e:
raise RuntimeError, "Wrapped AttributeError: " + str(e), sys.exc_info()[2]
return property(getter)
Then use #property_ instead of #property in classes that override __getattr__.
__getattribute__ documentation says:
If the class also defines __getattr__(), the latter will not be called unless __getattribute__() either calls it explicitly or raises an AttributeError.
I read this (by inclusio unius est exclusio alterius) as saying that attribute access will call __getattr__ if object.__getattribute__ (which is "called unconditionally to implement attribute accesses") happens to raise AttributeError - whether directly or inside a descriptor __get__ (e.g. a property fget); note that __get__ should "return the (computed) attribute value or raise an AttributeError exception".
As an analogy, operator special methods can raise NotImplementedError whereupon the other operator methods (e.g. __radd__ for __add__) will be tried.
__getattr__ is called when an attribute access fails with an AttributeError. Maybe this is why you think it 'catches' the errors. However, it doesn't, it's Python's attribute access functionality that catches them, and then calls __getattr__.
But __getattr__ itself doesn't catch any errors. If you raise an AttributeError in __getattr__ you get infinite recursion.
regularly run into this problem because I implement __getattr__ a lot and have lots of #property methods. Here's a decorator I came up with to get a more useful error message:
def replace_attribute_error_with_runtime_error(f):
#functools.wraps(f)
def wrapped(*args, **kwargs):
try:
return f(*args, **kwargs)
except AttributeError as e:
# logging.exception(e)
raise RuntimeError(
'{} failed with an AttributeError: {}'.format(f.__name__, e)
)
return wrapped
And use it like this:
class C(object):
def __getattr__(self, name):
...
#property
#replace_attribute_error_with_runtime_error
def complicated_property(self):
...
...
The error message of the underlying exception will include name of the class whose instance raised the underlying AttributeError.
You can also log it if you want to.
You're doomed anyways when you combine #property with __getattr__:
class Paradise:
pass
class Earth:
#property
def life(self):
print('Checking for paradise (just for fun)')
return Paradise.breasts
def __getattr__(self, item):
print("sorry! {} does not exist in Earth".format(item))
earth = Earth()
try:
print('Life in earth: ' + str(earth.life))
except AttributeError as e:
print('Exception found!: ' + str(e))
Gives the following output:
Checking for paradise (just for fun)
sorry! life does not exist in Earth
Life in earth: None
When your real problem was with calling Paradise.breasts.
__getattr__ is always called when an AtributeError is risen. The content of the exception is ignored.
The sad thing is that there's no solution to this problem given hasattr(earth, 'life') will return True (just because __getattr__ is defined), but will still be reached by the attribute 'life' as it didn't exist, whereas the real underlying problem is with Paradise.breasts.
My partial solution involves using a try-except in #property blocks which are known to hit upon AttributeError exceptions.

Categories