Python multiprocessing managers with context manager and properties - python

I am using the functionality in the multiprocessing package to create synchronized shared objects. My objects have both property attributes and are also context managers (i.e. have __enter__ and __exit__ methods).
I've came across a curiosity where I can't make both work at the same time, at least with the recipes I found online, both in python 2 and 3.
Suppose this simple class being registered into a manager:
class Obj(object):
#property
def a(self): return 1
def __enter__(self): return self
def __exit__(self, *args, **kw): pass
Normally both won't work because what we need is not exposed:
from multiprocessing.managers import BaseManager, NamespaceProxy
BaseManager.register('Obj', Obj)
m = BaseManager(); m.start();
o = m.Obj()
o.a # AttributeError: 'AutoProxy[Obj]' object has no attribute 'a'
with o: pass # AttributeError: __exit__
A solution I have found on SO that uses a custom proxy instead of AutoProxy works for the property but not the context manager (no matter if __enter__ and __exit__ is exposed this way or not):
class MyProxy(NamespaceProxy):
_exposed_ = ['__getattribute__', '__setattr__', '__delattr__', 'a', '__enter__', '__exit__']
BaseManager.register('Obj', Obj, MyProxy)
m = BaseManager(); m.start();
o = m.Obj()
o.a # outputs 1
with o: pass # AttributeError: __exit__
I can make the context manager alone work by using the exposed keyword while registering:
BaseManager.register('Obj', Obj, exposed=['__enter__', '__exit__'])
m = BaseManager(); m.start();
o = m.Obj()
with o: pass # works
But if I also add the stuff for the property I get a max recursion error:
BaseManager.register('Obj', Obj, exposed=['__enter__', '__exit__', '__getattribute__', '__setattr__', '__delattr__', 'a'])
m = BaseManager(); m.start();
o = m.Obj() # RuntimeError: maximum recursion depth exceeded
If I leave out __getattribute__ and friends I see a as a bound method which tries to call the property value instead of the method itself, so that doesn't work either.
I have tried to mix and match in every way I could think of and couldn't find a solution. Is there a way of doing this or maybe this is a bug in the lib?

The fact is that the way these managers are implemented is focused in controlling the access to shared data in then, in the form of Attributes. They won't do great in dealing with other Python features such as properties, or "dunder" methods that depend on the object state, like __enter__ and __exit__.
It would certainly be possible to get to specific workarounds for each needed feature, by means of subclassing the Proxy object, until one would get each to work - but the result of that would never be bullet-proof for all corner cases, much less for all Python class features.
So, I think that in this case the best you do is to create a simple- data-only class! One that just uses plain attributes - no properties, no descriptors, no attribute-access customization - just a plain data class whose instances will hold the data you need to share. Actually, you may not even need such a class, since the managers module provide a Synced dictionary type - just use that,
And then you create a second class where you build this intelligence you need. This second classes will have getters and setters and properties, and can implement the context protocol, and any dunder method you like, and get hold of an associated instance of the data class. All the intelligence in the methods and properties can make use of the data in this instance. Actually, you might just use a multiprocessing.managers.SyncManager.dict syncronized dictionary to hold your data.
Then, if you make this associated data class managed, it will work in a straightforward and simple way, and, in each process, you build the "smart class" wrapping it.
Your code snippets don't give examples on how you pass your objetcs from one process to the other - I hope you are aware that by calling BaseManager.Obj() you get new, local, instances of your classes anyway - you have to get a Queue to share your objects cross-process, regardless of the managers.
The proof of concept bellow shows an example of what I mean.
import time
from multiprocessing import Process, Pool
from multiprocessing.managers import SyncManager
class MySpecialClass:
def __init__(self, data):
self.data = data
#property
def a(self):
return self.data["a"]
def __enter__(self):
return self
def __exit__(self, ext_type, exc_value, traceback):
pass
def worker(data):
obj = MySpecialClass(data)
for i in range(10):
time.sleep(1)
obj.data[i] = i ** 2
def main():
m = SyncManager()
m.start()
data = m.dict()
server_obj = MySpecialClass(data)
p = Process(target=worker, args=(data,))
p.start()
for i in range(22):
print(server_obj.data)
time.sleep(.5)
p.join()
main()
Keep in mind that if you need to coordinate your context-blocks across processes, due to some resources, you can pass manager.Lock() objects around as easily as the data dictionary above - even as a value in the dictionary - and it would then be ready to use inside the object's __enter__ method.

Related

What is the difference between assignment and overriding in a child class?

This has been bugging me for a while and appears hard to google.
In the following code can anyone explain the pragmatic difference between FirstChild and SecondChild. It's clear from experiments that both "work" and arguably SecondChild is marginally more efficient. But is there something that I'm missing about the way these two behave? Are they different and how are they different?
import collections
class Parent:
def send_message(self, message: str):
pass
class FirstChild(Parent):
def __init__(self):
self.message_queue = collections.deque()
def send_message(self, message: str):
self.message_queue.append(message)
class SecondChild(Parent):
def __init__(self):
self.message_queue = collections.deque()
self.send_message = self.message_queue.append
FirstChild creates a descriptor in the class called send_message. When you do instance.send_message, the interpreter first searches the instance __dict__ for the name, then the class. When it's found in the class, the function is bound to the instance to create a method object that doesn't accept self. It happens every time you do the lookup, and it looks something like
method = type(instance).send_message.__get__(type(instance), instance)
SecondChild assigns a bound method as the attribute send_message in the instance. It cuts out the lookup in its own class object, as well as the lookup in the deque class object, and binding. That is probably why it appears marginally more efficient.
A major practical difference between these approaches is that send_message in SecondChild is not overridable. Since functions are non data descriptors (they have a __get__ method but not __set__ (and yes, functions have a class type and methods, like any other object)), the instance attribute send_message in SecondChild will always trump any class-level function. This means that a child of SecondChild that calls the parent __init__ method will hide any implementation of send_message it creates.
You will likely find the official descriptor guide to be quite informative: https://docs.python.org/3/howto/descriptor.html

Function to behave differently on class vs on instance

I'd like a particular function to be callable as a classmethod, and to behave differently when it's called on an instance.
For example, if I have a class Thing, I want Thing.get_other_thing() to work, but also thing = Thing(); thing.get_other_thing() to behave differently.
I think overwriting the get_other_thing method on initialization should work (see below), but that seems a bit hacky. Is there a better way?
class Thing:
def __init__(self):
self.get_other_thing = self._get_other_thing_inst()
#classmethod
def get_other_thing(cls):
# do something...
def _get_other_thing_inst(self):
# do something else
Great question! What you seek can be easily done using descriptors.
Descriptors are Python objects which implement the descriptor protocol, usually starting with __get__().
They exist, mostly, to be set as a class attribute on different classes. Upon accessing them, their __get__() method is called, with the instance and owner class passed in.
class DifferentFunc:
"""Deploys a different function accroding to attribute access
I am a descriptor.
"""
def __init__(self, clsfunc, instfunc):
# Set our functions
self.clsfunc = clsfunc
self.instfunc = instfunc
def __get__(self, inst, owner):
# Accessed from class
if inst is None:
return self.clsfunc.__get__(None, owner)
# Accessed from instance
return self.instfunc.__get__(inst, owner)
class Test:
#classmethod
def _get_other_thing(cls):
print("Accessed through class")
def _get_other_thing_inst(inst):
print("Accessed through instance")
get_other_thing = DifferentFunc(_get_other_thing,
_get_other_thing_inst)
And now for the result:
>>> Test.get_other_thing()
Accessed through class
>>> Test().get_other_thing()
Accessed through instance
That was easy!
By the way, did you notice me using __get__ on the class and instance function? Guess what? Functions are also descriptors, and that's the way they work!
>>> def func(self):
... pass
...
>>> func.__get__(object(), object)
<bound method func of <object object at 0x000000000046E100>>
Upon accessing a function attribute, it's __get__ is called, and that's how you get function binding.
For more information, I highly suggest reading the Python manual and the "How-To" linked above. Descriptors are one of Python's most powerful features and are barely even known.
Why not set the function on instantiation?
Or Why not set self.func = self._func inside __init__?
Setting the function on instantiation comes with quite a few problems:
self.func = self._funccauses a circular reference. The instance is stored inside the function object returned by self._func. This on the other hand is stored upon the instance during the assignment. The end result is that the instance references itself and will clean up in a much slower and heavier manner.
Other code interacting with your class might attempt to take the function straight out of the class, and use __get__(), which is the usual expected method, to bind it. They will receive the wrong function.
Will not work with __slots__.
Although with descriptors you need to understand the mechanism, setting it on __init__ isn't as clean and requires setting multiple functions on __init__.
Takes more memory. Instead of storing one single function, you store a bound function for each and every instance.
Will not work with properties.
There are many more that I didn't add as the list goes on and on.
Here is a bit hacky solution:
class Thing(object):
#staticmethod
def get_other_thing():
return 1
def __getattribute__(self, name):
if name == 'get_other_thing':
return lambda: 2
return super(Thing, self).__getattribute__(name)
print Thing.get_other_thing() # 1
print Thing().get_other_thing() # 2
If we are on class, staticmethod is executed. If we are on instance, __getattribute__ is first to be executed, so we can return not Thing.get_other_thing but some other function (lambda in my case)

how to subclass multiprocessing.JoinableQueue

I am trying to subclass multiprocessing.JoinableQueue so I can keep track of jobs that were skipped instead of completed. I am using a JoinableQueue to pass jobs to a set of multiprocessing.Process's and I have a threading.Thread populating the queue. Here is my implementation attempt:
import multiprocessing
class InputJobQueue(multiprocessing.JoinableQueue):
def __init__(self, max_size):
super(InputJobQueue, self).__init__(0)
self._max_size = max_size
self._skipped_job_count = 0
def isFull(self):
return self.qsize() >= self._max_size
def taskSkipped(self):
self._skipped_job_count += 1
self.task_done()
However, I run into this issue documented here:
class InputJobQueue(multiprocessing.JoinableQueue):
TypeError
:
Error when calling the metaclass bases
function() argument 1 must be code, not str
Looking at the code in multiprocessing I see that the actual class is in multiprocessing.queues. So I try to extend that class:
import multiprocessing.queues
class InputJobQueue(multiprocessing.queues.JoinableQueue):
def __init__(self, max_size):
super(InputJobQueue, self).__init__(0)
self._max_size = max_size
self._skipped_job_count = 0
def isFull(self):
return self.qsize() >= self._max_size
def taskSkipped(self):
self._skipped_job_count += 1
self.task_done()
But I get inconsistent results: sometimes my custom attributes exist, other times they don't. E.g. the following error is reported in one of my worker Processes:
AttributeError: 'InputJobQueue' object has no attribute '_max_size'
What am I missing to subclass multiprocessing.JoinableQueue?
With multiprocessing, the way objects like JoinableQueue are magically shared between processes is by explicitly sharing the core sync objects, and pickling the "wrapper" stuff to pass over a pipe.
If you understand how pickling works, you can look at the source to JoinableQueue and see that it's using __getstate__/__setstate__. So, you just need to override those to add your own attributes. Something like this:
def __getstate__(self):
return super(InputJobQueue, self).__getstate__() + (self._max_size,)
def __setstate__(self, state):
super(InputJobQueue, self).__setstate__(state[:-1])
self._max_size = state[-1]
I'm not promising this will actually work, since clearly these classes were not designed to be subclassed (the proposed fix for the bug you referenced is to document that the classes can't be subclassed and find a way to make the error messages nicer…). But it should get you past the particular problem you're having here.
You're trying to subclass a type that isn't meant to be subclassed. This requires you to depend on the internals of its implementation in two different ways (one of which is arguably a bug in the stdlib, but the other isn't). And this isn't necessary.
If the actual type is hidden under the covers, no code can actual expect you to be a formal subtype; as long as you duck-type as a queue, you're fine. Which you can do by delegating to a member:
class InputJobQueue(object):
def __init__(self, max_size):
self._jq = multiprocessing.JoinableQueue(0)
self._max_size = max_size
self._skipped_job_count = 0
def __getattr__(self, name):
return getattr(self._jq, name)
# your overrides/new methods
(It would probably be cleaner to explicitly delegate only the documented methods of JoinableQueue than to __getattr__-delegate everything, but in the interests of brevity, I did the shorter version.)
It doesn't matter whether that constructor is a function or a class, because the only thing you're doing is calling it. It doesn't matter how the actual type is pickled, because a class is only responsible for identifying its members, not knowing how to pickle them. All of your problems go away.

Python __call__ special method practical example

I know that __call__ method in a class is triggered when the instance of a class is called. However, I have no idea when I can use this special method, because one can simply create a new method and perform the same operation done in __call__ method and instead of calling the instance, you can call the method.
I would really appreciate it if someone gives me a practical usage of this special method.
This example uses memoization, basically storing values in a table (dictionary in this case) so you can look them up later instead of recalculating them.
Here we use a simple class with a __call__ method to calculate factorials (through a callable object) instead of a factorial function that contains a static variable (as that's not possible in Python).
class Factorial:
def __init__(self):
self.cache = {}
def __call__(self, n):
if n not in self.cache:
if n == 0:
self.cache[n] = 1
else:
self.cache[n] = n * self.__call__(n-1)
return self.cache[n]
fact = Factorial()
Now you have a fact object which is callable, just like every other function. For example
for i in xrange(10):
print("{}! = {}".format(i, fact(i)))
# output
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880
And it is also stateful.
Django forms module uses __call__ method nicely to implement a consistent API for form validation. You can write your own validator for a form in Django as a function.
def custom_validator(value):
#your validation logic
Django has some default built-in validators such as email validators, url validators etc., which broadly fall under the umbrella of RegEx validators. To implement these cleanly, Django resorts to callable classes (instead of functions). It implements default Regex Validation logic in a RegexValidator and then extends these classes for other validations.
class RegexValidator(object):
def __call__(self, value):
# validation logic
class URLValidator(RegexValidator):
def __call__(self, value):
super(URLValidator, self).__call__(value)
#additional logic
class EmailValidator(RegexValidator):
# some logic
Now both your custom function and built-in EmailValidator can be called with the same syntax.
for v in [custom_validator, EmailValidator()]:
v(value) # <-----
As you can see, this implementation in Django is similar to what others have explained in their answers below. Can this be implemented in any other way? You could, but IMHO it will not be as readable or as easily extensible for a big framework like Django.
I find it useful because it allows me to create APIs that are easy to use (you have some callable object that requires some specific arguments), and are easy to implement because you can use Object Oriented practices.
The following is code I wrote yesterday that makes a version of the hashlib.foo methods that hash entire files rather than strings:
# filehash.py
import hashlib
class Hasher(object):
"""
A wrapper around the hashlib hash algorithms that allows an entire file to
be hashed in a chunked manner.
"""
def __init__(self, algorithm):
self.algorithm = algorithm
def __call__(self, file):
hash = self.algorithm()
with open(file, 'rb') as f:
for chunk in iter(lambda: f.read(4096), ''):
hash.update(chunk)
return hash.hexdigest()
md5 = Hasher(hashlib.md5)
sha1 = Hasher(hashlib.sha1)
sha224 = Hasher(hashlib.sha224)
sha256 = Hasher(hashlib.sha256)
sha384 = Hasher(hashlib.sha384)
sha512 = Hasher(hashlib.sha512)
This implementation allows me to use the functions in a similar fashion to the hashlib.foo functions:
from filehash import sha1
print sha1('somefile.txt')
Of course I could have implemented it a different way, but in this case it seemed like a simple approach.
__call__ is also used to implement decorator classes in python. In this case the instance of the class is called when the method with the decorator is called.
class EnterExitParam(object):
def __init__(self, p1):
self.p1 = p1
def __call__(self, f):
def new_f():
print("Entering", f.__name__)
print("p1=", self.p1)
f()
print("Leaving", f.__name__)
return new_f
#EnterExitParam("foo bar")
def hello():
print("Hello")
if __name__ == "__main__":
hello()
program output:
Entering hello
p1= foo bar
Hello
Leaving hello
Yes, when you know you're dealing with objects, it's perfectly possible (and in many cases advisable) to use an explicit method call. However, sometimes you deal with code that expects callable objects - typically functions, but thanks to __call__ you can build more complex objects, with instance data and more methods to delegate repetitive tasks, etc. that are still callable.
Also, sometimes you're using both objects for complex tasks (where it makes sense to write a dedicated class) and objects for simple tasks (that already exist in functions, or are more easily written as functions). To have a common interface, you either have to write tiny classes wrapping those functions with the expected interface, or you keep the functions functions and make the more complex objects callable. Let's take threads as example. The Thread objects from the standard libary module threading want a callable as target argument (i.e. as action to be done in the new thread). With a callable object, you are not restricted to functions, you can pass other objects as well, such as a relatively complex worker that gets tasks to do from other threads and executes them sequentially:
class Worker(object):
def __init__(self, *args, **kwargs):
self.queue = queue.Queue()
self.args = args
self.kwargs = kwargs
def add_task(self, task):
self.queue.put(task)
def __call__(self):
while True:
next_action = self.queue.get()
success = next_action(*self.args, **self.kwargs)
if not success:
self.add_task(next_action)
This is just an example off the top of my head, but I think it is already complex enough to warrant the class. Doing this only with functions is hard, at least it requires returning two functions and that's slowly getting complex. One could rename __call__ to something else and pass a bound method, but that makes the code creating the thread slightly less obvious, and doesn't add any value.
Class-based decorators use __call__ to reference the wrapped function. E.g.:
class Deco(object):
def __init__(self,f):
self.f = f
def __call__(self, *args, **kwargs):
print args
print kwargs
self.f(*args, **kwargs)
There is a good description of the various options here at Artima.com
IMHO __call__ method and closures give us a natural way to create STRATEGY design pattern in Python. We define a family of algorithms, encapsulate each one, make them interchangeable and in the end we can execute a common set of steps and, for example, calculate a hash for a file.
I just stumbled upon a usage of __call__() in concert with __getattr__() which I think is beautiful. It allows you to hide multiple levels of a JSON/HTTP/(however_serialized) API inside an object.
The __getattr__() part takes care of iteratively returning a modified instance of the same class, filling in one more attribute at a time. Then, after all information has been exhausted, __call__() takes over with whatever arguments you passed in.
Using this model, you can for example make a call like api.v2.volumes.ssd.update(size=20), which ends up in a PUT request to https://some.tld/api/v2/volumes/ssd/update.
The particular code is a block storage driver for a certain volume backend in OpenStack, you can check it out here: https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/nexenta/jsonrpc.py
EDIT: Updated the link to point to master revision.
I find a good place to use callable objects, those that define __call__(), is when using the functional programming capabilities in Python, such as map(), filter(), reduce().
The best time to use a callable object over a plain function or a lambda function is when the logic is complex and needs to retain some state or uses other info that in not passed to the __call__() function.
Here's some code that filters file names based upon their filename extension using a callable object and filter().
Callable:
import os
class FileAcceptor(object):
def __init__(self, accepted_extensions):
self.accepted_extensions = accepted_extensions
def __call__(self, filename):
base, ext = os.path.splitext(filename)
return ext in self.accepted_extensions
class ImageFileAcceptor(FileAcceptor):
def __init__(self):
image_extensions = ('.jpg', '.jpeg', '.gif', '.bmp')
super(ImageFileAcceptor, self).__init__(image_extensions)
Usage:
filenames = [
'me.jpg',
'me.txt',
'friend1.jpg',
'friend2.bmp',
'you.jpeg',
'you.xml']
acceptor = ImageFileAcceptor()
image_filenames = filter(acceptor, filenames)
print image_filenames
Output:
['me.jpg', 'friend1.jpg', 'friend2.bmp', 'you.jpeg']
Specify a __metaclass__ and override the __call__ method, and have the specified meta classes' __new__ method return an instance of the class, viola you have a "function" with methods.
We can use __call__ method to use other class methods as static methods.
class _Callable:
def __init__(self, anycallable):
self.__call__ = anycallable
class Model:
def get_instance(conn, table_name):
""" do something"""
get_instance = _Callable(get_instance)
provs_fac = Model.get_instance(connection, "users")
One common example is the __call__ in functools.partial, here is a simplified version (with Python >= 3.5):
class partial:
"""New function with partial application of the given arguments and keywords."""
def __new__(cls, func, *args, **kwargs):
if not callable(func):
raise TypeError("the first argument must be callable")
self = super().__new__(cls)
self.func = func
self.args = args
self.kwargs = kwargs
return self
def __call__(self, *args, **kwargs):
return self.func(*self.args, *args, **self.kwargs, **kwargs)
Usage:
def add(x, y):
return x + y
inc = partial(add, y=1)
print(inc(41)) # 42
This is too late but I'm giving an example. Imagine you have a Vector class and a Point class. Both take x, y as positional args. Let's imagine you want to create a function that moves the point to be put on the vector.
4 Solutions
put_point_on_vec(point, vec)
Make it a method on the vector class. e.g my_vec.put_point(point)
Make it a method on the Point class. my_point.put_on_vec(vec)
Vector implements __call__, So you can use it like my_vec_instance(point)
This is actually part of some examples I'm working on for a guide for dunder methods explained with Maths that I'm gonna release sooner or later.
I left the logic of moving the point itself because this is not what this question is about
I'm a novice, but here is my take: having __call__ makes composition easier to code. If f, g are instance of a class Function that has a method eval(self,x), then with __call___ one could write f(g(x)) as opposed to f.eval(g.eval(x)).
A neural network can be composed from smaller neural networks, and in pytorch we have a __call__ in the Module class:
Here's an example of where __call__ is used in practice: in pytorch, when defining a neural network (call it class MyNN(nn.Module) for example) as a subclass of torch.nn.module.Module, one typically defines a forward method for the class, but typically when applying an input tensor x to an instance model=MyNN() we just write model(x) as opposed to model.forward(x) but both give the same answer. If you dig into the source for torch.nn.module.Module here
https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module
and search for __call__ one can eventually trace it back - at least in some cases - to a call to self.forward
import torch.nn as nn
import torch.nn.functional as F
class MyNN(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(784, 200)
def forward(self, x):
return F.ReLU(self.layer1(x))
x=torch.rand(10,784)
model=MyNN()
print(model(x))
print(model.forward(x))
will print the same values in the last two lines as the Module class has implemented __call___ so that's what I believe Python turns to when it sees model(x), and the __call__ which in turn eventually calls model.forward(x))
The function call operator.
class Foo:
def __call__(self, a, b, c):
# do something
x = Foo()
x(1, 2, 3)
The __call__ method can be used to redefined/re-initialize the same object. It also facilitates the use of instances/objects of a class as functions by passing arguments to the objects.
import random
class Bingo:
def __init__(self,items):
self._items=list(items)
random.shuffle(self._items,random=None)
def pick(self):
try:
return self._items.pop()
except IndexError:
raise LookupError('It is empty now!')
def __call__(self):
return self.pick()
b= Bingo(range(3))
print(b.pick())
print(b())
print(callable(b))
Now the output can be..(As first two answer keeps shuffling between [0,3])
2
1
True
You can see that Class Bingo implements _call_ method, an easy way to create a function like objects that have an internal state that must be kept across invocations like remaining items in Bingo. Another good use case of _call_ is Decorators. Decorators must be callable and it is sometimes convenient to "remember"
something between calls of decorators. (i.e., memoization- caching the results of expensive computation for later use.)

Disable class instance methods

How can I quickly disable all methods in a class instance based on a condition? My naive solution is to override using the __getattr__ but this is not called when the function name exists already.
class my():
def method1(self):
print 'method1'
def method2(self):
print 'method2'
def __getattr__(self, name):
print 'Fetching '+str(name)
if self.isValid():
return getattr(self, name)
def isValid(self):
return False
if __name__ == '__main__':
m=my()
m.method1()
The equivalent of what you want to do is actually to override __getattribute__, which is going to be called for every attribute access. Besides it being very slow, take care: by definition of every, that includes e.g. the call to self.isValid within __getattribute__'s own body, so you'll have to use some circuitous route to access that attribute (type(self).isValid(self) should work, for example, as it gets the attribute from the class, not from the instance).
This points to a horrible terminological confusion: this is not disabling "method from a class", but from an instance, and in particular has nothing to do with classmethods. If you do want to work in a similar way on a class basis, rather than an instance basis, you'll need to make a custom metaclass and override __getattribute__ on the metaclass (that's the one that's called when you access attributes on the class -- as you're asking in your title and text -- rather than on the instance -- as you in fact appear to be doing, which is by far the more normal and usual case).
Edit: a completely different approach might be to use a peculiarly Pythonic pathway to implementing the State design pattern: class-switching. E.g.:
class _NotValid(object):
def isValid(self):
return False
def setValid(self, yesno):
if yesno:
self.__class__ = TheGoodOne
class TheGoodOne(object):
def isValid(self):
return True
def setValid(self, yesno):
if not yesno:
self.__class__ = _NotValid
# write all other methods here
As long as you can call setValid appropriately, so that the object's __class__ is switched appropriately, this is very fast and simple -- essentially, the object's __class__ is where all the object's methods are found, so by switching it you switch, en masse, the set of methods that exist on the object at a given time. However, this does not work if you absolutely insist that validity checking must be performed "just in time", i.e. at the very instant the object's method is being looked up.
An intermediate approach between this and the __getattribute__ one would be to introduce an extra level of indirection (which is popularly held to be the solution to all problems;-), along the lines of:
class _Valid(object):
def __init__(self, actualobject):
self._actualobject = actualobject
# all actual methods go here
# keeping state in self._actualobject
class Wrapit(object):
def __init__(self):
self._themethods = _Valid(self)
def isValid(self):
# whatever logic you want
# (DON'T call other self. methods!-)
return False
def __getattr__(self, n):
if self.isValid():
return getattr(self._themethods, n)
raise AttributeError(n)
This is more idiomatic than __getattribute__ because it relies on the fact that __getattr__ is only called for attributes that aren't found in other ways -- so the object can hold normal state (data) in its __dict__, and that will be accessed without any big overhead; only method calls pay the extra overhead of indiretion. The _Valid class instances can keep some or all state in their respective self._actualobject, if any of the state needs to stay accessible on invalid objects (so that the invalid state disable methods, but not data attributes access; it's not clear from your Q if that's needed, but it's a free extra possibility offered by this approach). This idiom is less error-prone than __getattribute__, since state can be accessed more directly in the methods (without triggering validity checks).
As presented, the solution creates a circular reference loop, which may impose a bit of overhead in terms of garbage collection. If that's a problem in your application, use the weakref module from the standard Python library, of course -- that module is generally the simplest way to remove circular loops of references, if and when they're a problem.
(E.g., make the _actualobject attribute of _Valid class instances a weak reference to the object that holds that instance as its _themethods attribute).

Categories