I know that __call__ method in a class is triggered when the instance of a class is called. However, I have no idea when I can use this special method, because one can simply create a new method and perform the same operation done in __call__ method and instead of calling the instance, you can call the method.
I would really appreciate it if someone gives me a practical usage of this special method.
This example uses memoization, basically storing values in a table (dictionary in this case) so you can look them up later instead of recalculating them.
Here we use a simple class with a __call__ method to calculate factorials (through a callable object) instead of a factorial function that contains a static variable (as that's not possible in Python).
class Factorial:
def __init__(self):
self.cache = {}
def __call__(self, n):
if n not in self.cache:
if n == 0:
self.cache[n] = 1
else:
self.cache[n] = n * self.__call__(n-1)
return self.cache[n]
fact = Factorial()
Now you have a fact object which is callable, just like every other function. For example
for i in xrange(10):
print("{}! = {}".format(i, fact(i)))
# output
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880
And it is also stateful.
Django forms module uses __call__ method nicely to implement a consistent API for form validation. You can write your own validator for a form in Django as a function.
def custom_validator(value):
#your validation logic
Django has some default built-in validators such as email validators, url validators etc., which broadly fall under the umbrella of RegEx validators. To implement these cleanly, Django resorts to callable classes (instead of functions). It implements default Regex Validation logic in a RegexValidator and then extends these classes for other validations.
class RegexValidator(object):
def __call__(self, value):
# validation logic
class URLValidator(RegexValidator):
def __call__(self, value):
super(URLValidator, self).__call__(value)
#additional logic
class EmailValidator(RegexValidator):
# some logic
Now both your custom function and built-in EmailValidator can be called with the same syntax.
for v in [custom_validator, EmailValidator()]:
v(value) # <-----
As you can see, this implementation in Django is similar to what others have explained in their answers below. Can this be implemented in any other way? You could, but IMHO it will not be as readable or as easily extensible for a big framework like Django.
I find it useful because it allows me to create APIs that are easy to use (you have some callable object that requires some specific arguments), and are easy to implement because you can use Object Oriented practices.
The following is code I wrote yesterday that makes a version of the hashlib.foo methods that hash entire files rather than strings:
# filehash.py
import hashlib
class Hasher(object):
"""
A wrapper around the hashlib hash algorithms that allows an entire file to
be hashed in a chunked manner.
"""
def __init__(self, algorithm):
self.algorithm = algorithm
def __call__(self, file):
hash = self.algorithm()
with open(file, 'rb') as f:
for chunk in iter(lambda: f.read(4096), ''):
hash.update(chunk)
return hash.hexdigest()
md5 = Hasher(hashlib.md5)
sha1 = Hasher(hashlib.sha1)
sha224 = Hasher(hashlib.sha224)
sha256 = Hasher(hashlib.sha256)
sha384 = Hasher(hashlib.sha384)
sha512 = Hasher(hashlib.sha512)
This implementation allows me to use the functions in a similar fashion to the hashlib.foo functions:
from filehash import sha1
print sha1('somefile.txt')
Of course I could have implemented it a different way, but in this case it seemed like a simple approach.
__call__ is also used to implement decorator classes in python. In this case the instance of the class is called when the method with the decorator is called.
class EnterExitParam(object):
def __init__(self, p1):
self.p1 = p1
def __call__(self, f):
def new_f():
print("Entering", f.__name__)
print("p1=", self.p1)
f()
print("Leaving", f.__name__)
return new_f
#EnterExitParam("foo bar")
def hello():
print("Hello")
if __name__ == "__main__":
hello()
program output:
Entering hello
p1= foo bar
Hello
Leaving hello
Yes, when you know you're dealing with objects, it's perfectly possible (and in many cases advisable) to use an explicit method call. However, sometimes you deal with code that expects callable objects - typically functions, but thanks to __call__ you can build more complex objects, with instance data and more methods to delegate repetitive tasks, etc. that are still callable.
Also, sometimes you're using both objects for complex tasks (where it makes sense to write a dedicated class) and objects for simple tasks (that already exist in functions, or are more easily written as functions). To have a common interface, you either have to write tiny classes wrapping those functions with the expected interface, or you keep the functions functions and make the more complex objects callable. Let's take threads as example. The Thread objects from the standard libary module threading want a callable as target argument (i.e. as action to be done in the new thread). With a callable object, you are not restricted to functions, you can pass other objects as well, such as a relatively complex worker that gets tasks to do from other threads and executes them sequentially:
class Worker(object):
def __init__(self, *args, **kwargs):
self.queue = queue.Queue()
self.args = args
self.kwargs = kwargs
def add_task(self, task):
self.queue.put(task)
def __call__(self):
while True:
next_action = self.queue.get()
success = next_action(*self.args, **self.kwargs)
if not success:
self.add_task(next_action)
This is just an example off the top of my head, but I think it is already complex enough to warrant the class. Doing this only with functions is hard, at least it requires returning two functions and that's slowly getting complex. One could rename __call__ to something else and pass a bound method, but that makes the code creating the thread slightly less obvious, and doesn't add any value.
Class-based decorators use __call__ to reference the wrapped function. E.g.:
class Deco(object):
def __init__(self,f):
self.f = f
def __call__(self, *args, **kwargs):
print args
print kwargs
self.f(*args, **kwargs)
There is a good description of the various options here at Artima.com
IMHO __call__ method and closures give us a natural way to create STRATEGY design pattern in Python. We define a family of algorithms, encapsulate each one, make them interchangeable and in the end we can execute a common set of steps and, for example, calculate a hash for a file.
I just stumbled upon a usage of __call__() in concert with __getattr__() which I think is beautiful. It allows you to hide multiple levels of a JSON/HTTP/(however_serialized) API inside an object.
The __getattr__() part takes care of iteratively returning a modified instance of the same class, filling in one more attribute at a time. Then, after all information has been exhausted, __call__() takes over with whatever arguments you passed in.
Using this model, you can for example make a call like api.v2.volumes.ssd.update(size=20), which ends up in a PUT request to https://some.tld/api/v2/volumes/ssd/update.
The particular code is a block storage driver for a certain volume backend in OpenStack, you can check it out here: https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/nexenta/jsonrpc.py
EDIT: Updated the link to point to master revision.
I find a good place to use callable objects, those that define __call__(), is when using the functional programming capabilities in Python, such as map(), filter(), reduce().
The best time to use a callable object over a plain function or a lambda function is when the logic is complex and needs to retain some state or uses other info that in not passed to the __call__() function.
Here's some code that filters file names based upon their filename extension using a callable object and filter().
Callable:
import os
class FileAcceptor(object):
def __init__(self, accepted_extensions):
self.accepted_extensions = accepted_extensions
def __call__(self, filename):
base, ext = os.path.splitext(filename)
return ext in self.accepted_extensions
class ImageFileAcceptor(FileAcceptor):
def __init__(self):
image_extensions = ('.jpg', '.jpeg', '.gif', '.bmp')
super(ImageFileAcceptor, self).__init__(image_extensions)
Usage:
filenames = [
'me.jpg',
'me.txt',
'friend1.jpg',
'friend2.bmp',
'you.jpeg',
'you.xml']
acceptor = ImageFileAcceptor()
image_filenames = filter(acceptor, filenames)
print image_filenames
Output:
['me.jpg', 'friend1.jpg', 'friend2.bmp', 'you.jpeg']
Specify a __metaclass__ and override the __call__ method, and have the specified meta classes' __new__ method return an instance of the class, viola you have a "function" with methods.
We can use __call__ method to use other class methods as static methods.
class _Callable:
def __init__(self, anycallable):
self.__call__ = anycallable
class Model:
def get_instance(conn, table_name):
""" do something"""
get_instance = _Callable(get_instance)
provs_fac = Model.get_instance(connection, "users")
One common example is the __call__ in functools.partial, here is a simplified version (with Python >= 3.5):
class partial:
"""New function with partial application of the given arguments and keywords."""
def __new__(cls, func, *args, **kwargs):
if not callable(func):
raise TypeError("the first argument must be callable")
self = super().__new__(cls)
self.func = func
self.args = args
self.kwargs = kwargs
return self
def __call__(self, *args, **kwargs):
return self.func(*self.args, *args, **self.kwargs, **kwargs)
Usage:
def add(x, y):
return x + y
inc = partial(add, y=1)
print(inc(41)) # 42
This is too late but I'm giving an example. Imagine you have a Vector class and a Point class. Both take x, y as positional args. Let's imagine you want to create a function that moves the point to be put on the vector.
4 Solutions
put_point_on_vec(point, vec)
Make it a method on the vector class. e.g my_vec.put_point(point)
Make it a method on the Point class. my_point.put_on_vec(vec)
Vector implements __call__, So you can use it like my_vec_instance(point)
This is actually part of some examples I'm working on for a guide for dunder methods explained with Maths that I'm gonna release sooner or later.
I left the logic of moving the point itself because this is not what this question is about
I'm a novice, but here is my take: having __call__ makes composition easier to code. If f, g are instance of a class Function that has a method eval(self,x), then with __call___ one could write f(g(x)) as opposed to f.eval(g.eval(x)).
A neural network can be composed from smaller neural networks, and in pytorch we have a __call__ in the Module class:
Here's an example of where __call__ is used in practice: in pytorch, when defining a neural network (call it class MyNN(nn.Module) for example) as a subclass of torch.nn.module.Module, one typically defines a forward method for the class, but typically when applying an input tensor x to an instance model=MyNN() we just write model(x) as opposed to model.forward(x) but both give the same answer. If you dig into the source for torch.nn.module.Module here
https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module
and search for __call__ one can eventually trace it back - at least in some cases - to a call to self.forward
import torch.nn as nn
import torch.nn.functional as F
class MyNN(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = nn.Linear(784, 200)
def forward(self, x):
return F.ReLU(self.layer1(x))
x=torch.rand(10,784)
model=MyNN()
print(model(x))
print(model.forward(x))
will print the same values in the last two lines as the Module class has implemented __call___ so that's what I believe Python turns to when it sees model(x), and the __call__ which in turn eventually calls model.forward(x))
The function call operator.
class Foo:
def __call__(self, a, b, c):
# do something
x = Foo()
x(1, 2, 3)
The __call__ method can be used to redefined/re-initialize the same object. It also facilitates the use of instances/objects of a class as functions by passing arguments to the objects.
import random
class Bingo:
def __init__(self,items):
self._items=list(items)
random.shuffle(self._items,random=None)
def pick(self):
try:
return self._items.pop()
except IndexError:
raise LookupError('It is empty now!')
def __call__(self):
return self.pick()
b= Bingo(range(3))
print(b.pick())
print(b())
print(callable(b))
Now the output can be..(As first two answer keeps shuffling between [0,3])
2
1
True
You can see that Class Bingo implements _call_ method, an easy way to create a function like objects that have an internal state that must be kept across invocations like remaining items in Bingo. Another good use case of _call_ is Decorators. Decorators must be callable and it is sometimes convenient to "remember"
something between calls of decorators. (i.e., memoization- caching the results of expensive computation for later use.)
Related
I have a couple of classes that have the same methods but do things slightly different (WorkerOne, WorkerTwo). Those classes inherit from an abstract base class using the abc module and #abstractmethod annotation for the methods that should be implemented in WorkerOne and WorkerTwo.
Note: The actual question comes at the end.
Here's the shortened code:
class AbstractWorker(metaclass=ABCMeta):
#abstractmethod
def log_value(self, value):
pass
class WorkerOne(AbstractWorker):
def log_value(self, value):
# do something differently
class WorkerTwo(AbstractWorker):
def log_value(self, value):
# do something differently
This works fine and I can create objects for both worker classes and execute the functions accordingly.
E.g.
worker_one = WorkerOne()
worker_two = WorkerTwo()
worker_one.log_value(1)
worker_two.log_value('text')
Please note that this is simplified. Each worker uses a different package to track experiments in the ML field and not just differentiates between int and str.
I've been trying to find a way to not call both objects every single time I want to log something. I want to unify these methods in some sort of wrapper class that takes those two objects, and executes the method called on that wrapper, on each object. I call that wrapper a hive as it contains it workers.
Currently, I see two solutions to this but both are lacking quality. The first would be the easier one but results in duplication of code. It is simple and it works but it doesn't follow the DRY principle.
Solution #1:
class HiveSimple(AbstractWorker):
def __init__(self, workers: List[AbstractWorker]):
self.workers = workers
def log_value(self, value):
for worker in self.workers:
worker.log_value(value)
...
The idea is to have the wrapper/hive class to inherit from the abstract class as well, so we are forced to implement the functions. The workers are passed as a list for creating the object. For the log_value function we would iterate through the list of workers and execute their own implementation of that method. The problem, as shortly mentioned, is 1) duplicated code and 2) the hive class also grows or needs to be altered when a new method is added to the abstract base class.
The second solution is a bit more advanced and avoids duplicated code but has also a disadvantage.
Solution #2:
class Hive:
def __init__(self, trackers: List[AbstractWorker]):
self.workers = workers
self._ls_functions = []
def __getattr__(self, name):
for worker in self.workers:
self._ls_functions.append(getattr(worker, name))
return self.fn_executor
def fn_executor(self, *args, **kwargs):
for fn in self._ls_functions:
fn(*args, **kwargs)
self._ls_functions = []
In this solution I make use of the __getattr__ function. If I call the log_value() function on the hive object (hive.log_value()) it looks first if it has the log_value attribute/function. If the attribute does not exist, it enters the __getattr__ function and executes the code. There, I iterate through the list of workers and collect the functions with the same name. I then return the function fn_executor, because otherwise I wouldn't be able to hand over the parameters with which the log_value() function was called on the hive object. Although this works fine, the issue is that you need to know the parameters and the types beforehand. Since we don't use inheritance we don't have the advantage of IntelliSense, because the functions are no members of the hive class. Makes sense.
So I wanted to mitigate that by adding functions as attributes during the __init__, which works.
Solution #2.1:
class Hive:
def __init__(self, trackers: List[AbstractWorker]):
self.workers = workers
self._ls_functions = []
for fn_name in dir(AbstractWorker):
if not (fn_name.startswith('__') or fn_name.startswith('_')):
setattr(self, fn_name, self.fn_wrapper(str(fn_name)))
def fn_wrapper(self, name):
def fn(*params, **kwargs):
return self.__getattr__(name)(*params, **kwargs)
return fn
def __getattr__(self, name):
for worker in self.workers:
self._ls_functions.append(getattr(worker, name))
return self.fn_executor
def fn_executor(self, *args, **kwargs):
for fn in self._ls_functions:
fn(*args, **kwargs)
self._ls_functions = []
In solution #2.1 I try to fetch all functions from the abstract base class with dir(AbstractWorker), removing dunder functions and "private" ones with the if and set the name of the functions as an attribute. Additionally, I assign a wrapper function (similar to partial or a decorator) that contains the __getattr__ function. During runtime the members are correctly set, but since IntelliSense relies on static code analysis, it is difficult to handle dynamic attribute assignment and as a result IntelliSense refuses to bring them up.
Now to the question:
What would be the best approach to create a wrapper/hive class that knows about the signature of the functions from the abstract base class but gets rid of the duplication of code shown in solution #1?
Is this a plausible and sound way to write a class where there is a syntactic sugar #staticmethod that is used for the outside to interact with? Thanks.
###scrip1.py###
import SampleClass.method1 as method1
output = method1(input_var)
###script2.py###
class SampleClass(object):
def __init__(self):
self.var1 = 'var1'
self.var2 = 'var2'
#staticmethod
def method1(input_var):
# Syntactic Sugar method that outside uses
sample_class = SampleClass()
result = sample_class._method2(input_var)
return result
def _method2(self, input_var):
# Main method executes the various steps.
self.var4 = self._method3(input_var)
return self._method4(self.var4)
def _method3(self):
pass
def _method4(self):
pass
Answering to both your question and your comment, yes it is possible to write such a code but I see no point in doing it:
class A:
def __new__(cls, value):
return cls.meth1(value)
def meth1(value):
return value + 1
result = A(100)
print(result)
# output:
101
You can't store a reference to a class A instance because you get your method result instead of an A instance. And because of this, an existing __init__will not be called.
So if the instance just calculates something and gets discarded right away, what you want is to write a simple function, not a class. You are not storing state anywhere.
And if you look at it:
result = some_func(value)
looks exactly to what people expect when reading it, a function call.
So no, it is not a good practice unless you come up with a good use case for it (I can't remember one right now)
Also relevant for this question is the documentation here to understand __new__ and __init__ behaviour.
Regarding your other comment below my answer:
defining __init__ in a class to set the initial state (attribute values) of the (already) created instance happens all the time. But __new__ has the different goal of customizing the object creation. The instance object does not exist yet when __new__is run (it is a constructor function). __new__ is rarely needed in Python unless you need things like a singleton, say a class A that always returns the very same object instance (of A) when called with A(). Normal user-defined classes usually return a new object on instantiation. You can check this with the id() builtin function. Another use case is when you create your own version (by subclassing) of an immutable type. Because it's immutable the value was already set and there is no way of changing the value inside __init__ or later. Hence the need to act before that, adding code inside __new__. Using __new__ without returning an object of the same class type (this is the uncommon case) has the addtional problem of not running __init__.
If you are just grouping lots of methods inside a class but there is still no state to store/manage in each instance (you notice this also by the absence of self use in the methods body), consider not using a class at all and organize these methods now turned into selfless functions in a module or package for import. Because it looks you are grouping just to organize related code.
If you stick to classes because there is state involved, consider breaking the class into smaller classes with no more than five to 7 methods. Think also of giving them some more structure by grouping some of the small classes in various modules/submodules and using subclasses, because a long plain list of small classes (or functions anyway) can be mentally difficult to follow.
This has nothing to do with __new__ usage.
In summary, use the syntax of a call for a function call that returns a result (or None) or for an object instantiation by calling the class name. In this case the usual is to return an object of the intended type (the class called). Returning the result of a method usually involves returning a different type and that can look unexpected to the class user. There is a close use case to this where some coders return self from their methods to allow for train-like syntax:
my_font = SomeFont().italic().bold()
Finally if you don't like result = A().method(value), consider an alias:
func = A().method
...
result = func(value)
Note how you are left with no reference to the A() instance in your code.
If you need the reference split further the assignment:
a = A()
func = a.method
...
result = func(value)
If the reference to A() is not needed then you probably don't need the instance too, and the class is just grouping the methods. You can just write
func = A.method
result = func(value)
where selfless methods should be decorated with #staticmethod because there is no instance involved. Note also how static methods could be turned into simple functions outside classes.
Edit:
I have setup an example similar to what you are trying to acomplish. It is also difficult to judge if having methods injecting results into the next method is the best choice for a multistep procedure. Because they share some state, they are coupled to each other and so can also inject errors to each other more easily. I assume you want to share some data between them that way (and that's why you are setting them up in a class):
So this an example class where a public method builds the result by calling a chain of internal methods. All methods depend on object state, self.offset in this case, despite getting an input value for calculations.
Because of this it makes sense that every method uses self to access the state. It also makes sense that you are able to instantiate different objects holding different configurations, so I see no use here for #staticmethod or #classmethod.
Initial instance configuration is done in __init__ as usual.
# file: multistepinc.py
def __init__(self, offset):
self.offset = offset
def result(self, value):
return self._step1(value)
def _step1(self, x):
x = self._step2(x)
return self.offset + 1 + x
def _step2(self, x):
x = self._step3(x)
return self.offset + 2 + x
def _step3(self, x):
return self.offset + 3 + x
def get_multi_step_inc(offset):
return MultiStepInc(offset).result
--------
# file: multistepinc_example.py
from multistepinc import get_multi_step_inc
# get the result method of a configured
# MultiStepInc instance
# with offset = 10.
# Much like an object factory, but you
# mentioned to prefer to have the result
# method of the instance
# instead of the instance itself.
inc10 = get_multi_step_inc(10)
# invoke the inc10 method
result = inc10(1)
print(result)
# creating another instance with offset=2
inc2 = get_multi_step_inc(2)
result = inc2(1)
print(result)
# if you need to manipulate the object
# instance
# you have to (on file top)
from multistepinc import MultiStepInc
# and then
inc_obj = MultiStepInc(5)
# ...
# ... do something with your obj, then
result = inc_obj.result(1)
print(result)
Outputs:
37
13
22
I'd like a particular function to be callable as a classmethod, and to behave differently when it's called on an instance.
For example, if I have a class Thing, I want Thing.get_other_thing() to work, but also thing = Thing(); thing.get_other_thing() to behave differently.
I think overwriting the get_other_thing method on initialization should work (see below), but that seems a bit hacky. Is there a better way?
class Thing:
def __init__(self):
self.get_other_thing = self._get_other_thing_inst()
#classmethod
def get_other_thing(cls):
# do something...
def _get_other_thing_inst(self):
# do something else
Great question! What you seek can be easily done using descriptors.
Descriptors are Python objects which implement the descriptor protocol, usually starting with __get__().
They exist, mostly, to be set as a class attribute on different classes. Upon accessing them, their __get__() method is called, with the instance and owner class passed in.
class DifferentFunc:
"""Deploys a different function accroding to attribute access
I am a descriptor.
"""
def __init__(self, clsfunc, instfunc):
# Set our functions
self.clsfunc = clsfunc
self.instfunc = instfunc
def __get__(self, inst, owner):
# Accessed from class
if inst is None:
return self.clsfunc.__get__(None, owner)
# Accessed from instance
return self.instfunc.__get__(inst, owner)
class Test:
#classmethod
def _get_other_thing(cls):
print("Accessed through class")
def _get_other_thing_inst(inst):
print("Accessed through instance")
get_other_thing = DifferentFunc(_get_other_thing,
_get_other_thing_inst)
And now for the result:
>>> Test.get_other_thing()
Accessed through class
>>> Test().get_other_thing()
Accessed through instance
That was easy!
By the way, did you notice me using __get__ on the class and instance function? Guess what? Functions are also descriptors, and that's the way they work!
>>> def func(self):
... pass
...
>>> func.__get__(object(), object)
<bound method func of <object object at 0x000000000046E100>>
Upon accessing a function attribute, it's __get__ is called, and that's how you get function binding.
For more information, I highly suggest reading the Python manual and the "How-To" linked above. Descriptors are one of Python's most powerful features and are barely even known.
Why not set the function on instantiation?
Or Why not set self.func = self._func inside __init__?
Setting the function on instantiation comes with quite a few problems:
self.func = self._funccauses a circular reference. The instance is stored inside the function object returned by self._func. This on the other hand is stored upon the instance during the assignment. The end result is that the instance references itself and will clean up in a much slower and heavier manner.
Other code interacting with your class might attempt to take the function straight out of the class, and use __get__(), which is the usual expected method, to bind it. They will receive the wrong function.
Will not work with __slots__.
Although with descriptors you need to understand the mechanism, setting it on __init__ isn't as clean and requires setting multiple functions on __init__.
Takes more memory. Instead of storing one single function, you store a bound function for each and every instance.
Will not work with properties.
There are many more that I didn't add as the list goes on and on.
Here is a bit hacky solution:
class Thing(object):
#staticmethod
def get_other_thing():
return 1
def __getattribute__(self, name):
if name == 'get_other_thing':
return lambda: 2
return super(Thing, self).__getattribute__(name)
print Thing.get_other_thing() # 1
print Thing().get_other_thing() # 2
If we are on class, staticmethod is executed. If we are on instance, __getattribute__ is first to be executed, so we can return not Thing.get_other_thing but some other function (lambda in my case)
I am using the decorator design pattern to build a "composite class" that composes together the behavior of a set of "component classes". The behavior of the relevant method from each component class is governed by a dictionary param_dict, so that each component class has its own param_dict. The composite class also has a composite_param_dict, which is successively built up from the component dictionaries.
The behavior I need is the following: when an instance of the composite class has a value of composite_param_dict changed, I need the behavior of the inherited method to change.
Here is a minimal example of my class design:
class Component(object):
def __init__(self):
self.param_dict = {'a':4}
def component_method(self, x):
return self.param_dict['a']*x
I pass an instance of Component to the Composite constructor:
class Composite(object):
def __init__(self, instance):
self.instance = instance
# In the following line of code,
# I use copy to emphasize that there actually multiple
# instance.param_dict that are being passed to init,
# so composite_param_dict is not simply a reference
self.composite_param_dict = copy(self.instance.param_dict)
setattr(self, 'composite_method', self.param_dict_decorator(getattr(self.instance, 'component_method')))
def param_dict_decorator(self, func):
self.instance.param_dict['a'] = self.composite_param_dict['a']
return func
For the sake of being concise in this example, there is only one component, but in general there are many, so in general composite_param_dict has many keys, and the composite class has many inherited methods.
Additionally, I need to use getattr and setattr because I will not necessarily know in advance what the names of the methods I will need to inherit are. In general, the component models will tell the composite model which methods to inherit, so I cannot hard-code the method names into the composite model. In this minimal example, for the sake of being concise, I have gone ahead and hard-coded the method name component_method, and suppressed the mechanism by which this information is transmitted.
I build my composite class as follows:
component_instance = Component()
composite_instance = Composite(component_instance)
With my decorator written as I have in the above example, changes in the composite_param_dict do not propagate correctly, but I do not understand why not. For example:
composite_instance.composite_param_dict['a'] = 10
print composite_instance.composite_method(10)
40
If the values of composite_param_dict were correctly propagating, then the correct answer should be 100.
You only call param_dict_decorator once, at the moment when you create the composite_method. It is not called again every time you call the composite method. So it effectively "freezes" self.instance.param_dict with the value present in self.composite_param_dict at the time when you create the composite object.
If you want custom code to run every time composite_method is called, you can't just return func from param_dict_decorator. param_dict_decorator is only called once; it is what is returned from param_dict_decorator that you assign to composite_method, so that is what will be called whenever you call composite_method. So you need param_dict_decorator to return a new function that incorporates the "updating" behavior. Here's an example:
def param_dict_decorator(self, func):
def wrapper(*args, **kw):
self.instance.param_dict['a'] = self.composite_param_dict['a']
return func(*args, **kw)
return wrapper
With this change, it works:
>>> composite_instance = Composite(component_instance)
>>> composite_instance.composite_method(10)
40
>>> composite_instance.composite_param_dict['a'] = 10
>>> composite_instance.composite_method(10)
100
More generally, the concept of decorators is that they take in a function and return a new function that is meant to replace the original function. In your param_dict_decorator, you just return the original function, so your decorator has no effect at all on the behavior of func.
Ok, here is the real world scenario: I'm writing an application, and I have a class that represents a certain type of files (in my case this is photographs but that detail is irrelevant to the problem). Each instance of the Photograph class should be unique to the photo's filename.
The problem is, when a user tells my application to load a file, I need to be able to identify when files are already loaded, and use the existing instance for that filename, rather than create duplicate instances on the same filename.
To me this seems like a good situation to use memoization, and there's a lot of examples of that out there, but in this case I'm not just memoizing an ordinary function, I need to be memoizing __init__(). This poses a problem, because by the time __init__() gets called it's already too late as there's a new instance created already.
In my research I found Python's __new__() method, and I was actually able to write a working trivial example, but it fell apart when I tried to use it on my real-world objects, and I'm not sure why (the only thing I can think of is that my real world objects were subclasses of other objects that I can't really control, and so there were some incompatibilities with this approach). This is what I had:
class Flub(object):
instances = {}
def __new__(cls, flubid):
try:
self = Flub.instances[flubid]
except KeyError:
self = Flub.instances[flubid] = super(Flub, cls).__new__(cls)
print 'making a new one!'
self.flubid = flubid
print id(self)
return self
#staticmethod
def destroy_all():
for flub in Flub.instances.values():
print 'killing', flub
a = Flub('foo')
b = Flub('foo')
c = Flub('bar')
print a
print b
print c
print a is b, b is c
Flub.destroy_all()
Which output this:
making a new one!
139958663753808
139958663753808
making a new one!
139958663753872
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb090>
True False
killing <__main__.Flub object at 0x7f4aaa6fb050>
killing <__main__.Flub object at 0x7f4aaa6fb090>
It's perfect! Only two instances were made for the two unique id's given, and Flub.instances clearly only has two listed.
But when I tried to take this approach with the objects I was using, I got all kinds of nonsensical errors about how __init__() took only 0 arguments, not 2. So I'd change some things around and then it would tell me that __init__() needed an argument. Totally bizarre.
After a while of fighting with it, I basically just gave up and moved all the __new__() black magic into a staticmethod called get, such that I could call Photograph.get(filename) and it would only call Photograph(filename) if filename wasn't already in Photograph.instances.
Does anybody know where I went wrong here? Is there some better way to do this?
Another way of thinking about it is that it's similar to a singleton, except it's not globally singleton, just singleton-per-filename.
Here's my real-world code using the staticmethod get if you want to see it all together.
Let us see two points about your question.
Using memoize
You can use memoization, but you should decorate the class, not the __init__ method. Suppose we have this memoizator:
def get_id_tuple(f, args, kwargs, mark=object()):
"""
Some quick'n'dirty way to generate a unique key for an specific call.
"""
l = [id(f)]
for arg in args:
l.append(id(arg))
l.append(id(mark))
for k, v in kwargs:
l.append(k)
l.append(id(v))
return tuple(l)
_memoized = {}
def memoize(f):
"""
Some basic memoizer
"""
def memoized(*args, **kwargs):
key = get_id_tuple(f, args, kwargs)
if key not in _memoized:
_memoized[key] = f(*args, **kwargs)
return _memoized[key]
return memoized
Now you just need to decorate the class:
#memoize
class Test(object):
def __init__(self, somevalue):
self.somevalue = somevalue
Let us see a test?
tests = [Test(1), Test(2), Test(3), Test(2), Test(4)]
for test in tests:
print test.somevalue, id(test)
The output is below. Note that the same parameters yield the same id of the returned object:
1 3072319660
2 3072319692
3 3072319724
2 3072319692
4 3072319756
Anyway, I would prefer to create a function to generate the objects and memoize it. Seems cleaner to me, but it may be some irrelevant pet peeve:
class Test(object):
def __init__(self, somevalue):
self.somevalue = somevalue
#memoize
def get_test_from_value(somevalue):
return Test(somevalue)
Using __new__:
Or, of course, you can override __new__. Some days ago I posted an answer about the ins, outs and best practices of overriding __new__ that can be helpful. Basically, it says to always pass *args, **kwargs to your __new__ method.
I, for one, would prefer to memoize a function which creates the objects, or even write a specific function which would take care of never recreating a object to the same parameter. Of course, however, this is mostly a opinion of mine, not a rule.
The solution that I ended up using is this:
class memoize(object):
def __init__(self, cls):
self.cls = cls
self.__dict__.update(cls.__dict__)
# This bit allows staticmethods to work as you would expect.
for attr, val in cls.__dict__.items():
if type(val) is staticmethod:
self.__dict__[attr] = val.__func__
def __call__(self, *args):
key = '//'.join(map(str, args))
if key not in self.cls.instances:
self.cls.instances[key] = self.cls(*args)
return self.cls.instances[key]
And then you decorate the class with this, not __init__. Although brandizzi provided me with that key piece of information, his example decorator didn't function as desired.
I found this concept quite subtle, but basically when you're using decorators in Python, you need to understand that the thing that gets decorated (whether it's a method or a class) is actually replaced by the decorator itself. So for example when I'd try to access Photograph.instances or Camera.generate_id() (a staticmethod), I couldn't actually access them because Photograph doesn't actually refer to the original Photograph class, it refers to the memoized function (from brandizzi's example).
To get around this, I had to create a decorator class that actually took all the attributes and static methods from the decorated class and exposed them as it's own. Almost like a subclass, except that the decorator class doesn't know ahead of time what classes it will be decorating, so it has to copy the attributes over after the fact.
The end result is that any instance of the memoize class becomes an almost transparent wrapper around the actual class that it has decorated, with the exception that attempting to instantiate it (but really calling it) will provide you with cached copies when they're available.
The parameters to __new__ also get passed to __init__, so:
def __init__(self, flubid):
...
You need to accept the flubid argument there, even if you don't use it in __init__
Here is the relevant comment taken from typeobject.c in Python2.7.3
/* You may wonder why object.__new__() only complains about arguments
when object.__init__() is not overridden, and vice versa.
Consider the use cases:
1. When neither is overridden, we want to hear complaints about
excess (i.e., any) arguments, since their presence could
indicate there's a bug.
2. When defining an Immutable type, we are likely to override only
__new__(), since __init__() is called too late to initialize an
Immutable object. Since __new__() defines the signature for the
type, it would be a pain to have to override __init__() just to
stop it from complaining about excess arguments.
3. When defining a Mutable type, we are likely to override only
__init__(). So here the converse reasoning applies: we don't
want to have to override __new__() just to stop it from
complaining.
4. When __init__() is overridden, and the subclass __init__() calls
object.__init__(), the latter should complain about excess
arguments; ditto for __new__().
Use cases 2 and 3 make it unattractive to unconditionally check for
excess arguments. The best solution that addresses all four use
cases is as follows: __init__() complains about excess arguments
unless __new__() is overridden and __init__() is not overridden
(IOW, if __init__() is overridden or __new__() is not overridden);
symmetrically, __new__() complains about excess arguments unless
__init__() is overridden and __new__() is not overridden
(IOW, if __new__() is overridden or __init__() is not overridden).
However, for backwards compatibility, this breaks too much code.
Therefore, in 2.6, we'll *warn* about excess arguments when both
methods are overridden; for all other cases we'll use the above
rules.
*/
Was trying to figure this out as well and I put together a solution that combines some tips from other StackOverflow questions (links in the code comments).
If anyone still needs, try this out:
import functools
from collections import OrderedDict
def memoize(f):
class Memoized:
def __init__(self, func):
self._f = func
self._cache = {}
# Make the Memoized class masquerade as the object we are memoizing.
# Preserve class attributes
functools.update_wrapper(self, func)
# Preserve static methods
# From https://stackoverflow.com/questions/11174362
for k, v in func.__dict__.items():
self.__dict__[k] = v.__func__ if type(v) is staticmethod else v
def __call__(self, *args, **kwargs):
# Generate key
key = (args)
if kwargs:
key += (object())
for k, v in kwargs.items():
key += (hash(k))
key += (hash(v))
key = hash(key)
if key in self._cache:
return self._cache[key]
else:
self._cache[key] = self._f(*args, **kwargs)
return self._cache[key]
def __get__(self, instance, owner):
"""
From https://stackoverflow.com/questions/30104047/how-can-i-decorate-an-instance-method-with-a-decorator-class
"""
return functools.partial(self.__call__, instance)
def __instancecheck__(self, other):
"""Make isinstance() work"""
return isinstance(other, self._f)
return Memoized(f)
Then you can use like so:
#memoize
class Test:
def __init__(self, value):
self._value = value
#property
def value(self):
return self._value
Uploaded the full thing with documentation to: https://github.com/spoorn/nemoize