Changing global variable temporarily when calling object - python

I'm trying to understand how to change an object's attribute temporarily when it is called and have the original value persist when the object is not called.
Let me describe the problem with some code:
class DateCalc:
DEFAULT= "1/1/2001"
def __init__(self, day=DEFAULT):
self.day= day
def __call__(self, day=DEFAULT):
self.day= day
return self
def getday(self):
return self.day
In the event where a user calls getday method while passing another value
i.e 2/2/2002, self.day is set to 2/2/2002. However I want to be able to revert self.day to the original value of 1/1/2001 after the method call:
d_obj = DateCalc()
d_obj.getday() == "1/1/2001"
True
d_obj().getday() == "1/1/2001"
True
another_day_str = "2/2/2002"
d_obj(another_day_str).getday()
returns
"2/2/2002"
But when I run the following
d_obj.getday()
returns
"2/2/2002"
I was wondering what's the right way to revert the value, without needing to include code at every method call. Secondly, this should also be true when the object is called. For example:
d_obj().getday()
should return
"1/1/2001"
I thought a decorator on the call magic method would work here, but I'm not really sure where to start.
Any help would be much appreciated

Since you probably don't really want to modify the attributes of your object for a poorly defined interval, you need to return or otherwise create a different object.
The simplest case would be one in which you had two separate objects, and no __call__ method at all:
d1_obj = DateCalc()
d2_obj = DateCalc('2/2/2002')
print(d1_obj.getday()) # 1/1/2001
print(d2_obj.getday()) # 2/2/2002
If you know where you want to use d_obj vs d_obj() in the original case, you clearly know where to use d1_obj vs d2_obj in this version as well.
This may not be adequate for cases where DateCalc actually represents a very complex object that has many attributes that you do not want to change. In that case, you can have the __call__ method return a separate object that intelligently copies the portions of the original that you want.
For a simple case, this could be just
def __call__(self, day=DEFAULT):
return type(self)(day)
If the object becomes complex enough, you will want to create a proxy. A proxy is an object that forwards most of the implementation details to another object. super() is an example of a proxy that has a very highly customized __getattribute__ implementation, among other things.
In your particular case, you have a couple of requirements:
The proxy must store all overriden attributes.
The proxy must get all non-overriden attributes from the original objects.
The proxy must pass itself as the self parameter to any (at least non-special) methods that are invoked.
You can get as complicated with this as you want (in which case look up how to properly implement proxy objects like here). Here is a fairly simple example:
# Assume that there are many fields like `day` that you want to modify
class DateCalc:
DEFAULT= "1/1/2001"
def __init__(self, day=DEFAULT):
self.day= day
def getday(self):
return self.day
def __call__(self, **kwargs):
class Proxy:
def __init__(self, original, **kwargs):
self._self_ = original
self.__dict__.update(kwargs)
def __getattribute__(self, name):
# Don't forward any overriden, dunder or quasi-private attributes
if name.startswith('_') or name in self.__dict__:
return object.__getattribute__(self, name)
# This part is simplified:
# it does not take into account __slots__
# or attributes shadowing methods
t = type(self._self_)
if name in t.__dict__:
try:
return t.__dict__[name].__get__(self, t)
except AttributeError:
pass
return getattr(self._self_, name)
return Proxy(self, **kwargs)
The proxy would work exactly as you would want: it forwards any values that you did not override in __call__ from the original object. The interesting thing is that it binds instance methods to the proxy object instead of the original, so that getday gets called with a self that has the overridden value in it:
d_obj = DateCalc()
print(type(d_obj)) # __main__.DateCalc
print(d_obj.getday()) # 1/1/2001
d2_obj = d_obj(day='2/2/2002')
print(type(d2_obj)) # __main__.DateCalc.__call__.<locals>.Proxy
print(d2_obj.getday()) # 2/2/2002
Keep in mind that the proxy object shown here has very limited functionality implemented, and will not work properly in many situations. That being said, it likely covers many of the use cases that you will have out of the box. A good example is if you chose to make day a property instead of having a getter (it is the more Pythonic approach):
class DateCalc:
DEFAULT= "1/1/2001"
def __init__(self, day=DEFAULT):
self.__dict__['day'] = day
#property
def day(self):
return self.__dict__['day']
# __call__ same as above
...
d_obj = DateCalc()
print(d_obj(day='2/2/2002').day) # 2/2/2002
The catch here is that the proxy's version of day is just a regular writable attribute instead of a read-only property. If this is a problem for you, implementing __setattr__ appropriately on the proxy will be left as an exercise for the reader.

It seems that you want a behavior like a context manager: to modify an attribute for a limited time, use the updated attribute and then revert to the original. You can do this by having __call__ return a context manager, which you can then use in a with block like this:
d_obj = DateCalc()
print(d_obj.getday()) # 1/1/2001
with d_obj('2/2/2002'):
print(d_obj.getday()) # 2/2/2002
print(d_obj.getday()) # 1/1/2001
There are a couple of ways of creating such a context manager. The simplest would be to use a nested method in __call__ and decorate it with contextlib.contextmanager:
from contextlib import contextmanager
...
def __call__(self, day=DEFAULT):
#contextmanager
def context()
orig = self.day
self.day = day
yield
self.day = orig
return context
You could also use a fully-fledged nested class for this, but I would not recommend it unless you have some really complex requirements. I am just providing it for completeness:
def __call__(self, day=DEFAULT):
class Context:
def __init__(self, inst, new):
self.inst = inst
self.old = inst.day
self.new = new
def __enter__(self):
self.inst.day = self.new
def __exit__(self, *args):
self.inst.day = self.old
return Context(self, day)
Also, you should consider making getday a property, especially if it is really read-only.
Another alternative would be to have your methods accept different values:
def getday(self, day=None):
if day is None:
day = self.day
return day
This is actually a fairly common idiom.

Related

Promote instantiated class/object to a class in python?

Is there are a way in Python to store instantiated class as a class 'template' (aka promote object to a class) to create new objects of same type with same fields values, without relying on using data that was used to create original object again or on copy.deepcopy?
Like, for example I have the dictionary:
valid_date = {"date":"30 february"} # dict could have multiple items
and I have the class:
class AwesomeDate:
def __init__(self, dates_dict):
for key, val in dates_dict.items():
setattr(self, key, val);
I create the instance of the class like:
totally_valid_date = AwesomeDate(valid_date)
print(totally_valid_date.date) # output: 30 february
and now I want to use it to create new instances of the AwesomeDate class using the totally_valid_date instance as a template, i.e. like:
how_make_it_work = totally_valid_date()
print(how_make_it_work.date) # should print: 30 february
Is there are way to do so or no? I need a generic solution, not a solution for this specific example.
I don't really see the benefit of having a class act both as a template to instances, and as the instance itself, both conceptually and coding-wise. In my opinion, you're better off using two different classes - one for the template, one for the objects it is able to create.
You can think about awesome_date as a template class that stores the valid_date attributes upon initialization. Once called, the template returns an instance of a different class that has the expected attributes.
Here's a simple implementation (names have been changed to generalize the idea):
class Thing:
pass
class Template:
def __init__(self, template_attrs):
self.template_attrs = template_attrs
def __call__(self):
instance = Thing()
for key, val in self.template_attrs.items():
setattr(instance, key, val)
return instance
attrs = {'date': '30 february'}
template = Template(template_attrs=attrs)
# Gets instance of Thing
print(template()) # output: <__main__.Thing object at 0x7ffa656f8668>
# Gets another instance of Thing and accesses the date attribute
print(template().date) # output: 30 february
Yes, there are ways to do it -
there could even be some tweaking of inheriting from type and meddling with __call__ to make all instances automatically become derived classes. But I don't think that would be very sane. Python's own enum.Enum does something along this, because it has some use for the enum values - but the price is it became hard to understand beyond the basic usage, even for seasoned Pythonistas.
However, having a custom __init_subclass__ method that can inject some code to run prior to __init__ on the derived class, and then a method that will return a new class bound with the data that the new classes should have, can suffice:
import copy
from functools import wraps
def wrap_init(init):
#wraps(init)
def wrapper(self, *args, **kwargs):
if not getattr(self, "_initalized", False):
self.__dict__.update(self._template_data or {})
self._initialized = True
return init(self, *args, **kwargs)
wrapper._template_wrapper = True
return wrapper
class TemplateBase:
_template_data = None
def __init_subclass__(cls, *args, **kwargs):
super().__init_subclass__(*args, **kwargs)
if getattr(cls.__init__, "_template_wraper", False):
return
init = cls.__init__
cls.__init__ = wrap_init(init)
def as_class(self):
cls= self.__class__
new_cls = type(cls.__name__ + "_templated", (cls,), {})
new_cls._template_data = copy.copy(self.__dict__)
return new_cls
And using it:
class AwesomeDate(TemplateBase):
def __init__(self, dates_dict):
for key, val in dates_dict.items():
setattr(self, key, val)
On the REPL we have:
In [34]: x = AwesomeDate({"x":1, "y":2})
In [35]: Y = x.as_class()
In [36]: y = Y({})
In [37]: y.x
Out[37]: 1
Actually, __init_subclass__ itself could be supressed, and decorating __init__ could be done in one shot on the as_class method. This code takes some care so that mixin classes can be used, and it will still work.
It seems like you are going for something along the lines of the prototype design pattern.
What is the prototype design pattern?
From Wikipedia: Prototype pattern
The prototype pattern is a creational design pattern in software development. It is used when the type of objects to create is determined by a prototypical instance, which is cloned to produce new objects. This pattern is used to avoid subclasses of an object creator in the client application, like the factory method pattern does and to avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword) when it is prohibitively expensive for a given application.
From Refactoring.guru: Prototype
Prototype is a creational design pattern that lets you copy existing objects without making your code dependent on their classes. The Prototype pattern delegates the cloning process to the actual objects that are being cloned. The pattern declares a common interface for all objects that support cloning. This interface lets you clone an object without coupling your code to the class of that object. Usually, such an interface contains just a single clone method.
The implementation of the clone method is very similar in all classes. The method creates an object of the current class and carries over all of the field values of the old object into the new one. You can even copy private fields because most programming languages let objects access private fields of other objects that belong to the same class. An object that supports cloning is called a prototype. When your objects have dozens of fields and hundreds of possible configurations, cloning them might serve as an alternative to subclassing. Here’s how it works: you create a set of objects, configured in various ways. When you need an object like the one you’ve configured, you just clone a prototype instead of constructing a new object from scratch.
Implementing this for your problem, along with your other ideas
From your explanation, it seems like you want to:
Provide a variable containing a dictionary, which will be passed to the __init__ of some class Foo
Instantiate class Foo and pass the variable containing the dictionary as an argument.
Implement __call__ onto class Foo, allowing us to use the function call syntax on an object of class Foo.
The implementation of __call__ will COPY/CLONE the “template” object. We can then do whatever we want with this copied/cloned instance.
The Code (edited)
import copy
class Foo:
def __init__(self, *, template_attrs):
if not isinstance(template_attrs, dict):
raise TypeError("You must pass a dict to instantiate this class.")
self.template_attrs = template_attrs
def __call__(self):
return copy.copy(self)
def __repr__(self):
return f"{self.template_attrs}"
def __setitem__(self, key, value):
self.template_attrs[key] = value
def __getitem__(self, key):
if key not in self.template_attrs:
raise KeyError(f"Key {key} does not exist in '{self.template_attrs=}'.")
return self.template_attrs[key]
err = Foo(template_attrs=1) # Output: TypeError: You must pass a dict to instantiate this class.
# remove err's assignment to have code under it run
base = Foo(template_attrs={1: 2})
print(f"{base=}") # Output: base={1: 2}
base_copy = base()
base_copy["hello"] = "bye"
print(f"{base_copy=}") # Output: base_copy={1: 2, 'hello': 'bye'}
print(f"{base_copy[1]=}") # Output: base_copy[1]=2
print(f"{base_copy[10]=}") # Output: KeyError: "Key 10 does not exist in 'self.template_attrs={1: 2, 'hello': 'bye'}'."
I also added support for subscripting and item assignment through __getitem__ and __setitem__ respectively. I hope that this helped a bit with your problem! Feel free to comment on this if I missed what you were asking.
Reasons for edits (May 16th, 2022 at 8:49 PM CST | Approx. 9 hours after original answer)
Fix code based on suggestions by comment from user jsbueno
Handle, in __getitem__, if an instance of class Foo is subscripted with a key that doesn't exist in the dict.
Handle, in __init__, if the type of template_attrs isn't dict (did this based on the fact that you used a dictionary in the body of your question)

Is this a valid strategy pattern?

I need to augment the behavior of a class using external methods, hence I leverage the strategy pattern.
First I define an interface for the signature of methods:
class ILabel(ABC):
#abstractmethod
def get_label(self, obj):
pass
and an implementation of that interface:
class Label(ILabel):
def __init__(self, prefix):
self.prefix = prefix
def get_label(self, merchandise, obj):
return self.prefix + str(obj) + merchandise.name
And the class that I would like to augment its algorithm:
class Merchandise:
def __init__(self):
self.name = "__a_name"
def get_label(self, obj):
return str(obj) + self.name
def display(self, obj, get_label=None):
if get_label:
self.get_label = types.MethodType(get_label, self)
print(self.get_label(obj))
And finally the caller:
# default behavior
x = Merchandise().display("an_obj")
# augmented behavior
label = Label("a_prefix__")
y = Merchandise().display("an_obj", label)
print(f"Default output: {x}")
print(f"Augmented output: {y}")
the output should be:
Default output: an_obj__a_name
Augmented output: a_prefix__an_obj__a_name
Two questions:
Given instead of an "orphan" method (for the lack of a better word), I am sending a method within a class with reference to self; is this still considered strategy pattern, or a different pattern is closer to this design?
Since I pass a reference to the Merchandise when registering the method (i.e., types.MethodType(get_label, self)), the get_label method in the Label class has a reference to an instance Merchandise. i.e.:
def get_label(self, merchandise, obj):
The question is, is there any better naming convention for merchandise reference?
Update
In an endeavor to provide a minimal-working example, a decent amount of context is striped, which may lead to thinking the get_label method can be stateless (i.e., without a reference to an instance of Merchandise). The Label.get_label is updated to clarify this point.
Given instead of an "orphan" method (for the lack of a better word), I am sending a method within a class with reference to self; is this still considered strategy pattern, or a different pattern is closer to this design?
I would still call this Strategy, but keep reading.
Since I pass a reference to the Merchandise when registering the method (i.e., types.MethodType(get_label, self)), the correct definition of get_label in the Label class is:
This confusion is why you should take a different approach.
Your logic for implementing default behaviour of the strategy is backwards. There is no reason why "a Strategy for getting a label for obj" should need to be a method of the Merchandise class, except that you happen to have the default implementation stored there. Even that doesn't need to be an ordinary method, since it doesn't do anything with self.
This means the code is too complex (because you're needlessly using the types.MethodType machinery and dynamically patching the class) and also has unexpected stateful behaviour: when you call display with a non-None value for get_label, that Strategy will affect future calls to display where None is passed.
If you don't want stateful behaviour, then you want the default-setting logic the other way around - set a local rather than modifying the class:
class Merchandise:
#staticmethod
def get_label(obj):
return str(obj)
def display(self, obj, get_label=None):
if get_label is None:
get_label = Merchandise.get_label
print(get_label(obj))
Although we don't actually need the "replace None with a default value" pattern here, since we aren't going to mutate the parameter:
class Merchandise:
#staticmethod
def get_label(obj):
return str(obj)
def display(self, obj, get_label=Merchandise.get_label):
print(get_label(obj))
And this toy example it could be even simpler:
class Merchandise:
def display(self, obj, get_label=str):
print(get_label(obj))
# although *this* doesn't rely on `self`, either....
If you do want stateful behaviour, then you should set the state either at initialization, or explicitly later, or both:
class Merchandise:
def __init__(self, get_label=str):
self.get_label = get_label
#property
def get_label(self): return self._get_label
#get_label.setter
def get_label(self, value):
# may as well do a little verification
if not callable(value):
raise TypeError("get_label strategy must be callable")
self._get_label = value
def display(self, obj):
print(self.get_label(obj))
Notice here that self.get_label(obj) is not a method call; Python will find get_label as an attribute of the instance, before it attempts to look it up in the class; having found a callable object, it then calls that object.

Is this sound software engineering practice for class construction?

Is this a plausible and sound way to write a class where there is a syntactic sugar #staticmethod that is used for the outside to interact with? Thanks.
###scrip1.py###
import SampleClass.method1 as method1
output = method1(input_var)
###script2.py###
class SampleClass(object):
def __init__(self):
self.var1 = 'var1'
self.var2 = 'var2'
#staticmethod
def method1(input_var):
# Syntactic Sugar method that outside uses
sample_class = SampleClass()
result = sample_class._method2(input_var)
return result
def _method2(self, input_var):
# Main method executes the various steps.
self.var4 = self._method3(input_var)
return self._method4(self.var4)
def _method3(self):
pass
def _method4(self):
pass
Answering to both your question and your comment, yes it is possible to write such a code but I see no point in doing it:
class A:
def __new__(cls, value):
return cls.meth1(value)
def meth1(value):
return value + 1
result = A(100)
print(result)
# output:
101
You can't store a reference to a class A instance because you get your method result instead of an A instance. And because of this, an existing __init__will not be called.
So if the instance just calculates something and gets discarded right away, what you want is to write a simple function, not a class. You are not storing state anywhere.
And if you look at it:
result = some_func(value)
looks exactly to what people expect when reading it, a function call.
So no, it is not a good practice unless you come up with a good use case for it (I can't remember one right now)
Also relevant for this question is the documentation here to understand __new__ and __init__ behaviour.
Regarding your other comment below my answer:
defining __init__ in a class to set the initial state (attribute values) of the (already) created instance happens all the time. But __new__ has the different goal of customizing the object creation. The instance object does not exist yet when __new__is run (it is a constructor function). __new__ is rarely needed in Python unless you need things like a singleton, say a class A that always returns the very same object instance (of A) when called with A(). Normal user-defined classes usually return a new object on instantiation. You can check this with the id() builtin function. Another use case is when you create your own version (by subclassing) of an immutable type. Because it's immutable the value was already set and there is no way of changing the value inside __init__ or later. Hence the need to act before that, adding code inside __new__. Using __new__ without returning an object of the same class type (this is the uncommon case) has the addtional problem of not running __init__.
If you are just grouping lots of methods inside a class but there is still no state to store/manage in each instance (you notice this also by the absence of self use in the methods body), consider not using a class at all and organize these methods now turned into selfless functions in a module or package for import. Because it looks you are grouping just to organize related code.
If you stick to classes because there is state involved, consider breaking the class into smaller classes with no more than five to 7 methods. Think also of giving them some more structure by grouping some of the small classes in various modules/submodules and using subclasses, because a long plain list of small classes (or functions anyway) can be mentally difficult to follow.
This has nothing to do with __new__ usage.
In summary, use the syntax of a call for a function call that returns a result (or None) or for an object instantiation by calling the class name. In this case the usual is to return an object of the intended type (the class called). Returning the result of a method usually involves returning a different type and that can look unexpected to the class user. There is a close use case to this where some coders return self from their methods to allow for train-like syntax:
my_font = SomeFont().italic().bold()
Finally if you don't like result = A().method(value), consider an alias:
func = A().method
...
result = func(value)
Note how you are left with no reference to the A() instance in your code.
If you need the reference split further the assignment:
a = A()
func = a.method
...
result = func(value)
If the reference to A() is not needed then you probably don't need the instance too, and the class is just grouping the methods. You can just write
func = A.method
result = func(value)
where selfless methods should be decorated with #staticmethod because there is no instance involved. Note also how static methods could be turned into simple functions outside classes.
Edit:
I have setup an example similar to what you are trying to acomplish. It is also difficult to judge if having methods injecting results into the next method is the best choice for a multistep procedure. Because they share some state, they are coupled to each other and so can also inject errors to each other more easily. I assume you want to share some data between them that way (and that's why you are setting them up in a class):
So this an example class where a public method builds the result by calling a chain of internal methods. All methods depend on object state, self.offset in this case, despite getting an input value for calculations.
Because of this it makes sense that every method uses self to access the state. It also makes sense that you are able to instantiate different objects holding different configurations, so I see no use here for #staticmethod or #classmethod.
Initial instance configuration is done in __init__ as usual.
# file: multistepinc.py
def __init__(self, offset):
self.offset = offset
def result(self, value):
return self._step1(value)
def _step1(self, x):
x = self._step2(x)
return self.offset + 1 + x
def _step2(self, x):
x = self._step3(x)
return self.offset + 2 + x
def _step3(self, x):
return self.offset + 3 + x
def get_multi_step_inc(offset):
return MultiStepInc(offset).result
--------
# file: multistepinc_example.py
from multistepinc import get_multi_step_inc
# get the result method of a configured
# MultiStepInc instance
# with offset = 10.
# Much like an object factory, but you
# mentioned to prefer to have the result
# method of the instance
# instead of the instance itself.
inc10 = get_multi_step_inc(10)
# invoke the inc10 method
result = inc10(1)
print(result)
# creating another instance with offset=2
inc2 = get_multi_step_inc(2)
result = inc2(1)
print(result)
# if you need to manipulate the object
# instance
# you have to (on file top)
from multistepinc import MultiStepInc
# and then
inc_obj = MultiStepInc(5)
# ...
# ... do something with your obj, then
result = inc_obj.result(1)
print(result)
Outputs:
37
13
22

Python decorator get or set dictionary value in class

I'm working on a class representing on object with numerous associated data. I'm storing these data in a dictionary class attribute called metadata. A representation could be:
{'key1':slowToComputeValue, 'key2':evenSlowerToComputeValue}
The calculating of the values is in some cases very slow, so what I want to do is, using "getter" functions, first try and get the value from the metadata dict. Only on a KeyError (i.e. when the getter tries to get a value for a key which doesn't exist yet) should the value be calculated (and added to the dictionary for fast access next time the getter is called).
I began with a simple:
try:
return self.metadata[requested_key]
except KeyError:
#Implementation of function
As there are many getters in the class, I started thought that these first 3 lines of code could be handled by a decorator. However I'm having problems making this work. The problem is that I need to pass the metadata dictionary from the class instance to the decorator. I've found several tutorials and posts like this one which show that it is possible to send a parameter to an enclosing function but the difficulty I'm having is sending a class instantiation attribute metadata to it (if I send a string value it works).
Some example code from my attempt is here:
def get_existing_value_from_metadata_if_exists(metadata):
def decorator(function):
#wraps(function)
def decorated(*args, **kwargs):
function_name = function.__name__
if function_name in metadata.keys():
return metadata[function_name]
else:
function(*args, **kwargs)
return decorated
return decorator
class my_class():
#get_existing_value_from_metadata_if_exists(metadata)
def get_key1(self):
#Costly value calculation and add to metadata
#get_existing_value_from_metadata_if_exists(metadata)
def get_key2(self):
#Costly value calculation and add to metadata
def __init__(self):
self.metadata = {}
The errors I'm getting are generally self not defined but I've tried various combinations of parameter placement, decorator placement etc. without success.
So my questions are:
How can I make this work?
Are decorators a suitable way to achieve what I'm trying to do?
Yes, a decorator is a good use case for this. Django for example has something similar already included with it, it's called cached_property.
Basically all it does is that when the property is accessed first time it will store the data in instance's dict(__dict__) by the same name as the function. When we fetch the same property later on it simple fetches the value from the instance dictionary.
A cached_property is a non-data descriptor. Hence once the key is set in instance's dictionary, the access to property would always get the value from there.
class cached_property(object):
"""
Decorator that converts a method with a single self argument into a
property cached on the instance.
Optional ``name`` argument allows you to make cached properties of other
methods. (e.g. url = cached_property(get_absolute_url, name='url') )
"""
def __init__(self, func, name=None):
self.func = func
self.__doc__ = getattr(func, '__doc__')
self.name = name or func.__name__
def __get__(self, instance, cls=None):
if instance is None:
return self
res = instance.__dict__[self.name] = self.func(instance)
return res
In your case:
class MyClass:
#cached_property
def key1(self):
#Costly value calculation and add to metadata
#cached_property
def key2(self):
#Costly value calculation and add to metadata
def __init__(self):
# self.metadata not required
Use the name argument to convert an existing method to cached property.
class MyClass:
def __init__(self, data):
self.data = data
def get_total(self):
print('Processing...')
return sum(self.data)
total = cached_property(get_total, 'total')
Demo:
>>> m = MyClass(list(range(10**5)))
>>> m.get_total()
Processing...
4999950000
>>> m.total
Processing...
4999950000
>>> m.total
4999950000
>>> m.data.append(1000)
>>> m.total # This is now invalid
4999950000
>>> m.get_total() # This still works
Processing...
4999951000
>>> m.total
4999950000
Based on the example above we can see that we can use total as long as we know the internal data hasn't been updated yet, hence saving processing time. But it doesn't make get_total() redundant, as it can get the correct total based on the data.
Another example could be that our public facing client was using something(say get_full_name()) as method so far but we realised that it would be more appropriate to use it as a property(just full_name), in that case it makes sense to keep the method intact but mark it as deprecated and start suggesting the users to use the new property from now on.
Another way to go about this is to use class "properties" like so:
class MyClass():
def __init__():
self._slowToComputeValue = None
#property
def slowToComputeValue(self):
if self._slowToComputeValue is None:
self._slowToComputeValue = self.ComputeValue()
return self._slowToComputeValue
def ComputeValue(self):
pass
Now you can access this as though it were a class attribute:
myclass = MyClass()
print(myclass.slowToComputeValue)

How can I memoize a class instantiation in Python?

Ok, here is the real world scenario: I'm writing an application, and I have a class that represents a certain type of files (in my case this is photographs but that detail is irrelevant to the problem). Each instance of the Photograph class should be unique to the photo's filename.
The problem is, when a user tells my application to load a file, I need to be able to identify when files are already loaded, and use the existing instance for that filename, rather than create duplicate instances on the same filename.
To me this seems like a good situation to use memoization, and there's a lot of examples of that out there, but in this case I'm not just memoizing an ordinary function, I need to be memoizing __init__(). This poses a problem, because by the time __init__() gets called it's already too late as there's a new instance created already.
In my research I found Python's __new__() method, and I was actually able to write a working trivial example, but it fell apart when I tried to use it on my real-world objects, and I'm not sure why (the only thing I can think of is that my real world objects were subclasses of other objects that I can't really control, and so there were some incompatibilities with this approach). This is what I had:
class Flub(object):
instances = {}
def __new__(cls, flubid):
try:
self = Flub.instances[flubid]
except KeyError:
self = Flub.instances[flubid] = super(Flub, cls).__new__(cls)
print 'making a new one!'
self.flubid = flubid
print id(self)
return self
#staticmethod
def destroy_all():
for flub in Flub.instances.values():
print 'killing', flub
a = Flub('foo')
b = Flub('foo')
c = Flub('bar')
print a
print b
print c
print a is b, b is c
Flub.destroy_all()
Which output this:
making a new one!
139958663753808
139958663753808
making a new one!
139958663753872
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb050>
<__main__.Flub object at 0x7f4aaa6fb090>
True False
killing <__main__.Flub object at 0x7f4aaa6fb050>
killing <__main__.Flub object at 0x7f4aaa6fb090>
It's perfect! Only two instances were made for the two unique id's given, and Flub.instances clearly only has two listed.
But when I tried to take this approach with the objects I was using, I got all kinds of nonsensical errors about how __init__() took only 0 arguments, not 2. So I'd change some things around and then it would tell me that __init__() needed an argument. Totally bizarre.
After a while of fighting with it, I basically just gave up and moved all the __new__() black magic into a staticmethod called get, such that I could call Photograph.get(filename) and it would only call Photograph(filename) if filename wasn't already in Photograph.instances.
Does anybody know where I went wrong here? Is there some better way to do this?
Another way of thinking about it is that it's similar to a singleton, except it's not globally singleton, just singleton-per-filename.
Here's my real-world code using the staticmethod get if you want to see it all together.
Let us see two points about your question.
Using memoize
You can use memoization, but you should decorate the class, not the __init__ method. Suppose we have this memoizator:
def get_id_tuple(f, args, kwargs, mark=object()):
"""
Some quick'n'dirty way to generate a unique key for an specific call.
"""
l = [id(f)]
for arg in args:
l.append(id(arg))
l.append(id(mark))
for k, v in kwargs:
l.append(k)
l.append(id(v))
return tuple(l)
_memoized = {}
def memoize(f):
"""
Some basic memoizer
"""
def memoized(*args, **kwargs):
key = get_id_tuple(f, args, kwargs)
if key not in _memoized:
_memoized[key] = f(*args, **kwargs)
return _memoized[key]
return memoized
Now you just need to decorate the class:
#memoize
class Test(object):
def __init__(self, somevalue):
self.somevalue = somevalue
Let us see a test?
tests = [Test(1), Test(2), Test(3), Test(2), Test(4)]
for test in tests:
print test.somevalue, id(test)
The output is below. Note that the same parameters yield the same id of the returned object:
1 3072319660
2 3072319692
3 3072319724
2 3072319692
4 3072319756
Anyway, I would prefer to create a function to generate the objects and memoize it. Seems cleaner to me, but it may be some irrelevant pet peeve:
class Test(object):
def __init__(self, somevalue):
self.somevalue = somevalue
#memoize
def get_test_from_value(somevalue):
return Test(somevalue)
Using __new__:
Or, of course, you can override __new__. Some days ago I posted an answer about the ins, outs and best practices of overriding __new__ that can be helpful. Basically, it says to always pass *args, **kwargs to your __new__ method.
I, for one, would prefer to memoize a function which creates the objects, or even write a specific function which would take care of never recreating a object to the same parameter. Of course, however, this is mostly a opinion of mine, not a rule.
The solution that I ended up using is this:
class memoize(object):
def __init__(self, cls):
self.cls = cls
self.__dict__.update(cls.__dict__)
# This bit allows staticmethods to work as you would expect.
for attr, val in cls.__dict__.items():
if type(val) is staticmethod:
self.__dict__[attr] = val.__func__
def __call__(self, *args):
key = '//'.join(map(str, args))
if key not in self.cls.instances:
self.cls.instances[key] = self.cls(*args)
return self.cls.instances[key]
And then you decorate the class with this, not __init__. Although brandizzi provided me with that key piece of information, his example decorator didn't function as desired.
I found this concept quite subtle, but basically when you're using decorators in Python, you need to understand that the thing that gets decorated (whether it's a method or a class) is actually replaced by the decorator itself. So for example when I'd try to access Photograph.instances or Camera.generate_id() (a staticmethod), I couldn't actually access them because Photograph doesn't actually refer to the original Photograph class, it refers to the memoized function (from brandizzi's example).
To get around this, I had to create a decorator class that actually took all the attributes and static methods from the decorated class and exposed them as it's own. Almost like a subclass, except that the decorator class doesn't know ahead of time what classes it will be decorating, so it has to copy the attributes over after the fact.
The end result is that any instance of the memoize class becomes an almost transparent wrapper around the actual class that it has decorated, with the exception that attempting to instantiate it (but really calling it) will provide you with cached copies when they're available.
The parameters to __new__ also get passed to __init__, so:
def __init__(self, flubid):
...
You need to accept the flubid argument there, even if you don't use it in __init__
Here is the relevant comment taken from typeobject.c in Python2.7.3
/* You may wonder why object.__new__() only complains about arguments
when object.__init__() is not overridden, and vice versa.
Consider the use cases:
1. When neither is overridden, we want to hear complaints about
excess (i.e., any) arguments, since their presence could
indicate there's a bug.
2. When defining an Immutable type, we are likely to override only
__new__(), since __init__() is called too late to initialize an
Immutable object. Since __new__() defines the signature for the
type, it would be a pain to have to override __init__() just to
stop it from complaining about excess arguments.
3. When defining a Mutable type, we are likely to override only
__init__(). So here the converse reasoning applies: we don't
want to have to override __new__() just to stop it from
complaining.
4. When __init__() is overridden, and the subclass __init__() calls
object.__init__(), the latter should complain about excess
arguments; ditto for __new__().
Use cases 2 and 3 make it unattractive to unconditionally check for
excess arguments. The best solution that addresses all four use
cases is as follows: __init__() complains about excess arguments
unless __new__() is overridden and __init__() is not overridden
(IOW, if __init__() is overridden or __new__() is not overridden);
symmetrically, __new__() complains about excess arguments unless
__init__() is overridden and __new__() is not overridden
(IOW, if __new__() is overridden or __init__() is not overridden).
However, for backwards compatibility, this breaks too much code.
Therefore, in 2.6, we'll *warn* about excess arguments when both
methods are overridden; for all other cases we'll use the above
rules.
*/
Was trying to figure this out as well and I put together a solution that combines some tips from other StackOverflow questions (links in the code comments).
If anyone still needs, try this out:
import functools
from collections import OrderedDict
def memoize(f):
class Memoized:
def __init__(self, func):
self._f = func
self._cache = {}
# Make the Memoized class masquerade as the object we are memoizing.
# Preserve class attributes
functools.update_wrapper(self, func)
# Preserve static methods
# From https://stackoverflow.com/questions/11174362
for k, v in func.__dict__.items():
self.__dict__[k] = v.__func__ if type(v) is staticmethod else v
def __call__(self, *args, **kwargs):
# Generate key
key = (args)
if kwargs:
key += (object())
for k, v in kwargs.items():
key += (hash(k))
key += (hash(v))
key = hash(key)
if key in self._cache:
return self._cache[key]
else:
self._cache[key] = self._f(*args, **kwargs)
return self._cache[key]
def __get__(self, instance, owner):
"""
From https://stackoverflow.com/questions/30104047/how-can-i-decorate-an-instance-method-with-a-decorator-class
"""
return functools.partial(self.__call__, instance)
def __instancecheck__(self, other):
"""Make isinstance() work"""
return isinstance(other, self._f)
return Memoized(f)
Then you can use like so:
#memoize
class Test:
def __init__(self, value):
self._value = value
#property
def value(self):
return self._value
Uploaded the full thing with documentation to: https://github.com/spoorn/nemoize

Categories