How is this Python design pattern called? - python

In the ffmpeg-python docs, they used following design pattern for their examples:
(
ffmpeg
.input('dummy.mp4')
.filter('fps', fps=25, round='up')
.output('dummy2.mp4')
.run()
)
How is this design pattern called, where can I find more information about it, and what are the pros and cons of it ?

This design pattern called builder, you can read about it in here
basically, all the command (except run) change the object, and return it self, which allow you to "build" the object as it go on.
in my opinion is very usefull things, in query building is super good, and can be simplify a code.
think about db query you want to build, lets say we use sql.
# lets say we implement a builder called query
class query:
def __init__():
...
def from(self, db_name):
self.db_name = db_name
return self
....
q = query()
.from("db_name") # Notice every line change something (like here change query.db_name to "db_name"
.fields("username")
.where(id=2)
.execute() # this line will run the query on the server and return the output

This is called 'method chaining' or 'function chaining'. You can chain method calls together because each method call returns the underlying object itself (represented by self in Python, or this in other languages).
It's a technique used in the Gang of Four builder design pattern where you construct an initial object and then chain additional property setters, for example: car().withColor('red').withDoors(2).withSunroof().
Here's an example:
class Arithmetic:
def __init__(self):
self.value = 0
def total(self, *args):
self.value = sum(args)
return self
def double(self):
self.value *= 2
return self
def add(self, x):
self.value += x
return self
def subtract(self, x):
self.value -= x
return self
def __str__(self):
return f"{self.value}"
a = Arithmetic().total(1, 2, 3)
print(a) # 6
a = Arithmetic().total(1, 2, 3).double()
print(a) # 12
a = Arithmetic().total(1, 2, 3).double().subtract(3)
print(a) # 9

Related

Function overload errors [duplicate]

I know that Python does not support method overloading, but I've run into a problem that I can't seem to solve in a nice Pythonic way.
I am making a game where a character needs to shoot a variety of bullets, but how do I write different functions for creating these bullets? For example suppose I have a function that creates a bullet travelling from point A to B with a given speed. I would write a function like this:
def add_bullet(sprite, start, headto, speed):
# Code ...
But I want to write other functions for creating bullets like:
def add_bullet(sprite, start, direction, speed):
def add_bullet(sprite, start, headto, spead, acceleration):
def add_bullet(sprite, script): # For bullets that are controlled by a script
def add_bullet(sprite, curve, speed): # for bullets with curved paths
# And so on ...
And so on with many variations. Is there a better way to do it without using so many keyword arguments cause its getting kinda ugly fast. Renaming each function is pretty bad too because you get either add_bullet1, add_bullet2, or add_bullet_with_really_long_name.
To address some answers:
No I can't create a Bullet class hierarchy because thats too slow. The actual code for managing bullets is in C and my functions are wrappers around C API.
I know about the keyword arguments but checking for all sorts of combinations of parameters is getting annoying, but default arguments help allot like acceleration=0
What you are asking for is called multiple dispatch. See Julia language examples which demonstrates different types of dispatches.
However, before looking at that, we'll first tackle why overloading is not really what you want in Python.
Why Not Overloading?
First, one needs to understand the concept of overloading and why it's not applicable to Python.
When working with languages that can discriminate data types at
compile-time, selecting among the alternatives can occur at
compile-time. The act of creating such alternative functions for
compile-time selection is usually referred to as overloading a
function. (Wikipedia)
Python is a dynamically typed language, so the concept of overloading simply does not apply to it. However, all is not lost, since we can create such alternative functions at run-time:
In programming languages that defer data type identification until
run-time the selection among alternative
functions must occur at run-time, based on the dynamically determined
types of function arguments. Functions whose alternative
implementations are selected in this manner are referred to most
generally as multimethods. (Wikipedia)
So we should be able to do multimethods in Python—or, as it is alternatively called: multiple dispatch.
Multiple dispatch
The multimethods are also called multiple dispatch:
Multiple dispatch or multimethods is the feature of some
object-oriented programming languages in which a function or method
can be dynamically dispatched based on the run time (dynamic) type of
more than one of its arguments. (Wikipedia)
Python does not support this out of the box1, but, as it happens, there is an excellent Python package called multipledispatch that does exactly that.
Solution
Here is how we might use multipledispatch2 package to implement your methods:
>>> from multipledispatch import dispatch
>>> from collections import namedtuple
>>> from types import * # we can test for lambda type, e.g.:
>>> type(lambda a: 1) == LambdaType
True
>>> Sprite = namedtuple('Sprite', ['name'])
>>> Point = namedtuple('Point', ['x', 'y'])
>>> Curve = namedtuple('Curve', ['x', 'y', 'z'])
>>> Vector = namedtuple('Vector', ['x','y','z'])
>>> #dispatch(Sprite, Point, Vector, int)
... def add_bullet(sprite, start, direction, speed):
... print("Called Version 1")
...
>>> #dispatch(Sprite, Point, Point, int, float)
... def add_bullet(sprite, start, headto, speed, acceleration):
... print("Called version 2")
...
>>> #dispatch(Sprite, LambdaType)
... def add_bullet(sprite, script):
... print("Called version 3")
...
>>> #dispatch(Sprite, Curve, int)
... def add_bullet(sprite, curve, speed):
... print("Called version 4")
...
>>> sprite = Sprite('Turtle')
>>> start = Point(1,2)
>>> direction = Vector(1,1,1)
>>> speed = 100 #km/h
>>> acceleration = 5.0 #m/s**2
>>> script = lambda sprite: sprite.x * 2
>>> curve = Curve(3, 1, 4)
>>> headto = Point(100, 100) # somewhere far away
>>> add_bullet(sprite, start, direction, speed)
Called Version 1
>>> add_bullet(sprite, start, headto, speed, acceleration)
Called version 2
>>> add_bullet(sprite, script)
Called version 3
>>> add_bullet(sprite, curve, speed)
Called version 4
1. Python 3 currently supports single dispatch
2. Take care not to use multipledispatch in a multi-threaded environment or you will get weird behavior.
Python does support "method overloading" as you present it. In fact, what you just describe is trivial to implement in Python, in so many different ways, but I would go with:
class Character(object):
# your character __init__ and other methods go here
def add_bullet(self, sprite=default, start=default,
direction=default, speed=default, accel=default,
curve=default):
# do stuff with your arguments
In the above code, default is a plausible default value for those arguments, or None. You can then call the method with only the arguments you are interested in, and Python will use the default values.
You could also do something like this:
class Character(object):
# your character __init__ and other methods go here
def add_bullet(self, **kwargs):
# here you can unpack kwargs as (key, values) and
# do stuff with them, and use some global dictionary
# to provide default values and ensure that ``key``
# is a valid argument...
# do stuff with your arguments
Another alternative is to directly hook the desired function directly to the class or instance:
def some_implementation(self, arg1, arg2, arg3):
# implementation
my_class.add_bullet = some_implementation_of_add_bullet
Yet another way is to use an abstract factory pattern:
class Character(object):
def __init__(self, bfactory, *args, **kwargs):
self.bfactory = bfactory
def add_bullet(self):
sprite = self.bfactory.sprite()
speed = self.bfactory.speed()
# do stuff with your sprite and speed
class pretty_and_fast_factory(object):
def sprite(self):
return pretty_sprite
def speed(self):
return 10000000000.0
my_character = Character(pretty_and_fast_factory(), a1, a2, kw1=v1, kw2=v2)
my_character.add_bullet() # uses pretty_and_fast_factory
# now, if you have another factory called "ugly_and_slow_factory"
# you can change it at runtime in python by issuing
my_character.bfactory = ugly_and_slow_factory()
# In the last example you can see abstract factory and "method
# overloading" (as you call it) in action
You can use "roll-your-own" solution for function overloading. This one is copied from Guido van Rossum's article about multimethods (because there is little difference between multimethods and overloading in Python):
registry = {}
class MultiMethod(object):
def __init__(self, name):
self.name = name
self.typemap = {}
def __call__(self, *args):
types = tuple(arg.__class__ for arg in args) # a generator expression!
function = self.typemap.get(types)
if function is None:
raise TypeError("no match")
return function(*args)
def register(self, types, function):
if types in self.typemap:
raise TypeError("duplicate registration")
self.typemap[types] = function
def multimethod(*types):
def register(function):
name = function.__name__
mm = registry.get(name)
if mm is None:
mm = registry[name] = MultiMethod(name)
mm.register(types, function)
return mm
return register
The usage would be
from multimethods import multimethod
import unittest
# 'overload' makes more sense in this case
overload = multimethod
class Sprite(object):
pass
class Point(object):
pass
class Curve(object):
pass
#overload(Sprite, Point, Direction, int)
def add_bullet(sprite, start, direction, speed):
# ...
#overload(Sprite, Point, Point, int, int)
def add_bullet(sprite, start, headto, speed, acceleration):
# ...
#overload(Sprite, str)
def add_bullet(sprite, script):
# ...
#overload(Sprite, Curve, speed)
def add_bullet(sprite, curve, speed):
# ...
Most restrictive limitations at the moment are:
methods are not supported, only functions that are not class members;
inheritance is not handled;
kwargs are not supported;
registering new functions should be done at import time thing is not thread-safe
A possible option is to use the multipledispatch module as detailed here:
http://matthewrocklin.com/blog/work/2014/02/25/Multiple-Dispatch
Instead of doing this:
def add(self, other):
if isinstance(other, Foo):
...
elif isinstance(other, Bar):
...
else:
raise NotImplementedError()
You can do this:
from multipledispatch import dispatch
#dispatch(int, int)
def add(x, y):
return x + y
#dispatch(object, object)
def add(x, y):
return "%s + %s" % (x, y)
With the resulting usage:
>>> add(1, 2)
3
>>> add(1, 'hello')
'1 + hello'
In Python 3.4 PEP-0443. Single-dispatch generic functions was added.
Here is a short API description from PEP.
To define a generic function, decorate it with the #singledispatch decorator. Note that the dispatch happens on the type of the first argument. Create your function accordingly:
from functools import singledispatch
#singledispatch
def fun(arg, verbose=False):
if verbose:
print("Let me just say,", end=" ")
print(arg)
To add overloaded implementations to the function, use the register() attribute of the generic function. This is a decorator, taking a type parameter and decorating a function implementing the operation for that type:
#fun.register(int)
def _(arg, verbose=False):
if verbose:
print("Strength in numbers, eh?", end=" ")
print(arg)
#fun.register(list)
def _(arg, verbose=False):
if verbose:
print("Enumerate this:")
for i, elem in enumerate(arg):
print(i, elem)
The #overload decorator was added with type hints (PEP 484).
While this doesn't change the behaviour of Python, it does make it easier to understand what is going on, and for mypy to detect errors.
See: Type hints and PEP 484
This type of behaviour is typically solved (in OOP languages) using polymorphism. Each type of bullet would be responsible for knowing how it travels. For instance:
class Bullet(object):
def __init__(self):
self.curve = None
self.speed = None
self.acceleration = None
self.sprite_image = None
class RegularBullet(Bullet):
def __init__(self):
super(RegularBullet, self).__init__()
self.speed = 10
class Grenade(Bullet):
def __init__(self):
super(Grenade, self).__init__()
self.speed = 4
self.curve = 3.5
add_bullet(Grendade())
def add_bullet(bullet):
c_function(bullet.speed, bullet.curve, bullet.acceleration, bullet.sprite, bullet.x, bullet.y)
void c_function(double speed, double curve, double accel, char[] sprite, ...) {
if (speed != null && ...) regular_bullet(...)
else if (...) curved_bullet(...)
//..etc..
}
Pass as many arguments to the c_function that exist, and then do the job of determining which c function to call based on the values in the initial c function. So, Python should only ever be calling the one c function. That one c function looks at the arguments, and then can delegate to other c functions appropriately.
You're essentially just using each subclass as a different data container, but by defining all the potential arguments on the base class, the subclasses are free to ignore the ones they do nothing with.
When a new type of bullet comes along, you can simply define one more property on the base, change the one python function so that it passes the extra property, and the one c_function that examines the arguments and delegates appropriately. It doesn't sound too bad I guess.
It is impossible by definition to overload a function in python (read on for details), but you can achieve something similar with a simple decorator
class overload:
def __init__(self, f):
self.cases = {}
def args(self, *args):
def store_function(f):
self.cases[tuple(args)] = f
return self
return store_function
def __call__(self, *args):
function = self.cases[tuple(type(arg) for arg in args)]
return function(*args)
You can use it like this
#overload
def f():
pass
#f.args(int, int)
def f(x, y):
print('two integers')
#f.args(float)
def f(x):
print('one float')
f(5.5)
f(1, 2)
Modify it to adapt it to your use case.
A clarification of concepts
function dispatch: there are multiple functions with the same name. Which one should be called? two strategies
static/compile-time dispatch (aka. "overloading"). decide which function to call based on the compile-time type of the arguments. In all dynamic languages, there is no compile-time type, so overloading is impossible by definition
dynamic/run-time dispatch: decide which function to call based on the runtime type of the arguments. This is what all OOP languages do: multiple classes have the same methods, and the language decides which one to call based on the type of self/this argument. However, most languages only do it for the this argument only. The above decorator extends the idea to multiple parameters.
To clear up, assume that we define, in a hypothetical static language, the functions
void f(Integer x):
print('integer called')
void f(Float x):
print('float called')
void f(Number x):
print('number called')
Number x = new Integer('5')
f(x)
x = new Number('3.14')
f(x)
With static dispatch (overloading) you will see "number called" twice, because x has been declared as Number, and that's all overloading cares about. With dynamic dispatch you will see "integer called, float called", because those are the actual types of x at the time the function is called.
By passing keyword args.
def add_bullet(**kwargs):
#check for the arguments listed above and do the proper things
Python 3.8 added functools.singledispatchmethod
Transform a method into a single-dispatch generic function.
To define a generic method, decorate it with the #singledispatchmethod
decorator. Note that the dispatch happens on the type of the first
non-self or non-cls argument, create your function accordingly:
from functools import singledispatchmethod
class Negator:
#singledispatchmethod
def neg(self, arg):
raise NotImplementedError("Cannot negate a")
#neg.register
def _(self, arg: int):
return -arg
#neg.register
def _(self, arg: bool):
return not arg
negator = Negator()
for v in [42, True, "Overloading"]:
neg = negator.neg(v)
print(f"{v=}, {neg=}")
Output
v=42, neg=-42
v=True, neg=False
NotImplementedError: Cannot negate a
#singledispatchmethod supports nesting with other decorators such as
#classmethod. Note that to allow for dispatcher.register,
singledispatchmethod must be the outer most decorator. Here is the
Negator class with the neg methods being class bound:
from functools import singledispatchmethod
class Negator:
#singledispatchmethod
#staticmethod
def neg(arg):
raise NotImplementedError("Cannot negate a")
#neg.register
def _(arg: int) -> int:
return -arg
#neg.register
def _(arg: bool) -> bool:
return not arg
for v in [42, True, "Overloading"]:
neg = Negator.neg(v)
print(f"{v=}, {neg=}")
Output:
v=42, neg=-42
v=True, neg=False
NotImplementedError: Cannot negate a
The same pattern can be used for other similar decorators:
staticmethod, abstractmethod, and others.
I think your basic requirement is to have a C/C++-like syntax in Python with the least headache possible. Although I liked Alexander Poluektov's answer it doesn't work for classes.
The following should work for classes. It works by distinguishing by the number of non-keyword arguments (but it doesn't support distinguishing by type):
class TestOverloading(object):
def overloaded_function(self, *args, **kwargs):
# Call the function that has the same number of non-keyword arguments.
getattr(self, "_overloaded_function_impl_" + str(len(args)))(*args, **kwargs)
def _overloaded_function_impl_3(self, sprite, start, direction, **kwargs):
print "This is overload 3"
print "Sprite: %s" % str(sprite)
print "Start: %s" % str(start)
print "Direction: %s" % str(direction)
def _overloaded_function_impl_2(self, sprite, script):
print "This is overload 2"
print "Sprite: %s" % str(sprite)
print "Script: "
print script
And it can be used simply like this:
test = TestOverloading()
test.overloaded_function("I'm a Sprite", 0, "Right")
print
test.overloaded_function("I'm another Sprite", "while x == True: print 'hi'")
Output:
This is overload 3
Sprite: I'm a Sprite
Start: 0
Direction: Right
This is overload 2
Sprite: I'm another Sprite
Script:
while x == True: print 'hi'
You can achieve this with the following Python code:
#overload
def test(message: str):
return message
#overload
def test(number: int):
return number + 1
Either use multiple keyword arguments in the definition, or create a Bullet hierarchy whose instances are passed to the function.
I think a Bullet class hierarchy with the associated polymorphism is the way to go. You can effectively overload the base class constructor by using a metaclass so that calling the base class results in the creation of the appropriate subclass object. Below is some sample code to illustrate the essence of what I mean.
Updated
The code has been modified to run under both Python 2 and 3 to keep it relevant. This was done in a way that avoids the use Python's explicit metaclass syntax, which varies between the two versions.
To accomplish that objective, a BulletMetaBase instance of the BulletMeta class is created by explicitly calling the metaclass when creating the Bullet baseclass (rather than using the __metaclass__= class attribute or via a metaclass keyword argument depending on the Python version).
class BulletMeta(type):
def __new__(cls, classname, bases, classdict):
""" Create Bullet class or a subclass of it. """
classobj = type.__new__(cls, classname, bases, classdict)
if classname != 'BulletMetaBase':
if classname == 'Bullet': # Base class definition?
classobj.registry = {} # Initialize subclass registry.
else:
try:
alias = classdict['alias']
except KeyError:
raise TypeError("Bullet subclass %s has no 'alias'" %
classname)
if alias in Bullet.registry: # unique?
raise TypeError("Bullet subclass %s's alias attribute "
"%r already in use" % (classname, alias))
# Register subclass under the specified alias.
classobj.registry[alias] = classobj
return classobj
def __call__(cls, alias, *args, **kwargs):
""" Bullet subclasses instance factory.
Subclasses should only be instantiated by calls to the base
class with their subclass' alias as the first arg.
"""
if cls != Bullet:
raise TypeError("Bullet subclass %r objects should not to "
"be explicitly constructed." % cls.__name__)
elif alias not in cls.registry: # Bullet subclass?
raise NotImplementedError("Unknown Bullet subclass %r" %
str(alias))
# Create designated subclass object (call its __init__ method).
subclass = cls.registry[alias]
return type.__call__(subclass, *args, **kwargs)
class Bullet(BulletMeta('BulletMetaBase', (object,), {})):
# Presumably you'd define some abstract methods that all here
# that would be supported by all subclasses.
# These definitions could just raise NotImplementedError() or
# implement the functionality is some sub-optimal generic way.
# For example:
def fire(self, *args, **kwargs):
raise NotImplementedError(self.__class__.__name__ + ".fire() method")
# Abstract base class's __init__ should never be called.
# If subclasses need to call super class's __init__() for some
# reason then it would need to be implemented.
def __init__(self, *args, **kwargs):
raise NotImplementedError("Bullet is an abstract base class")
# Subclass definitions.
class Bullet1(Bullet):
alias = 'B1'
def __init__(self, sprite, start, direction, speed):
print('creating %s object' % self.__class__.__name__)
def fire(self, trajectory):
print('Bullet1 object fired with %s trajectory' % trajectory)
class Bullet2(Bullet):
alias = 'B2'
def __init__(self, sprite, start, headto, spead, acceleration):
print('creating %s object' % self.__class__.__name__)
class Bullet3(Bullet):
alias = 'B3'
def __init__(self, sprite, script): # script controlled bullets
print('creating %s object' % self.__class__.__name__)
class Bullet4(Bullet):
alias = 'B4'
def __init__(self, sprite, curve, speed): # for bullets with curved paths
print('creating %s object' % self.__class__.__name__)
class Sprite: pass
class Curve: pass
b1 = Bullet('B1', Sprite(), (10,20,30), 90, 600)
b2 = Bullet('B2', Sprite(), (-30,17,94), (1,-1,-1), 600, 10)
b3 = Bullet('B3', Sprite(), 'bullet42.script')
b4 = Bullet('B4', Sprite(), Curve(), 720)
b1.fire('uniform gravity')
b2.fire('uniform gravity')
Output:
creating Bullet1 object
creating Bullet2 object
creating Bullet3 object
creating Bullet4 object
Bullet1 object fired with uniform gravity trajectory
Traceback (most recent call last):
File "python-function-overloading.py", line 93, in <module>
b2.fire('uniform gravity') # NotImplementedError: Bullet2.fire() method
File "python-function-overloading.py", line 49, in fire
raise NotImplementedError(self.__class__.__name__ + ".fire() method")
NotImplementedError: Bullet2.fire() method
You can easily implement function overloading in Python. Here is an example using floats and integers:
class OverloadedFunction:
def __init__(self):
self.router = {int : self.f_int ,
float: self.f_float}
def __call__(self, x):
return self.router[type(x)](x)
def f_int(self, x):
print('Integer Function')
return x**2
def f_float(self, x):
print('Float Function (Overloaded)')
return x**3
# f is our overloaded function
f = OverloadedFunction()
print(f(3 ))
print(f(3.))
# Output:
# Integer Function
# 9
# Float Function (Overloaded)
# 27.0
The main idea behind the code is that a class holds the different (overloaded) functions that you would like to implement, and a Dictionary works as a router, directing your code towards the right function depending on the input type(x).
PS1. In case of custom classes, like Bullet1, you can initialize the internal dictionary following a similar pattern, such as self.D = {Bullet1: self.f_Bullet1, ...}. The rest of the code is the same.
PS2. The time/space complexity of the proposed solution is fairly good as well, with an average cost of O(1) per operation.
Use keyword arguments with defaults. E.g.
def add_bullet(sprite, start=default, direction=default, script=default, speed=default):
In the case of a straight bullet versus a curved bullet, I'd add two functions: add_bullet_straight and add_bullet_curved.
Overloading methods is tricky in Python. However, there could be usage of passing the dict, list or primitive variables.
I have tried something for my use cases, and this could help here to understand people to overload the methods.
Let's take your example:
A class overload method with call the methods from different class.
def add_bullet(sprite=None, start=None, headto=None, spead=None, acceleration=None):
Pass the arguments from the remote class:
add_bullet(sprite = 'test', start=Yes,headto={'lat':10.6666,'long':10.6666},accelaration=10.6}
Or
add_bullet(sprite = 'test', start=Yes, headto={'lat':10.6666,'long':10.6666},speed=['10','20,'30']}
So, handling is being achieved for list, Dictionary or primitive variables from method overloading.
Try it out for your code.
Plum supports it in a straightforward pythonic way. Copying an example from the README below.
from plum import dispatch
#dispatch
def f(x: str):
return "This is a string!"
#dispatch
def f(x: int):
return "This is an integer!"
>>> f("1")
'This is a string!'
>>> f(1)
'This is an integer!'
You can also try this code. We can try any number of arguments
# Finding the average of given number of arguments
def avg(*args): # args is the argument name we give
sum = 0
for i in args:
sum += i
average = sum/len(args) # Will find length of arguments we given
print("Avg: ", average)
# call function with different number of arguments
avg(1,2)
avg(5,6,4,7)
avg(11,23,54,111,76)

How is lazy evaluation implemented (in ORMs for example)

Im curious to know how lazy evaluation is implemented at higher levels, ie in libraries, etc. For example, how does the Django ORM or ActiveRecord defer evaluation of query until it is actually used?
Let's have a look at some methods for django's django.db.models.query.QuerySet class:
class QuerySet(object):
"""
Represents a lazy database lookup for a set of objects.
"""
def __init__(self, model=None, query=None, using=None):
...
self._result_cache = None
...
def __len__(self):
if self._result_cache is None:
...
elif self._iter:
...
return len(self._result_cache)
def __iter__(self):
if self._result_cache is None:
...
if self._iter:
...
return iter(self._result_cache)
def __nonzero__(self):
if self._result_cache is not None:
...
def __contains__(self, val):
if self._result_cache is not None:
...
else:
...
...
def __getitem__(self, k):
...
if self._result_cache is not None:
...
...
The pattern that these methods follow is that no queries are executed until some method that really needs to return some result is called. At that point, the result is stored in self._result_cache and any subsequent call to the same method returns the cached value.
In Python, one object may "exist" - but its intrinsic value will only be known by the outer world at the moment it is used with one of the operators - since the operators are defined in the class by the magic names with double underscores, if a class writes the appropriate code to execute the deferred code when the operator is called, it is just fine.
That means, if the object's value is, for example, to be used like a string, any part of the program that will use the object will call, at some point, the "__str__" coercion method.
For example, let's create an object that behaves like a string, but tells the current time. Strings can be concatenated to other strings(__add__), can have their length requested (__len__), and so on. If we want it to fit perfectly in the place of a string, we'd have to override all methods. The idea is to retrieve the actual value just when one of the operators is called - otherwise, the actual object can freely be assigned to variables, and passed around. It will only be evaluated when its value is needed
Then, one can have some code like this:
class timestr(object):
def __init__(self):
self.value = None
def __str__(self):
self._getvalue()
return self.value
def __len__(self):
self._getvalue()
return len(self.value)
def __add__(self, other):
self._getvalue()
return self.value + other
def _getvalue(self):
timet = time.localtime()
self.value = " %s:%s:%s " % (timet.tm_hour, timet.tm_min, timet.tm_sec)
And using it on the console, you may have:
>>> a = timestr()
>>> b = timestr()
>>> print b
17:16:22
>>> print a
17:16:25
If the value for which you want a lazy evaluation is an attribute of your object (like Peson.name ) instead of what your object actually behaves like - it is even easier. Because Python allows all object attributes to be of a special type - called a descriptor -- which actually has a method called each time the attribute will be accessed. Therefore, one just has to create a class with a proper method named __get__ to fetch the actual value. This method will be called only when the attribute is needed.
Python even has an utility for easy descriptor creation - the "property" keyword, that makes this even easier - you pass a method that is the code to generate the attribute as the first parameter to property.
So, having an Event class with a lazy (and live) evaluated time, is just a matter of writting:
import time
class Event(object):
#property
def time(self):
timet = time.localtime()
return " %s:%s:%s " % (timet.tm_hour, timet.tm_min, timet.tm_sec)
And use it as in:
>>> e= Event()
>>> e.time
' 17:25:8 '
>>> e.time
' 17:25:10 '
The mechanism is quite simple:
class Lazy:
def __init__(self, evaluate):
self.evaluate = evaluate
self.computed = False
def getresult(self):
if not self.computed:
self.result = self.evaluate()
self.computed = True
return self.result
Then, this utility can be used as:
def some_computation(a, b):
return ...
# bind the computation to its operands, but don't evaluate it yet.
lazy = Lazy(lambda: some_computation(1, 2))
# "some_computation()" is evaluated now.
print lazy.getresult()
# use the cached result again without re-computing.
print lazy.getresult()
This implementation uses callables to represent the computation, but there are many variations on this theme (e.g. a base class that requires you to imlement an evaluate() method, etc.).
Not sure about the specifics about which library you talking about but, from an algorithm standpoint, I've always used/undertsood it as follows: (psuedo code from a python novice)
class Object:
#... Other stuff ...
_actual_property = None;
def interface():
if _actual_property is None:
# Execute query and load up _actual_property
return _actual_property
Essentially because the interface and implementation are separated, you can define behaviors to execute upon request.

String construction using OOP and Proxy pattern

I find it very interesting the way how SQLAlchemy constructing query strings, eg:
(Session.query(model.User)
.filter(model.User.age > 18)
.order_by(model.User.age)
.all())
As far as I can see, there applied some kind of Proxy Pattern. In my small project I need to make similar string construction using OOP approach. So, I tried to reconstitute this behavior.
Firstly, some kind of object, one of plenty similar objects:
class SomeObject(object):
items = None
def __init__(self):
self.items = []
def __call__(self):
return ' '.join(self.items) if self.items is not None else ''
def a(self):
self.items.append('a')
return self
def b(self):
self.items.append('b')
return self
All methods of this object return self, so I can call them in any order and unlimited number of times.
Secondly, proxy object, that will call subject's methods if it's not a perform method, which calls object to see the resulting string.
import operator
class Proxy(object):
def __init__(self, some_object):
self.some_object = some_object
def __getattr__(self, name):
self.method = operator.methodcaller(name)
return self
def __call__(self, *args, **kw):
self.some_object = self.method(self.some_object, *args, **kw)
return self
def perform(self):
return self.some_object()
And finally:
>>> obj = SomeObject()
>>> p = Proxy(obj)
>>> print p.a().a().b().perform()
a a b
What can you say about this implementation? Is there better ways to make the desirable amount of classes that would make such a string cunstructing with the same syntax?
PS: Sorry for my english, it's not my primary language.
Actually what you are looking at is not a proxy pattern but the builder pattern, and yes your implementation is IMHO is the classic one (using the Fluent interface pattern).
I don't know what SQLAlchemy does, but I would implement the interface by having the Session.query() method return a Query object with methods like filter(), order_by(), all() etc. Each of these methods simply returns a new Query object taking into account the applied changes. This allows for method chaining as in your first example.
Your own code example has numerous problems. One example
obj = SomeObject()
p = Proxy(obj)
a = p.a
b = p.b
print a().perform() # prints b

python lazy variables? or, delayed expensive computation

I have a set of arrays that are very large and expensive to compute, and not all will necessarily be needed by my code on any given run. I would like to make their declaration optional, but ideally without having to rewrite my whole code.
Example of how it is now:
x = function_that_generates_huge_array_slowly(0)
y = function_that_generates_huge_array_slowly(1)
Example of what I'd like to do:
x = lambda: function_that_generates_huge_array_slowly(0)
y = lambda: function_that_generates_huge_array_slowly(1)
z = x * 5 # this doesn't work because lambda is a function
# is there something that would make this line behave like
# z = x() * 5?
g = x * 6
While using lambda as above achieves one of the desired effects - computation of the array is delayed until it is needed - if you use the variable "x" more than once, it has to be computed each time. I'd like to compute it only once.
EDIT:
After some additional searching, it looks like it is possible to do what I want (approximately) with "lazy" attributes in a class (e.g. http://code.activestate.com/recipes/131495-lazy-attributes/). I don't suppose there's any way to do something similar without making a separate class?
EDIT2: I'm trying to implement some of the solutions, but I'm running in to an issue because I don't understand the difference between:
class sample(object):
def __init__(self):
class one(object):
def __get__(self, obj, type=None):
print "computing ..."
obj.one = 1
return 1
self.one = one()
and
class sample(object):
class one(object):
def __get__(self, obj, type=None):
print "computing ... "
obj.one = 1
return 1
one = one()
I think some variation on these is what I'm looking for, since the expensive variables are intended to be part of a class.
The first half of your problem (reusing the value) is easily solved:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
self.value = None
def __call__(self):
if self.value is None:
self.value = self.func()
return self.value
lazy_wrapper = LazyWrapper(lambda: function_that_generates_huge_array_slowly(0))
But you still have to use it as lazy_wrapper() not lazy_wrapper.
If you're going to be accessing some of the variables many times, it may be faster to use:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
def __call__(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
Which will make the first call slower and subsequent uses faster.
Edit: I see you found a similar solution that requires you to use attributes on a class. Either way requires you rewrite every lazy variable access, so just pick whichever you like.
Edit 2: You can also do:
class YourClass(object)
def __init__(self, func):
self.func = func
#property
def x(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
If you want to access x as an instance attribute. No additional class is needed. If you don't want to change the class signature (by making it require func), you can hard code the function call into the property.
Writing a class is more robust, but optimizing for simplicity (which I think you are asking for), I came up with the following solution:
cache = {}
def expensive_calc(factor):
print 'calculating...'
return [1, 2, 3] * factor
def lookup(name):
return ( cache[name] if name in cache
else cache.setdefault(name, expensive_calc(2)) )
print 'run one'
print lookup('x') * 2
print 'run two'
print lookup('x') * 2
Python 3.2 and greater implement an LRU algorithm in the functools module to handle simple cases of caching/memoization:
import functools
#functools.lru_cache(maxsize=128) #cache at most 128 items
def f(x):
print("I'm being called with %r" % x)
return x + 1
z = f(9) + f(9)**2
You can't make a simple name, like x, to really evaluate lazily. A name is just an entry in a hash table (e.g. in that which locals() or globals() return). Unless you patch access methods of these system tables, you cannot attach execution of your code to simple name resolution.
But you can wrap functions in caching wrappers in different ways.
This is an OO way:
class CachedSlowCalculation(object):
cache = {} # our results
def __init__(self, func):
self.func = func
def __call__(self, param):
already_known = self.cache.get(param, None)
if already_known:
return already_known
value = self.func(param)
self.cache[param] = value
return value
calc = CachedSlowCalculation(function_that_generates_huge_array_slowly)
z = calc(1) + calc(1)**2 # only calculates things once
This is a classless way:
def cached(func):
func.__cache = {} # we can attach attrs to objects, functions are objects
def wrapped(param):
cache = func.__cache
already_known = cache.get(param, None)
if already_known:
return already_known
value = func(param)
cache[param] = value
return value
return wrapped
#cached
def f(x):
print "I'm being called with %r" % x
return x + 1
z = f(9) + f(9)**2 # see f called only once
In real world you'll add some logic to keep the cache to a reasonable size, possibly using a LRU algorithm.
To me, it seems that the proper solution for your problem is subclassing a dict and using it.
class LazyDict(dict):
def __init__(self, lazy_variables):
self.lazy_vars = lazy_variables
def __getitem__(self, key):
if key not in self and key in self.lazy_vars:
self[key] = self.lazy_vars[key]()
return super().__getitem__(key)
def generate_a():
print("generate var a lazily..")
return "<a_large_array>"
# You can add as many variables as you want here
lazy_vars = {'a': generate_a}
lazy = LazyDict(lazy_vars)
# retrieve the variable you need from `lazy`
a = lazy['a']
print("Got a:", a)
And you can actually evaluate a variable lazily if you use exec to run your code. The solution is just using a custom globals.
your_code = "print('inside exec');print(a)"
exec(your_code, lazy)
If you did your_code = open(your_file).read(), you could actually run your code and achieve what you want. But I think the more practical approach would be the former one.

Python memoising/deferred lookup property decorator

Recently I've gone through an existing code base containing many classes where instance attributes reflect values stored in a database. I've refactored a lot of these attributes to have their database lookups be deferred, ie. not be initialised in the constructor but only upon first read. These attributes do not change over the lifetime of the instance, but they're a real bottleneck to calculate that first time and only really accessed for special cases. Hence they can also be cached after they've been retrieved from the database (this therefore fits the definition of memoisation where the input is simply "no input").
I find myself typing the following snippet of code over and over again for various attributes across various classes:
class testA(object):
def __init__(self):
self._a = None
self._b = None
#property
def a(self):
if self._a is None:
# Calculate the attribute now
self._a = 7
return self._a
#property
def b(self):
#etc
Is there an existing decorator to do this already in Python that I'm simply unaware of? Or, is there a reasonably simple way to define a decorator that does this?
I'm working under Python 2.5, but 2.6 answers might still be interesting if they are significantly different.
Note
This question was asked before Python included a lot of ready-made decorators for this. I have updated it only to correct terminology.
Here is an example implementation of a lazy property decorator:
import functools
def lazyprop(fn):
attr_name = '_lazy_' + fn.__name__
#property
#functools.wraps(fn)
def _lazyprop(self):
if not hasattr(self, attr_name):
setattr(self, attr_name, fn(self))
return getattr(self, attr_name)
return _lazyprop
class Test(object):
#lazyprop
def a(self):
print 'generating "a"'
return range(5)
Interactive session:
>>> t = Test()
>>> t.__dict__
{}
>>> t.a
generating "a"
[0, 1, 2, 3, 4]
>>> t.__dict__
{'_lazy_a': [0, 1, 2, 3, 4]}
>>> t.a
[0, 1, 2, 3, 4]
I wrote this one for myself... To be used for true one-time calculated lazy properties. I like it because it avoids sticking extra attributes on objects, and once activated does not waste time checking for attribute presence, etc.:
import functools
class lazy_property(object):
'''
meant to be used for lazy evaluation of an object attribute.
property should represent non-mutable data, as it replaces itself.
'''
def __init__(self, fget):
self.fget = fget
# copy the getter function's docstring and other attributes
functools.update_wrapper(self, fget)
def __get__(self, obj, cls):
if obj is None:
return self
value = self.fget(obj)
setattr(obj, self.fget.__name__, value)
return value
class Test(object):
#lazy_property
def results(self):
calcs = 1 # Do a lot of calculation here
return calcs
Note: The lazy_property class is a non-data descriptor, which means it is read-only. Adding a __set__ method would prevent it from working correctly.
For all sorts of great utilities I'm using boltons.
As part of that library you have cachedproperty:
from boltons.cacheutils import cachedproperty
class Foo(object):
def __init__(self):
self.value = 4
#cachedproperty
def cached_prop(self):
self.value += 1
return self.value
f = Foo()
print(f.value) # initial value
print(f.cached_prop) # cached property is calculated
f.value = 1
print(f.cached_prop) # same value for the cached property - it isn't calculated again
print(f.value) # the backing value is different (it's essentially unrelated value)
property is a class. A descriptor to be exact. Simply derive from it and implement the desired behavior.
class lazyproperty(property):
....
class testA(object):
....
a = lazyproperty('_a')
b = lazyproperty('_b')
Here's a callable that takes an optional timeout argument, in the __call__ you could also copy over the __name__, __doc__, __module__ from func's namespace:
import time
class Lazyproperty(object):
def __init__(self, timeout=None):
self.timeout = timeout
self._cache = {}
def __call__(self, func):
self.func = func
return self
def __get__(self, obj, objcls):
if obj not in self._cache or \
(self.timeout and time.time() - self._cache[key][1] > self.timeout):
self._cache[obj] = (self.func(obj), time.time())
return self._cache[obj]
ex:
class Foo(object):
#Lazyproperty(10)
def bar(self):
print('calculating')
return 'bar'
>>> x = Foo()
>>> print(x.bar)
calculating
bar
>>> print(x.bar)
bar
...(waiting 10 seconds)...
>>> print(x.bar)
calculating
bar
What you really want is the reify (source linked!) decorator from Pyramid:
Use as a class method decorator. It operates almost exactly like the Python #property decorator, but it puts the result of the method it decorates into the instance dict after the first call, effectively replacing the function it decorates with an instance variable. It is, in Python parlance, a non-data descriptor. The following is an example and its usage:
>>> from pyramid.decorator import reify
>>> class Foo(object):
... #reify
... def jammy(self):
... print('jammy called')
... return 1
>>> f = Foo()
>>> v = f.jammy
jammy called
>>> print(v)
1
>>> f.jammy
1
>>> # jammy func not called the second time; it replaced itself with 1
>>> # Note: reassignment is possible
>>> f.jammy = 2
>>> f.jammy
2
They added exactly what you're looking for in python 3.8
Transform a method of a class into a property whose value is computed once and then cached as a normal attribute for the life of the instance.
Similar to property(), with the addition of caching.
Use it just like #property :
#cached_property
def a(self):
self._a = 7
return self._a
There is a mix up of terms and/or confusion of concepts both in question and in answers so far.
Lazy evaluation just means that something is evaluated at runtime at the last possible moment when a value is needed. The standard #property decorator does just that.(*) The decorated function is evaluated only and every time you need the value of that property. (see wikipedia article about lazy evaluation)
(*)Actually a true lazy evaluation (compare e.g. haskell) is very hard to achieve in python (and results in code which is far from idiomatic).
Memoization is the correct term for what the asker seems to be looking for. Pure functions that do not depend on side effects for return value evaluation can be safely memoized and there is actually a decorator in functools #functools.lru_cache so no need for writing own decorators unless you need specialized behavior.
You can do this nice and easily by building a class from Python native property:
class cached_property(property):
def __init__(self, func, name=None, doc=None):
self.__name__ = name or func.__name__
self.__module__ = func.__module__
self.__doc__ = doc or func.__doc__
self.func = func
def __set__(self, obj, value):
obj.__dict__[self.__name__] = value
def __get__(self, obj, type=None):
if obj is None:
return self
value = obj.__dict__.get(self.__name__, None)
if value is None:
value = self.func(obj)
obj.__dict__[self.__name__] = value
return value
We can use this property class like regular class property ( It's also support item assignment as you can see)
class SampleClass():
#cached_property
def cached_property(self):
print('I am calculating value')
return 'My calculated value'
c = SampleClass()
print(c.cached_property)
print(c.cached_property)
c.cached_property = 2
print(c.cached_property)
print(c.cached_property)
Value only calculated first time and after that we used our saved value
Output:
I am calculating value
My calculated value
My calculated value
2
2
I agree with #jason
When I think about lazy evaluation, Asyncio immediately comes to mind.
The possibility of delaying the expensive calculation till the last minute is the sole benefit of lazy evaluation.
Caching / memozition on the other hand could be beneficial but on the expense that the calculation is static and won't change with time / inputs.
A practice I often do for expensive calculations of these sorts is to calculate then cache with TTL.

Categories