python lazy variables? or, delayed expensive computation - python

I have a set of arrays that are very large and expensive to compute, and not all will necessarily be needed by my code on any given run. I would like to make their declaration optional, but ideally without having to rewrite my whole code.
Example of how it is now:
x = function_that_generates_huge_array_slowly(0)
y = function_that_generates_huge_array_slowly(1)
Example of what I'd like to do:
x = lambda: function_that_generates_huge_array_slowly(0)
y = lambda: function_that_generates_huge_array_slowly(1)
z = x * 5 # this doesn't work because lambda is a function
# is there something that would make this line behave like
# z = x() * 5?
g = x * 6
While using lambda as above achieves one of the desired effects - computation of the array is delayed until it is needed - if you use the variable "x" more than once, it has to be computed each time. I'd like to compute it only once.
EDIT:
After some additional searching, it looks like it is possible to do what I want (approximately) with "lazy" attributes in a class (e.g. http://code.activestate.com/recipes/131495-lazy-attributes/). I don't suppose there's any way to do something similar without making a separate class?
EDIT2: I'm trying to implement some of the solutions, but I'm running in to an issue because I don't understand the difference between:
class sample(object):
def __init__(self):
class one(object):
def __get__(self, obj, type=None):
print "computing ..."
obj.one = 1
return 1
self.one = one()
and
class sample(object):
class one(object):
def __get__(self, obj, type=None):
print "computing ... "
obj.one = 1
return 1
one = one()
I think some variation on these is what I'm looking for, since the expensive variables are intended to be part of a class.

The first half of your problem (reusing the value) is easily solved:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
self.value = None
def __call__(self):
if self.value is None:
self.value = self.func()
return self.value
lazy_wrapper = LazyWrapper(lambda: function_that_generates_huge_array_slowly(0))
But you still have to use it as lazy_wrapper() not lazy_wrapper.
If you're going to be accessing some of the variables many times, it may be faster to use:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
def __call__(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
Which will make the first call slower and subsequent uses faster.
Edit: I see you found a similar solution that requires you to use attributes on a class. Either way requires you rewrite every lazy variable access, so just pick whichever you like.
Edit 2: You can also do:
class YourClass(object)
def __init__(self, func):
self.func = func
#property
def x(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
If you want to access x as an instance attribute. No additional class is needed. If you don't want to change the class signature (by making it require func), you can hard code the function call into the property.

Writing a class is more robust, but optimizing for simplicity (which I think you are asking for), I came up with the following solution:
cache = {}
def expensive_calc(factor):
print 'calculating...'
return [1, 2, 3] * factor
def lookup(name):
return ( cache[name] if name in cache
else cache.setdefault(name, expensive_calc(2)) )
print 'run one'
print lookup('x') * 2
print 'run two'
print lookup('x') * 2

Python 3.2 and greater implement an LRU algorithm in the functools module to handle simple cases of caching/memoization:
import functools
#functools.lru_cache(maxsize=128) #cache at most 128 items
def f(x):
print("I'm being called with %r" % x)
return x + 1
z = f(9) + f(9)**2

You can't make a simple name, like x, to really evaluate lazily. A name is just an entry in a hash table (e.g. in that which locals() or globals() return). Unless you patch access methods of these system tables, you cannot attach execution of your code to simple name resolution.
But you can wrap functions in caching wrappers in different ways.
This is an OO way:
class CachedSlowCalculation(object):
cache = {} # our results
def __init__(self, func):
self.func = func
def __call__(self, param):
already_known = self.cache.get(param, None)
if already_known:
return already_known
value = self.func(param)
self.cache[param] = value
return value
calc = CachedSlowCalculation(function_that_generates_huge_array_slowly)
z = calc(1) + calc(1)**2 # only calculates things once
This is a classless way:
def cached(func):
func.__cache = {} # we can attach attrs to objects, functions are objects
def wrapped(param):
cache = func.__cache
already_known = cache.get(param, None)
if already_known:
return already_known
value = func(param)
cache[param] = value
return value
return wrapped
#cached
def f(x):
print "I'm being called with %r" % x
return x + 1
z = f(9) + f(9)**2 # see f called only once
In real world you'll add some logic to keep the cache to a reasonable size, possibly using a LRU algorithm.

To me, it seems that the proper solution for your problem is subclassing a dict and using it.
class LazyDict(dict):
def __init__(self, lazy_variables):
self.lazy_vars = lazy_variables
def __getitem__(self, key):
if key not in self and key in self.lazy_vars:
self[key] = self.lazy_vars[key]()
return super().__getitem__(key)
def generate_a():
print("generate var a lazily..")
return "<a_large_array>"
# You can add as many variables as you want here
lazy_vars = {'a': generate_a}
lazy = LazyDict(lazy_vars)
# retrieve the variable you need from `lazy`
a = lazy['a']
print("Got a:", a)
And you can actually evaluate a variable lazily if you use exec to run your code. The solution is just using a custom globals.
your_code = "print('inside exec');print(a)"
exec(your_code, lazy)
If you did your_code = open(your_file).read(), you could actually run your code and achieve what you want. But I think the more practical approach would be the former one.

Related

How is this Python design pattern called?

In the ffmpeg-python docs, they used following design pattern for their examples:
(
ffmpeg
.input('dummy.mp4')
.filter('fps', fps=25, round='up')
.output('dummy2.mp4')
.run()
)
How is this design pattern called, where can I find more information about it, and what are the pros and cons of it ?
This design pattern called builder, you can read about it in here
basically, all the command (except run) change the object, and return it self, which allow you to "build" the object as it go on.
in my opinion is very usefull things, in query building is super good, and can be simplify a code.
think about db query you want to build, lets say we use sql.
# lets say we implement a builder called query
class query:
def __init__():
...
def from(self, db_name):
self.db_name = db_name
return self
....
q = query()
.from("db_name") # Notice every line change something (like here change query.db_name to "db_name"
.fields("username")
.where(id=2)
.execute() # this line will run the query on the server and return the output
This is called 'method chaining' or 'function chaining'. You can chain method calls together because each method call returns the underlying object itself (represented by self in Python, or this in other languages).
It's a technique used in the Gang of Four builder design pattern where you construct an initial object and then chain additional property setters, for example: car().withColor('red').withDoors(2).withSunroof().
Here's an example:
class Arithmetic:
def __init__(self):
self.value = 0
def total(self, *args):
self.value = sum(args)
return self
def double(self):
self.value *= 2
return self
def add(self, x):
self.value += x
return self
def subtract(self, x):
self.value -= x
return self
def __str__(self):
return f"{self.value}"
a = Arithmetic().total(1, 2, 3)
print(a) # 6
a = Arithmetic().total(1, 2, 3).double()
print(a) # 12
a = Arithmetic().total(1, 2, 3).double().subtract(3)
print(a) # 9

how to implement a function like sum(2)(3)(4)......(n) in python?

how to implement a function that will be invoked in the following way sum_numbers(2)(3)(4)......(n) in python?
the result should be 2+3+4+.....+n
The hint that I have is since functions are object in pythons there is way to do those using a nested function but I am not sure.
def sum_number(x):
def sum_number_2(y):
def sum_number_3(z):
....................
def sum_number_n(n)
return n
return sum_number_n
return sum_number_3
return sum_number_2
return sum_number
But instead of writing so many nested functions we should be able to do it in couple nested functions to compute sum of n values when invoked in the following way sum_numbers(2)(3)(4)......(n)
Use Python's data model features to convert the result into the desired type.
class sum_number(object):
def __init__(self, val):
self.val = val
def __call__(self, val):
self.val += val
return self
def __float__(self):
return float(self.val)
def __int__(self):
return int(self.val)
print '{}'.format(int(sum_number(2)(3)(8)))
print '{}'.format(float(sum_number(2)(3)(8)))
You could create a subclass of int that is callable:
class sum_numbers (int):
def __new__ (cls, *args, **kwargs):
return super().__new__(cls, *args, **kwargs)
def __call__ (self, val):
return sum_numbers(self + val)
That way, you have full compatibility with a normal integer (since objects of that type are normal integers), so the following examples work:
>>> sum_numbers(2)(3)(4)(5)
14
>>> isinstance(sum_numbers(2)(3), int)
True
>>> sum_numbers(2)(3) + 4
9
Of course, you may want to override additional methods, e.g. __add__ so that adding a normal integer will still return an object of your type. Otherwise, you would have to call the type with the result, e.g.:
>>> sum_numbers(sum_numbers(2)(3) + 5)(6)
16
If your function is returning another function, you can't just chain calls together and expect a human readable result. If you want a function that does what you want without the final result, this works:
def sum(x):
def f(y):
return sum(x+y)
return f
If you're fine with printing out the operations you can try this:
def sum(x):
print(x)
def f(y):
return sum(x+y)
return f
If you absolutely, absolutely need a return value then this is a dirty, horrible hack you could try:
def sum(x, v):
v[0] = x
def f(y, v):
return sum(x+y, v)
return f
v = [0]
sum(1,v)(2,v)(3,v)
print(v[0]) # Should return 6
Here's another solution that uses classes:
class sum(object):
def __init__(self, x=0):
self.x=x
def __call__(self, *y):
if len(y) > 0:
self.x += y[0]
return self
return self.x
print(sum(1)(2)(3)()) # Prints 6
What you're asking for is not possible in Python since you aren't providing a way to determine the end of the call chain, as cricket_007 mentions in the comments. However, if you do provide a way to indicate that there are no more calls then the function is easy to code. One way to indicate the end of the chain is to make the last call with no arguments.
I'm using rsum (recursive sum) as the name of the function in my code because sum is a built-in function and unnecessarily shadowing the Python built-ins is not a good coding practice: it makes the code potentially confusing, or at least harder to read because you have to keep remembering that the name isn't referring to what you normally expect it to refer to, and can lead to subtle bugs.
def rsum(val=None, tot=0):
if val is None:
return tot
tot += val
return lambda val=None, tot=tot: rsum(val, tot)
print rsum(42)()
print rsum(1)(2)()
print rsum(4)(3)(2)(1)()
print rsum(4100)(310000)(9)(50)()
output
42
3
10
314159
class MetaSum(type):
def __repr__(cls):
sum_str = str(cls.sum)
cls.sum = 0
return sum_str
def __call__(cls, *args):
for arg in args:
cls.sum += arg
return cls
class sum_numbers(object, metaclass = MetaSum):
sum = 0
print (sum_numbers(2)(3)(4)) # this only works in python 3

Deferring a computation in custom object until data available

I have a custom class like the below. The idea, as the naming suggests, is that I want to evaluate a token stream in a parser-type tool. Once a bunch of constructs have been parsed out and put into data structures, certain sequences of tokens will evaluate to an int, but when the data structures aren't available yet, a function just returns None instead. This single complex data structure 'constructs' gets passed around pretty much everywhere in the program.
class Toks(list):
def __init__(self, seq, constructs=Constructs()):
list.__init__(self, seq)
self.constructs = constructs
#property
def as_value(self):
val = tokens_as_value(self, self.constructs)
return val if val is not None else self
At points in the code, I want to assign this maybe-computable value to a name, e.g.:
mything.val = Toks(tokens[start:end], constructs).as_value
Well, this gives mything.val either an actual int value or a funny thing that allows us to compute a value later. But this requires a later pass to actually perform the computation, similar to:
if not isinstance(mything.val, int):
mything.val = mything.val.as_value
As it happens, I can do this in my program. However, what I'd really like to happen is to avoid the second pass altogether, and just have access to the property perform the computation and give the computed value if it's computable at that point (and perhaps evaluate to some sentinal if it's not possible to compute).
Any ideas?
To clarify: Depending on the case I get "value" differently; actual code is more like:
if tok.type == 'NUMBER':
mything.val = tok.value # A simple integer value
else:
mything.val = Toks(tokens[start:end], constructs).as_value
There are additional cases, sometimes I know I know the actual value early, and sometimes I'm not sure if I'll only know it later.
I realize I can defer calling (a bit more compactly than #dana suggests) with:
return val if val is not None else lambda: self.as_value
However, that makes later access inconsistent between mything.val and mything.val(), so I'd still have to guard it with an if to see which style to use. It's the same inconvenience whether I need to fall back to mything.val.as_value or to mything.val() after the type check.
You could easily do something like:
class NaiveLazy(object):
def __init__(self, ctor):
self.ctor = ctor
self._value = None
#property
def value(self):
if self._value is None:
self._value = ctor()
return self._value
mything = NaiveLazy(lambda: time.sleep(5) and 10)
And then always use mything.value (example to demonstrate evaluation):
print mything.value # will wait 5 seconds and print 10
print mything.value # will print 10
I've seen some utility libraries create a special object for undefined in case ctor returns None. If you eventually want to extend your code beyond ints, you should think about that:
class Undefined(object): pass
UNDEFINED = Undefined()
#...
self._value = UNDEFINED
#...
if self._value is UNDEFINED: self._value = ctor()
For your example specifically:
def toks_ctor(seq, constructs=Constructs()):
return lambda l=list(seq): tokens_as_value(l, constructs) or UNDEFINED
mything = NaiveLazy(toks_ctor(tokens[start:end], constructs))
If you're using Python3.2+, consider a Future object. This tool lets you run any number of calculations in the background. You can wait for a single future to be completed, and use its value. Or, you can "stream" the results one at a time as they're completed.
You could return a callable object from as_value, which would allow you automatically check for the real return value automatically. The one drawback is you'd need to use mything.val() instead of mything.val:
def tokens_as_value(toks, constructs):
if constructs.constructed:
return "some value"
else:
return None
class Constructs(object):
def __init__(self):
self.constructed = False
class Toks(list):
def __init__(self, seq, constructs=Constructs()):
list.__init__(self, seq)
self.constructs = constructs
#property
def as_value(self):
return FutureVal(tokens_as_value, self, self.constructs)
class FutureVal(object):
def __init__(self, func, *args, **kwargs):
self.func = func
self._val = None
self.args = args
self.kwargs = kwargs
def __call__(self):
if self._val is None:
self._val = self.func(*self.args, **self.kwargs)
return self._val
Just for the purposes of the example, Constructs just contains a boolean that indicates whether or not a real value should be returned from tokens_as_value.
Usage:
>>> t = test.Toks([])
>>> z = t.as_value
>>> z
<test.FutureVal object at 0x7f7292c96150>
>>> print(z())
None
>>> t.constructs.constructed = True
>>> print(z())
our value

Storing a reference to a reference in Python?

Using Python, is there any way to store a reference to a reference, so that I can change what that reference refers to in another context? For example, suppose I have the following class:
class Foo:
def __init__(self):
self.standalone = 3
self.lst = [4, 5, 6]
I would like to create something analogous to the following:
class Reassigner:
def __init__(self, target):
self.target = target
def reassign(self, value):
# not sure what to do here, but reassigns the reference given by target to value
Such that the following code
f = Foo()
rStandalone = Reassigner(f.standalone) # presumably this syntax might change
rIndex = Reassigner(f.lst[1])
rStandalone.reassign(7)
rIndex.reassign(9)
Would result in f.standalone equal to 7 and f.lst equal to [4, 9, 6].
Essentially, this would be an analogue to a pointer-to-pointer.
In short, it's not possible. At all. The closest equivalent is storing a reference to the object whose member/item you want to reassign, plus the attribute name/index/key, and then use setattr/setitem. However, this yields quite different syntax, and you have to differentiate between the two:
class AttributeReassigner:
def __init__(self, obj, attr):
# use your imagination
def reassign(self, val):
setattr(self.obj, self.attr, val)
class ItemReassigner:
def __init__(self, obj, key):
# use your imagination
def reassign(self, val):
self.obj[self.key] = val
f = Foo()
rStandalone = AttributeReassigner(f, 'standalone')
rIndex = ItemReassigner(f.lst, 1)
rStandalone.reassign(7)
rIndex.reassign(9)
I've actually used something very similar, but the valid use cases are few and far between.
For globals/module members, you can use either the module object or globals(), depending on whether you're inside the module or outside of it. There is no equivalent for local variables at all -- the result of locals() cannot be used to change locals reliably, it's only useful for inspecting.
I've actually used something very similar, but the valid use cases are few and far between.
Simple answer: You can't.
Complicated answer: You can use lambdas. Sort of.
class Reassigner:
def __init__(self, target):
self.reassign = target
f = Foo()
rIndex = Reassigner(lambda value: f.lst.__setitem__(1, value))
rStandalone = Reassigner(lambda value: setattr(f, 'strandalone', value))
rF = Reassigner(lambda value: locals().__setitem__('f', value)
If you need to defer assignments; you could use functools.partial (or just lambda):
from functools import partial
set_standalone = partial(setattr, f, "standalone")
set_item = partial(f.lst.__setitem__, 1)
set_standalone(7)
set_item(9)
If reassign is the only operation; you don't need a new class.
Functions are first-class citizens in Python: you can assign them to a variable, store in a list, pass as arguments, etc.
This would work for the contents of container objects. If you don't mind adding one level of indirection to your variables (which you'd need in the C pointer-to-pointer case anyway), you could:
class Container(object):
def __init__(self, val):
self.val = val
class Foo(object):
def __init__(self, target):
self.standalone = Container(3)
self.lst = [Container(4), Container(5), Container(6)]
And you wouldn't really need the reassigner object at all.
Class Reassigner(object):
def __init__(self, target):
self.target = target
def reassign(self, value):
self.target.val = value

Dynamic/runtime method creation (code generation) in Python

I need to generate code for a method at runtime. It's important to be able to run arbitrary code and have a docstring.
I came up with a solution combining exec and setattr, here's a dummy example:
class Viking(object):
def __init__(self):
code = '''
def dynamo(self, arg):
""" dynamo's a dynamic method!
"""
self.weight += 1
return arg * self.weight
'''
self.weight = 50
d = {}
exec code.strip() in d
setattr(self.__class__, 'dynamo', d['dynamo'])
if __name__ == "__main__":
v = Viking()
print v.dynamo(10)
print v.dynamo(10)
print v.dynamo.__doc__
Is there a better / safer / more idiomatic way of achieving the same result?
Based on Theran's code, but extending it to methods on classes:
class Dynamo(object):
pass
def add_dynamo(cls,i):
def innerdynamo(self):
print "in dynamo %d" % i
innerdynamo.__doc__ = "docstring for dynamo%d" % i
innerdynamo.__name__ = "dynamo%d" % i
setattr(cls,innerdynamo.__name__,innerdynamo)
for i in range(2):
add_dynamo(Dynamo, i)
d=Dynamo()
d.dynamo0()
d.dynamo1()
Which should print:
in dynamo 0
in dynamo 1
Function docstrings and names are mutable properties. You can do anything you want in the inner function, or even have multiple versions of the inner function that makedynamo() chooses between. No need to build any code out of strings.
Here's a snippet out of the interpreter:
>>> def makedynamo(i):
... def innerdynamo():
... print "in dynamo %d" % i
... innerdynamo.__doc__ = "docstring for dynamo%d" % i
... innerdynamo.__name__ = "dynamo%d" % i
... return innerdynamo
>>> dynamo10 = makedynamo(10)
>>> help(dynamo10)
Help on function dynamo10 in module __main__:
dynamo10()
docstring for dynamo10
Python will let you declare a function in a function, so you don't have to do the exec trickery.
def __init__(self):
def dynamo(self, arg):
""" dynamo's a dynamic method!
"""
self.weight += 1
return arg * self.weight
self.weight = 50
setattr(self.__class__, 'dynamo', dynamo)
If you want to have several versions of the function, you can put all of this in a loop and vary what you name them in the setattr function:
def __init__(self):
for i in range(0,10):
def dynamo(self, arg, i=i):
""" dynamo's a dynamic method!
"""
self.weight += i
return arg * self.weight
setattr(self.__class__, 'dynamo_'+i, dynamo)
self.weight = 50
(I know this isn't great code, but it gets the point across). As far as setting the docstring, I know that's possible but I'd have to look it up in the documentation.
Edit: You can set the docstring via dynamo.__doc__, so you could do something like this in your loop body:
dynamo.__doc__ = "Adds %s to the weight" % i
Another Edit: With help from #eliben and #bobince, the closure problem should be solved.
class Dynamo(object):
def __init__(self):
pass
#staticmethod
def init(initData=None):
if initData is not None:
dynamo= Dynamo()
for name, value in initData.items():
code = '''
def %s(self, *args, **kwargs):
%s
''' % (name, value)
result = {}
exec code.strip() in result
setattr(dynamo.__class__, name, result[name])
return dynamo
return None
service = Dynamo.init({'fnc1':'pass'})
service.fnc1()
A bit more general solution:
You can call any method of an instance of class Dummy.
The docstring is generated based on the methods name.
The handling of any input arguments is demonstrated, by just returning them.
Code
class Dummy(object):
def _mirror(self, method, *args, **kwargs):
"""doc _mirror"""
return args, kwargs
def __getattr__(self, method):
"doc __getattr__"
def tmp(*args, **kwargs):
"""doc tmp"""
return self._mirror(method, *args, **kwargs)
tmp.__doc__ = (
'generated docstring, access by {:}.__doc__'
.format(method))
return tmp
d = Dummy()
print(d.test2('asd', level=0), d.test.__doc__)
print(d.whatever_method(7, 99, par=None), d.whatever_method.__doc__)
Output
(('asd',), {'level': 0}) generated docstring, access by test.__doc__
((7, 99), {'par': None}) generated docstring, access by whatever_method.__doc__
Pardon me for my bad English.
I recently need to generate dynamic function to bind each menu item to open particular frame on wxPython. Here is what i do.
first, i create a list of mapping between the menu item and the frame.
menus = [(self.menuItemFile, FileFrame), (self.menuItemEdit, EditFrame)]
the first item on the mapping is the menu item and the last item is the frame to be opened. Next, i bind the wx.EVT_MENU event from each of the menu item to particular frame.
for menu in menus:
f = genfunc(self, menu[1])
self.Bind(wx.EVT_MENU, f, menu[0])
genfunc function is the dynamic function builder, here is the code:
def genfunc(parent, form):
def OnClick(event):
f = form(parent)
f.Maximize()
f.Show()
return OnClick

Categories