Conditional performance in python closures - python

In the code example below, I have two higher level functions, factory1 and factory2, that produce a function with identical behavior. The first factory, factory1, avoids having to explicitly define two different functions by letting the returned function change behavior based on a boolean from the factory. The usefulness of this is not as obvious in this example, but if the function to be produced were more complex, it would be detrimental to both readability and and maintainability to explicitly write out two almost identical copies of the function, like is done in factory2.
However, the factory2 implementation is faster, as can be seen by the timing results.
Is there a way to achieve the performance of factory2 without explicitly defining two alternative functions?
def factory1(condition):
def fn():
if condition:
return "foo"
else:
return "bar"
return fn
def factory2(condition):
def foo_fn():
return "foo"
def bar_fn():
return "bar"
if condition:
return foo_fn
else:
return bar_fn
def test1():
fn = factory1(True)
for _ in range(1000):
fn()
def test2():
fn = factory2(True)
for _ in range(1000):
fn()
if __name__ == '__main__':
import timeit
print(timeit.timeit("test1()", setup="from __main__ import test1"))
# >>> 62.458039999
print(timeit.timeit("test2()", setup="from __main__ import test2"))
# >>> 49.203676939
EDIT: Some more context
The reason I am asking is that I am trying to produce a function that looks something like this:
def function(data):
data = some_transform(data)
if condition:
# condition should be considered invariant at time of definition
data = transform1(data)
else:
data = transform2(data)
data = yet_another_transform(data)
return data

Depending on what you mean by "explicitly defining two functions", note that you don't have to execute a def statement until after you check the condition:
def factory3(condition):
if condition:
def fn():
return "foo"
else:
def fn():
return "bar"
return fn
One might object that this still has to compile two code objects before determining which one gets used to define the function at run-time. In the case, you might fallback on using exec on a dynamically constructed string. NOTE This needs to be done carefully for anything other than the static example I will show here. See the old definition for namedtuple for a good(?) example.
def factory4(condition):
code = """def fn():\n return "{}"\n""".format("foo" if condition else "bar")
exec(code)
return fn
A safer alternative might be to use a closure:
def factory5(condition):
def make_fun(val):
def _():
return val
return _
if condition:
return make_fun("foo")
else:
return make_fun("bar")
make_fun can be define outside of factory5 as well, as it doesn't rely on condition at all.
Based on your edit, I think you are just looking to implement dependency injection. Don't put an if statement inside your function; pass transform1 or transform2 as an argument:
def function(transform):
def _(data):
data = some_transform(data)
data = transform(data)
data = yet_another_transform(data)
return data
return _
if condition:
thing = function(transform1)
else:
thing = function(transform2)

Related

Python: is there a way to list all the functions defined inside a function? [duplicate]

In python you can do fname.__code__.co_names to retrieve a list of functions and global things that a function references. If I do fname.__code__.co_varnames, this includes inner functions, I believe.
Is there a way to essentially do inner.__code__.co_names ? by starting with a string that looks like 'inner', as is returned by co_varnames?
In Python 3.4+ you can get the names using dis.get_instructions. To support nested functions as well you need to recursively loop over each code object you encounter:
import dis
import types
def get_names(f):
ins = dis.get_instructions(f)
for x in ins:
try:
if x.opcode == 100 and '<locals>' in next(ins).argval\
and next(ins).opcode == 132:
yield next(ins).argrepr
yield from get_names(x.argval)
except Exception:
pass
Demo:
def func():
x = 1
y = 2
print ('foo')
class A:
def method(self):
pass
def f1():
z = 3
print ('bar')
def f2():
a = 4
def f3():
b = [1, 2, 3]
def f4():
pass
print(list(get_names(func)))
Outputs:
['f1', 'f2', 'f3', 'f4']
I don't think you can inspect the code object because inner functions are lazy, and their code-objects are only created just in time. What you probably want to look at instead is the ast module. Here's a quick example:
import ast, inspect
# this is the test scenario
def function1():
f1_var1 = 42
def function2():
f2_var1 = 42
f2_var2 = 42
def function3():
f3_var1 = 42
# derive source code for top-level function
src = inspect.getsource(function1)
# derive abstract syntax tree rooted at top-level function
node = ast.parse(src)
# next, ast's walk method takes all the difficulty out of tree-traversal for us
for x in ast.walk(node):
# functions have names whereas variables have ids,
# nested-classes may all use different terminology
# you'll have to look at the various node-types to
# get this part exactly right
name_or_id = getattr(x,'name', getattr(x,'id',None))
if name_or_id:
print name_or_id
The results are: function1, function2, f1_var1, function3, f2_var1, f2_var2, f3_var1. Obligatory disclaimer: there's probably not a good reason for doing this type of thing.. but have fun :)
Oh and if you only want the names of the inner functions?
print dict([[x.name,x] for x in ast.walk(ast.parse(inspect.getsource(some_function))) if type(x).__name__=='FunctionDef'])

Parameters of parsy parser

Consider the following code, which parses and evaluates strings like 567 +223 in Python.
import parsy as pr
from parsy import generate
def lex(p):
return p << pr.regex('\s*')
numberP = lex(pr.regex('[0-9]+').map(int))
#generate
def sumP():
a = yield numberP
yield lex(pr.string('+'))
b = yield numberP
return a+b
exp = sumP.parse('567 + 323')
print(exp)
The #generate is a total mystery for me. Does anyone have more information on how that trick works? It does allow us to write in a similar style to Haskell's monadic do notation. Is code reflection needed to make your own #generate, or is there a clever way to interpret that code literally.
Now here comes my main problem, I want to generalize sumP to opP that also takes an operator symbol and a combinator function:
import parsy as pr
from parsy import generate
def lex(p):
return p << pr.regex('\s*')
numberP = lex(pr.regex('[0-9]+').map(int))
#generate
def opP(symbol, f):
a = yield numberP
yield lex(pr.string(symbol))
b = yield numberP
return f(a,b)
exp = opP('+', lambda x,y:x+y).parse('567 + 323')
print(exp)
This gives an error. It seems that the generated opP already has two arguments, which I do not know how to deal with.
The way that decorators work in Python is that they're functions that are called with the decorated method as an argument and then their return value is assigned to the method name. In other words this:
#foo
def bar():
bla
Is equivalent to this:
def bar():
bla
bar = foo(bar)
Here foo can do anything it wants with bar. It may wrap it in something, it may introspect its code, it may call it.
What #generate does is to wrap the given function in a parser object. The parser object, when parsing, will call the function without arguments, which is why you get an error about missing arguments when you apply #generate to a function that takes arguments.
To create parameterized rules, you can apply #generate to an inner 0-argument function and return that:
def opP(symbol, f):
#generate
def op():
a = yield numberP
yield lex(pr.string(symbol))
b = yield numberP
return f(a,b)
return op

Is it OK to replace a method by a plain function?

This works as expected, but I am somehow unsure about this approach. Is it safe? Is it pythonic?
class Example:
def __init__(self, parameter):
if parameter == 0:
# trivial case, the result is always zero
self.calc = lambda x: 0.0 # <== replacing a method
self._parameter = parameter
def calc(self, x):
# ... long calculation of result ...
return result
(If there is any difference between Python2 and Python3, I'm using Python3 only.)
This is very confusing. If someone else reads it, they won't understand what is going on. Just put a if statement at the beginning of your method.
def calc(self, x):
if self.parameter == 0:
return 0
# ... long calculation of result ...
return result
Also if you change self.parameter after it was initialized with 0, your function wouldn't work anymore.
You'll have a problem should parameter ever changes, so I don't consider it good practice.
Instead, I think you should do this:
class Example:
def __init__(self, parameter):
self._parameter = parameter
def calc(self, x):
if not self._parameter:
return 0.0
# ... long calculation of result ...
return result
I decided to post a summary of several comments and answers. Please do not vote for this summary, but give +1 to the original authors instead.
the approach is safe except for special __methods__
the approach is deemed unpythonic, undesirable, or unnecessary etc.
the parameter determining the function to use must be constant. If it is not the case, this approach makes no sense at all.
from several suggestions I prefer the code below for general cases and the obvious if cond: return 0.0 for simple cases:
class Example:
def __init__(self, parameter):
if parameter == 0:
self.calc = self._calc_trivial
else:
# ... pre-compute data if necessary ...
self.calc = self._calc_regular
self._parameter = parameter
def _calc_regular(self, x):
# ... long calculation of result ...
return result
#staticmethod
def _calc_trivial(x):
return 0.0

Python: Wrapping functions in the loop

What is the correct way to wrap mulptiple functions in the loop in Python?
I have universal wrapper and the list of functions. I need to wrap all functions from the list inside one loop but for f in funcs: doesn't work for me.
What will be the correct way to perform this?
def orig_func1(x, y):
print "x=%i y=%i" % (x,y)
def orig_func2(a, b):
print "a=%i b=%i" % (a,b)
def wrapper(func):
f_name = func.__name__
print 'adding hook for function [%s]' % f_name
def inner(*args, **kwargs):
print 'I am before original function'
ret = func(*args, **kwargs)
print 'I am after original function'
return ret
return inner
funcs = [orig_func1, orig_func2]
print funcs
for f in funcs:
f = wrapper(f)
print funcs
and the results showing that functions from the list are not changed:
[<function orig_func1 at 0x022F78F0>, <function orig_func2 at 0x022F7930>]
adding hook for function [orig_func1]
adding hook for function [orig_func2]
[<function orig_func1 at 0x022F78F0>, <function orig_func2 at 0x022F7930>]
x=1 y=2
a=3 b=4
Inside that loop, f is nothing but a local variable. You're not changing anything meaningful unless you modify the list directly. Instead of:
for f in funcs:
f = wrapper(f)
You should do this:
for i, f in enumerate(funcs):
funcs[i] = wrapper(f)
This will change the functions in your list to new, wrapped ones that you can use. But it still won't change the ultimate definition of the function. Nothing will, once it's been defined, short of a complete redefinition or a wrapper used right above the function definition; calling orig_func1 directly will net the same results before and after the for loop. If you want to modify a function at runtime, you'll have to keep referring to this wrapped version of the function that you've just created.
Instead of looping over functions trying to wrap them, you should be using Python Decorators. They are the correct way to modify the behavior of your functions, rather than your current looping method. If the official docs don't make it clear enough, here and here are a couple of tutorials that helped me quite a bit.
Your existing code actually looks a lot like some of the code snippets from my first tutorial link. You should replace your loop with the #decorator syntax instead of the manual wrapping.
That being said, you can accomplish what you originally intended with a comprehension. Replace your loop with this:
funcs = [wrapper(func) for func in funcs]
The other comments and answers were correct that your modification of f in the loop wouldn't work because it had a scope local to that loop and was not modifying your list.
I've implemented required behavior using following approach:
import Module1, Module2
funcs = ['Module1.func1','Module1.func2','Module2.func1','Module2.func2']
hooks = {}
def wrapper(func,f_name):
if not hooks.has_key(f_name):
hooks[f_name] = {'before':[],
'after':[]}
def inner(*args, **kwargs):
for f in hooks[f_name]['before']:
f(*args, **kwargs)
ret = func(*args, **kwargs)
for f in hooks[f_name]['after']:
f(*args, **kwargs)
return ret
return inner
def implementHooks():
for f in funcs:
obj_name, func_name = f.rsplit('.',1)
obj = globals()[obj_name]
func = getattr(obj, func_name)
setattr(obj, func_name, wrapper(func, f))
implementHooks()
def module1_func1_hook():
print 'Before running module1.func1()'
hooks['Module1.func1']['before'] += [module1_func1_hook]

Function which computes once, caches the result, and returns from cache infinitely (Python)

I have a function which performs an expensive operation and is called often; but, the operation only needs to be performed once - its result could be cached.
I tried making an infinite generator but I didn't get the results I expected:
>>> def g():
... result = "foo"
... while True:
... yield result
...
>>> g()
<generator object g at 0x1093db230> # why didn't it give me "foo"?
Why isn't g a generator?
>>> g
<function g at 0x1093de488>
Edit: it's fine if this approach doesn't work, but I need something which performs exactly like a regular function, like so:
>>> [g() for x in range(3)]
["foo", "foo", "foo"]
g() is a generator function. Calling it returns the generator. You then need to use that generator to get your values. By looping, for example, or by calling next() on it:
gen = g()
value = next(gen)
Note that calling g() again will calculate the same value again and produce a new generator.
You may just want to use a global to cache the value. Storing it as an attribute on the function could work:
def g():
if not hasattr(g, '_cache'):
g._cache = 'foo'
return g._cache
A better way: #functools.lru_cache(maxsize=None). It's been backported to python 2.7, or you could just write your own.
I am occasionally guilty of doing:
def foo():
if hasattr(foo, 'cache'):
return foo.cache
# do work
foo.cache = result
return result
Here's a dead-simple caching decorator. It doesn't take into account any variations in parameters, it just returns the same result after the first call. There are fancier ones out there that cache the result for each combination of inputs ("memoization").
import functools
def callonce(func):
result = []
#functools.wraps(func)
def wrapper(*args, **kwargs):
if not result:
result.append(func(*args, **kwargs))
return result[0]
return wrapper
Usage:
#callonce
def long_running_function(x, y, z):
# do something expensive with x, y, and z, producing result
return result
If you would prefer to write your function as a generator for some reason (perhaps the result is slightly different on each call, but there's still a time-consuming initial setup, or else you just want C-style static variables that allow your function to remember some bit of state from one call to the next), you can use this decorator:
import functools
def gen2func(generator):
gen = []
#functools.wraps(generator)
def wrapper(*args, **kwargs):
if not gen:
gen.append(generator(*args, **kwargs))
return next(gen[0])
return wrapper
Usage:
#gen2func
def long_running_function_in_generator_form(x, y, z):
# do something expensive with x, y, and z, producing result
while True:
yield result
result += 1 # for example
A Python 2.5 or later version that uses .send() to allow parameters to be passed to each iteration of the generator is as follows (note that **kwargs are not supported):
import functools
def gen2func(generator):
gen = []
#functools.wraps(generator)
def wrapper(*args):
if not gen:
gen.append(generator(*args))
return next(gen[0])
return gen[0].send(args)
return wrapper
#gen2func
def function_with_static_vars(a, b, c):
# time-consuming initial setup goes here
# also initialize any "static" vars here
while True:
# do something with a, b, c
a, b, c = yield # get next a, b, c
A better option would be to use memoization. You can create a memoize decorator that you can use to wrap any function that you want to cache the results for. You can find some good implementations here.
You can also leverage Beaker and its cache.
Also it has a tons of extensions.

Categories