Say I have got some function fun, the actual code body of which is out of my control. I can create a new function which does some preprocessing before calling fun, i.e.
def process(x):
x += 1
return fun(x)
If I now want process to take the place of fun for all future calls to fun, I need to do something like
# Does not work
fun = process
This does not work however, as this creates a cyclic reference problem as now fun is called from within the body of fun. One solution I have found is to reference a copy of fun inside of process, like so:
# Works
import copy
fun_cp = copy.copy(fun)
def process(x):
x += 1
return fun_cp(x)
fun = process
but this solution bothers me as I don't really know how Python constructs a copy of a function. I guess my problem is identical to that of extending a class method using inheritance and the super function, but here I have no class.
How can I do this properly? I would think that this is a common enough task that some more or less idiomatic solution should exist, but I have had no luck finding it.
Python is not constructing a copy of your function. copy.copy(fun) just returns fun; the difference is that you saved that to the fun_cp variable, a different variable from the one you saved process to, so it's still in fun_cp when process tries to look for it.
I'd do something similar to what you did, saving the original function to a different variable, just without the "copy":
original_fun = fun
def fun(x):
x += 1
return original_fun(x)
If you want to apply the same wrapping to multiple functions, defining a decorator and doing fun = decorate(fun) is more reusable, but for a one-off, it's more work than necessary and an extra level of indentation.
This looks like a use case for python's closures. Have a function return your function.
def getprocess(f):
def process(x):
x += 1
return f(x) # f is referenced from the enclosing scope.
return process
myprocess = getprocess(fun)
myprocess = getprocess(myprocess)
Credit to coldspeed for the idea of using a closure. A fully working and polished solution is
import functools
def getprocess(f):
#functools.wraps(f)
def process(x):
x += 1
return f(x)
return process
fun = getprocess(fun)
Note that this is 100% equivalent to applying a decorator (getprocess) to fun. I couldn't come up with this solution as the dedicated decorator syntax #getprocess can only be used at the definition place of the function (here fun). To apply it on an existing function though, just do fun = getprocess(fun).
Related
Issue: I have 2 functions that both require the same nested functions to operate so they're currently copy-pasted into each function. These functions cannot be combined as the second function relies on calling the first function twice. Unnesting the functions would result in the addition of too many parameters.
Question: Is it better to run the nested functions in the first function and append their values to an object to be fed into the 2nd function, or is it better to copy and paste the nested functions?
Example:
def func_A(thing):
def sub_func_A(thing):
thing += 1
return sub_func_A(thing)
def func_B(thing):
def sub_func_B(thing):
thing += 1
val_A, val_B = func_A(5), func_A(5)
return sub_func_B(val_A), sub_func_B(val_B)
Imagine these functions couldn't be combined and the nested function relied on so many parameters that moving it outside and calling it would be too cluttered
The "better option" depends on a few factors -:
The type of optimization you want to achieve.
The time taken by the functions to execute.
If the type of optimization to be achieved here is based on the time taken to execute the second function in the two cases, then it depends on the time taken for the nested function to fully execute, if that time is less than the time taken to store it's output when it's first called by the first function then its better copy pasting them.
While, if the time taken by the nested function to execute is more than the time taken to store it's output, then its a better option to execute it first time and then store it's output for future use.
Further, As mentioned by #DarylG in the comments, a class based approach can also be used wherein the nested function(subfunction) can be a private function(only accessible by the class's inner components), while the two functions(func_A and func_B) can be public thus allowing them to be used and accessed widely from the outside as well. If implemented in code it might look something like this -:
class MyClass() :
def __init__(self, ...) :
...
return
def __subfunc(self, thing) :
# PRIVATE SUBFUNC
thing += 1
return thing
def func_A(self, thing):
# PUBLIC FUNC A
return self.__subfunc(thing)
def func_B(self, thing):
# PUBLIC FUNC B
val_A, val_B = self.func_A(5), self.func_A(5)
return self.__subfunc(val_A), self.__subfunc(val_B)
I'm currently trying to code a Python (3.4.4) GUI with tkinter which should allow to fit an arbitrary function to some datapoints. To start easy, I'd like to create some input-function and evaluate it. Later, I would like to plot and fit it using curve_fit from scipy.
In order to do so, I would like to create a dynamic (fitting) function from a user-input-string. I found and read about exec, but people say that (1) it is not safe to use and (2) there is always a better alternative (e.g. here and in many other places). So, I was wondering what would be the alternative in this case?
Here is some example code with two nested functions which works but it's not dynamic:
def buttonfit_press():
def f(x):
return x+1
return f
print(buttonfit_press()(4))
And here is some code that gives rise to NameError: name 'f' is not defined before I can even start to use xval:
def buttonfit_press2(xval):
actfitfunc = "f(x)=x+1"
execstr = "def {}:\n return {}\n".format(actfitfunc.split("=")[0], actfitfunc.split("=")[1])
exec(execstr)
return f
print(buttonfit_press2(4))
An alternative approach with types.FunctionType discussed here (10303248) wasn't successful either...
So, my question is: Is there a good alternative I could use for this scenario? Or if not, how can I make the code with exec run?
I hope it's understandable and not too vague. Thanks in advance for your ideas and input.
#Gábor Erdős:
Either I don't understand or I disagree. If I code the same segment in the mainloop, it recognizes f and I can execute the code segment from execstr:
actfitfunc = "f(x)=x+1"
execstr = "def {}:\n return {}\n".format(actfitfunc.split("=")[0], actfitfunc.split("=")[1])
exec(execstr)
print(f(4))
>>> 5
#Łukasz Rogalski:
Printing execstr seems fine to me:
def f(x):
return x+1
Indentation error is unlikely due to my editor, but I double-checked - it's fine.
Introducing my_locals, calling it in exec and printing in afterwards shows:
{'f': <function f at 0x000000000348D8C8>}
However, I still get NameError: name 'f' is not defined.
#user3691475:
Your example is very similar to my first example. But this is not "dynamic" in my understanding, i.e. one can not change the output of the function while the code is running.
#Dunes:
I think this is going in the right direction, thanks. However, I don't understand yet how I can evaluate and use this function in the next step? What I mean is: in order to be able to fit it, I have to extract fitting variables (i.e. a in f(x)=a*x+b) or evaluate the function at various x-values (i.e. print(f(3.14))).
The problem with exec/eval, is that they can execute arbitrary code. So to use exec or eval you need to either carefully parse the code fragment to ensure it doesn't contain malicious code (an incredibly hard task), or be sure that the source of the code can be trusted. If you're making a small program for personal use then that's fine. A big program that's responsible for sensitive data or money, definitely not. It would seem your use case counts as having a trusted source.
If all you want is to create an arbitrary function at runtime, then just use a combination of the lambda expression and eval. eg.
func_str = "lambda x: x + 1" # equates to f(x)=x+1
func = eval(func_str)
assert func(4) == 5
The reason why your attempt isn't working is that locals(), in the context of a function, creates a copy of the local namespace. Mutations to the resulting dictionary do not effect the current local namespace. You would need to do something like:
def g():
src = """
def f(x):
return x + 1
"""
exec_namespace = {} # exec will place the function f in this dictionary
exec(src, exec_namespace)
return exec_namespace['f'] # retrieve f
I'm not sure what exactly are you trying to do, i.e. what functions are allowed, what operations are permitted, etc.
Here is an example of a function generator with one dynamic parameter:
>>> def generator(n):
def f(x):
return x+n
return f
>>> plus_one=generator(1)
>>> print(plus_one(4))
5
I often do interactive work in Python that involves some expensive operations that I don't want to repeat often. I'm generally running whatever Python file I'm working on frequently.
If I write:
import functools32
#functools32.lru_cache()
def square(x):
print "Squaring", x
return x*x
I get this behavior:
>>> square(10)
Squaring 10
100
>>> square(10)
100
>>> runfile(...)
>>> square(10)
Squaring 10
100
That is, rerunning the file clears the cache. This works:
try:
safe_square
except NameError:
#functools32.lru_cache()
def safe_square(x):
print "Squaring", x
return x*x
but when the function is long it feels strange to have its definition inside a try block. I can do this instead:
def _square(x):
print "Squaring", x
return x*x
try:
safe_square_2
except NameError:
safe_square_2 = functools32.lru_cache()(_square)
but it feels pretty contrived (for example, in calling the decorator without an '#' sign)
Is there a simple way to handle this, something like:
#non_resetting_lru_cache()
def square(x):
print "Squaring", x
return x*x
?
Writing a script to be executed repeatedly in the same session is an odd thing to do.
I can see why you'd want to do it, but it's still odd, and I don't think it's unreasonable for the code to expose that oddness by looking a little odd, and having a comment explaining it.
However, you've made things uglier than necessary.
First, you can just do this:
#functools32.lru_cache()
def _square(x):
print "Squaring", x
return x*x
try:
safe_square_2
except NameError:
safe_square_2 = _square
There is no harm in attaching a cache to the new _square definition. It won't waste any time, or more than a few bytes of storage, and, most importantly, it won't affect the cache on the previous _square definition. That's the whole point of closures.
There is a potential problem here with recursive functions. It's already inherent in the way you're working, and the cache doesn't add to it in any way, but you might only notice it because of the cache, so I'll explain it and show how to fix it. Consider this function:
#lru_cache()
def _fact(n):
if n < 2:
return 1
return _fact(n-1) * n
When you re-exec the script, even if you have a reference to the old _fact, it's going to end up calling the new _fact, because it's accessing _fact as a global name. It has nothing to do with the #lru_cache; remove that, and the old function will still end up calling the new _fact.
But if you're using the renaming trick above, you can just call the renamed version:
#lru_cache()
def _fact(n):
if n < 2:
return 1
return fact(n-1) * n
Now the old _fact will call fact, which is still the old _fact. Again, this works identically with or without the cache decorator.
Beyond that initial trick, you can factor that whole pattern out into a simple decorator. I'll explain step by step below, or see this blog post.
Anyway, even with the less-ugly version, it's still a bit ugly and verbose. And if you're doing this dozens of times, my "well, it should look a bit ugly" justification will wear thin pretty fast. So, you'll want to handle this the same way you always factor out ugliness: wrap it in a function.
You can't really pass names around as objects in Python. And you don't want to use a hideous frame hack just to deal with this. So you'll have to pass the names around as strings. ike this:
globals().setdefault('fact', _fact)
The globals function just returns the current scope's global dictionary. Which is a dict, which means it has the setdefault method, which means this will set the global name fact to the value _fact if it didn't already have a value, but do nothing if it did. Which is exactly what you wanted. (You could also use setattr on the current module, but I think this way emphasizes that the script is meant to be (repeatedly) executed in someone else's scope, not used as a module.)
So, here that is wrapped up in a function:
def new_bind(name, value):
globals().setdefault(name, value)
… which you can turn that into a decorator almost trivially:
def new_bind(name):
def wrap(func):
globals().setdefault(name, func)
return func
return wrap
Which you can use like this:
#new_bind('foo')
def _foo():
print(1)
But wait, there's more! The func that new_bind gets is going to have a __name__, right? If you stick to a naming convention, like that the "private" name must be the "public" name with a _ prefixed, we can do this:
def new_bind(func):
assert func.__name__[0] == '_'
globals().setdefault(func.__name__[1:], func)
return func
And you can see where this is going:
#new_bind
#lru_cache()
def _square(x):
print "Squaring", x
return x*x
There is one minor problem: if you use any other decorators that don't wrap the function properly, they will break your naming convention. So… just don't do that. :)
And I think this works exactly the way you want in every edge case. In particular, if you've edited the source and want to force the new definition with a new cache, you just del square before rerunning the file, and it works.
And of course if you want to merge those two decorators into one, it's trivial to do so, and call it non_resetting_lru_cache.
However, I'd keep them separate. I think it's more obvious what they do. And if you ever want to wrap another decorator around #lru_cache, you're probably still going to want #new_bind to be the outermost decorator, right?
What if you want to put new_bind into a module that you can import? Then it's not going to work, because it will be referring to the globals of that module, not the one you're currently writing.
You can fix that by explicitly passing your globals dict, or your module object, or your module name as an argument, like #new_bind(__name__), so it can find your globals instead of its. But that's ugly and repetitive.
You can also fix it with an ugly frame hack. At least in CPython, sys._getframe() can be used to get your caller's frame, and frame objects have a reference to their globals namespace, so:
def new_bind(func):
assert func.__name__[0] == '_'
g = sys._getframe(1).f_globals
g.setdefault(func.__name__[1:], func)
return func
Notice the big box in the docs that tells you this is an "implementation detail" that may only apply to CPython and is "for internal and specialized purposes only". Take this seriously. Whenever someone has a cool idea for the stdlib or builtins that could be implemented in pure Python, but only by using _getframe, it's generally treated almost the same as an idea that can't be implemented in pure Python at all. But if you know what you're doing, and you want to use this, and you only care about present-day versions of CPython, it will work.
There is no persistent_lru_cache in the stdlib. But you can build one pretty easily.
The functools source is linked directly from the docs, because this is one of those modules that's as useful as sample code as it is for using it directly.
As you can see, the cache is just a dict. If you replace that with, say, a shelf, it will become persistent automatically:
def persistent_lru_cache(filename, maxsize=128, typed=False):
"""new docstring explaining what dbpath does"""
# same code as before up to here
def decorating_function(user_function):
cache = shelve.open(filename)
# same code as before from here on.
Of course that only works if your arguments are strings. And it could be a little slow.
So, you might want to instead keep it as an in-memory dict, and just write code that pickles it to a file atexit, and restores it from a file if present at startup:
def decorating_function(user_function):
# ...
try:
with open(filename, 'rb') as f:
cache = pickle.load(f)
except:
cache = {}
def cache_save():
with lock:
with open(filename, 'wb') as f:
pickle.dump(cache, f)
atexit.register(cache_save)
# …
wrapper.cache_save = cache_save
wrapper.cache_filename = filename
Or, if you want it to write every N new values (so you don't lose the whole cache on, say, an _exit or a segfault or someone pulling the cord), add this to the second and third versions of wrapper, right after the misses += 1:
if misses % N == 0:
cache_save()
See here for a working version of everything up to this point (using save_every as the "N" argument, and defaulting to 1, which you probably don't want in real life).
If you want to be really clever, maybe copy the cache and save that in a background thread.
You might want to extend the cache_info to include something like number of cache writes, number of misses since last cache write, number of entries in the cache at startup, …
And there are probably other ways to improve this.
From a quick test, with save_every=1, this makes the cache on both get_pep and fib (from the functools docs) persistent, with no measurable slowdown to get_pep and a very small slowdown to fib the first time (note that fib(100) has 100097 hits vs. 101 misses…), and of course a large speedup to get_pep (but not fib) when you re-run it. So, just what you'd expect.
I can't say I won't just use #abarnert's "ugly frame hack", but here is the version that requires you to pass in the calling module's globals dict. I think it's worth posting given that decorator functions with arguments are tricky and meaningfully different from those without arguments.
def create_if_not_exists_2(my_globals):
def wrap(func):
if "_" != func.__name__[0]:
raise Exception("Function names used in cine must begin with'_'")
my_globals.setdefault(func.__name__[1:], func)
def wrapped(*args):
func(*args)
return wrapped
return wrap
Which you can then use in a different module like this:
from functools32 import lru_cache
from cine import create_if_not_exists_2
#create_if_not_exists_2(globals())
#lru_cache()
def _square(x):
print "Squaring", x
return x*x
assert "_square" in globals()
assert "square" in globals()
I've gained enough familiarity with decorators during this process that I was comfortable taking a swing at solving the problem another way:
from functools32 import lru_cache
try:
my_cine
except NameError:
class my_cine(object):
_reg_funcs = {}
#classmethod
def func_key (cls, f):
try:
name = f.func_name
except AttributeError:
name = f.__name__
return (f.__module__, name)
def __init__(self, f):
k = self.func_key(f)
self._f = self._reg_funcs.setdefault(k, f)
def __call__(self, *args, **kwargs):
return self._f(*args, **kwargs)
if __name__ == "__main__":
#my_cine
#lru_cache()
def fact_my_cine(n):
print "In fact_my_cine for", n
if n < 2:
return 1
return fact_my_cine(n-1) * n
x = fact_my_cine(10)
print "The answer is", x
#abarnert, if you are still watching, I'd be curious to hear your assessment of the downsides of this method. I know of two:
You have to know in advance what attributes to look in for a name to associate with the function. My first stab at it only looked at func_name which failed when passed an lru_cache object.
Resetting a function is painful: del my_cine._reg_funcs[('__main__', 'fact_my_cine')], and the swing I took at adding a __delitem__ was unsuccessful.
There were several discussions on "returning multiple values in Python", e.g.
1,
2.
This is not the "multiple-value-return" pattern I'm trying to find here.
No matter what you use (tuple, list, dict, an object), it is still a single return value and you need to parse that return value (structure) somehow.
The real benefit of multiple return value is in the upgrade process. For example,
originally, you have
def func():
return 1
print func() + func()
Then you decided that func() can return some extra information but you don't want to break previous code (or modify them one by one). It looks like
def func():
return 1, "extra info"
value, extra = func()
print value # 1 (expected)
print extra # extra info (expected)
print func() + func() # (1, 'extra info', 1, 'extra info') (not expected, we want the previous behaviour, i.e. 2)
The previous codes (func() + func()) are broken. You have to fix it.
I don't know whether I made the question clear... You can see the CLISP example. Is there an equivalent way to implement this pattern in Python?
EDIT: I put the above clisp snippets online for your quick reference.
Let me put two use cases here for multiple return value pattern. Probably someone can have alternative solutions to the two cases:
Better support smooth upgrade. This is shown in the above example.
Have simpler client side codes. See following alternative solutions I have so far. Using exception can make the upgrade process smooth but it costs more codes.
Current alternatives: (they are not "multi-value-return" constructions, but they can be engineering solutions that satisfy some of the points listed above)
tuple, list, dict, an object. As is said, you need certain parsing from the client side. e.g. if ret.success == True: blabla. You need to ret = func() before that. It's much cleaner to write if func() == True: blabal.
Use Exception. As is discussed in this thread, when the "False" case is rare, it's a nice solution. Even in this case, the client side code is still too heavy.
Use an arg, e.g. def func(main_arg, detail=[]). The detail can be list or dict or even an object depending on your design. The func() returns only original simple value. Details go to the detail argument. Problem is that the client need to create a variable before invocation in order to hold the details.
Use a "verbose" indicator, e.g. def func(main_arg, verbose=False). When verbose == False (default; and the way client is using func()), return original simple value. When verbose == True, return an object which contains simple value and the details.
Use a "version" indicator. Same as "verbose" but we extend the idea there. In this way, you can upgrade the returned object for multiple times.
Use global detail_msg. This is like the old C-style error_msg. In this way, functions can always return simple values. The client side can refer to detail_msg when necessary. One can put detail_msg in global scope, class scope, or object scope depending on the use cases.
Use generator. yield simple_return and then yield detailed_return. This solution is nice in the callee's side. However, the caller has to do something like func().next() and func().next().next(). You can wrap it with an object and override the __call__ to simplify it a bit, e.g. func()(), but it looks unnatural from the caller's side.
Use a wrapper class for the return value. Override the class's methods to mimic the behaviour of original simple return value. Put detailed data in the class. We have adopted this alternative in our project in dealing with bool return type. see the relevant commit: https://github.com/fqj1994/snsapi/commit/589f0097912782ca670568fe027830f21ed1f6fc (I don't have enough reputation to put more links in the post... -_-//)
Here are some solutions:
Based on #yupbank 's answer, I formalized it into a decorator, see github.com/hupili/multiret
The 8th alternative above says we can wrap a class. This is the current engineering solution we adopted. In order to wrap more complex return values, we may use meta class to generate the required wrapper class on demand. Have not tried, but this sounds like a robust solution.
try inspect?
i did some try, and not very elegant, but at least is doable.. and works :)
import inspect
from functools import wraps
import re
def f1(*args):
return 2
def f2(*args):
return 3, 3
PATTERN = dict()
PATTERN[re.compile('(\w+) f()')] = f1
PATTERN[re.compile('(\w+), (\w+) = f()')] = f2
def execute_method_for(call_str):
for regex, f in PATTERN.iteritems():
if regex.findall(call_str):
return f()
def multi(f1, f2):
def liu(func):
#wraps(func)
def _(*args, **kwargs):
frame,filename,line_number,function_name,lines,index=\
inspect.getouterframes(inspect.currentframe())[1]
call_str = lines[0].strip()
return execute_method_for(call_str)
return _
return liu
#multi(f1, f2)
def f():
return 1
if __name__ == '__main__':
print f()
a, b = f()
print a, b
Your case does need code editing. However, if you need a hack, you can use function attributes to return extra values , without modifying return values.
def attr_store(varname, value):
def decorate(func):
setattr(func, varname, value)
return func
return decorate
#attr_store('extra',None)
def func(input_str):
func.extra = {'hello':input_str + " ,How r you?", 'num':2}
return 1
print(func("John")+func("Matt"))
print(func.extra)
Demo : http://codepad.org/0hJOVFcC
However, be aware that function attributes will behave like static variables, and you will need to assign values to them with care, appends and other modifiers will act on previous saved values.
the magic is you should use design pattern blablabla to not use actual operation when you process the result, but use a parameter as the operation method, for your case, you can use the following code:
def x():
#return 1
return 1, 'x'*1
def f(op, f1, f2):
print eval(str(f1) + op + str(f2))
f('+', x(), x())
if you want generic solution for more complicated situation, you can extend the f function, and specify the process operation via the op parameter
The idea is that when a new function is written, it's variable name is appended to a list automatically.
Just to note, I realise I can just use mylist.append(whatever) but I'm specifically looking for a way to automatically append, rather than manually.
So, if we start with...
def function1(*args):
print "string"
def function2(*args):
print "string 2"
mylist = []
...is there a way to append 'function1' and 'function2' to mylist automatically so that it would end up like this...
mylist = [function1, function2]
Specifically, I'd like to have the variable name listed, not a string (e.g. "function1").
I'm learning Python and just experimenting, so this doesn't serve any particular purpose at the moment, I just want to know if it's possible.
Thanks in advance for any suggestions and happy answer any questions if I've not been clear.
**
Just add the function object to the list:
mylist = [function1, function2]
or use .append():
mylist.append(function1)
mylist.append(function2)
Python functions are first-class objects. They are values, just like classes and strings and integers.
If you want to automate this for a whole module, you can use the globals() function to quickly list all functions defined in the module so far, with a little help from the inspect.isfunction() predicate:
import inspect
mylist = [v for v globals().itervalues() if inspect.isfunction(v) and v.__module__ == __name__]
The v.__module__ == __name__ test ensures we only list functions from the current module, not anything we imported.
However, explicit is still better than implicit. Either add mylist.append(functionname) below each function, or use a decorator:
mylist = []
def listed(func):
mylist.append(func)
return func
#listed
def function1():
pass
#listed
def function2():
pass
Each function you 'mark' with the #listed decorator is added to the mylist list.
In principle, you could do that with a decorator, which would probably qualify as a semi-automatic solution:
#gather
def function1():
print "function 1"
#gather
def function2():
print "function 2"
One implementation of such a decorator is essentially a function which gets a function as a parameter:
function_list = []
def gather(func):
function_list.append(func) # or .append(func.__name__)
return func
In this simple incarnation it is probably not useful at all, but popular libraries and frameworks often employ a somewhat enhanced version of this technique. As an example, see the Flask's #app.route decorator for specifying functions that handle specific HTTP requests.