Python: Redefining function from within the function - python

I have some expensive function f(x) that I want to only calculate once, but is called rather frequently. In essence, the first time the function is called, it should compute a whole bunch of values for a range of x since it will be integrated over anyway and then interpolate that one with splines, and cache the coefficients somehow, possibly in a file for further use.
My idea was to do something like the following, since it would be pretty easy to implement. First time the function is called, it does something, then redefines itself, then does something else from then on. However, it does not work as expected and might in general be bad practice.
def f():
def g():
print(2)
print(1)
f = g
f()
f()
Expected output:
1
2
Actual output:
1
1
Defining g() outside of f() does not help. Why does this not work? Other than that, the only solution I can think of right now is to use some global variable. Or does it make sense to somehow write a class for this?

This is overly complicated. Instead, use memoization:
def memoized(f):
res = []
def resf():
if len(res) == 0
res.append(f())
return res[0]
return resf
and then simply
#memoized
def f():
# expensive calculation here ...
return calculated_value
In Python 3, you can replace memoized with functools.lru_cache.

Simply add global f at the beginning of the f function, otherwise python creates a local f variable.

Changing f in f's scope doesn't affect outside of function, if you want to change f, you could use global:
>>> def f():
... print(1)
... global f
... f=lambda: print(2)
...
>>> f()
1
>>> f()
2
>>> f()
2

You can use memoization and decoration to cache the result. See an example here. A separate question on memoization that might prove useful can be found here.

What you're describing is the kind of problem caching was invented for. Why not just have a buffer to hold the result; before doing the expensive calculation, check if the buffer is already filled; if so, return the buffered result, otherwise, execute the calculation, fill the buffer, and then return the result. No need to go all fancy with self-modifying code for this.

Related

Python - multiple functions - output of one to the next

I know this is super basic and I have been searching everywhere but I am still very confused by everything I'm seeing and am not sure the best way to do this and am having a hard time wrapping my head around it.
I have a script where I have multiple functions. I would like the first function to pass it's output to the second, then the second pass it's output to the third, etc. Each does it's own step in an overall process to the starting dataset.
For example, very simplified with bad names but this is to just get the basic structure:
#!/usr/bin/python
# script called process.py
import sys
infile = sys.argv[1]
def function_one():
do things
return function_one_output
def function_two():
take output from function_one, and do more things
return function_two_output
def function_three():
take output from function_two, do more things
return/print function_three_output
I want this to run as one script and print the output/write to new file or whatever which I know how to do. Just am unclear on how to pass the intermediate outputs of each function to the next etc.
infile -> function_one -> (intermediate1) -> function_two -> (intermediate2) -> function_three -> final result/outfile
I know I need to use return, but I am unsure how to call this at the end to get my final output
Individually?
function_one(infile)
function_two()
function_three()
or within each other?
function_three(function_two(function_one(infile)))
or within the actual function?
def function_one():
do things
return function_one_output
def function_two():
input_for_this_function = function_one()
# etc etc etc
Thank you friends, I am over complicating this and need a very simple way to understand it.
You could define a data streaming helper function
from functools import reduce
def flow(seed, *funcs):
return reduce(lambda arg, func: func(arg), funcs, seed)
flow(infile, function_one, function_two, function_three)
#for example
flow('HELLO', str.lower, str.capitalize, str.swapcase)
#returns 'hELLO'
edit
I would now suggest that a more "pythonic" way to implement the flow function above is:
def flow(seed, *funcs):
for func in funcs:
seed = func(seed)
return seed;
As ZdaR mentioned, you can run each function and store the result in a variable then pass it to the next function.
def function_one(file):
do things on file
return function_one_output
def function_two(myData):
doThings on myData
return function_two_output
def function_three(moreData):
doMoreThings on moreData
return/print function_three_output
def Main():
firstData = function_one(infile)
secondData = function_two(firstData)
function_three(secondData)
This is assuming your function_three would write to a file or doesn't need to return anything. Another method, if these three functions will always run together, is to call them inside function_three. For example...
def function_three(file):
firstStep = function_one(file)
secondStep = function_two(firstStep)
doThings on secondStep
return/print to file
Then all you have to do is call function_three in your main and pass it the file.
For safety, readability and debugging ease, I would temporarily store the results of each function.
def function_one():
do things
return function_one_output
def function_two(function_one_output):
take function_one_output and do more things
return function_two_output
def function_three(function_two_output):
take function_two_output and do more things
return/print function_three_output
result_one = function_one()
result_two = function_two(result_one)
result_three = function_three(result_two)
The added benefit here is that you can then check that each function is correct. If the end result isn't what you expected, just print the results you're getting or perform some other check to verify them. (also if you're running on the interpreter they will stay in namespace after the script ends for you to interactively test them)
result_one = function_one()
print result_one
result_two = function_two(result_one)
print result_two
result_three = function_three(result_two)
print result_three
Note: I used multiple result variables, but as PM 2Ring notes in a comment you could just reuse the name result over and over. That'd be particularly helpful if the results would be large variables.
It's always better (for readability, testability and maintainability) to keep your function as decoupled as possible, and to write them so the output only depends on the input whenever possible.
So in your case, the best way is to write each function independently, ie:
def function_one(arg):
do_something()
return function_one_result
def function_two(arg):
do_something_else()
return function_two_result
def function_three(arg):
do_yet_something_else()
return function_three_result
Once you're there, you can of course directly chain the calls:
result = function_three(function_two(function_one(arg)))
but you can also use intermediate variables and try/except blocks if needed for logging / debugging / error handling etc:
r1 = function_one(arg)
logger.debug("function_one returned %s", r1)
try:
r2 = function_two(r1)
except SomePossibleExceptio as e:
logger.exception("function_two raised %s for %s", e, r1)
# either return, re-reraise, ask the user what to do etc
return 42 # when in doubt, always return 42 !
else:
r3 = function_three(r2)
print "Yay ! result is %s" % r3
As an extra bonus, you can now reuse these three functions anywhere, each on it's own and in any order.
NB : of course there ARE cases where it just makes sense to call a function from another function... Like, if you end up writing:
result = function_three(function_two(function_one(arg)))
everywhere in your code AND it's not an accidental repetition, it might be time to wrap the whole in a single function:
def call_them_all(arg):
return function_three(function_two(function_one(arg)))
Note that in this case it might be better to decompose the calls, as you'll find out when you'll have to debug it...
I'd do it this way:
def function_one(x):
# do things
output = x ** 1
return output
def function_two(x):
output = x ** 2
return output
def function_three(x):
output = x ** 3
return output
Note that I have modified the functions to accept a single argument, x, and added a basic operation to each.
This has the advantage that each function is independent of the others (loosely coupled) which allows them to be reused in other ways. In the example above, function_two() returns the square of its argument, and function_three() the cube of its argument. Each can be called independently from elsewhere in your code, without being entangled in some hardcoded call chain such as you would have if called one function from another.
You can still call them like this:
>>> x = function_one(3)
>>> x
3
>>> x = function_two(x)
>>> x
9
>>> x = function_three(x)
>>> x
729
which lends itself to error checking, as others have pointed out.
Or like this:
>>> function_three(function_two(function_one(2)))
64
if you are sure that it's safe to do so.
And if you ever wanted to calculate the square or cube of a number, you can call function_two() or function_three() directly (but, of course, you would name the functions appropriately).
With d6tflow you can easily chain together complex data flows and execute them. You can quickly load input and output data for each task. It makes your workflow very clear and intuitive.
import d6tlflow
class Function_one(d6tflow.tasks.TaskCache):
function_one_output = do_things()
self.save(function_one_output) # instead of return
#d6tflow.requires(Function_one)
def Function_two(d6tflow.tasks.TaskCache):
output_from_function_one = self.inputLoad() # load function input
function_two_output = do_more_things()
self.save(function_two_output)
#d6tflow.requires(Function_two)
def Function_three():
output_from_function_two = self.inputLoad()
function_three_output = do_more_things()
self.save(function_three_output)
d6tflow.run(Function_three()) # executes all functions
function_one_output = Function_one().outputLoad() # get function output
function_three_output = Function_three().outputLoad()
It has many more useful features like parameter management, persistence, intelligent workflow management. See https://d6tflow.readthedocs.io/en/latest/
This way function_three(function_two(function_one(infile))) would be the best, you do not need global variables and each function is completely independent of the other.
Edited to add:
I would also say that function3 should not print anything, if you want to print the results returned use:
print function_three(function_two(function_one(infile)))
or something like:
output = function_three(function_two(function_one(infile)))
print output
Use parameters to pass the values:
def function1():
foo = do_stuff()
return function2(foo)
def function2(foo):
bar = do_more_stuff(foo)
return function3(bar)
def function3(bar):
baz = do_even_more_stuff(bar)
return baz
def main():
thing = function1()
print thing

How to make two functions share the same non global variable (Python)

Is there a way to make function B to be able to access a non global variable that was declared in only in function A, without return statements from function A.
As asked, the question:
Define two functions:
p: prints the value of a variable
q: increments the variable
such that
Initial value of the variable is 0. You can't define the variable in the global
enviroment.
Variable is not located in the global environment and the only way to change it is by invoking q().
The global enviroment should know only p() and q().
Tip: 1) In python, a function can return more than 1 value. 2) A function can be
assigned to a variable.
# Example:
>>> p()
0
>>> q()
>>> q()
>>> p()
2
The question says the global enviroment should know only p and q.
So, taking that literally, it could be done inline using a single function scope:
>>> p, q = (lambda x=[0]: (lambda: print(x[0]), lambda: x.__setitem__(0, x[0] + 1)))()
>>> p()
0
>>> q()
>>> q()
>>> p()
2
Using the tips provided as clues, it could be done something like this:
def make_p_and_q():
context = {'local_var': 0}
def p():
print('{}'.format(context['local_var']))
def q():
context['local_var'] += 1
return p, q
p, q = make_p_and_q()
p() # --> 0
q()
q()
p() # --> 2
The collection of things that functions can access is generally called its scope. One interpretation of your question is whether B can access a "local variable" of A; that is, one that is defined normally as
def A():
x = 1
The answer here is "not easily": Python lets you do a lot, but local variables are one of the things that are not meant to be accessed inside a function.
I suspect what your teacher is getting at is that A can modify things outside of its scope, in order to send information out without sending it through the return value. (Whether this is good coding practise is another matter.) For example, functions are themselves Python objects, and you can assign arbitrary properties to Python objects, so you can actually store values on the function object and read them from outside it.
def a():
a.key = "value"
a()
print a.key
Introspection and hacking with function objects
In fact, you can sort of get at the constant values defined in A by looking at the compiled Python object generated when you define a function. For example, in the example above, "value" is a constant, and constants are stored on the code object:
In [9]: a.func_code.co_consts
Out[9]: (None, 'value')
This is probably not what you meant.
Firstly, it's bad practise to do so. Such variables make debugging difficult and are easy to lose track of, especially in complex code.
Having said that, you can accomplish what you want by declaring a variable as global:
def funcA():
global foo
foo = 3
def funcB():
print foo # output is 3
That's one weird homework assignment; especially the tips make me suspect that you've misunderstood or left out something.
Anyway, here's a simpler solution than the accepted answer: Since calls to q increment the value of the variable, it must be a persistent ("static") variable of some sort. Store it somewhere other than the global namespace, and tell p about it. The obvious place to store it is as an attribute of q:
def q():
q.x += 1
q.x = 0 # Initialize
def p():
print(q.x)

Python - using a shared variable in a recursive function

I'm using a recursive function to sort a list in Python, and I want to keep track of the number of sorts/merges as the function continues. However, when I declare/initialize the variable inside the function, it becomes a local variable inside each successive call of the function. If I declare the variable outside the function, the function thinks it doesn't exist (i.e. has no access to it). How can I share this value across different calls of the function?
I tried to use the "global" variable tag inside and outside the function like this:
global invcount ## I tried here, with and without the global tag
def inv_sort (listIn):
global invcount ## and here, with and without the global tag
if (invcount == undefined): ## can't figure this part out
invcount = 0
#do stuff
But I cannot figure out how to check for the undefined status of the global variable and give it a value on the first recursion call (because on all successive recursions it should have a value and be defined).
My first thought was to return the variable out of each call of the function, but I can't figure out how to pass two objects out of the function, and I already have to pass the list out for the recursion sort to work. My second attempt to resolve this issue involved me adding the variable invcount to the list I'm passing as the last element with an identifier, like "i27". Then I could just check for the presence of the identifier (the letter i in this example) in the last element and if present pop() it off at the beginning of the function call and re-add it during the recursion. In practice this is becoming really convoluted and while it may work eventually, I'm wondering if there is a more practical or easier solution.
Is there a way to share a variable without directly passing/returning it?
There's couple of things you can do. Taking your example you should modify it like this:
invcount = 0
def inv_sort (listIn):
global invcount
invcount += 1
# do stuff
But this approach means that you should zero invcount before each call to inv_sort.
So actually its better to return invcount as a part of result. For example using tuples like this:
def inv_sort(listIn):
#somewhere in your code recursive call
recursive_result, recursive_invcount = inv_sort(argument)
# this_call_invcount includes recursive_invcount
return this_call_result, this_call_invcount
There's no such thing as an "undefined" variable in Python, and you don't need one.
Outside the function, set the variable to 0. Inside the loop, use the global keyword, then increment.
invcount = 0
def inv_sort (listIn):
global invcount
... do stuff ...
invcount += 1
An alternative might be using a default argument, e.g.:
def inv_sort(listIn, invcount=0):
...
invcount += 1
...
listIn, invcount = inv_sort(listIn, invcount)
...
return listIn, invcount
The downside of this is that your calls get slightly less neat:
l, _ = inv_sort(l) # i.e. ignore the second returned parameter
But this does mean that invcount automatically gets reset each time the function is called with a single argument (and also provides the opportunity to inject a value of invcount if necessary for testing: assert result, 6 == inv_sort(test, 5)).
Assuming that you don't need to know the count inside the function, I would approach this using a decorator function:
import functools
def count_calls(f):
#functools.wraps(f)
def func(*args):
func.count += 1
return f(*args)
func.count = 0
return func
You can now decorate your recursive function:
#count_calls
def inv_sort(...):
...
And check or reset the count before or after calling it:
inv_sort.count = 0
l = inv_sort(l)
print(inv_sort.count)

How to use iter(v,w) with a function?

I was study iter(), in its official document, it says i can do iter(v,w) , so that iter() will call v until it return the value w, then it stops.
But I tried for half hour, and still can't work out a function that can return multiple result.
Here is my code, I expect it to return 1,2,3,4,5:
def x():
for i in range(10):
return i
a = iter(x, 5)
a.next()
I know that when I return i, that I was actually quit the function.
Maybe it's impossible to return result for multiple times for a function.
But how should I use a function to make that iter(x,5) work properly?
iter() calls the function each time. Your function returns the same value on each call (the first number in the range(10) list).
You could change the function to use a global to illustrate how iter() with two arguments works:
i = 0
def f():
global i
i += 1
return i
for x in iter(f, 5):
print x
Now each time f() is called, a new number is returned. You could use a default argument, or an instance with state and a method on that instance, too. As long as the function returns something different when called more than once it'll fit the iter(a, b) usecase.
iter() with two arguments is most often called with a method, where the state of an instance changes with each call. The .readline() method on a file object, for example:
for line in iter(fileobject.readline, ''):
which would work exactly like iterating over the fileobject iterable directly, except it wouldn't use the internal file iteration buffer. That could sometimes be a requirement (see the file.next() method for more information on the file iteration buffer).
You can of course pass in a lambda function too:
for chunk in iter(lambda: fileobject.read(2048), ''):
Now we are reading the file object is chunks of up to 2048 bytes instead of line by line.
After #Martjin Pieters's answer, I've got the idea.
And this is the piece of code I wrote which can use iter(v,w) correctly:
import random
def x():
return random.randrange(1,10)
a = iter(x,5)
while True:
print a.next()
In this code, a.next() will return the value a get from x(), until x() returns 5, then it will stop.
You could also use a generator function, via the yield keyword. Example:
def x():
for i in range(10):
yield i

Python lazy evaluator

Is there a Pythonic way to encapsulate a lazy function call, whereby on first use of the function f(), it calls a previously bound function g(Z) and on the successive calls f() returns a cached value?
Please note that memoization might not be a perfect fit.
I have:
f = g(Z)
if x:
return 5
elif y:
return f
elif z:
return h(f)
The code works, but I want to restructure it so that g(Z) is only called if the value is used. I don't want to change the definition of g(...), and Z is a bit big to cache.
EDIT: I assumed that f would have to be a function, but that may not be the case.
I'm a bit confused whether you seek caching or lazy evaluation. For the latter, check out the module lazy.py by Alberto Bertogli.
Try using this decorator:
class Memoize:
def __init__ (self, f):
self.f = f
self.mem = {}
def __call__ (self, *args, **kwargs):
if (args, str(kwargs)) in self.mem:
return self.mem[args, str(kwargs)]
else:
tmp = self.f(*args, **kwargs)
self.mem[args, str(kwargs)] = tmp
return tmp
(extracted from dead link: http://snippets.dzone.com/posts/show/4840 / https://web.archive.org/web/20081026130601/http://snippets.dzone.com/posts/show/4840)
(Found here: Is there a decorator to simply cache function return values? by Alex Martelli)
EDIT: Here's another in form of properties (using __get__) http://code.activestate.com/recipes/363602/
You can employ a cache decorator, let see an example
from functools import wraps
class FuncCache(object):
def __init__(self):
self.cache = {}
def __call__(self, func):
#wraps(func)
def callee(*args, **kwargs):
key = (args, str(kwargs))
# see is there already result in cache
if key in self.cache:
result = self.cache.get(key)
else:
result = func(*args, **kwargs)
self.cache[key] = result
return result
return callee
With the cache decorator, here you can write
my_cache = FuncCache()
#my_cache
def foo(n):
"""Expensive calculation
"""
sum = 0
for i in xrange(n):
sum += i
print 'called foo with result', sum
return sum
print foo(10000)
print foo(10000)
print foo(1234)
As you can see from the output
called foo with result 49995000
49995000
49995000
The foo will be called only once. You don't have to change any line of your function foo. That's the power of decorators.
There are quite a few decorators out there for memoization:
http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize
http://code.activestate.com/recipes/498110-memoize-decorator-with-o1-length-limited-lru-cache/
http://code.activestate.com/recipes/496879-memoize-decorator-function-with-cache-size-limit/
Coming up with a completely general solution is harder than you might think. For instance, you need to watch out for non-hashable function arguments and you need to make sure the cache doesn't grow too large.
If you're really looking for a lazy function call (one where the function is only actually evaluated if and when the value is needed), you could probably use generators for that.
EDIT: So I guess what you want really is lazy evaluation after all. Here's a library that's probably what you're looking for:
http://pypi.python.org/pypi/lazypy/0.5
Just for completness, here is a link for my lazy-evaluator decorator recipe:
https://bitbucket.org/jsbueno/metapython/src/f48d6bd388fd/lazy_decorator.py
Here's a pretty brief lazy-decorator, though it lacks using #functools.wraps (and actually returns an instance of Lazy plus some other potential pitfalls):
class Lazy(object):
def __init__(self, calculate_function):
self._calculate = calculate_function
def __get__(self, obj, _=None):
if obj is None:
return self
value = self._calculate(obj)
setattr(obj, self._calculate.func_name, value)
return value
# Sample use:
class SomeClass(object):
#Lazy
def someprop(self):
print 'Actually calculating value'
return 13
o = SomeClass()
o.someprop
o.someprop
Curious why you don't just use a lambda in this scenario?
f = lambda: g(z)
if x:
return 5
if y:
return f()
if z:
return h(f())
Even after your edit, and the series of comments with detly, I still don't really understand. In your first sentence, you say the first call to f() is supposed to call g(), but subsequently return cached values. But then in your comments, you say "g() doesn't get called no matter what" (emphasis mine). I'm not sure what you're negating: Are you saying g() should never be called (doesn't make much sense; why does g() exist?); or that g() might be called, but might not (well, that still contradicts that g() is called on the first call to f()). You then give a snippet that doesn't involve g() at all, and really doesn't relate to either the first sentence of your question, or to the comment thread with detly.
In case you go editing it again, here is the snippet I am responding to:
I have:
a = f(Z)
if x:
return 5
elif y:
return a
elif z:
return h(a)
The code works, but I want to
restructure it so that f(Z) is only
called if the value is used. I don't
want to change the definition of
f(...), and Z is a bit big to cache.
If that is really your question, then the answer is simply
if x:
return 5
elif y:
return f(Z)
elif z:
return h(f(Z))
That is how to achieve "f(Z) is only called if the value is used".
I don't fully understand "Z is a bit big to cache". If you mean there will be too many different values of Z over the course of program execution that memoization is useless, then maybe you have to resort to precalculating all the values of f(Z) and just looking them up at run time. If you can't do this (because you can't know the values of Z that your program will encounter) then you are back to memoization. If that's still too slow, then your only real option is to use something faster than Python (try Psyco, Cython, ShedSkin, or hand-coded C module).

Categories