Intermediate results from recursion - python

I have a problem where I need to produce something which is naturally computed recursively, but where I also need to be able to interrogate the intermediate steps in the recursion if needed.
I know I can do this by passing and mutating a list or similar structure. However, this looks ugly to me and I'm sure there must be a neater way, e.g. using generators. What I would ideally love to be able to do is something like:
intermediate_results = [f(x) for x in range(T)]
final_result = intermediate_results[T-1]
in an efficient way. While my solution is not performance critical, I can't justify the massive amount of redundant effort in that first line. It looks to me like a generator would be perfect for this except for the fact that f is fundamentally much more suited to recursion in my case (which at least in my mind is the complete opposite of a generator, but maybe I'm just not thinking far enough outside of the box).
Is there a neat Pythonic way of doing something like this that I just don't know about, or do I just need to just capitulate and pollute my function f by passing it an intermediate_results list which I then mutate as a side-effect?

I have a generic solution for you using a decorator. We create a Memoize class which stores the results of previous times the function is executed (including in recursive calls). If the arguments given have already been seen, the cached versions are used to quickly lookup the result.
The custom class has the benefit over an lru_cache in that you can see the results.
from functools import wraps
class Memoize:
def __init__(self):
self.store = {}
def save(self, fun):
#wraps(fun)
def wrapper(*args):
if args not in self.store:
self.store[args] = fun(*args)
return self.store[args]
return wrapper
m = Memoize()
#m.save
def fibo(n):
if n <= 0: return 0
elif n == 1: return 1
else: return fibo(n-1) + fibo(n-2)
Then after running different things you can see what the cache contains. When you run future function calls, m.store will be used as a lookup so calculation doesn't need to be redone.
>>> f(8)
21
>>> m.store
{(1,): 1,
(0,): 0,
(2,): 1,
(3,): 2,
(4,): 3,
(5,): 5,
(6,): 8,
(7,): 13,
(8,): 21}
You could modify the save function to use the name of the function and the args as the key, so that multiple function results can be stored in the same Memoize class.

You can use your existing solution that makes many "redundant" calls to f, but employ the use of function caching to save the results to previous calls to f.
In other words, when f(x1) is called, it's input arguments and corresponding return values are saved, and the next time it is called, the result is simply pulled from the cache
see functools.lru_cache for the standard library solution to this
ie:
from functools import lru_cache
#lru_cache
intermediate_results = [f(x) for x in range(T)]
final_result = intermediate_results[T-1]
Note, however, f must be a pure function (no side-effects, 1-to-1 mapping) for this to work properly

Having considered your comments, I'll now try to give another perspective on the problem.
So, let's consider a concrete example:
def f(x):
a = 2
return g(x) + a if x != 0 else 0
def g(x):
b = 1
return h(x) - b
def h(x):
c = 1/2
return f(x-1)*(1+c)
I
First of all, it should be mentioned that (in our particular case) the algorithm has form of: f(x) = p(f(x - 1)) for some p. It follows that f(x) = p^x(f(0)) = p^x(0). That means we should just apply p to 0 x times to get the desired result, which can be done in an iterative process, so this can be written without recursion. Though I believe that your real case is much harder. Moreover, it would be too boring and uninformative to stop here)
II
Generally speaking, we can divide all possible solutions into two groups: the ones that require refactoring (i.e. rewriting functions f, g, h) and the ones that do not. I have little to offer from the latter one (and I don't think anyone can). Consider the following, however:
def fk(x, k):
a = 2
return k(gk(x, k) + a if x != 0 else 0)
def gk(x, k):
b = 1
return k(hk(x, k) - b)
def hk(x, k):
c = 1/2
return k(fk(x-1, k)*(1+c))
def printret(x):
print(x)
return x
f(4, printret) # see what happens
Inspired by continuation-passing style, but that's totally not it.
What's the point? It's something between your idea of passing a list to write down all the computations and memoizing. This k carries additional behavior with it, such as printing or writing to list (you can make a function that writes to some list, why not?). But if you look carefully you'll see that it lefts inner code of these functions practically untouched (only input and output to function are affected), so one can produce a decorator associated with a function like printret that does essentially the same thing for f, g, h.
Pros: no need to modify code, much more flexible than passing a list, no additional work (like in memoizing).
Cons: Impure (printing or modifying sth), not so flexible as we would like.
III
Now let's see how modifying function bodies can help. Don't be afraid of what's written below, take your time and play with that thing a little.
class Logger:
def __init__(self, lst, cur_val):
self.lst = lst
self.cur_val = cur_val
def bind(self, f):
res = f(self.cur_val)
return Logger([self.cur_val] + res.lst + self.lst, res.cur_val)
def __repr__(self):
return "Logger( " + repr({'value' : self.cur_val,'lst' : self.lst}) + " )"
def unit(x):
return Logger([], x)
# you can also play with lala
def lala(x):
if x <= 0:
return unit(1)
else:
return lala(x - 1).bind(lambda y: unit(2*y))
def f(x):
a = 2
if x == 0:
return unit(0)
else:
return g(x).bind(lambda y: unit(y + a))
def g(x):
b = 1
return h(x).bind(lambda y: unit(y - b))
def h(x):
c = 1/2
return f(x-1).bind(lambda y: unit(y*(1+c)))
f(4) # see for yourself
Logger is called a monad. I'm not very familiar with this concept myself, but I guess I'm doing everything right) f, g, h are functions that take a number and return a Logger instance. Logger's bind takes in a function (like f) and returns Logger with new value (computed by f) and updated 'logs'. The key point - as I see it - is the ability to do whatever we want with collected functions in the order the resulting value was calculated.
Afterword
I'm not at all some kind of 'guru' of functional programming, I believe I'm missing a lot of things here. But what I've understood is that functional programming is about inversing the flow of the program. That's why, for instance, I totally agree with your opinion about generators being opposed to functional programming. When we use generator gen in, say, function func, we yield values one by one to func and func does sth with them in e.g. a loop. The functional approach would be to make gen a function taking func as a parameter and make func perform computations on 'yielded' values. It's like gen and func exchanged their places. So the flow is inversed! And there are plenty of other ways of inversing the flow. Monads are one of them.

itertools islice gets a generator, start value and stop value. it will give you the elements between the start value and stop value as a generator. if islice is not clear you can check the docs here https://docs.python.org/3/library/itertools.html
intermediate_result = map(f, range(T))
final_result = next(itertools.islice(intermediate_result, start=T-1, stop=T))

Related

Is there a way to change the argument name of a nested function?

So the easiest way to illustrate my question is through an example. Let's say I want to write a function function(a=None,b=None,c=None), this function will only take two arguments at a time and the third one will be left blank. For any two arguments that one gives to the function it will return the missing one such that a+b=c. So for example function(a=5, c=15) would return 10 and for function(a = 5, b = 10) it would return 15. Now for the sake of the argument let's say that a could not be written as a function of b and c, or its simply too complicated to find a closed solution (this is clearly not the case here because to find a, I could simply say a = c-b). Anyway, if I was to write such a function I'd do something like this:
#import a root finder
from scipy.optimize import newton
def function(a= None, b= None, c = None):
#find the missing parameter:
vars = [a,b,c]
comp = vars.index(None)
if comp == 0:
def aux_fun(a):
return (a+b-c)
elif comp == 1:
def aux_fun(b):
return (a+b-c)
else:
def aux_fun(c):
return (a+b-c)
return newton(aux_fun, 0)
I have not found a solution to this other than writing 3 different functions and calling the correct one in newton. This works for this small example but if I have a bigger problem let's say with 100 variables writing 100 functions is not pretty.
My question is: is there any way such that I only have to write aux_fun once and change its parameter based on the missing parameter from function
Thanks a lot for your answers!
I haven't fully grasped what you are trying to do, but I don't think you quite understand how newton works.
optimize.newton is going to call your function with
func(x, a, b, c ...)
where x is a scalar or array of the size of the initial value, the x0. The other arguments are passed via the args tuple. This pattern of passing an iteration variable and args to the function is widely used in these scipy.optimize functions. I've answered a number of questions regarding these arguments.
newton(func, x0, args=(a,b,c))
These aren't keyword arguments.
Read and experiment with the examples in the docs. People often mess up the args tuple. And then explore a few small examples of your own before seriously trying to do this 'renaming'.
edit
This might work - I haven't tested it:
def function(a= None, b= None, c = None):
#find the missing parameter:
vars = [a,b,c]
comp = vars.index(None)
def aux_fun(x):
vars[comp] = x
a,b,c = vars
return (a+b-c)
return newton(aux_fun, 0)
What you want is impossible. Given a generic function f(a, b, c, d) = 0, there is no way to generically turn it into a set of functions a = f1(b, c, d), b = f2(a, c, d), etc. You are entering the realm of symbolic computation. You would need your code to understand trigonometry, algebra, calculus, exponentiation and logarithms.
Updating based on comments below.
So you want something like:
def aux_func(x):
vars[comp] = x
return f(*vars)

SICP "streams as signals" in Python

I have found some nice examples (here, here) of implementing SICP-like streams in Python. But I am still not sure how to handle an example like the integral found in SICP 3.5.3 "Streams as signals."
The Scheme code found there is
(define (integral integrand initial-value dt)
(define int
(cons-stream initial-value
(add-streams (scale-stream integrand dt)
int)))
int)
What is tricky about this one is that the returned stream int is defined in terms of itself (i.e., the stream int is used in the definition of the stream int).
I believe Python could have something similarly expressive and succinct... but not sure how. So my question is, what is an analogous stream-y construct in Python? (What I mean by a stream is the subject of 3.5 in SICP, but briefly, a construct (like a Python generator) that returns successive elements of a sequence of indefinite length, and can be combined and processed with operations such as add-streams and scale-stream that respect streams' lazy character.)
There are two ways to read your question. The first is simply: How do you use Stream constructs, perhaps the ones from your second link, but with a recursive definition? That can be done, though it is a little clumsy in Python.
In Python you can represent looped data structures but not directly. You can't write:
l = [l]
but you can write:
l = [None]
l[0] = l
Similarly you can't write:
def integral(integrand,initial_value,dt):
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
int_rec))
return int_rec
but you can write:
def integral(integrand,initial_value,dt):
placeholder = Stream(initial_value,lambda : None)
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
placeholder))
placeholder._compute_rest = lambda:int_rec
return int_rec
Note that we need to clumsily pre-compute the first element of placeholder and then only fix up the recursion for the rest of the stream. But this does all work (alongside appropriate definitions of all the rest of the code - I'll stick it all at the bottom of this answer).
However, the second part of your question seems to be asking how to do this naturally in Python. You ask for an "analogous stream-y construct in Python". Clearly the answer to that is exactly the generator. The generator naturally provides the lazy evaluation of the stream concept. It differs by not being naturally expressed recursively but then Python does not support that as well as Scheme, as we will see.
In other words, the strict stream concept can be expressed in Python (as in the link and above) but the idiomatic way to do it is to use generators.
It is more or less possible to replicate the Scheme example by a kind of direct mechanical transformation of stream to generator (but avoiding the built-in int):
def integral_rec(integrand,initial_value,dt):
def int_rec():
for x in cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),int_rec())):
yield x
for x in int_rec():
yield x
def cons_stream(a,b):
yield a
for x in b:
yield x
def add_streams(a,b):
while True:
yield next(a) + next(b)
def scale_stream(a,b):
for x in a:
yield x * b
The only tricky thing here is to realise that you need to eagerly call the recursive use of int_rec as an argument to add_streams. Calling it doesn't start it yielding values - it just creates the generator ready to yield them lazily when needed.
This works nicely for small integrands, though it's not very pythonic. The Scheme version works by optimising the tail recursion - the Python version will exceed the max stack depth if your integrand is too long. So this is not really appropriate in Python.
A direct and natural pythonic version would look something like this, I think:
def integral(integrand,initial_value,dt):
value = initial_value
yield value
for x in integrand:
value += dt * x
yield value
This works efficiently and correctly treats the integrand lazily as a "stream". However, it uses iteration rather than recursion to unpack the integrand iterable, which is more the Python way.
In moving to natural Python I have also removed the stream combination functions - for example, replaced add_streams with +=. But we could still use them if we wanted a sort of halfway house version:
def accum(initial_value,a):
value = initial_value
yield value
for x in a:
value += x
yield value
def integral_hybrid(integrand,initial_value,dt):
for x in accum(initial_value,scale_stream(integrand,dt)):
yield x
This hybrid version uses the stream combinations from the Scheme and avoids only the tail recursion. This is still pythonic and python includes various other nice ways to work with iterables in the itertools module. They all "respect streams' lazy character" as you ask.
Finally here is all the code for the first recursive stream example, much of it taken from the Berkeley reference:
class Stream(object):
"""A lazily computed recursive list."""
def __init__(self, first, compute_rest, empty=False):
self.first = first
self._compute_rest = compute_rest
self.empty = empty
self._rest = None
self._computed = False
#property
def rest(self):
"""Return the rest of the stream, computing it if necessary."""
assert not self.empty, 'Empty streams have no rest.'
if not self._computed:
self._rest = self._compute_rest()
self._computed = True
return self._rest
def __repr__(self):
if self.empty:
return '<empty stream>'
return 'Stream({0}, <compute_rest>)'.format(repr(self.first))
Stream.empty = Stream(None, None, True)
def cons_stream(a,b):
return Stream(a,lambda : b)
def add_streams(a,b):
if a.empty or b.empty:
return Stream.empty
def compute_rest():
return add_streams(a.rest,b.rest)
return Stream(a.first+b.first,compute_rest)
def scale_stream(a,scale):
if a.empty:
return Stream.empty
def compute_rest():
return scale_stream(a.rest,scale)
return Stream(a.first*scale,compute_rest)
def make_integer_stream(first=1):
def compute_rest():
return make_integer_stream(first+1)
return Stream(first, compute_rest)
def truncate_stream(s, k):
if s.empty or k == 0:
return Stream.empty
def compute_rest():
return truncate_stream(s.rest, k-1)
return Stream(s.first, compute_rest)
def stream_to_list(s):
r = []
while not s.empty:
r.append(s.first)
s = s.rest
return r
def integral(integrand,initial_value,dt):
placeholder = Stream(initial_value,lambda : None)
int_rec = cons_stream(initial_value,
add_streams(scale_stream(integrand,dt),
placeholder))
placeholder._compute_rest = lambda:int_rec
return int_rec
a = truncate_stream(make_integer_stream(),5)
print(stream_to_list(integral(a,8,.5)))

calculating current value based on previous value

i would like to perform a calculation using python, where the current value (i) of the equation is based on the previous value of the equation (i-1), which is really easy to do in a spreadsheet but i would rather learn to code it
i have noticed that there is loads of information on finding the previous value from a list, but i don't have a list i need to create it! my equation is shown below.
h=(2*b)-h[i-1]
can anyone give me tell me a method to do this ?
i tried this sort of thing, but that will not work as when i try to do the equation i'm calling a value i haven't created yet, if i set h=0 then i get an error that i am out of index range
i = 1
for i in range(1, len(b)):
h=[]
h=(2*b)-h[i-1]
x+=1
h = [b[0]]
for val in b[1:]:
h.append(2 * val - h[-1]) # As you add to h, you keep up with its tail
for large b list (brr, one-letter identifier), to avoid creating large slice
from itertools import islice # For big list it will keep code less wasteful
for val in islice(b, 1, None):
....
As pointed out by #pad, you simply need to handle the base case of receiving the first sample.
However, your equation makes no use of i other than to retrieve the previous result. It's looking more like a running filter than something which needs to maintain a list of past values (with an array which might never stop growing).
If that is the case, and you only ever want the most recent value,then you might want to go with a generator instead.
def gen():
def eqn(b):
eqn.h = 2*b - eqn.h
return eqn.h
eqn.h = 0
return eqn
And then use thus
>>> f = gen()
>>> f(2)
4
>>> f(3)
2
>>> f(2)
0
>>>
The same effect could be acheived with a true generator using yield and send.
First of, do you need all the intermediate values? That is, do you want a list h from 0 to i? Or do you just want h[i]?
If you just need the i-th value you could us recursion:
def get_h(i):
if i>0:
return (2*b) - get_h(i-1)
else:
return h_0
But be aware that this will not work for large i, as it will exceed the maximum recursion depth. (Thanks for pointing this out kdopen) In that case a simple for-loop or a generator is better.
Even better is to use a (mathematically) closed form of the equation (for your example that is possible, it might not be in other cases):
def get_h(i):
if i%2 == 0:
return h_0
else:
return (2*b)-h_0
In both cases h_0 is the initial value that you start out with.
h = []
for i in range(len(b)):
if i>0:
h.append(2*b - h[i-1])
else:
# handle i=0 case here
You are successively applying a function (equation) to the result of a previous application of that function - the process needs a seed to start it. Your result looks like this [seed, f(seed), f(f(seed)), f(f(f(seed)), ...]. This concept is function composition. You can create a generalized function that will do this for any sequence of functions, in Python functions are first class objects and can be passed around just like any other object. If you need to preserve the intermediate results use a generator.
def composition(functions, x):
""" yields f(x), f(f(x)), f(f(f(x)) ....
for each f in functions
functions is an iterable of callables taking one argument
"""
for f in functions:
x = f(x)
yield x
Your specs require a seed and a constant,
seed = 0
b = 10
The equation/function,
def f(x, b = b):
return 2*b - x
f is applied b times.
functions = [f]*b
Usage
print list(composition(functions, seed))
If the intermediate results are not needed composition can be redefined as
def composition(functions, x):
""" Returns f(x), g(f(x)), h(g(f(x)) ....
for each function in functions
functions is an iterable of callables taking one argument
"""
for f in functions:
x = f(x)
return x
print composition(functions, seed)
Or more generally, with no limitations on call signature:
def compose(funcs):
'''Return a callable composed of successive application of functions
funcs is an iterable producing callables
for [f, g, h] returns f(g(h(*args, **kwargs)))
'''
def outer(f, g):
def inner(*args, **kwargs):
return f(g(*args, **kwargs))
return inner
return reduce(outer, funcs)
def plus2(x):
return x + 2
def times2(x):
return x * 2
def mod16(x):
return x % 16
funcs = (mod16, plus2, times2)
eq = compose(funcs) # mod16(plus2(times2(x)))
print eq(15)
While the process definition appears to be recursive, I resisted the temptation so I could stay out of maximum recursion depth hades.
I got curious, searched SO for function composition and, of course, there are numerous relavent Q&A's.

Automatically use list comprehension/map() recursion if a function is given a list

As a Mathematica user, I like functions that automatically "threads over lists" (as the Mathematica people call it - see http://reference.wolfram.com/mathematica/ref/Listable.html). That means that if a function is given a list instead of a single value, it automatically uses each list entry as an argument and returns a list of the results - e.g.
myfunc([1,2,3,4]) -> [myfunc(1),myfunc(2),myfunc(3),myfunc(4)]
I implemented this principle in Python like this:
def myfunc(x):
if isinstance(x,list):
return [myfunc(thisx) for thisx in x]
#rest of the function
Is this a good way to do it? Can you think of any downsides of this implementation or the strategy overall?
If this is something you're going to do in a lot of functions, you could use a Python decorator. Here's a simple but useful one.
def threads_over_lists(fn):
def wrapped(x, *args, **kwargs):
if isinstance(x, list):
return [fn(e, *args, **kwargs) for e in x]
return fn(x, *args, **kwargs)
return wrapped
This way, just adding the line #threads_over_lists before your function would make it behave this way. For example:
#threads_over_lists
def add_1(val):
return val + 1
print add_1(10)
print add_1([10, 15, 20])
# if there are multiple arguments, threads only over the first element,
# keeping others the same
#threads_over_lists
def add_2_numbers(x, y):
return x + y
print add_2_numbers(1, 10)
print add_2_numbers([1, 2, 3], 10)
You should also consider whether you want this to vectorize only over lists, or also over other iterable objects like tuples and generators. This is a useful StackOverflow question for determining that. Be careful, though- a string is iterable, but you probably won't want your function operating on each character within it.
That's a good way to do it.
However, you would have to do it for each function you write.
To avoid that, you could use a decorator like this one :
def threads(fun):
def wrapper(element_or_list):
if isinstance(element_or_list, list):
return [fun(element) for element in element_or_list]
else:
return fun(element_or_list)
return wrapper
#threads
def plusone(e):
return e + 1
print(plusone(1))
print(plusone([1, 2, 3]))

Function closure performance

I thought that I improve performance when I replace this code:
def f(a, b):
return math.sqrt(a) * b
result = []
a = 100
for b in range(1000000):
result.append(f(a, b))
with:
def g(a):
def f(b):
return math.sqrt(a) * b
return f
result = []
a = 100
func = g(a)
for b in range(1000000):
result.append(func(b))
I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so math.sqrt(a) would be repeated just once instead of 1000000 times.
Is my understanding always correct, or always incorrect, or correct/incorrect depending on the implementation?
I noticed that the code object for func is built (at least in CPython) before runtime, and is immutable. The code object then seems to use global environment to achieve the closure. This seems to suggest that the optimization I hoped for does not happen.
I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so
math.sqrt(a) would be repeated just once instead of 1000000 times.
That assumption is wrong, I don't know where it came from. A closure just captures variable bindings, in your case it captures the value of a, but that doesn't mean that any more magic is going on: The expression math.sqrt(a) is still evaluated every time f is called.
After all, it has to be computed every time because the interpreter doesn't know that sqrt is "pure" (the return value is only dependent on the argument and no side-effects are performed). Optimizations like the ones you expect are practical in functional languages (referential transparency and static typing help a lot here), but would be very hard to implement in Python, which is an imperative and dynamically typed language.
That said, if you want to precompute the value of math.sqrt(a), you need to do that explicitly:
def g(a):
s = math.sqrt(a)
def f(b):
return s * b
return f
Or using lambda:
def g(a):
s = math.sqrt(a)
return lambda b: s * b
Now that g really returns a function with 1 parameter, you have to call the result with only one argument.
The code is not evaluated statically; the code inside the function is still calculated each time. The function object contains all the byte code which expresses the code in the function; it doesn't evaluate any of it. You could improve matters by calculating the expensive value once:
def g(a):
root_a = math.sqrt(a)
def f(b):
return root_a * b
return f
result = []
a = 100
func = g(a)
for b in range(1000000):
result.append(func(b))
Naturally, in this trivial example, you could improve performance much more:
a = 100
root_a = math.sqrt(a)
result = [root_a * b for b in range(1000000)]
But I presume you're working with a more complex example than that where that doesn't scale?
As usual, the timeit module is your friend. Try some things and see how it goes. If you don't care about writing ugly code, this might help a little as well:
def g(a):
def f(b,_local_func=math.sqrt):
return _local_func(a)*b
Apparently python takes a performance penalty whenever it tries to access a "global" variable/function. If you can make that access local, you can shave off a little time.

Categories