This is a question about scope and closures in Python, motivated by an exercise in SICP. Much thanks for your time if you read this!
A question (3.2) in SICP asks one to create a procedure "make-monitored", that takes in a function f (of one parameter) as input and returns a procedure that keeps track of how many times f has been called. (If the input to this new procedure is "num-calls" it returns the number of times f has been called, if it is "reset" it resets counter to 0 and anything else, it applies f to the input and returns the result (after appropriately incrementing the counter).
Here is code in Scheme that I wrote that works:
(define (make-monitored f)
(let ((counter 0))
(define (number-calls) counter)
(define (reset-count)
(set! counter 0))
(define (call-f input)
(begin (set! counter (+ 1 counter))
(f input)))
(define (dispatch message)
(cond ((eq? message 'num-calls) (number-calls))
((eq? message 'reset) (reset-count))
(else (call-f message))))
dispatch))
My question however is about how to write this in a "pythonic" way. My attempt below is obviously a direct translation of my Scheme code and I realize that though it is fine for an impure functional language (like Scheme) it's probably not the cleanest or best way to do it in Python. How does one solve a general problem like this in Python where you want a higher order procedure to dispatch on type and remember local state?
Below is my noobish attempt that works (earlier I had said it did not but the problem was that an earlier version of the program was still in the terminal's memory) (In 2 it seems hard to make nonlocal variable binding)
def make_monitored(func):
counter = 0
def dispatch(message):
if message == "num-calls":
return num_calls()
elif message == "reset":
reset()
else:
nonlocal counter
counter += 1
return func(message)
def num_calls():
nonlocal counter
return counter
def reset():
nonlocal counter
counter = 0
return dispatch
PS: This question is related to this same set of exercises in SICP but my question is really about Python best practice and not the concept of closures or Scheme...
I think writing a decorator wrapping the function in a class would be more pythonic:
from functools import wraps
def make_monitored(func):
class wrapper:
def __init__(self, f):
self.func = f
self.counter = 0
def __call__(self, *args, **kwargs):
self.counter += 1
return self.func(*args, **kwargs)
return wraps(func)(wrapper(func))
This has the advantage that it mimics the original function as close as possible, and just adds a counter field to it:
In [25]: msqrt = make_monitored(math.sqrt)
In [26]: msqrt(2)
Out[26]: 1.4142135623730951
In [29]: msqrt.counter
Out[29]: 1
In [30]: msqrt(235)
Out[30]: 15.329709716755891
In [31]: msqrt.counter
Out[31]: 2
In [32]: #make_monitored
...: def f(a):
...: """Adding the answer"""
...: return a + 42
In [33]: f(0)
Out[33]: 42
In [34]: f(1)
Out[34]: 43
In [35]: f.counter
Out[35]: 2
In [36]: f.__name__
Out[36]: 'f'
In [37]: f.__doc__
Out[37]: 'Adding the answer'
For f, you also see the usage as a decorator, and how the wrapper keeps the original name and docstring (which would not be the case without functools.wraps).
Defining reset is left as an exercise to the reader, but quite trivial.
Related
I have a problem where I need to produce something which is naturally computed recursively, but where I also need to be able to interrogate the intermediate steps in the recursion if needed.
I know I can do this by passing and mutating a list or similar structure. However, this looks ugly to me and I'm sure there must be a neater way, e.g. using generators. What I would ideally love to be able to do is something like:
intermediate_results = [f(x) for x in range(T)]
final_result = intermediate_results[T-1]
in an efficient way. While my solution is not performance critical, I can't justify the massive amount of redundant effort in that first line. It looks to me like a generator would be perfect for this except for the fact that f is fundamentally much more suited to recursion in my case (which at least in my mind is the complete opposite of a generator, but maybe I'm just not thinking far enough outside of the box).
Is there a neat Pythonic way of doing something like this that I just don't know about, or do I just need to just capitulate and pollute my function f by passing it an intermediate_results list which I then mutate as a side-effect?
I have a generic solution for you using a decorator. We create a Memoize class which stores the results of previous times the function is executed (including in recursive calls). If the arguments given have already been seen, the cached versions are used to quickly lookup the result.
The custom class has the benefit over an lru_cache in that you can see the results.
from functools import wraps
class Memoize:
def __init__(self):
self.store = {}
def save(self, fun):
#wraps(fun)
def wrapper(*args):
if args not in self.store:
self.store[args] = fun(*args)
return self.store[args]
return wrapper
m = Memoize()
#m.save
def fibo(n):
if n <= 0: return 0
elif n == 1: return 1
else: return fibo(n-1) + fibo(n-2)
Then after running different things you can see what the cache contains. When you run future function calls, m.store will be used as a lookup so calculation doesn't need to be redone.
>>> f(8)
21
>>> m.store
{(1,): 1,
(0,): 0,
(2,): 1,
(3,): 2,
(4,): 3,
(5,): 5,
(6,): 8,
(7,): 13,
(8,): 21}
You could modify the save function to use the name of the function and the args as the key, so that multiple function results can be stored in the same Memoize class.
You can use your existing solution that makes many "redundant" calls to f, but employ the use of function caching to save the results to previous calls to f.
In other words, when f(x1) is called, it's input arguments and corresponding return values are saved, and the next time it is called, the result is simply pulled from the cache
see functools.lru_cache for the standard library solution to this
ie:
from functools import lru_cache
#lru_cache
intermediate_results = [f(x) for x in range(T)]
final_result = intermediate_results[T-1]
Note, however, f must be a pure function (no side-effects, 1-to-1 mapping) for this to work properly
Having considered your comments, I'll now try to give another perspective on the problem.
So, let's consider a concrete example:
def f(x):
a = 2
return g(x) + a if x != 0 else 0
def g(x):
b = 1
return h(x) - b
def h(x):
c = 1/2
return f(x-1)*(1+c)
I
First of all, it should be mentioned that (in our particular case) the algorithm has form of: f(x) = p(f(x - 1)) for some p. It follows that f(x) = p^x(f(0)) = p^x(0). That means we should just apply p to 0 x times to get the desired result, which can be done in an iterative process, so this can be written without recursion. Though I believe that your real case is much harder. Moreover, it would be too boring and uninformative to stop here)
II
Generally speaking, we can divide all possible solutions into two groups: the ones that require refactoring (i.e. rewriting functions f, g, h) and the ones that do not. I have little to offer from the latter one (and I don't think anyone can). Consider the following, however:
def fk(x, k):
a = 2
return k(gk(x, k) + a if x != 0 else 0)
def gk(x, k):
b = 1
return k(hk(x, k) - b)
def hk(x, k):
c = 1/2
return k(fk(x-1, k)*(1+c))
def printret(x):
print(x)
return x
f(4, printret) # see what happens
Inspired by continuation-passing style, but that's totally not it.
What's the point? It's something between your idea of passing a list to write down all the computations and memoizing. This k carries additional behavior with it, such as printing or writing to list (you can make a function that writes to some list, why not?). But if you look carefully you'll see that it lefts inner code of these functions practically untouched (only input and output to function are affected), so one can produce a decorator associated with a function like printret that does essentially the same thing for f, g, h.
Pros: no need to modify code, much more flexible than passing a list, no additional work (like in memoizing).
Cons: Impure (printing or modifying sth), not so flexible as we would like.
III
Now let's see how modifying function bodies can help. Don't be afraid of what's written below, take your time and play with that thing a little.
class Logger:
def __init__(self, lst, cur_val):
self.lst = lst
self.cur_val = cur_val
def bind(self, f):
res = f(self.cur_val)
return Logger([self.cur_val] + res.lst + self.lst, res.cur_val)
def __repr__(self):
return "Logger( " + repr({'value' : self.cur_val,'lst' : self.lst}) + " )"
def unit(x):
return Logger([], x)
# you can also play with lala
def lala(x):
if x <= 0:
return unit(1)
else:
return lala(x - 1).bind(lambda y: unit(2*y))
def f(x):
a = 2
if x == 0:
return unit(0)
else:
return g(x).bind(lambda y: unit(y + a))
def g(x):
b = 1
return h(x).bind(lambda y: unit(y - b))
def h(x):
c = 1/2
return f(x-1).bind(lambda y: unit(y*(1+c)))
f(4) # see for yourself
Logger is called a monad. I'm not very familiar with this concept myself, but I guess I'm doing everything right) f, g, h are functions that take a number and return a Logger instance. Logger's bind takes in a function (like f) and returns Logger with new value (computed by f) and updated 'logs'. The key point - as I see it - is the ability to do whatever we want with collected functions in the order the resulting value was calculated.
Afterword
I'm not at all some kind of 'guru' of functional programming, I believe I'm missing a lot of things here. But what I've understood is that functional programming is about inversing the flow of the program. That's why, for instance, I totally agree with your opinion about generators being opposed to functional programming. When we use generator gen in, say, function func, we yield values one by one to func and func does sth with them in e.g. a loop. The functional approach would be to make gen a function taking func as a parameter and make func perform computations on 'yielded' values. It's like gen and func exchanged their places. So the flow is inversed! And there are plenty of other ways of inversing the flow. Monads are one of them.
itertools islice gets a generator, start value and stop value. it will give you the elements between the start value and stop value as a generator. if islice is not clear you can check the docs here https://docs.python.org/3/library/itertools.html
intermediate_result = map(f, range(T))
final_result = next(itertools.islice(intermediate_result, start=T-1, stop=T))
i need to get consecutive numbers while an input number doesnt change.
so i get give(5)->1, give(5)->2, and so on, but then: give(6)->1 again, starting the count.
So far I solved it with an iterator function count() and a function give(num) like this:
def count(start=1):
n=start
while True:
yield n
n +=1
def give(num):
global last
global a
if num==last:
ret=a.next()
else:
a=count()
ret=a.next()
last=num
return ret
It works, but its ugly: I have two globals and have to set them before I call give(num). I'd like to be able to call give(num) without setting previously the 'a=count()' and 'last=999' variables. I'm positive there's better way to do this...
edit: ty all for incredibly fast and varied responses, i've got a lot to study here..
The obvious thing to do is to make give into an object rather than a function.* Any object can be made callable by defining a __call__ method.
While we're at it, your code can be simplified quite a bit, so let's do that.
class Giver(object):
def __init__(self):
self.last, self.a = object(), count()
def __call__(self, num):
if num != self.last:
self.a = count(1)
self.last = num
return self.a.next()
give = Giver()
So:
>>> give(5)
1
>>> give(5)
2
>>> give(6)
1
>>> give(5)
1
This also lets you create multiple separate givers, each with its own, separate current state, if you have any need to do that.
If you want to expand it with more state, the state just goes into the instance variables. For example, you can replace last and a with a dictionary mapping previously-seen values to counters:
class Giver(object):
def __init__(self):
self.counters = defaultdict(count)
def __call__(self, num):
return next(self.counters[num])
And now:
>>> give(5)
1
>>> give(5)
2
>>> give(6)
1
>>> give(5)
3
* I sort of skipped a step here. You can always remove globals by putting the variables and everything that uses them (which may just be one function) inside a function or other scope, so they end up as free variables in the function's closure. But in your case, I think this would just make your code look "uglier" (in the same sense you thought it was ugly). But remember that objects and closures are effectively equivalent in what they can do, but different in what they look like—so when one looks horribly ugly, try the other.
Just keep track of the last returned value for each input. You can do this with an ordinary dict:
_counter = {}
def give(n):
_counter[n] = _counter.get(n, 0) + 1
return _counter[n]
The standard library has a Counter class that makes things a bit easier:
import collections
_counter = collections.Counter()
def give(n):
_counter[n] += 1
return _counter[n]
collections.defaultdict(int) works too.
You can achieve this with something like this:
def count(start=1):
n = start
while True:
yield n
n += 1
def give(num):
if num not in give.memo:
give.memo[num] = count()
return next(give.memo[num])
give.memo = {}
Which produces:
>>> give(5)
1
>>> give(5)
2
>>> give(5)
3
>>> give(6)
1
>>> give(5)
4
>>>
The two key points are using a dict to keep track of multiple iterators simultaneously, and setting a variable on the function itself. You can do this because functions are themselves objects in python. This is the equivalent of a static local variable in C.
You can basically get what you want via combination of defaultdict and itertools.count:
from collections import defaultdict
from itertools import count
_counters = defaultdict(count)
next(_counters[5])
Out[116]: 0
next(_counters[5])
Out[117]: 1
next(_counters[5])
Out[118]: 2
next(_counters[5])
Out[119]: 3
next(_counters[6])
Out[120]: 0
next(_counters[6])
Out[121]: 1
next(_counters[6])
Out[122]: 2
If you need the counter to start at one, you can get that via functools.partial:
from functools import partial
_counters = defaultdict(partial(count,1))
next(_counters[5])
Out[125]: 1
next(_counters[5])
Out[126]: 2
next(_counters[5])
Out[127]: 3
next(_counters[6])
Out[128]: 1
Adding a second answer because this is rather radically different from my first.
What you are basically trying to accomplish is a coroutine - a generator that preserves state that at arbitrary time, values can be sent into. PEP 342 gives us a way to do that with the "yield expression". I'll jump right into how it looks:
from collections import defaultdict
from itertools import count
from functools import partial
def gen(x):
_counters = defaultdict(partial(count,1))
while True:
out = next(_counters[x])
sent = yield out
if sent:
x = sent
If the _counters line is confusing, see my other answer.
With a coroutine, you can send data into the generator. So you can do something like the following:
g = gen(5)
next(g)
Out[159]: 1
next(g)
Out[160]: 2
g.send(6)
Out[161]: 1
next(g)
Out[162]: 2
next(g)
Out[163]: 3
next(g)
Out[164]: 4
g.send(5)
Out[165]: 3
Notice how the generator preserves state and can switch between counters at will.
In my first answer, I suggested that one solution was to transform the closure into an object. But I skipped a step—you're using global variables, not a closure, and that's part of what you didn't like about it.
Here's a simple way to transform any global state into encapsulated state:
def make_give():
last, a = None, None
def give(num):
nonlocal last
nonlocal a
if num != last:
a = count()
last=num
return a.next()
return give
give = make_give()
Or, adapting my final version of Giver:
def make_giver():
counters = defaultdict(count)
def give(self, num):
return next(counters[num])
return give
If you're curious how this works:
>>> give.__closure__
(<cell at 0x10f0e2398: NoneType object at 0x10b40fc50>, <cell at 0x10f0e23d0: NoneType object at 0x10b40fc50>)
>>> give.__code__.co_freevars
('a', 'last')
Those cell objects are essentially references into the stack frame of the make_give call that created the give function.
This doesn't always work quite as well in Python 2.x as in 3.x. While closure cells work the same way, if you assign to a variable inside the function body and there's no global or nonlocal statement, it automatically becomes local, and Python 2 had no nonlocal statement. So, the second version works fine, but for the first version, you'd have to do something like state = {'a': None, 'last': None} and then write state['a'] = count instead of a = count.
This trick—creating a closure just to hide local variables—is very common in a few other languages, like JavaScript. In Python (partly because of the long history without the nonlocal statement, and partly because Python has alternatives that other languages don't), it's less common. It's usually more idiomatic to stash the state in a mutable default parameter value, or an attribute on the function—or, if there's a reasonable class to make the function a method of, as an attribute on the class instances. There are plenty of cases where a closure is pythonic, this just isn't usually one of them.
Python has an elegant way of automatically generating a counter variable in for loops: the enumerate function. This saves the need of initializing and incrementing a counter variable. Counter variables are also ugly because they are often useless once the loop is finished, yet their scope is not the scope of the loop, so they occupy the namespace without need (although I am not sure whether enumerate actually solves this).
My question is, whether there is a similar pythonic solution for while loops. enumerate won't work for while loops since enumerate returns an iterator. Ideally, the solution should be "pythonic" and not require function definitions.
For example:
x=0
c=0
while x<10:
x=int(raw_input())
print x,c
c+=1
In this case we would want to avoid initializing and incrementing c.
Clarification:
This can be done with an endless for loop with manual termination as some have suggested, but I am looking for a solution that makes the code clearer, and I don't think that solution makes the code clearer in this case.
Improvement (in readability, I'd say) to Ignacio's answer:
x = 0
for c in itertools.takewhile(lambda c: x < 10, itertools.count()):
x = int(raw_input())
print x, c
Advantages:
Only the while loop condition is in the loop header, not the side-effect raw_input.
The loop condition can depend on any condition that a normal while loop could. It's not necessary to "import" the variables referenced into the takewhile, as they are already visible in the lambda scope. Additionally it can depend on the count if you want, though not in this case.
Simplified: enumerate no longer appears at all.
Again with the itertools...
import itertools
for c, x in enumerate(
itertools.takewhile(lambda v: v < 10,
(int(raw_input()) for z in itertools.count())
)
):
print c, x
If you want zero initialization before the while loop, you can use a Singleton with a counter:
class Singleton(object):
_instance = None
def __new__(cls, *args, **kwargs):
if not cls._instance:
cls._instance = super(Singleton, cls).__new__(
cls, *args, **kwargs)
cls.count=0
else:
cls.count+=1
return cls._instance
Then there will only be one instance of Singleton and each additional instance just adds one:
>>> Singleton().count # initial instance
0
>>> Singleton().count
1
>>> Singleton().count
2
>>> Singleton().count
3
Then your while loop becomes:
while Singleton():
x=int(raw_input('x: '))
if x>10: break
print 'While loop executed',Singleton().count,'times'
Entering 1,2,3,11 it prints:
x: 1
x: 2
x: 3
x: 11
While loop executed 4 times
If you do not mind a single line initialization before the while loop, you can just subclass an interator:
import collections
class WhileEnum(collections.Iterator):
def __init__(self,stop=None):
self.stop=stop
self.count=0
def next(self): # '__next__' on Py 3, 'next' on Py 2
if self.stop is not None:
self.remaining=self.stop-self.count
if self.count>=self.stop: return False
self.count+=1
return True
def __call__(self):
return self.next()
Then your while loop becomes:
enu=WhileEnum()
while enu():
i=int(raw_input('x: '))
if i>10: break
print enu.count
I think the second is the far better approach. You can have multiple enumerators and you can also set a limit on how many loops to go:
limited_enum=WhileEnum(5)
I don't think it's possible to do what you want in the exact way you want it. If I understand right, you want a while loop that increments a counter each time through, without actually exposing a visible counter outside the scope of the loop. I think the way to do this would be to rewrite your while loop as a nonterminating for loop, and check the end condition manually. For your example code:
import itertools
x = 0
for c in itertools.count():
x = int(raw_input())
print x, c
if x >= 10:
break
The problem is that fundamentally you're doing iteration, with the counter. If you don't want to expose that counter, it needs to come from the loop construct. Without defining a new function, you're stuck with a standard loop and an explicit check.
On the other hand, you could probably also define a generator for this. You'd still be iterating, but you could at least wrap the check up in the loop construct.
I thought that I improve performance when I replace this code:
def f(a, b):
return math.sqrt(a) * b
result = []
a = 100
for b in range(1000000):
result.append(f(a, b))
with:
def g(a):
def f(b):
return math.sqrt(a) * b
return f
result = []
a = 100
func = g(a)
for b in range(1000000):
result.append(func(b))
I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so math.sqrt(a) would be repeated just once instead of 1000000 times.
Is my understanding always correct, or always incorrect, or correct/incorrect depending on the implementation?
I noticed that the code object for func is built (at least in CPython) before runtime, and is immutable. The code object then seems to use global environment to achieve the closure. This seems to suggest that the optimization I hoped for does not happen.
I assumed that since a is fixed when the closure is performed, the interpreter would precompute everything that involves a, and so
math.sqrt(a) would be repeated just once instead of 1000000 times.
That assumption is wrong, I don't know where it came from. A closure just captures variable bindings, in your case it captures the value of a, but that doesn't mean that any more magic is going on: The expression math.sqrt(a) is still evaluated every time f is called.
After all, it has to be computed every time because the interpreter doesn't know that sqrt is "pure" (the return value is only dependent on the argument and no side-effects are performed). Optimizations like the ones you expect are practical in functional languages (referential transparency and static typing help a lot here), but would be very hard to implement in Python, which is an imperative and dynamically typed language.
That said, if you want to precompute the value of math.sqrt(a), you need to do that explicitly:
def g(a):
s = math.sqrt(a)
def f(b):
return s * b
return f
Or using lambda:
def g(a):
s = math.sqrt(a)
return lambda b: s * b
Now that g really returns a function with 1 parameter, you have to call the result with only one argument.
The code is not evaluated statically; the code inside the function is still calculated each time. The function object contains all the byte code which expresses the code in the function; it doesn't evaluate any of it. You could improve matters by calculating the expensive value once:
def g(a):
root_a = math.sqrt(a)
def f(b):
return root_a * b
return f
result = []
a = 100
func = g(a)
for b in range(1000000):
result.append(func(b))
Naturally, in this trivial example, you could improve performance much more:
a = 100
root_a = math.sqrt(a)
result = [root_a * b for b in range(1000000)]
But I presume you're working with a more complex example than that where that doesn't scale?
As usual, the timeit module is your friend. Try some things and see how it goes. If you don't care about writing ugly code, this might help a little as well:
def g(a):
def f(b,_local_func=math.sqrt):
return _local_func(a)*b
Apparently python takes a performance penalty whenever it tries to access a "global" variable/function. If you can make that access local, you can shave off a little time.
I have a method within which I need to pass an ever-increasing integer to another function.
I can do this like so:
def foo(i):
print i
def bar():
class Incrementer(object):
def __init__(self, start=0):
self.i = start
def __get__(self):
j = self.i
self.i += 1
return j
number = Incrementer()
foo(number)
foo(number)
foo(number)
which correctly outputs 0 1 2 ... but I feel like I'm overlooking a much easier (or built-in) way of doing this?
Try itertools.count() -- it does exactly what you need:
>>> c = itertools.count()
>>> next(c)
0
>>> next(c)
1
>>> next(c)
2
In general, if you need to retain state between one call to a function and the next, what you want is either an object (your solution) or a generator. In some cases one will be simpler than the other, but there's nothing wrong with how you've done it, in principle (though you seem to have some issues with the implementation).
Sven's suggestion, itertools.count(), is a generator. Its implementation is something like this:
def count():
i = 0
while True:
yield i
i += 1
Now, if you wanted it to be callable like a function, rather than having to do next(c), you could define a wrapper that made it so:
def count(c=itertools.count()):
return next(c)
Or the inevitable one-line lambda:
count = lambda c=itertools.count(): next(c)
Then count() returns the next integer each time you call it.
Of course, if you want to be able to create any number of callable functions, each with their own counter, you can write a factory for that:
def counter():
return lambda c=itertools.count(): next(c)
Then it's:
c = counter()
print c() # 0
print c() # 1
# etc
This still seems simpler to me than an object, but not by much. If your state or logic were any more complex, the encapsulation of the object might win out.