Hey. This example is pretty specific but I think it could apply to a broad range of functions.
It's taken from some online programming contest.
There is a game with a simple winning condition. Draw is not possible. Game cannot go on forever because every move takes you closer to the terminating condition. The function should, given a state, determine if the player who is to move now has a winning strategy.
In the example, the state is an integer. A player chooses a non-zero digit and subtracts it from the number: the new state is the new integer. The winner is the player who reaches zero.
I coded this:
from Memoize import Memoize
#Memoize
def Game(x):
if x == 0: return True
for digit in str(x):
if digit != '0' and not Game(x-int(digit)):
return True
return False
I think it's clear how it works. I also realize that for this specific game there's probably a much smarter solution but my question is general. However this makes python go crazy even for relatively small inputs. Is there any way to make this code work with a loop?
Thanks
This is what I mean by translating into a loop:
def fac(x):
if x <= 1: return x
else: return x*fac(x-1)
def fac_loop(x):
result = 1
for i in xrange(1,x+1):
result *= i
return result
## dont try: fac(10000)
print fac_loop(10000) % 100 ## works
In general, it is only possible to convert recursive functions into loops when they are primitive-recursive; this basically means that they call themselves only once in the body. Your function calls itself multiple times. Such a function really needs a stack. It is possible to make the stack explicit, e.g. with lists. One reformulation of your algorithm using an explicit stack is
def Game(x):
# x, str(x), position
stack = [(x,str(x),0)]
# return value
res = None
while stack:
if res is not None:
# we have a return value
if not res:
stack.pop()
res = True
continue
# res is True, continue to search
res = None
x, s, pos = stack.pop()
if x == 0:
res = True
continue
if pos == len(s):
# end of loop, return False
res = False
continue
stack.append((x,s,pos+1))
digit = s[pos]
if digit == '0':
continue
x -= int(digit)
# recurse, starting with position 0
stack.append((x,str(x),0))
return res
Basically, you need to make each local variable an element of a stack frame; the local variables here are x, str(x), and the iteration counter of the loop. Doing return values is a bit tricky - I chose to set res to not-None if a function has just returned.
By "go crazy" I assume you mean:
>>> Game(10000)
# stuff skipped
RuntimeError: maximum recursion depth exceeded in cmp
You could start at the bottom instead -- a crude change would be:
# after defining Game()
for i in range(10000):
Game(i)
# Now this will work:
print Game(10000)
This is because, if you start with a high number, you have to recurse a long way before you reach the bottom (0), so your memoization decorator doesn't help the way it should.
By starting from the bottom, you ensure that every recursive call hits the dictionary of results immediately. You probably use extra space, but you don't recurse far.
You can turn any recursive function into an iterative function by using a loop and a stack -- essentially running the call stack by hand. See this question or this quesstion, for example, for some discussion. There may be a more elegant loop-based solution here, but it doesn't leap out to me.
Well, recursion mostly is about being able to execute some code without losing previous contexts and their order. In particular, function frames put and saved onto call stack during recursion, therefore giving constraint on recursion depth because stack size is limited. You can 'increase' your recursion depth by manually managing/saving required information on each recursive call by creating a state stack on the heap memory. Usually, amount of available heap memory is larger than stack's one. Think: good quick sort implementations eliminate recursion to the larger side by creating an outer loop with ever-changing state variables (lower/upper array boundaries and pivot in QS example).
While I was typing this, Martin v. Löwis posted good answer about converting recursive functions into loops.
You could modify your recursive version a bit:
def Game(x):
if x == 0: return True
s = set(digit for digit in str(x) if digit != '0')
return any(not Game(x-int(digit)) for digit in s)
This way, you don't examine digits multiple times. For example, if you are doing 111, you don't have to look at 110 three times.
I'm not sure if this counts as an iterative version of the original algorithm you presented, but here is a memoized iterative version:
import Queue
def Game2(x):
memo = {}
memo[0] = True
calc_dep = {}
must_calc = Queue.Queue()
must_calc.put(x)
while not must_calc.empty():
n = must_calc.get()
if n and n not in calc_dep:
s = set(int(c) for c in str(n) if c != '0')
elems = [n - digit for digit in s]
calc_dep[n] = elems
for new_elem in elems:
if new_elem not in calc_dep:
must_calc.put(new_elem)
for k in sorted(calc_dep.keys()):
v = calc_dep[k]
#print k, v
memo[k] = any(not memo[i] for i in v)
return memo[x]
It first calculates the set of numbers that x, the input, depends on. Then it calculates those numbers, starting at the bottom and going towards x.
The code is so fast because of the test for calc_dep. It avoids calculating multiple dependencies. As a result, it can do Game(10000) in under 400 milliseconds whereas the original takes -- I don't know how long. A long time.
Here are performance measurements:
Elapsed: 1000 0:00:00.029000
Elapsed: 2000 0:00:00.044000
Elapsed: 4000 0:00:00.086000
Elapsed: 8000 0:00:00.197000
Elapsed: 16000 0:00:00.461000
Elapsed: 32000 0:00:00.969000
Elapsed: 64000 0:00:01.774000
Elapsed: 128000 0:00:03.708000
Elapsed: 256000 0:00:07.951000
Elapsed: 512000 0:00:19.148000
Elapsed: 1024000 0:00:34.960000
Elapsed: 2048000 0:01:17.960000
Elapsed: 4096000 0:02:55.013000
It's reasonably zippy.
Related
Here's the question: https://leetcode.com/problems/coin-change/
I'm having some trouble understanding two different methods of dynamic programming used to solve this problem. I'm currently going through the Grokking Dynamic Programming course from educative.io, and their approach is to use subsets to search for each combination. They go about testing if a coin is viable, if so, then try it in the DFS. If not, skip the coin and go to the next index and try the next coin.
Here's Grokking's approach with memoization:
def coinChange(self, coins: List[int], amount: int) -> int:
def dfs(i, total, memo):
key = (i, total)
if key in memo:
return memo[key]
if total == 0:
return 0
if len(coins) == 0 or i >= len(coins):
return inf
count = inf
if coins[i] <= total:
res = dfs(i, total - coins[i], memo)
if res != inf:
count = res + 1
memo[key] = min(count, dfs(i + 1, total, memo))
return memo[key]
return dfs(0, amount, {}) if dfs(0, amount, {}) != inf else -1
It doesn't do very well on Leetcode; it runs very slowly (but passes, nonetheless). The efficient algorithm that was in the discussions was this:
def coinChange(self, coins: List[int], amount: int) -> int:
#lru_cache(None)
def dp(sum):
if sum == 0: return 0
if sum < 0: return float("inf")
count = float('inf')
for coin in coins:
count = min(count, dp(sum - coin))
return count + 1
return dp(amount) if dp(amount) != float("inf") else -1
Does this second code have the same logic as "testing the subsets of coins?" What's the difference between the two? Is the for-loop a way of testing the different subsets, like with backtracking?
I tested the second algorithm with memoization in a dictionary, like the first, using sum as the key, and it tanked in efficiency. But then I tried using the #lru_cache with the first algorithm, and it didn't help.
Could anyone explain why the second algorithm is so much faster? Is it my memoization that sucks?
Does this second code have the same logic as "testing the subsets of coins?"
If with subset you mean the subset of the coins that is still available for selection, then: no. The second algorithm does not reduce the problem in terms of coins; it reasons that at any time any coin can be selected, irrespective of previous selections. Although this may seem inefficient as it tries to take the same combinations in all possible permutations, this downside is minimised by the effect of memoization.
What's the difference between the two?
The first one takes coins in the order they are given, never going back to take an earlier coin once it has decided to go to the next one. So doing, it tries to reduce the problem in terms of available coins. The second one doesn't care about the order and looks at any permutation, it only reduces the problem in terms of amount.
This first one has a larger memoization collection because the index is part of the key, whereas the second uses a memoization collection that is only keyed by the amount.
The first one makes a recursive call even when no coin is selected (the one at the end of the inner function), since that fits in the logic of reducing the problem to fewer coins. The second one only makes a recursive call when the amount is further reduced.
Is the for-loop a way of testing the different subsets, like with backtracking?
If with subset you mean that the problem is reduced to fewer coins, then no: the second algorithm doesn't attempt to apply that methodology.
The for loop is just a way to consider every coin. It doesn't reduce the problem size in terms of available coins, only in terms of remaining amount.
Could anyone explain why the second algorithm is so much faster?
It is faster because the memoization key is smaller, leading to more hits, leading to fewer recursive calls. You can experiment with this and add global counters that count the number of executions of both inner functions (dfs and dp) and you'll see a dramatic difference there.
Is it my memoization that sucks?
You could say that, but it is too harsh.
I am studying Python by the book "a beginner guide to python 3" written by Mr.John Hunt. In chapter 8, which is about recursion, there is an exercise, that demands a code in which a prime number is found by recursion. I wrote first code below independently, but the answer key is written in different structure. Because I am very doubtful about recursion, What is your analysis about these two? Which is more recursive?
My code:
def is_prime(n, holder = 1):
if n == 2:
return True
else:
if (n-1 + holder)%(n-1) == 0:
return False
else:
return is_prime(n-1, holder+1)
print('is_prime(9):', is_prime(9))
print('is_prime(31):', is_prime(31))
Answer key:
def is_prime(n, i=2):
# Base cases
if n <= 2:
return True if (n == 2) else False
if n % i == 0:
return False
if i * i > n:
return True
# Check for next divisor
return is_prime(n, i + 1)
print('is_prime(9):', is_prime(9))
print('is_prime(31):', is_prime(31))
My suggestion in this case would be not to use recursion at all. Whilst I understand that you want to use this as a learning example of how to use recursion, it is also important to learn when to use recursion.
Recursion has a maximum allowed depth, because the deeper the recursion, the more items need to be put on the call stack. As such, this is not a good example to use recursion for, because it is easy to reach the maximum in this case. Even the "model" example code suffers from this. The exact maximum recursion depth may be implementation-dependent, but for example, if I try to use it to compute is_prime(1046527) then I get an error:
RecursionError: maximum recursion depth exceeded while calling a Python object
and inserting a print(i) statement shows that it is encountered when i=998.
A simple non-recursive equivalent of the "model" example will not have this problem. (There are more efficient solutions, but this one is trying to stay close to the model solution apart from not using recursion.)
def is_prime(n):
if n == 2:
return True
i = 2
while i * i <= n:
if n % i == 0:
return False
i += 1
return True
(In practice you would probably also want to handle n<2 cases.)
If you want a better example of a problem to practise recursive programming, check out the Tower of Hanoi problem. In this case, you will find that using recursion allows you to make a simpler and cleaner solution than is possible without it, while being unlikely to involve exceeding the maximum recursion depth (you are unlikely to need to consider a tower 1000 disks high, because the solution would require a vast number of moves, 2^1000-1 or about 10^301).
As another good example of where recursion can be usefully employed, try using turtle graphics to draw a Koch snowflake.
I'd say the Answer Key needs improvement. We can make it faster and handle the base cases more cleanly:
def is_prime(n, i=3):
# Base cases
if n < 2:
return False
if n % 2 == 0:
return n == 2
if i * i > n:
return True
if n % i == 0:
return False
# Check for next divisor
return is_prime(n, i + 2)
The original answer key starts at 2 and counts up by 1 -- here we start at 3 and count up by 2.
As far as your answer goes, there's a different flaw to consider. Python's default stack depth is 1,000 frames, and your function fails shortly above input of 1,000. The solution above uses recursion more sparingly and can handle input of up to nearly 4,000,000 before hitting up against Python's default stack limit.
Yes your example seems to work correctly. Note However, that by the nature of the implementation, the answer key is more efficient. To verify that a number n is a prime number, your algorithm uses a maximum of n-1 function calls, while the provided answer stops after reaching the iteration count of sqrt(n). Checking higher numbers makes generally no sense since if n is dividable without remainder by a value a > sqrt(n) it has to also be dividable by b = n % a.
Furthermore, your code raises an exception for evaluating at n = 1 since the modulo of 0 is not defined.
I am trying to write a function which calculates multiple iteration hashes of a specific value (and output each iteration in the meantime).
However, I can't get my head over how to perform, for instance, md5 hash function on itself multiple times. For instance:
a = hashlib.md5('fun').hexdigest()
b = hashlib.md5(a).hexdigest()
c = hashlib.md5(b).hexdigest()
d = hashlib.md5(c).hexdigest()
.......
I think the recursion is the solution, but I just can't seem to implement it properly. This is the general factorial recursion example, but how do I adapt it to hashes:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
This is a classic application of generators. Python allows a maximum of 500 recursions due to its unusually heavy stack. For anything which might be executed anywhere near that many times, iteration will often be faster. Using a generator allows you to break after any desired number of executions and allows flat usage of the desired logic in your code. The following example prints the output of 10 such executions.
from itertools import islice
def hashes(n):
while True:
n = hashlib.md5(n).hexdigest()
yield n
for h in islice(hashes('fun'), 10):
print(h)
In general, you are looking for a loop like
while True:
x = f(x)
where you repeatedly replace the input with the result of the most recent application.
For your specific example,
def iterated_hash(x):
while True:
x = hashlib.md5(x).hexdigest()
return x
However, since you don't really want to do this an infinite number of times, you need to supply a count:
def iterated_hash(x, n):
while True:
if n == 0:
return x
x = hashlib.md5(x).hexdigest()
or with a for loop,
def iterated_hash(x, n):
for _ in range(n):
x = hashlib.md5(x).hexdigest()
return x
(Practically speaking, you want to use the for loop, but it's nice to see how the for loop is just a finite special case of the more general infinite loop.)
Just iterate as many times as needed:
def make_hash(text, iterations):
a = hashlib.md5(text).hexdigest()
for _ in range(iterations):
a = hashlib.md5(a).hexdigest()
return a
a = make_hash('fun', 5) # 5 iterations
My code is as follows.
I tried coding out for each case first, so given n = 4, my code looks like this:
a = overlay_frac(0,blank_bb,scale(1/4,rune))
b = overlay_frac(1/4,blank_bb,scale(1/2,rune))
c = overlay_frac(1/2,blank_bb,scale(3/4,rune))
d = overlay_frac(3/4,blank_bb,scale(1,rune))
show (overlay(a,(overlay(b,(overlay(c,d))))))
My understanding is that the recursion pattern is:
a = overlay_frac((1/n)-(1/n),blank_bb,scale(1/n,rune))
b = overlay_frac((2/n)-(1/n),blank_bb,scale(2/n,rune))
c = overlay_frac((3/n)-(1/n),blank_bb,scale(3/n,rune))
d = overlay_frac((4/n)-(1/n),blank_bb,sale(4/n,rune))
Hence, the recursion pattern that I came up with is:
def tree(n,rune):
if n==1:
return rune
else:
for i in range(n+1):
return overlay(overlay_frac(1-(1/n),blank_bb,scale(i/n,rune)),tree(n-1,rune))
When I hardcode this, everything turns out just fine, but I suspect I'm not doing the recursion properly. Where have I gone wrong?
You are in fact trying to do an iteration within a recursive call. In stead of using loop, you can use an inner function to memorize your status. The coefficient you defined is actually changed with both n and i, but for a given n it changed with i only. The status you need to memorize with inner function is then i, which is the same as you looping through i.
You can still achieve your goal by doing so
def f(i, n):
return overlay_frac((i/n)-(1/n),blank_bb,scale(i/n,rune))
# for each iteration, you check if i is equal to n
# if yes, return the result (base case)
# otherwise, you apply next coefficient to the previous result
# you start with i = 0 and increase by one every iteration until i reach to n (base case)
# notice how similar this recursive call looks like a loop
# the only difference is the status are updated within the function call itself
# therefore you will not have the problem of earlier return
def recursion(n):
def iteration(i, out):
if i == n:
return out
else:
return iteration(i+1, overlay(f(n-1, n), out))
return iteration(0, f(n, n))
Here, n is assumed to be the times of overlay you want to apply. When n = 0, no function applied on the last coefficient f(n, n). When n = 1, the output would be overlay applied once on coefficient with i = n - 1 and coefficient with i = n.
This way avoids the earlier return inside your loop.
In fact you can omit the inner function by adding additional argument to your outer function. Then you need to assign the default initial i. The inner function is not really necessary here. The key is to use the function argument to memorize the status (variable i in this case).
def f(i, n):
return overlay_frac((i/n)-(1/n),blank_bb,scale(i/n,rune))
def recursion(n, i=0):
if i == n:
return f(n, n)
else:
return overlay(f(n-1, n), recursion(n, i+1))
Your first two code blocks don't correspond to the same operations. This would be equivalent to your first block (in Python 3).
def overlayer(n, rune):
def layer(k):
# Scale decreases linearly with k
return overlay_frac((1 - (k+1)/n), blank_bb, scale(1-k/n, rune))
result = layer(0)
for i in range(1, n):
# Overlay on top of previous layers
result = overlay(layer(i), result)
return result
show(overlayer(4, rune))
Let's look at your equations again:
a = overlay_frac(0,blank_bb,scale(1/4,rune))
b = overlay_frac(1/4,blank_bb,scale(1/2,rune))
c = overlay_frac(1/2,blank_bb,scale(3/4,rune))
d = overlay_frac(3/4,blank_bb,scale(1,rune))
show (overlay(a,(overlay(b,(overlay(c,d))))))
What you wrote as "recursion" is not a recursion formula. If you compare your formulas for the recursion with the ones you gave us, you can infer n=4 which makes no sense. For a recursion pattern you need to describe your inner variables as a manifestation of the same expression with only a different parameter. That is, you should replace:
f_n = overlay_frac((1/4)*(n-1),blank_bb,sale(n/4,rune))
such that f_1=a, f_2=b etc...
Then your recursion fomula that you want to calculate translates to:
show (overlay(f_1,(overlay(f_2,(overlay(f_3,f_4))))))
You can write the function f_n as f(n) (and maybe other paramters) in your code and then do
def recurse(n):
if n == 4:
return f(4)
else:
return overlay(f(n),recurse(n+1))
then call:
show( recurse (1))
You need to assert that n<5and integer, otherwise you'll end up in an infinity loop.
There may still be some mistake, but it should be along those lines. Once you've actually written it like this however, it (maybe) doesn't really make sense to do a recursion anyways. If you only want to do it for n_max=4, that is. Just call the function in one line by replacing a,b,c,d with f_1,f_2,f_3,f_4
I have a line code like this -
while someMethod(n) < length and List[someMethod(n)] == 0:
# do something
n += 1
where someMethod(arg) does some computation on the number n. The problem with this code is that I'm doing the same computation twice, which is something I need to avoid.
One option is to do this -
x = someMethod(n)
while x < length and List[x] == 0:
# do something
x = someMethod(n + 1)
I am storing the value of someMethod(n) in a variable x and then using it later. However, the problem with this approach is that the code is inside a recursive method which is called multiple times. As a result, a lot of excess instances of variables x are being created which slows the code down.
Here's the snipped of the code -
def recursion(x, n, i):
while someMethod(n) < length and List[someMethod(n)] == 0:
# do something
n += 1
# some condition
recursion(x - 1, n, someList(i + 1))
and this recursion method is called many times throughout the code and the recursion is quite deep.
Is there some alternative available to deal with a problem like this?
Please try to be language independent if possible.
You can use memoization with decorators technique:
def memoize(f):
memo = dict()
def wrapper(x):
if x not in memo:
memo[x] = f(x)
return memo[x]
return wrapper
#memoize
def someMethod(x):
return <your computations with x>
As i understand your code correctly you are looking for some sort of memorization.
https://en.wikipedia.org/wiki/Memoization
it means that on every recursive call you have to save as mush as possible past calculations to use it in current calculation.