I am creating a chess engine in Python by using a minimax algortihm with alpha-beta pruning. It is however very slow at the moment, and I found that doing deepcopy each iteration in minimax is as slow as all my other functions combined.
Is there any way to get around the deepcopy, or to make it faster? Below is my minimax function as of today. It can only think 3-4 moves ahead or so, which doesn't make a very good engine... Any suggestions on speeding the algorithm up is very appreciated.
def minimax(board, depth, alpha, beta, maximizing_player):
board.is_human_turn = not maximizing_player
children = board.get_all_possible_moves()
if depth == 0 or board.is_draw or board.is_check_mate:
return None, evaluate(board)
best_move = random.choice(children)
if maximizing_player:
max_eval = -math.inf
for child in children:
board_copy = copy.deepcopy(board)
board_copy.move(child[0][0], child[0][1], child[1][0], child[1][1])
current_eval = minimax(board_copy, depth - 1, alpha, beta, False)[1]
if current_eval > max_eval:
max_eval = current_eval
best_move = child
alpha = max(alpha, current_eval)
if beta <= alpha:
break
return best_move, max_eval
else:
min_eval = math.inf
for child in children:
board_copy = copy.deepcopy(board)
board_copy.move(child[0][0], child[0][1], child[1][0], child[1][1])
current_eval = minimax(board_copy, depth - 1, alpha, beta, True)[1]
if current_eval < min_eval:
min_eval = current_eval
best_move = child
beta = min(beta, current_eval)
if beta <= alpha:
break
return best_move, min_eval
Some ideas on how to optimize your program (in no particular order):
1) Make the check if depth == 0 or board.is_draw or board.is_check_mate first thing you do in the minimax function. Right now you call board.get_all_possible_moves() which might be redundant (e.g. in the case depth == 0).
2) I don't see how the get_all_possible_moves() method is implemented and assume that it doesn't do any kind of sorting. It's a good practice to order moves for minimax algorithm so that you loop over them starting from the best one to the worst (in this case you are likely to prune more nodes and speed up the program).
3) The child variable in the for child in children loop is a two-dimensional matrix. I also guess that board is a multi-dimensional array as well. Multi-dimensional arrays can be slower than one-dimensional because of their memory layout (e.g. if you iterate over them column-wise). Use one-dimensional arrays if possible (e.g. you can represent a two-dim array as "concatenation" of one-dim arrays).
4) Use generators for lazy evaluation. For instance, you can turn your get_all_possible_moves() into a generator and iterate over it without creating lists and consuming extra memory. If the alpha/beta pruning condition triggers early, you won't need to expand the whole list of children in the position.
5) Avoid deepcopying of the board by making and unmaking the current move. In this case you don't create copies of the board but reuse the original board which might be faster:
current_move_info = ... # collect and store info necessary to make/unmake the current move
board.move(current_move_info)
current_eval = minimax(board, ...)[1]
board.unmake_move(current_move_info) # restore the board position by undoing the last move
6) Add more classical optimizing chess-engine features like iterative deepening, principal variation search, transposition tables, bitboards etc.
Related
I have a sparse 60000x10000 matrix M where each element is either a 1 or 0. Each column in the matrix is a different combination of signals (ie. 1s and 0s). I want to choose five column vectors from M and take the Hadamard (ie. element-wise) product of them; I call the resulting vector the strategy vector. After this step, I compute the dot product of this strategy vector with a target vector (that does not change). The target vector is filled with 1s and -1s such that having a 1 in a specific row of the strategy vector is either rewarded or penalised.
Is there some heuristic or linear algebra method that I could use to help me pick the five vectors from the matrix M that result in a high dot product? I don't have any experience with Google's OR tools nor Scipy's optimization methods so I am not too sure if they can be applied to my problem. Advice on this would be much appreciated! :)
Note: the five column vectors given as the solution does not need to be the optimal one; I'd rather have something that does not take months/years to run.
First of all, thanks for a good question. I don't get to practice numpy that often. Also, I don't have much experience in posting to SE, so any feedback, code critique, and opinions relating to the answer are welcome.
This was an attempt at finding an optimal solution at first, but I didn't manage to deal with the complexity. The algorithm should, however, give you a greedy solution that might prove to be adequate.
Colab Notebook (Python code + Octave validation)
Core Idea
Note: During runtime, I've transposed the matrix. So, the column vectors in the question correspond to row vectors in the algorithm.
Notice that you can multiply the target with one vector at a time, effectively getting a new target, but with some 0s in it. These will never change, so you can filter out some computations by removing those rows (columns, in the algorithm) in further computations entirely - both from the target and the matrix. - you're then left with a valid target again (only 1s and -1 in it).
That's the basic idea of the algorithm. Given:
n: number of vectors you need to pick
b: number of best vectors to check
m: complexity of matrix operations to check one vector
Do an exponentially-complex O((n*m)^b) depth-first search, but decrease the complexity of the calculations in deeper layers by reducing target/matrix size, while cutting down a few search paths with some heuristics.
Heuristics used
The best score achieved so far is known in every recursion step. Compute an optimistic vector (turn -1 to 0) and check what scores can still be achieved. Do not search in levels where the score cannot be surpassed.
This is useless if the best vectors in the matrix have 1s and 0s equally distributed. The optimistic scores are just too high. However, it gets better with more sparsity.
Ignore duplicates. Basically, do not check duplicate vectors in the same layer. Because we reduce the matrix size, the chance for ending up with duplicates increases in deeper recursion levels.
Further Thoughts on Heuristics
The most valuable ones are those that eliminate the vector choices at the start. There's probably a way to find vectors that are worse-or-equal than others, with respect to their affects on the target. Say, if v1 only differs from v2 by an extra 1, and target has a -1 in that row, then v1 is worse-or-equal than v2.
The problem is that we need to find more than 1 vector, and can't readily discard the rest. If we have 10 vectors, each worse-or-equal than the one before, we still have to keep 5 at the start (in case they're still the best option), then 4 in the next recursion level, 3 in the following, etc.
Maybe it's possible to produce a tree and pass it on in into recursion? Still, that doesn't help trim down the search space at the start... Maybe it would help to only consider 1 or 2 of the vectors in the worse-or-equal chain? That would explore more diverse solutions, but doesn't guarantee that it's more optimal.
Warning: Note that the MATRIX and TARGET in the example are in int8. If you use these for the dot product, it will overflow. Though I think all operations in the algorithm are creating new variables, so are not affected.
Code
# Given:
TARGET = np.random.choice([1, -1], size=60000).astype(np.int8)
MATRIX = np.random.randint(0, 2, size=(10000,60000), dtype=np.int8)
# Tunable - increase to search more vectors, at the cost of time.
# Performs better if the best vectors in the matrix are sparse
MAX_BRANCHES = 3 # can give more for sparser matrices
# Usage
score, picked_vectors_idx = pick_vectors(TARGET, MATRIX, 5)
# Function
def pick_vectors(init_target, init_matrix, vectors_left_to_pick: int, best_prev_result=float("-inf")):
assert vectors_left_to_pick >= 1
if init_target.shape == (0, ) or len(init_matrix.shape) <= 1 or init_matrix.shape[0] == 0 or init_matrix.shape[1] == 0:
return float("inf"), None
target = init_target.copy()
matrix = init_matrix.copy()
neg_matrix = np.multiply(target, matrix)
neg_matrix_sum = neg_matrix.sum(axis=1)
if vectors_left_to_pick == 1:
picked_id = np.argmax(neg_matrix_sum)
score = neg_matrix[picked_id].sum()
return score, [picked_id]
else:
sort_order = np.argsort(neg_matrix_sum)[::-1]
sorted_sums = neg_matrix_sum[sort_order]
sorted_neg_matrix = neg_matrix[sort_order]
sorted_matrix = matrix[sort_order]
best_score = best_prev_result
best_picked_vector_idx = None
# Heuristic 1 (H1) - optimistic target.
# Set a maximum score that can still be achieved
optimistic_target = target.copy()
optimistic_target[target == -1] = 0
if optimistic_target.sum() <= best_score:
# This check can be removed - the scores are too high at this point
return float("-inf"), None
# Heuristic 2 (H2) - ignore duplicates
vecs_tried = set()
# MAIN GOAL: for picked_id, picked_vector in enumerate(sorted_matrix):
for picked_id, picked_vector in enumerate(sorted_matrix[:MAX_BRANCHES]):
# H2
picked_tuple = tuple(picked_vector)
if picked_tuple in vecs_tried:
continue
else:
vecs_tried.add(picked_tuple)
# Discard picked vector
new_matrix = np.delete(sorted_matrix, picked_id, axis=0)
# Discard matrix and target rows where vector is 0
ones = np.argwhere(picked_vector == 1).squeeze()
new_matrix = new_matrix[:, ones]
new_target = target[ones]
if len(new_matrix.shape) <= 1 or new_matrix.shape[0] == 0:
return float("-inf"), None
# H1: Do not compute if best score cannot be improved
new_optimistic_target = optimistic_target[ones]
optimistic_matrix = np.multiply(new_matrix, new_optimistic_target)
optimistic_sums = optimistic_matrix.sum(axis=1)
optimistic_viable_vector_idx = optimistic_sums > best_score
if optimistic_sums.max() <= best_score:
continue
new_matrix = new_matrix[optimistic_viable_vector_idx]
score, next_picked_vector_idx = pick_vectors(new_target, new_matrix, vectors_left_to_pick - 1, best_prev_result=best_score)
if score <= best_score:
continue
# Convert idx of trimmed-down matrix into sorted matrix IDs
for i, returned_id in enumerate(next_picked_vector_idx):
# H1: Loop until you hit the required number of 'True'
values_passed = 0
j = 0
while True:
value_picked: bool = optimistic_viable_vector_idx[j]
if value_picked:
values_passed += 1
if values_passed-1 == returned_id:
next_picked_vector_idx[i] = j
break
j += 1
# picked_vector index
if returned_id >= picked_id:
next_picked_vector_idx[i] += 1
best_score = score
# Convert from sorted matrix to input matrix IDs before returning
matrix_id = sort_order[picked_id]
next_picked_vector_idx = [sort_order[x] for x in next_picked_vector_idx]
best_picked_vector_idx = [matrix_id] + next_picked_vector_idx
return best_score, best_picked_vector_idx
Maybe it's too naive, but the first thing that occurs to me is to choose the 5 columns with the shortest distance to the target:
import scipy
import numpy as np
from sklearn.metrics.pairwise import pairwise_distances
def sparse_prod_axis0(A):
"""Sparse equivalent of np.prod(arr, axis=0)
From https://stackoverflow.com/a/44321026/3381305
"""
valid_mask = A.getnnz(axis=0) == A.shape[0]
out = np.zeros(A.shape[1], dtype=A.dtype)
out[valid_mask] = np.prod(A[:, valid_mask].A, axis=0)
return np.matrix(out)
def get_strategy(M, target, n=5):
"""Guess n best vectors.
"""
dists = np.squeeze(pairwise_distances(X=M, Y=target))
idx = np.argsort(dists)[:n]
return sparse_prod_axis0(M[idx])
# Example data.
M = scipy.sparse.rand(m=6000, n=1000, density=0.5, format='csr').astype('bool')
target = np.atleast_2d(np.random.choice([-1, 1], size=1000))
# Try it.
strategy = get_strategy(M, target, n=5)
result = strategy # target.T
It strikes me that you could add another step of taking the top few percent from the M–target distances and check their mutual distances — but this could be quite expensive.
I have not checked how this compares to an exhaustive search.
I am implementing Checkers game using Minimax algorithm and Python. There are two players - both are computers. I was looking for a similar problem's solution but I could not find any and I have been struggling with it for few days. My entry point is this function:
def run_game(board):
players = board.players
is_move_possible = True
move = 0
while is_move_possible:
is_move_possible = move_piece_minimax(board, players[move % 2])
move += 1
It starts the game and calls the next function which is supposed to do the best move, basing on MiniMax algorithm, for the first player. After that first move, it calls this function for the second player and this loop will end once the game is won by one of the players. This function looks as following:
def move_piece_minimax(board, player):
best_move = minimax(copy.deepcopy(board), player, 0)
if best_move.score == +infinity or best_move.score == -infinity:
return False
move_single_piece(board.fields, player, best_move)
return True
The first line calls the MiniMax algorithm, which I will describe later, and is supposed to find the best possible move for the player. I am passing a deep copy of whole board here, as I do not want the original one to be edited during execution of MiniMax algorithm. The condition checks for a win condition so if either maximizing player wins or minimizing one. If none of them won, best_move is performed. Moving to the main problem here, I implemented MiniMax algorithm as following:
def minimax(board, player, depth):
best_move = Move(-1, -1, -infinity if player.name == PLAYER_NAMES['P1'] else +infinity)
if depth == MAX_SEARCH_DEPTH or game_over(board):
score = evaluate(board)
return Move(-1, -1, score)
for correct_move in get_all_correct_moves(player, board.fields):
x, y, piece = correct_move.x, correct_move.y, correct_move.piece
move_single_piece(board.fields, player, correct_move)
player_to_move = get_player_to_move(board, player)
move = minimax(board, player_to_move, depth + 1) # <--- here is a recursion
move.x = x
move.y = y
move.piece = piece
if player.name == PLAYER_NAMES['P1']:
if move.score > best_move.score:
best_move = move # max value
else:
if move.score < best_move.score:
best_move = move # min value
return best_move
I decided that player 'P1' is a maximizing player and player 'P2' is a minimizing one. Starting from the first line, best_move variable holds a reference to a Move object which has the following fields:
class Move:
def __init__(self, x, y, score, piece=None):
self.x = x
self.y = y
self.score = score
self.piece = piece
I am initializing best_move.score to -Infinity in case of the maximizing player and to +Infinity otherwise.
The first condition checks if depth reached the maximal level(for testing purposes it is set to 2) or the game is over. If yes, it evaluates current board's situation and returns a Move object holding current board's score. Otherwise, my algorithm looks for all legal/correct moves for the player and performs the first one.
After performing it, this function is called in a recursive manner but with incremented depth and changed moving player. The function runs again with changing parameters until the first if condition executes.
Once the execution goes to that branch, the board's evaluation score is returned and later, in a for loop after recursive call, coordinates x, y and a piece, which was moved, are assigned to a Move object.
Last conditions check if the new score is a better score for that specific player. If this is a maximizing player, so in my case P1, it checks if new score is higher that the previous one. In the case of minimizing player, the algorithm looks for the lowest score.
After performing all correct moves for that player, my algorithm should return one best_move.
Expected result
Single object of a Move class with coordinates x and y, evaluated board's score, which is +Infinity/-Infinity only in a case of win for one of the players, and an object of Piece class which will be moved to [x, y] coordinates.
Actual result
Single object of Move class with coordinates x and y, evaluated board's score which is equal to +Infinity after first call of MiniMax function. None of pieces changed its position so the game is not over yet. However, score is +Infinity so function move_piece_minimax() will return False - meaning no more moves are possible. Therefore, my program will stop execution with no changes on the board. Here is the screenshot of initial and final board's states - nothing is changed during exectuion as the first call returns +Infinity.
My question is, what I missed during implementation of MiniMax algorithm? Did I make any mistake? I am also open to any code improvements or suggestions. If any additional functions will be needed for you to understand my implementation, I will provide them. Thank you!
In the minimax function, you should do either of the following
1. Make a copy of your board before placing pieces
2. Remove the placed piece after recursively calling your minimax function
Otherwise your board will be getting filled with pieces with recursion and you'll receive an error stating that there's no moves left. Minimax is meant to do in-depth searching by placing pieces, so you should implement a method so it doesn't modify your original board.
I'm writing a simple A* algorithm for finding the shortest path. But I need something more complicated. The agent can only go forward and rotate(90 deg). Will it influence at path or I can use simple A*?
Thanks for all.
def astar(maze, start, end):
start_node = Node(None, start)
start_node.g = start_node.h = start_node.f = 0
end_node = Node(None, end)
end_node.g = end_node.h = end_node.f = 0
open_list = []
closed_list = []
open_list.append(start_node)
while len(open_list) > 0:
current_node = open_list[0]
current_index = 0
for index, item in enumerate(open_list):
if item.f < current_node.f:
current_node = item
current_index = index
open_list.pop(current_index)
closed_list.append(current_node)
if current_node == end_node:
path = []
current = current_node
while current is not None:
path.append(current.position)
current = current.parent
return path[::-1]
children = []
for new_position in [(0, -1), (0, 1), (-1, 0), (1, 0), (-1, -1), (-1, 1), (1, -1), (1, 1)]:
node_position = (current_node.position[0] + new_position[0], current_node.position[1] + new_position[1])
if node_position[0] > (len(maze) - 1) or node_position[0] < 0 or node_position[1] > (len(maze[len(maze)-1]) -1) or node_position[1] < 0:
continue
if maze[node_position[0]][node_position[1]] != 0:
continue
new_node = Node(current_node, node_position)
children.append(new_node)
for child in children:
for closed_child in closed_list:
if child == closed_child:
continue
child.g = current_node.g + 1
child.h = ((child.position[0] - end_node.position[0]) ** 2) + ((child.position[1] - end_node.position[1]) ** 2)
child.f = child.g + child.h
for open_node in open_list:
if child == open_node and child.g > open_node.g:
continue
open_list.append(child)
The main issue here is that the A* algorithm needs to be made aware of the fact that there's a difference between "being at position (x, y) facing in direction d1" and "being at position (x, y) facing in direction d2." If the algorithm doesn't know this, it can't give you back the best series of instructions to follow.
One way that you can solve this is to imagine that your world isn't a 2D grid, but actually a 3D space consisting of four copies of the 2D grid stacked on top of one another. Just as your position in the 2D space consists of your current (x, y) coordinate, your position in the 3D space consists of your current (x, y) coordinate, combined with the direction that you're facing.
Imagine what it means to move around in this space. You have two options for the action you can take - "move" or "turn 90°" The "move" action will move you one step forward in the current 2D slice you're in, where "forward" has a different meaning based on what slice you're in. In other words, "move" will cause you to move purely within the X/Y plane, keeping your direction fixed. The "turn" operation will keep your X/Y position fixed, but will change which plane that you're in (based on which direction you happen to be facing at the time).
If you encode this information explicitly into your search, then A* can use it to find the best path for you to take. You'll need to define some new heuristic that estimates the distance to the goal. One option would be to assume there are no walls and to determine how many steps you'd have to take, plus the number of rotations needed, to get to the goal. At that point, A* can give you the best path to follow, using your heuristic to guide its search.
More generally, many problems of the form "I have a position in the world, plus some extra state (orientation, velocity, etc.)" can be converted from pathfinding in 2D space to pathfinding in 3D space, where your third dimension would be that extra piece of information. Or perhaps it'll go up to 4D or 5D space if you have multiple extra pieces of information.
Hope this helps!
The first thing to do is to divide the sourcecode into two functions: the forward model of the agent and the (path)planning algorithm itself. In the forward model, it's specified, that the agent has three possible actions (forward, rotleft, rotright). The A* planner creates the gametree for the agent. A possible sequence is: left, forward, forward.
It's not a classical pathplanning problem but an AI planning problem. Realizing such a two layer system is more complicated over a normal A* wayfinding algorithm, because the rotation action doesn't bring the agent nearer to the goal. It's more complicated to decide, if a simulated action has improved the situation or was a step backward.
In a real life scenario, even the described separation into forward model and planner wouldn't solve the issue, because the state space in a map would be grow very fast. Additional heuristics are needed to guide the A* planner into the optimal direction.
I have a list of intervals and I need to return the ones that overlap with an interval passed in a query. What is special is that in a typical query around a third or even half of the intervals will overlap with the one given in the query. Also, the ratio of the shortest interval to the longest is not more than 1:5. I implemented my own interval tree (augmented red-black tree) - I did not want to use existing implementations because I needed support for closed intervals and some special features. I tested the query speed with 6000 queries in a tree with 6000 intervals (so n=6000 and m=3000 (app.)). It turned out that brute force is just as good as using the tree:
Computation time - loop: 125.220461 s
Tree setup: 0.05064 s
Tree Queries: 123.167337 s
Let me use asymptotic analysis. n: number of queries; n: number of intervals; app. n/2: number of intervals returned in a query:
time complexity brute force: n*n
time complexity tree: n*(log(n)+n/2) --> 1/2 nn + nlog(n) --> n*n
So the result is saying that the two should be roughly the same for a large n. Still one would somehow expect the tree to be noticeably faster given the constant 1/2 in front of n*n. So there are three possible reasons I can imagine for the results I got:
a) My implementation is wrong. (Should I be using BFS like below?)
b) My implementation is right, but I made things cumbersome for Python so it needs more time to deal with the tree than to deal with brute force.
c) everything is OK - it is just how things should behave for a large n
My query function looks like this:
from collections import deque
def query(self,low,high):
result = []
q = deque([self.root]) # this is the root node in the tree
append_result = result.append
append_q = q.append
pop_left = q.popleft
while q:
node = pop_left() # look at the next node
if node.overlap(low,high): # some overlap?
append_result(node.interval)
if node.low != None and low <= node.get_low_max(): # en-q left node
append_q(node.low)
if node.high != None and node.get_high_min() <= high: # en-q right node
append_q(node.high)
I build the tree like this:
def build(self, intervals):
"""
Function which is recursively called to build the tree.
"""
if intervals is None:
return None
if len(intervals) > 2: # intervals is always sorted in increasing order
mid = len(intervals)//2
# split intervals into three parts:
# central element (median)
center = intervals[mid]
# left half (<= median)
new_low = intervals[:mid]
#right half (>= median)
new_high = intervals[mid+1:]
#compute max on the lower side (left):
max_low = max([n.get_high() for n in new_low])
#store min on the higher side (right):
min_high = new_high[0].get_low()
elif len(intervals) == 2:
center = intervals[1]
new_low = [intervals[0]]
new_high = None
max_low = intervals[0].get_high()
min_high = None
elif len(intervals) == 1:
center = intervals[0]
new_low = None
new_high = None
max_low = None
min_high = None
else:
raise Exception('The tree is not behaving as it should...')
return(Node(center, self.build(new_low),self.build(new_high),
max_low, min_high))
EDIT:
A node is represented like this:
class Node:
def __init__(self, interval, low, high, max_low, min_high):
self.interval = interval # pointer to corresponding interval object
self.low = low # pointer to node containing intervals to the left
self.high = high # pointer to node containing intervals to the right
self.max_low = max_low # maxiumum value on the left side
self.min_high = min_high # minimum value on the right side
All the nodes in a subtree can be obtained like this:
def subtree(current):
node_list = []
if current.low != None:
node_list += subtree(current.low)
node_list += [current]
if current.high != None:
node_list += subtree(current.high)
return node_list
p.s. note that by exploiting that there is so much overlap and that all intervals have comparable lenghts, I managed to implement a simple method based on sorting and bisection that completed in 80 s, but I would say this is over-fitting... Amusingly, by using asymptotic analysis, I found it should have app. the same runtime as using the tree...
If I correctly understand your problem, you are trying to speed up your process.
If it is that, try to create a real tree instead of manipulating lists.
Something that looks like :
class IntervalTreeNode():
def __init__(self, parent, min, max):
self.value = (min,max)
self.parent = parent
self.leftBranch = None
self.rightBranch= None
def insert(self, interval):
...
def asList(self):
""" return the list that is this node and all the subtree nodes """
left=[]
if (self.leftBranch != None):
left = self.leftBranch.asList()
right=[]
if (self.rightBranch != None):
left = self.rightBranch.asList()
return [self.value] + left + right
And then at start create an internalTreeNode and insert all yours intervals in.
This way, if you really need a list you can build a list each time you need a result and not each time you make a step in your recursive iteration using [x:] or [:x] as list manipulation is a costly operation in python. It is possible to work also using directly the nodes instead of a list that will greatly speed up the process as you just have to return a reference to the node instead of doing some list addition.
Hey. This example is pretty specific but I think it could apply to a broad range of functions.
It's taken from some online programming contest.
There is a game with a simple winning condition. Draw is not possible. Game cannot go on forever because every move takes you closer to the terminating condition. The function should, given a state, determine if the player who is to move now has a winning strategy.
In the example, the state is an integer. A player chooses a non-zero digit and subtracts it from the number: the new state is the new integer. The winner is the player who reaches zero.
I coded this:
from Memoize import Memoize
#Memoize
def Game(x):
if x == 0: return True
for digit in str(x):
if digit != '0' and not Game(x-int(digit)):
return True
return False
I think it's clear how it works. I also realize that for this specific game there's probably a much smarter solution but my question is general. However this makes python go crazy even for relatively small inputs. Is there any way to make this code work with a loop?
Thanks
This is what I mean by translating into a loop:
def fac(x):
if x <= 1: return x
else: return x*fac(x-1)
def fac_loop(x):
result = 1
for i in xrange(1,x+1):
result *= i
return result
## dont try: fac(10000)
print fac_loop(10000) % 100 ## works
In general, it is only possible to convert recursive functions into loops when they are primitive-recursive; this basically means that they call themselves only once in the body. Your function calls itself multiple times. Such a function really needs a stack. It is possible to make the stack explicit, e.g. with lists. One reformulation of your algorithm using an explicit stack is
def Game(x):
# x, str(x), position
stack = [(x,str(x),0)]
# return value
res = None
while stack:
if res is not None:
# we have a return value
if not res:
stack.pop()
res = True
continue
# res is True, continue to search
res = None
x, s, pos = stack.pop()
if x == 0:
res = True
continue
if pos == len(s):
# end of loop, return False
res = False
continue
stack.append((x,s,pos+1))
digit = s[pos]
if digit == '0':
continue
x -= int(digit)
# recurse, starting with position 0
stack.append((x,str(x),0))
return res
Basically, you need to make each local variable an element of a stack frame; the local variables here are x, str(x), and the iteration counter of the loop. Doing return values is a bit tricky - I chose to set res to not-None if a function has just returned.
By "go crazy" I assume you mean:
>>> Game(10000)
# stuff skipped
RuntimeError: maximum recursion depth exceeded in cmp
You could start at the bottom instead -- a crude change would be:
# after defining Game()
for i in range(10000):
Game(i)
# Now this will work:
print Game(10000)
This is because, if you start with a high number, you have to recurse a long way before you reach the bottom (0), so your memoization decorator doesn't help the way it should.
By starting from the bottom, you ensure that every recursive call hits the dictionary of results immediately. You probably use extra space, but you don't recurse far.
You can turn any recursive function into an iterative function by using a loop and a stack -- essentially running the call stack by hand. See this question or this quesstion, for example, for some discussion. There may be a more elegant loop-based solution here, but it doesn't leap out to me.
Well, recursion mostly is about being able to execute some code without losing previous contexts and their order. In particular, function frames put and saved onto call stack during recursion, therefore giving constraint on recursion depth because stack size is limited. You can 'increase' your recursion depth by manually managing/saving required information on each recursive call by creating a state stack on the heap memory. Usually, amount of available heap memory is larger than stack's one. Think: good quick sort implementations eliminate recursion to the larger side by creating an outer loop with ever-changing state variables (lower/upper array boundaries and pivot in QS example).
While I was typing this, Martin v. Löwis posted good answer about converting recursive functions into loops.
You could modify your recursive version a bit:
def Game(x):
if x == 0: return True
s = set(digit for digit in str(x) if digit != '0')
return any(not Game(x-int(digit)) for digit in s)
This way, you don't examine digits multiple times. For example, if you are doing 111, you don't have to look at 110 three times.
I'm not sure if this counts as an iterative version of the original algorithm you presented, but here is a memoized iterative version:
import Queue
def Game2(x):
memo = {}
memo[0] = True
calc_dep = {}
must_calc = Queue.Queue()
must_calc.put(x)
while not must_calc.empty():
n = must_calc.get()
if n and n not in calc_dep:
s = set(int(c) for c in str(n) if c != '0')
elems = [n - digit for digit in s]
calc_dep[n] = elems
for new_elem in elems:
if new_elem not in calc_dep:
must_calc.put(new_elem)
for k in sorted(calc_dep.keys()):
v = calc_dep[k]
#print k, v
memo[k] = any(not memo[i] for i in v)
return memo[x]
It first calculates the set of numbers that x, the input, depends on. Then it calculates those numbers, starting at the bottom and going towards x.
The code is so fast because of the test for calc_dep. It avoids calculating multiple dependencies. As a result, it can do Game(10000) in under 400 milliseconds whereas the original takes -- I don't know how long. A long time.
Here are performance measurements:
Elapsed: 1000 0:00:00.029000
Elapsed: 2000 0:00:00.044000
Elapsed: 4000 0:00:00.086000
Elapsed: 8000 0:00:00.197000
Elapsed: 16000 0:00:00.461000
Elapsed: 32000 0:00:00.969000
Elapsed: 64000 0:00:01.774000
Elapsed: 128000 0:00:03.708000
Elapsed: 256000 0:00:07.951000
Elapsed: 512000 0:00:19.148000
Elapsed: 1024000 0:00:34.960000
Elapsed: 2048000 0:01:17.960000
Elapsed: 4096000 0:02:55.013000
It's reasonably zippy.