Alpha-Beta pruning algorithm in python not pruning

Alpha-Beta pruning algorithm in python not pruning - python

I'm evaluating chess positions, the implementation isn't really relevant. I've inserted print checks to see how many paths I'm able to prune, but nothing is printed, meaning I don't really prune anything.
I've understood the algorithm and followed the pseudo-code to the letter. Anyone has any idea on what's going wrong?
def alphabeta(self,node,depth,white, alpha,beta):
ch = Chessgen()
if(depth == 0 or self.is_end(node)):
return self.stockfish_evaluation(node.board)
if (white):
value = Cp(-10000)
for child in ch.chessgen(node):
value = max(value, self.alphabeta(child,depth-1,False, alpha,beta))
alpha = max(alpha, value)
if (alpha >= beta):
print("Pruned white")
break
return value
else:
value = Cp(10000)
for child in ch.chessgen(node):
value = min(value, self.alphabeta(child,depth-1,True, alpha,beta))
beta = min(beta,value)
if(beta <= alpha):
print("Pruned black")
break
return value

What is your pseudo code ?
The one I found gives a little diferent code:
As I do not have your full code, I cannot run it:
def alphabeta(self,node,depth,white, alpha,beta):
ch = Chessgen() ### can you do the init somewhere else to speed up the code ?
if(depth == 0 or self.is_end(node)):
return self.stockfish_evaluation(node.board)
if (white):
value = Cp(-10000)
for child in ch.chessgen(node):
value = max(value, self.alphabeta(child,depth-1,False, alpha,beta))
if (value >= beta):
print("Pruned white")
return value
alpha = max(alpha, value)
return value
else:
value = Cp(10000)
for child in ch.chessgen(node):
value = min(value, self.alphabeta(child,depth-1,True, alpha,beta))
if(value <= alpha):
print("Pruned black")
return value
beta = min(beta,value)
return value
A full working simple chess program can be found here:
https://andreasstckl.medium.com/writing-a-chess-program-in-one-day-30daff4610ec

Related

ID3 does not learn from train set

I have programmed this ID3 algorithm and for some reason the predicted value seems to always return None. I cannot figure out why the code does not enter the if statement in the predict function but have narrowed down the problem to that area.
I have tried changing the predict function multiple times and debugging but cannot find out why the issue persists and the feature value is not in tree[root_node]. Can someone please help with this?
def predict(tree, instance):
if not isinstance(tree, dict):
return tree
else:
root_node = next(iter(tree))
feat_val = instance[root_node]
if feat_val in tree[root_node]:
return predict(tree[root_node][feat_val], instance)
else:
return None
def evaluate(tree, test_data_m, label):
correct_preditct = 0
wrong_preditct = 0
for index, row in test_data_m.iterrows():#for each row in the dataset
result = predict(tree, test_data_m.loc[index])
if result == test_data_m[label][index]:
correct_predict += 1 #increase correct count
else:
wrong_predict += 1 #increase incorrect count
accuracy = correct_predict / (correct_predict + wrong_predict)
return accuracy

Collecting and retrieving the principal variation from an alphabeta framework

I am trying to write a chess engine in python, i can find the best move given a position, but i am struggling to collect the principal variation from that position, the following is what i have tried so far:
def alphabeta(board, alpha, beta, depth, pvtable):
if depth == 0:
return evaluate.eval(board)
for move in board.legal_moves:
board.push(move)
score = -alphabeta(board, -beta, -alpha, depth - 1, pvtable)
board.pop()
if score >= beta:
return beta
if score > alpha:
alpha = score
pvtable[depth-1] = str(move)
return alpha
i am using pvtable[depth - 1] = str(move) to append moves but in the end i find that pvtable contains random non consistent moves, things like ['g1h3', 'g8h6', 'h3g5', 'd8g5'] for the starting position.
I know that similar questions about that have been asked but i still didn't figure out how i can solve this problem.

I think your moves are overwritten when the search reaches the same depth again (in a different branch of the game tree).
This site explains quite good how to retrieve the principal variation: https://web.archive.org/web/20071031100114/http://www.brucemo.com:80/compchess/programming/pv.htm
Applied to your code example, it should be something like this (I didn't test it):
def alphabeta(board, alpha, beta, depth, pline):
line = []
if depth == 0:
return evaluate.eval(board)
for move in board.legal_moves:
board.push(move)
score = -alphabeta(board, -beta, -alpha, depth - 1, line)
board.pop()
if score >= beta:
return beta
if score > alpha:
alpha = score
pline[:] = [str(move)] + line
return alpha

TypeError: unsupported operand type(s) for *: 'function' and 'float'

I can't seem to figure out why I keep getting this error. The command line error traceback looks like this:
The point of the following code is to essentially give artificial intelligence to Pacman; keep him away from unscared ghosts, while eating all the food and capsules on a map. Most of code for this was given by the prof for an AI class, which can be found here.
The evaluationFunction method returns a very simple heuristic value that takes into consideration the distances to ghosts, food and capsules. The getAction function is found within my ExpectimaxAgent class (passed arg is MulitAgentSearchAgent), and it gathers all relevant information together, iterates through all possible actions and passes the info along to expectimax. The expectimax function is supposed to calculate a heuristic value which when returned to getAction is compared to the other action-heuristic values, and the one with the highest heuristic is chosen as the best action.
This should be all the relevant code for this error (if not I'll add more, also a quick apology for the noob mistakes in this question, I'm first time poster):
class ReflexAgent(Agent):
def getAction(self, gameState):
# Collect legal moves and successor states
legalMoves = gameState.getLegalActions()
# Choose one of the best actions
scores = [self.evaluationFunction(gameState, action) for action in legalMoves]
bestScore = max(scores)
bestIndices = [index for index in range(len(scores)) if scores[index] == bestScore]
chosenIndex = random.choice(bestIndices) # Pick randomly among the best
return legalMoves[chosenIndex]
def evaluationFunction(self, currentGameState, action):
successorGameState = currentGameState.generatePacmanSuccessor(action)
oldPos = currentGameState.getPacmanPosition()
newPos = successorGameState.getPacmanPosition()
newFood = successorGameState.getFood()
newGhostStates = successorGameState.getGhostStates()
# heuristic baseline
heuristic = 0.0
# ghost heuristic
for ghost in newGhostStates:
ghostDist = manhattanDistance(ghost.getPosition(), newPos)
if ghostDist <= 1:
if ghost.scaredTimer != 0:
heuristic += 2000
else:
heuristic -= 200
# capsule heuristic
for capsule in currentGameState.getCapsules():
capsuleDist = manhattanDistance(capsule, newPos)
if capsuleDist == 0:
heuristic += 100
else:
heuristic += 10.0/capsuleDist
# food heuristic
for x in xrange(newFood.width):
for y in xrange(newFood.height):
if (newFood[x][y]):
foodDist = manhattanDistance(newPos, (x,y))
if foodDist == 0:
heuristic += 100
else:
heuristic += 1.0/(foodDist ** 2)
if currentGameState.getNumFood() > successorGameState.getNumFood():
heuristic += 100
if action == Directions.STOP:
heuristic -= 5
return heuristic
def scoreEvaluationFunction(currentGameState):
return currentGameState.getScore()
class MultiAgentSearchAgent(Agent):
def __init__(self, evalFn = 'scoreEvaluationFunction', depth = '2'):
self.index = 0 # Pacman is always agent index 0
self.evaluationFunction = util.lookup(evalFn, globals())
self.depth = int(depth)
class ExpectimaxAgent(MultiAgentSearchAgent):
def getAction(self, gameState):
# Set v to smallest float value (-infinity)
v = float("-inf")
bestAction = []
# Pacman is agent == 0
agent = 0
# All legal actions which Pacman can make from his current location
actions = gameState.getLegalActions(agent)
# All successors determined from all the legal actions
successors = [(action, gameState.generateSuccessor(agent, action)) for action in actions]
# Iterate through all successors
for successor in successors:
# Expectimax function call (actor = 1, agentList = total number of agents, state = successor[1], depth = self.depth, evalFunct = self.evaluationFunction)
temp = expectimax(1, range(gameState.getNumAgents()), successor[1], self.depth, self.evaluationFunction)
# temp is greater than -infinity (or previously set value)
if temp > v:
# Set v to the new value of temp
v = temp
# Make the best action equal to successor[0]
bestAction = successor[0]
return bestAction
def expectimax(agent, agentList, state, depth, evalFunct):
# Check if won, lost or depth is less than/equal to 0
if depth <= 0 or state.isWin() == True or state.isLose() == True:
# return evalFunct
return evalFunct
# Check to see if agent is Pacman
if agent == 0:
# Set v to smallest float value (-infinity)
v = float("-inf")
# Otherwise, agent is ghost
else:
# Set v to 0
v = 0
# All possible legal actions for Pacman/Ghost(s)
actions = state.getLegalActions(agent)
# All successors determined from all the legal actions for the passed actor (either Pacman or Ghost(s))
successors = [state.generateSuccessor(agent, action) for action in actions]
# Find the inverse of the length of successors
p = 1.0/len(successors)
# Iterate through the length of successors
for j in range(len(successors)):
# Temp var to store the current successor at location j
successor = successors[j]
# Check if agent is Pacman
if agent == 0:
# Set v to the max of its previous value or recursive call to expectimax
v = max(v, expectimax(agentList[agent + 1], agentList, successor, depth, evalFunct))
# Check if agent is equal to ghost 2
elif agent == agentList[-1]:
# Increment v by the recursive call to p times expectimax (with agent=agentlist[0], agentList, state=successor, depth-=1, evalFunct)
v += expectimax(agentList[0], agentList, successor, depth - 1, evalFunct) * p
# Otherwise
else:
# Increment v by p times the recursive call to expectimax (with agent=agentList[agent+1], agentList, state=successor, depth, evalFunct)
v += expectimax(agentList[agent + 1], agentList, successor, depth, evalFunct) * p
return v
I've looked at a few other posts on here, and around the inter-webs, but have found nothing that seems to be similar to my issue. I've tried to pass the value to a temp variable, even tried to move the multiplication before the function call, but those changes have given me the exact same error, on the exact same line.

The error was the first return inside of the expectimax function. Instead of:
def expectimax(agent, agentList, state, depth, evalFunct):
# Check if won, lost or depth is less than/equal to 0
if depth <= 0 or state.isWin() == True or state.isLose() == True:
# return evalFunct
return evalFunct <--- cause of the error
It should have been:
return evalFunct(state)
This is because (as was pointed out) evalFunct only points to the evaluation function the user chose (from the command line arguments).

Is there any effective difference between those two functions?

I was trying to understand what is the effective difference between those two pieces of code. They are both written for an assignment I got at school, but only the first one works as it should. I've been unable to understand what goes wrong in the second one so I'd be fantastically grateful if someone could shine some light on this problem.
First code:
def classify(self, obj):
if sum([c[0].classify(obj)*c[1] for c in self.classifiers]) >0:
return 1
else: return -1
def update_weights(self, best_error, best_classifier):
w=self.data_weights
for index in range(len(self.data_weights)):
if self.standard.classify(self.data[index])==best_classifier.classify(self.data[index]):
s=-1
else: s=1
self.data_weights[index] = self.data_weights[index]*math.exp(s*error_to_alpha(best_error))
Second code:
def classify(self, obj):
score = 0
for c, alpha in self.classifiers:
score += alpha * c.classify(obj)
if score > 0:
return 1
else:
return -1
def update_weights(self, best_error, best_classifier):
alpha = error_to_alpha(best_error)
for d, w in zip(self.data, self.data_weights):
if self.standard.classify(d) == best_classifier.classify(d):
w *= w * math.exp(alpha)
else:
w *= w * math.exp(-1.0*alpha)

the second doesn't modify the weights.
in the first you explicitly modify the weights array with the line
self.data_weights[index] = ...
but in the second you are only modifying w:
w *= ...
(and you have an extra factor of w). in the second case, w is a variable that is initialised from data_weights, but it is a new variable. it is not the same thing as the array entry, and changing its value does not change the array itself.
so when you later go to look at data_weights in the second case, it will not have been updated.

Infinite loop and recursion in Python

I am working on implementing an iterative deepening depth first search to find solutions for the 8 puzzle problem. I am not interested in finding the actual search paths themselves, but rather just to time how long it takes for the program to run. (I have not yet implemented the timing function).
However, I am having some issues trying to implement the actual search function (scroll down to see). I pasted all the code I have so far, so if you copy and paste this, you can run it as well. That may be the best way to describe the problems I'm having...I'm just not understanding why I'm getting infinite loops during the recursion, e.g. in the test for puzzle 2 (p2), where the first expansion should yield a solution. I thought it may have something to do with not adding a "Return" in front of one of the lines of code (it's commented below). When I add the return, I can pass the test for puzzle 2, but something more complex like puzzle 3 fails, since it appears that the now the code is only expanding the left most branch...
Been at this for hours, and giving up hope. I would really appreciate another set of eyes on this, and if you could point out my error(s). Thank you!
#Classic 8 puzzle game
#Data Structure: [0,1,2,3,4,5,6,7,8], which is the goal state. 0 represents the blank
#We also want to ignore "backward" moves (reversing the previous action)
p1 = [0,1,2,3,4,5,6,7,8]
p2 = [3,1,2,0,4,5,6,7,8]
p3 = [3,1,2,4,5,8,6,0,7]
def z(p): #returns the location of the blank cell, which is represented by 0
return p.index(0)
def left(p):
zeroLoc = z(p)
p[zeroLoc] = p[zeroLoc-1]
p[zeroLoc-1] = 0
return p
def up(p):
zeroLoc = z(p)
p[zeroLoc] = p[zeroLoc-3]
p[zeroLoc-3] = 0
return p
def right(p):
zeroLoc = z(p)
p[zeroLoc] = p[zeroLoc+1]
p[zeroLoc+1] = 0
return p
def down(p):
zeroLoc = z(p)
p[zeroLoc] = p[zeroLoc+3]
p[zeroLoc+3] = 0
return p
def expand1(p): #version 1, which generates all successors at once by copying parent
x = z(p)
#p[:] will make a copy of parent puzzle
s = [] #set s of successors
if x == 0:
s.append(right(p[:]))
s.append(down(p[:]))
elif x == 1:
s.append(left(p[:]))
s.append(right(p[:]))
s.append(down(p[:]))
elif x == 2:
s.append(left(p[:]))
s.append(down(p[:]))
elif x == 3:
s.append(up(p[:]))
s.append(right(p[:]))
s.append(down(p[:]))
elif x == 4:
s.append(left(p[:]))
s.append(up(p[:]))
s.append(right(p[:]))
s.append(down(p[:]))
elif x == 5:
s.append(left(p[:]))
s.append(up(p[:]))
s.append(down(p[:]))
elif x == 6:
s.append(up(p[:]))
s.append(right(p[:]))
elif x == 7:
s.append(left(p[:]))
s.append(up(p[:]))
s.append(right(p[:]))
else: #x == 8
s.append(left(p[:]))
s.append(up(p[:]))
#returns set of all possible successors
return s
goal = [0,1,2,3,4,5,6,7,8]
def DFS(root, goal): #iterative deepening DFS
limit = 0
while True:
result = DLS(root, goal, limit)
if result == goal:
return result
limit = limit + 1
visited = []
def DLS(node, goal, limit): #limited DFS
if limit == 0 and node == goal:
print "hi"
return node
elif limit > 0:
visited.append(node)
children = [x for x in expand1(node) if x not in visited]
print "\n limit =", limit, "---",children #for testing purposes only
for child in children:
DLS(child, goal, limit - 1) #if I add "return" in front of this line, p2 passes the test below, but p3 will fail (only the leftmost branch of the tree is getting expanded...)
else:
return "No Solution"
#Below are tests
print "\ninput: ",p1
print "output: ",DFS(p1, goal)
print "\ninput: ",p2
print "output: ",DLS(p2, goal, 1)
#print "output: ",DFS(p2, goal)
print "\ninput: ",p3
print "output: ",DLS(p3, goal, 2)
#print "output: ",DFS(p2, goal)

The immediate issue you're having with your recursion is that you're not returning anything when you hit your recursive step. However, unconditionally returning the value from the first recursive call won't work either, since the first child isn't guaranteed to be the one that finds the solution. Instead, you need to test to see which (if any) of the recursive searches you're doing on your child states is successful. Here's how I'd change the end of your DLS function:
for child in children:
child_result = DLS(child, goal, limit - 1)
if child_result != "No Solution":
return child_result
# note, "else" removed here, so you can fall through to the return from above
return "No Solution"
A slightly more "pythonic" (and faster) way of doing this would be to use None as the sentinel value rather than the "No Solution" string. Then your test would simply be if child_result: return child_result and you could optionally leave off the return statement for the failed searches (since None is the default return value of a function).
There are some other issues going on with your code that you'll run into once this recursion issue is fixed. For instance, using a global visited variable is problematic, unless you reset it each time you restart another recursive search. But I'll leave those to you!

Use classes for your states! This should make things much easier. To get you started. Don't want to post the whole solution right now, but this makes things much easier.
#example usage
cur = initialPuzzle
for k in range(0,5): # for 5 iterations. this will cycle through, so there is some coding to do
allsucc = cur.succ() # get all successors as puzzle instances
cur = allsucc[0] # expand first
print 'expand ',cur
import copy
class puzzle:
'''
orientation
[0, 1, 2
3, 4, 5
6, 7, 8]
'''
def __init__(self,p):
self.p = p
def z(self):
''' returns the location of the blank cell, which is represented by 0 '''
return self.p.index(0)
def swap(self,a,b):
self.p[a] = self.p[b]
self.p[b] = 0
def left(self):
self.swap(self.z(),self.z()+1) #FIXME: raise exception if not allowed
def up(self):
self.swap(self.z(),self.z()+3)
def right(self):
self.swap(self.z(),self.z()-1)
def down(self):
self.swap(self.z(),self.z()-3)
def __str__(self):
return str(self.p)
def copyApply(self,func):
cpy = self.copy()
func(cpy)
return cpy
def makeCopies(self,s):
''' some bookkeeping '''
flist = list()
if 'U' in s:
flist.append(self.copyApply(puzzle.up))
if 'L' in s:
flist.append(self.copyApply(puzzle.left))
if 'R' in s:
flist.append(self.copyApply(puzzle.right))
if 'D' in s:
flist.append(self.copyApply(puzzle.down))
return flist
def succ(self):
# return all successor states for this puzzle state
# short hand of allowed success states
m = ['UL','ULR','UR','UDR','ULRD','UDL','DL','LRD','DR']
ss= self.makeCopies(m[self.z()]) # map them to copies of puzzles
return ss
def copy(self):
return copy.deepcopy(self)
# some initial state
p1 = [0,1,2,3,4,5,6,7,8]
print '*'*20
pz = puzzle(p1)
print pz
a,b = pz.succ()
print a,b

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Alpha-Beta pruning algorithm in python not pruning - python

Related

ID3 does not learn from train set

Collecting and retrieving the principal variation from an alphabeta framework

TypeError: unsupported operand type(s) for *: 'function' and 'float'

Is there any effective difference between those two functions?

Infinite loop and recursion in Python

Categories

Resources