Python Negamax Algorithm - python

I have about as simple of a negamax algorithm as possible, for evaluating positions in Tic Tac Toe. The state of the game is stored as an array in numpy, with X's pieces represented by 1, and O's pieces represented by four.
I was testing this just now, and found:
a = np.zeros(9).reshape(3,3)
negaMax(a, 6, 1) # Returned zero as it should
negaMax(a, 7, 1) # Returns 100
Meaning that my algorithm thinks it has found a way for X to win in seven plies in a game of Tic Tac Toe, which is obviously impossible against decent play. I can't work out how to have it print the best moves it has found, so am having real trouble debugging this. What am I doing wrong?
def winCheck(state):
"""Takes a position, and returns the outcome of that game"""
# Sums which correspond to a line across a column
winNums = list(state.sum(axis=0))
# Sums which correspond to a line across a row
winNums.extend(list(state.sum(axis=1)))
# Sums which correspond to a line across the main diagonal
winNums.append(state.trace())
# Sums which correspond to a line across the off diagonal
winNums.append(np.flipud(state).trace())
if Square.m in winNums:
return 'X'
elif (Square.m**2 + Square.m) in winNums:
return 'O'
elif np.count_nonzero(state) == Square.m**2:
return 'D'
else:
return None
def moveFind(state):
"""Takes a position as an nparray and determines the legal moves"""
moveChoices = []
# Iterate over state, to determine which squares are empty
it = np.nditer(state, flags=['multi_index'])
while not it.finished:
if it[0] == 0:
moveChoices.append(it.multi_index)
it.iternext()
return moveChoices
def moveSim(state, move, player):
"""Create the state of the player having moved without interfering with the board"""
simState = state.copy()
if player == 1:
simState[move] = 1
else:
simState[move] = gamecfg.n + 1
return simState
def positionScore(state):
"""The game is either won or lost"""
if winCheck(state) == 'X':
return 100
elif winCheck(state) == 'O':
return -100
else:
return 0
def negaMax(state, depth, colour):
"""Recursively find the best move via a negamax search"""
if depth == 0:
return positionScore(state) * colour
highScore = -100
moveList = moveFind(state)
for move in moveList:
score = -negaMax(moveSim(state, move, colour), depth -1, colour * -1)
highScore = max(score, highScore)
return highScore

Your code does not consider the game to stop when a line of 3 symbols is made.
This means that it is playing a variant of tic-tac-toe where X wins if he makes a line of 3 even after O has made a line of 3.
For this variant, the program has correctly found that it is possible for X to always win!
(I came across the same situation with a chess program I made where the computer was happy to sacrifice its king if it would reach checkmate a little later...)

Related

Average time to hit a given line on 2D random walk on a unit grid

I am trying to simulate the following problem:
Given a 2D random walk (in a lattice grid) starting from the origin what is the average waiting time to hit the line y=1-x
import numpy as np
from tqdm import tqdm
N=5*10**3
results=[]
for _ in tqdm(range(N)):
current = [0,0]
step=0
while (current[1]+current[0] != 1):
step += 1
a = np.random.randint(0,4)
if (a==0):
current[0] += 1
elif (a==1):
current[0] -= 1
elif (a==2):
current[1] += 1
elif (a==3):
current[1] -= 1
results.append(step)
This code is slow even for N<10**4 I am not sure how to optimize it or change it to properly simulate the problem.
Instead of simulating a bunch of random walks sequentially, lets try simulating multiple paths at the same time and tracking the probabilities of those happening, for instance we start at position 0 with probability 1:
states = {0+0j: 1}
and the possible moves along with their associated probabilities would be something like this:
moves = {1+0j: 0.25, 0+1j: 0.25, -1+0j: 0.25, 0-1j: 0.25}
# moves = {1: 0.5, -1:0.5} # this would basically be equivelent
With this construct we can update to new states by going over the combination of each state and each move and update probabilities accordingly
def simulate_one_step(current_states):
newStates = {}
for cur_pos, prob_of_being_here in current_states.items():
for movement_dist,prob_of_moving_this_way in moves.items():
newStates.setdefault(cur_pos+movement_dist, 0)
newStates[cur_pos+movement_dist] += prob_of_being_here*prob_of_moving_this_way
return newStates
Then we just iterate this popping out all winning states at each step:
for stepIdx in range(1, 100):
states = simulate_one_step(states)
winning_chances = 0
# use set(keys) to make copy so we can delete cases out of states as we go.
for pos, prob in set(states.items()):
# if y = 1-x
if pos.imag == 1 - pos.real:
winning_chances += prob
# we no longer consider this a state that propogated because the path stops here.
del states[pos]
print(f"probability of winning after {stepIdx} moves is: {winning_chances}")
you would also be able to look at states for an idea of the distribution of possible positions, although totalling it in terms of distance from the line simplifies the data. Anyway, the final step would be to average the steps taken by the probability of taking that many steps and see if it converges:
total_average_num_moves += stepIdx * winning_chances
But we might be able to gather more insight by using symbolic variables! (note I'm simplifying this to a 1D problem which I describe how at the bottom)
import sympy
x = sympy.Symbol("x") # will sub in 1/2 later
moves = {
1: x, # assume x is the chances for us to move towards the target
-1: 1-x # and therefore 1-x is the chance of moving away
}
This with the exact code as written above gives us this sequence:
probability of winning after 1 moves is: x
probability of winning after 2 moves is: 0
probability of winning after 3 moves is: x**2*(1 - x)
probability of winning after 4 moves is: 0
probability of winning after 5 moves is: 2*x**3*(1 - x)**2
probability of winning after 6 moves is: 0
probability of winning after 7 moves is: 5*x**4*(1 - x)**3
probability of winning after 8 moves is: 0
probability of winning after 9 moves is: 14*x**5*(1 - x)**4
probability of winning after 10 moves is: 0
probability of winning after 11 moves is: 42*x**6*(1 - x)**5
probability of winning after 12 moves is: 0
probability of winning after 13 moves is: 132*x**7*(1 - x)**6
And if we ask the OEIS what the sequence 1,2,5,14,42,132... means it tells us those are Catalan numbers with the formula of (2n)!/(n!(n+1)!) so we can write a function for the non-zero terms in that series as:
f(n,x) = (2n)! / (n! * (n+1)!) * x^(n+1) * (1-x)^n
or in actual code:
import math
def probability_of_winning_after_2n_plus_1_steps(n, prob_of_moving_forward = 0.5):
return (math.factorial(2*n)/math.factorial(n)/math.factorial(n+1)
* prob_of_moving_forward**(n+1) * (1-prob_of_moving_forward)**n)
which now gives us a relatively instant way of calculating relevant parameters for any length, or more usefully ask wolfram alpha what the average would be (it diverges)
Note that we can simplify this to a 1D problem by considering y-x as one variable: "we start at y-x = 0 and move such that y-x either increases or decreases by 1 each move with equal chance and we are interested when y-x = 1. This means we can consider the 1D case by subbing in z=y-x.
Vectorisation would result in much faster code, approximately ~90K times faster. Here is the function that would return step to hit y=1-x line starting from (0,0) and trajectory generation on the 2D grid with unit steps .
import numpy as np
def _random_walk_2D(sim_steps):
""" Walk on 2D unit steps
return x_sim, y_sim, trajectory, number_of_steps_first_hit to y=1-x """
random_moves_x = np.insert(np.random.choice([1,0,-1], sim_steps), 0, 0)
random_moves_y = np.insert(np.random.choice([1,0,-1], sim_steps), 0, 0)
x_sim = np.cumsum(random_moves_x)
y_sim = np.cumsum(random_moves_y)
trajectory = np.array((x_sim,y_sim)).T
y_hat = 1-x_sim # checking if hit y=1-x
y_hit = y_hat-y_sim
hit_steps = np.where(y_hit == 0)
number_of_steps_first_hit = -1
if hit_steps[0].shape[0] > 0:
number_of_steps_first_hit = hit_steps[0][0]
return x_sim, y_sim, trajectory, number_of_steps_first_hit
if number_of_steps_first_hit is -1 it means trajectory does not hit the line.
A longer simulation and repeating might give the average behaviour, but the following one tells if it does not escape to Infiniti it hits line on average ~84 steps.
sim_steps= 5*10**3 # 5K steps
#Repeat
nrepeat = 40000
hit_step = [_random_walk_2D(sim_steps)[3] for _ in range(nrepeat)]
hit_step = [h for h in hit_step if h > -1]
np.mean(hit_step) # ~84 step
Much longer sim_steps will change the result though.
PS:
Good exercise, hope that this wasn't a homework, if it was homework, please cite this answer if it is used.
Edit
As discussed in the comments current _random_walk_2D works for 8-directions. To restrict it to cardinal direction we could do the following filtering:
cardinal_x_y = [(t[0], t[1]) for t in zip(random_moves_x, random_moves_y)
if np.abs(t[0]) != np.abs(t[1])]
random_moves_x = [t[0] for t in cardinal_x_y]
random_moves_y = [t[1] for t in cardinal_x_y]
though this would slow it down the function a bit but still will be super fast compare to for loop solutions.

Converting If statement to a loop

I am working on a practice problem where we are to input a list into a function argument, that will represent a tic tac toe board, and return the outcome of the board. That is, X wins, O wins, Draw, or None (null string).
I have it solved, but I was wondering if there is a way I could manipulate my algorithm into a loop, as it was recommended to use a loop to compare each element of the main diagonal with all the
elements of its intersecting row and column, and then check the two diagonals. I'm new to python, so my solution might be a bit longer then it needs to be. How could a loop be implemented to check the outcome of the tic tac toe board?
def gameState (List):
xcounter=0
ocounter=0
if List[0][0]==List[0][1] and List[0][0]==List[0][2]:
return List[0][0]
elif List[0][0]==List[1][0] and List[0][0]==List[2][0]:
return List[0][0]
elif List[0][0]==List[1][1] and List[0][0]==List[2][2]:
return List[0][0]
elif List[1][1]==List[1][2] and List[1][1]==List[1][0] :
return List[1][1]
elif List[1][1]==List[0][1] and List[1][1]==List[2][1]:
return List[1][1]
elif List[1][1]==List[0][0] and List[1][1]==List[2][2]:
return List[1][1]
elif List[2][2]==List[2][0] and List[2][2]==List[2][1]:
return List[2][2]
elif List[2][2]==List[1][2] and List[2][2]==List[0][2]:
return List[2][2]
elif List[2][2]==List[1][1] and List[2][2]==List[0][0]:
return List[2][2]
for listt in List:
for elm in listt:
if elm=="X" or elm=="x":
xcounter+=1
elif elm=="O" or elm=="o":
ocounter+=1
if xcounter==5 or ocounter==5:
return "D"
else:
return ''
First up, there are only eight ways to win at TicTacToe. You have nine compare-and-return statements so one is superfluous. In fact, on further examination, you check 00, 11, 22 three times (cases 3, 6 and 9) and totally miss the 02, 11, 20 case.
In terms of checking with a loop, you can split out the row/column checks from the diagonals as follows:
# Check all three rows and columns.
for idx in range(3):
if List[0][idx] != ' ':
if List[0][idx] == List[1][idx] and List[0][idx] == List[2][idx]:
return List[0][idx]
if List[idx][0] != ' ':
if List[idx][0] == List[idx][1] and List[idx][0] == List[idx][2]:
return List[idx][0]
# Check two diagonals.
if List[1][1] != ' ':
if List[1][1] == List[0][0] and List[1][1] == List[2][2]:
return List[1][1]
if List[1][1] == List[0][2] and List[1][1] == List[2][0]:
return List[1][1]
# No winner yet.
return ' '
Note that this ensures a row of empty cells isn't immediately picked up as a win by nobody. You need to check only for wins by a "real" player. By that, I mean you don't want to detect three empty cells in the first row and return an indication based on that if the second row has an actual winner.
Of course, there are numerous ways to refactor such code to make it more easily read and understood. One way is to separate out the logic for checking a single line and then call that for each line:
# Detect a winning line. First cell must be filled in
# and other cells must be equal to first.
def isWinLine(brd, x1, y1, x2, y2, x3, y3):
if brd[x1][y1] == ' ': return False
return brd[x1][y1] == brd[x2][y2] and brd[x1][y1] == brd[x3][y3]
# Get winner of game by checking each possible line for a winner,
# return contents of one of the cells if so. Otherwise return
# empty value.
def getWinner(brd):
# Rows and columns first.
for idx in range(3):
if isWinLine(brd, idx, 0, idx, 1, idx, 2): return brd[idx][0]
if isWinLine(brd, 0, idx, 1, idx, 2, idx): return brd[0][idx]
# Then diagonals.
if isWinLine(brd, 0, 0, 1, 1, 2, 2): return brd[1][1]
if isWinLine(brd, 2, 0, 1, 1, 0, 2): return brd[1][1]
# No winner yet.
return ' '
Then you can just use:
winner = getWinner(List)
in your code and you'll either get back the winner or an empty indication if there isn't one.

How to deal with very big Bitboards

I'm working on a 2-player board game (e.g. connect 4), with parametric board size h, w. I want to check for winning condition using hw-sized bitboards.
In game like chess, where board size is fixed, bitboards are usually represented with some sort of 64-bit integer. When h and w are not constant and maybe very big (let's suppose 30*30) are bitboards a good idea? If so, are the any data types in C/C++ to deal with big bitboards keeping their performances?
Since I'm currently working on python a solution in this language is appreciated too! :)
Thanks in advance
I wrote this code while ago just to play around with the game concept. There is no intelligence behaviour involve. just random moves to demonstrate the game. I guess this is not important for you since you are only looking for a fast check of winning conditions. This implementation is fast since I did my best to avoid for loops and use only built-in python/numpy functions (with some tricks).
import numpy as np
row_size = 6
col_size = 7
symbols = {1:'A', -1:'B', 0:' '}
def was_winning_move(S, P, current_row_idx,current_col_idx):
#****** Column Win ******
current_col = S[:,current_col_idx]
P_idx= np.where(current_col== P)[0]
#if the difference between indexes are one, that means they are consecutive.
#we need at least 4 consecutive index. So 3 Ture value
is_idx_consecutive = sum(np.diff(P_idx)==1)>=3
if is_idx_consecutive:
return True
#****** Column Win ******
current_row = S[current_row_idx,:]
P_idx= np.where(current_row== P)[0]
is_idx_consecutive = sum(np.diff(P_idx)==1)>=3
if is_idx_consecutive:
return True
#****** Diag Win ******
offeset_from_diag = current_col_idx - current_row_idx
current_diag = S.diagonal(offeset_from_diag)
P_idx= np.where(current_diag== P)[0]
is_idx_consecutive = sum(np.diff(P_idx)==1)>=3
if is_idx_consecutive:
return True
#****** off-Diag Win ******
#here 1) reverse rows, 2)find new index, 3)find offest and proceed as diag
reversed_rows = S[::-1,:] #1
new_row_idx = row_size - 1 - current_row_idx #2
offeset_from_diag = current_col_idx - new_row_idx #3
current_off_diag = reversed_rows.diagonal(offeset_from_diag)
P_idx= np.where(current_off_diag== P)[0]
is_idx_consecutive = sum(np.diff(P_idx)==1)>=3
if is_idx_consecutive:
return True
return False
def move_at_random(S,P):
selected_col_idx = np.random.permutation(range(col_size))[0]
#print selected_col_idx
#we should fill in matrix from bottom to top. So find the last filled row in col and fill the upper row
last_filled_row = np.where(S[:,selected_col_idx] != 0)[0]
#it is possible that there is no filled array. like the begining of the game
#in this case we start with last row e.g row : -1
if last_filled_row.size != 0:
current_row_idx = last_filled_row[0] - 1
else:
current_row_idx = -1
#print 'col[{0}], row[{1}]'.format(selected_col,current_row)
S[current_row_idx, selected_col_idx] = P
return (S,current_row_idx,selected_col_idx)
def move_still_possible(S):
return not (S[S==0].size == 0)
def print_game_state(S):
B = np.copy(S).astype(object)
for n in [-1, 0, 1]:
B[B==n] = symbols[n]
print B
def play_game():
#initiate game state
game_state = np.zeros((6,7),dtype=int)
player = 1
mvcntr = 1
no_winner_yet = True
while no_winner_yet and move_still_possible(game_state):
#get player symbol
name = symbols[player]
game_state, current_row, current_col = move_at_random(game_state, player)
#print '******',player,(current_row, current_col)
#print current game state
print_game_state(game_state)
#check if the move was a winning move
if was_winning_move(game_state,player,current_row, current_col):
print 'player %s wins after %d moves' % (name, mvcntr)
no_winner_yet = False
# switch player and increase move counter
player *= -1
mvcntr += 1
if no_winner_yet:
print 'game ended in a draw'
player = 0
return game_state,player,mvcntr
if __name__ == '__main__':
S, P, mvcntr = play_game()
let me know if you have any question
UPDATE: Explanation:
At each move, look at column, row, diagonal and secondary diagonal that goes through the current cell and find consecutive cells with the current symbol. avoid scanning the whole board.
extracting cells in each direction:
column:
current_col = S[:,current_col_idx]
row:
current_row = S[current_row_idx,:]
Diagonal:
Find the offset of the desired diagonal from the
main diagonal:
diag_offset = current_col_idx - current_row_idx
current_diag = S.diagonal(offset)
off-diagonal:
Reverse the rows of matrix:
S_reversed_rows = S[::-1,:]
Find the row index in the new matrix
new_row_idx = row_size - 1 - current_row_idx
current_offdiag = S.diagonal(offset)

Touring a chess board with multiple pieces [Python]

I have been trying to solve this for 2 days straight and I just can't find a valid algorithm. Given a Chess board, and a number of pieces, I have to check if said board can be toured by the pieces, with the condition that each piece can only visit an square once. I know it's some kind of multiple backtracking, but I can't get it to work. (I only have been able to implement a general knight's tour for individual pieces)
tablero is a class of the board, that holds a name, a list of pieces, a list with prohibited positions, a list with the free positions, and a tuple with the dimensions of the board.
ficha is the class of a piece, it holds a name (nombre), a tuple with its position (posicion), a list with its valid movements (movimientos) (For example, a pawn's list would be [ [0,1] ], meaning it can only move forward 1)
Any insight is welcome.
Here are the classes (Feel free to add/remove any method).
def legal(pos,dimensiones):
if pos[0] >= 0 and pos[0] < dimensiones[0] and pos[1] >= 0 and pos[1] < dimensiones[0]:
return True
else:
return False
class board:
def __init__(self,name,pieces,dimention,prohibited_positions):
self.name = name
self.pieces = pieces
self.dimention = dimention
self.prohibited_positions = prohibited_positions
self.free_positions = []
for x in range(dimention[0]):
for y in range(dimention[1]):
self.free_positions.append([x,y])
for x,y in self.prohibited_positions:
if [x,y] in self.free_positions:
self.free_positions.remove([x,y])
for piece in self.pieces:
if self.piece.position in self.free_positions:
self.free_positions.remove(piece.position)
def append(self,piece):
pos = piece.position
if pos in self.free_positions:
self.pieces.append(piece)
self.free_positions.remove(pos)
class piece:
def __init__(self,name,position,move_offsets):
self.name=name
self.position=position
self.move_offsets=move_offsets
self.possible_movements=move_offsets
def setPos(self,pos):
self.position=pos
def ValidMovements(self,dim,free_positions,prohibited_positions):
aux = []
for i in self.possible_movements:
newX = self.position[0] + i[0]
newY = self.position[1] + i[1]
newPos = [newX,newY]
if legal(newPos,dim):
aux.append(newPos)
for i in list(aux):
if i not in free_positions:
aux.remove(i)

Abstract Collision Detection

I've been struggling conceptually with how to implement simple square collision detection within a game I am writing while avoiding Pygame; I want to learn how to do it without cheating. The structure of the program as intended looks is this:
The game loads a text file containing a level. Each level consists of 25 rows of 25 digits (for a total of 625 digits). It is extracted into a 2D array to emulate a cartesian grid which will correspond with the screen. From there the program draws a 32x32 block at the proper place on the screen. For example, if the digit at location [2][5] is a 1, it will draw a white square at pixel coordinate (96,192) (the counting of the squares starts at zero since it is an array). It also generates a collision array consisting of True or False for each location corresponding to the original array.
I have a player object that moves freely along the grid, not confined to the 32x32 squares. My question is this: how would I implement square collision detection? I've tried a number of methods but I'm not quite sure where I'm getting stuck. I'll post my latest incarnation and the relevant code below.
Collision code:
def checkPlayerEnvCollision(self,player):
p = player
c = self.cLayer #this is the collision grid generated when loading the level
for row in range(25):
for col in range (25):
print("checkEnvCollision")
if c[row][col] != False:
tileleftx = row*32
tilerightx = tileleftx + 32
tilelefty = col*32
tilerighty = tilelefty+32
if (abs(tileleftx - p.x) * 2 < (tilerightx + (p.x + 32))) and (abs(tilelefty - p.y) * 2 < (tilerighty + (p.y + 32))):
print("OH NO, COLLISION")
The code that loads the tiles from the text file into the array:
def loadLevel(self, level):
print("Loading Level")
levelFile = open(level)
count=0
for line in levelFile:
tempArray = []
if line.startswith("|"):
dirs = line.split('|')
self.north = dirs[1]
self.south = dirs[2]
self.east = dirs[3]
self.west = dirs[4]
continue
for item in line:
if item in self.tileValues:
tempArray.append(int(item))
self.tileLayer[count] = tempArray
count+=1
for items in self.tileLayer:
if len(items) > 25:
print("Error: Loaded Level Too Large")
count = 0
for line in self.tileLayer:
tempArray = []
for item in line:
if self.tilePassableValues[item] == False:
tempArray.append(False)
else:
tempArray.append(True)
self.collisionLayer[count] = tempArray
count += 1
Not sure if this is useful, but here is a simple demonstration of the drawing method:
def levelTiles(self, level):
row = 0
for t in level:
col = 0
for r in t:
color = "white"
if r == 0:
col+=1
continue
elif r == 1:
color = "red"
elif r == 2:
color = "white"
elif r == 3:
color = "green"
self.Canvas.create_rectangle(row*32, col*32, row*32+32, col*32+32, fill=color, width=1,tags='block')
col += 1
row += 1
Lastly, here is the text file I have been testing it with:
1111111111111111111111111
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222233332222222222222221
1222233332222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222233332222222221
1222222222333332332222221
1222222222222222332222221
1222222222222222332222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1222222222222222222222221
1111111111111111111111111
|onescreen2|onescreen2|onescreen2|onescreen2
(The last line is what will load the map to the north, south, east and west when reaching the edge of the level; you can ignore it.)
Thanks for the help. It's a lot to ask, but I'm stuck on this one!
If the player is tied to the grid, why not just test the grid positions:
if grid[player.x][player.y] == some_collidable_thing:
# there was a collision
If not,
I also provided an answer to something almost identical in This question
def check_col(self, rect):
for row in self.cLayer:
for column in row:
grid_position = (row*element_size, column*element_width)
collide_x = False
collide_y = False
# check x axis for collision
if self.rect.x + self.rect.w > grid_position[0]:
collide_x = True
elif self.rect.x < grid_position[0] + element_width:
collide_x = True
# check y axis for collision
if self.rect.y < grid_position[1] + element_height:
collide_y = True
elif self.rect.y + self.rect.h > grid_position[1]:
collide_y = True
# act on a collision on both axis
if collide_x and collide_y:
# act on the collision
return True
else:
# act on no collision
return False
An easier way to do this would be to define vectors for the player's movement, and lines for the boundaries of the objects. Then you check to see if the vector collides with any line (there should not be many lines to check) as follows (I'm assuming that the player/object can be on the boundary of the other object):
Take the determinant of the triangle formed by the movement vector and endpoint of the line you're checking for a collision, and take its area via determinant. Compare its area to the area of the triangle formed with the other endpoint. If they are both positive/negative, then there is no intersection. If their signs are different, then there MIGHT be an intersection.
If their signs are different, do the same thing as above, except using the endpoints of the movement vector instead of the endpoints of the line. (And using the whole line instead of the movement vector).
If their signs are different, then there is definitely an intersection, and if they are the same, then there is no intersection.
I hope this helps (and just comment if it does not make sense).

Categories