Improving minimax algorithm for Tic Tac Toe

Improving minimax algorithm for Tic Tac Toe - python

I have coded a minimax algorithm with alpha-beta pruning for a 4x4 Tic Tac Toe, in which the players who gets 3 marks in a row wins. However, it seems that the first to move wins in this game, such that minimax doesn't even try to make it harder for the player to win, because no other possible move can improve the outcome (since minimax assumes optimal play from both sides). Therefore, I added the condition that the algorithm will choose the best move that will also maximize the game length (while assuming that the other player will try to win with least moves), and I tried to do that by adding 2 more "alpha" and "beta" variables, but for the game length.
My game has worked fine without this new condition, but when I add it, it doesn't work properly, and the algorithm will not avoid losing, which minimax should have prevented. Below is the part of the code which solves for the optimal move.
Note that I used 1 and -1 to represent the players, that is, o = 1 and x = -1, and they are themselves the score of their win. A draw returns 0.
def optimalmove(board, player, alpha, beta,optlength,movementlength,worlength):
stator = checkstate4(board) # Contains the state of the game. stator[0] is whether the game has reached the end or not,
# stator[1] is who has won (or draw), stator[2] are the possible moves
if len(stator[2]) == boardsize ** 2:
# if it is the first move, pick the corner as it has already been calculated
# to be the optimal, in order to reduce time
return([0, [0, 0],movementlength])
if stator[0]: #If at the end return which one has won
return([stator[1], [],movementlength])
movement = [] #best movement
for move in stator[2]: #loops through each possible move
localboard = deepcopy(board) #creates a copy of the board to not change the original
localboard[move[0]][move[1]] = player
quality = optimalmove(localboard, player * (-1), alpha, beta,optlength,movementlength+1,worlength)
#since it made a move, increase the move list length by one
if player == o: #'O' maximizes the score and wants to reduce the length of the game
if (quality[0] > alpha) or ((quality[0] >= alpha) and (quality[2] < worlength)):
# If it finds a better move, so be it. But if it finds a move as good as the previously best seen,
# take the one that will reach the end the fastest
alpha = quality[0]
movement = move
worlength = quality[2] #set the shortest game length seen to the current one
elif player == x: #'X' minimizes the score and wants to increase the length of the game
if (quality[0] < beta) or ((quality[0] <= beta) and (quality[2] > optlength)):
# If it finds a better move, so be it. But if it finds a move as good as the previously best seen,
# take the one that will reach the end the longest
beta = quality[0]
movement = move
optlength = quality[2] #set the longest game length seen to the current one
if alpha >= beta:
# prunes
break
if player == o:
return ([alpha, movement,worlength])
else:
return ([beta, movement,optlength])

Related

NEAT AI not controlling individual genomes in population

I am trying to implement NEAT into a pong game (made with pygame), however the individual paddles (geeks is how they're called here) don't seem to behave the way they should.
I have a population size of 5, with 5 input nodes and 3 output nodes.
Since the balls velocity is constant and so is its angle the only input nodes i have are: balls y/x position, balls y/x direction and the geeks y position.
The outputs are either the geek going up, down or being stationary.
I use only a single ball and everytime the ball gets to the part, where it would hit the geek to shoot it back, instead of using something like colliderect, i check if the ball is outside the Y range covered by the geek. It should get removed from the population of the current gen.
Terminating the geeks from the population seems to be the problem. It looks like i am having issues addressing each geek individually. I checked this by having a geek that is technically being terminated from the generation turn white, replacing their inital green color, but they all turn white but not all get removed. Infact, the first miss 3 will get removed from the population, then 1 more and then 1 more before all are terminated and the next generation starts. This happens everytime
Furthermore, the expected random behavior in earlier gens never occurs. It will always do the same weird movements. They wont split up or anything, they all stay in the same spot.
I tried splitting up the for loops handling the movement, rendering and my pseudo-collision detection but they still seem to all behave as one entity. I am unsure what to do.
This is the link to the full project and config file: https://www.mediafire.com/folder/baqsg6ddv5cv6/Pong
Side Note: I got the initial pong game from geeksforgeeks.org
Snippet of my code handling NEAT:
def remove(index):
AIgeeks.pop(index)
ge.pop(index)
nets.pop(index)
# Game Manager
def eval_genomes(genomes, config):
running = True
global geekAIYFac
geekAIYFac = 0 #sets initial movement of future geeks controlled by NEAT to 0
# Defining the objects
geek1 = Striker(20, 0, 10, 100, 10, GREEN) #purely visual opponent, will lock onto Y of the ball
geekAI = Striker(WIDTH-30, 0, 10, 50, 7, GREEN) #is the object that is the paddles (here they're called geeks) that will be controlled by NEAT
ball = Ball(WIDTH//2, HEIGHT//2, 7, 7, WHITE)
global AIgeeks, nets, ge, previous_time, current_time
AIgeeks = []
nets = []
ge = []
# Initial parameters of the players
for genome_id, genome in genomes: #basic setup to fill AIgeeks array and assign networks
AIgeeks.append(geekAI)
ge.append(genome)
net = neat.nn.FeedForwardNetwork.create(genome, config)
nets.append(net)
genome.fitness = 0
while running:
screen.fill(BLACK)
# Event handling
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
pygame.quit()
quit()
if len(AIgeeks) == 0: #checks if the list of geeks is empty -> breaks, starts new generation
previous_time = pygame.time.get_ticks() #gets time at the end of generation so it can be subtracted from the total time for fitness evaluation later
break
for i, geekAI in enumerate(AIgeeks):
output = nets[i].activate((ball.posy, ball.posx, geekAI.posy, ball.yFac, ball.xFac)) #takes in the input nodes: geeks y position, balls position (x and y), if ball going left/right or up/down (that's the yFac/xFac, speed is constant, that's why no input node for that needed )
if output[0] > 0.5: #defines first output nodes: output 0 makes geek go up, output 1 makes geek go down, output 2 makes geek stationary
geekAIYFac = -1
if output[1] > 0.5:
geekAIYFac = 1
if output[2] > 0.5:
geekAIYFac = 0
fitness_time = (pygame.time.get_ticks() - previous_time) / 50 #sets the fitness for each geek to the current time subtracted by the time when the generation started, divided by 50 for smaller numbers
ge[i].fitness = fitness_time
for i, geekAI in enumerate(AIgeeks):
if ball.posx == WIDTH - 30: #checks if ball is on same X as geeks
if geekAI.posy >= ball.posy or geekAI.posy <= ball.posy - geekAI.height: #checks if the ball is outside the Y range covered by each geek, if so ball wouldn't be hit in a real game -> geek gets terminated for this generation
geekAI.color = WHITE #sets colour to white before technically removing object
remove(i)
geekAI.update(geekAIYFac) #renders every geek on screen from AIgeeks list
geekAI.display()
for geek in AIgeeks:
print(geek.color)

Minimax algorithm incorrectly acting upon recognizing win/loss scenarios

The following minimax algorithm in python is meant to find the values of possible moves in a given connect four board. It is called by another function, computer_move, which is also shown below.
Expected behavior: minimax will return evaluations of a position based on a board state and a player.
If the player is one, it returns the evaluation of the highest value move.
If the player is two, it returns the evaluation of the lowest value move.
minimax is called by computer_move, which gets the returned values and chooses the best from them.
Actual behavior: computer_move seems to successfully choose the best move from minimax's returned values, but minimax does not properly evaluate different moves. Specifically, winning cases or cases where a win can be prevented are not properly evaluated. Intermediate situations have unknown behavior because the static evaluation function has not been implemented yet. However, in cases where a win is one-off by either player, the algorithm fails to react correctly.
I've tried switching various signs and swapping mins/maxes in both functions, but this did not seem to resolve the issue. Careful use of print statements also showed me that computer_move is correctly processing returned values, but minimax is returning incorrect values, suggesting that there's some error with the algorithm. However, it seems to be a textbook minimax algorithm, at least as far as I can see.
Does anyone have some suggestions for what the issue might be? Thanks!
Minimax function:
def minimax(board, depth, alpha, beta, player, move):
# Base cases
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_eval = max(max_eval, eval)
alpha = max(alpha, eval)
if beta <= alpha:
break
return max_eval
# Player 2 finds lowest (for player one) value move
else:
min_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
#print_board(board)
eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_eval = min(min_eval, eval)
beta = min(beta, eval)
if beta <= alpha:
break
return min_eval
Calling the minimax function and implementing the best move:
# Get list of move values, and choose highest/lowest depending on player
def computer_move(board, player, difficulty=3):
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
player,
move))
else:
move_vals.append(-math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer\'s move:')
print_board(board)

One problem is here:
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
In your case you are sending back a score of -infininty if player 1 is winning and it is his turn. Split the cases up into different statements instead which will make it easier to debug. Also you don't need depth == 0, you can check for when the board is full instead and then it is a draw if that is the case.
if check_winning(board, current_player):
return math.inf
if check_winning(board, the_other_player):
return -math.inf
if is_board_full(board):
return 0
You probably also want to include the depth to always go for the shortest possible win/longest possible lose, and you also don't need to have such large values as inf:
if check_winning(board, current_player):
return (10+depth)
if check_winning(board, the_other_player):
return -(10+depth)
if is_board_full(board):
return 0

It turns out there were two issues:
Because the minimax function is hardcoded to have player one maximize and player two minimize, player one winning must return a positive value, and player two winning must always return a negative value
The computer_move function needs to add the potential moves to the board before calling the minimax function, and unmake them after
Here is the updated code:
def minimax(board, depth, alpha, beta, player, previous_move):
# Change 1
if check_winning(board, 1):
return 100+depth
if check_winning(board, 2):
return -(100+depth)
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_move_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_move_eval = max(max_move_eval, move_eval)
alpha = max(alpha, move_eval)
if beta <= alpha:
break
return max_move_eval
# Player 2 finds lowest (for player one) value move
else:
min_move_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
#print_board(board)
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_move_eval = min(min_move_eval, move_eval)
beta = min(beta, move_eval)
if beta <= alpha:
break
return min_move_eval
def computer_move(board, difficulty, player):
other_player = int(not bool(player-1))+1
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
# Change 2
make_move(board, move, player)
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
other_player,
move))
unmake_move(board, move)
else:
if(player==2):
move_vals.append(-math.inf)
if(player==1):
move_vals.append(math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer\'s move:')
print_board(board)

Minimax algorithm and checkers game

I am implementing Checkers game using Minimax algorithm and Python. There are two players - both are computers. I was looking for a similar problem's solution but I could not find any and I have been struggling with it for few days. My entry point is this function:
def run_game(board):
players = board.players
is_move_possible = True
move = 0
while is_move_possible:
is_move_possible = move_piece_minimax(board, players[move % 2])
move += 1
It starts the game and calls the next function which is supposed to do the best move, basing on MiniMax algorithm, for the first player. After that first move, it calls this function for the second player and this loop will end once the game is won by one of the players. This function looks as following:
def move_piece_minimax(board, player):
best_move = minimax(copy.deepcopy(board), player, 0)
if best_move.score == +infinity or best_move.score == -infinity:
return False
move_single_piece(board.fields, player, best_move)
return True
The first line calls the MiniMax algorithm, which I will describe later, and is supposed to find the best possible move for the player. I am passing a deep copy of whole board here, as I do not want the original one to be edited during execution of MiniMax algorithm. The condition checks for a win condition so if either maximizing player wins or minimizing one. If none of them won, best_move is performed. Moving to the main problem here, I implemented MiniMax algorithm as following:
def minimax(board, player, depth):
best_move = Move(-1, -1, -infinity if player.name == PLAYER_NAMES['P1'] else +infinity)
if depth == MAX_SEARCH_DEPTH or game_over(board):
score = evaluate(board)
return Move(-1, -1, score)
for correct_move in get_all_correct_moves(player, board.fields):
x, y, piece = correct_move.x, correct_move.y, correct_move.piece
move_single_piece(board.fields, player, correct_move)
player_to_move = get_player_to_move(board, player)
move = minimax(board, player_to_move, depth + 1) # <--- here is a recursion
move.x = x
move.y = y
move.piece = piece
if player.name == PLAYER_NAMES['P1']:
if move.score > best_move.score:
best_move = move # max value
else:
if move.score < best_move.score:
best_move = move # min value
return best_move
I decided that player 'P1' is a maximizing player and player 'P2' is a minimizing one. Starting from the first line, best_move variable holds a reference to a Move object which has the following fields:
class Move:
def __init__(self, x, y, score, piece=None):
self.x = x
self.y = y
self.score = score
self.piece = piece
I am initializing best_move.score to -Infinity in case of the maximizing player and to +Infinity otherwise.
The first condition checks if depth reached the maximal level(for testing purposes it is set to 2) or the game is over. If yes, it evaluates current board's situation and returns a Move object holding current board's score. Otherwise, my algorithm looks for all legal/correct moves for the player and performs the first one.
After performing it, this function is called in a recursive manner but with incremented depth and changed moving player. The function runs again with changing parameters until the first if condition executes.
Once the execution goes to that branch, the board's evaluation score is returned and later, in a for loop after recursive call, coordinates x, y and a piece, which was moved, are assigned to a Move object.
Last conditions check if the new score is a better score for that specific player. If this is a maximizing player, so in my case P1, it checks if new score is higher that the previous one. In the case of minimizing player, the algorithm looks for the lowest score.
After performing all correct moves for that player, my algorithm should return one best_move.
Expected result
Single object of a Move class with coordinates x and y, evaluated board's score, which is +Infinity/-Infinity only in a case of win for one of the players, and an object of Piece class which will be moved to [x, y] coordinates.
Actual result
Single object of Move class with coordinates x and y, evaluated board's score which is equal to +Infinity after first call of MiniMax function. None of pieces changed its position so the game is not over yet. However, score is +Infinity so function move_piece_minimax() will return False - meaning no more moves are possible. Therefore, my program will stop execution with no changes on the board. Here is the screenshot of initial and final board's states - nothing is changed during exectuion as the first call returns +Infinity.
My question is, what I missed during implementation of MiniMax algorithm? Did I make any mistake? I am also open to any code improvements or suggestions. If any additional functions will be needed for you to understand my implementation, I will provide them. Thank you!

In the minimax function, you should do either of the following
1. Make a copy of your board before placing pieces
2. Remove the placed piece after recursively calling your minimax function
Otherwise your board will be getting filled with pieces with recursion and you'll receive an error stating that there's no moves left. Minimax is meant to do in-depth searching by placing pieces, so you should implement a method so it doesn't modify your original board.

Pygame Raycasting for line of sight

I am making a 2d top-down shooter game and ideally I would like the enemies to only shoot at the player when they see him/her (so the player could hide behind a crate etc.)
I have done research and I think the best way to do this would be raycasting. I have not been able to find a good example of raycasting in pygame.
Alternatively, I saw this piece of code on a different stackoverflow question ( Pygame Line of Sight from Fixed Position )
def isInLine(player, person):
deltaX = person[0] - player[0]
deltaY = person[1] - player[1]
if (person[0] == player[0]) or (person[1] == player[1]) or (abs(deltaX) == abs(deltaY)):
return true
but I am not sure if it would accomplsih the kind of thing I want to and if it is I'm not sure how I would implement it.
What I am asking is firstly, would the code I am using accomplish what I wanted to do and if so how would I implement it and is there a better way to do it.

I am assuming the variables 'player' and 'person' are the positions of the player and enemy? If so, the code you have added will check if either the two objects:
are in the same x position (person[0] == player[0])
are in the same y position (person[1] == player[1])
have equal x and y differences, i.e. the objects are at 45 degrees to each other ( abs(deltaX) == abs(deltaY) ).
This doesn't seem like what you want, however.
What might work is if you check if :
the angle between the enemy and barrier is equal to the angle between the enemy and the player. One way to do that is to use tan(angle) = opposite / adjacent, or deltaY / deltaX.
the enemy is further from the player than from the barricade. This can be done using pythagoras.
Here is a function for this which might help:
import math
def isInLine(enemy_pos, player_pos, barrier_pos):
# get x and y displacements from enemy to objects
playerDX = player_pos[0] - enemy_pos[0]
playerDY = player_pos[1] - enemy_pos[1]
barrierDX = barrier_pos[0] - enemy_pos[0]
barrierDY = barrier_pos[1] - enemy_pos[1]
# have to convert to degrees, as math uses radians
playerAngle = math.degrees( math.atan(playerDY / playerDX) )
barrierAngle = math.degrees( math.atan(barrerDY / barrierDX) )
# use pythagoras to find distances
playerDist = math.sqrt( (playerDX)**2 + (playerDY)**2 )
barrierDist = math.sqrt( (barrierDX)**2 + (barrierDY)**2 )
return (playerAngle == barrierAngle) and (playerDist > barrierDist)
So if the angles of the player and barrier from the enemy are equal, that are along the same line. If the enemy is also further from the player than from the barricade, the player is behind the barricade compared to the enemy.
EDIT: Actually this will only work if the line from the enemy to the barrier is exactly equal to the line from the enemy to the player. This might need editing to take into account the range of the barrier.

Chomp Game- do I have winning strategy?

for those who don't know the game: http://en.wikipedia.org/wiki/Chomp
short review- you have choclate table n x m with 1 poisoned cubed in the bottom left corner.
The player target is to avoid eating that cube. Each player in his turn choose 1 cube and actualy eat all the cubes that are right and up to that cube.
With a given matrix n X m and a list that represent the configuration of the game;
Each index represent how many cubes are in each column. For example: [3,3,2,2]- (the player chose to eat the second cube from the right)
P=poisoned
X X
X X X X
P X X X
(the numbers are just for the order)
I need to recursivley return an answer wheater there is a winning strategy or not for the player that it's turn to play. I thought about this: If I can know that for the next step (player against me) there's no winning strategy- I have winning strategy. stop terms are: if 1 last cube left- return false; if 2 last cubes left- return true. Just have hard times how to return it recursivley in a way that will give me an answer for that question
I thought about calculating all the moves that the player can do, but I couldn't implement that solution... any help?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Improving minimax algorithm for Tic Tac Toe - python

Related

NEAT AI not controlling individual genomes in population

Minimax algorithm incorrectly acting upon recognizing win/loss scenarios

Minimax algorithm and checkers game

Pygame Raycasting for line of sight

Chomp Game- do I have winning strategy?

Categories

Resources