I'm trying to write a game in pygame, involving a moving object with a "turret" that swivels to follow a mouse. As of now, I'm mostly trying to expand upon examples, so the code's not entirely mine (credit to Sean J. McKiernan for his sample programs); however, this portion is. Below is my code; I use the center of rect (the "base" shape and the point around which the "turret" swivels) as the base point, and the position of the mouse as the other point. By subtracting the mouse's displacement from the displacement of the "center," I effectively get a vector between the two points and find the angle between that vector and the x-axis with atan2. Below is the code I use to do that:
def get_angle(self, mouse):
x_off = (mouse[0]-self.rect.centerx)
y_off = (mouse[1]-self.rect.centery)
self.angle = math.degrees(math.atan2(-y_off, x_off) % 2*math.pi)
self.hand = pg.transform.rotate(self.original_hand, self.angle)
self.hand_rect = self.hand.get_rect(center=self.hand_rect.center)
According to multiple tutorials I've reviewed, this SHOULD be the correct code; however, I discovered later that those tutorials (and, in fact, this tutorial) were all for Python 2.7, while I am trying to write in Python 3.6. I don't think that should make a difference in this scenario, though. As it stands, the view appears to depend entirely upon the "character's" position on the screen. If the "character" is in one corner, the reaction of the "turret" is different than the reaction if the "character" is in the middle of the screen. However, this shouldn't matter; the position of the "character" relative to the mouse is the exact same no matter where on the screen they are. Any ideas, or do I need to supply more code?
Edit: Apparently, more code is required. Rather than attempt to extricate only the entirely necessary parts, I've provided the entire code sample, so everyone can run it. As a side note, the "Bolt" things (intended to fire simple yellow blocks) don't work either, but I'm just trying to get the arm working before I start in on debugging that.
Edit the second: I have discovered that the "Bolt" system works within a certain distance of the origin (0,0 in the window coordinate system), and that the arm also works within a much lesser distance. I added the line Block(pg.Color("chocolate"), (0,0,100,100)) under the "walls" grouping as a decision point, and the block was positioned in the top left corner. I've corrected Bolt by changing screen_rect to viewport in the control loop; however, I don't know why the "arm" swinging is dependent on adjacency to the origin. The positions of the mouse and "character" SHOULD be absolute. Am I missing something?
"""
Basic moving platforms using only rectangle collision.
-Written by Sean J. McKiernan 'Mekire'
Edited for a test of "arms"
"""
import os
import sys
import math
import pygame as pg
CAPTION = "Moving Platforms"
SCREEN_SIZE = (700,700)
BACKGROUND_COLOR = (50, 50, 50)
COLOR_KEY = (255, 255, 255)
class _Physics(object):
"""A simplified physics class. Psuedo-gravity is often good enough."""
def __init__(self):
"""You can experiment with different gravity here."""
self.x_vel = self.y_vel = 0
self.grav = 0.4
self.fall = False
def physics_update(self):
"""If the player is falling, add gravity to the current y velocity."""
if self.fall:
self.y_vel += self.grav
else:
self.y_vel = 0
class Player(_Physics, object):
def __init__(self,location,speed):
_Physics.__init__(self)
HAND = pg.image.load("playertst2.png").convert()
HAND.set_colorkey(COLOR_KEY)
self.image = pg.image.load('playertst.png').convert()
self.rect = self.image.get_rect(topleft=location)
self.speed = speed
self.jump_power = -9.0
self.jump_cut_magnitude = -3.0
self.on_moving = False
self.collide_below = False
self.original_hand = HAND
self.fake_hand = self.original_hand.copy()
self.hand = self.original_hand.copy()
self.hand_rect = self.hand.get_rect(topleft=location)
self.angle = self.get_angle(pg.mouse.get_pos())
def check_keys(self, keys):
"""Find the player's self.x_vel based on currently held keys."""
self.x_vel = 0
if keys[pg.K_LEFT] or keys[pg.K_a]:
self.x_vel -= self.speed
if keys[pg.K_RIGHT] or keys[pg.K_d]:
self.x_vel += self.speed
def get_position(self, obstacles):
"""Calculate the player's position this frame, including collisions."""
if not self.fall:
self.check_falling(obstacles)
else:
self.fall = self.check_collisions((0,self.y_vel), 1, obstacles)
if self.x_vel:
self.check_collisions((self.x_vel,0), 0, obstacles)
def check_falling(self, obstacles):
"""If player is not contacting the ground, enter fall state."""
if not self.collide_below:
self.fall = True
self.on_moving = False
def check_moving(self,obstacles):
"""
Check if the player is standing on a moving platform.
If the player is in contact with multiple platforms, the prevously
detected platform will take presidence.
"""
if not self.fall:
now_moving = self.on_moving
any_moving, any_non_moving = [], []
for collide in self.collide_below:
if collide.type == "moving":
self.on_moving = collide
any_moving.append(collide)
else:
any_non_moving.append(collide)
if not any_moving:
self.on_moving = False
elif any_non_moving or now_moving in any_moving:
self.on_moving = now_moving
def check_collisions(self, offset, index, obstacles):
"""
This function checks if a collision would occur after moving offset
pixels. If a collision is detected, the position is decremented by one
pixel and retested. This continues until we find exactly how far we can
safely move, or we decide we can't move.
"""
unaltered = True
self.rect[index] += offset[index]
self.hand_rect[index] += offset[index]
while pg.sprite.spritecollideany(self, obstacles):
self.rect[index] += (1 if offset[index]<0 else -1)
self.hand_rect[index] += (1 if offset[index]<0 else -1)
unaltered = False
return unaltered
def check_above(self, obstacles):
"""When jumping, don't enter fall state if there is no room to jump."""
self.rect.move_ip(0, -1)
collide = pg.sprite.spritecollideany(self, obstacles)
self.rect.move_ip(0, 1)
return collide
def check_below(self, obstacles):
"""Check to see if the player is contacting the ground."""
self.rect.move_ip((0,1))
collide = pg.sprite.spritecollide(self, obstacles, False)
self.rect.move_ip((0,-1))
return collide
def jump(self, obstacles):
"""Called when the user presses the jump button."""
if not self.fall and not self.check_above(obstacles):
self.y_vel = self.jump_power
self.fall = True
self.on_moving = False
def jump_cut(self):
"""Called if player releases the jump key before maximum height."""
if self.fall:
if self.y_vel < self.jump_cut_magnitude:
self.y_vel = self.jump_cut_magnitude
def get_angle(self, mouse):
x_off = (mouse[0]-self.rect.centerx)
y_off = (mouse[1]-self.rect.centery)
self.angle = math.degrees(math.atan2(-y_off, x_off) % (2*math.pi))
self.hand = pg.transform.rotate(self.original_hand, self.angle)
self.hand_rect = self.hand.get_rect(center=self.hand_rect.center)
"""
offset = (mouse[1]-self.hand_rect.centery, mouse[0]-self.hand_rect.centerx)
self.angle = math.atan2(-offset[0], offset[1]) % (2 * math.pi)
self.angle = math.degrees(self.angle)
self.hand = pg.transform.rotate(self.original_hand, self.angle)
self.hand_rect = self.hand.get_rect(center=self.rect.center)
self.angle = 135-math.degrees(math.atan2(*offset))
self.hand = pg.transform.rotate(self.original_hand, self.angle)
self.hand_rect = self.hand.get_rect(topleft=self.rect.topleft)
"""
def pre_update(self, obstacles):
"""Ran before platforms are updated."""
self.collide_below = self.check_below(obstacles)
self.check_moving(obstacles)
def update(self, obstacles, keys):
"""Everything we need to stay updated; ran after platforms update."""
self.check_keys(keys)
self.get_position(obstacles)
self.physics_update()
def get_event(self, event, bolts):
if event.type == pg.MOUSEBUTTONDOWN and event.button == 1:
bolts.add(Bolt(self.rect.center))
elif event.type == pg.MOUSEMOTION:
self.get_angle(event.pos)
def draw(self, surface):
"""Blit the player to the target surface."""
surface.blit(self.image, self.rect)
surface.blit(self.hand, self.hand_rect)
class Bolt(pg.sprite.Sprite):
def __init__(self, location):
pg.sprite.Sprite.__init__(self)
"""self.original_bolt = pg.image.load('bolt.png')"""
"""self.angle = -math.radians(angle-135)"""
"""self.image = pg.transform.rotate(self.original_bolt, angle)"""
"""self.image = self.original_bolt"""
self.image=pg.Surface((5,10)).convert()
self.image.fill(pg.Color("yellow"))
self.rect = self.image.get_rect(center=location)
self.move = [self.rect.x, self.rect.y]
self.speed_magnitude = 5
"""self.speed = (self.speed_magnitude*math.cos(self.angle), self.speed_magnitude*math.sin(self.angle))"""
"""self.speed = (5,0)"""
self.done = False
def update(self, screen_rect, obstacles):
self.move[0] += self.speed_magnitude
"""self.move[1] += self.speed[1]"""
self.rect.topleft = self.move
self.remove(screen_rect, obstacles)
def remove(self, screen_rect, obstacles):
if not self.rect.colliderect(screen_rect):
self.kill()
class Block(pg.sprite.Sprite):
"""A class representing solid obstacles."""
def __init__(self, color, rect):
"""The color is an (r,g,b) tuple; rect is a rect-style argument."""
pg.sprite.Sprite.__init__(self)
self.rect = pg.Rect(rect)
self.image = pg.Surface(self.rect.size).convert()
self.image.fill(color)
self.type = "normal"
class MovingBlock(Block):
"""A class to represent horizontally and vertically moving blocks."""
def __init__(self, color, rect, end, axis, delay=500, speed=2, start=None):
"""
The moving block will travel in the direction of axis (0 or 1)
between rect.topleft and end. The delay argument is the amount of time
(in miliseconds) to pause when reaching an endpoint; speed is the
platforms speed in pixels/frame; if specified start is the place
within the blocks path to start (defaulting to rect.topleft).
"""
Block.__init__(self, color, rect)
self.start = self.rect[axis]
if start:
self.rect[axis] = start
self.axis = axis
self.end = end
self.timer = 0.0
self.delay = delay
self.speed = speed
self.waiting = False
self.type = "moving"
def update(self, player, obstacles):
"""Update position. This should be done before moving any actors."""
obstacles = obstacles.copy()
obstacles.remove(self)
now = pg.time.get_ticks()
if not self.waiting:
speed = self.speed
start_passed = self.start >= self.rect[self.axis]+speed
end_passed = self.end <= self.rect[self.axis]+speed
if start_passed or end_passed:
if start_passed:
speed = self.start-self.rect[self.axis]
else:
speed = self.end-self.rect[self.axis]
self.change_direction(now)
self.rect[self.axis] += speed
self.move_player(now, player, obstacles, speed)
elif now-self.timer > self.delay:
self.waiting = False
def move_player(self, now, player, obstacles, speed):
"""
Moves the player both when on top of, or bumped by the platform.
Collision checks are in place to prevent the block pushing the player
through a wall.
"""
if player.on_moving is self or pg.sprite.collide_rect(self,player):
axis = self.axis
offset = (speed, speed)
player.check_collisions(offset, axis, obstacles)
if pg.sprite.collide_rect(self, player):
if self.speed > 0:
self.rect[axis] = player.rect[axis]-self.rect.size[axis]
else:
self.rect[axis] = player.rect[axis]+player.rect.size[axis]
self.change_direction(now)
def change_direction(self, now):
"""Called when the platform reaches an endpoint or has no more room."""
self.waiting = True
self.timer = now
self.speed *= -1
"""class Spell(pg.sprite.Sprite):
def __init__(self, location, angle)"""
class Control(object):
"""Class for managing event loop and game states."""
def __init__(self):
"""Initalize the display and prepare game objects."""
self.screen = pg.display.get_surface()
self.screen_rect = self.screen.get_rect()
self.clock = pg.time.Clock()
self.fps = 60.0
self.keys = pg.key.get_pressed()
self.done = False
self.player = Player((50,875), 4)
self.viewport = self.screen.get_rect()
self.level = pg.Surface((1000,1000)).convert()
self.level_rect = self.level.get_rect()
self.win_text,self.win_rect = self.make_text()
self.obstacles = self.make_obstacles()
self.bolts = pg.sprite.Group()
def make_text(self):
"""Renders a text object. Text is only rendered once."""
font = pg.font.Font(None, 100)
message = "You win. Celebrate."
text = font.render(message, True, (100,100,175))
rect = text.get_rect(centerx=self.level_rect.centerx, y=100)
return text, rect
def make_obstacles(self):
"""Adds some arbitrarily placed obstacles to a sprite.Group."""
walls = [Block(pg.Color("chocolate"), (0,980,1000,20)),
Block(pg.Color("chocolate"), (0,0,20,1000)),
Block(pg.Color("chocolate"), (980,0,20,1000))]
static = [Block(pg.Color("darkgreen"), (250,780,200,100)),
Block(pg.Color("darkgreen"), (600,880,200,100)),
Block(pg.Color("darkgreen"), (20,360,880,40)),
Block(pg.Color("darkgreen"), (950,400,30,20)),
Block(pg.Color("darkgreen"), (20,630,50,20)),
Block(pg.Color("darkgreen"), (80,530,50,20)),
Block(pg.Color("darkgreen"), (130,470,200,215)),
Block(pg.Color("darkgreen"), (20,760,30,20)),
Block(pg.Color("darkgreen"), (400,740,30,40))]
moving = [MovingBlock(pg.Color("olivedrab"), (20,740,75,20), 325, 0),
MovingBlock(pg.Color("olivedrab"), (600,500,100,20), 880, 0),
MovingBlock(pg.Color("olivedrab"),
(420,430,100,20), 550, 1, speed=3, delay=200),
MovingBlock(pg.Color("olivedrab"),
(450,700,50,20), 930, 1, start=930),
MovingBlock(pg.Color("olivedrab"),
(500,700,50,20), 730, 0, start=730),
MovingBlock(pg.Color("olivedrab"),
(780,700,50,20), 895, 0, speed=-1)]
return pg.sprite.Group(walls, static, moving)
def update_viewport(self):
"""
The viewport will stay centered on the player unless the player
approaches the edge of the map.
"""
self.viewport.center = self.player.rect.center
self.viewport.clamp_ip(self.level_rect)
def event_loop(self):
"""We can always quit, and the player can sometimes jump."""
for event in pg.event.get():
if event.type == pg.QUIT or self.keys[pg.K_ESCAPE]:
self.done = True
elif event.type == pg.KEYDOWN:
if event.key == pg.K_SPACE:
self.player.jump(self.obstacles)
elif event.type == pg.KEYUP:
if event.key == pg.K_SPACE:
self.player.jump_cut()
elif event.type == pg.MOUSEMOTION or event.type == pg.MOUSEBUTTONDOWN:
self.player.get_event(event, self.bolts)
def update(self):
"""Update the player, obstacles, and current viewport."""
self.keys = pg.key.get_pressed()
self.player.pre_update(self.obstacles)
self.obstacles.update(self.player, self.obstacles)
self.player.update(self.obstacles, self.keys)
self.update_viewport()
self.bolts.update(self.screen_rect, self.obstacles)
def draw(self):
"""
Draw all necessary objects to the level surface, and then draw
the viewport section of the level to the display surface.
"""
self.level.fill(pg.Color("lightblue"))
self.obstacles.draw(self.level)
self.level.blit(self.win_text, self.win_rect)
self.player.draw(self.level)
self.bolts.draw(self.level)
self.screen.blit(self.level, (0,0), self.viewport)
def display_fps(self):
"""Show the programs FPS in the window handle."""
caption = "{} - FPS: {:.2f}".format(CAPTION, self.clock.get_fps())
pg.display.set_caption(caption)
def main_loop(self):
"""As simple as it gets."""
while not self.done:
self.event_loop()
self.update()
self.draw()
pg.display.update()
self.clock.tick(self.fps)
self.display_fps()
if __name__ == "__main__":
os.environ['SDL_VIDEO_CENTERED'] = '1'
pg.init()
pg.display.set_caption(CAPTION)
pg.display.set_mode(SCREEN_SIZE)
PLAYERIMG = pg.image.load("playertst.png").convert()
PLAYERIMG.set_colorkey(COLOR_KEY)
run_it = Control()
run_it.main_loop()
pg.quit()
sys.exit()
The % 2*pi unnecessary, and your get_angle function has no return value, but you do an assignment to self.angle = self.get_angle, but that is not the issue. The issue is that the mouse position is relative to the screen (i.e. clicking in the top right area of your game screen will always yield (0,480) if your screen is 640x480), while the position of the (character) rectangle is given in your game play area, which is larger than the screen, ergo if you move the character and thus the view shifts, you are getting coordinates in two different coordinate systems. You will have to keep track of where the view is in your game play area and add the offset to the mouse coordinates.
I've written a program (listed below) which plays Tic Tic Toe with a Tkinter GUI. If I invoke it like this:
root = tk.Tk()
root.title("Tic Tac Toe")
player1 = QPlayer(mark="X")
player2 = QPlayer(mark="O")
human_player = HumanPlayer(mark="X")
player2.epsilon = 0 # For playing the actual match, disable exploratory moves
game = Game(root, player1=human_player, player2=player2)
game.play()
root.mainloop()
it works as expected and the HumanPlayer can play against player2, which is a computer player (specifically, a QPlayer). The figure below shows how the HumanPlayer (with mark "X") easily wins.
In order to improve the performance of the QPlayer, I'd like to 'train' it by allowing it to play against an instance of itself before playing against the human player. I've tried modifying the above code as follows:
root = tk.Tk()
root.title("Tic Tac Toe")
player1 = QPlayer(mark="X")
player2 = QPlayer(mark="O")
for _ in range(1): # Play a couple of training games
training_game = Game(root, player1, player2)
training_game.play()
training_game.reset()
human_player = HumanPlayer(mark="X")
player2.epsilon = 0 # For playing the actual match, disable exploratory moves
game = Game(root, player1=human_player, player2=player2)
game.play()
root.mainloop()
What I then find, however, is that the Tkinter window contains two Tic Tac Toe boards (depicted below), and the buttons of the second board are unresponsive.
In the above code, the reset() method is the same one as used in the callback of the "Reset" button, which usually makes the board blank again to start over. I don't understand why I'm seeing two boards (of which one is unresponsive) instead of a single, responsive board?
For reference, the full code of the Tic Tac Toe program is listed below (with the 'offensive' lines of code commented out):
import numpy as np
import Tkinter as tk
import copy
class Game:
def __init__(self, master, player1, player2, Q_learn=None, Q={}, alpha=0.3, gamma=0.9):
frame = tk.Frame()
frame.grid()
self.master = master
self.player1 = player1
self.player2 = player2
self.current_player = player1
self.other_player = player2
self.empty_text = ""
self.board = Board()
self.buttons = [[None for _ in range(3)] for _ in range(3)]
for i in range(3):
for j in range(3):
self.buttons[i][j] = tk.Button(frame, height=3, width=3, text=self.empty_text, command=lambda i=i, j=j: self.callback(self.buttons[i][j]))
self.buttons[i][j].grid(row=i, column=j)
self.reset_button = tk.Button(text="Reset", command=self.reset)
self.reset_button.grid(row=3)
self.Q_learn = Q_learn
self.Q_learn_or_not()
if self.Q_learn:
self.Q = Q
self.alpha = alpha # Learning rate
self.gamma = gamma # Discount rate
self.share_Q_with_players()
def Q_learn_or_not(self): # If either player is a QPlayer, turn on Q-learning
if self.Q_learn is None:
if isinstance(self.player1, QPlayer) or isinstance(self.player2, QPlayer):
self.Q_learn = True
def share_Q_with_players(self): # The action value table Q is shared with the QPlayers to help them make their move decisions
if isinstance(self.player1, QPlayer):
self.player1.Q = self.Q
if isinstance(self.player2, QPlayer):
self.player2.Q = self.Q
def callback(self, button):
if self.board.over():
pass # Do nothing if the game is already over
else:
if isinstance(self.current_player, HumanPlayer) and isinstance(self.other_player, HumanPlayer):
if self.empty(button):
move = self.get_move(button)
self.handle_move(move)
elif isinstance(self.current_player, HumanPlayer) and isinstance(self.other_player, ComputerPlayer):
computer_player = self.other_player
if self.empty(button):
human_move = self.get_move(button)
self.handle_move(human_move)
if not self.board.over(): # Trigger the computer's next move
computer_move = computer_player.get_move(self.board)
self.handle_move(computer_move)
def empty(self, button):
return button["text"] == self.empty_text
def get_move(self, button):
info = button.grid_info()
move = (info["row"], info["column"]) # Get move coordinates from the button's metadata
return move
def handle_move(self, move):
try:
if self.Q_learn:
self.learn_Q(move)
i, j = move # Get row and column number of the corresponding button
self.buttons[i][j].configure(text=self.current_player.mark) # Change the label on the button to the current player's mark
self.board.place_mark(move, self.current_player.mark) # Update the board
if self.board.over():
self.declare_outcome()
else:
self.switch_players()
except:
print "There was an error handling the move."
pass # This might occur if no moves are available and the game is already over
def declare_outcome(self):
if self.board.winner() is None:
print "Cat's game."
else:
print "The game is over. The player with mark %s won!" % self.current_player.mark
def reset(self):
print "Resetting..."
for i in range(3):
for j in range(3):
self.buttons[i][j].configure(text=self.empty_text)
self.board = Board(grid=np.ones((3,3))*np.nan)
self.current_player = self.player1
self.other_player = self.player2
# np.random.seed(seed=0) # Set the random seed to zero to see the Q-learning 'in action' or for debugging purposes
self.play()
def switch_players(self):
if self.current_player == self.player1:
self.current_player = self.player2
self.other_player = self.player1
else:
self.current_player = self.player1
self.other_player = self.player2
def play(self):
if isinstance(self.player1, HumanPlayer) and isinstance(self.player2, HumanPlayer):
pass # For human vs. human, play relies on the callback from button presses
elif isinstance(self.player1, HumanPlayer) and isinstance(self.player2, ComputerPlayer):
pass
elif isinstance(self.player1, ComputerPlayer) and isinstance(self.player2, HumanPlayer):
first_computer_move = player1.get_move(self.board) # If player 1 is a computer, it needs to be triggered to make the first move.
self.handle_move(first_computer_move)
elif isinstance(self.player1, ComputerPlayer) and isinstance(self.player2, ComputerPlayer):
while not self.board.over(): # Make the two computer players play against each other without button presses
move = self.current_player.get_move(self.board)
self.handle_move(move)
def learn_Q(self, move): # If Q-learning is toggled on, "learn_Q" should be called after receiving a move from an instance of Player and before implementing the move (using Board's "place_mark" method)
state_key = QPlayer.make_and_maybe_add_key(self.board, self.current_player.mark, self.Q)
next_board = self.board.get_next_board(move, self.current_player.mark)
reward = next_board.give_reward()
next_state_key = QPlayer.make_and_maybe_add_key(next_board, self.other_player.mark, self.Q)
if next_board.over():
expected = reward
else:
next_Qs = self.Q[next_state_key] # The Q values represent the expected future reward for player X for each available move in the next state (after the move has been made)
if self.current_player.mark == "X":
expected = reward + (self.gamma * min(next_Qs.values())) # If the current player is X, the next player is O, and the move with the minimum Q value should be chosen according to our "sign convention"
elif self.current_player.mark == "O":
expected = reward + (self.gamma * max(next_Qs.values())) # If the current player is O, the next player is X, and the move with the maximum Q vlue should be chosen
change = self.alpha * (expected - self.Q[state_key][move])
self.Q[state_key][move] += change
class Board:
def __init__(self, grid=np.ones((3,3))*np.nan):
self.grid = grid
def winner(self):
rows = [self.grid[i,:] for i in range(3)]
cols = [self.grid[:,j] for j in range(3)]
diag = [np.array([self.grid[i,i] for i in range(3)])]
cross_diag = [np.array([self.grid[2-i,i] for i in range(3)])]
lanes = np.concatenate((rows, cols, diag, cross_diag)) # A "lane" is defined as a row, column, diagonal, or cross-diagonal
any_lane = lambda x: any([np.array_equal(lane, x) for lane in lanes]) # Returns true if any lane is equal to the input argument "x"
if any_lane(np.ones(3)):
return "X"
elif any_lane(np.zeros(3)):
return "O"
def over(self): # The game is over if there is a winner or if no squares remain empty (cat's game)
return (not np.any(np.isnan(self.grid))) or (self.winner() is not None)
def place_mark(self, move, mark): # Place a mark on the board
num = Board.mark2num(mark)
self.grid[tuple(move)] = num
#staticmethod
def mark2num(mark): # Convert's a player's mark to a number to be inserted in the Numpy array representing the board. The mark must be either "X" or "O".
d = {"X": 1, "O": 0}
return d[mark]
def available_moves(self):
return [(i,j) for i in range(3) for j in range(3) if np.isnan(self.grid[i][j])]
def get_next_board(self, move, mark):
next_board = copy.deepcopy(self)
next_board.place_mark(move, mark)
return next_board
def make_key(self, mark): # For Q-learning, returns a 10-character string representing the state of the board and the player whose turn it is
fill_value = 9
filled_grid = copy.deepcopy(self.grid)
np.place(filled_grid, np.isnan(filled_grid), fill_value)
return "".join(map(str, (map(int, filled_grid.flatten())))) + mark
def give_reward(self): # Assign a reward for the player with mark X in the current board position.
if self.over():
if self.winner() is not None:
if self.winner() == "X":
return 1.0 # Player X won -> positive reward
elif self.winner() == "O":
return -1.0 # Player O won -> negative reward
else:
return 0.5 # A smaller positive reward for cat's game
else:
return 0.0 # No reward if the game is not yet finished
class Player(object):
def __init__(self, mark):
self.mark = mark
self.get_opponent_mark()
def get_opponent_mark(self):
if self.mark == 'X':
self.opponent_mark = 'O'
elif self.mark == 'O':
self.opponent_mark = 'X'
else:
print "The player's mark must be either 'X' or 'O'."
class HumanPlayer(Player):
def __init__(self, mark):
super(HumanPlayer, self).__init__(mark=mark)
class ComputerPlayer(Player):
def __init__(self, mark):
super(ComputerPlayer, self).__init__(mark=mark)
class RandomPlayer(ComputerPlayer):
def __init__(self, mark):
super(RandomPlayer, self).__init__(mark=mark)
#staticmethod
def get_move(board):
moves = board.available_moves()
if moves: # If "moves" is not an empty list (as it would be if cat's game were reached)
return moves[np.random.choice(len(moves))] # Apply random selection to the index, as otherwise it will be seen as a 2D array
class THandPlayer(ComputerPlayer):
def __init__(self, mark):
super(THandPlayer, self).__init__(mark=mark)
def get_move(self, board):
moves = board.available_moves()
if moves:
for move in moves:
if THandPlayer.next_move_winner(board, move, self.mark):
return move
elif THandPlayer.next_move_winner(board, move, self.opponent_mark):
return move
else:
return RandomPlayer.get_move(board)
#staticmethod
def next_move_winner(board, move, mark):
return board.get_next_board(move, mark).winner() == mark
class QPlayer(ComputerPlayer):
def __init__(self, mark, Q={}, epsilon=0.2):
super(QPlayer, self).__init__(mark=mark)
self.Q = Q
self.epsilon = epsilon
def get_move(self, board):
if np.random.uniform() < self.epsilon: # With probability epsilon, choose a move at random ("epsilon-greedy" exploration)
return RandomPlayer.get_move(board)
else:
state_key = QPlayer.make_and_maybe_add_key(board, self.mark, self.Q)
Qs = self.Q[state_key]
if self.mark == "X":
return QPlayer.stochastic_argminmax(Qs, max)
elif self.mark == "O":
return QPlayer.stochastic_argminmax(Qs, min)
#staticmethod
def make_and_maybe_add_key(board, mark, Q): # Make a dictionary key for the current state (board + player turn) and if Q does not yet have it, add it to Q
state_key = board.make_key(mark)
if Q.get(state_key) is None:
moves = board.available_moves()
Q[state_key] = {move: 0.0 for move in moves} # The available moves in each state are initially given a default value of zero
return state_key
#staticmethod
def stochastic_argminmax(Qs, min_or_max): # Determines either the argmin or argmax of the array Qs such that if there are 'ties', one is chosen at random
min_or_maxQ = min_or_max(Qs.values())
if Qs.values().count(min_or_maxQ) > 1: # If there is more than one move corresponding to the maximum Q-value, choose one at random
best_options = [move for move in Qs.keys() if Qs[move] == min_or_maxQ]
move = best_options[np.random.choice(len(best_options))]
else:
move = min_or_max(Qs, key=Qs.get)
return move
root = tk.Tk()
root.title("Tic Tac Toe")
player1 = QPlayer(mark="X")
player2 = QPlayer(mark="O")
# for _ in range(1): # Play a couple of training games
# training_game = Game(root, player1, player2)
# training_game.play()
# training_game.reset()
human_player = HumanPlayer(mark="X")
player2.epsilon = 0 # For playing the actual match, disable exploratory moves
game = Game(root, player1=human_player, player2=player2)
game.play()
root.mainloop()
It looks like you only need to create the board one time as the reset method resets it for the new players. Each type you create a Game instance, you create a new Tk frame so you either need to destroy the old one or you can reuse the windows by not creating a new Game instance each time.
A minor change to the main code at the bottom of the file seems to fix this:
player1 = QPlayer(mark="X")
player2 = QPlayer(mark="O")
game = Game(root, player1, player2)
for _ in range(1): # Play a couple of training games
game.play()
game.reset()
human_player = HumanPlayer(mark="X")
player2.epsilon = 0 # For playing the actual match, disable exploratory moves
game.player1 = human_player
game.player2 = player2
game.play()
I've noticed in this code that if you were to use it in python 3.2.3 or similar editions all of the print statements would need to be enclosed by brackets, and you'd need to add tkinter in the program by importing it.