Related
This is a game where you have 12 cards and you pick you until you choose 3 from the same group. I am attempting to find the probability of choosing each group. The script that I have created works, but it is extremely slow. My coworker created a similar script in R without the functions and his script takes 1/100th the time that mine takes. I am just trying to figure out why. Any ideas would be greatly appreciated.
from collections import Counter
import pandas as pd
from datetime import datetime
weight = pd.read_excel('V01Weights.xlsx')
Weight looks like the following:
Symb Weight
Grand 170000
Grand 170000
Grand 105
Major 170000
Major 170000
Major 215
Minor 150000
Minor 150000
Minor 12000
Bonus 105000
Bonus 105000
Bonus 105000
Max Picks represents the total number of different "cards". Total Picks represents the max number of user choices. This is because after 8 choices, you are guaranteed to have 2 of each type so on the 9th pick, you are guaranteed to have 3 matching.
TotalPicks = 9
MaxPicks = 12
This should have been named PickedProbabilities.
Picks = {0:0,1:0,2:0,3:0}
This is my simple version of the timeit class because I don't like the timeit class
def Time_It(function):
start =datetime.now()
x = function()
finish = datetime.now()
TotalTime = finish - start
Minutes = int(TotalTime.seconds/60)
Seconds = TotalTime.seconds % 60
print('It took ' + str(Minutes) + ' minutes and ' + str(Seconds) + ' seconds')
return(x)
Given x(my picks in order) I find the probability. These picks are done without replacement
def Get_Prob(x,weight):
prob = 1
weights = weight.iloc[:,1]
for index in x:
num = weights[index]
denom = sum(weights)
prob *= num/denom
weights.drop(index, inplace = True)
# print(weights)
return(prob)
This is used to determine if there are duplicates in my loop because that is not allowed
def Is_Allowed(x):
return(len(x) == len(set(x)))
This determines if a win is present in all of the cards present thus far.
def Is_Win(x):
global Picks
WinTypes = [[0,1,2],[3,4,5],[6,7,8],[9,10,11]]
IsWin = False
for index,item in enumerate(WinTypes):
# print(index)
if set(item).issubset(set(x)):
IsWin = True
Picks[index] += Get_Prob(x,weight)
# print(Picks[index])
print(sum(Picks.values()))
break
return(IsWin)
This is my main function that cycles through all of the cards. I attempted to do this using recursion but I eventually gave up. I can't use itertools to create all of the permutations because for example [0,1,2,3,4] will be created by itertools but this is not possible because once you get 3 matching, the game ends.
def Cycle():
for a in range(MaxPicks):
x = [a]
for b in range(MaxPicks):
x = [a,b]
if Is_Allowed(x):
for c in range(MaxPicks):
x = [a,b,c]
if Is_Allowed(x):
if Is_Win(x):
# print(x)
continue
for d in range(MaxPicks):
x = [a,b,c,d]
if Is_Allowed(x):
if Is_Win(x):
# print(x)
continue
for e in range(MaxPicks):
x = [a,b,c,d,e]
if Is_Allowed(x):
if Is_Win(x):
continue
for f in range(MaxPicks):
x = [a,b,c,d,e,f]
if Is_Allowed(x):
if Is_Win(x):
continue
for g in range(MaxPicks):
x = [a,b,c,d,e,f,g]
if Is_Allowed(x):
if Is_Win(x):
continue
for h in range(MaxPicks):
x = [a,b,c,d,e,f,g,h]
if Is_Allowed(x):
if Is_Win(x):
continue
for i in range(MaxPicks):
if Is_Allowed(x):
if Is_Win(x):
continue
Calls the main function
x = Time_It(Cycle)
print(x)
writes the probabilities to a text file
with open('result.txt','w') as file:
# file.write(pickle.dumps(x))
for item in x:
file.write(str(item) + ',' + str(x[item]) + '\n')
My coworker created a similar script in R without the functions and his script takes 1/100th the time that mine takes.
Two easy optimizations:
1) In-line the function calls like Is_Allowed() because Python have a lot of function call overhead (such as creating a new stackframe and argument tuples).
2) Run the code in using pypy which is really good at optimizing functions like this one.
Ok, this time I hope I got your problem right:)
There are two insights (I guess you have them, just for the sake of the completeness) needed in order to speed up your program algorithmically:
The probabilities for the sequence (card_1, card_2) and (card_2, card_1) are not equal, so we cannot use the results from the urn problem, and it looks like we need to try out all permutations.
However, given a set of cards we picked so far, we don't really need the information in which sequence they where picked - it is all the same for the future course of the game. So it is enough to use dynamic programming and calculate the probabilities for every subset to be traversed during the game (thus we need to check 2^N instead of N! states).
For a set of picked cards set the probability to pick a card i in the next turn is:
norm:=sum Wi for i in set
P(i|set)=Wi/norm if i not in set else 0.0
The recursion for calculating P(set) - the probability that a set of picked card occured during the game is:
set_without_i:=set/{i}
P(set)=sum P(set_without_i)*P(i|set_without_i) for i in set
However this should be done only for set_without_i for which the game not ended yet, i.e. no group has 3 cards picked.
This can be done by means of recursion+memoization or, as my version does, by using bottom-up dynamic programming. It also uses binary representation of integers for representations of sets and (most important part!) returns the result almost instantly [('Grand', 0.0014104762718021384), ('Major', 0.0028878988709489244), ('Minor', 0.15321793072867956), ('Bonus', 0.84248369412856905)]:
#calculates probability to end the game with 3 cards of a type
N=12
#set representation int->list
def decode_set(encoded):
decoded=[False]*N
for i in xrange(N):
if encoded&(1<<i):
decoded[i]=True
return decoded
weights = [170000, 170000, 105, 170000, 170000, 215, 150000, 150000, 12000, 105000, 105000, 105000]
def get_probs(decoded_set):
denom=float(sum((w for w,is_taken in zip(weights, decoded_set) if not is_taken)))
return [w/denom if not is_taken else 0.0 for w,is_taken in zip(weights, decoded_set)]
def end_group(encoded_set):
for i in xrange(4):
whole_group = 7<<(3*i) #7=..000111, 56=00111000 and so on
if (encoded_set & whole_group)==whole_group:
return i
return None
#MAIN: dynamic program:
MAX=(1<<N)#max possible set is 1<<N-1
probs=[0.0]*MAX
#we always start with the empty set:
probs[0]=1.0
#building bottom-up
for current_set in xrange(MAX):
if end_group(current_set) is None: #game not ended yet!
decoded_set=decode_set(current_set)
trans_probs=get_probs(decoded_set)
for i, is_set in enumerate(decoded_set):
if not is_set:
new_set=current_set | (1<<i)
probs[new_set]+=probs[current_set]*trans_probs[i]
#filtering wins:
group_probs=[0.0]*4
for current_set in xrange(MAX):
group_won=end_group(current_set)
if group_won is not None:
group_probs[group_won]+=probs[current_set]
print zip(["Grand", "Major", "Minor", "Bonus"], group_probs)
Some explanation of the "tricks" used in code:
A pretty standard trick is to use integer's binary representation to encode a set. Let's say we have objects [a,b,c], so we could represent the set {b,c} as 110, which would mean a (first in the list corresponds to 0- the lowest digit) - not in the set, b(1) in the set, c(1) in the set. However, 110 read as integer it is 6.
The current_set - for loop simulates the game and best understood while playing. Let's play with two cards [a,b] with weights [2,1].
We start the game with an empty set, 0 as integer, so the probability vector (given set, its binary representation and as integer mapped onto probability):
probs=[{}=00=0->1.0, 01={a}=1->0.0, {b}=10=2->0.0, {a,b}=11=3->0.0]
We process the current_set=0, there are two possibilities 66% to take card a and 33% to take cardb, so the probabilities become after the processing:
probs=[{}=00=0->1.0, 01={a}=1->0.66, {b}=10=2->0.33, {a,b}=11=3->0.0]
Now we process the current_set=1={a} the only possibility is to take b so we will end with set {a,b}. So we need to update its (3={a,b}) probability via our formula and we get:
probs=[{}=00=0->1.0, 01={a}=1->0.66, {b}=10=2->0.33, {a,b}=11=3->0.66]
In the next step we process 2, and given set {b} the only possibility is to pick card a, so probability of set {a,b} needs to be updated again
probs=[{}=00=0->1.0, 01={a}=1->0.66, {b}=10=2->0.33, {a,b}=11=3->1.0]
We can get to {a,b} on two different paths - this could be seen in our algorithm. The probability to go through set {a,b} at some point in our game is obviously 1.0.
Another important thing: all paths that leads to {a,b} are taken care of before we process this set (it would be the next step).
Edit: I misunderstood the original problem, the here presented solution is for the following problem:
Given 4 groups with 3 different cards with a different score for every card, we pick up cards as long as we don't have picked 3 cards from the same group. What is the expected score(sum of scores of picked cards) in the end of the game.
I leave the solution as it is, because it was such a joy to work it out after so many probability-theory-less years and I just cannot delete it:)
See my other answer for handling of the original problem
There are two possibilities to improve the performance: making the code faster (and before starting this, one should profile in order to know which part of the program should be optimized, otherwise the time is spent optimizing things that don't count) or improving the algorithm. I propose to do the second.
Ok, this problem seems to be more complex as at the first site. Let's start with some observations.
All you need to know is the expected number of the picked cards at the end of the game:
If Pi is the probability that the card i is picked somewhere during the game, then we are looking for the expected value of the score E(Score)=P1*W1+P2*W2+...Pn*Wn. However, if we look at the cards of a group, we can state that because of the symmetry the probabilities for the cards of this group are the same, e.g. P1=P2=P3=:Pgrand in your case. Thus our expectation can be calculated:
E(Score)=3*Pgrand*(W1+W2+W3)/3+...+3*Pbonus*(W10+W11+W12)/3
We call averageWgrand:=(W1+W2+W3)/3 and note that E(#grand)=3*Pgrand - the expected number of picked grand card at the end of the game. With this our formula becomes:
E(Score)=E(#grand)*averageWgrand+...+E(#bonus)*averageWbonus
In your example we can go even further: The number of cards in every group is equal, so because of the symmetry we can claim: E(#grand)=E(#minor)=E(#major)=E(#grand)=:(E#group). For the sake of simplicity, in the following we consider only this special case (but the outlined solution could be extended also to the general case). This lead to the following simplification:
E(Score)=4*E(#group)(averageWgrand+...+averageWbonus)/4
We call averageW:=(averageWgrand+...+averageWbonus)/4 and note that E(#cards)=4*E(#grand) is the expected number of picked card at the end of the game.
Thus, E(Score)=E(#cards)*averageW, so our task is reduced to calculating the expected value of the number of cards at the end of the game:
E(#cards)=P(1)*1+P(2)*2+...P(n)*n
where P(i) denotes the probability, that the game ends with exact i cards. The probabilities P(1),P(2) and P(k), k>9 are easy to see - they are 0.
Calculation of the probability of ending the game with i picked cards -P(i):
Let's play a slightly different game: we pick exactly i cards and win if and only if:
There is exactly one group with 3 cards picked. We call this group full_group.
The last picked (i-th) card was from the full_group.
It is easy to see, that the probability to win this game P(win) is exactly the probability we are looking for - P(i). Once again we can use the symmetry, because all groups are equal (P(win, full=grand) means the probability that we what and that the full_group=grand):
P(win)=P(win, grand)+P(win, minor)+P(win, major)+P(win, bonus)
=4*P(win, grand)
P(win, grand) is the probability that:
after picking i-1 cards the number of picked grand cards is 2, i.e. `#grand=2' and
after picking i-1 cards, for every group the number of picked cards is less than 3 and
we pick a grand-card in the last round. Given the first two constraints hold, this (conditional) probability is 1/(n-i+1) (there are n-i+1 cards left and only one of them is "right").
From the urn problem we know the probability for
P(#grand=u, #minor=x, #major=y, #bonus=z) = binom(3,u)*binom(3,x)*binom(3,y)*binom(3,z)/binom(12, u+x+y+z)
with binom(n,k)=n!/k!/(n-k)!. Thus P(win, grand) can be calculated as:
P(win, grand) = 1/(n-i+1)*sum P(#grand=2, #minor=x, #major=y, #bonus=z)
where x<=2, y<=2, z<=2 and 2+x+y+z=i-1
And now the code:
import math
def binom(n,k):
return math.factorial(n)//math.factorial(k)//math.factorial(n-k)
#expected number of cards:
n=12 #there are 12 cards
probs=[0]*n
for minor in xrange(3):
for major in xrange(3):
for bonus in xrange(3):
i = 3 + minor +major +bonus
P_urn = binom(3,2)*binom(3,minor)*binom(3,major)*binom(3,bonus)/float(binom(n, n-i+1))
P_right_last_card = 1.0/(n-i+1)
probs[i]+=4*P_urn*P_right_last_card #factor 4 from symmetry
print "Expected number of cards:", sum((prob*card_cnt for card_cnt, prob in enumerate(probs)))
As result I get 6.94285714286 as the expected number of cards in the end of the game. And very fast - almost instantly. Not sure whether it is right though...
Conclusion:
Obviously, if you like to handle a more general case (more groups, number cards in a group different) you have to extend the code (recursion, memoization of binom) and the theory.
But the most crucial part: with this approach you (almost) don't care in which order the cards were picked - and thus the number of states you have to inspect is down by factor of (k-1)! where k is the maximal possible number of cards in the end of the game. In your example k=9 and thus the approach is faster by factor 40000 (I don't even consider the speed-up from the exploited symmetry, because it might not be possible in general case).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have been seriously trying to create a genetic program that will evolve to play tic-tac-toe in an acceptable way. I'm using the genome to generate a function that will then take the board as input and output the result... But it's not working.
Can this program be written in less than 500 lines of code (including blank lines and documentation)? Perhaps my problem is that I'm generating AIs that are too simple.
My research
A Genetic Algorithm for Tic-Tac-Toe (very different from my approach).
http://www.tropicalcoder.com/GeneticAlgorithm.htm (too abstract).
In quite a good number of websites there are references to 'neural networks'. Are they really required?
Important disclaimers
This is NOT homework of any kind, just a personal project for the sake of learning something cool.
This is NOT a 'give me the codz plz', I am looking for high level suggestions. I explicitly don't want ready-made solutions as the answers.
Please give me some help and insight into this 'genetic programming' concept applied to this specific simple problem.
#OnABauer: I think that I am using genetic programming because quoting Wikipedia
In artificial intelligence, genetic programming (GP) is an
evolutionary algorithm-based methodology inspired by biological
evolution to find computer programs that perform a user-defined task.
And I am trying to generate a program (in this case function) that will perform the task of playing tic-tac-toe, you can see it because the output of the most important genetic_process function is a genome that will then be converted to a function, thus if I understood correctly this is genetic programming because the output is a function.
Program introspection and possible bugs/problems
The code runs with no errors nor crashes. The problem is that in the end what I get is an incompetent AI, that will attempt to make illegal move and be punished with losing each and every time. It is no better than random.
Possibly it is because my AI function is so simple: just making calculations on the stored values of the squares with no conditionals.
High level description
What is your chromosome meant to represent?
My cromosome rapresents a list of functions that will then be used to reduce over the array of the board stored as trinary. OK let me make an example:
* Cromosome is: amMMMdsa (length of cromosome must be 8).
1. The first step is converting this to functions following the lookup at the top called LETTERS_TO_FUNCTIONS, this give the functions: [op.add,op.mul,op.mod,op.mod,op.mod,op.floordiv,op.sub,op.add]
The second step is converting the board to a trinary rapresentation. So let's say the board is "OX XOX " we will get [2, 3, 1, 1, 3, 2, 3, 1, 1]
The third step is reducing the trinary raprentation using the functions obtained above. That is best explained by the function down below:
def reduce_by_all_functions(numbers_list,functions_list):
"""
Applies all the functions in a list to a list of numbers.
>>> reduce_by_all_functions([3,4,6],[op.add,op.sub])
1
>>> reduce_by_all_functions([6,2,4],[op.mul,op.floordiv])
3
"""
if len(functions_list) != len(numbers_list) - 1:
raise ValueError("The functions must be exactly one less than the numbers")
result = numbers_list[0]
for index,n in enumerate(numbers_list[1:]):
result = functions_list[index](result,n)
return result
Thus yielding the result of: 0 that means that the ai decided to go in the first square
What is your fitness function?
Luckly this is easy to answer.
def ai_fitness(genome,accuracy):
"""
Returns how good an ai is by letting it play against a random ai many times.
The higher the value, the best the ai
"""
ai = from_genome_to_ai(genome)
return decide_best_ai(ai,random_ai,accuracy)
How does your mutation work?
The son ereditates 80% of the genes from the father and 20% of genes from the mother. There is no kind of random mutation besides that.
And how is that reduce_by_all_functions() being used? I see that it
takes a board and a chromosome and returns a number. How is that
number used, what is it meant to represent, and... why is it being
returned modulo 9?
reduce_by_all_functions() is used to actually apply the functions previously obtained by the cromosome. The number is the square the ai wants to take. It is modulo 9 because it must be between 0 and 8 because the board is 9 spaces.
My code so far:
import doctest
import random
import operator as op
SPACE = ' '
MARK_OF_PLAYER_1 = "X"
MARK_OF_PLAYER_2 = "O"
EMPTY_MARK = SPACE
BOARD_NUMBERS = """
The moves are numbered as follows:
0 | 1 | 2
---------
3 | 4 | 5
---------
6 | 7 | 8
"""
WINNING_TRIPLETS = [ (0,1,2), (3,4,5), (6,7,8),
(0,3,6), (1,4,7), (2,5,8),
(0,4,8), (2,4,6) ]
LETTERS_TO_FUNCTIONS = {
'a': op.add,
'm': op.mul,
'M': op.mod,
's': op.sub,
'd': op.floordiv
}
def encode_board_as_trinary(board):
"""
Given a board, replaces the symbols with the numbers
1,2,3 in order to make further processing easier.
>>> encode_board_as_trinary("OX XOX ")
[2, 3, 1, 1, 3, 2, 3, 1, 1]
>>> encode_board_as_trinary(" OOOXXX")
[1, 1, 1, 2, 2, 2, 3, 3, 3]
"""
board = ''.join(board)
board = board.replace(MARK_OF_PLAYER_1,'3')
board = board.replace(MARK_OF_PLAYER_2,'2')
board = board.replace(EMPTY_MARK,'1')
return list((int(square) for square in board))
def create_random_genome(length):
"""
Creates a random genome (that is a sequences of genes, from which
the ai will be generated. It consists of randoom letters taken
from the keys of LETTERS_TO_FUNCTIONS
>>> random.seed("EXAMPLE")
# Test is not possible because even with the same
# seed it gives different results each run...
"""
letters = [letter for letter in LETTERS_TO_FUNCTIONS]
return [random.choice(letters) for _ in range(length)]
def reduce_by_all_functions(numbers_list,functions_list):
"""
Applies all the functions in a list to a list of numbers.
>>> reduce_by_all_functions([3,4,6],[op.add,op.sub])
1
>>> reduce_by_all_functions([6,2,4],[op.mul,op.floordiv])
3
"""
if len(functions_list) != len(numbers_list) - 1:
raise ValueError("The functions must be exactly one less than the numbers")
result = numbers_list[0]
for index,n in enumerate(numbers_list[1:]):
result = functions_list[index](result,n)
return result
def from_genome_to_ai(genome):
"""
Creates an AI following the rules written in the genome (the same as DNA does).
Each letter corresponds to a function as written in LETTERS_TO_FUNCTIONS.
The resulting ai will reduce the board using the functions obtained.
>>> ai = from_genome_to_ai("amMaMMss")
>>> ai("XOX OXO")
4
"""
functions = [LETTERS_TO_FUNCTIONS[gene] for gene in genome]
def ai(board):
return reduce_by_all_functions(encode_board_as_trinary(board),functions) % 9
return ai
def take_first_empty_ai(board):
"""
Very simple example ai for tic-tac-toe
that takes the first empty square.
>>> take_first_empty_ai(' OX O XXO')
0
>>> take_first_empty_ai('XOX O XXO')
3
"""
return board.index(SPACE)
def get_legal_moves(board):
"""
Given a tic-tac-toe board returns the indexes of all
the squares in which it is possible to play, i.e.
the empty squares.
>>> list(get_legal_moves('XOX O XXO'))
[3, 5]
>>> list(get_legal_moves('X O XXO'))
[1, 2, 3, 5]
"""
for index,square in enumerate(board):
if square == SPACE:
yield index
def random_ai(board):
"""
The simplest possible tic-tac-toe 'ai', just
randomly choses a random move.
>>> random.seed("EXAMPLE")
>>> random_ai('X O XXO')
3
"""
legal_moves = list(get_legal_moves(board))
return random.choice(legal_moves)
def printable_board(board):
"""
User Interface function:
returns an easy to understand representation
of the board.
"""
return """
{} | {} | {}
---------
{} | {} | {}
---------
{} | {} | {}""".format(*board)
def human_interface(board):
"""
Allows the user to play tic-tac-toe.
Shows him the board, the board numbers and then asks
him to select a move.
"""
print("The board is:")
print(printable_board(board))
print(BOARD_NUMBERS)
return(int(input("Your move is: ")))
def end_result(board):
"""
Evaluates a board returning:
0.5 if it is a tie
1 if MARK_OF_PLAYER_1 won # default to 'X'
0 if MARK_OF_PLAYER_2 won # default to 'O'
else if nothing of the above applies return None
>>> end_result('XXX OXO')
1
>>> end_result(' O X X O')
None
>>> end_result('OXOXXOXOO')
0.5
"""
if SPACE not in board:
return 0.5
for triplet in WINNING_TRIPLETS:
if all(board[square] == 'X' for square in triplet):
return 1
elif all(board[square] == 'O' for square in triplet):
return 0
def game_ended(board):
"""
Small syntactic sugar function to if the game is ended
i.e. no tie nor win occured
"""
return end_result(board) is not None
def play_ai_tic_tac_toe(ai_1,ai_2):
"""
Plays a game between two different ai-s, returning the result.
It should be noted that this function can be used also to let the user
play against an ai, just call it like: play_ai_tic_tac_toe(random_ai,human_interface)
>>> play_ai_tic_tac_toe(take_first_empty_ai,take_first_empty_ai)
1
"""
board = [SPACE for _ in range(9)]
PLAYER_1_WIN = 1
PLAYER_1_LOSS = 0
while True:
for ai,check in ( (ai_1,MARK_OF_PLAYER_1), (ai_2,MARK_OF_PLAYER_2) ):
move = ai(board)
# If move is invalid you lose
if board[move] != EMPTY_MARK:
if check == MARK_OF_PLAYER_1:
return PLAYER_1_LOSS
else:
return PLAYER_1_WIN
board[move] = check
if game_ended(board):
return end_result(board)
def loop_play_ai_tic_tac_toe(ai_1,ai_2,games_number):
"""
Plays games number games between ai_1 and ai_2
"""
return sum(( play_ai_tic_tac_toe(ai_1,ai_2)) for _ in range(games_number))
def decide_best_ai(ai_1,ai_2,accuracy):
"""
Returns the number of times the first ai is better than the second:
ex. if the ouput is 1.4, the first ai is 1.4 times better than the second.
>>> decide_best_ai(take_first_empty_ai,random_ai,100) > 0.80
True
"""
return sum((loop_play_ai_tic_tac_toe(ai_1,ai_2,accuracy//2),
loop_play_ai_tic_tac_toe(ai_2,ai_1,accuracy//2))) / (accuracy // 2)
def ai_fitness(genome,accuracy):
"""
Returns how good an ai is by lettting it play against a random ai many times.
The higher the value, the best the ai
"""
ai = from_genome_to_ai(genome)
return decide_best_ai(ai,random_ai,accuracy)
def sort_by_fitness(genomes,accuracy):
"""
Syntactic sugar for sorting a list of genomes based on the fitness.
High accuracy will yield a more accurate ordering but at the cost of more
computation time.
"""
def fit(genome):
return ai_fitness(genome,accuracy)
return list(sorted(genomes, key=fit, reverse=True))
# probable bug-fix because high fitness means better individual
def make_child(a,b):
"""
Returns a mix of cromosome a and cromosome b.
There is a bias towards cromosome a because I think that
a too weird soon is going to be bad.
"""
result = []
for index,char_a in enumerate(a):
char_b = b[index]
if random.random() > 0.8:
result.append(char_a)
else:
result.append(char_b)
return result
def genetic_process(population_size,generation_number,accuracy,elite_number):
"""
A full genetic process yielding a good tic-tac-toe ai. # not yet
# Parameters:
# population_size: the number of ai-s that you allow to be alive
at once
# generation_number: the number of generations of the gentetic
# accuracy: how well the ai-s are ordered,
low accuracy means that a good ai may be considered bad or
viceversa. High accuarcy is computationally costly
# elite_number: the number of best programmes that get to reproduce
at each generation.
# Return:
# A genome for a tic-tac-toe ai
"""
pool = [create_random_genome(9-1) for _ in range(population_size)]
for generation in range(generation_number):
best_individuals = sort_by_fitness(pool,accuracy)[:elite_number]
the_best = best_individuals[0]
for good_individual in best_individuals:
pool.append(make_child(the_best,good_individual))
pool = sort_by_fitness(pool,accuracy)[:population_size]
return the_best
def _test():
"""
Tests all the script by running the
>>> 2 + 2 # code formatted like this
4
"""
doctest.testmod()
def main():
"""
A simple demo to let the user play against a genetic opponent.
"""
print("The genetic ai is being created... please wait.")
genetic_ai = from_genome_to_ai(genetic_process(50,4,40,25))
play_ai_tic_tac_toe(genetic_ai,human_interface)
if __name__ == '__main__':
main()
First and foremost, I am obligated to say that Tic Tac Toe is really too simple a problem to reasonably attack with a genetic program. You simply don't need the power of a GP to win Tic Tac Toe; you can solve it with a brute force lookup table, or a simple game tree.
That said, if I understand correctly, your basic notion is this:
1) Create chromosomes of length 8, where each gene is an arithmetic operation, and the 8-gene chromosome acts on each board as a board evaluation function. That is, a chromosome takes in a board representation, and spits out a number representing the goodness of that board.
It's not perfectly clear that this is what you're doing, because your board representations are each 9 integers (1, 2, 3 only) but your examples are given in terms of the "winning triples" which are 2 integers (0 through 8).
2) Start the AI up and, on the AI's turn it should get a list of all legal moves, evaluate the board per its chromosome for each legal move and... take that number, modulo 9, and use that as the next move? Surely there's some code in there to handle the case where that move is illegal....
3) Let a bunch of these chromosome representations either play a standard implementation, or play each other, and determine the fitness based on the number of wins.
4) Once a whole generation of chromosomes has been evaluated, create a new generation. It's not clear to me how you are selecting the parents from the pool, but once the parents are selected, a child is produced by just taking individual genes from the parents by an 80-20 rule.
Your overall high level strategy is sound, but there are a lot of conceptual and implementation flaws in the execution. First, let's talk about fully observable games and simple ways to make AIs for them. If the game is very simple (such as Tic Tac Toe) you can simply make a brute force minimax game tree, such as this. TTT is simple enough that even your cell phone can go all the way to the bottom of the tree very quickly. You can even solve it by brute force with a look up table: Just make a list of all board positions and the response to each one.
When the games get larger-- think checkers, chess, go-- that is no longer true, and one of the ways around this is to develop what's called a board evaluation function. It is a function which takes a board position and returns a number, usually with higher being better for one player and lower being better for the other. One then executes a search to certain acceptable depth and aims for the highest (say) board evaluation function.
This begs the question: How do we come up with the board evaluation function? Originally, one asked experts at the game to develop these function for you. There is a great paper by Chellapilla and Fogel which is similar to what you want to do for checkers-- they use neural networks to determine the board evaluation functions, and, critically, these neural networks are encoded as genomes and evolved. They are then used in search depth 4 trees. The end results are very competitive against human players.
You should read that paper.
What you are trying to do, I think, is very similar, except instead of coding a neural network as a chromosome, you're trying to code up a very restricted algebraic expression, always of the form:
((((((((arg1 op1 arg2) op2 arg3) op3 arg4) op4 arg5) op5 arg6) op6 arg7) op7 arg8) op8 arg)
... and then you're using it mod 9 to pick a move.
Now let's talk about genetic algorithms, genetic programs, and the creation of new children. The whole idea in evolutionary techniques is to combine the best attributes of two hopefully-good solutions in the hopes that they will be even better, without getting stuck in a local maximum.
Generally, this is done by touranment selection, crossover, and mutation. Tournament selection means selecting the parents proportionally to their fitness. Crossover means dividing the chromosomes into two usually contiguous regions and taking one region from one parent and the other region from the other parent. (Why contiguous? Because Holland's Schema Theorem) Mutation means occasionally changing a gene, as a means of maintaining population diversity.
Now let's look at what you're doing:
1) Your board evaluation function-- the function that your chromosome turns into, which acts on the board positions-- is highly restricted and very arbitrary. There's not much rhyme or reason to assigning 1, 2, and 3 as those numbers, but that might possibly be okay. The bigger flaw is that your functions are a terribly restricted part of the overall space of functions. They are always the same length, and the parse tree always looks the same.
There's no reason to expect anything useful to be in this restrictive space. What's needed is to come up with a scheme which allows for a much more general set of parse trees, including crossover and mutation schemes. You should look up some papers or books by John Koza for ideas on this topic.
Note that Chellapilla and Fogel have fixed forms of functions as well, but their chromosomes are substantially larger than their board representations. A checkers board has 32 playable spaces, and each space can have 5 states. But their neural network had some 85 nodes, and the chromosome comprised the connection weights of those nodes-- hundreds, if not thousands, of values.
2) Then there's this whole modulo 9 thing. I don't understand why you're doing that. Don't do that. You're just scrambling whatever information might be in your chromosomes.
3) Your function to make new children is bad. Even as a genetic algorithm, you should be dividing the chromosomes in two (at random points) and taking part of one parent from one side, and the other part from the other parent on the other side. For genetic programming, which is what you're doing, there are analogous strategies for doing crossovers on parse trees. See Koza.
You must include mutation, or you will almost certainly get sub-optimal results.
4a) If you evaluate the fitness by playing against a competent AI, then realize that your chromosomes will never, ever win. They will lose, or they will draw. A competent AI will never lose. Moreover, it is likely that your AIs will lose all the time and initial generations may all come out as equally (catastrophically) poor players. It's not impossible to get yourelf out of that hole, but it will be hard.
4b) On the other hand, if, like Chellapilla and Fogel, you play the AIs against them selves, then you'd better make certain that the AIs can play either X or O. Otherwise you're never going to make any progress at all.
5) Finally, even if all these concerns are addressed, I'm not convinced this will get great results. Note that the checkers example searches to a depth of 4, which is not a big horizon in a game of checkers that might last 20 or 30 moves.
TTT can only ever last 9 moves.
If you don't do a search tree and just go for the highest board evaluation function, you might get something that works. You might not. I'm not sure. If you search to depth 4, you might as well skip to a full search to level 9 and do this conventionally.
This question already has answers here:
Random weighted choice
(7 answers)
Closed 8 years ago.
I am making a text-based RPG. I have an algorithm that determines the damage dealt by the player to the enemy which is based off the values of two variables. I am not sure how the first part of the algorithm will work quite yet, but that isn't important.
(AttackStrength is an attribute of the player that represents generally how strong his attacks are. WeaponStrength is an attribute of swords the player wields and represents generally how strong attacks are with the weapon.)
Here is how the algorithm will go:
import random
Damage = AttackStrength (Do some math operation to WeaponStrength) WeaponStrength
DamageDealt = randrange(DamageDealt - 4, DamageDealt + 1) #Bad pseudocode, sorry
What I am trying to do with the last line is get a random integer inside a range of integers with the minimum bound as 4 less than Damage, and the maximum bound as 1 more than Damage. But, that's not all. I want to assign probabilities that:
X% of the time DamageDealt will equal Damage
Y% of the time DamageDealt will equal one less than Damage
Z% of the time DamageDealt will equal two less than Damage
A% of the time DamageDealt will equal three less than Damage
B% of the time DamageDealt will equal three less than Damage
C% of the time DamageDealt will equal one more than Damage
I hope I haven't over-complicated all of this thank you!
I think the easiest way to do random weighted probability when you have nice integer probabilities like that is to simply populate a list with multiple copies of your choices - in the right ratios - then choose one element from it, randomly.
Let's do it from -3 to 1 with your (original) weights of 10,10,25,25,30 percent. These share a gcd of 5, so you only need a list of length 20 to hold your choices:
choices = [-3]*2 + [-2]*2 + [-1]*5 + [0]*5 + [1]*6
And implementation done, just choose randomly from that. Demo showing 100 trials:
trials = [random.choice(choices) for _ in range(100)]
[trials.count(i) for i in range(-3,2)]
Out[18]: [11, 7, 27, 22, 33]
Essentially, what you're trying to accomplish is simulation of a loaded die: you have six possibilities and want to assign different probabilities to each one. This is a fairly interesting problem, mathematically speaking, and here is a wonderful piece on the subject.
Still, you're probably looking for something a little less verbose, and the easiest pattern to implement here would be via roulette wheel selection. Given a dictionary where keys are the various 'sides' (in this case, your possible damage formulae) and the values are the probabilities that each side can occur (.3, .25, etc.), the method looks like this:
def weighted_random_choice(choices):
max = sum(choices.values())
pick = random.uniform(0, max)
current = 0
for key, value in choices.items():
current += value
if current > pick:
return key
Suppose that we wanted to have these relative weights for the outcomes:
a = (10, 15, 15, 25, 25, 30)
Then we create a list of partial sums b and a function c:
import random
b = [sum(a[:i+1]) for i,x in enumerate(a)]
def c():
n = random.randrange(sum(a))
for i, v in enumerate(b):_
if n < v: return i
The function c will return an integer from 0 to len(a)-1 with probability proportional to the weights specified in a.
This can be a tricky problem with a lot of different probabilities. Since you want to impose probabilities on the outcomes it's not really fair to call them "random". It always helps to imagine how you might represent your data. One way would be to keep a tuple of tuples like
probs = ((10, +1), (30, 0), (25, -1), (25, -2), (15, -3))
You will notice I have adjusted the series to put the highest adjustment first and so on. I have also removed the duplicate "15, -3) that your question implies because (I imagine) of a line duplicated by accident. One very useful test is to ensure that your probabilities add up to 100 (since I've represented them as integer percentages). This reveals a data fault:
>>> sum(prob[0] for prob in probs)
105
This needn't be an issue unless you really want your probabilities to sum to a sensible value. If this isn't necessary you can just treat them as weightings and select random numbers from (0, 104) instead of (0, 99). This is the course I will follow, but the adjustment should be relatively simple.
Given probs and a random number between 0 and (in your case) 104, you can iterate over the probs structure, accumulating probabilities until you find the bin this particular random number belongs to. This would look (something) like:
def damage_offset(N):
prob = random.randint(0, N-1)
cum_prob = 0
for prob, offset in probs:
cum_prob += prob
if cum_prob >= prob:
return offset
This should always terminate if you get your data right (hence my paranoid check on your weightings - I've been doing this quite a while).
Of course it's often possible to trade memory for speed. If the above needs to work faster then it's relatively easy to create a structure that maps random integer choices direct to their results. One way to construct such a mapping would be
damage_offsets = []
for i in range(N):
damage_offsets.append(damage_offset(i))
Then all you have to do after you've picked your random number r between 1 and N is to look up damage_offsets[r-1] for the particular value of r1 and you have created an O(1) operation. As I mentioned at the start, this isn't likely going to be terribly useful unless your probability list becomes huge (but if it does then you really will need to to avoid O(N) operations when you have large N for the number of probability buckets).
Apologies for untested code.
Given the final score of a basketball game, how i can count the number of possible scoring sequences that lead to the final score.
Each score can be one of: 3 point, 2 point, 1 point score by either the visiting or home team. For example:
basketball(3,0)=4
Because these are the 4 possible scoring sequences:
V3
V2, V1
V1, V2
V1, V1, V1
And:
basketball(88,90)=2207953060635688897371828702457752716253346841271355073695508308144982465636968075
Also I need to do it in a recursive way and without any global variables(dictionary is allowed and probably is the way to solve this)
Also, the function can get only the result as an argument (basketball(m,n)).
for those who asked for the solution:
basketballDic={}
def basketball(n,m):
count=0;
if n==0 and m==0:
return 1;
if (n,m) in basketballDic:
return basketballDic[(n,m)]
if n>=3:
count+= basketball(n-3,m)
if n>=2:
count+= basketball(n-2,m)
if n>=1:
count+= basketball(n-1,m)
if m>=3:
count+= basketball(n,m-3)
if m>=2:
count+= basketball(n,m-2)
if m>=1:
count+= basketball(n,m-1)
basketballDic[(n,m)]=count;
return count;
When you're considering a recursive algorithm, there are two things you need to figure out.
What is the base case, where the recursion ends.
What is the recursive case. That is, how can you calculate one value from one or more previous values?
For your basketball problem, the base case is pretty simple. When there's no score, there's exactly one possible set of baskets that has happened to get there (it's the empty set). So basketball(0,0) needs to return 1.
The recursive case is a little more tricky to think about. You need to reduce a given score, say (M,N), step by step until you get to (0,0), counting up the different ways to get each score on the way. There are six possible ways for the score to have changed to get to (M,N) from whatever it was previously (1, 2 and 3-point baskets for each team) so the number of ways to get to (M,N) is the sum of the ways to get to (M-1,N), (M-2,N), (M-3,N), (M,N-1), (M,N-2) and (M,N-3). So those are the recursive calls you'll want to make (after perhaps some bounds checking).
You'll find that a naive recursive implementation takes a very long time to solve for high scores. This is because it calculates the same values over and over (for instance, it may calculate that there's only one way to get to a score of (1,0) hundreds of separate times). Memoization can help prevent the duplicate work by remembering previously calculated results. It's also worth noting that the problem is symmetric (there are the same number of ways of getting a score of (M,N) as there are of getting (N,M)) so you can save a little work by remembering not only the current result, but also its reverse.
There are two ways this can be done, and neither come close to matching your specified outputs. The less relevant way would be to count the maximum possible number of scoring plays. Since basketball has 1 point scores, this will always be equal to the sum of both inputs to our basketball() function. The second technique is counting the minimum number of scoring plays. This can be done trivially with recursion, like so:
def team(x):
if x:
score = 3
if x < 3:
score = 2
if x < 2:
score = 1
return 1 + team(x - score)
else:
return 0
def basketball(x, y):
return team(x) + team(y)
Can this be done more tersely and even more elegantly? Certainly, but this should give you a decent starting point for the kind of stateless, recursive solution you are working on.
tried to reduce from the given result (every possible play- 1,2,3 points) using recursion until i get to 0 but for that i need global variable and i cant use one.
Maybe this is where you reveal what you need. You can avoid a global by passing the current count and/or returning the used count (or remaining count) as needed.
In your case I think you would just pass the points to the recursive function and have it return the counts. The return values would be added so the final total would roll-up as the recursion unwinds.
Edit
I wrote a function that was able to generate correct results. This question is tagged "memoization", using it gives a huge performance boost. Without it, the same sub-sequences are processed again and again. I used a decorator to implement memoization.
I liked #Maxwell's separate handling of teams, but that approach will not generate the numbers you are looking for. (Probably because your original wording was not at all clear, I've since rewritten your problem description). I wound up processing the 6 home and visitor scoring possibilities in a single function.
My counts were wrong until I realized what I needed to count was the number of times I hit the terminal condition.
Solution
Other solutions have been posted. Here's my (not very readable) one-liner:
def bball(vscore, hscore):
return 1 if vscore == 0 and hscore == 0 else \
sum([bball(vscore-s, hscore) for s in range(1,4) if s <= vscore]) + \
sum([bball(vscore, hscore-s) for s in range(1,4) if s <= hscore])
Actually I also have this line just before the function definition:
#memoize
I used Michele Simionato's decorator module and memoize sample. Though as #Blckknght mentioned the function is commutative so you could customize memoize to take advantage of this.
While I like the separation of concerns provided by generic memoization, I'm also tempted to initialize the cache with (something like):
cache = {(0, 0): 1}
and remove the special case check for 0,0 args in the function.
How would you create an algorithm to solve the following puzzle, "Mastermind"?
Your opponent has chosen four different colours from a set of six (yellow, blue, green, red, orange, purple). You must guess which they have chosen, and in what order. After each guess, your opponent tells you how many (but not which) of the colours you guessed were the right colour in the right place ["blacks"] and how many (but not which) were the right colour but in the wrong place ["whites"]. The game ends when you guess correctly (4 blacks, 0 whites).
For example, if your opponent has chosen (blue, green, orange, red), and you guess (yellow, blue, green, red), you will get one "black" (for the red), and two whites (for the blue and green). You would get the same score for guessing (blue, orange, red, purple).
I'm interested in what algorithm you would choose, and (optionally) how you translate that into code (preferably Python). I'm interested in coded solutions that are:
Clear (easily understood)
Concise
Efficient (fast in making a guess)
Effective (least number of guesses to solve the puzzle)
Flexible (can easily answer questions about the algorithm, e.g. what is its worst case?)
General (can be easily adapted to other types of puzzle than Mastermind)
I'm happy with an algorithm that's very effective but not very efficient (provided it's not just poorly implemented!); however, a very efficient and effective algorithm implemented inflexibly and impenetrably is not of use.
I have my own (detailed) solution in Python which I have posted, but this is by no means the only or best approach, so please post more! I'm not expecting an essay ;)
Key tools: entropy, greediness, branch-and-bound; Python, generators, itertools, decorate-undecorate pattern
In answering this question, I wanted to build up a language of useful functions to explore the problem. I will go through these functions, describing them and their intent. Originally, these had extensive docs, with small embedded unit tests tested using doctest; I can't praise this methodology highly enough as a brilliant way to implement test-driven-development. However, it does not translate well to StackOverflow, so I will not present it this way.
Firstly, I will be needing several standard modules and future imports (I work with Python 2.6).
from __future__ import division # No need to cast to float when dividing
import collections, itertools, math
I will need a scoring function. Originally, this returned a tuple (blacks, whites), but I found output a little clearer if I used a namedtuple:
Pegs = collections.namedtuple('Pegs', 'black white')
def mastermindScore(g1,g2):
matching = len(set(g1) & set(g2))
blacks = sum(1 for v1, v2 in itertools.izip(g1,g2) if v1 == v2)
return Pegs(blacks, matching-blacks)
To make my solution general, I pass in anything specific to the Mastermind problem as keyword arguments. I have therefore made a function that creates these arguments once, and use the **kwargs syntax to pass it around. This also allows me to easily add new attributes if I need them later. Note that I allow guesses to contain repeats, but constrain the opponent to pick distinct colours; to change this, I only need change G below. (If I wanted to allow repeats in the opponent's secret, I would need to change the scoring function as well.)
def mastermind(colours, holes):
return dict(
G = set(itertools.product(colours,repeat=holes)),
V = set(itertools.permutations(colours, holes)),
score = mastermindScore,
endstates = (Pegs(holes, 0),))
def mediumGame():
return mastermind(("Yellow", "Blue", "Green", "Red", "Orange", "Purple"), 4)
Sometimes I will need to partition a set based on the result of applying a function to each element in the set. For instance, the numbers 1..10 can be partitioned into even and odd numbers by the function n % 2 (odds give 1, evens give 0). The following function returns such a partition, implemented as a map from the result of the function call to the set of elements that gave that result (e.g. { 0: evens, 1: odds }).
def partition(S, func, *args, **kwargs):
partition = collections.defaultdict(set)
for v in S: partition[func(v, *args, **kwargs)].add(v)
return partition
I decided to explore a solver that uses a greedy entropic approach. At each step, it calculates the information that could be obtained from each possible guess, and selects the most informative guess. As the numbers of possibilities grow, this will scale badly (quadratically), but let's give it a try! First, I need a method to calculate the entropy (information) of a set of probabilities. This is just -∑p log p. For convenience, however, I will allow input that are not normalized, i.e. do not add up to 1:
def entropy(P):
total = sum(P)
return -sum(p*math.log(p, 2) for p in (v/total for v in P if v))
So how am I going to use this function? Well, for a given set of possibilities, V, and a given guess, g, the information we get from that guess can only come from the scoring function: more specifically, how that scoring function partitions our set of possibilities. We want to make a guess that distinguishes best among the remaining possibilites — divides them into the largest number of small sets — because that means we are much closer to the answer. This is exactly what the entropy function above is putting a number to: a large number of small sets will score higher than a small number of large sets. All we need to do is plumb it in.
def decisionEntropy(V, g, score):
return entropy(collections.Counter(score(gi, g) for gi in V).values())
Of course, at any given step what we will actually have is a set of remaining possibilities, V, and a set of possible guesses we could make, G, and we will need to pick the guess which maximizes the entropy. Additionally, if several guesses have the same entropy, prefer to pick one which could also be a valid solution; this guarantees the approach will terminate. I use the standard python decorate-undecorate pattern together with the built-in max method to do this:
def bestDecision(V, G, score):
return max((decisionEntropy(V, g, score), g in V, g) for g in G)[2]
Now all I need to do is repeatedly call this function until the right result is guessed. I went through a number of implementations of this algorithm until I found one that seemed right. Several of my functions will want to approach this in different ways: some enumerate all possible sequences of decisions (one per guess the opponent may have made), while others are only interested in a single path through the tree (if the opponent has already chosen a secret, and we are just trying to reach the solution). My solution is a "lazy tree", where each part of the tree is a generator that can be evaluated or not, allowing the user to avoid costly calculations they won't need. I also ended up using two more namedtuples, again for clarity of code.
Node = collections.namedtuple('Node', 'decision branches')
Branch = collections.namedtuple('Branch', 'result subtree')
def lazySolutionTree(G, V, score, endstates, **kwargs):
decision = bestDecision(V, G, score)
branches = (Branch(result, None if result in endstates else
lazySolutionTree(G, pV, score=score, endstates=endstates))
for (result, pV) in partition(V, score, decision).iteritems())
yield Node(decision, branches) # Lazy evaluation
The following function evaluates a single path through this tree, based on a supplied scoring function:
def solver(scorer, **kwargs):
lazyTree = lazySolutionTree(**kwargs)
steps = []
while lazyTree is not None:
t = lazyTree.next() # Evaluate node
result = scorer(t.decision)
steps.append((t.decision, result))
subtrees = [b.subtree for b in t.branches if b.result == result]
if len(subtrees) == 0:
raise Exception("No solution possible for given scores")
lazyTree = subtrees[0]
assert(result in endstates)
return steps
This can now be used to build an interactive game of Mastermind where the user scores the computer's guesses. Playing around with this reveals some interesting things. For example, the most informative first guess is of the form (yellow, yellow, blue, green), not (yellow, blue, green, red). Extra information is gained by using exactly half the available colours. This also holds for 6-colour 3-hole Mastermind — (yellow, blue, green) — and 8-colour 5-hole Mastermind — (yellow, yellow, blue, green, red).
But there are still many questions that are not easily answered with an interactive solver. For instance, what is the most number of steps needed by the greedy entropic approach? And how many inputs take this many steps? To make answering these questions easier, I first produce a simple function that turns the lazy tree of above into a set of paths through this tree, i.e. for each possible secret, a list of guesses and scores.
def allSolutions(**kwargs):
def solutions(lazyTree):
return ((((t.decision, b.result),) + solution
for t in lazyTree for b in t.branches
for solution in solutions(b.subtree))
if lazyTree else ((),))
return solutions(lazySolutionTree(**kwargs))
Finding the worst case is a simple matter of finding the longest solution:
def worstCaseSolution(**kwargs):
return max((len(s), s) for s in allSolutions(**kwargs)) [1]
It turns out that this solver will always complete in 5 steps or fewer. Five steps! I know that when I played Mastermind as a child, I often took longer than this. However, since creating this solver and playing around with it, I have greatly improved my technique, and 5 steps is indeed an achievable goal even when you don't have time to calculate the entropically ideal guess at each step ;)
How likely is it that the solver will take 5 steps? Will it ever finish in 1, or 2, steps? To find that out, I created another simple little function that calculates the solution length distribution:
def solutionLengthDistribution(**kwargs):
return collections.Counter(len(s) for s in allSolutions(**kwargs))
For the greedy entropic approach, with repeats allowed: 7 cases take 2 steps; 55 cases take 3 steps; 229 cases take 4 steps; and 69 cases take the maximum of 5 steps.
Of course, there's no guarantee that the greedy entropic approach minimizes the worst-case number of steps. The final part of my general-purpose language is an algorithm that decides whether or not there are any solutions for a given worst-case bound. This will tell us whether greedy entropic is ideal or not. To do this, I adopt a branch-and-bound strategy:
def solutionExists(maxsteps, G, V, score, **kwargs):
if len(V) == 1: return True
partitions = [partition(V, score, g).values() for g in G]
maxSize = max(len(P) for P in partitions) ** (maxsteps - 2)
partitions = (P for P in partitions if max(len(s) for s in P) <= maxSize)
return any(all(solutionExists(maxsteps-1,G,s,score) for l,s in
sorted((-len(s), s) for s in P)) for i,P in
sorted((-entropy(len(s) for s in P), P) for P in partitions))
This is definitely a complex function, so a bit more explanation is in order. The first step is to partition the remaining solutions based on their score after a guess, as before, but this time we don't know what guess we're going to make, so we store all partitions. Now we could just recurse into every one of these, effectively enumerating the entire universe of possible decision trees, but this would take a horrifically long time. Instead I observe that, if at this point there is no partition that divides the remaining solutions into more than n sets, then there can be no such partition at any future step either. If we have k steps left, that means we can distinguish between at most nk-1 solutions before we run out of guesses (on the last step, we must always guess correctly). Thus we can discard any partitions that contain a score mapped to more than this many solutions. This is the next two lines of code.
The final line of code does the recursion, using Python's any and all functions for clarity, and trying the highest-entropy decisions first to hopefully minimize runtime in the positive case. It also recurses into the largest part of the partition first, as this is the most likely to fail quickly if the decision was wrong. Once again, I use the standard decorate-undecorate pattern, this time to wrap Python's sorted function.
def lowerBoundOnWorstCaseSolution(**kwargs):
for steps in itertools.count(1):
if solutionExists(maxsteps=steps, **kwargs):
return steps
By calling solutionExists repeatedly with an increasing number of steps, we get a strict lower bound on the number of steps needed in the worst case for a Mastermind solution: 5 steps. The greedy entropic approach is indeed optimal.
Out of curiosity, I invented another guessing game, which I nicknamed "twoD". In this, you try to guess a pair of numbers; at each step, you get told if your answer is correct, if the numbers you guessed are no less than the corresponding ones in the secret, and if the numbers are no greater.
Comparison = collections.namedtuple('Comparison', 'less greater equal')
def twoDScorer(x, y):
return Comparison(all(r[0] <= r[1] for r in zip(x, y)),
all(r[0] >= r[1] for r in zip(x, y)),
x == y)
def twoD():
G = set(itertools.product(xrange(5), repeat=2))
return dict(G = G, V = G, score = twoDScorer,
endstates = set(Comparison(True, True, True)))
For this game, the greedy entropic approach has a worst case of five steps, but there is a better solution possible with a worst case of four steps, confirming my intuition that myopic greediness is only coincidentally ideal for Mastermind. More importantly, this has shown how flexible my language is: all the same methods work for this new guessing game as did for Mastermind, letting me explore other games with a minimum of extra coding.
What about performance? Obviously, being implemented in Python, this code is not going to be blazingly fast. I've also dropped some possible optimizations in favour of clear code.
One cheap optimization is to observe that, on the first move, most guesses are basically identical: (yellow, blue, green, red) is really no different from (blue, red, green, yellow), or (orange, yellow, red, purple). This greatly reduces the number of guesses we need consider on the first step — otherwise the most costly decision in the game.
However, because of the large runtime growth rate of this problem, I was not able to solve the 8-colour, 5-hole Mastermind problem, even with this optimization. Instead, I ported the algorithms to C++, keeping the general structure the same and employing bitwise operations to boost performance in the critical inner loops, for a speedup of many orders of magnitude. I leave this as an exercise to the reader :)
Addendum, 2018: It turns out the greedy entropic approach is not optimal for the 8-colour, 4-hole Mastermind problem either, with a worst-case length of 7 steps when an algorithm exists that takes at most 6!
I once wrote a "Jotto" solver which is essentially "Master Mind" with words. (We each pick a word and we take turns guessing at each other's word, scoring "right on" (exact) matches and "elsewhere" (correct letter/color, but wrong placement).
The key to solving such a problem is the realization that the scoring function is symmetric.
In other words if score(myguess) == (1,2) then I can use the same score() function to compare my previous guess with any other possibility and eliminate any that don't give exactly the same score.
Let me give an example: The hidden word (target) is "score" ... the current guess is "fools" --- the score is 1,1 (one letter, 'o', is "right on"; another letter, 's', is "elsewhere"). I can eliminate the word "guess" because the `score("guess") (against "fools") returns (1,0) (the final 's' matches, but nothing else does). So the word "guess" is not consistent with "fools" and a score against some unknown word that gave returned a score of (1,1).
So I now can walk through every five letter word (or combination of five colors/letters/digits etc) and eliminate anything that doesn't score 1,1 against "fools." Do that at each iteration and you'll very rapidly converge on the target. (For five letter words I was able to get within 6 tries every time ... and usually only 3 or 4). Of course there's only 6000 or so "words" and you're eliminating close to 95% for each guess.
Note: for the following discussion I'm talking about five letter "combination" rather than four elements of six colors. The same algorithms apply; however, the problem is orders of magnitude smaller for the old "Master Mind" game ... there are only 1296 combinations (6**4) of colored pegs in the classic "Master Mind" program, assuming duplicates are allowed. The line of reasoning that leads to the convergence involves some combinatorics: there are 20 non-winning possible scores for a five element target (n = [(a,b) for a in range(5) for b in range(6) if a+b <= 5] to see all of them if you're curious. We would, therefore, expect that any random valid selection would have a roughly 5% chance of matching our score ... the other 95% won't and therefore will be eliminated for each scored guess. This doesn't account for possible clustering in word patterns but the real world behavior is close enough for words and definitely even closer for "Master Mind" rules. However, with only 6 colors in 4 slots we only have 14 possible non-winning scores so our convergence isn't quite as fast).
For Jotto the two minor challenges are: generating a good world list (awk -f 'length($0)==5' /usr/share/dict/words or similar on a UNIX system) and what to do if the user has picked a word that not in our dictionary (generate every letter combination, 'aaaaa' through 'zzzzz' --- which is 26 ** 5 ... or ~1.1 million). A trivial combination generator in Python takes about 1 minute to generate all those strings ... an optimized one should to far better. (I can also add a requirement that every "word" have at least one vowel ... but this constraint doesn't help much --- 5 vowels * 5 possible locations for that and then multiplied by 26 ** 4 other combinations).
For Master Mind you use the same combination generator ... but with only 4 or 5 "letters" (colors). Every 6-color combination (15,625 of them) can be generated in under a second (using the same combination generator as I used above).
If I was writing this "Jotto" program today, in Python for example, I would "cheat" by having a thread generating all the letter combos in the background while I was still eliminated words from the dictionary (while my opponent was scoring me, guessing, etc). As I generated them I'd also eliminate against all guesses thus far. Thus I would, after I'd eliminated all known words, have a relatively small list of possibilities and against a human player I've "hidden" most of my computation lag by doing it in parallel to their input. (And, if I wrote a web server version of such a program I'd have my web engine talk to a local daemon to ask for sequences consistent with a set of scores. The daemon would keep a locally generated list of all letter combinations and would use a select.select() model to feed possibilities back to each of the running instances of the game --- each would feed my daemon word/score pairs which my daemon would apply as a filter on the possibilities it feeds back to that client).
(By comparison I wrote my version of "Jotto" about 20 years ago on an XT using Borland TurboPascal ... and it could do each elimination iteration --- starting with its compiled in list of a few thousand words --- in well under a second. I build its word list by writing a simple letter combination generator (see below) ... saving the results to a moderately large file, then running my word processor's spell check on that with a macro to delete everything that was "mis-spelled" --- then I used another macro to wrap all the remaining lines in the correct punctuation to make them valid static assignments to my array, which was a #include file to my program. All that let me build a standalone game program that "knew" just about every valid English 5 letter word; the program was a .COM --- less than 50KB if I recall correctly).
For other reasons I've recently written a simple arbitrary combination generator in Python. It's about 35 lines of code and I've posted that to my "trite snippets" wiki on bitbucket.org ... it's not a "generator" in the Python sense ... but a class you can instantiate to an infinite sequence of "numeric" or "symbolic" combination of elements (essentially counting in any positive integer base).
You can find it at: Trite Snippets: Arbitrary Sequence Combination Generator
For the exact match part of our score() function you can just use this:
def score(this, that):
'''Simple "Master Mind" scoring function'''
exact = len([x for x,y in zip(this, that) if x==y])
### Calculating "other" (white pegs) goes here:
### ...
###
return (exact,other)
I think this exemplifies some of the beauty of Python: zip() up the two sequences,
return any that match, and take the length of the results).
Finding the matches in "other" locations is deceptively more complicated. If no repeats were allowed then you could simply use sets to find the intersections.
[In my earlier edit of this message, when I realized how I could use zip() for exact matches, I erroneously thought we could get away with other = len([x for x,y in zip(sorted(x), sorted(y)) if x==y]) - exact ... but it was late and I was tired. As I slept on it I realized that the method was flawed. Bad, Jim! Don't post without adequate testing!* (Tested several cases that happened to work)].
In the past the approach I used was to sort both lists, compare the heads of each: if the heads are equal, increment the count and pop new items from both lists. otherwise pop a new value into the lesser of the two heads and try again. Break as soon as either list is empty.
This does work; but it's fairly verbose. The best I can come up with using that approach is just over a dozen lines of code:
other = 0
x = sorted(this) ## Implicitly converts to a list!
y = sorted(that)
while len(x) and len(y):
if x[0] == y[0]:
other += 1
x.pop(0)
y.pop(0)
elif x[0] < y[0]:
x.pop(0)
else:
y.pop(0)
other -= exact
Using a dictionary I can trim that down to about nine:
other = 0
counters = dict()
for i in this:
counters[i] = counters.get(i,0) + 1
for i in that:
if counters.get(i,0) > 0:
other += 1
counters[i] -= 1
other -= exact
(Using the new "collections.Counter" class (Python3 and slated for Python 2.7?) I could presumably reduce this a little more; three lines here are initializing the counters collection).
It's important to decrement the "counter" when we find a match and it's vital to test for counter greater than zero in our test. If a given letter/symbol appears in "this" once and "that" twice then it must only be counted as a match once.
The first approach is definitely a bit trickier to write (one must be careful to avoid boundaries). Also in a couple of quick benchmarks (testing a million randomly generated pairs of letter patterns) the first approach takes about 70% longer as the one using dictionaries. (Generating the million pairs of strings using random.shuffle() took over twice as long as the slower of the scoring functions, on the other hand).
A formal analysis of the performance of these two functions would be complicated. The first method has two sorts, so that would be 2 * O(nlog(n)) ... and it iterates through at least one of the two strings and possibly has to iterate all the way to the end of the other string (best case O(n), worst case O(2n)) -- force I'm mis-using big-O notation here, but this is just a rough estimate. The second case depends entirely on the perfomance characteristics of the dictionary. If we were using b-trees then the performance would be roughly O(nlog(n) for creation and finding each element from the other string therein would be another O(n*log(n)) operation. However, Python dictionaries are very efficient and these operations should be close to constant time (very few hash collisions). Thus we'd expect a performance of roughly O(2n) ... which of course simplifies to O(n). That roughly matches my benchmark results.
Glancing over the Wikipedia article on "Master Mind" I see that Donald Knuth used an approach which starts similarly to mine (and 10 years earlier) but he added one significant optimization. After gathering every remaining possibility he selects whichever one would eliminate the largest number of possibilities on the next round. I considered such an enhancement to my own program and rejected the idea for practical reasons. In his case he was searching for an optimal (mathematical) solution. In my case I was concerned about playability (on an XT, preferably using less than 64KB of RAM, though I could switch to .EXE format and use up to 640KB). I wanted to keep the response time down in the realm of one or two seconds (which was easy with my approach but which would be much more difficult with the further speculative scoring). (Remember I was working in Pascal, under MS-DOS ... no threads, though I did implement support for crude asynchronous polling of the UI which turned out to be unnecessary)
If I were writing such a thing today I'd add a thread to do the better selection, too. This would allow me to give the best guess I'd found within a certain time constraint, to guarantee that my player didn't have to wait too long for my guess. Naturally my selection/elimination would be running while waiting for my opponent's guesses.
Have you seem Raymond Hettingers attempt? They certainly match up to some of your requirements.
I wonder how his solutions compares to yours.
There is a great site about MasterMind strategy here. The author starts off with very simple MasterMind problems (using numbers rather than letters, and ignoring order and repetition) and gradually builds up to a full MasterMind problem (using colours, which can be repeated, in any order, even with the possibility of errors in the clues).
The seven tutorials that are presented are as follows:
Tutorial 1 - The simplest game setting (no errors, fixed order, no repetition)
Tutorial 2 - Code may contain blank spaces (no errors, fixed order, no repetition)
Tutorial 3 - Hints may contain errors (fixed order, no repetition)
Tutorial 4 - Game started from the middle (no errors, fixed order, no repetition)
Tutorial 5 - Digits / colours may be repeated (no errors, fixed order, each colour repeated at most 4 times)
Tutorial 6 - Digits / colours arranged in random order (no errors, random order, no repetition)
Tutorial 7 - Putting it all together (no errors, random order, each colour repeated at most 4 times)
Just thought I'd contribute my 90 odd lines of code. I've build upon #Jim Dennis' answer, mostly taking away the hint on symetric scoring. I've implemented the minimax algorithm as described on the Mastermind wikipedia article by Knuth, with one exception: I restrict my next move to current list of possible solutions, as I found performance deteriorated badly when taking all possible solutions into account at each step. The current approach leaves me with a worst case of 6 guesses for any combination, each found in well under a second.
It's perhaps important to note that I make no restriction whatsoever on the hidden sequence, allowing for any number of repeats.
from itertools import product, tee
from random import choice
COLORS = 'red ', 'green', 'blue', 'yellow', 'purple', 'pink'#, 'grey', 'white', 'black', 'orange', 'brown', 'mauve', '-gap-'
HOLES = 4
def random_solution():
"""Generate a random solution."""
return tuple(choice(COLORS) for i in range(HOLES))
def all_solutions():
"""Generate all possible solutions."""
for solution in product(*tee(COLORS, HOLES)):
yield solution
def filter_matching_result(solution_space, guess, result):
"""Filter solutions for matches that produce a specific result for a guess."""
for solution in solution_space:
if score(guess, solution) == result:
yield solution
def score(actual, guess):
"""Calculate score of guess against actual."""
result = []
#Black pin for every color at right position
actual_list = list(actual)
guess_list = list(guess)
black_positions = [number for number, pair in enumerate(zip(actual_list, guess_list)) if pair[0] == pair[1]]
for number in reversed(black_positions):
del actual_list[number]
del guess_list[number]
result.append('black')
#White pin for every color at wrong position
for color in guess_list:
if color in actual_list:
#Remove the match so we can't score it again for duplicate colors
actual_list.remove(color)
result.append('white')
#Return a tuple, which is suitable as a dictionary key
return tuple(result)
def minimal_eliminated(solution_space, solution):
"""For solution calculate how many possibilities from S would be eliminated for each possible colored/white score.
The score of the guess is the least of such values."""
result_counter = {}
for option in solution_space:
result = score(solution, option)
if result not in result_counter.keys():
result_counter[result] = 1
else:
result_counter[result] += 1
return len(solution_space) - max(result_counter.values())
def best_move(solution_space):
"""Determine the best move in the solution space, being the one that restricts the number of hits the most."""
elim_for_solution = dict((minimal_eliminated(solution_space, solution), solution) for solution in solution_space)
max_elimintated = max(elim_for_solution.keys())
return elim_for_solution[max_elimintated]
def main(actual = None):
"""Solve a game of mastermind."""
#Generate random 'hidden' sequence if actual is None
if actual == None:
actual = random_solution()
#Start the game of by choosing n unique colors
current_guess = COLORS[:HOLES]
#Initialize solution space to all solutions
solution_space = all_solutions()
guesses = 1
while True:
#Calculate current score
current_score = score(actual, current_guess)
#print '\t'.join(current_guess), '\t->\t', '\t'.join(current_score)
if current_score == tuple(['black'] * HOLES):
print guesses, 'guesses for\t', '\t'.join(actual)
return guesses
#Restrict solution space to exactly those hits that have current_score against current_guess
solution_space = tuple(filter_matching_result(solution_space, current_guess, current_score))
#Pick the candidate that will limit the search space most
current_guess = best_move(solution_space)
guesses += 1
if __name__ == '__main__':
print max(main(sol) for sol in all_solutions())
Should anyone spot any possible improvements to the above code than I would be very much interested in your suggestions.
To work out the "worst" case, instead of using entropic I am looking to the partition that has the maximum number of elements, then select the try that is a minimum for this maximum => This will give me the minimum number of remaining possibility when I am not lucky (which happens in the worst case).
This always solve standard case in 5 attempts, but it is not a full proof that 5 attempts are really needed because it could happen that for next step a bigger set possibilities would have given a better result than a smaller one (because easier to distinguish between).
Though for the "Standard game" with 1680 I have a simple formal proof:
For the first step the try that gives the minimum for the partition with the maximum number is 0,0,1,1: 256. Playing 0,0,1,2 is not as good: 276.
For each subsequent try there are 14 outcomes (1 not placed and 3 placed is impossible) and 4 placed is giving a partition of 1. This means that in the best case (all partition same size) we will get a maximum partition that is a minimum of (number of possibilities - 1)/13 (rounded up because we have integer so necessarily some will be less and other more, so that the maximum is rounded up).
If I apply this:
After first play (0,0,1,1) I am getting 256 left.
After second try: 20 = (256-1)/13
After third try : 2 = (20-1)/13
Then I have no choice but to try one of the two left for the 4th try.
If I am unlucky a fifth try is needed.
This proves we need at least 5 tries (but not that this is enough).
Here is a generic algorithm I wrote that uses numbers to represent the different colours. Easy to change, but I find numbers to be a lot easier to work with than strings.
You can feel free to use any whole or part of this algorithm, as long as credit is given accordingly.
Please note I'm only a Grade 12 Computer Science student, so I am willing to bet that there are definitely more optimized solutions available.
Regardless, here's the code:
import random
def main():
userAns = raw_input("Enter your tuple, and I will crack it in six moves or less: ")
play(ans=eval("("+userAns+")"),guess=(0,0,0,0),previousGuess=[])
def play(ans=(6,1,3,5),guess=(0,0,0,0),previousGuess=[]):
if(guess==(0,0,0,0)):
guess = genGuess(guess,ans)
else:
checker = -1
while(checker==-1):
guess,checker = genLogicalGuess(guess,previousGuess,ans)
print guess, ans
if not(guess==ans):
previousGuess.append(guess)
base = check(ans,guess)
play(ans=ans,guess=base,previousGuess=previousGuess)
else:
print "Found it!"
def genGuess(guess,ans):
guess = []
for i in range(0,len(ans),1):
guess.append(random.randint(1,6))
return tuple(guess)
def genLogicalGuess(guess,previousGuess,ans):
newGuess = list(guess)
count = 0
#Generate guess
for i in range(0,len(newGuess),1):
if(newGuess[i]==-1):
newGuess.insert(i,random.randint(1,6))
newGuess.pop(i+1)
for item in previousGuess:
for i in range(0,len(newGuess),1):
if((newGuess[i]==item[i]) and (newGuess[i]!=ans[i])):
newGuess.insert(i,-1)
newGuess.pop(i+1)
count+=1
if(count>0):
return guess,-1
else:
guess = tuple(newGuess)
return guess,0
def check(ans,guess):
base = []
for i in range(0,len(zip(ans,guess)),1):
if not(zip(ans,guess)[i][0] == zip(ans,guess)[i][1]):
base.append(-1)
else:
base.append(zip(ans,guess)[i][1])
return tuple(base)
main()
Here's a link to pure Python solver for Mastermind(tm): http://code.activestate.com/recipes/496907-mastermind-style-code-breaking/ It has a simple version, a way to experiment with various guessing strategies, performance measurement, and an optional C accelerator.
The core of the recipe is short and sweet:
import random
from itertools import izip, imap
digits = 4
fmt = '%0' + str(digits) + 'd'
searchspace = tuple([tuple(map(int,fmt % i)) for i in range(0,10**digits)])
def compare(a, b, imap=imap, sum=sum, izip=izip, min=min):
count1 = [0] * 10
count2 = [0] * 10
strikes = 0
for dig1, dig2 in izip(a,b):
if dig1 == dig2:
strikes += 1
count1[dig1] += 1
count2[dig2] += 1
balls = sum(imap(min, count1, count2)) - strikes
return (strikes, balls)
def rungame(target, strategy, verbose=True, maxtries=15):
possibles = list(searchspace)
for i in xrange(maxtries):
g = strategy(i, possibles)
if verbose:
print "Out of %7d possibilities. I'll guess %r" % (len(possibles), g),
score = compare(g, target)
if verbose:
print ' ---> ', score
if score[0] == digits:
if verbose:
print "That's it. After %d tries, I won." % (i+1,)
break
possibles = [n for n in possibles if compare(g, n) == score]
return i+1
def strategy_allrand(i, possibles):
return random.choice(possibles)
if __name__ == '__main__':
hidden_code = random.choice(searchspace)
rungame(hidden_code, strategy_allrand)
Here is what the output looks like:
Out of 10000 possibilities. I'll guess (6, 4, 0, 9) ---> (1, 0)
Out of 1372 possibilities. I'll guess (7, 4, 5, 8) ---> (1, 1)
Out of 204 possibilities. I'll guess (1, 4, 2, 7) ---> (2, 1)
Out of 11 possibilities. I'll guess (1, 4, 7, 1) ---> (3, 0)
Out of 2 possibilities. I'll guess (1, 4, 7, 4) ---> (4, 0)
That's it. After 5 tries, I won.
My friend was considering relatively simple case - 8 colors, no repeats, no blanks.
With no repeats, there's no need for the max entropy consideration, all guesses have the same entropy and first or random guessing all work fine.
Here's the full code to solve that variant:
# SET UP
import random
import itertools
colors = ('red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet', 'ultra')
# ONE FUNCTION REQUIRED
def EvaluateCode(guess, secret_code):
key = []
for i in range(0, 4):
for j in range(0, 4):
if guess[i] == secret_code[j]:
key += ['black'] if i == j else ['white']
return key
# MAIN CODE
# choose secret code
secret_code = random.sample(colors, 4)
print ('(shh - secret code is: ', secret_code, ')\n', sep='')
# create the full list of permutations
full_code_list = list(itertools.permutations(colors, 4))
N_guess = 0
while True:
N_guess += 1
print ('Attempt #', N_guess, '\n-----------', sep='')
# make a random guess
guess = random.choice(full_code_list)
print ('guess:', guess)
# evaluate the guess and get the key
key = EvaluateCode(guess, secret_code)
print ('key:', key)
if key == ['black', 'black', 'black', 'black']:
break
# remove codes from the code list that don't match the key
full_code_list2 = []
for i in range(0, len(full_code_list)):
if EvaluateCode(guess, full_code_list[i]) == key:
full_code_list2 += [full_code_list[i]]
full_code_list = full_code_list2
print ('N remaining: ', len(full_code_list), '\n', full_code_list, '\n', sep='')
print ('\nMATCH after', N_guess, 'guesses\n')