sub-string text with a running number

sub-string text with a running number - python

This should be very simple and short, but i cannot think of a good and short way of doing this:
I have a string for instance:
'How many roads must a man walk down Before you call him a man? How
many seas must a white dove sail Before she sleeps in the sand? Yes,
and how many times must the cannon balls fly Before they're forever
banned?'
and I want to substring a word say "how" with a running number so i get:
'[1] many roads must a man walk down Before you call him a man? [2]
many seas must a white dove sail Before she sleeps in the sand? Yes,
and [3] many times must the cannon balls fly Before they're forever
banned?'

You can utilise itertools.count and a function as the replacement argument, eg:
import re
from itertools import count
text = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
result = re.sub(r'(?i)\bhow\b', lambda m, c=count(1): '[{}]'.format(next(c)), text)
# [1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and [3] many times must the cannon balls fly Before they're forever banned?

You can use re.sub with a replacement function. The function will look up how often that word has been seen in a dictionary and return an according number.
counts = collections.defaultdict(int)
def subst_count(match):
word = match.group().lower()
counts[word] += 1
return "[%d]" % counts[word]
Example:
>>> text = "How many ...? How many ...? Yes, and how many ...?"
>>> re.sub(r"\bhow\b", subst_count, text, flags=re.I)
'[1] many ...? [2] many ...? Yes, and [3] many ...?'
Note: This uses different counts for each word to replace (in case you use a regex that matched more than one word), but will not reset counts between calls to re.sub.

Here's another way to do it with re.sub with a replacement function. But rather than using a global object to keep track of the count this code uses a function attribute.
import re
def count_replace():
def replace(m):
replace.count += 1
return '[%d]' % replace.count
replace.count = 0
return replace
src = '''How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before they're forever banned?'''
pat = re.compile('how', re.I)
print(pat.sub(count_replace(), src))
output
[1] many roads must a man walk down Before you call him a man? [2]
many seas must a white dove sail Before she sleeps in the sand? Yes,
and [3] many times must the cannon balls fly Before they're forever
banned?
If you need to only replace complete words and not partial words, then you'll need a smarter regex, eg r"\bhow\b".

Test = 'How many roads must a man walk down Before you call him a man? How many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?'
i = 0
while("How" in Test):
new = "["+str(i)+"]"
Test = Test.replace("How",new,i)
i=i+1
print Test
Output
[1] many roads must a man walk down Before you call him a man? [2] many seas must a white dove sail Before she sleeps in the sand? Yes, and how many times must the cannon balls fly Before theyre forever banned?

Just for fun, I wanted to see if I could solve this using recursion, and this is what I got:
def count_replace(s, to_replace, leng=0, count=1, replaced=[]):
if s.find(' ') == -1:
replaced.append(s)
return ' '.join(replaced)
else:
if s[0:s.find(' ')].lower() == to_replace.lower():
replaced.append('[%d]' % count)
count += 1
leng = len(to_replace)
else:
replaced.append(s[0:s.find(' ')])
leng = s.find(' ')
return count_replace(s[leng + 1:], to_replace, leng, count, replaced)
It goes without saying that I wouldn't recommend this as it's ridiculously inefficient on top of the fact that it's also overly complicated, but I thought I'd share it anyway!

Related

how can i concat these string vars in a for loop? [duplicate]

This question already has answers here:
How can I select a variable by (string) name?
(5 answers)
How to concatenate (join) items in a list to a single string
(11 answers)
Closed 8 months ago.
string1 = "The wind, "
string2 = "which had hitherto carried us along with amazing rapidity, "
string3 = "sank at sunset to a light breeze; "
string4 = "the soft air just ruffled the water and "
string5 = "caused a pleasant motion among the trees as we approached the shore, "
string6 = "from which it wafted the most delightful scent of flowers and hay."
I tried:
for i in range(6):
message +=string(i)
but it didn't work and showed the error: string is undefined
I want to manipulate the vars directly, I know that putting them in a list is much easier, but imagine if you had like 1000 string, kinda difficult to write each one in the list.

Using join():
cList = [string1, string2, string3, string4, string5, string6]
print("".join(cList))
What I'd suggest, instead of a n number of variables, have them in a list:
x = ["The wind, ", "which had hitherto carried us along with amazing rapidity, ", "sank at sunset to a light breeze; ", "the soft air just ruffled the water and ", "caused a pleasant motion among the trees as we approached the shore", "from which it wafted the most delightful scent of flowers and hay."]
print("".join(x))
One-liner:
print("".join([string1, string2, string3, string4, string5, string6]))

If you can put the strings in a list in the first place, so much the better. Otherwise:
for s in [string1, string2, string3, string4, string5, string6]:
message += s

You are using different variables. You would have to individually call each variable to be able to concatenate them, because you are trying to call them like you would in a list. Try adding them to an array, or list.

maybe you were looking for eval ?
message = ''.join([eval('string'+str(i)) for i in range(1,7)])

There are various solutions for this problem, one was given in the comments. The reason you are getting that error is because string doesn't exist. You are calling string(i), What exactly are you expecting that to do?
Also, the loop you are doing doesn't have the right logic. When coding and not getting the expected result, the first line of defence is to debug. In this case, understand what you are looping through, which essentially is numbers. Go ahead and print that i variable so you can see what's happening. You are accessing your stringX variables at all. They need to be contained in an iterable in order for you to iterate them. Not to mention that the for loop iteration is wrong since range(x) provides numbers from 0 to x-1, which in your case would be 0 1 2 3 4 5. You would have known that if you had debugged. Part of coding, I must say, is debugging. It's something good to get used to doing.
Here's the documentation on Python about strings.
string.join(words[, sep])
Concatenate a list or tuple of words with intervening occurrences of sep. The default value for sep is a single space character. It is always true that string.join(string.split(s, sep), sep) equals s.
That means you can use the string's method join to concatenate strings. The method requries you pass it a list of string. Your code would look like this:
string1 = "The wind, "
string2 = "which had hitherto carried us along with amazing rapidity, "
string3 = "sank at sunset to a light breeze; "
string4 = "the soft air just ruffled the water and "
string5 = "caused a pleasant motion among the trees as we approached the shore, "
string6 = "from which it wafted the most delightful scent of flowers and hay."
message = "".join([string1, string2, string3, string4, string5, string6])
print(message)
Output:
The wind, which had hitherto carried us along with amazing rapidity, sank at sunset to
a light breeze; the soft air just ruffled the water and caused a pleasant motion among the
trees as we approached the shore, from which it wafted the most delightful scent of
flowers and hay.
Since I am not sure what you goal is, I am going to assume, for the sake of having something different, that you have an arbitrary number of string variables passed to you. That's even easier to handle, because if you define a method called join_strings the value of the variables passed is already a list. Neat, right? So your code would be something like:
def join_strings(*strings):
return "".join(strings)
Incredibly short and sweet isn't it? Then you would call that method like this:
join_strings(string1, string2, string3, string4, string5, string6)
The cool thing is that tomorrow you might have only 3 strings, or 8, and that still works. Of course, it'd be more helpful to know why you are even saving the strings like that, since I'm sure you can use a better suited data structure for your needs (like using a list to begin with).
Next time you post to StackOverflow, it's good to try and show some effort of what you have tried to do to understand your problem instead of just pasting the problem.

Just write your story in one go (note the lack of commas between the strings):
message = ("The wind, "
"which had hitherto carried us along with amazing rapidity, "
"sank at sunset to a light breeze; "
"the soft air just ruffled the water and "
"caused a pleasant motion among the trees as we approached the shore, "
"from which it wafted the most delightful scent of flowers and hay.")
Or if you want to type less quotes:
import inspect
message = """
The wind,
which had hitherto carried us along with amazing rapidity,
sank at sunset to a light breeze;
the soft air just ruffled the water and
caused a pleasant motion among the trees as we approached the shore,
from which it wafted the most delightful scent of flowers and hay.
"""
message = inspect.cleandoc(message) # remove unwanted indentation
message = message.replace('\n', ' ') # remove the newlines

To many lists of Unique Words

This is a homework project from last week. I had problems so did not turn it it. But I like to go back and see if I can make them work. Now that I have it printing the right words in alphabetical order. I have the problem that it is printing 3 separate lists of unique words all with different number of words in the lists. How can I fix this?
import string
def process_line(line_str,word_set):
line_str=line_str.strip()
list_of_words=line_str.split()
for word in list_of_words:
if word!="--":
word=word.strip()
word=word.strip(string.punctuation)
word=word.lower()
word_set.add(word)
def pretty_print(word_set):
list_of_words=[]
for w in word_set:
list_of_words.append(w)
list_of_words.sort()
for w in list_of_words:
print(w,end=" ")
word_set=set([])
fObject=open("gettysburg.txt")
for line_str in fObject:
process_line(line_str,word_set)
print("\nlength of the word set: ",len(word_set))
print("\nUnique words in set: ")
pretty_print(word_set)
Below is the output I get, I only want it to give me the last one with the 138 words. Appreciate any help.
length of the word set: 29
Unique words in set:
a ago all and are brought conceived continent created dedicated equal fathers forth four in liberty men nation new on our proposition score seven that the this to years
length of the word set: 71
Unique words in set:
a ago all altogether and any are as battlefield brought can civil come conceived continent created dedicate dedicated do endure engaged equal fathers field final fitting for forth four gave great have here in is it liberty live lives long men met might nation new now of on or our place portion proper proposition resting score seven should so testing that the their this those to war we whether who years
length of the word set: 138
Unique words in set:
a above add advanced ago all altogether and any are as battlefield be before birth brave brought but by can cause civil come conceived consecrate consecrated continent created dead dedicate dedicated detract devotion did died do earth endure engaged equal far fathers field final fitting for forget forth fought four freedom from full gave god government great ground hallow have here highly honored in increased is it larger last liberty little live lives living long measure men met might nation never new nobly nor not note now of on or our people perish place poor portion power proper proposition rather remaining remember resolve resting say score sense seven shall should so struggled take task testing that the their these they this those thus to under unfinished us vain war we what whether which who will work world years

Take last 3 lines out of for:
....
for line_str in fObject:
process_line(line_str,word_set)
print("\nlength of the word set: ",len(word_set))
print("\nUnique words in set: ")
pretty_print(word_set)

Calculating the Letter Frequency in Python

I need to define a function that will slice a string according to a certain character, sum up those indices, divide by the number of times the character occurs in the string and then divide all that by the length of the text.
Here's what I have so far:
def ave_index(char):
passage = "string"
if char in passage:
word = passage.split(char)
words = len(word)
number = passage.count(char)
answer = word / number / len(passage)
return(answer)
elif char not in passage:
return False
So far, the answers I've gotten when running this have been quite off the mark
EDIT: The passage we were given to use as a string -
'Call me Ishmael. Some years ago - never mind how long precisely - having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off - then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.'
when char = 's' the answer should be 0.5809489252885479

You can use Counter to check frequencies:
from collections import Counter
words = 'The passage we were given to use as a string - Call me Ishmael. Some years ago - never mind how long precisely - having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people\'s hats off - then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me.'
freqs = Counter(list(words)) # list(words) returns a list of all the characters in words, then Counter will calculate the frequencies
print(float(freqs['s']) / len(words))

The problem is how you are counting the letters. Take the string hello world and you are trying to count how many l there are. Now we know there are 3 l, but if you do a split:
>>> s.split('l')
['he', '', 'o wor', 'd']
This will result in a count of 4. Further, we have to get the position of each instance of the character in the string.
The enumerate built-in helps us out here:
>>> s = 'hello world'
>>> c = 'l' # The letter we are looking for
>>> results = [k for k,v in enumerate(s) if v == c]
>>> results
[2, 3, 9]
Now we have the total number of occurrences len(results), and the positions in the string where the letter occurs.
The final "trick" to this problem is to make sure you divide by a float, in order to get the proper result.
Working against your sample text (stored in s):
>>> c = 's'
>>> results = [k for k,v in enumerate(s) if v == c]
>>> results_sum = sum(results)
>>> (results_sum / len(results)) / float(len(s))
0.5804132973944295

Puzzle Solver - Determining what variables belong together based on given facts & constraints by calculating probabilities

Before I say anything else - the problem may look lengthy but it's actually just 1 problem in the core, that kind of gets repeated. I'm trying to create a "puzzle solver" that has to do with probability. This is a model where the person solving the problem has to do it without computer help, but I want to be able to have a script, where I can just change the variables for different conditions. We use this test for potential employees and we have to come up with the answers. It's pretty tedious, so I was hoping someone could give me a hand trying to do it in Python? I'm still learning Python but because I see how easily everyone can "manipulate" it to give them what they need I was hoping to learn how to do that, so I don't have to do this over and over again by hand.
Here's an example of the test -
Given facts:
There are 5 houses in a row, numbered from left to right as 1, 2, 3, 4, 5.
Each house is painted a different color: Red, Orange, Yellow, Green, or Blue.
Each house is owned by a different person: Ann, Bob, Carl, Dorothy, or Ed.
Each house has a different number of windows: one, two, three, four, five
Each house was built in a different year: 1970, 1980, 1990, 2000, or 2010.
Each person knows a different language: Spanish, French, Latin, German, Italian.
Then we give them the constraints:
Bob lives in the yellow house.
Ann knows Latin.
There are 5 windows on the orange house.
Dorothy lives in a house with 2 windows.
The orange house is immediately to the right of the green house.
The German speaker lives in the house built in 2000.
The red house was built in 1970.
The middle house has 3 windows.
Carl lives in the first house.
The house built in 1980 is next to the house of the Italian speaker.
The house built in 1970 is next to the house where the French speaker lives.
The house built in 1990 has 1 window.
Ed lives in a house built in 2010.
Carl lives next to the blue house.
The potential employee has to figure out:
for each house numbered 1 to 5, who lives there, what color is the house, how many windows it has, when was it built and what language does the occupant speak.
And that is exactly what I want to put into Python!
I gave it a go and here's my reasoning behind it:
def permutations(x):
outlist = []
for a in x:
for b in x:
if b == a:
continue
for c in x:
if c == b or c == a:
continue
for d in x:
if d == a or d == b or d == c:
continue
for e in x:
if e == a or e==b or e==c or e==d:
continue
outlist.append([a,b,c,d,e])
return outlist
The "checks" in the loop are so that the loop continues if an entry would be repeated, so that the inner loops don’t have to execute unless the early loops are valid - saves time!
Given a list x of five elements, this function returns a list of lists, each of which is a permutation of the original five elements where no one element is equal to another.
So, if the list input is x = [1,2,3,4,5], the returned output is a list of possible permutations of this:
Outlist = [[1,2,3,4,5],[1,2,3,5,4],[1,2,4,3,5],[1,2,4,5,3], ...]
which will have 5! = 120 elements.
So, I know how it works in theory but writing it down in Python proved to much for me to "translate".
I assigned the name variables (Ann,Bob,Carl,Dorothy,Ed) one of these permutations (say [1,2,5,4,3]), which means that Ann lives in house 1, Bob lives in house 2, Carol lives in house 5, Dorothy lives in house 4, Ed lives in house 3.
Similarly, I know you can assign to the color variables (Red,Orange,Yellow,Green,Blue) another of these permutations (say [5,4,3,1,2]) which means that house 5 is red, house 4 is orange, house 3 is yellow, house 1 is green and house 2 is blue.
You can assign the same or another permutation to the number of windows (one,two,three,four,five), the year the house was built (Seven,Eight,Nine,Zero,Ten) and the language spoken.
And this is where I get really lost because I'm having a hard time understanding how the same numbers can be reused - don't they get written over in such cases?
First things first though - it's better (more time efficient) if we first check whether the clues are true for this assignment. If not, the person taking the test can go to another assignment!
Coding-wise this is how I imagined it but my limited knowledge of Python didn't really help me write "proper code":
a) check if Bob lives in the yellow house, by Bob == Yellow
(that is, the house number assigned to Bob is the same as the house number assigned to Yellow.
b) check if the house built in 1970 is next to the house where the French speaker lives, do absolute value calculation ->
abs(Seven – French) == 1
Meaning the house numbers assigned to Seven and French differ by only 1.
Further on I know there are additional checks and all of the them must be passed as True for the five permutation assignments to be the solution of the puzzle.
Then I made an assignment of the permutation to the variables using a loop:
for a in outlist:
(Ann,Bob,Carl,Dorothy,Ed) = a
It will assign Ann the value a[0], Bob the value a[1], Carol the value a[2], Dorothy the value a[3] and Ed the value a[4], and because we loop through all permutations in outlist, where outlist is the output of the function, a list of permutation lists.
Another problem - making a list of lists... showing to be a bit of a struggle.
I know I have to write five nested loops of assignment to the variables of interest. To verify whether the assignment satisfies the clues I thought about checking a subset of the clues incrementally in each loop once I'd have a partial assignment, so then I wouldn’t enumerate the inner loop unless the subset of clues is satisfied. Again, it gives the program to run faster and be more effective.
Here's an attempt at the first loop, which is (for instance) over names. I know Carl must live in house 1, and the other loops don't get executed if this is not true! On paper, you have to keep repeating the process until Carl == 1!
Attempt at writing code:
for a in outlist:
(Ann,Bob,Carl,Dorothy,Ed) = a
if Carl != 1:
continue
for b in outlist:
(Red,Orange,Yellow,Green,Blue) = b
if ...
With this code, the inner four loops only execute when Carl == 1.
I have to continue I know, but the overlap of variables is a problem here too.
AND FINALLY - I was advised to "time the function" by using the time module
time.time().
I know the current time reported back in a Mac (so mine) is in microseconds, and this is written according to that. Not sure HOW to get the right code tho.
import time
start = time.time()
#CODE
end = time.time()
print('Running Time: {} msecs'.format((end - start)*0.001))
Thank you for getting to the end of this! I find it very overwhelming and don't know where to start but I would sure love to have something like this doing all my permutations for me!

Use itertools.permutations() to generate permutations:
from itertools import permutations
for colors in permutations(range(5)):
# colors is permuted combination of 5 integers between 0 and 4, inclusive.
This uses numbers from 0 to 4 as that comes more natural to Python, but the principle is the same; if you want you can use range(1, 6) instead to generate permutations of integers 1 through to 5, inclusive.
Now nest your permutation loops. The outer loop is for the colour choices; each number representing the colour for that house. Test the constraints, eliminate all that don't fit (any combo that has orange not next to green doesn't fit). Where the constraints fit, loop over permutations for the owner, eliminate those that don't fit, loop over window counts for those that do, etc.
Use one function to test constraints, allowing for missing aspects, to keep testing simple.
You'll find that you can eliminate most combos very quickly very early.

How to solve the "Mastermind" guessing game?

How would you create an algorithm to solve the following puzzle, "Mastermind"?
Your opponent has chosen four different colours from a set of six (yellow, blue, green, red, orange, purple). You must guess which they have chosen, and in what order. After each guess, your opponent tells you how many (but not which) of the colours you guessed were the right colour in the right place ["blacks"] and how many (but not which) were the right colour but in the wrong place ["whites"]. The game ends when you guess correctly (4 blacks, 0 whites).
For example, if your opponent has chosen (blue, green, orange, red), and you guess (yellow, blue, green, red), you will get one "black" (for the red), and two whites (for the blue and green). You would get the same score for guessing (blue, orange, red, purple).
I'm interested in what algorithm you would choose, and (optionally) how you translate that into code (preferably Python). I'm interested in coded solutions that are:
Clear (easily understood)
Concise
Efficient (fast in making a guess)
Effective (least number of guesses to solve the puzzle)
Flexible (can easily answer questions about the algorithm, e.g. what is its worst case?)
General (can be easily adapted to other types of puzzle than Mastermind)
I'm happy with an algorithm that's very effective but not very efficient (provided it's not just poorly implemented!); however, a very efficient and effective algorithm implemented inflexibly and impenetrably is not of use.
I have my own (detailed) solution in Python which I have posted, but this is by no means the only or best approach, so please post more! I'm not expecting an essay ;)

Key tools: entropy, greediness, branch-and-bound; Python, generators, itertools, decorate-undecorate pattern
In answering this question, I wanted to build up a language of useful functions to explore the problem. I will go through these functions, describing them and their intent. Originally, these had extensive docs, with small embedded unit tests tested using doctest; I can't praise this methodology highly enough as a brilliant way to implement test-driven-development. However, it does not translate well to StackOverflow, so I will not present it this way.
Firstly, I will be needing several standard modules and future imports (I work with Python 2.6).
from __future__ import division # No need to cast to float when dividing
import collections, itertools, math
I will need a scoring function. Originally, this returned a tuple (blacks, whites), but I found output a little clearer if I used a namedtuple:
Pegs = collections.namedtuple('Pegs', 'black white')
def mastermindScore(g1,g2):
matching = len(set(g1) & set(g2))
blacks = sum(1 for v1, v2 in itertools.izip(g1,g2) if v1 == v2)
return Pegs(blacks, matching-blacks)
To make my solution general, I pass in anything specific to the Mastermind problem as keyword arguments. I have therefore made a function that creates these arguments once, and use the **kwargs syntax to pass it around. This also allows me to easily add new attributes if I need them later. Note that I allow guesses to contain repeats, but constrain the opponent to pick distinct colours; to change this, I only need change G below. (If I wanted to allow repeats in the opponent's secret, I would need to change the scoring function as well.)
def mastermind(colours, holes):
return dict(
G = set(itertools.product(colours,repeat=holes)),
V = set(itertools.permutations(colours, holes)),
score = mastermindScore,
endstates = (Pegs(holes, 0),))
def mediumGame():
return mastermind(("Yellow", "Blue", "Green", "Red", "Orange", "Purple"), 4)
Sometimes I will need to partition a set based on the result of applying a function to each element in the set. For instance, the numbers 1..10 can be partitioned into even and odd numbers by the function n % 2 (odds give 1, evens give 0). The following function returns such a partition, implemented as a map from the result of the function call to the set of elements that gave that result (e.g. { 0: evens, 1: odds }).
def partition(S, func, *args, **kwargs):
partition = collections.defaultdict(set)
for v in S: partition[func(v, *args, **kwargs)].add(v)
return partition
I decided to explore a solver that uses a greedy entropic approach. At each step, it calculates the information that could be obtained from each possible guess, and selects the most informative guess. As the numbers of possibilities grow, this will scale badly (quadratically), but let's give it a try! First, I need a method to calculate the entropy (information) of a set of probabilities. This is just -∑p log p. For convenience, however, I will allow input that are not normalized, i.e. do not add up to 1:
def entropy(P):
total = sum(P)
return -sum(p*math.log(p, 2) for p in (v/total for v in P if v))
So how am I going to use this function? Well, for a given set of possibilities, V, and a given guess, g, the information we get from that guess can only come from the scoring function: more specifically, how that scoring function partitions our set of possibilities. We want to make a guess that distinguishes best among the remaining possibilites — divides them into the largest number of small sets — because that means we are much closer to the answer. This is exactly what the entropy function above is putting a number to: a large number of small sets will score higher than a small number of large sets. All we need to do is plumb it in.
def decisionEntropy(V, g, score):
return entropy(collections.Counter(score(gi, g) for gi in V).values())
Of course, at any given step what we will actually have is a set of remaining possibilities, V, and a set of possible guesses we could make, G, and we will need to pick the guess which maximizes the entropy. Additionally, if several guesses have the same entropy, prefer to pick one which could also be a valid solution; this guarantees the approach will terminate. I use the standard python decorate-undecorate pattern together with the built-in max method to do this:
def bestDecision(V, G, score):
return max((decisionEntropy(V, g, score), g in V, g) for g in G)[2]
Now all I need to do is repeatedly call this function until the right result is guessed. I went through a number of implementations of this algorithm until I found one that seemed right. Several of my functions will want to approach this in different ways: some enumerate all possible sequences of decisions (one per guess the opponent may have made), while others are only interested in a single path through the tree (if the opponent has already chosen a secret, and we are just trying to reach the solution). My solution is a "lazy tree", where each part of the tree is a generator that can be evaluated or not, allowing the user to avoid costly calculations they won't need. I also ended up using two more namedtuples, again for clarity of code.
Node = collections.namedtuple('Node', 'decision branches')
Branch = collections.namedtuple('Branch', 'result subtree')
def lazySolutionTree(G, V, score, endstates, **kwargs):
decision = bestDecision(V, G, score)
branches = (Branch(result, None if result in endstates else
lazySolutionTree(G, pV, score=score, endstates=endstates))
for (result, pV) in partition(V, score, decision).iteritems())
yield Node(decision, branches) # Lazy evaluation
The following function evaluates a single path through this tree, based on a supplied scoring function:
def solver(scorer, **kwargs):
lazyTree = lazySolutionTree(**kwargs)
steps = []
while lazyTree is not None:
t = lazyTree.next() # Evaluate node
result = scorer(t.decision)
steps.append((t.decision, result))
subtrees = [b.subtree for b in t.branches if b.result == result]
if len(subtrees) == 0:
raise Exception("No solution possible for given scores")
lazyTree = subtrees[0]
assert(result in endstates)
return steps
This can now be used to build an interactive game of Mastermind where the user scores the computer's guesses. Playing around with this reveals some interesting things. For example, the most informative first guess is of the form (yellow, yellow, blue, green), not (yellow, blue, green, red). Extra information is gained by using exactly half the available colours. This also holds for 6-colour 3-hole Mastermind — (yellow, blue, green) — and 8-colour 5-hole Mastermind — (yellow, yellow, blue, green, red).
But there are still many questions that are not easily answered with an interactive solver. For instance, what is the most number of steps needed by the greedy entropic approach? And how many inputs take this many steps? To make answering these questions easier, I first produce a simple function that turns the lazy tree of above into a set of paths through this tree, i.e. for each possible secret, a list of guesses and scores.
def allSolutions(**kwargs):
def solutions(lazyTree):
return ((((t.decision, b.result),) + solution
for t in lazyTree for b in t.branches
for solution in solutions(b.subtree))
if lazyTree else ((),))
return solutions(lazySolutionTree(**kwargs))
Finding the worst case is a simple matter of finding the longest solution:
def worstCaseSolution(**kwargs):
return max((len(s), s) for s in allSolutions(**kwargs)) [1]
It turns out that this solver will always complete in 5 steps or fewer. Five steps! I know that when I played Mastermind as a child, I often took longer than this. However, since creating this solver and playing around with it, I have greatly improved my technique, and 5 steps is indeed an achievable goal even when you don't have time to calculate the entropically ideal guess at each step ;)
How likely is it that the solver will take 5 steps? Will it ever finish in 1, or 2, steps? To find that out, I created another simple little function that calculates the solution length distribution:
def solutionLengthDistribution(**kwargs):
return collections.Counter(len(s) for s in allSolutions(**kwargs))
For the greedy entropic approach, with repeats allowed: 7 cases take 2 steps; 55 cases take 3 steps; 229 cases take 4 steps; and 69 cases take the maximum of 5 steps.
Of course, there's no guarantee that the greedy entropic approach minimizes the worst-case number of steps. The final part of my general-purpose language is an algorithm that decides whether or not there are any solutions for a given worst-case bound. This will tell us whether greedy entropic is ideal or not. To do this, I adopt a branch-and-bound strategy:
def solutionExists(maxsteps, G, V, score, **kwargs):
if len(V) == 1: return True
partitions = [partition(V, score, g).values() for g in G]
maxSize = max(len(P) for P in partitions) ** (maxsteps - 2)
partitions = (P for P in partitions if max(len(s) for s in P) <= maxSize)
return any(all(solutionExists(maxsteps-1,G,s,score) for l,s in
sorted((-len(s), s) for s in P)) for i,P in
sorted((-entropy(len(s) for s in P), P) for P in partitions))
This is definitely a complex function, so a bit more explanation is in order. The first step is to partition the remaining solutions based on their score after a guess, as before, but this time we don't know what guess we're going to make, so we store all partitions. Now we could just recurse into every one of these, effectively enumerating the entire universe of possible decision trees, but this would take a horrifically long time. Instead I observe that, if at this point there is no partition that divides the remaining solutions into more than n sets, then there can be no such partition at any future step either. If we have k steps left, that means we can distinguish between at most nk-1 solutions before we run out of guesses (on the last step, we must always guess correctly). Thus we can discard any partitions that contain a score mapped to more than this many solutions. This is the next two lines of code.
The final line of code does the recursion, using Python's any and all functions for clarity, and trying the highest-entropy decisions first to hopefully minimize runtime in the positive case. It also recurses into the largest part of the partition first, as this is the most likely to fail quickly if the decision was wrong. Once again, I use the standard decorate-undecorate pattern, this time to wrap Python's sorted function.
def lowerBoundOnWorstCaseSolution(**kwargs):
for steps in itertools.count(1):
if solutionExists(maxsteps=steps, **kwargs):
return steps
By calling solutionExists repeatedly with an increasing number of steps, we get a strict lower bound on the number of steps needed in the worst case for a Mastermind solution: 5 steps. The greedy entropic approach is indeed optimal.
Out of curiosity, I invented another guessing game, which I nicknamed "twoD". In this, you try to guess a pair of numbers; at each step, you get told if your answer is correct, if the numbers you guessed are no less than the corresponding ones in the secret, and if the numbers are no greater.
Comparison = collections.namedtuple('Comparison', 'less greater equal')
def twoDScorer(x, y):
return Comparison(all(r[0] <= r[1] for r in zip(x, y)),
all(r[0] >= r[1] for r in zip(x, y)),
x == y)
def twoD():
G = set(itertools.product(xrange(5), repeat=2))
return dict(G = G, V = G, score = twoDScorer,
endstates = set(Comparison(True, True, True)))
For this game, the greedy entropic approach has a worst case of five steps, but there is a better solution possible with a worst case of four steps, confirming my intuition that myopic greediness is only coincidentally ideal for Mastermind. More importantly, this has shown how flexible my language is: all the same methods work for this new guessing game as did for Mastermind, letting me explore other games with a minimum of extra coding.
What about performance? Obviously, being implemented in Python, this code is not going to be blazingly fast. I've also dropped some possible optimizations in favour of clear code.
One cheap optimization is to observe that, on the first move, most guesses are basically identical: (yellow, blue, green, red) is really no different from (blue, red, green, yellow), or (orange, yellow, red, purple). This greatly reduces the number of guesses we need consider on the first step — otherwise the most costly decision in the game.
However, because of the large runtime growth rate of this problem, I was not able to solve the 8-colour, 5-hole Mastermind problem, even with this optimization. Instead, I ported the algorithms to C++, keeping the general structure the same and employing bitwise operations to boost performance in the critical inner loops, for a speedup of many orders of magnitude. I leave this as an exercise to the reader :)
Addendum, 2018: It turns out the greedy entropic approach is not optimal for the 8-colour, 4-hole Mastermind problem either, with a worst-case length of 7 steps when an algorithm exists that takes at most 6!

I once wrote a "Jotto" solver which is essentially "Master Mind" with words. (We each pick a word and we take turns guessing at each other's word, scoring "right on" (exact) matches and "elsewhere" (correct letter/color, but wrong placement).
The key to solving such a problem is the realization that the scoring function is symmetric.
In other words if score(myguess) == (1,2) then I can use the same score() function to compare my previous guess with any other possibility and eliminate any that don't give exactly the same score.
Let me give an example: The hidden word (target) is "score" ... the current guess is "fools" --- the score is 1,1 (one letter, 'o', is "right on"; another letter, 's', is "elsewhere"). I can eliminate the word "guess" because the `score("guess") (against "fools") returns (1,0) (the final 's' matches, but nothing else does). So the word "guess" is not consistent with "fools" and a score against some unknown word that gave returned a score of (1,1).
So I now can walk through every five letter word (or combination of five colors/letters/digits etc) and eliminate anything that doesn't score 1,1 against "fools." Do that at each iteration and you'll very rapidly converge on the target. (For five letter words I was able to get within 6 tries every time ... and usually only 3 or 4). Of course there's only 6000 or so "words" and you're eliminating close to 95% for each guess.
Note: for the following discussion I'm talking about five letter "combination" rather than four elements of six colors. The same algorithms apply; however, the problem is orders of magnitude smaller for the old "Master Mind" game ... there are only 1296 combinations (6**4) of colored pegs in the classic "Master Mind" program, assuming duplicates are allowed. The line of reasoning that leads to the convergence involves some combinatorics: there are 20 non-winning possible scores for a five element target (n = [(a,b) for a in range(5) for b in range(6) if a+b <= 5] to see all of them if you're curious. We would, therefore, expect that any random valid selection would have a roughly 5% chance of matching our score ... the other 95% won't and therefore will be eliminated for each scored guess. This doesn't account for possible clustering in word patterns but the real world behavior is close enough for words and definitely even closer for "Master Mind" rules. However, with only 6 colors in 4 slots we only have 14 possible non-winning scores so our convergence isn't quite as fast).
For Jotto the two minor challenges are: generating a good world list (awk -f 'length($0)==5' /usr/share/dict/words or similar on a UNIX system) and what to do if the user has picked a word that not in our dictionary (generate every letter combination, 'aaaaa' through 'zzzzz' --- which is 26 ** 5 ... or ~1.1 million). A trivial combination generator in Python takes about 1 minute to generate all those strings ... an optimized one should to far better. (I can also add a requirement that every "word" have at least one vowel ... but this constraint doesn't help much --- 5 vowels * 5 possible locations for that and then multiplied by 26 ** 4 other combinations).
For Master Mind you use the same combination generator ... but with only 4 or 5 "letters" (colors). Every 6-color combination (15,625 of them) can be generated in under a second (using the same combination generator as I used above).
If I was writing this "Jotto" program today, in Python for example, I would "cheat" by having a thread generating all the letter combos in the background while I was still eliminated words from the dictionary (while my opponent was scoring me, guessing, etc). As I generated them I'd also eliminate against all guesses thus far. Thus I would, after I'd eliminated all known words, have a relatively small list of possibilities and against a human player I've "hidden" most of my computation lag by doing it in parallel to their input. (And, if I wrote a web server version of such a program I'd have my web engine talk to a local daemon to ask for sequences consistent with a set of scores. The daemon would keep a locally generated list of all letter combinations and would use a select.select() model to feed possibilities back to each of the running instances of the game --- each would feed my daemon word/score pairs which my daemon would apply as a filter on the possibilities it feeds back to that client).
(By comparison I wrote my version of "Jotto" about 20 years ago on an XT using Borland TurboPascal ... and it could do each elimination iteration --- starting with its compiled in list of a few thousand words --- in well under a second. I build its word list by writing a simple letter combination generator (see below) ... saving the results to a moderately large file, then running my word processor's spell check on that with a macro to delete everything that was "mis-spelled" --- then I used another macro to wrap all the remaining lines in the correct punctuation to make them valid static assignments to my array, which was a #include file to my program. All that let me build a standalone game program that "knew" just about every valid English 5 letter word; the program was a .COM --- less than 50KB if I recall correctly).
For other reasons I've recently written a simple arbitrary combination generator in Python. It's about 35 lines of code and I've posted that to my "trite snippets" wiki on bitbucket.org ... it's not a "generator" in the Python sense ... but a class you can instantiate to an infinite sequence of "numeric" or "symbolic" combination of elements (essentially counting in any positive integer base).
You can find it at: Trite Snippets: Arbitrary Sequence Combination Generator
For the exact match part of our score() function you can just use this:
def score(this, that):
'''Simple "Master Mind" scoring function'''
exact = len([x for x,y in zip(this, that) if x==y])
### Calculating "other" (white pegs) goes here:
### ...
###
return (exact,other)
I think this exemplifies some of the beauty of Python: zip() up the two sequences,
return any that match, and take the length of the results).
Finding the matches in "other" locations is deceptively more complicated. If no repeats were allowed then you could simply use sets to find the intersections.
[In my earlier edit of this message, when I realized how I could use zip() for exact matches, I erroneously thought we could get away with other = len([x for x,y in zip(sorted(x), sorted(y)) if x==y]) - exact ... but it was late and I was tired. As I slept on it I realized that the method was flawed. Bad, Jim! Don't post without adequate testing!* (Tested several cases that happened to work)].
In the past the approach I used was to sort both lists, compare the heads of each: if the heads are equal, increment the count and pop new items from both lists. otherwise pop a new value into the lesser of the two heads and try again. Break as soon as either list is empty.
This does work; but it's fairly verbose. The best I can come up with using that approach is just over a dozen lines of code:
other = 0
x = sorted(this) ## Implicitly converts to a list!
y = sorted(that)
while len(x) and len(y):
if x[0] == y[0]:
other += 1
x.pop(0)
y.pop(0)
elif x[0] < y[0]:
x.pop(0)
else:
y.pop(0)
other -= exact
Using a dictionary I can trim that down to about nine:
other = 0
counters = dict()
for i in this:
counters[i] = counters.get(i,0) + 1
for i in that:
if counters.get(i,0) > 0:
other += 1
counters[i] -= 1
other -= exact
(Using the new "collections.Counter" class (Python3 and slated for Python 2.7?) I could presumably reduce this a little more; three lines here are initializing the counters collection).
It's important to decrement the "counter" when we find a match and it's vital to test for counter greater than zero in our test. If a given letter/symbol appears in "this" once and "that" twice then it must only be counted as a match once.
The first approach is definitely a bit trickier to write (one must be careful to avoid boundaries). Also in a couple of quick benchmarks (testing a million randomly generated pairs of letter patterns) the first approach takes about 70% longer as the one using dictionaries. (Generating the million pairs of strings using random.shuffle() took over twice as long as the slower of the scoring functions, on the other hand).
A formal analysis of the performance of these two functions would be complicated. The first method has two sorts, so that would be 2 * O(nlog(n)) ... and it iterates through at least one of the two strings and possibly has to iterate all the way to the end of the other string (best case O(n), worst case O(2n)) -- force I'm mis-using big-O notation here, but this is just a rough estimate. The second case depends entirely on the perfomance characteristics of the dictionary. If we were using b-trees then the performance would be roughly O(nlog(n) for creation and finding each element from the other string therein would be another O(n*log(n)) operation. However, Python dictionaries are very efficient and these operations should be close to constant time (very few hash collisions). Thus we'd expect a performance of roughly O(2n) ... which of course simplifies to O(n). That roughly matches my benchmark results.
Glancing over the Wikipedia article on "Master Mind" I see that Donald Knuth used an approach which starts similarly to mine (and 10 years earlier) but he added one significant optimization. After gathering every remaining possibility he selects whichever one would eliminate the largest number of possibilities on the next round. I considered such an enhancement to my own program and rejected the idea for practical reasons. In his case he was searching for an optimal (mathematical) solution. In my case I was concerned about playability (on an XT, preferably using less than 64KB of RAM, though I could switch to .EXE format and use up to 640KB). I wanted to keep the response time down in the realm of one or two seconds (which was easy with my approach but which would be much more difficult with the further speculative scoring). (Remember I was working in Pascal, under MS-DOS ... no threads, though I did implement support for crude asynchronous polling of the UI which turned out to be unnecessary)
If I were writing such a thing today I'd add a thread to do the better selection, too. This would allow me to give the best guess I'd found within a certain time constraint, to guarantee that my player didn't have to wait too long for my guess. Naturally my selection/elimination would be running while waiting for my opponent's guesses.

Have you seem Raymond Hettingers attempt? They certainly match up to some of your requirements.
I wonder how his solutions compares to yours.

There is a great site about MasterMind strategy here. The author starts off with very simple MasterMind problems (using numbers rather than letters, and ignoring order and repetition) and gradually builds up to a full MasterMind problem (using colours, which can be repeated, in any order, even with the possibility of errors in the clues).
The seven tutorials that are presented are as follows:
Tutorial 1 - The simplest game setting (no errors, fixed order, no repetition)
Tutorial 2 - Code may contain blank spaces (no errors, fixed order, no repetition)
Tutorial 3 - Hints may contain errors (fixed order, no repetition)
Tutorial 4 - Game started from the middle (no errors, fixed order, no repetition)
Tutorial 5 - Digits / colours may be repeated (no errors, fixed order, each colour repeated at most 4 times)
Tutorial 6 - Digits / colours arranged in random order (no errors, random order, no repetition)
Tutorial 7 - Putting it all together (no errors, random order, each colour repeated at most 4 times)

Just thought I'd contribute my 90 odd lines of code. I've build upon #Jim Dennis' answer, mostly taking away the hint on symetric scoring. I've implemented the minimax algorithm as described on the Mastermind wikipedia article by Knuth, with one exception: I restrict my next move to current list of possible solutions, as I found performance deteriorated badly when taking all possible solutions into account at each step. The current approach leaves me with a worst case of 6 guesses for any combination, each found in well under a second.
It's perhaps important to note that I make no restriction whatsoever on the hidden sequence, allowing for any number of repeats.
from itertools import product, tee
from random import choice
COLORS = 'red ', 'green', 'blue', 'yellow', 'purple', 'pink'#, 'grey', 'white', 'black', 'orange', 'brown', 'mauve', '-gap-'
HOLES = 4
def random_solution():
"""Generate a random solution."""
return tuple(choice(COLORS) for i in range(HOLES))
def all_solutions():
"""Generate all possible solutions."""
for solution in product(*tee(COLORS, HOLES)):
yield solution
def filter_matching_result(solution_space, guess, result):
"""Filter solutions for matches that produce a specific result for a guess."""
for solution in solution_space:
if score(guess, solution) == result:
yield solution
def score(actual, guess):
"""Calculate score of guess against actual."""
result = []
#Black pin for every color at right position
actual_list = list(actual)
guess_list = list(guess)
black_positions = [number for number, pair in enumerate(zip(actual_list, guess_list)) if pair[0] == pair[1]]
for number in reversed(black_positions):
del actual_list[number]
del guess_list[number]
result.append('black')
#White pin for every color at wrong position
for color in guess_list:
if color in actual_list:
#Remove the match so we can't score it again for duplicate colors
actual_list.remove(color)
result.append('white')
#Return a tuple, which is suitable as a dictionary key
return tuple(result)
def minimal_eliminated(solution_space, solution):
"""For solution calculate how many possibilities from S would be eliminated for each possible colored/white score.
The score of the guess is the least of such values."""
result_counter = {}
for option in solution_space:
result = score(solution, option)
if result not in result_counter.keys():
result_counter[result] = 1
else:
result_counter[result] += 1
return len(solution_space) - max(result_counter.values())
def best_move(solution_space):
"""Determine the best move in the solution space, being the one that restricts the number of hits the most."""
elim_for_solution = dict((minimal_eliminated(solution_space, solution), solution) for solution in solution_space)
max_elimintated = max(elim_for_solution.keys())
return elim_for_solution[max_elimintated]
def main(actual = None):
"""Solve a game of mastermind."""
#Generate random 'hidden' sequence if actual is None
if actual == None:
actual = random_solution()
#Start the game of by choosing n unique colors
current_guess = COLORS[:HOLES]
#Initialize solution space to all solutions
solution_space = all_solutions()
guesses = 1
while True:
#Calculate current score
current_score = score(actual, current_guess)
#print '\t'.join(current_guess), '\t->\t', '\t'.join(current_score)
if current_score == tuple(['black'] * HOLES):
print guesses, 'guesses for\t', '\t'.join(actual)
return guesses
#Restrict solution space to exactly those hits that have current_score against current_guess
solution_space = tuple(filter_matching_result(solution_space, current_guess, current_score))
#Pick the candidate that will limit the search space most
current_guess = best_move(solution_space)
guesses += 1
if __name__ == '__main__':
print max(main(sol) for sol in all_solutions())
Should anyone spot any possible improvements to the above code than I would be very much interested in your suggestions.

To work out the "worst" case, instead of using entropic I am looking to the partition that has the maximum number of elements, then select the try that is a minimum for this maximum => This will give me the minimum number of remaining possibility when I am not lucky (which happens in the worst case).
This always solve standard case in 5 attempts, but it is not a full proof that 5 attempts are really needed because it could happen that for next step a bigger set possibilities would have given a better result than a smaller one (because easier to distinguish between).
Though for the "Standard game" with 1680 I have a simple formal proof:
For the first step the try that gives the minimum for the partition with the maximum number is 0,0,1,1: 256. Playing 0,0,1,2 is not as good: 276.
For each subsequent try there are 14 outcomes (1 not placed and 3 placed is impossible) and 4 placed is giving a partition of 1. This means that in the best case (all partition same size) we will get a maximum partition that is a minimum of (number of possibilities - 1)/13 (rounded up because we have integer so necessarily some will be less and other more, so that the maximum is rounded up).
If I apply this:
After first play (0,0,1,1) I am getting 256 left.
After second try: 20 = (256-1)/13
After third try : 2 = (20-1)/13
Then I have no choice but to try one of the two left for the 4th try.
If I am unlucky a fifth try is needed.
This proves we need at least 5 tries (but not that this is enough).

Here is a generic algorithm I wrote that uses numbers to represent the different colours. Easy to change, but I find numbers to be a lot easier to work with than strings.
You can feel free to use any whole or part of this algorithm, as long as credit is given accordingly.
Please note I'm only a Grade 12 Computer Science student, so I am willing to bet that there are definitely more optimized solutions available.
Regardless, here's the code:
import random
def main():
userAns = raw_input("Enter your tuple, and I will crack it in six moves or less: ")
play(ans=eval("("+userAns+")"),guess=(0,0,0,0),previousGuess=[])
def play(ans=(6,1,3,5),guess=(0,0,0,0),previousGuess=[]):
if(guess==(0,0,0,0)):
guess = genGuess(guess,ans)
else:
checker = -1
while(checker==-1):
guess,checker = genLogicalGuess(guess,previousGuess,ans)
print guess, ans
if not(guess==ans):
previousGuess.append(guess)
base = check(ans,guess)
play(ans=ans,guess=base,previousGuess=previousGuess)
else:
print "Found it!"
def genGuess(guess,ans):
guess = []
for i in range(0,len(ans),1):
guess.append(random.randint(1,6))
return tuple(guess)
def genLogicalGuess(guess,previousGuess,ans):
newGuess = list(guess)
count = 0
#Generate guess
for i in range(0,len(newGuess),1):
if(newGuess[i]==-1):
newGuess.insert(i,random.randint(1,6))
newGuess.pop(i+1)
for item in previousGuess:
for i in range(0,len(newGuess),1):
if((newGuess[i]==item[i]) and (newGuess[i]!=ans[i])):
newGuess.insert(i,-1)
newGuess.pop(i+1)
count+=1
if(count>0):
return guess,-1
else:
guess = tuple(newGuess)
return guess,0
def check(ans,guess):
base = []
for i in range(0,len(zip(ans,guess)),1):
if not(zip(ans,guess)[i][0] == zip(ans,guess)[i][1]):
base.append(-1)
else:
base.append(zip(ans,guess)[i][1])
return tuple(base)
main()

Here's a link to pure Python solver for Mastermind(tm): http://code.activestate.com/recipes/496907-mastermind-style-code-breaking/ It has a simple version, a way to experiment with various guessing strategies, performance measurement, and an optional C accelerator.
The core of the recipe is short and sweet:
import random
from itertools import izip, imap
digits = 4
fmt = '%0' + str(digits) + 'd'
searchspace = tuple([tuple(map(int,fmt % i)) for i in range(0,10**digits)])
def compare(a, b, imap=imap, sum=sum, izip=izip, min=min):
count1 = [0] * 10
count2 = [0] * 10
strikes = 0
for dig1, dig2 in izip(a,b):
if dig1 == dig2:
strikes += 1
count1[dig1] += 1
count2[dig2] += 1
balls = sum(imap(min, count1, count2)) - strikes
return (strikes, balls)
def rungame(target, strategy, verbose=True, maxtries=15):
possibles = list(searchspace)
for i in xrange(maxtries):
g = strategy(i, possibles)
if verbose:
print "Out of %7d possibilities. I'll guess %r" % (len(possibles), g),
score = compare(g, target)
if verbose:
print ' ---> ', score
if score[0] == digits:
if verbose:
print "That's it. After %d tries, I won." % (i+1,)
break
possibles = [n for n in possibles if compare(g, n) == score]
return i+1
def strategy_allrand(i, possibles):
return random.choice(possibles)
if __name__ == '__main__':
hidden_code = random.choice(searchspace)
rungame(hidden_code, strategy_allrand)
Here is what the output looks like:
Out of 10000 possibilities. I'll guess (6, 4, 0, 9) ---> (1, 0)
Out of 1372 possibilities. I'll guess (7, 4, 5, 8) ---> (1, 1)
Out of 204 possibilities. I'll guess (1, 4, 2, 7) ---> (2, 1)
Out of 11 possibilities. I'll guess (1, 4, 7, 1) ---> (3, 0)
Out of 2 possibilities. I'll guess (1, 4, 7, 4) ---> (4, 0)
That's it. After 5 tries, I won.

My friend was considering relatively simple case - 8 colors, no repeats, no blanks.
With no repeats, there's no need for the max entropy consideration, all guesses have the same entropy and first or random guessing all work fine.
Here's the full code to solve that variant:
# SET UP
import random
import itertools
colors = ('red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet', 'ultra')
# ONE FUNCTION REQUIRED
def EvaluateCode(guess, secret_code):
key = []
for i in range(0, 4):
for j in range(0, 4):
if guess[i] == secret_code[j]:
key += ['black'] if i == j else ['white']
return key
# MAIN CODE
# choose secret code
secret_code = random.sample(colors, 4)
print ('(shh - secret code is: ', secret_code, ')\n', sep='')
# create the full list of permutations
full_code_list = list(itertools.permutations(colors, 4))
N_guess = 0
while True:
N_guess += 1
print ('Attempt #', N_guess, '\n-----------', sep='')
# make a random guess
guess = random.choice(full_code_list)
print ('guess:', guess)
# evaluate the guess and get the key
key = EvaluateCode(guess, secret_code)
print ('key:', key)
if key == ['black', 'black', 'black', 'black']:
break
# remove codes from the code list that don't match the key
full_code_list2 = []
for i in range(0, len(full_code_list)):
if EvaluateCode(guess, full_code_list[i]) == key:
full_code_list2 += [full_code_list[i]]
full_code_list = full_code_list2
print ('N remaining: ', len(full_code_list), '\n', full_code_list, '\n', sep='')
print ('\nMATCH after', N_guess, 'guesses\n')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.