How to prevent 36k x 36k x 36k permutations to be calculated - python

I have a nice planning exercise, that I just can't get my head around (except for brute force, which is too big to handle). We are organizing a golf trip for 12 persons. We play 4 days of golf. With 3 flights every day. (so 12 flights in total).
We want to:
maximize the number of unique players that play each other in a flight,
but also want to minimize the number of double occurences (2 players playing each other more than once).
Since with 12 players and 4 players per flight I can roughly create 36k player combinations per flight per day, it becomes pretty compute intense. Is there any smarter way to solve this? My gut feeling says fibonacci can help out, but not sure how exactly.
This is the code I have so far:
import random
import itertools
import pandas as pd
def make_player_combi(day):
player_combis = []
for flight in day:
#print flight
for c in itertools.combinations(flight,2):
combi = list(sorted(c))
player_combis.append('-'.join(combi))
return player_combis
def make_score(a,b,c):
df = pd.DataFrame(a + b + c,columns=['player_combi'])['player_combi']
combi_counts = df.value_counts()
pairs_playing = len(combi_counts)
double_plays = combi_counts.value_counts().sort_index()
return pairs_playing, double_plays
players = ['A','B','C', 'D', 'E', 'F', 'G', 'H', 'I', 'J','K','L']
available_players = players[:]
n = 0
combinations_per_day = []
for players_in_flight_1 in itertools.combinations(players,4):
available_players_flight_1 = players[:]
available_players_flight_2 = players[:]
available_players_flight_3 = players[:]
for player in players_in_flight_1:
# players in flight1 can no longer be used in flight 2 or 3
available_players_flight_2.remove(player)
available_players_flight_3.remove(player)
for players_in_flight_2 in itertools.combinations(available_players_flight_2,4):
players_in_flight_3 = available_players_flight_3[:]
for player in players_in_flight_2:
players_in_flight_3.remove(player)
n = n + 1
print str(n), players_in_flight_1,players_in_flight_2,tuple(players_in_flight_3)
combinations_per_day.append([players_in_flight_1,players_in_flight_2,tuple(players_in_flight_3)])
n_max = 100 # limit to 100 entries max per day to save calculations
winning_list = []
max_score = 0
for day_1 in range(0,len(combinations_per_day[0:n_max])):
print day_1
for day_2 in range(0,len(combinations_per_day[0:n_max])):
for day_3 in range(0,len(combinations_per_day[0:n_max])):
a = make_player_combi(combinations_per_day[day_1])
b = make_player_combi(combinations_per_day[day_2])
x,y = make_score(a,b,c)
if x >= max_score:
max_score = x
my_result = {'pairs_playing' : x,
'double_plays' : y,
'max_day_1' : day_1,
'max_day_2' : day_2,
'max_day_3' : day_3
}
winning_list.append(my_result)

I chose the suboptimal ( but good enough) solution by running 5 mio samples and taking the lowest outcome... Thanks all for thinking along

You can brute-force this by eliminating symmetries. Let's call the 12 players a,b,c,d,e,f,g,h,i,j,k,l, write a flight with 4 players concatenated together: degj, and a day's schedule with the three flights: eg. abcd-efgh-ijkl.
The first day's flights are arbitrary: say the 3 flights are abcd-efgh-ijkl.
On the second and third day, you have fewer than (12 choose 4) * (8 choose 4) possibilities, because that counts each distinct schedule 6 times. For example, abcd-efgh-ijkl, efgh-abcd-ijkl and ijkl-efgh-abcd are all counted as separate, but they are essentially the same. In fact, you have (12 choose 4) * (8 choose 4) / 6 = 5775 different schedules.
Overall, this gives you 5775 * 5775 = 33350625, a manageable 33 million to check.
We can do a little better: we might as well assume that day 2 and day 3 schedules are different, and not count schedules that are the same but day 2 and day 3's are swapped over. This gives us another factor of very almost 2.
Here's code that does all that:
import itertools
import collections
# schedules generates all possible schedules for a given day, up
# to symmetry.
def schedules(players):
for c in itertools.combinations(players, 4):
for d in itertools.combinations(players, 4):
if set(c) & set(d):
continue
e = set(players) - set(c) - set(d)
sc = ''.join(sorted(c))
sd = ''.join(sorted(d))
se = ''.join(sorted(e))
if sc < sd < se:
yield sc, sd, se
# all_schedules generates all possible (different up to symmetry) schedules for
# the three days.
def all_schedules(players):
s = list(schedules(players))
d1 = s[0]
for d2, d3 in itertools.combinations(s, 2):
yield d1, d2, d3
# pcount returns a Counter that records how often each pair
# of players play each other.
def pcount(*days):
players = collections.Counter()
for d in days:
for flight in d:
for p1, p2 in itertools.combinations(flight, 2):
players[p1+p2] += 1
return players
def score(*days):
p = pcount(*days)
return len(p), sum(-1 for v in p.itervalues() if v > 1)
best = None
for days in all_schedules('abcdefghijkl'):
s = score(*days)
if s > best:
best = s
print days, s
It still takes some time to run (about 10-15 minutes on my computer), and produces this as the last line of output:
abcd-efgh-ijkl abei-cfgj-dhkl abhj-cekl-dfgi (48, -3)
That means there's 48 unique pairings over the three days, and there's 3 pairs who play each other more than once (ab, fg and kl).
Note that each of the three pairs who play each other more than once play each other on every day. That's unfortunate, and probably means you need to tweak your idea of how to score schedules. For example, excluding schedules where the same pair plays more than twice, and taking the soft-min of the number of players each player sees, gives this solution:
abcd-efgh-ijkl abei-cfgj-dhkl acfk-begl-dhij
This has 45 unique pairings, and 9 pairs that play each other more than once. But every player meets at least 7 different players and might be preferable in practice to the "optimal" solution above.

Related

Paradox python algorithm

I am trying to solve a version of the birthday paradox question where I have a probability of 0.5 but I need to find the number of people n where at least 4 have their birthdays within a week of each other.
I have written code that is able to simulate where 2 people have their birthdays on the same day.
import numpy
import matplotlib.pylab as plt
no_of_simulations = 1000
milestone_probabilities = [50, 75, 90, 99]
milestone_current = 0
def birthday_paradox(no_of_people, simulations):
global milestone_probabilities, milestone_current
same_birthday_four_people = 0
#We assume that there are 365 days in all years.
for sim in range(simulations):
birthdays = numpy.random.choice(365, no_of_people, replace=True)
unique_birthdays = set(birthdays)
if len(unique_birthdays) < no_of_people:
same_birthday_four_people += 1
success_fraction = same_birthday_four_people/simulations
if milestone_current < len(milestone_probabilities) and success_fraction*100 > milestone_probabilities[milestone_current]:
print("P(Four people sharing birthday in a room with " + str(no_of_people) + " people) = " + str(success_fraction))
milestone_current += 1
return success_fraction
def main():
day = []
success = []
for i in range(1, 366): #Executing for all possible cases where can have unique birthdays, i.e. from 1 person to a maximum of 365 people in a room
day.append(i)
success.append(birthday_paradox(i, no_of_simulations))
plt.plot(day, success)
plt.show()
main()
I am looking to modify the code to look for sets of 4 instead of 2 and then calculate the difference between them to be less than equal to 7 in order to meet the question.
Am I going down the right path or should I approach the question differently?
The key part of your algorithm is in these lines:
unique_birthdays = set(birthdays)
if len(unique_birthdays) < no_of_people:
same_birthday_four_people += 1
Comparing the number of unique birthdays to the number of people did the work when you tested if two different people had the same birthday, but It wont do for your new test.
Define a new function that will receive the birthday array and return True or False after checking if indeed 4 different people had the a birthday in a range of 7 days:
def four_birthdays_same_week(birthdays):
# fill this function code
def birthday_paradox(no_of_people, simulations):
...
(this function can be defined outside the birthday_paradox function)
Then switch this code:
if len(unique_birthdays) < no_of_people:
same_birthday_four_people += 1
into:
if four_birthdays_same_week(birthdays):
same_birthday_four_people += 1
Regarding the algorithm for checking if there 4 different birthday on the same week: a basic idea would be to sort the array of birthdays, then for every group of 4 birthdays check if the day range between them is equal or lower to 7:
if it is, the function can immediately return True.
(I am sure this algorithm can be vastly improved.)
If after scanning the whole array we didn't return True, the function can return False.

Python algorithm to generate set of teams where players are with other given players an equal number of times

I'm looking to write a python programme that takes as input n (a number of players) and splits them into two teams of equal size for a number w (weeks) such that each player is with any other given player an equal amount of the time.
e.g. 6 players over 4 weeks generates the 4 team pairings ((1,2,3)(4,5,6) , (1,3,5),(2,4,6) , (1,5,6),(2,3,4) , (1,4,6),(2,3,5)) such that 1 is with 2 as equally as possible as 1 is with 5 or any other pair of players.
There is no distinction between the first and second team and both teams are always of equal size.
basically what I did is used a library called itertools to calculate all permutations of teams and just checked whether they have been picked already or not. If you have any questions just ask.
list_of_players = ["Mark", "Willy", "Josh", "Rob"]
N = len(list_of_players)
good = []
all = []
for perm in list(permutations(list_of_players)):
if sorted(perm[:N//2]) not in all and sorted(perm[N // 2:]) not in all:
good.append(tuple([sorted(perm[:N // 2]), sorted(perm[N // 2:])]))
all.append(sorted(perm[:N // 2]))
all.append(sorted(perm[N // 2:]))
for i in range(len(good)):
print("week:",i+1,good[i])
Output:
week: 1 (['Mark', 'Willy'], ['Josh', 'Rob'])
week: 2 (['Josh', 'Mark'], ['Rob', 'Willy'])
week: 3 (['Mark', 'Rob'], ['Josh', 'Willy'])

How to find the number of ways to get 21 in Blackjack?

Some assumptions:
One deck of 52 cards is used
Picture cards count as 10
Aces count as 1 or 11
The order is not important (ie. Ace + Queen is the same as Queen + Ace)
I thought I would then just sequentially try all the possible combinations and see which ones add up to 21, but there are way too many ways to mix the cards (52! ways). This approach also does not take into account that order is not important nor does it account for the fact that there are only 4 maximum types of any one card (Spade, Club, Diamond, Heart).
Now I am thinking of the problem like this:
We have 11 "slots". Each of these slots can have 53 possible things inside them: 1 of 52 cards or no card at all. The reason it is 11 slots is because 11 cards is the maximum amount of cards that can be dealt and still add up to 21; more than 11 cards would have to add up to more than 21.
Then the "leftmost" slot would be incremented up by one and all 11 slots would be checked to see if they add up to 21 (0 would represent no card in the slot). If not, the next slot to the right would be incremented, and the next, and so on.
Once the first 4 slots contain the same "card" (after four increments, the first 4 slots would all be 1), the fifth slot could not be that number as well since there are 4 numbers of any type. The fifth slot would then become the next lowest number in the remaining available cards; in the case of four 1s, the fifth slot would become a 2 and so on.
How would you do approach this?
divide and conquer by leveraging the knowledge that if you have 13 and pick a 10 you only have to pick cards to sum to 3 left to look at ... be forwarned this solution might be slow(took about 180 seconds on my box... it is definately non-optimal) ..
def sum_to(x,cards):
if x == 0: # if there is nothing left to sum to
yield []
for i in range(1,12): # for each point value 1..11 (inclusive)
if i > x: break # if i is bigger than whats left we are done
card_v = 11 if i == 1 else i
if card_v not in cards: continue # if there is no more of this card
new_deck = cards[:] # create a copy of hte deck (we do not want to modify the original)
if i == 1: # one is clearly an ace...
new_deck.remove(11)
else: # remove the value
new_deck.remove(i)
# on the recursive call we need to subtract our recent pick
for result in sum_to(x-i,new_deck):
yield [i] + result # append each further combination to our solutions
set up your cards as follows
deck = []
for i in range(2,11): # two through ten (with 4 of each)
deck.extend([i]*4)
deck.extend([10]*4) #jacks
deck.extend([10]*4) #queens
deck.extend([10]*4) #kings
deck.extend([11]*4) # Aces
then just call your function
for combination in sum_to(21,deck):
print combination
unfortunately this does allow some duplicates to sneak in ...
in order to get unique entries you need to change it a little bit
in sum_to on the last line change it to
# sort our solutions so we can later eliminate duplicates
yield sorted([i] + result) # append each further combination to our solutions
then when you get your combinations you gotta do some deep dark voodoo style python
unique_combinations = sorted(set(map(tuple,sum_to(21,deck))),key=len,reverse=0)
for combo in unique_combinations: print combo
from this cool question i have learned the following (keep in mind in real play you would have the dealer and other players also removing from the same deck)
there are 416 unique combinations of a deck of cards that make 21
there are 300433 non-unique combinations!!!
the longest number of ways to make 21 are as follows
with 11 cards there are 1 ways
[(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3)]
with 10 cards there are 7 ways
with 9 cards there are 26 ways
with 8 cards there are 54 ways
with 7 cards there are 84 ways
with 6 cards there are 94 ways
with 5 cards there are 83 ways
with 4 cards there are 49 ways
with 3 cards there are 17 ways
with 2 cards there are 1 ways
[(10, 11)]
there are 54 ways in which all 4 aces are used in making 21!!
there are 106 ways of making 21 in which NO aces are used !!!
keep in mind these are often suboptimal plays (ie considering A,10 -> 1,10 and hitting )
Before worrying about the suits and different cards with value 10 lets figure out how many different value combinations resulting to 21 there are. For example 5, 5, 10, 1 is one such combination. The following function takes in limit which is the target value, start which indicates the lowest value that can be picked and used which is the list of picked values:
def combinations(limit, start, used):
# Base case
if limit == 0:
return 1
# Start iteration from lowest card to picked so far
# so that we're only going to pick cards 3 & 7 in order 3,7
res = 0
for i in range(start, min(12, limit + 1)):
# Aces are at index 1 no matter if value 11 or 1 is used
index = i if i != 11 else 1
# There are 16 cards with value of 10 (T, J, Q, K) and 4 with every
# other value
available = 16 if index == 10 else 4
if used[index] < available:
# Mark the card used and go through combinations starting from
# current card and limit lowered by the value
used[index] += 1
res += combinations(limit - i, i, used)
used[index] -= 1
return res
print combinations(21, 1, [0] * 11) # 416
Since we're interested about different card combinations instead of different value combinations the base case in above should be modified to return number of different card combinations that can be used to generate a value combination. Luckily that's quite easy task, Binomial coefficient can be used to figure out how many different combinations of k items can be picked from n items.
Once the number of different card combinations for each value in used is known they can be just multiplied with each other for the final result. So for the example of 5, 5, 10, 1 value 5 results to bcoef(4, 2) == 6, value 10 to bcoef(16, 1) == 16 and value 1 to bcoef(4, 1) == 4. For all the other values bcoef(x, 0) results to 1. Multiplying those values results to 6 * 16 * 4 == 384 which is then returned:
import operator
from math import factorial
def bcoef(n, k):
return factorial(n) / (factorial(k) * factorial(n - k))
def combinations(limit, start, used):
if limit == 0:
combs = (bcoef(4 if i != 10 else 16, x) for i, x in enumerate(used))
res = reduce(operator.mul, combs, 1)
return res
res = 0
for i in range(start, min(12, limit + 1)):
index = i if i != 11 else 1
available = 16 if index == 10 else 4
if used[index] < available:
used[index] += 1
res += combinations(limit - i, i, used)
used[index] -= 1
return res
print combinations(21, 1, [0] * 11) # 186184
So I decided to write the script that every possible viable hand can be checked. The total number comes out to be 188052. Since I checked every possible combination, this is the exact number (as opposed to an estimate):
import itertools as it
big_list = []
def deck_set_up(m):
special = {8:'a23456789TJQK', 9:'a23456789', 10:'a2345678', 11:'a23'}
if m in special:
return [x+y for x,y in list(it.product(special[m], 'shdc'))]
else:
return [x+y for x,y in list(it.product('a23456789TJQKA', 'shdc'))]
deck_dict = {'as':1,'ah':1,'ad':1,'ac':1,
'2s':2,'2h':2,'2d':2,'2c':2,
'3s':3,'3h':3,'3d':3,'3c':3,
'4s':4,'4h':4,'4d':4,'4c':4,
'5s':5,'5h':5,'5d':5,'5c':5,
'6s':6,'6h':6,'6d':6,'6c':6,
'7s':7,'7h':7,'7d':7,'7c':7,
'8s':8,'8h':8,'8d':8,'8c':8,
'9s':9,'9h':9,'9d':9,'9c':9,
'Ts':10,'Th':10,'Td':10,'Tc':10,
'Js':10,'Jh':10,'Jd':10,'Jc':10,
'Qs':10,'Qh':10,'Qd':10,'Qc':10,
'Ks':10,'Kh':10,'Kd':10,'Kc':10,
'As':11,'Ah':11,'Ad':11,'Ac':11}
stop_here = {2:'As', 3:'8s', 4:'6s', 5:'4h', 6:'3c', 7:'3s', 8:'2h', 9:'2s', 10:'2s', 11:'2s'}
for n in range(2,12): # n is number of cards in the draw
combos = it.combinations(deck_set_up(n), n)
stop_point = stop_here[n]
while True:
try:
pick = combos.next()
except:
break
if pick[0] == stop_point:
break
if n < 8:
if len(set([item.upper() for item in pick])) != n:
continue
if sum([deck_dict[card] for card in pick]) == 21:
big_list.append(pick)
print n, len(big_list) # Total number hands that can equal 21 is 188052
In the output, the the first column is the number of cards in the draw, and the second number is the cumulative count. So the number after "3" in the output is the total count of hands that equal 21 for a 2-card draw, and a 3-card draw. The lower case a is a low ace (1 point), and uppercase A is high ace. I have a line (the one with the set command), to make sure it throws out any hand that has a duplicate card.
The script takes 36 minutes to run. So there is definitely a trade-off between execution time, and accuracy. The "big_list" contains the solutions (i.e. every hand where the sum is 21)
>>>
================== RESTART: C:\Users\JBM\Desktop\bj3.py ==================
2 64
3 2100
4 14804
5 53296
6 111776
7 160132
8 182452
9 187616
10 188048
11 188052 # <-- This is the total count, as these numbers are cumulative
>>>

python tuple over writing previous data

I am trying to create a function that will start the loop and add a day to current day count, it will ask 3 questions then combine that data to equal Total_Output. I then want 'n' to represent the end of the tuple, and in the next step add the Total_Output to the end of the tuple. But when I run the function it seems like it is creating a new tuple.
Example:
Good Morninghi
This is Day: 1
How much weight did you use?40
How many reps did you do?20
How many sets did you do?6
Day: 1
[4800.0]
This is Day: 2
How much weight did you use?50
How many reps did you do?20
How many sets did you do?6
Day: 2
[6000.0, 6000.0]
This is Day: 3
How much weight did you use?40
How many reps did you do?20
How many sets did you do?6
Day: 3
[4800.0, 4800.0, 4800.0]
failed
Here is the function:
def Start_Work(x):
Num_Days = 0
Total_Output = 0
Wght = 0
Reps = 0
Sets = 0
Day = []
while x == 1 and Num_Days < 6: ##will be doing in cycles of 6 days
Num_Days += 1 ##increase day count with each loop
print "This is Day:",Num_Days
Wght = float(raw_input("How much weight did you use?"))
Reps = float(raw_input("How many reps did you do?"))
Sets = float(raw_input("How many sets did you do?"))
Total_Output = Wght * Reps * Sets
n = Day[:-1] ##go to end of tuple
Day = [Total_Output for n in range(Num_Days)] ##add data (Total_Output to end of tuple
print "Day:",Num_Days
print Day
else:
print "failed"
Input = raw_input("Good Morning")
if Input.lower() == str('hi') or str('start') or str('good morning'):
Start_Work(1)
else:
print "Good Bye"
n = Day[:-1] ##go to end of tuple
Day = [Total_Output for n in range(Num_Days)] ##add data (Total_Output to end of tuple
Does not do what you think it does. You assign n but never use it (the n in the loop is assigned by the for n in), and it only hold a list of the end of the Day variable.
You then set Day to be [Total_Output] * Num_Days, so you make a new list of Num_Days occurrences of Total_Output.
You want:
Day.append(Total_Output)
to replace both of those lines.

Loop For working too long

I have two list of dicts: prices_distincts, prices.
They connect through hash_brand_artnum, both of them sorted by hash_brand_artnum
I do not understand why loop works for so long:
If length of prices_distincts is 100,000 it works for 30 min
But If length of prices_distincts is 10,000 it works for 10 sec.
Code:
prices_distincts = [{'hash_brand_artnum':1202},...,..]
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]
for prices_distinct in prices_distincts:
for price in list(prices):
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
print price['hash_brand_artnum']
#print prices
del prices[0]
else:
continue
I need to look for items with same prices. Relation beatween prices_distincts and prices one to many. And group price with equal price['hash_brand_artnum']
it's working so long because your algorithm is O(N^2) and 100000 ^ 2 = 10000000000 and 10000 ^ 2 = 100000000. So factor between two number is 100, and factor between 30 min and 10 sec ~100.
EDIT: It's hard to say by your code and such a small amount of data, and I don't know what your task is, but I think that your dictionaries is not very useful.
May be try this:
>>> prices_distincts = [{'hash_brand_artnum':1202}, {'hash_brand_artnum':14}]
>>> prices = [{'hash_brand_artnum':1202, 'price':12.077}, {'hash_brand_artnum':14, 'price':15}]
# turning first list of dicts into simple list of numbers
>>> dist = [x['hash_brand_artnum'] for x in prices_distincts]
# turning second list of dicts into dict where number is a key and price is a value
>>> pr = {x['hash_brand_artnum']:x["price"] for x in prices}
not you can iterate throuth your number and get prices:
>>> for d in dist:
... print d, pr[d]
As #RomanPekar mentioned, your algorithm is running slow because its complexity is O(n^2). To fix it, you should write it as an O(n) algorithm:
import itertools as it
for price, prices_distinct in it.izip(prices, prices_distincts):
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
# do stuff
If prices grows more or less with prices_distincts, then if you multiply the size of prices_distincts by 10, your original 10 seconds will be multiply by 10 then again by 10 (second for loop), and then by ~2 because of the "list(prices)" (that, by the way, should definitively be done out of the loop):
10sec*10*10*2 = 2000sec = 33min
This conversion is usually expensive.
prices_distincts = [{'hash_brand_artnum':1202},...,..]
prices = [{'hash_brand_artnum':1202,'price':12.077},...,...]
list_prices = list(prices)
for prices_distinct in prices_distincts:
for price in list_prices:
if prices_distinct['hash_brand_artnum'] == price['hash_brand_artnum']:
print price['hash_brand_artnum']
#print prices
del prices[0]
else:
continue

Categories