Where to hold static information for game logic? - python

The context for this question is:
A Google App Engine backend for a two-person multiplayer turn-based card game
The game revolves around different combinations of cards giving rise to different scores in the game
Obviously, one would store the state of a game in the GAE datastore, but I'm not sure on the approach for the design of the game logic itself. It seems I might have two choices:
Store entries in the datastore with a key that is a sorted list of the valid combinations of cards that can be player. These will then map to the score values. When a player tries to play a combination of cards, the server-side python will sort the combination appropriately and lookup the key. If it succeeds, it can do the necessary updates for the score, if it fails then the combination wasn't valid.
Store the valid combinations as a python dictionary written into the server-side code and perform the same lookups as above to test the validity/get the score but without a trip to the datastore.
From a cost point of view (datastore lookups aren't free), option 2 seems like it would be better. But then there is the performance of the instance itself - will the startup time, processing time, memory usage start to tip me into greater expense?
There's also the code maintanence issue of constructing that Python dictionary, but I can bash together some scripts to help me write the code for that on the infrequently occasions that the logic changes. I think there will be on the order of 1000 card combinations (that can produce a score) of between 2 and 6 cards if that helps anyone who wants to quantify the problem.
I'm starting out with this design, and the summary of the above is whether it is sensible to store the static logic of this kind of game in the datastore, or simply keep it as part of the CPU bound logic? What are the pros and cons of both approaches?

So just for comparison to the information above:
In a standard deck of cards there are 52 unique cards. With 5 cards in a hand, there are 2,598,960 unique hands possible to get dealt.
Here's a break down of the combinatorial math:
n = 52 cards total
r = 5 cards in hand
number of combinations = n! / (r! - (n-r)!)
= 52! / (5! * 47!)
= (52 * 51 * 50 * 49 * 48) / (5 * 4 * 3 * 2 * 1)
= 2,598,960
And to simplify the example a bit more, lets compare the numbers to 5 card stud poker. Poker has 9 different types of "scoring" hands (Royal Flush, Straight Flush, Four of a kind, Full House, Flush, Straight, Three of a kind, Two Pairs, One Pair).
The odds that you have nothing in your hand are 50.11%
The odds that you have any of the above combinations is 49.89%.
The number of possible hand combinations in 5 card stud is huge, there are only 4 suits of cards, 13 numbers on the cards, and 9 different "scoring" combination type
I hope that this example clearly illustrates that it would be a huge undertaking to generate and store all of the possible "scoring" combinations of 5 card stud in a database.
What this means to you:
Since I do not know the rules of your game, the main thing you need to consider is the number of different "scoring" combination types.
You didn't really specify how unique / different your possible scoring combinations are in your game. The closer the number of combination types is to the number of combinations, the more custom "scoring" rules there are.
Using the 52 card deck example above, if each unique hand had it's own unique score You would quickly have upwards of 3 million database entries. In which case I would heavily suggest you get many databases that support a MapReduce query capability (e.g. Cassandra + Hadoop) that would allow you to easily scale your infrastructure to reduce query times. I would imagine that having 3 million unique scoring combinations is very unlikely though. That would make the rules of the game tremendously complex, and probably make your game unplayable.
Since you said that your game will have around 1000 hand combinations of 2-6 cards, let's simplify the example and get a ballpark number. Using the largest hand possible (6 cards in hand), there are 3,003 possible hands in a 12 card deck. Assuming that number of different combination types, suits, and numbered cards scales evenly (there is some fantasy math in here), you are looking at having around ~1,500 "scoring" combinations.
The bottom line:
The application logic needed to "score" winning hands for your game is the very same logic that the players of your game will need to understand in order to play (assuming that this game requires any skill / understanding at all, and it does not purely rely on luck). The more complicated it is, the harder the game will be for the players. I can only assume that the game logic isn't that complicated.
I would find it unlikely that you only have 16 cards in your deck. It would seem reasonable to have a couple hundred cards with several grouping types used for uniqueness (e.g. suit on a poker card, or mana type on a Magic the Gathering card). Assuming that you have more cards and combination types than a basic poker game, then it would seem very reasonable to conclude that it would take far more effort to store the various combinations than to include the logic of your game in the code. Also, every time you add a new rule your storage requirements would jump up in orders of magnitude, rather than scale linearly.
Since you will inevitably develop the code necessary to implement the rules of the card game, you might as well see how long it takes to "score" an arbitrary hand of cards. I would caution against premature optimization, and suggest that you prototype your design.
I would suggest you have configurable logic module that allows you to easily alter the rules of your game as needed. once the rules are solidified and unlikely to change, then I would look at optimizing your application code as needed. From the maintainability perspective, storing all of your application logic in a database is nuts (I think most maintenance programmers would agree with me on this). After you try to "fix" (e.g. normalize, migrate, transform) the data generated by a few scripts you just "bashed together" (your words) you will end up bashing your head into your keyboard.
As far as GAE pricing is concerned, the number of instances you will need will be based on your users / demand. Generally the limiting factor on scaling systems is Disk IOPS not the CPU. In the long run, I bet you would take a much bigger hit on performance as well as pricing by storing all of your application logic in a database.
Sources:
1) A Combinatorial Calculator: http://www.calculatorsoup.com/calculators/discretemathematics/combinations.php
B) Poker Odds: http://www.durangobill.com/Poker.html

If the logic is fixed, keep it in your code. Maybe you can procedurally generate the dicts on startup. If there is a dynamic component to the logic (something you want to update frequently), a data store might be a better bet, but it sounds like that's not applicable here. Unless the number of combinations runs over the millions, and you'd want to trade speed in favour of a lower memory footprint, stick with putting it in the application itself.

Related

Using multiple data sets to balance teams in Python?

I am new to coding so please forgive if I overlook anything simple. I am writing a program to make four teams of four. Each player has a certain point value for 11 different categories (eg. speed, agility, strength, etc.). I know I could average these categories together and just balance off that, but that leaves some teams wildly unbalanced in certain categories.
I have a separate program that takes in one set of point values, iterates through all possible combinations of teams, and returns the set for which the teams have the lowest standard deviation. I have also written some code myself that calculates the difference between each the average score for each category and that player's score for that category to get each player's point differential for each category.
However, I do not know how to use this data to get what I want: teams balanced off each category. I assume the best way to do this would be, for each team, minimizing the sum of the absolute values of each team's collective point differential. I have included a simplification of my code below:
d1_game1 = player1_points - average_points # this is repeated for each player and each game
game1Differentials = [d1_game1, d2_game1, d3_game1, ..., d16_game1] # There are 11 of these, one for each category
team1Differential = sum(abs([game1Differentials, game2Differentials, ..., game11Differentials]))
This team1Differential value is what is tripping me up; how do I take player differentials and convert them to team differentials? Would I have to try every combination of players?
values_to_minimize = [team1Differential, team2Differential, team3Differential, team4Differential]
I assume that this approach combined with the function from the code I linked above is almost all the way there, but how could this be applied to multiple metrics? I feel that it is a simple option that I am overlooking. This problem has been stopping me for days and I would really appreciate any help. Also if I am looking at this the wrong way and there is an easier method to get what I want, please let me know.
The problem you're having here is that you're calculating your differential from all players, when the actual operation is a 16-choose-4 operation. Therefore, your actual optimization function looks like this:
team1_players = random.sample(players, 4)
skill_1_differential = sum(abs(x.skill_1 - average_skill_1) for x in team1_players)
team_1_differential = sum([skill_1_differential, ..., skill_11_differential])
and then you'll repeat this for each of the remaining teams. You'll have to be careful to "remove" players from the pool before calling random.sample between the calculation for each team because if you don't you might wind up with the same player on multiple teams. Once you have all of these then you can sum them up:
balance = team_1_differential + team_2_differential + team_3_differential + team_4_differential
which will give you a balance paramter. From here, there are a number of ways you could handle this:
The simplest would be to calculate all possible team combinations but as there are 63,063,000 unique combinations, you'll be waiting for a very long time.
You could use a stochastic algorithm, such as simulated annealing to choose the team assignments that reduce the balance score closest to zero. This will give you a good, but not perfect, balance in a reasonable time.
You could modify how your players are created so that any combination of four players is approximately balanced. This is the easiest but if you don't have control over the creation process then this won't work.
You could choose teams at random and have them play actual games, giving the players a score that is increased when they win and decreased when the lose. After doing this, you will easily be able to create balanced teams by choosing players with similar scores or scores that balance out. This will still be a knapsack problem, but a much easier one because you'll be balancing on one variable instead of eleven. For more information, see this question. It looks like Sabermetrics or Elo could be useful here as well.

fastest Connect 4 win checking method

I am trying to make an ai following the alpha-beta pruning method for tic-tac-toe. I need to make checking a win as fast as possible, as the ai will goes through many different possible game states. Right now I have thought of 2 approaches, neither which is very efficient.
Create a large tuple for scoring every possible 4 in a row win conditions, and loop through that.
Using for loops, check horizontally, vertically, diag facing left, and diag facing right. This seems like it would be much slower that #1.
How would someone recommend doing it?
From your question, it's a bit unclear how your approaches would be implemented. But from the alpha-beta pruning, it seems as if you want to look at a lot of different game states, and in the recursion determine a "score" for each one.
One very important observation is that recursion ends once a 4-in-a-row has been found. That means that at the start of a recursion step, the game board does not have any 4-in-a-row instances.
Using this, we can intuitively see that the new piece placed in said recursion step must be a part of any 4-in-a-row instance created during the recursion step. This greatly reduces the search space for solutions from a total of 69 (21 vertical, 24 horizontal, 12+12 diagonals) 4-in-a-row positions to a maximum of 13 (3 vertical, 4 horizontal, 3+3 diagonal).
This should be the baseline for your second approach. It will require a maximum of 52 (13*4) checks for a naive implementation, or 25 (6+7+6+6) checks for a faster algorithm.
Now it's pretty hard to beat 25 boolean checks for this win-check I'd say, but I'm guessing that your #1 approach trades some extra memory-usage to enable less calculation per recursion step. The simplest way of doing this would be to store 8 integers (single byte is fine for this application) which represent the longest chains of same-color chips that can be found in any of the 8 directions.
Using this, a check for win can be reduced to 8 boolean checks and 4 additions. Simply get the chain lengths on opposite sides of the newly placed chip, check if they're the same color as the chip, and if they are, add their lengths and add 1 (for the newly placed chip).
From this calculation, it seems as if your #1 approach might be the most efficient. However, it has a much larger overhead of maintaining the data structure, and uses more memory, something that should be avoided unless you can pass by reference. Also (assuming that boolean checks and additions are similar in speed) the much harder approach only wins by a factor 2 even when ignoring the overhead.
I've made some simplifications, and some explanations maybe weren't crystal clear, but ask if you have any further questions.

My algorithm isn't correct. Why so?

I am trying to solve the following problem:
The group of people consists of N members. Every member has one or more friends in the group. You are to write program that divides this group into two teams. Every member of each team must have friends in another team.
Input:
The first line of input contains the only number N (N ≤ 100). Members are numbered from 1 to N. The second, the third,…and the (N+1)th line contain list of friends of the first, the second, …and the Nth member respectively. This list is finished by zero. Remember that friendship is always mutual in this group.
Output:
The first line of output should contain the number of people in the first team or zero if it is impossible to divide people into two teams. If the solution exists you should write the list of the first group into the second line of output. Numbers should be divided by single space. If there are more than one solution you may find any of them.
My algorithm looks like this:
create a dictionary where each player maps to a list of friends
team1 = ['1']
team2 = []
left = []
for player in dictionary:
if its friend in team1:
add to team2
elif its freind in team2:
add to team1
else:
add it to left
But still it isn't correct. There may be cycles in the dictionary where the friend of 6 would be 7 and the only friend of 7 would be 6. What should I do in such a case? I do not know how long such a cycle may be. What should I do. Since, I have a while loop around my code, I currently am running into an infinite loop. I am also trying to add players from left to teams but its not working since they have cycles among them. I don't know how to solve the following problem.
Thanks.
Since this is a competition problem and it's clear you want to learn from it, I'll be a little sparse on details and explain more about how I thought about the problem.
First, consider a connected friendship component, then pick any vertex. Since the friendship relationship is commutative, it's easy to see that adding an edge means that both the vertices are "solved". This seems to suggest something like finding a perfect matching.
However, finding a perfect matching is not sufficient, as for the complete graph with three vertices, a perfect matching doesn't exist, yet it can be solved. So thinking about it little more, it seems that a Hamiltonian path is sufficient, because you can just alternate teams.
If you consider a sufficiently large tree though, it should be clear that there's no Hamiltonian path, but the obvious splitting of teams by even or odd height produces the right result. So the answer seems to be that if you can find a spanning tree, that tree can be used to split the teams into two.
This can be repeated for each component, and just playing around with graphs, it should be convincing enough for a competition, as every component has a spanning tree, so there's nowhere else to expand to. I'm not sure what would be a graph with no possible assignment. Maybe if you have an unconnected node, that's considered invalid?
Update: I found even simpler solution. The original answer is at the bottom. This one is cleaner and comes with a proof ;)
We will be building the solution incrementally. The initial state is that all the people are unallocated, and both teams are empty. We will extend the solution using one of two actions below. After each step, the division will be legal, meaning every allocated person will have a friend allocated to the other team.
Action 1: pick any two unallocated guys that are friends. Put one of them in Team A, the other in Team B. The invariant holds, because newly allocated people know each other and are on separate teams.
Action 2: pick any guy who has an allocated friend, and place him on the other team. The invariant holds, because the one allocated person was allocated in such a way to satisfy it.
So at very step you pick any doable action and execute it. Repeat until there are no more possible actions. When does this happen? It would mean that no-one of the unallocated people has any friends. Since we assumed that everyone has at least one friend, you will be able to execute the actions until there is nobody left.
Original answer:
The problem seems complicated at first, but in fact does not require rocket science. The constraint on the division is rather loose - everyone needs just one friend on the other team.
Consider a simpler case first. Let's say you are given two teams of people and one extra player that got late to the party and needs to be allocated to one of the two existing teams. If he has no friends at all, that's impossible. But if he does have any friends, you pick one of his friends and allocate the newcommer to the other team.
The outcome? If you could start with some small teams and then arrange the rest of the people in such a way that they always know someone who came before, you're golden. This means we reduced initial big problem to two smaller ones.
Tackling the first one is easy. In order to bootstrap the teams, just pick any two guys that know each other, put one in Team A, the other in Team B, and it works.
Now, the second: adding the rest of the people. Take a look at all the people that are already allocated to teams and see if they have any unallocated friends. Case 1: one of already allocated guys has an unallocated friend. You can easily add him somewhere. Case 2: all the friends of the allocated guys are already allocated, too. This means the initial friendship graph was not connected and doesn't hurt at all - just take any random unallocated guy and place him anywhere.

Program, that chooses the best out of 10

I need to make a program in python that chooses cars from an array, that is filled with the mass of 10 cars. The idea is that it fills a barge, that can hold ~8 tonnes most effectively and minimum space is left unfilled. My idea is, that it makes variations of the masses and chooses one, that is closest to the max weight. But since I'm new to algorithms, I don't have a clue how to do it
I'd solve this exercise with dynamic programming. You should be able to get the optimal solution in O(m*n) operations (n beeing the number of cars, m beeing the total mass).
That will only work however if the masses are all integers.
In general you have a binary linear programming problem. Those are very hard in general (NP-complete).
However, both ways lead to algorithms which I wouldn't consider to be beginners material. You might be better of with trial and error (as you suggested) or simply try every possible combination.
This is a 1D bin-packing problem. It's a NP problem and there isn't an optimal solution. However there is a way to solve this with greedy algorithm. Most likely you want to try my bin-packing solver at phpclasses.org (bin-packing).
If I have a graph unweigthed and undirected and each node is connected which each node then I have (n^2-n)/2 pairs of node and overall n^2-n possibilities/combinations:
1,2,3,4,5,...,64
2,1,X,X,X,...,X
3,X,1,X,X,...,X
4,X,X,1,X,...,X
5,X,X,X,1,...,X
.,X,X,X,X,1,.,X
.,X,X,X,X,X,1,X
64,X,X,X,X,X,X,1
Isn't this the same with 10 cars? (45 pairs of cars and 90 combinations/possibilites). Did I forgot something? Where is the error?
A problem like you have here is similar to the classic traveling salesman problem, which asks for the most efficient way for a salesman to visit a list of cities. The difference is that it is conceivable that you might not need every car to fill the barge, whereas the salesman must visit every city. But the problem is similar. The brute-force way to solve the problem is to investigate every possible combination of cars, from 1 car to all 10. We will assume that it is valid to have any number of each car (i.e. if car 2 is a Ford Focus, you could have three Ford Foci). This is easy to change if the car list is an exact list of specific cars, however, and you can use only 1 of each.
Now, this quickly begins to consume a lot of time. As the number of cars goes up, the number of possible combinations of cars goes up geometrically, which means that with a number smaller than you expect, it will take longer to run the program then there is time left in your life. 10 should be manageable, though (it turns out to be a little more than 700,000 combinations, or 1024 if you can only have one of each item).
The first thing is to define the weight of each car and the maximum weight the barge can carry.
weights = [1, 2, 1, 3, 1, 2, 2, 4, 1, 2, 2]
capacity = 8
Now we need some way to find each possible combination. Python's itertools module has a function that will give us every combination of a given length, but we want all lengths. So we will write one loop that goes from 1 to 10 and calls itertools.combinations_with_replacement for each length. We can then find out the total weight of each combination, and if it is higher than any weight we have already found, yet still within capacity, we will remember it as the best we have found so far.
The only real trick here is that we don't want combinations of the weights -- we want combinations of the indexes of the weights, because at the end we want to know which cars to put on the barge, not their weights. So combinations_with_replacements(range(10), ...) rather than combinations_with_replacements(weights, ...). Inside the loop you will want to get the weight of each car in the combination with weights[i] to sum it up.
I originally had code here, but took it out since this is homework. :-) (It wasn't originally tagged as such, but I should have known -- I blame the time change.)
If you wanted to allow only one of each car, you'd use itertools.combinations instead of combinations_with_replacement.
A shortcut is possible since you mention elsewhere that cars weigh from 1-2 tonnes. This means that you know you will need at least 4 of them (4 * 2 = 8), so you can skip all the combinations of 1-3 cars. However, this wouldn't generalize well if the prof changed the parameters on you.

writing optimization function

I'm trying to write a tennis reservation system and I got stucked with this problem.
Let's say you have players with their prefs regarding court number, day and hour.
Also every player is ranked so if there is day/hour slot and there are several players
with preferences for this slot the one with top priority should be chosen.
I'm thinking about using some optimization algorithms to solve this problem but I'am not sure what would be the best cost function and/or algorithm to use.
Any advice?
One more thing I would prefer to use Python but some language-agnostic advice would be welcome also.
Thanks!
edit:
some clarifications-
the one with better priority wins and loser is moved to nearest slot,
rather flexible time slots question
yes, maximizing the number of people getting their most highly preffered times
The basic Algorithm
I'd sort the players by their rank, as the high ranked ones always push away the low ranked ones. Then you start with the player with the highest rank, give him what he asked for (if he really is the highest, he will always win, thus you can as well give him whatever he requested). Then I would start with the second highest one. If he requested something already taken by the highest, try to find a slot nearby and assign this slot to him. Now comes the third highest one. If he requested something already taken by the highest one, move him to a slot nearby. If this slot is already taken by the second highest one, move him to a slot some further away. Continue with all other players.
Some tunings to consider:
If multiple players can have the same rank, you may need to implement some "fairness". All players with equal rank will have a random order to each other if you sort them e.g. using QuickSort. You can get some some fairness, if you don't do it player for player, but rank for rank. You start with highest rank and the first player of this rank. Process his first request. However, before you process his second request, process the first request of the next player having highest rank and then of the third player having highest rank. The algorithm is the same as above, but assuming you have 10 players and player 1-4 are highest rank and players 5-7 are low and players 8-10 are very low, and every player made 3 requests, you process them as
Player 1 - Request 1
Player 2 - Request 1
Player 3 - Request 1
Player 4 - Request 1
Player 1 - Request 2
Player 2 - Request 2
:
That way you have some fairness. You could also choose randomly within a ranking class each time, this could also provide some fairness.
You could implement fairness even across ranks. E.g. if you have 4 ranks, you could say
Rank 1 - 50%
Rank 2 - 25%
Rank 3 - 12,5%
Rank 4 - 6,25%
(Just example values, you may use a different key than always multiplying by 0.5, e.g. multiplying by 0.8, causing the numbers to decrease slower)
Now you can say, you start processing with Rank 1, however once 50% of all Rank 1 requests have been fulfilled, you move on to Rank 2 and make sure 25% of their requests are fulfilled and so on. This way even a Rank 4 user can win over a Rank 1 user, somewhat defeating the initial algorithm, however you offer some fairness. Even a Rank 4 player can sometimes gets his request, he won't "run dry". Otherwise a Rank 1 player scheduling every request on the same time as a Rank 4 player will make sure a Rank 4 player has no chance to ever get a single request. This way there is at least a small chance he may get one.
After you made sure everyone had their minimal percentage processed (and the higher the rank, the more this is), you go back to top, starting with Rank 1 again and process the rest of their requests, then the rest of the Rank 2 requests and so on.
Last but not least: You may want to define a maximum slot offset. If a slot is taken, the application should search for the nearest slot still free. However, what if this nearest slot is very far away? If I request a slot Monday at 4 PM and the application finds the next free one to be Wednesday on 9 AM, that's not really helpful for me, is it? I might have no time on Wednesday at all. So you may limit slot search to the same day and saying the slot might be at most 3 hours off. If no slot is found within that range, cancel the request. In that case you need to inform the player "We are sorry, but we could not find any nearby slot for you; please request a slot on another date/time and we will see if we can find a suitable slot there for you".
This is an NP-complete problem, I think, so it'll be impossible to have a very fast algorithm for any large data sets.
There's also the problem where you might have a schedule that is impossible to make. Given that that's not the case, something like this pseudocode is probably your best bet:
sort players by priority, highest to lowest
start with empty schedule
for player in players:
for timeslot in player.preferences():
if timeslot is free:
schedule.fillslot(timeslot, player)
break
else:
#if we get here, it means this player couldn't be accomodated at all.
#you'll have to go through the slots that were filled and move another (higher-priority) player's time slot
You are describing a matching problem. Possible references are the Stony Brook algorithm repository and Algorithm Design by Kleinberg and Tardos. If the number of players is equal to the number of courts you can reach a stable matching - The Stable Marriage Problem. Other formulations become harder.
There are several questions I'd ask before answering this queston:
what happens if there is a conflict, i.e. a worse player books first, then a better player books the same court? Who wins? what happens for the loser?
do you let the best players play as long as the match runs, or do you have fixed time slots?
how often is the scheduling run - is it run interactively - so potentially someone could be told they can play, only to be told they can't; or is it run in a more batch manner - you put in requests, then get told later if you can have your slot. Or do users set up a number of preferred times, and then the system has to maximise the number of people getting their most highly preferred times?
As an aside, you can make it slightly less complex by re-writing the times as integer indexes (so you're dealing with integers rather than times).
I would advise using a scoring algorithm. Basically construct a formula that pulls all the values you described into a single number. Who ever has the highest final score wins that slot. For example a simple formula might be:
FinalScore = ( PlayerRanking * N1 ) + ( PlayerPreference * N2 )
Where N1, N2 are weights to control the formula.
This will allow you to get good (not perfect) results very quickly. We use this approach on a much more complex system with very good results.
You can add more variety to this by adding in factors for how many times the player has won or lost slots, or (as someone suggested) how much the player paid.
Also, you can use multiple passes to assign slots in the day. Use one strategy where it goes chronologically, one reverse chronologically, one that does the morning first, one that does the afternoon first, etc. Then sum the scores of the players that got the spots, and then you can decide strategy provided the best results.
Basically, you have the advantage that players have priorities; therefore, you sort the players by descending priority, and then you start allocating slots to them. The first gets their preferred slot, then the next takes his preferred among the free ones and so on. It's a O(N) algorithm.
I think you should use genetic algorithm because:
It is best suited for large problem instances.
It yields reduced time complexity on the price of inaccurate answer(Not the ultimate best)
You can specify constraints & preferences easily by adjusting fitness punishments for not met ones.
You can specify time limit for program execution.
The quality of solution depends on how much time you intend to spend solving the program..
Genetic Algorithms Definition
Genetic Algorithms Tutorial
Class scheduling project with GA
Also take a look at :a similar question and another one
Money. Allocate time slots based on who pays the most. In case of a draw don't let any of them have the slot.

Categories