I am trying to make an ai following the alpha-beta pruning method for tic-tac-toe. I need to make checking a win as fast as possible, as the ai will goes through many different possible game states. Right now I have thought of 2 approaches, neither which is very efficient.
Create a large tuple for scoring every possible 4 in a row win conditions, and loop through that.
Using for loops, check horizontally, vertically, diag facing left, and diag facing right. This seems like it would be much slower that #1.
How would someone recommend doing it?
From your question, it's a bit unclear how your approaches would be implemented. But from the alpha-beta pruning, it seems as if you want to look at a lot of different game states, and in the recursion determine a "score" for each one.
One very important observation is that recursion ends once a 4-in-a-row has been found. That means that at the start of a recursion step, the game board does not have any 4-in-a-row instances.
Using this, we can intuitively see that the new piece placed in said recursion step must be a part of any 4-in-a-row instance created during the recursion step. This greatly reduces the search space for solutions from a total of 69 (21 vertical, 24 horizontal, 12+12 diagonals) 4-in-a-row positions to a maximum of 13 (3 vertical, 4 horizontal, 3+3 diagonal).
This should be the baseline for your second approach. It will require a maximum of 52 (13*4) checks for a naive implementation, or 25 (6+7+6+6) checks for a faster algorithm.
Now it's pretty hard to beat 25 boolean checks for this win-check I'd say, but I'm guessing that your #1 approach trades some extra memory-usage to enable less calculation per recursion step. The simplest way of doing this would be to store 8 integers (single byte is fine for this application) which represent the longest chains of same-color chips that can be found in any of the 8 directions.
Using this, a check for win can be reduced to 8 boolean checks and 4 additions. Simply get the chain lengths on opposite sides of the newly placed chip, check if they're the same color as the chip, and if they are, add their lengths and add 1 (for the newly placed chip).
From this calculation, it seems as if your #1 approach might be the most efficient. However, it has a much larger overhead of maintaining the data structure, and uses more memory, something that should be avoided unless you can pass by reference. Also (assuming that boolean checks and additions are similar in speed) the much harder approach only wins by a factor 2 even when ignoring the overhead.
I've made some simplifications, and some explanations maybe weren't crystal clear, but ask if you have any further questions.
Related
I was given a problem in which you are supposed to write a python code that distributes a number of different weights among 4 boxes.
Logically we can't expect a perfect distribution as in case we are given weights like 10, 65, 30, 40, 50 and 60 kilograms, there is no way of grouping those numbers without making one box heavier than another. But we can aim for the most homogenous distribution. ((60),(40,30),(65),(50,10))
I can't even think of an algorithm to complete this task let alone turn it into python code. Any ideas about the subject would be appreciated.
The problem you're describing is similar to the "fair teams" problem, so I'd suggest looking there first.
Because a simple greedy algorithm where weights are added to the lightest box won't work, the most straightforward solution would be a brute force recursive backtracking algorithm that keeps track of the best solution it has found while iterating over all possible combinations.
As stated in #j_random_hacker's response, this is not going to be something easily done. My best idea right now is to find some baseline. I describe a baseline as an object with the largest value since it cannot be subdivided. Using that you can start trying to match the rest of the data to that value which would only take about three iterations to do. The first and second would create a list of every possible combination and then the third can go over that list and compare the different options by taking the average of each group and storing the closest average value to your baseline.
Using your example, 65 is the baseline and since you cannot subdivide it you know that has to be the minimum bound on your data grouping so you would try to match all of the rest of the values to that. It wont be great, but it does give you something to start with.
As j_random_hacker notes, the partition problem is NP-complete. This problem is also NP-complete by a reduction from the 4-partition problem (the article also contains a link to a paper by Garey and Johnson that proves that 4-partition itself is NP-complete).
In particular, given a list to 4-partition, you could feed that list as an input to a function that solves your box distribution problem. If each box had the same weight in it, a 4-partition would exist, otherwise not.
Your best bet would be to create an exponential time algorithm that uses backtracking to iterate over the 4^n possible assignments. Because unless P = NP (highly unlikely), no polynomial time algorithm exists for this problem.
I'm attempting to write a program which finds the 'pits' in a list of
integers.
A pit is any integer x where x is less than or equal to the integers
immediately preceding and following it. If the integer is at the start
or end of the list it is only compared on the inward side.
For example in:
[2,1,3] 1 is a pit.
[1,1,1] all elements are pits.
[4,3,4,3,4] the elements at 1 and 3 are pits.
I know how to work this out by taking a linear approach and walking along
the list however i am curious about how to apply divide and conquer
techniques to do this comparatively quickly. I am quite inexperienced and
am not really sure where to start, i feel like something similar to a binary
tree could be applied?
If its pertinent i'm working in Python 3.
Thanks for your time :).
Without any additional information on the distribution of the values in the list, it is not possible to achieve any algorithmic complexity of less than O(x), where x is the number of elements in the list.
Logically, if the dataset is random, such as a brownian noise, a pit can happen anywhere, requiring a full 1:1 sampling frequency in order to correctly find every pit.
Even if one just wants to find the absolute lowest pit in the sequence, that would not be possible to achieve in sub-linear time without repercussions on the correctness of the results.
Optimizations can be considered, such as mere parallelization or skipping values neighbor to a pit, but the overall complexity would stay the same.
I have a tree. It has a flat bottom. We're only interested in the bottom-most leaves, but this is roughly how many leaves there are at the bottom...
2 x 1600 x 1600 x 10 x 4 x 1600 x 10 x 4
That's ~13,107,200,000,000 leaves? Because of the size (the calculation performed on each leaf seems unlikely to be optimised to ever take less than one second) I've given up thinking it will be possible to visit every leaf.
So I'm thinking I'll build a 'smart' leaf crawler which inspects the most "likely" nodes first (based on results from the ones around it). So it's reasonable to expect the leaves to be evaluated in branches/groups of neighbours, but the groups will vary in size and distribution.
What's the smartest way to record which leaves have been visited and which have not?
You don't give a lot of information, but I would suggest tuning your search algorithm to help you keep track of what it's seen. If you had a global way of ranking leaves by "likelihood", you wouldn't have a problem since you could just visit leaves in descending order of likelihood. But if I understand you correctly, you're just doing a sort of hill climbing, right? You can reduce storage requirements by searching complete subtrees (e.g., all 1600 x 10 x 4 leaves in a cluster that was chosen as "likely"), and keeping track of clusters rather than individual leaves.
It sounds like your tree geometry is consistent, so depending on how your search works, it should be easy to merge your nodes upwards... e.g., keep track of level 1 nodes whose leaves have all been examined, and when all children of a level 2 node are in your list, drop the children and keep their parent. This might also be a good way to choose what to examine: If three children of a level 3 node have been examined, the fourth and last one is probably worth examining too.
Finally, a thought: Are you really, really sure that there's no way to exclude some solutions in groups (without examining every individual one)? Problems like sudoku have an astronomically large search space, but a good brute-force solver eliminates large blocks of possibilities without examining every possible 9 x 9 board. Given the scale of your problem, this would be the most practical way to attack it.
It seems that you're looking for a quick and efficient ( in terms of memory usage ) way to do a membership test. If so and if you can cope with some false-positives go for a bloom filter.
Bottom line is : Use bloom filters in situations where your data set is really big AND all what you need is checking if a particular element exists in the set AND a small chance of false positives is tolerable.
Some implementation for Python should exist.
Hope this will help.
Maybe this is too obvious, but you could store your results in a similar tree. Since your computation is slow, the results tree should not grow out of hand too quickly. Then just look up if you have results for a given node.
I'm working on a problem and one solution would require an input of every 14x10 matrix that is possible to be made up of 1's and 0's... how can I generate these so that I can input every possible 14x10 matrix into another function? Thank you!
Added March 21: It looks like I didn't word my post appropriately. Sorry. What I'm trying to do is optimize the output of 10 different production units (given different speeds and amounts of downtime) for several scenarios. My goal is to place blocks of downtime to minimized the differences in production on a day-to-day basis. The amount of downtime and frequency each unit is allowed is given. I am currently trying to evaluate a three week cycle, meaning every three weeks each production unit is taken down for a given amount of hours. I was asking the computer to determine the order the units would be taken down based on the constraint that the lines come down only once every 3 weeks and the difference in daily production is the smallest possible. My first approach was to use Excel (as I tried to describe above) and it didn't work (no suprise there)... where 1- running, 0- off and when these are summed to calculate production. The calculated production is subtracted from a set max daily production. Then, these differences were compared going from Mon-Tues, Tues-Wed, etc for a three week time frame and minimized using solver. My next approach was to write a Matlab code where the input was a tolerance (set allowed variation day-to-day). Is there a program that already does this or an approach to do this easiest? It seems simple enough, but I'm still thinking through the different ways to go about this. Any insight would be much appreciated.
The actual implementation depends heavily on how you want to represent matrices… But assuming the matrix can be represented by a 14 * 10 = 140 element list:
from itertools import product
for matrix in product([0, 1], repeat=140):
# ... do stuff with the matrix ...
Of course, as other posters have noted, this probably isn't what you want to do… But if it really is what you want to do, that's the best code (given your requirements) to do it.
Generating Every possible matrix of 1's and 0's for 14*10 would generate 2**140 matrixes. I don't believe you would have enough lifetime for this. I don't know, if the sun would still shine before you finish that. This is why it is impossible to generate all those matrices. You must look for some other solution, this looks like a brute force.
This is absolutely impossible! The number of possible matrices is 2140, which is around 1.4e42. However, consider the following...
If you were to generate two 14-by-10 matrices at random, the odds that they would be the same are 1 in 1.4e42.
If you were to generate 1 billion unique 14-by-10 matrices, then the odds that the next one you generate would be the same as one of those would still be exceedingly slim: 1 in 1.4e33.
The default random number stream in MATLAB uses a Mersenne twister algorithm that has a period of 219936-1. Therefore, the random number generator shouldn't start repeating itself any time this eon.
Your approach should be thus:
Find a computer no one ever wants to use again.
Give it as much storage space as possible to save your results.
Install MATLAB on it and fire it up.
Start computing matrices at random like so:
while true
newMatrix = randi([0 1],14,10);
%# Process the matrix and output your results to disk
end
Walk away
Since there are so many combinations, you don't have to compare newMatrix with any of the previous matrices since the length of time before a repeat is likely to occur is astronomically large. Your processing is more likely to stop due to other reasons first, such as (in order of likely occurrence):
You run out of disk space to store your results.
There's a power outage.
Your computer suffers a fatal hardware failure.
You pass away.
The Earth passes away.
The Universe dies a slow heat death.
NOTE: Although I injected some humor into the above answer, I think I have illustrated one useful alternative. If you simply want to sample a small subset of the possible combinations (where even 1 billion could be considered "small" due to the sheer number of combinations) then you don't have to go through the extra time- and memory-consuming steps of saving all of the matrices you've already processed and comparing new ones to it to make sure you aren't repeating matrices. Since the odds of repeating a combination are so low, you could safely do this:
for iLoop = 1:whateverBigNumberYouWant
newMatrix = randi([0 1],14,10); %# Generate a new matrix
%# Process the matrix and save your results
end
Are you sure you want every possible 14x10 matrix? There are 140 elements in each matrix, and each element can be on or off. Therefore there are 2^140 possible matrices. I suggest you reconsider what you really want.
Edit: I noticed you mentioned in a comment that you are trying to minimize something. There is an entire mathematical field called optimization devoted to doing this type of thing. The reason this field exists is because quite often it is not possible to exhaustively examine every solution in anything resembling a reasonable amount of time.
Trying this:
import numpy
for i in xrange(int(1e9)): a = numpy.random.random_integers(0,1,(14,10))
(which is much, much, much smaller than what you require) should be enough to convince you that this is not feasible. It also shows you how to calculate one, or few, such random matrices even up to a million is pretty fast).
EDIT: changed to xrange to "improve speed and memory requirements" :)
You don't have to iterate over this:
def everyPossibleMatrix(x,y):
N=x*y
for i in range(2**N):
b="{:0{}b}".format(i,N)
yield '\n'.join(b[j*x:(j+1)*x] for j in range(y))
Depending on what you want to accomplish with the generated matrices, you might be better off generating a random sample and running a number of simulations. Something like:
matrix_samples = []
# generate 10 matrices
for i in range(10):
sample = numpy.random.binomial(1, .5, 14*10)
sample.shape = (14, 10)
matrix_samples.append(sample)
You could do this a number of times to see how results vary across simulations. Of course, you could also modify the code to ensure that there are no repeats in a sample set, again depending on what you're trying to accomplish.
Are you saying that you have a table with 140 cells and each value can be 1 or 0 and you'd like to generate every possible output? If so, you would have 2^140 possible combinations...which is quite a large number.
Instead of just suggesting the this is unfeasible, I would suggest considering a scheme that samples the important subset of all possible combinations instead of applying a brute force approach. As one of your replies suggested, you are doing minimization. There are numerical techniques to do this such as simulated annealing, monte carlo sampling as well as traditional minimization algorithms. You might want to look into whether one is appropriate in your case.
I was actually much more pessimistic to begin with, but consider:
from math import log, e
def timeInYears(totalOpsNeeded=2**140, currentOpsPerSecond=10**9, doublingPeriodInYears=1.5):
secondsPerYear = 365.25 * 24 * 60 * 60
doublingPeriodInSeconds = doublingPeriodInYears * secondsPerYear
k = log(2,e) / doublingPeriodInSeconds # time-proportionality constant
timeInSeconds = log(1 + k*totalOpsNeeded/currentOpsPerSecond, e) / k
return timeInSeconds / secondsPerYear
if we assume that computer processing power continues to double every 18 months, and you can currently do a billion combinations per second (optimistic, but for sake of argument) and you start today, your calculation will be complete on or about April 29th 2137.
Here is an efficient way to do get started Matlab:
First generate all 1024 possible rows of length 10 containing only zeros and ones:
dec2bin(0:2^10-1)
Now you have all possible rows, and you can sample from them as you wish. For example by calling the following line a few times:
randperm(1024,14)
I need to make a program in python that chooses cars from an array, that is filled with the mass of 10 cars. The idea is that it fills a barge, that can hold ~8 tonnes most effectively and minimum space is left unfilled. My idea is, that it makes variations of the masses and chooses one, that is closest to the max weight. But since I'm new to algorithms, I don't have a clue how to do it
I'd solve this exercise with dynamic programming. You should be able to get the optimal solution in O(m*n) operations (n beeing the number of cars, m beeing the total mass).
That will only work however if the masses are all integers.
In general you have a binary linear programming problem. Those are very hard in general (NP-complete).
However, both ways lead to algorithms which I wouldn't consider to be beginners material. You might be better of with trial and error (as you suggested) or simply try every possible combination.
This is a 1D bin-packing problem. It's a NP problem and there isn't an optimal solution. However there is a way to solve this with greedy algorithm. Most likely you want to try my bin-packing solver at phpclasses.org (bin-packing).
If I have a graph unweigthed and undirected and each node is connected which each node then I have (n^2-n)/2 pairs of node and overall n^2-n possibilities/combinations:
1,2,3,4,5,...,64
2,1,X,X,X,...,X
3,X,1,X,X,...,X
4,X,X,1,X,...,X
5,X,X,X,1,...,X
.,X,X,X,X,1,.,X
.,X,X,X,X,X,1,X
64,X,X,X,X,X,X,1
Isn't this the same with 10 cars? (45 pairs of cars and 90 combinations/possibilites). Did I forgot something? Where is the error?
A problem like you have here is similar to the classic traveling salesman problem, which asks for the most efficient way for a salesman to visit a list of cities. The difference is that it is conceivable that you might not need every car to fill the barge, whereas the salesman must visit every city. But the problem is similar. The brute-force way to solve the problem is to investigate every possible combination of cars, from 1 car to all 10. We will assume that it is valid to have any number of each car (i.e. if car 2 is a Ford Focus, you could have three Ford Foci). This is easy to change if the car list is an exact list of specific cars, however, and you can use only 1 of each.
Now, this quickly begins to consume a lot of time. As the number of cars goes up, the number of possible combinations of cars goes up geometrically, which means that with a number smaller than you expect, it will take longer to run the program then there is time left in your life. 10 should be manageable, though (it turns out to be a little more than 700,000 combinations, or 1024 if you can only have one of each item).
The first thing is to define the weight of each car and the maximum weight the barge can carry.
weights = [1, 2, 1, 3, 1, 2, 2, 4, 1, 2, 2]
capacity = 8
Now we need some way to find each possible combination. Python's itertools module has a function that will give us every combination of a given length, but we want all lengths. So we will write one loop that goes from 1 to 10 and calls itertools.combinations_with_replacement for each length. We can then find out the total weight of each combination, and if it is higher than any weight we have already found, yet still within capacity, we will remember it as the best we have found so far.
The only real trick here is that we don't want combinations of the weights -- we want combinations of the indexes of the weights, because at the end we want to know which cars to put on the barge, not their weights. So combinations_with_replacements(range(10), ...) rather than combinations_with_replacements(weights, ...). Inside the loop you will want to get the weight of each car in the combination with weights[i] to sum it up.
I originally had code here, but took it out since this is homework. :-) (It wasn't originally tagged as such, but I should have known -- I blame the time change.)
If you wanted to allow only one of each car, you'd use itertools.combinations instead of combinations_with_replacement.
A shortcut is possible since you mention elsewhere that cars weigh from 1-2 tonnes. This means that you know you will need at least 4 of them (4 * 2 = 8), so you can skip all the combinations of 1-3 cars. However, this wouldn't generalize well if the prof changed the parameters on you.