4-Way MergeSort Challenge - Python

4-Way MergeSort Challenge - Python - python

I'm trying to write a MergeSort function that splits the array into 4 arrays and not into 2 like the regular MergeSort..
I tried to follow the 2-way mergeSort and implement it to the 4-way but I keep getting stuck in recursive calls and I can't understand where's the problem..
I wrote a merge_sort_4 function that calls itself 4 times and merge4 function that should merge 4 arrays and not 2.
I know some people in my class solved it with 3 calls to the regular Merge function but I think it kinda misses the point of this challenge..
If the only way of solving it is by using the regular Merge please tell me, if not, please help me find the problem
Here's my code
def merge_sort_4(lst, start, end):
if start < end:
quarter1 = (start + end) // 4
quarter2 = (start + end) // 2
quarter3 = (end - quarter1 - 1)
merge_sort_4(lst, start, quarter1)
merge_sort_4(lst, quarter1 + 1, quarter2)
merge_sort_4(lst, quarter2 + 1, quarter3)
merge_sort_4(lst, quarter3 + 1, end)
merge4(lst, start, quarter1, quarter2, quarter3, end)
def merge4(lst, start, q1, q2, q3, end):
first_q_list = lst[start:q1 + 1]
sec_q_list = lst[q1 + 1:q2 + 1]
third_q_list = lst[q2 + 1:q3 + 1]
last_q_list = lst[q3 + 1:end + 1]
first_q_list.append(float('inf'))
sec_q_list.append(float('inf'))
third_q_list.append(float('inf'))
last_q_list.append(float('inf'))
i = 0 # first sublist index
j = 0 # sec sublist index
m = 0 # third sublist index
n = 0 # last sublist index
for k in range(start, end + 1):
if first_q_list[i] <= sec_q_list[j] and first_q_list[i] <= third_q_list[m] and first_q_list[i] <= last_q_list[
n]:
lst[k] = first_q_list[i]
i += 1
elif sec_q_list[j] <= third_q_list[m] and sec_q_list[j] <= last_q_list[n]:
lst[k] = sec_q_list[j]
j += 1
elif third_q_list[m] <= last_q_list[n]:
lst[k] = third_q_list[m]
m += 1
else:
lst[k] = last_q_list[n]
n += 1
thanks in advance for your help!

Okay, so I have some ideas about things that are probably wrong here, but you've not really given much indication of what help you need, so first I want to point you in the right direction for how to figure this out yourself.
Before you even write any code to go inside merge4 and merge_sort_4, can you describe what are their preconditions and postconditions? I.e., what do you require to be true before one of them is called, both about the state of the list and the arguments passed to it? What is its job? What does it guarantee to have done when it has finished? What values can start have, and what values can end have?
I'm hoping that you're going to describe some constraints such as that 0 <= start <= end < len(lst) for calling merge_sort_4 and that 0 <= start <= q1 <= q2 <= q3 <= end <= len(lst) for calling merge4, and that merge4 requires the four sublists to be sorted and will guarantee that the entire range from list index start to end is sorted.
Now - are those things actually true? Can you walk through the code with example values and see what would happen? Can you add assertions into the algorithm to catch the first time one of them isn't true? Can you test the pieces of the algorithm separately on small pieces of test data to see if they behave as you expect?
Try and see if you can figure it out on your own, but if not, here's one place I suggest you start looking...
...
...
I would expect that quarter1, quarter2 and quarter3 should all fall within the range of start to end but I think the calculations you have there might not be doing what you expect. Try some different values of start and end and see if you are surprised what comes out.

There is an issue with using sentinel values (float('inf')). Consider the case when the ends of multiple runs have been reached, for example when first_q_list[i] == sec_q_list[j] == float('inf'), in which case the float('inf') at first_q_list[i] is copied to lst[k] and i is incremented beyond the end of first_q_list[].
The code needs to check for reaching the end of a run each time it copies an element, and if so, drop down to a 3 way merge of the remaining runs. Later dropping to 2 way merge and then finally a copy of the rest of the one remaining run.
The code could be improved, consider the case where third_q_lst[] or last_q_lst[] has the smallest element, it takes 6 compares to get to those cases. By using nested if + else, this can be reduced to just 3 compares along any path to determine which run has the smallest element. Example for 3 way merge to find smallest of a, b, c, with only 2 compares for any of the 3 possible cases:
if(a <= b)
if(a <= c)
a is smallest
else
c is smallest
else
if(b <= c)
b is smallest
else
c is smallest
Link to java example of a 4 way top down hybrid merge sort + insertion sort (for small runs). I haven't bothered porting it to Python because Python is so slow, and due to Python being an interpretive language, 4 way merge sort would probably be slower in Python than 2 way (with a compiled language, 4 way is about 15% faster than 2 way).
How can I implement the Merge Sort algorithm with 4-way partition without the error ArrayIndexOutOfBoundsException?

Related

How do I modify this Fibonacci Sequence Problem?

I am trying to figure out how I can modify the code below to help solve the question given. However, instead of only being able to take 1 or 2 steps at a time, I want to make it so I can also take 3 steps.
You have a ladder of N steps (rungs). You can go up the ladder by taking either 1 step or two steps at a time, in any combination. How many different routes are there (combinations of 1 steps or 2 steps) to make it up the ladder?
Here is some code that I'm trying to modify:
def countP(n):
if (n == 1 or n == 2):
return n
return countP(n-1) + countP(n-2)
I've already tried this so far and I am not getting the correct answer:
def countP(n):
if (n == 1 or n == 2 or n == 3):
return n
return countP(n-1) + countP(n-2) + countP(n-3)
Any help of guidance would be of great help! Thanks

Your base case in the recursion for n = 3 is wrong.
For n = 3, the correct answer should be 4, but you are returning 3.
I suggest you to simplify the base case by using the following observations:
Utmost base case is when n <= 1 i.e when we have only one stair or no stair, hence the only way to climb is taking a step of 1 unit or 0 unit. Hence, number of ways is countP(0) = 1 and countP(1) = 1.
What happens when n > 1 ? Well, we have three options to take for first step - we can take m units (1 unit, 2 units or 3 units step) as first step provided m <= n.
If we can take 1 unit step as first step, we can reduce the sub-problem from countP(n) to countP(n-1).
If we can take 2 units step as first step, we can reduce the sub-problem from countP(n) to countP(n-2).
If we can take 3 units step as first step, we can reduce the sub-problem from countP(n) to countP(n-3).
So, our final count will be : countP(n - m) for all m <= n
Code will be as follows:
def countP(n):
if (n == 0 or n == 1):
return 1
count = 0
for m in [1, 2, 3]:
if m <= n:
count += countP(n - m)
return count

The line return n is correct for the first problem but not the second. Keep in mind that the result is supposed to be the number of possible routes you can take.
If you can take either one or two steps at a time, then when you have one rung left, there is only one thing to do: take one step. If you have two rungs left, you have two options: either take two steps, or take one step (one rung), then another (the second rung). So, somewhat by coindicence, for this version of the problem the number of routes in the base case happens to equal the number of rungs left.
If you can take either one, two or three steps at a time, then the number of routes when you have three rungs remaining is not three; there are more than three options. You will have to count how many options there are, and return that in the case where n == 3.

Project Euler #25 - Can the performance be improved further?

The 12th term, F12, is the first term to contain three digits.
What is the index of the first term in the Fibonacci sequence to contain 1000 digits?
a = 1
b = 1
i = 2
while(1):
c = a + b
i += 1
length = len(str(c))
if length == 1000:
print(i)
break
a = b
b = c
I got the answer(works fast enough). Just looking if there's a better way for this question

If you've answered the question, you'll find plenty of explanations on answers in the problem thread. The solution you posted is pretty much okay. You may get a slight speedup by simply checking that your c>=10^999 at every step instead of first converting it to a string.
The better method is to use the fact that when the Fibonacci numbers become large enough, the Fibonacci numbers converge to round(phi**n/(5**.5)) where phi=1.6180... is the golden ratio and round(x) rounds x to the nearest integer. Let's consider the general case of finding the first Fibonacci number with more than m digits. We are then looking for n such that round(phi**n/(5**.5)) >= 10**(m-1)
We can easily solve that by just taking the log of both sides and observe that
log(phi)*n - log(5)/2 >= m-1 and then solve for n.
If you're wondering "well how do I know that it has converged by the nth number?" Well, you can check for yourself, or you can look online.
Also, I think questions like these either belong on the Code Review SE or the Computer Science SE. Even Math Overflow might be a good place for Project Euler questions, since many are rooted in number theory.

Your solution is completely fine for #25 on project euler. However, if you really want to optimize for speed here you can try to calculate fibonacci using the identities I have written about in this blog post: https://sloperium.github.io/calculating-the-last-digits-of-large-fibonacci-numbers.html
from functools import lru_cache
#lru_cache(maxsize=None)
def fib4(n):
if n <= 1:
return n
if n % 2:
m = (n + 1) // 2
return fib4(m) ** 2 + fib4(m - 1) ** 2
else:
m = n // 2
return (2 * fib4(m - 1) + fib4(m)) * fib4(m)
def binarySearch( length):
first = 0
last = 10**5
found = False
while first <= last and not found:
midpoint = (first + last) // 2
length_string = len(str(fib4(midpoint)))
if length_string == length:
return midpoint -1
else:
if length < length_string:
last = midpoint - 1
else:
first = midpoint + 1
print(binarySearch(1000))
This code tests about 12 times faster than your solution. (it does require an initial guess about max size though)

Processing big list using python

So I'm trying to solve a challenge and have come across a dead end. My solution works when the list is small or medium but when it is over 50000. It just "time out"
a = int(input().strip())
b = list(map(int,input().split()))
result = []
flag = []
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and temp in flag):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flag.remove(temp)
else:
flag.append(b[i])
result.sort()
for i in result:
print(i[0],i[1])
Where
a = 10
and b = [ 2, 4 ,6 ,8, 5 ]
Solution sum any two element in b which matches a
**Edit: ** Updated full code

flag is a list, of potentially the same order of magnitude as b. So, when you do temp in flag that's a linear search: it has to check every value in flag to see if that value is == temp. So, that's 50000 comparisons. And you're doing that once per loop in a linear walk over b. So, your total time is quadratic: 50,000 * 50,000 = 2,500,000,000. (And flag.remove is also linear time.)
If you replace flag with a set, you can test it for membership (and remove from it) in constant time. So your total time drops from quadratic to linear, or 50,000 steps, which is a lot faster than 2 billion:
flagset = set(flag)
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and temp in flagset):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flagset.remove(temp)
else:
flagset.add(b[i])
flag = list(flagset)
If flag needs to retain duplicate values, then it's a multiset, not a set, which means you can implement with Counter:
flagset = collections.Counter(flag)
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and flagset[temp]):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flagset[temp] -= 1
else:
flagset[temp] += 1
flag = list(flagset.elements())
In your edited code, you’ve got another list that’s potentially of the same size, result, and you’re sorting that list every time through the loop.
Sorting takes log-linear time. Since you do it up to 50,000 times, that’s around log(50;000) * 50,000 * 50,000, or around 30 billion steps.
If you needed to keep result in order throughout the operation, you’d want to use a logarithmic data structure, like a binary search tree or a skiplist, so you could insert a new element in the right place in logarithmic time, which would mean just 800.000 steps.
But you don’t need it in order until the end. So, much more simply, just move the result.sort out of the loop and do it at the end.

k-greatest double selection

Imagine you have two sacks (A and B) with N and M balls respectively in it. Each ball with a known numeric value (profit). You are asked to extract (with replacement) the pair of balls with the maximum total profit (given by the multiplication of the selected balls).
The best extraction is obvious: Select the greatest valued ball from A as well as from B.
The problem comes when you are asked to give the 2nd or kth best selection. Following the previous approach you should select the greatest valued balls from A and B without repeating selections.
This can be clumsily solved calculating the value of every possible selection, ordering and ordering it (example in python):
def solution(A,B,K):
if K < 1:
return 0
pool = []
for a in A:
for b in B:
pool.append(a*b)
pool.sort(reverse=True)
if K>len(pool):
return 0
return pool[K-1]
This works but its worst time complexity is O(N*M*Log(M*M)) and I bet there are better solutions.
I reached a solution based on a table where A and B elements are sorted from higher value to lower and each of these values has associated an index representing the next value to test from the other column. Initially this table would look like:
The first element from A is 25 and it has to be tested (index 2 select from b = 0) against 20 so 25*20=500 is the first greatest selection and, after increasing the indexes to check, the table changes to:
Using these indexes we have a swift way to get the best selection candidates:
25 * 20 = 500 #first from A and second from B
20 * 20 = 400 #second from A and first from B
I tried to code this solution:
def solution(A,B,K):
if K < 1:
return 0
sa = sorted(A,reverse=true)
sb = sorted(B,reverse=true)
for k in xrange(K):
i = xfrom
j = yfrom
if i >= n and j >= n:
ret = 0
break
best = None
while i < n and j < n:
selected = False
#From left
nexti = i
nextj = sa[i][1]
a = sa[nexti][0]
b = sb[nextj][0]
if best is None or best[2]<a*b:
selected = True
best = [nexti,nextj,a*b,'l']
#From right
nexti = sb[j][1]
nextj = j
a = sa[nexti][0]
b = sb[nextj][0]
if best is None or best[2]<a*b:
selected = True
best = [nexti,nextj,a*b,'r']
#Keep looking?
if not selected or abs(best[0]-best[1])<2:
break
i = min(best[:2])+1
j = i
print("Continue with: ", best, selected,i,j)
#go,go,go
print(best)
if best[3] == 'l':
dx[best[0]][1] = best[1]+1
dy[best[1]][1] += 1
else:
dx[best[0]][1] += 1
dy[best[1]][1] = best[0]+1
if dx[best[0]][1]>= n:
xfrom = best[0]+1
if dy[best[1]][1]>= n:
yfrom = best[1]+1
ret = best[2]
return ret
But it did not work for the on-line Codility judge (Did I mention this is part of the solution to an, already expired, Codility challenge? Sillicium 2014)
My questions are:
Is the second approach an unfinished good solution? If that is the case, any clue on what I may be missing?
Do you know any better approach for the problem?

You need to maintain a priority queue.
You start with (sa[0], sb[0]), then move onto (sa[0], sb[1]) and (sa[1], sb[0]). If (sa[0] * sb[1]) > (sa[1] * sb[0]), can we say anything about the comparative sizes of (sa[0], sb[2]) and (sa[1], sb[0])?
The answer is no. Thus we must maintain a priority queue, and after removing each (sa[i], sb[j]) (such that sa[i] * sb[j] is the biggest in the queue), we must add to the priority queue (sa[i - 1], sb[j]) and (sa[i], sb[j - 1]), and repeat this k times.
Incidentally, I gave this algorithm as an answer to a different question. The algorithm may seem to be different at first, but essentially it's solving the same problem.

I'm not sure I understand the "with replacement" bit...
...but assuming this is in fact the same as "How to find pair with kth largest sum?", then the key to the solution is to consider the matrix S of all the sums (or products, in your case), constructed from A and B (once they are sorted) -- this paper (referenced by #EvgenyKluev) gives this clue.
(You want A*B rather than A+B... but the answer is the same -- though negative numbers complicate but (I think) do not invalidate the approach.)
An example shows what is going on:
for A = (2, 3, 5, 8, 13)
and B = (4, 8, 12, 16)
we have the (notional) array S, where S[r, c] = A[r] + B[c], in this case:
6 ( 2+4), 10 ( 2+8), 14 ( 2+12), 18 ( 2+16)
7 ( 3+4), 11 ( 3+8), 15 ( 3+12), 19 ( 3+16)
9 ( 5+4), 13 ( 5+8), 17 ( 5+12), 21 ( 5+16)
12 ( 8+4), 16 ( 8+8), 20 ( 8+12), 14 ( 8+16)
17 (13+4), 21 (13+8), 25 (13+12), 29 (13+16)
(As the referenced paper points out, we don't need to construct the array S, we can generate the value of an item in S if or when we need it.)
The really interesting thing is that each column of S contains values in ascending order (of course), so we can extract the values from S in descending order by doing a merge of the columns (reading from the bottom).
Of course, merging the columns can be done using a priority queue (heap) -- hence the max-heap solution. The simplest approach being to start the heap with the bottom row of S, marking each heap item with the column it came from. Then pop the top of the heap, and push the next item from the same column as the one just popped, until you pop the kth item. (Since the bottom row is sorted, it is a trivial matter to seed the heap with it.)
The complexity of this is O(k log n) -- where 'n' is the number of columns. The procedure works equally well if you process the rows... so if there are 'm' rows and 'n' columns, you can choose the smaller of the two !
NB: the complexity is not O(k log k)... and since for a given pair of A and B the 'n' is constant, O(k log n) is really O(k) !!
If you want to do many probes for different 'k', then the trick might be to cache the state of the process every now and then, so that future 'k's can be done by restarting from the nearest check-point. In the limit, one would run the merge to completion and store all possible values, for O(1) lookup !

KenKen puzzle addends: REDUX A (corrected) non-recursive algorithm

This question relates to those parts of the KenKen Latin Square puzzles which ask you to find all possible combinations of ncells numbers with values x such that 1 <= x <= maxval and x(1) + ... + x(ncells) = targetsum. Having tested several of the more promising answers, I'm going to award the answer-prize to Lennart Regebro, because:
his routine is as fast as mine (+-5%), and
he pointed out that my original routine had a bug somewhere, which led me to see what it was really trying to do. Thanks, Lennart.
chrispy contributed an algorithm that seems equivalent to Lennart's, but 5 hrs later, sooo, first to the wire gets it.
A remark: Alex Martelli's bare-bones recursive algorithm is an example of making every possible combination and throwing them all at a sieve and seeing which go through the holes. This approach takes 20+ times longer than Lennart's or mine. (Jack up the input to max_val = 100, n_cells = 5, target_sum = 250 and on my box it's 18 secs vs. 8+ mins.) Moral: Not generating every possible combination is good.
Another remark: Lennart's and my routines generate the same answers in the same order. Are they in fact the same algorithm seen from different angles? I don't know.
Something occurs to me. If you sort the answers, starting, say, with (8,8,2,1,1) and ending with (4,4,4,4,4) (what you get with max_val=8, n_cells=5, target_sum=20), the series forms kind of a "slowest descent", with the first ones being "hot" and the last one being "cold" and the greatest possible number of stages in between. Is this related to "informational entropy"? What's the proper metric for looking at it? Is there an algorithm that producs the combinations in descending (or ascending) order of heat? (This one doesn't, as far as I can see, although it's close over short stretches, looking at normalized std. dev.)
Here's the Python routine:
#!/usr/bin/env python
#filename: makeAddCombos.07.py -- stripped for StackOverflow
def initialize_combo( max_val, n_cells, target_sum):
"""returns combo
Starting from left, fills combo to max_val or an intermediate value from 1 up.
E.g.: Given max_val = 5, n_cells=4, target_sum = 11, creates [5,4,1,1].
"""
combo = []
#Put 1 in each cell.
combo += [1] * n_cells
need = target_sum - sum(combo)
#Fill as many cells as possible to max_val.
n_full_cells = need //(max_val - 1)
top_up = max_val - 1
for i in range( n_full_cells): combo[i] += top_up
need = target_sum - sum(combo)
# Then add the rest to next item.
if need > 0:
combo[n_full_cells] += need
return combo
#def initialize_combo()
def scrunch_left( combo):
"""returns (new_combo,done)
done Boolean; if True, ignore new_combo, all done;
if Falso, new_combo is valid.
Starts a new combo list. Scanning from right to left, looks for first
element at least 2 greater than right-end element.
If one is found, decrements it, then scrunches all available counts on its
right up against its right-hand side. Returns the modified combo.
If none found, (that is, either no step or single step of 1), process
done.
"""
new_combo = []
right_end = combo[-1]
length = len(combo)
c_range = range(length-1, -1, -1)
found_step_gt_1 = False
for index in c_range:
value = combo[index]
if (value - right_end) > 1:
found_step_gt_1 = True
break
if not found_step_gt_1:
return ( new_combo,True)
if index > 0:
new_combo += combo[:index]
ceil = combo[index] - 1
new_combo += [ceil]
new_combo += [1] * ((length - 1) - index)
need = sum(combo[index:]) - sum(new_combo[index:])
fill_height = ceil - 1
ndivf = need // fill_height
nmodf = need % fill_height
if ndivf > 0:
for j in range(index + 1, index + ndivf + 1):
new_combo[j] += fill_height
if nmodf > 0:
new_combo[index + ndivf + 1] += nmodf
return (new_combo, False)
#def scrunch_left()
def make_combos_n_cells_ge_two( combos, max_val, n_cells, target_sum):
"""
Build combos, list of tuples of 2 or more addends.
"""
combo = initialize_combo( max_val, n_cells, target_sum)
combos.append( tuple( combo))
while True:
(combo, done) = scrunch_left( combo)
if done:
break
else:
combos.append( tuple( combo))
return combos
#def make_combos_n_cells_ge_two()
if __name__ == '__main__':
combos = []
max_val = 8
n_cells = 5
target_sum = 20
if n_cells == 1: combos.append( (target_sum,))
else:
combos = make_combos_n_cells_ge_two( combos, max_val, n_cells, target_sum)
import pprint
pprint.pprint( combos)

Your algorithm seems pretty good at first blush, and I don't think OO or another language would improve the code. I can't say if recursion would have helped but I admire the non-recursive approach. I bet it was harder to get working and it's harder to read but it likely is more efficient and it's definitely quite clever. To be honest I didn't analyze the algorithm in detail but it certainly looks like something that took a long while to get working correctly. I bet there were lots of off-by-1 errors and weird edge cases you had to think through, eh?
Given all that, basically all I tried to do was pretty up your code as best I could by replacing the numerous C-isms with more idiomatic Python-isms. Often times what requires a loop in C can be done in one line in Python. Also I tried to rename things to follow Python naming conventions better and cleaned up the comments a bit. Hope I don't offend you with any of my changes. You can take what you want and leave the rest. :-)
Here are the notes I took as I worked:
Changed the code that initializes tmp to a bunch of 1's to the more idiomatic tmp = [1] * n_cells.
Changed for loop that sums up tmp_sum to idiomatic sum(tmp).
Then replaced all the loops with a tmp = <list> + <list> one-liner.
Moved raise doneException to init_tmp_new_ceiling and got rid of the succeeded flag.
The check in init_tmp_new_ceiling actually seems unnecessary. Removing it, the only raises left were in make_combos_n_cells, so I just changed those to regular returns and dropped doneException entirely.
Normalized mix of 4 spaces and 8 spaces for indentation.
Removed unnecessary parentheses around your if conditions.
tmp[p2] - tmp[p1] == 0 is the same thing as tmp[p2] == tmp[p1].
Changed while True: if new_ceiling_flag: break to while not new_ceiling_flag.
You don't need to initialize variables to 0 at the top of your functions.
Removed combos list and changed function to yield its tuples as they are generated.
Renamed tmp to combo.
Renamed new_ceiling_flag to ceiling_changed.
And here's the code for your perusal:
def initial_combo(ceiling=5, target_sum=13, num_cells=4):
"""
Returns a list of possible addends, probably to be modified further.
Starts a new combo list, then, starting from left, fills items to ceiling
or intermediate between 1 and ceiling or just 1. E.g.:
Given ceiling = 5, target_sum = 13, num_cells = 4: creates [5,5,2,1].
"""
num_full_cells = (target_sum - num_cells) // (ceiling - 1)
combo = [ceiling] * num_full_cells \
+ [1] * (num_cells - num_full_cells)
if num_cells > num_full_cells:
combo[num_full_cells] += target_sum - sum(combo)
return combo
def all_combos(ceiling, target_sum, num_cells):
# p0 points at the rightmost item and moves left under some conditions
# p1 starts out at rightmost items and steps left
# p2 starts out immediately to the left of p1 and steps left as p1 does
# So, combo[p2] and combo[p1] always point at a pair of adjacent items.
# d combo[p2] - combo[p1]; immediate difference
# cd combo[p2] - combo[p0]; cumulative difference
# The ceiling decreases by 1 each iteration.
while True:
combo = initial_combo(ceiling, target_sum, num_cells)
yield tuple(combo)
ceiling_changed = False
# Generate all of the remaining combos with this ceiling.
while not ceiling_changed:
p2, p1, p0 = -2, -1, -1
while combo[p2] == combo[p1] and abs(p2) <= num_cells:
# 3,3,3,3
if abs(p2) == num_cells:
return
p2 -= 1
p1 -= 1
p0 -= 1
cd = 0
# slide_ptrs_left loop
while abs(p2) <= num_cells:
d = combo[p2] - combo[p1]
cd += d
# 5,5,3,3 or 5,5,4,3
if cd > 1:
if abs(p2) < num_cells:
# 5,5,3,3 --> 5,4,4,3
if d > 1:
combo[p2] -= 1
combo[p1] += 1
# d == 1; 5,5,4,3 --> 5,4,4,4
else:
combo[p2] -= 1
combo[p0] += 1
yield tuple(combo)
# abs(p2) == num_cells; 5,4,4,3
else:
ceiling -= 1
ceiling_changed = True
# Resume at make_combo_same_ceiling while
# and follow branch.
break
# 4,3,3,3 or 4,4,3,3
elif cd == 1:
if abs(p2) == num_cells:
return
p1 -= 1
p2 -= 1
if __name__ == '__main__':
print list(all_combos(ceiling=6, target_sum=12, num_cells=4))

First of all, I'd use variable names that mean something, so that the code gets comprehensible. Then, after I understood the problem, it's clearly a recursive problem, as once you have chosen one number, the question of finding the possible values for the rest of the squares are exactly the same problem, but with different values in.
So I would do it like this:
from __future__ import division
from math import ceil
def make_combos(max_val,target_sum,n_cells):
combos = []
# The highest possible value of the next cell is whatever is
# largest of the max_val, or the target_sum minus the number
# of remaining cells (as you can't enter 0).
highest = min(max_val, target_sum - n_cells + 1)
# The lowest is the lowest number you can have that will add upp to
# target_sum if you multiply it with n_cells.
lowest = int(ceil(target_sum/n_cells))
for x in range(highest, lowest-1, -1):
if n_cells == 1: # This is the last cell, no more recursion.
combos.append((x,))
break
# Recurse to get the next cell:
# Set the max to x (or we'll get duplicates like
# (6,3,2,1) and (6,2,3,1), which is pointless.
# Reduce the target_sum with x to keep the sum correct.
# Reduce the number of cells with 1.
for combo in make_combos(x, target_sum-x, n_cells-1):
combos.append((x,)+combo)
return combos
if __name__ == '__main__':
import pprint
# And by using pprint the output gets easier to read
pprint.pprint(make_combos( 6,12,4))
I also notice that your solution still seems buggy. For the values max_val=8, target_sum=20 and n_cells=5 your code doesn't find the solution (8,6,4,1,1,), as an example. I'm not sure if that means I've missed a rule in this or not, but as I understand the rules that should be a valid option.
Here's a version using generators, It saves a couple of lines, and memory if the values are really big, but as recursion, generators can be tricky to "get".
from __future__ import division
from math import ceil
def make_combos(max_val,target_sum,n_cells):
highest = min(max_val, target_sum - n_cells + 1)
lowest = int(ceil(target_sum/n_cells))
for x in xrange(highest, lowest-1, -1):
if n_cells == 1:
yield (x,)
break
for combo in make_combos(x, target_sum-x, n_cells-1):
yield (x,)+combo
if __name__ == '__main__':
import pprint
pprint.pprint(list(make_combos( 6,12,4)))

Here's the simplest recursive solution that I can think of to "find all possible combinations of n numbers with values x such that 1 <= x <= max_val and x(1) + ... + x(n) = target". I'm developing it from scratch. Here's a version without any optimization at all, just for simplicity:
def apcnx(n, max_val, target, xsofar=(), sumsofar=0):
if n==0:
if sumsofar==target:
yield xsofar
return
if xsofar:
minx = xsofar[-1] - 1
else:
minx = 0
for x in xrange(minx, max_val):
for xposs in apcnx(n-1, max_val, target, xsofar + (x+1,), sumsofar+x+1):
yield xposs
for xs in apcnx(4, 6, 12):
print xs
The base case n==0 (where we can't yield any more numbers) either yield the tuple so far if it satisfies the condition, or nothing, then finishes (returns).
If we're supposed to yield longer tuples than we've built so far, the if/else makes sure we only yield non-decreasing tuples, to avoid repetition (you did say "combination" rather than "permutation").
The for tries all possibilities for "this" item and loops over whatever the next-lower-down level of recursion is still able to yield.
The output I see is:
(1, 1, 4, 6)
(1, 1, 5, 5)
(1, 2, 3, 6)
(1, 2, 4, 5)
(1, 3, 3, 5)
(1, 3, 4, 4)
(2, 2, 2, 6)
(2, 2, 3, 5)
(2, 2, 4, 4)
(2, 3, 3, 4)
(3, 3, 3, 3)
which seems correct.
There are a bazillion possible optimizations, but, remember:
First make it work, then make it fast
I corresponded with Kent Beck to properly attribute this quote in "Python in a Nutshell", and he tells me he got it from his dad, whose job was actually unrelated to programming;-).
In this case, it seems to me that the key issue is understanding what's going on, and any optimization might interfere, so I'm going all out for "simple and understandable"; we can, if need be!, optimize the socks off it once the OP confirms they can understand what's going on in this sheer, unoptimized version!

Sorry to say, your code is kind of long and not particularly readable. If you can try to summarize it somehow, maybe someone can help you write it more clearly.
As for the problem itself, my first thought would be to use recursion. (For all I know, you're already doing that. Sorry again for my inability to read your code.) Think of a way that you can reduce the problem to a smaller easier version of the same problem, repeatedly, until you have a trivial case with a very simple answer.
To be a bit more concrete, you have these three parameters, max_val, target_sum, and n_cells. Can you set one of those numbers to some particular value, in order to give you an extremely simple problem requiring no thought at all? Once you have that, can you reduce the slightly harder version of the problem to the already solved one?
EDIT: Here is my code. I don't like the way it does de-duplication. I'm sure there's a more Pythonic way. Also, it disallows using the same number twice in one combination. To undo this behavior, just take out the line if n not in numlist:. I'm not sure if this is completely correct, but it seems to work and is (IMHO) more readable. You could easily add memoization and that would probably speed it up quite a bit.
def get_combos(max_val, target, n_cells):
if target <= 0:
return []
if n_cells is 1:
if target > max_val:
return []
else:
return [[target]]
else:
combos = []
for n in range(1, max_val+1, 1):
for numlist in get_combos(max_val, target-n, n_cells-1):
if n not in numlist:
combos.append(numlist + [n])
return combos
def deduplicate(combos):
for numlist in combos:
numlist.sort()
answer = [tuple(numlist) for numlist in combos]
return set(answer)
def kenken(max_val, target, n_cells):
return deduplicate(get_combos(max_val, target, n_cells))

First of all, I am learning Python myself so this solution won't be great but this is just an attempt at solving this. I have tried to solve it recursively and I think a recursive solution would be ideal for this kind of problem although THAT recursive solution might not be this one:
def GetFactors(maxVal, noOfCells, targetSum):
l = []
while(maxVal != 0):
remCells = noOfCells - 1
if(remCells > 2):
retList = GetFactors(maxVal, remCells, targetSum - maxVal)
#Append the returned List to the original List
#But first, add the maxVal to the start of every elem of returned list.
for i in retList:
i.insert(0, maxVal)
l.extend(retList)
else:
remTotal = targetSum - maxVal
for i in range(1, remTotal/2 + 1):
itemToInsert = remTotal - i;
if (i > maxVal or itemToInsert > maxVal):
continue
l.append([maxVal, i, remTotal - i])
maxVal -= 1
return l
if __name__ == "__main__":
l = GetFactors(5, 5, 15)
print l

Here a simple solution in C/C++:
const int max = 6;
int sol[N_CELLS];
void enum_solutions(int target, int n, int min) {
if (target == 0 && n == 0)
report_solution(); /* sol[0]..sol[N_CELLS-1] is a solution */
if (target <= 0 || n == 0) return; /* nothing further to explore */
sol[n - 1] = min; /* remember */
for (int i = min; i <= max; i++)
enum_solutions(target - i, n - 1, i);
}
enum_solutions(12, 4, 1);

Little bit offtopic, but still might help at programming kenken.
I got good results using DLX algorhitm for solving Killer Sudoku (very simmilar as KenKen it has cages, but only sums). It took less than second for most of problems and it was implemented in MATLAB language.
reference this forum
http://www.setbb.com/phpbb/viewtopic.php?t=1274&highlight=&mforum=sudoku
killer sudoku
"look at wikipedia, cant post hyper link" damt spammers

Here is a naive, but succinct, solution using generators:
def descending(v):
"""Decide if a square contains values in descending order"""
return list(reversed(v)) == sorted(v)
def latinSquares(max_val, target_sum, n_cells):
"""Return all descending n_cells-dimensional squares,
no cell larger than max_val, sum equal to target_sum."""
possibilities = itertools.product(range(1,max_val+1),repeat=n_cells)
for square in possibilities:
if descending(square) and sum(square) == target_sum:
yield square
I could have optimized this code by directly enumerating the list of descending grids, but I find itertools.product much clearer for a first-pass solution. Finally, calling the function:
for m in latinSquares(6, 12, 4):
print m

And here is another recursive, generator-based solution, but this time using some simple math to calculate ranges at each step, avoiding needless recursion:
def latinSquares(max_val, target_sum, n_cells):
if n_cells == 1:
assert(max_val >= target_sum >= 1)
return ((target_sum,),)
else:
lower_bound = max(-(-target_sum / n_cells), 1)
upper_bound = min(max_val, target_sum - n_cells + 1)
assert(lower_bound <= upper_bound)
return ((v,) + w for v in xrange(upper_bound, lower_bound - 1, -1)
for w in latinSquares(v, target_sum - v, n_cells - 1))
This code will fail with an AssertionError if you supply parameters that are impossible to satisfy; this is a side-effect of my "correctness criterion" that we never do an unnecessary recursion. If you don't want that side-effect, remove the assertions.
Note the use of -(-x/y) to round up after division. There may be a more pythonic way to write that. Note also I'm using generator expressions instead of yield.
for m in latinSquares(6,12,4):
print m

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

4-Way MergeSort Challenge - Python - python

Related

How do I modify this Fibonacci Sequence Problem?

Project Euler #25 - Can the performance be improved further?

Processing big list using python

k-greatest double selection

KenKen puzzle addends: REDUX A (corrected) non-recursive algorithm

Categories

Resources