Project Euler Problem 18: What mistake did I make in my code?

Project Euler Problem 18: What mistake did I make in my code? - python

https://projecteuler.net/problem=18
Given a triangle of integers, the problem is to find the maximum path sum from top to bottom (where all the numbers in the path must be adjacent).
I had an idea for an algorithm: starting at the very top, calculate the sum of the left and right paths (left all the way down, and right all the way down), if the left sum is greater, jump to the left-adjacent number, if the right sum is greater, jump to the right-adjacent number, repeat the algorithm starting from the current number, and so on until you get to the bottom row.
triangle = ['75', '9564', '174782', '18358710', '2004824765', '190123750334', '88027773076367', '9965042806167092', '414126568340807033', '41487233473237169429', '5371446525439152975114', '701133287773177839681757', '91715238171491435850272948', '6366046889536730731669874031', '046298272309709873933853600423']
maximumPath = [75]
maxSum = 75 #Start it with the starting element of the triangle.
def triNum(row, index): #Returns the number at given row, number in row
return(int(triangle[row][2*index:2*(index+1)])) #Nota bene: returns an integer.
def options(row, index): #Rows start at 0, index starts at 0
return(triNum(row+1, index), triNum(row+1, index+1))
def criticalPathSum(startRow, startIndex, direction):
critPath = []
if direction == 'left':
directionNum = 0
else:
directionNum = 1
sum = triNum(startRow, startIndex) #Starting sum of left and right paths is just the number at the start of both paths.
for i in range(startRow + 1, len(triangle)):
startIndex += directionNum
sum += triNum(i, startIndex)
critPath.append(triNum(i, startIndex))
#print(triNum(i, startIndex + directionNum))
return(sum, critPath)
pathIndex = 0
for row in range(0, len(triangle)-1):
print('These are my options: ' + str(options(row, pathIndex)))
print('Left Sum: ' + str(criticalPathSum(row, pathIndex, 'left')) + ', ' + 'Right Sum: ' + str(criticalPathSum(row, pathIndex, 'right')))
if criticalPathSum(row, pathIndex, 'left') > criticalPathSum(row, pathIndex, 'right'):
maximumPath.append(triNum(row + 1, pathIndex))
print('Left. ' + str(triNum(row + 1, pathIndex)))
else:
print('Right. ' + str(triNum(row + 1, pathIndex + 1)))
pathIndex += 1
maximumPath.append(triNum(row + 1, pathIndex))
maxSum += triNum(row + 1, pathIndex)
print('_______________________________')
print('\n')
print(maximumPath)
print(maxSum)
The answer is 1067, but I get 883. This is the maximum path, according to the algorithm:
[75, 95, 17, 35, 82, 75, 7, 16, 80, 37, 91, 17, 91, 67, 98].

Your algorithm is wrong: in a triangle like
1
1 4
1 4 1
9 1 4 1
it is too tempted by the 9 (always going left) to see the optimal 1+4+4+4 = 13 route.

This is one of the many wrong algorithms that the test data was designed to defeat. You chose what is called a "greedy" algorithm, one that takes the maximum value for any single step, without considering long-term characteristics of the problem.
Rather, you need to work from the bottom of the triangle upward. At each juncture, note which of the two paths gives the maximum sum to the bottom of the triangle, and record that as the optimum result from that node. When you hit the apex, you will have the desired answer as the larger of the two elements underneath.
For instance, given the triangle
1
2 1
2 1 9
1 2 1 9
your algorithm will be greedy and take the path 1-2-2-2; the lower choice of a 1 in row two cuts off that branch from the short-sighted logic.
Rather, you need to build up the totals from the bottom, taking the best of two paths at each node:
20
6 19
4 3 18
1 2 1 9
In case this isn't clear enough ...
the bottom row has no further choices; each value is its own best path to the end. For the row above, let's go right to left, considering each value and its two "children":
2 1 9
1 2 1 9
The 2 has two values below, 1 and 2. Obviously, the 2 is better, so the best path from there is 2+2 = 4.
The 1 likewise has a 2 and 1 below; again, the better path is 1+2, giving 3.
The 9 has children of 1 and 9; we choose 9+9=18. The rows now appear as
1
2 1
4 3 18
1 2 1 9
Now we move up one row, cosindering the two choices for 2 1 there.
The 2 has 4 and 3; the 1 has 3 and 18. Again taking the higher value in each case and adding the node value, we get 2+4 = 6 and 1+18 = 19:
1
6 19
4 3 18
1 2 1 9
Finally, the top node chooses the larger value of 19, giving a total of 20 along the path 1-1-9-9.
Got it?

listOfLists = [
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23]]
for i in range(len(listOfLists) - 2, -1, -1):
for j in range(len(listOfLists[i])):
listOfLists[i][j] += max(listOfLists[i+1][j], listOfLists[i+1][j+1])
print(listOfLists[0][0])
The bottom - top approach and it gets the right answer according to PE.
Thanks!

Related

How to find the smallest threshold from one list to fit another list

I have two lists of marks for the same set of students. For example:
A = [22, 2, 88, 3, 93, 84]
B = [66, 0, 6, 33, 99, 45]
If I accept only students above a threshold according to list A then I can look up their marks in list B. For example, if I only accept students with at least a mark of 80 from list A then their marks in list B are [6, 99, 45].
I would like to compute the smallest threshold for A which gives at least 90% of students in the derived set in B getting at least 50. In this example the threshold will have to be 93 which gives the list [99] for B.
Another example:
A = [3, 36, 66, 88, 99, 52, 55, 42, 10, 70]
B = [5, 30, 60, 80, 80, 60, 45, 45, 15, 60]
In this case we have to set the threshold to 66 which then gives 100% of [60, 80, 80, 60] getting at least 50.

This is an O(nlogn + m) approach (due to sorting) where n is the length of A and m is the length of B:
from operator import itemgetter
from itertools import accumulate
def find_threshold(lst_a, lst_b):
# find the order of the indices of lst_a according to the marks
indices, values = zip(*sorted(enumerate(lst_a), key=itemgetter(1)))
# find the cumulative sum of the elements of lst_b above 50 sorted by indices
cumulative = list(accumulate(int(lst_b[j] > 50) for j in indices))
for i, value in enumerate(values):
# find the index where the elements above 50 is greater than 90%
if cumulative[-1] - cumulative[i - 1] > 0.9 * (len(values) - i):
return value
return None
print(find_threshold([22, 2, 88, 3, 93, 84], [66, 0, 6, 33, 99, 45]))
print(find_threshold([3, 36, 66, 88, 99, 52, 55, 42, 10, 70], [5, 30, 60, 80, 80, 60, 45, 45, 15, 60]))
Output
93
66

First, define a function that will tell you if 90% of students in a set scored more than 50:
def setb_90pc_pass(b):
return sum(score >= 50 for score in b) >= len(b) * 0.9
Next, loop over scores in a in ascending order, setting each of them as the threshold. Filter out your lists according that threshold, and check if they fulfill your condition:
for threshold in sorted(A):
filtered_a, filtered_b = [], []
for ai, bi in zip(A, B):
if ai >= threshold:
filtered_a.append(ai)
filtered_b.append(bi)
if setb_90pc_pass(filtered_b):
break
print(threshold)

Faster/lazier way to evenly and randomly split m*n into n group (each has m elements) in python

I want to split m*n elements (e.g., 1, 2, ..., m*n) into n group randomly and evenly such that each group has m random elements. Each group will process k (k>=1) elements at one time from its own group and at the same speed (via some synchronization mechanism), until all group has processed all their own elements. Actually each group is in an independent process/thread.
I use numpy.random.choice(m*n, m*n, replace=False) to generate the permutation first, and then index the permuted result from each group.
The problem is that when m*n is very large (e.g., >=1e8), the speed is very slow (tens of seconds or minutes).
Is there any faster/lazier way to do this? I think maybe this can be done in a lazier way, which is not generating the permuted result in the first time, but generate a generator first, and in each group, generate k elements at each time, and its effect should be identical to the method I currently use. But I don't know how to achieve this lazy way. And I am not sure whether this can be implemented actually.

You can make a generator that will progressively shuffle (a copy of) the list and lazily yield distinct groups:
import random
def rndGroups(A,size):
A = A.copy() # work on a copy (if needed)
p = len(A) # target position of random item
for _ in range(0,len(A),size): # work in chunks of group size
for _ in range(size): # Create one group
i = random.randrange(p) # random index in remaining items
p -= 1 # update randomized position
A[i],A[p] = A[p],A[i] # swap items
yield A[p:p+size] # return shuffled sub-range
Output:
A = list(range(100))
iG = iter(rndGroups(A,10)) # 10 groups of 10 items
s = set() # set to validate uniqueness
for _ in range(10): # 10 groups
g = next(iG) # get the next group from generator
s.update(g) # to check that all items are distinct
print(g)
print(len(s)) # must get 100 distinct values from groups
[87, 19, 85, 90, 35, 55, 86, 58, 96, 68]
[38, 92, 93, 78, 39, 62, 43, 20, 66, 44]
[34, 75, 72, 50, 42, 52, 60, 81, 80, 41]
[13, 14, 83, 28, 53, 5, 94, 67, 79, 95]
[9, 33, 0, 76, 4, 23, 2, 3, 32, 65]
[61, 24, 31, 77, 36, 40, 47, 49, 7, 97]
[63, 15, 29, 25, 11, 82, 71, 89, 91, 30]
[12, 22, 99, 37, 73, 69, 45, 1, 88, 51]
[74, 70, 98, 26, 59, 6, 64, 46, 27, 21]
[48, 17, 18, 8, 54, 10, 57, 84, 16, 56]
100
This will take just as long as pre-shuffling the list (if not longer) but it will let you start/feed threads as you go, thus augmenting the parallelism

Sum of multiple of 3+5 < 100

New to Jupyter Notebook, computing this code to return sum of values that are a multiple of 3 and 5, AND less than 100 in my list range 1, 100. I've got a feeling that I'm truncating the code by removing 3 and 5 from the equation. Not sure how/where to include that.
print(list(range(1, 100)))
multiple35 = 0
for i in range (1, 100):
if i % 15 == 0 and multiple35 <= 100:
multiple35 += i
print(multiple35)
My print line returns the range, Plus the 3 correct multiples less than 100. BUT ALSO prints 150, which is greater than and should be excluded from the result.
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
15
45
90
150
Appreciate your help here.

BUT ALSO prints 150, which is greater than and should be excluded from the result.
The reason is simple. You are testing multiple35 <= 100 before the addition (multiple35 += i). So the sum is printed first and then tested in the next round. Therefore the output ends after the first occurrence that is bigger than 100.
By the way, it is useless to go through all natural numbers and only do anything on each 15th element (because of i % 15 == 0). You can use a tailored range instead:
>>> list(range(15,100,15))
[15, 30, 45, 60, 75, 90]
So a simplified loop which would stop printing when reaching 100, could look like:
multiple35 = 0
for i in range (15, 100, 15):
multiple35 += i
if multiple35 > 100:
break # no reason to continue the loop, the sum will never go back below 100
print(multiple35)

You have to check if the final target will exceed the threshold, because in your loop, when i=75, it satisfies the condition 75%15==0 and also satisfies 75<=100 and since it satisfies both, we let it into the next block which adds 75 to it and gives 150 which exceeds the threshold. The solution is to not even allow a number inside the adding part if when added, crosses the threshold,
There are simpler solutions like above by #Melebius but, I wanted to explain this in the way OP has written
multiple35 = 0
for i in range (1, 100):
if i % 15 == 0 and multiple35+i<=100:
multiple35 += i
print(multiple35)

There are multiple flaws in your logic. I am not going to address them all, but instead suggest an alternate way to solve your issue.
Simply notice that multiples of 3 and 5 are exactly the multiples of 15. So there is no need to range over all numbers from 0 to 100.
for x in range(15, 100, 15):
print(x)
# 15
# 30
# 45
# 60
# 75
# 90
You also mention that you want to sum all numbers. In python, you can sum over any iterator with sum, including a range.
print(sum(range(15, 100, 15)))
# 315

You can use list comprehension also here
values = [i for i in range(1,100) if i%5==0 if i%3==0]
print("Numbers divisible by 3 and 5:",values)
sum_of_numbers = 0
for i,items in enumerate(values):
sum_of_numbers = sum_of_numbers+items
if sum_of_numbers>100:
break
print(values[:i])

How to find all possible daughters in a sequence of numbers stored in dataframe

I have a python dataframe which one of its column such as column1 contains series of numbers. I have to mention that each these numbers are the result of cell mutation so cell with number n deviates to two cells with following numbers: 2*n and 2*n+1. I want to search in this column to find all rows corresponds to daughters of specific number k. I mean the rows which contains all possible {2*k, 2*k+1, 2*(2*k), 2*(2*k+1), ... } in their column1. I don't want to use tree structure, how can I approach the solution ? thanks

The two sequences look like the numbers who's binary expansion starts with 10 and the numbers for which the binary expansion starts with 11.
Both sequences can be found directly:
import math
def f(n=2):
while True:
yield int(n + 2**math.floor(math.log(n,2)))
n += 1
def g(n=2):
while True:
yield int(n + 2 * 2**math.floor(math.log(n,2)))
n += 1
a, b = f(), g()
print [a.next() for i in range(15)]
print [b.next() for i in range(15)]
>>> [4, 5, 8, 9, 10, 11, 16, 17, 18, 19, 20, 21, 22, 23, 32]
>>> [6, 7, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31, 48]
EDIT:
For an arbitrary starting point, you can do the following, which I think meets your criteria.
import Queue
def f(k):
q = Queue.Queue()
q.put(k)
while not q.empty():
p = q.get()
a, b = 2*p, 2*p+1
q.put(a)
q.put(b)
yield a
yield b
a = f(4)
print [a.next() for i in range(16)]
>>> [8, 9, 16, 17, 18, 19, 32, 33, 34, 35, 36, 37, 38, 39, 64, 65] # ...
a = f(5)
print [a.next() for i in range(16)]
>>> [10, 11, 20, 21, 22, 23, 40, 41, 42, 43, 44, 45, 46, 47, 80, 81] # ...
Checking those sequences against OEIS:
f(2) - Starting 10 - A004754
f(3) - Starting 11 - A004755
f(4) - Starting 100 - A004756
f(5) - Starting 101 - A004756
f(6) - Starting 110 - A004758
f(7) - Starting 111 - A004759
...
Which means you can simply do:
import math
def f(k, n=2):
while True:
yield int(n + (k-1) * 2**math.floor(math.log(n, 2)))
n+=1
for i in range(2,8):
a = f(i)
print i, [a.next() for j in range(16)]
>>> 2 [4, 5, 8, 9, 10, 11, 16, 17, 18, 19, 20, 21, 22, 23, 32]
>>> 3 [6, 7, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31, 48]
>>> 4 [8, 9, 16, 17, 18, 19, 32, 33, 34, 35, 36, 37, 38, 39, 64]
>>> 5 [10, 11, 20, 21, 22, 23, 40, 41, 42, 43, 44, 45, 46, 47, 80]
>>> 6 [12, 13, 24, 25, 26, 27, 48, 49, 50, 51, 52, 53, 54, 55, 96]
>>> 7 [14, 15, 28, 29, 30, 31, 56, 57, 58, 59, 60, 61, 62, 63, 112]
# ... where the first number is shown for clarity.

Ugly but seems to work alright. What I think you might have needed to know is the newer yield from construction. Used twice in this code. Never thought I would.
from fractions import Fraction
from itertools import count
def daughters(k):
print ('daughters of cell', k)
if k<=0:
return
if k==1:
yield from count(1)
def locateK():
cells = 1
newCells = 2
generation = 1
while True:
generation += 1
previousCells = cells
cells += newCells
newCells *= 2
if k > previousCells and k <= cells :
break
return ( generation, k - previousCells )
parentGeneration, parentCell = locateK()
cells = 1
newCells = 2
generation = 1
while True:
generation += 1
previousCells = cells
if generation > parentGeneration:
if parentCell%2:
firstChildCell=previousCells+int(Fraction(parentCell-1, 2**parentGeneration)*newCells)+1
else:
firstChildCell=previousCells+int(Fraction(parentCell, 2**parentGeneration)*newCells)+1
yield from range(firstChildCell, firstChildCell+int(newCells*Fraction(1,2)))
cells += newCells
newCells *= 2
for n, d in enumerate(daughters(2)):
print (d)
if n > 15:
break
Couple of representative results:
daughters of cell 2
4
5
8
9
10
11
16
17
18
19
20
21
22
23
32
33
34
daughters of cell 3
6
7
12
13
14
15
24
25
26
27
28
29
30
31
48
49
50

using the in operator and changing lists

Im trying to produce a table with one row with numbers increasing by one and another with the respective composites with the limit being 100 like:
Numbers----------composites
x---------------numbers 1-100 divisible by x
x+1---------------numbers 1-100 divisible by x+1 but aren't in x
x+2---------------numbers 1-100 divisible by x+2 but aren't in x or x+1 x+3---------------numbers 1-100 divisible by x+3 but aren't in x,x+1,or x+2 etc
Numbers is a permanent list that starts off as 2-100 I whittle down as I pull out every composite number within the function, at the end it should only contain prime numbers.
composites is a list I fill with composites of a certain number (2,3,4 etc) that I then wish to check with the current numbers list to make sure there are no duplicates. I print whats left, empty the list and increase the current variable by 1 and repeat.
This is the coding ive come up with, I understand its very sloppy but I literally know nothing about the subject and my professor likes us to learn trial by fire and this is what ive managed to scrounge up from the textbook. The main issue of my concern is the adding and removing of elements from certain lists
def main():
x=2
n=2
print("numbers"" ""composite")
print("------------------------")
cross_out(n,x)
def cross_out(n,x):
composites=[]
prime=[]
numbers=[]
while x<101:
numbers.append(x)
x=x+1
x=2
for x in range(2,102):
if x==101:
search=composites[0]
index=0
while index<=len(composites):
if search in numbers:
search=search+1
index=index+1
else:
if search in composites:
composites.remove(search)
else:
pass
print(n,"--->",composites)
x=2
composites=[]
n=n+1
index=0
elif x%n==0:
composites.append(x)
if x in numbers:
numbers.remove(x)
else:
pass
x=x+1
else:
x=x+1
main()
cross_out()

I think I'm understanding your description correctly, and this is what I came up with.
I used a set to keep track of the number you have added to the composites already. This makes the problem pretty simple. Also, advice when writing functions is to not overwrite your parameters. For example, in cross_out, you are doing x = <value> and n = <value> several times.
def cross_out(n,x):
composites=[]
seen = set()
numbers = range(x, 101)
primes = []
for num in numbers:
for val in numbers:
if val % num == 0 and val not in seen:
composites.append(val)
seen.add(val)
if composites:
primes.append(composites[0])
print(num,'--->',composites)
composites = []
print(primes)
def main():
print("numbers composite")
print("------------------------")
cross_out(2, 2)
main()
Sample Output
numbers composite
------------------------
2 ---> [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100]
3 ---> [3, 9, 15, 21, 27, 33, 39, 45, 51, 57, 63, 69, 75, 81, 87, 93, 99]
4 ---> []
5 ---> [5, 25, 35, 55, 65, 85, 95]
6 ---> []
7 ---> [7, 49, 77, 91]
8 ---> []
9 ---> []
10 ---> []
11 ---> [11]
12 ---> []
13 ---> [13]
14 ---> []
15 ---> []
16 ---> []
17 ---> [17]
18 ---> []
19 ---> [19]
20 ---> []
21 ---> []
22 ---> []
23 ---> [23]
24 ---> []
25 ---> []
Primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.