a specific constraint in ortools is so slow - python

I have an assignment problem to solve. I found ortools to be a great tool to use here. I managed to solve it but it is very slow and I need it to be fast.
The issue that I have is basically a bunch of stores selling the same items at different prices. I need to select where to pick those items from to achieve the lowest total price as well as not exceeding 4 stores.
This is the code I have but it is slow if the supplied cost matrix has more than 4 stores. The problem lies in the store max limit constraint. Is there anyway that this can be coded differently to improve speed?
import numpy as np
from ortools.sat.python import cp_model
from ortools.linear_solver import pywraplp
#cost matrix, where j are stores, i are items
C = np.array([[38, 13, 73, 10, 76, 6, 80, 65, 17, 2],
[77, 72, 7, 26, 51, 21, 19, 85, 12, 29],
[30, 15, 51, 69, 88, 88, 95, 97, 87, 14],
[10, 8, 64, 62, 23, 58, 2, 1, 61, 82],
[ 9, 89, 14, 48, 73, 31, 72, 4, 71, 22],
[50, 58, 4, 69, 25, 44, 77, 27, 53, 81],
[42, 83, 16, 65, 69, 26, 99, 88, 8, 27],
[26, 23, 10, 68, 24, 28, 38, 58, 84, 39],
[ 9, 33, 35, 11, 24, 16, 88, 26, 72, 93],
[75, 63, 47, 33, 89, 24, 56, 66, 78, 4],
[ 1, 78, 7, 53, 86, 71, 3, 77, 92, 22],
[76, 8, 78, 73, 76, 77, 44, 21, 31, 37],
[ 8, 46, 69, 58, 83, 97, 14, 11, 24, 82],
[ 8, 25, 75, 93, 21, 33, 13, 66, 95, 61],
[25, 83, 98, 3, 93, 99, 11, 55, 97, 83],
[87, 71, 67, 72, 49, 55, 16, 6, 18, 43],
[21, 49, 23, 14, 98, 54, 85, 11, 97, 56],
[62, 57, 90, 22, 97, 84, 26, 15, 14, 85],
[44, 7, 78, 57, 60, 16, 25, 10, 67, 72],
[54, 70, 37, 22, 41, 78, 92, 50, 48, 78]])
# the solver func
def Solve_Cost_Matrix_2(cost):
model = cp_model.CpModel()
max_stops=4
#generate ranges
num_items = len(cost)
num_shops = len(cost[0])
all_items = range(num_items)
all_shops = range(num_shops)
# Create bool Variable matrix
x=[]
for i in all_items:
t=[]
for j in all_shops:
t.append(model.NewBoolVar(f'i{i}_j{j}'))
x.append(t)
# Constraints
# Each item only assigned once to any store .
[model.Add(sum(x[i][j] for j in all_shops) == 1) for i in all_items]
# Adding the intermediate variable to constrain the number of the stores.
s=[]
for j in all_shops:
s.append( model.NewBoolVar(f's_{j}') )
for j in all_shops:
model.Add(sum(x[i][j] for i in all_items) >= 1).OnlyEnforceIf(s[j])
model.Add(sum(x[i][j] for i in all_items) == 0).OnlyEnforceIf(s[j].Not())
model.Add(sum(s[j] for j in all_shops) <= max_stops)
# Create the Objective function Variable
total_cost = model.NewIntVar(0, 1000000, 'total_cost')
# Create the Objective function, Minimize (Sum of cost)
model.Add(total_cost == (sum(x[i][j] * cost[i][j] for j in all_shops for i in all_items )))
model.Minimize(total_cost)
#Initialize the Solver ...
solver = cp_model.CpSolver()
status = solver.Solve(model)
print(solver.ResponseStats())
Total_Cost,senario_cost = 0,0
#printing the solution
if status == cp_model.OPTIMAL:
senario_cost={'Items':[],'Assigned_to':[],'Item_cost':[],'Num_stops':0,'cost':[]}
Total_Cost = solver.ObjectiveValue()
for i in range(num_items):
for j in range(num_shops):
if solver.Value(x[i][j]) == 1:
senario_cost['Items'].append(i)
senario_cost['Assigned_to'].append(j)
senario_cost['Item_cost'].append(cost[i][j])
senario_cost['Num_stops'] = len(set(senario_cost['Assigned_to']))
senario_cost['cost'] = cost
return Total_Cost,senario_cost
else:
return None,None
I get this when I run it:
CpSolverResponse:
status: OPTIMAL
objective: 213
best_bound: 213
booleans: 210
conflicts: 106343
branches: 158796
propagations: 4242079
integer_propagations: 7844526
restarts: 878
lp_iterations: 0
walltime: 6.90529
usertime: 6.90529
deterministic_time: 4.67974
primal_integral: 0
CPU times: user 6.86 s, sys: 41 ms, total: 6.9 s
Wall time: 6.95 s

When I run the supplied code on the master branch, without parallelism, I get:
CpSolverResponse:
status: OPTIMAL
objective: 213
best_bound: 213
booleans: 210
conflicts: 31
branches: 617
propagations: 5226
integer_propagations: 8220
restarts: 428
lp_iterations: 130
walltime: 0.021303
usertime: 0.021303
deterministic_time: 0.0011162
primal_integral: 0.00536794
Do you get a different result?

Related

Why parallelized code with concurrent.futures is slower then regular code?

I tried parallelizing with concurrent.futures expecting that parallelized code will be faster.
I made a dafault code to test the parallelization. It is not important what the code does. I'm mainly interested in the speed of the dafault code and parallelized code. All it does is calculate the correlation between lists from sigs and data_mat and store the values in corr_coefs. You can see the plain code below:
from time import time
import numpy as np
sigs = [
[91, 43, 44, 49, 64, 37, 61, 31, 73],
[59, 94, 91, 12, 47, 44, 93, 7, 84],
[47, 76, 24, 87, 2, 83, 77, 60, 36],
[83, 68, 3, 49, 14, 12, 51, 36, 22]
]
data_mat = [
[83, 68, 3, 49, 14, 12, 51, 36, 22],
[8, 78, 44, 40, 39, 67, 63, 64, 34],
[49, 24, 77, 91, 66, 44, 83, 30, 99],
[97, 40, 69, 7, 24, 70, 63, 52, 81],
[26, 62, 53, 36, 72, 54, 85, 94, 31],
[99, 52, 87, 52, 50, 9, 22, 72, 62],
[91, 15, 54, 84, 89, 15, 43, 31, 9],
[39, 26, 36, 81, 65, 50, 67, 12, 19],
[67, 22, 86, 24, 38, 30, 45, 94, 44],
# etc.
]
execution_time_start = time()
corr_coefs = []
for sig in sigs:
for data_mat_row in data_mat:
corr = np.corrcoef(np.square(sig), np.square(data_mat_row))
corr_coefs.append(corr[0, 1])
execution_time_end = time()
elapsed_time = execution_time_end - execution_time_start
print(f'Execution time (without parallelizaion): = {elapsed_time:.20f} s')
I tried to parallelize this code using concurrent.futures. The data_mat and sings sheets are the same (I just rewrote the code):
from time import time
import numpy as np
import concurrent.futures
sigs = [
[91, 43, 44, 49, 64, 37, 61, 31, 73],
[59, 94, 91, 12, 47, 44, 93, 7, 84],
[47, 76, 24, 87, 2, 83, 77, 60, 36],
[83, 68, 3, 49, 14, 12, 51, 36, 22]
]
data_mat = [
[83, 68, 3, 49, 14, 12, 51, 36, 22],
[8, 78, 44, 40, 39, 67, 63, 64, 34],
[49, 24, 77, 91, 66, 44, 83, 30, 99],
[97, 40, 69, 7, 24, 70, 63, 52, 81],
[26, 62, 53, 36, 72, 54, 85, 94, 31],
[99, 52, 87, 52, 50, 9, 22, 72, 62],
[91, 15, 54, 84, 89, 15, 43, 31, 9],
[39, 26, 36, 81, 65, 50, 67, 12, 19],
[67, 22, 86, 24, 38, 30, 45, 94, 44],
# etc.
]
execution_time_start = time()
corr_coefs = []
with concurrent.futures.ThreadPoolExecutor() as executor:
future_corr_coefs = {executor.submit(np.corrcoef, np.square(sig), np.square(data_mat_row)): (sig, data_mat_row)
for sig in sigs for data_mat_row in data_mat}
for future in concurrent.futures.as_completed(future_corr_coefs):
sig, data_mat_row = future_corr_coefs[future]
corr = future.result()
corr_coefs.append(corr[0,1])
execution_time_end = time()
elapsed_time = execution_time_end - execution_time_start
print(f'Execution time (with parallelizaion): = {elapsed_time:.20f} s')
I expected the rewritten code to be faster, but I got these outputs:
Execution time (without parallelization): = 1.30910301208496093750 s
Execution time (with parallelization): = 2.38465380668640136719 s
I also tried with a larger data set by expanding the list data_mat , but still the code is slower. Does anyone have any advice that would help? I still thought it might be Overhead. But I am not able to explain how...
I found the answer. The code is faster, but sigs and data_mat should be much larger (very large) to be more efficient. If the input data set is small, then it is pointless to use concurent.futures because the overhead for parallelizing the code increases the computation time...but if the data set is large and the code in loops is more complex, then the parallelization is faster...

Why my loop is not exiting when after if condition I return

In this code I need to exit loop on certain condition. if position + 1 == len(triangle)
Maybe I am not good at Python and don't understand clearly its behaviour.
It is not listening to my command and keep calling same function instead of leaving the loop.
The only other thing I tried is to call break in the loop itself when same condition is met but it is not working as well.
def max_value(list, index):
for _ in range(len(list)):
dictionary = dict()
maximum = max(list[index], list[index + 1])
dictionary['max'] = maximum
if maximum == list[index]:
dictionary['next_index'] = index
else:
dictionary['next_index'] = index + 1
return dictionary
total = 0
index = 0
skip = False
position = 0
def sliding_triangle(triangle):
global total
global index
global skip
global position
if not skip:
skip = True
total += triangle[0][0]
total += max_value(triangle[1], index).get("max")
index = max_value(triangle[1], index).get("next_index")
position = 2
sliding_triangle(triangle)
if position + 1 == len(triangle): return <-----------------HERE I AM EXPECTING IT TO EXIT
for row in range(position, len(triangle)):
values = max_value(triangle[row], index)
total += values.get("max")
index = values.get("next_index")
print(position, int(len(triangle)), index, values.get("max"), total)
position += 1
sliding_triangle(triangle)
return total
print(sliding_triangle([
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[ 4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23],
]))
Hehey, Got it working finally, so the solution was to break from loop earlier.
I had to put the condition in the beginning of the loop otherwise it was doing the same process and condition was wrong.
total = 0
index = 0
skip = False
position = 0
def max_value(list, index):
for _ in range(len(list)):
dictionary = dict()
maximum = max(list[index], list[index + 1])
dictionary['max'] = maximum
if maximum == list[index]:
dictionary['next_index'] = index
else:
dictionary['next_index'] = index + 1
return dictionary
def sliding_triangle(triangle):
global total
global index
global skip
global position
if not skip:
skip = True
total += triangle[0][0]
total += max_value(triangle[1], index).get("max")
index = max_value(triangle[1], index).get("next_index")
position = 2
sliding_triangle(triangle)
for row in range(position, len(triangle)):
if position == int(len(triangle)): break <<<--------------- I HAD TO CALL BREAK EARLIER, OTHERWISE FOR LOOP WAS KEEP WORKING INSTEAD OF STOPPING
values = max_value(triangle[row], index)
total += values.get("max")
index = values.get("next_index")
position += 1
sliding_triangle(triangle)
return total
print(sliding_triangle([
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[ 4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23],
]))
Recursive brute force solution
def sliding_triangle(triangle, row = 0, index = 0):
if row >= len(triangle) or index >= len(triangle[row]):
return 0 # row or index out of bounds
# Add parent value to max of child triangles
return triangle[row][index] + max(sliding_triangle(triangle, row+1, index), sliding_triangle(triangle, row+1, index+1))
Tests
print(sliding_triangle([[3], [7, 4], [2, 4, 6], [8, 5, 9, 3]]))
# Output: 23
print(sliding_triangle([
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[ 4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23],
]))
# Output: 1074
However, brute force approach times out on larges dataset
Optimized Solution
Apply memoization to brute force solution.
Uses cache to avoid repeatedly solving for subpaths of a parent triangle node
Code
def sliding_triangle(triangle):
' Wrapper setup function '
def sliding_triangle_(row, index):
' Memoized function which does the calcs'
if row >= len(triangle) or index >= len(triangle[row]):
return 0
if not (row, index) in cache:
# Update cache
cache[(row, index)] = (triangle[row][index] +
max(sliding_triangle_(row+1, index),
sliding_triangle_(row+1, index+1)))
return cache[(row, index)]
cache = {} # init cache
return sliding_triangle_(0, 0) # calcuate starting from top most node
Tests
Same results as brute force solution for simple test cases
Works on large dataset i.e. https://projecteuler.net/project/resources/p067_triangle.txt
Find and Show Optimal Path*
Modify Brute Force to Return Path
Show highlighted path in triangle
Code
####### Main function
def sliding_triangle_path(triangle, row = 0, index = 0, path = None):
'''
Finds highest scoring path (using brute force)
'''
if path is None:
path = [(0, 0)] # Init path with top most triangle node
if row >= len(triangle) or index >= len(triangle[row]):
path.pop() # drop last item since place out of bounds
return path
# Select best path of child nodes
path_ = max(sliding_triangle_path(triangle, row+1, index, path + [(row+1, index)]),
sliding_triangle_path(triangle, row+1, index+1, path + [(row+1, index+1)]),
key = lambda p: score(triangle, p))
return path_
####### Utils
def getter(x, args):
'''
Gets element of multidimensional array using tuple as index
Source (Modified): https://stackoverflow.com/questions/40258083/recursive-itemgetter-for-python
'''
try:
for k in args:
x = x[k]
return x
except IndexError:
return 0
def score(tri, path):
' Score for a path through triangle tri '
return sum(getter(tri, t) for t in path)
def colored(r, g, b, text):
'''
Use rgb code to color text'
Source: https://www.codegrepper.com/code-examples/python/how+to+print+highlighted+text+in+python
'''
return "\033[38;2;{};{};{}m{} \033[38;2;255;255;255m".format(r, g, b, text)
def highlight_path(triangle, path):
' Created string that highlight path in red through triangle'
result = "" # output string
for p in path: # Looop over path tuples
row, index = p
values = triangle[row] # corresponding values in row 'row' of triangle
# Color in red path value at index, other values are in black (color using rgb)
row_str = ' '.join([colored(255, 0, 0, str(v)) if i == index else colored(0, 0, 0, str(v)) for i, v in enumerate(values)])
result += row_str + '\n'
return result
Test
# Test
triangle = ([
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[ 4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23],
])
path = sliding_triangle_path(triangle)
print(f'Score: {score(tri, path)}')
print(f"Path\n {'->'.join(map(str,path))}")
print(f'Highlighted path\n {highlight_path(tri, path)}')
Output
Score: 1074
Path
(0, 0)->(1, 1)->(2, 2)->(3, 2)->(4, 2)->(5, 3)->(6, 3)->(7, 3)->(8, 4)->(9, 5)->(10, 6)->(11, 7)->(12, 8)->(13, 8)->(14, 9)
Got my own correct answer for the kata, which can handle big triangles and passed all tests
def longest_slide_down(triangle):
temp_arr = []
first = triangle[-2]
second = triangle[-1]
if len(triangle) > 2:
for i in range(len(first)):
for _ in range(len(second)):
summary = first[i] + max(second[i], second[i + 1])
temp_arr.append(summary)
break
del triangle[-2:]
triangle.append(temp_arr)
return longest_slide_down(triangle)
summary = triangle[0][0] + max(triangle[1][0], triangle[1][1])
return summary
You can try using an else and a pass, like so:
def max_value():
# code
def sliding_triangle():
if not skip:
# code
if position + 1 == len(triangle):
pass
else:
for row in range(position, len(triangle)):
# code
return total
print sliding_triangle()
As far as I know, you can't interrupt a def by throwing a return in two or more different points of the script just like in Java. Instead, you can just place a condition that, whether is respected, you skip to the return. Instead, you continue with the execution.
I synthesized your code to let you understand the logic easier, but it's not a problem if I have to write it fully

Numpy blocks reshaping

I am looking for a way to reshape the following 1d-numpy array:
# dimensions
n = 2 # int : 1 ... N
h = 2 # int : 1 ... N
m = n*(2*h+1)
input_data = np.arange(0,(n*(2*h+1))**2)
The expected output should be reshaped into (2*h+1)**2 blocks of shape (n,n) such as:
input_data.reshape(((2*h+1)**2,n,n))
>>> array([[[ 0 1]
[ 2 3]]
[[ 4 5]
[ 6 7]]
...
[[92 93]
[94 95]]
[[96 97]
[98 99]]]
These blocks finally need to be reshaped into a (m,m) matrix so that they are stacked in rows of 2*h+1 blocks:
>>> array([[ 0, 1, 4, 5, 8, 9, 12, 13, 16, 17],
[ 2, 3, 6, 7, 10, 11, 14, 15, 18, 19],
...
[80, 81, 84, 85, 88, 89, 92, 93, 96, 97],
[82, 83, 86, 87, 90, 91, 94, 95, 98, 99]])
My problem is that I can't seem to find proper axis permutations after the first reshape into (n,n) blocks. I have looked at several answers such as this one but in vain.
As the real dimensions n and h are quite bigger and this operation takes place in an iterative process, I am looking for an efficient reshaping operation.
I don't think you can do this with reshape and transpose alone (although I'd love to be proven wrong). Using np.block works, but it's a bit messy:
np.block([list(i) for i in input_data.reshape( (2*h+1), (2*h+1), n, n )])
array([[ 0, 1, 4, 5, 8, 9, 12, 13, 16, 17],
[ 2, 3, 6, 7, 10, 11, 14, 15, 18, 19],
[20, 21, 24, 25, 28, 29, 32, 33, 36, 37],
[22, 23, 26, 27, 30, 31, 34, 35, 38, 39],
[40, 41, 44, 45, 48, 49, 52, 53, 56, 57],
[42, 43, 46, 47, 50, 51, 54, 55, 58, 59],
[60, 61, 64, 65, 68, 69, 72, 73, 76, 77],
[62, 63, 66, 67, 70, 71, 74, 75, 78, 79],
[80, 81, 84, 85, 88, 89, 92, 93, 96, 97],
[82, 83, 86, 87, 90, 91, 94, 95, 98, 99]])
EDIT: Never mind, you can do without np.block:
input_data.reshape( (2*h+1), (2*h+1), n, n).transpose(0, 2, 1, 3).reshape(10, 10)
array([[ 0, 1, 4, 5, 8, 9, 12, 13, 16, 17],
[ 2, 3, 6, 7, 10, 11, 14, 15, 18, 19],
[20, 21, 24, 25, 28, 29, 32, 33, 36, 37],
[22, 23, 26, 27, 30, 31, 34, 35, 38, 39],
[40, 41, 44, 45, 48, 49, 52, 53, 56, 57],
[42, 43, 46, 47, 50, 51, 54, 55, 58, 59],
[60, 61, 64, 65, 68, 69, 72, 73, 76, 77],
[62, 63, 66, 67, 70, 71, 74, 75, 78, 79],
[80, 81, 84, 85, 88, 89, 92, 93, 96, 97],
[82, 83, 86, 87, 90, 91, 94, 95, 98, 99]])

Сan you explain why I get an error globalizing a variable?

Hey!
This relates to problem 18 from Euler's project (https://projecteuler.net/problem=18)
This code solved it, but I got an error (4th line):
Undefined variable: 'ans'Python(undefined-variable)
So, I want to understand why this happened
Also, let me know, if there are any flaws in my code
Thanks in advance
def brute(i, j, sum):
global ans
if i > len(l) - 1:
if sum > ans:
ans = sum
return None
brute(i + 1, j, sum + l[i][j])
brute(i + 1, j + 1, sum + l[i][j])
l = [
[75],
[95, 64],
[17, 47, 82],
[18, 35, 87, 10],
[20, 4, 82, 47, 65],
[19, 1, 23, 75, 3, 34],
[88, 2, 77, 73, 7, 63, 67],
[99, 65, 4, 28, 6, 16, 70, 92],
[41, 41, 26, 56, 83, 40, 80, 70, 33],
[41, 48, 72, 33, 47, 32, 37, 16, 94, 29],
[53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14],
[70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57],
[91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48],
[63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31],
[4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23],
]
ans = 0
brute(0, 0, 0)
print(ans)
IMHO this is not a good use-case for globals, would be better to refactor the code like so:
def brute(i, j):
if i > len(l) - 1:
return 0
return l[i][j]+max(brute(i + 1, j), brute(i + 1, j + 1))
I've flipped the logic around to accomplish though, the code works by picking the maximum sum from its subtree
You ideally want to save usage of global variables for system-wide settings and such

List Index out of range error in python

I'm trying to solve the max path sum problem from project euler.
CODE:
def main():
data = [map(int,row.split()) for row in open("Triangle.txt")]
print data
for i in range(len(data)-2,-1,-1):
for j in range(i+1):
data[i][j] += max([data[i+1][j],data[i+1][j+1]]) #list out of range error
print (data[0][0])
if __name__ == '__main__':
main()
The data value has 16 internal lists as follows:
[[75], [95, 64], [17, 47, 82], [18, 35, 87, 10], [20, 4, 82, 47, 65], [19, 1, 23, 75, 3, 34], [88, 2, 77, 73, 7, 63, 67], [99, 65, 4, 28, 6, 16, 70, 92], [41, 41, 26, 56, 83, 40, 80, 70, 33], [41, 48, 72, 33, 47, 32, 37, 16, 94, 29], [53, 71, 44, 65, 25, 43, 91, 52, 97, 51, 14], [70, 11, 33, 28, 77, 73, 17, 78, 39, 68, 17, 57], [91, 71, 52, 38, 17, 14, 91, 43, 58, 50, 27, 29, 48], [63, 66, 4, 68, 89, 53, 67, 30, 73, 16, 69, 87, 40, 31], [4, 62, 98, 27, 23, 9, 70, 98, 73, 93, 38, 53, 60, 4, 23], []]
And I am getting list index out of range error in the line:
data[i][j] += max([data[i+1][j],data[i+1][j+1]])
IndexError: list index out of range
How can i get rid of this error?
Thanks in advance...
The problem is the last item in data. It's an empty list. Try removing it and executing the script, as follows:
In [392]: data[-1]
Out[392]: []
In [393]: data = data[:-1]
In [394]: for i in range(len(data)-2,-1,-1):
.....: for j in range(i+1):
.....: data[i][j] += max([data[i+1][j],data[i+1][j+1]]) #list out of range error
.....: print (data[0][0])
.....:
1074
In order to eliminate the error altogether, without the need to manually alter the contents of data, you can read it correctly at the first place, as follows:
data = [map(int,row.split()) for row in open("Triangle.txt") if row.strip()]

Categories