I am creating a linear optimization model using python using the pulp package. I am wondering if there is a simple way to add constraints to a model without many hard coding every variable. For example...I am currently using a for loop to create the following set partitioning constraints:
for j in range(0,(len(excel_data_df))) :
i = j * 3
OptModel += x[i] + x[i + 1] + x[i + 2] == 1
This works for smaller problems. However as the variable i gets larger it becomes very time consuming to add all of the indices in the constraint.
Would it be possible to loop through all of the values which i can take and then generate the OptModel+= line of code automatically? For example if the variable i is 100 I would want the code to generate the following without having to manually add each x[i] variable.
for j in range(0,(len(excel_data_df))) :
for i in range(0,100)
OptModel += x[0] + x[1] + x[2] + ......+ x[100] == 1
You can use the lpSum method which allows you to sum over lists of vatriables - so all you need to do is have a way of generating the indexes you want to sum over. In your second exmaple you could do:
OptModel += lpSum([x[i] for i in range(0, 100)]) == 1
Related
I am finally (!) switching from coding mostly in Fortran to Python. I have heard that Python enables efficient vectorization. I am wondering how this works. Say I want to do the following:
for each i
skip the first 3 lines
for each j
calculate something
end
calculate average over all j
end
calculate average over all i
This is possible but laborious in Fortran. How can it be done efficiently in Python?
for k in range(i):
if(k>=3):
for z in range(j):
calc3 += (j/3) # Replace (j/3) with "Something"
calc2+=calc3
calc1+=calc2
sum_i = []
for i in range(<from>, <to>, <step>):
sum_j = []
if not i <= 3:
for j in range(<from>, <to>, <step>):
sum_j.append(<something calculated>)
average_j = sum(sum_j) / len(sum_j)
sum_i.append(<the i related value you want the average from>)
average_i = sum(sum_i) / len(sum_i)
EDIT: This is not a vectorization (more like a translation in Python of the code given)
I'm writing a LP problem in pulp python. I'm not new to LP but I am to pulp. So far I got a couple of constraints that are implemented correctly. They are simple and I know how they work. The problem is about assigning containers to voyages;
# All containers asigned to only 1 voyage
for i in cntrs:
prob += lpSum([x[(i,v)] for v in voyages]) <= 1
# Contaienr to right destination
for v in voyages:
prob += lpSum([x[(i,v)] * posibleDest.loc[i,v] for i in cntrs]) == 1
# Weight capacity of voyages
for v in voyages:
for b in barges:
prob += lpSum([weight[i] * x[(i,v)] for i in cntrs]) <= voyWCap[v]
# Type capacity of voyages
for c in cats:
for v in voyages:
prob += cntrCat.loc[i,c] * x[(i,v)] <= bargeCATCAP.loc[c,b] * voyBarge.loc[b,v]
# TEU cap of voyages
for v in voyages:
for b in barges:
prob += lpSum([cntrTEU[i] * x[(i,v)] for i in cntrs]) <= voyTEUCap[v]
I tested the program and it works just fine, however I'm stuck at a particular part. I want to add an parameter 'Tardy' which, if the container arrives to late/early, it gives the container a 'penalty value'. My objective function is to minimize unused space, so adding the sum of penalties times a big number should 'push' the program into trying to get everything in the right time window.
Now my problem; I know this works, only not how to program it.
What I've done so far;
My objective function is as follows
prob += lpSum([(TEUcap[b] * voyBarge.loc[b,v]) - (x[(i,v)] * cntrTEU[i]) + Tardy[i] * M]
for i in cntrs
for b in barges
for v in voyages)
Where M is a very big number
I've created a dictionary (Tardy) with 0's and a loop to fill that dictionary;
Tardy = dict.fromkeys(cntrs,0)
for i in cntrs:
for v in voyages:
if cntrDest.dot(voyArive).loc[i,v] != 0:
if cntrDest.dot(voyArive).loc[i,v] * x[(i,v)] <= (cntrOpen.dot(voyDest)).loc[i,v] * x[(i,v)]:
Tardy[i] = 1
elif cntrDest.dot(voyArive).loc[i,v] * x[(i,v)] >= (cntrClose.dot(voyDest)).loc[i,v] * x[(i,v)]:
Tardy[i] = 1
else:
Tardy[i] = 0
In words: most of my parameters are matrices, if there is a value (not 0) for
cntrDest.dot(voyArive).loc[i,v]
It means there is an arival datetime for container i on voyage v, if this value is greater than close datetime, or smaller than the open datetime, that container should get a penalty (Tardy[container] =1)
Because x is a LpVariable
x[(i,v)]
is always 0 before the problem is solved, therefore, tardy is always 1.
I think I have to 'paste' a prob+= in somewhere, but I can't figure out how to let the program take it into account. If anyone could help me make it work, or have another suggestion on how to program it, that would be greatly appreciated!
Kind regards
You can't formulate your model "conditionally"... Meaning Tardy is a variable in your model and you cannot assign values to it within a conditional statement (if-elif-else) inside of a linear program because the value of the dependent variables (x in this case) is not known when the problem is formulated and handed over to the solver, so we need to try something else and re-formulate that.
It isn't totally clear how you are handling times within your model, but it appears that the containers have due-dates and the voyages have arrival times, which would be the basis of calculating Tardy. So, you should introduce Tardy[i] as a non-negative real value and just constrain it to be larger than the difference between the arrival time and the due date. That assumes that container 'i' goes on that particular voyage 'v'. So, we need to multiply that delta by the selection binary variable x to only apply in the case of selection. In pseudo-code:
Tardy[i] >= (arrival_time[v] - due_time[i]) * x[i,v]
and then build that into your pulp model for each i,v in the model
I am trying to implement relaxation iterative solver for a project. The function we create should intake two inputs: Matrix A, and Vector B, and should return iterative vectors X that Approximate solution Ax = b.
Pseudo Code from the book is here:
enter image description here
I am new to Python so I am struggling quite a bit with implementing this method. Here is my code:
def SOR_1(A,b):
k=1
n = len(A)
xo = np.zeros_like(b)
x = np.zeros_like(b)
omega = 1.25
while (k <= N):
for i in range(n-1):
x[i] = (1.0-omega)*xo[i] + (1.0/A[i][i])[omega(-np.sum(A[i][j]*x[j]))
-np.sum(A[i][j]*xo[j] + b[i])]
if ( np.linalg.norm(x - xo) < 1e-9):
print (x)
k = k + 1.0
for i in range(n-1):
xo[i] = x[i]
return x
My question is how do I implement the for loop and generating the arrays correctly based off of the Pseudo Code.
Welcome to Python!
Variables in Python are case sensitive so n is defined but N is not defined. If they are supposed to be different variables, I don't see what your value is for N.
You are off to a good start but the following line is still psuedocode for the most part:
x[i] = (1.0-omega)*xo[i] + (1.0/A[i][i])[omega(-np.sum(A[i][j]*x[j]))
-np.sum(A[i][j]*xo[j] + b[i])]
In the textbook's pseudocode square brackets are being used as a grouping symbol but in Python, they are reserved for creating and accessing lists (which is what python calls arrays). Also, there is no implicit multiplication in Python so you have to write things like (1 + 2)*(3*(4+5)) rather than (1 + 2)[3(4+5)]
The other major issue is that you don't define j. You probably need a for loop that would either look like:
for j in range(1, i):
or if you want to do it inline
sum(A[i][j]*x[j] for j in range(1, i))
Note that range has two arguments, where to start and what value to stop before so range(1, i) is equivalent to the summation from 1 to i - 1
I think you are struggling with that line because there's far too much going on in that line. See if you can figure out parts of it using separate variables or offload some of the work to separate functions.
something like: x[i] =a + b * c * d() - e() but give a,b c, d and e meaningful names. You'd then have to correctly set each variable and define each function but at least you are trying to solve separate problems rather than one huge complex one.
Also, make sure you have your tabs correct. k = k + 1.0 should not be inside that for loop, just inside the while loop.
Coding is an iterative process. First get the while loop working. Don't try to do anything in it (except print out the variable so you can see that it is working). Next get the for loop working inside the while loop (again, just printing the variables). Next get (1.0-omega)*xo[i] working. Along the way, you'll discover and solve issues such as (1.0-omega)*xo[i] will evaluate to 0 because xo is a NumPy list initiated with all zeros.
You'd start with something like:
k = 1
N = 3
n = 3
xo = [1, 2, 3]
while (k <= N):
for i in range(n):
print(k, i)
omega = 1.25
print((1.0-omega)*xo[i])
k += 1
And slowly work more and more of the relaxation solver in until you have everything working.
I wrote the following code for this problem.
prof = sorted([int(input()) for x in range(int(input()))])
student = sorted([int(input()) for x in range(int(input()))])
prof_dates = len(prof)
stud_dates = len(student)
amount = 0
prof_index = 0
stud_index = 0
while stud_index < stud_dates and prof_index < prof_dates:
if student[stud_index] == prof[prof_index]:
amount += 1
stud_index += 1
elif student[stud_index] > prof[prof_index]:
prof_index += 1
elif student[stud_index] < prof[prof_index]:
stud_index += 1
print(amount)
But the code is producing a Time Limit Exceeded Error. Earlier I had tried using a in for every item in student but it produced a TLE and I believe that's because the in statement is O(n). So, I wrote this code whose steps required are roughly equal to the sum of the lengths of both the lists. But this is also producing a TLE. So, what changes should I make in my code. Is there some particular part which has a high time expense?
Thanks.
You are using sorting + merging. This takes O(NlogN + MlogM + N + M) time complexity.
But you can put professor data in a set, check every student year value (from an unsorted list) and get O(M + N) complexity (on average).
Note that this approach eliminates the long operation of student list sorting.
Addition: python has built-in sets. For languages that have no such provision, the professor's list is already sorted, so you can just use binary search for every year. The complexity would be O(NlogM).
As the problem basically is to find the intersection of two sets of integers the following code solves the problem in O(M + N) when assuming that a dictionary access is possible in O(1)
prof = set([int(input()) for x in range(int(input()))])
student = set([int(input()) for x in range(int(input()))])
equals_dates = len(prof.intersection(student))
I need to generate all the partitions of a given integer.
I found this algorithm by Jerome Kelleher for which it is stated to be the most efficient one:
def accelAsc(n):
a = [0 for i in range(n + 1)]
k = 1
a[0] = 0
y = n - 1
while k != 0:
x = a[k - 1] + 1
k -= 1
while 2*x <= y:
a[k] = x
y -= x
k += 1
l = k + 1
while x <= y:
a[k] = x
a[l] = y
yield a[:k + 2]
x += 1
y -= 1
a[k] = x + y
y = x + y - 1
yield a[:k + 1]
reference: http://homepages.ed.ac.uk/jkellehe/partitions.php
By the way, it is not quite efficient. For an input like 40 it freezes nearly my whole system for few seconds before giving its output.
If it was a recursive algorithm I would try to decorate it with a caching function or something to improve its efficiency, but being like that I can't figure out what to do.
Do you have some suggestions about how to speed up this algorithm? Or can you suggest me another one, or a different approach to make another one from scratch?
To generate compositions directly you can use the following algorithm:
def ruleGen(n, m, sigma):
"""
Generates all interpart restricted compositions of n with first part
>= m using restriction function sigma. See Kelleher 2006, 'Encoding
partitions as ascending compositions' chapters 3 and 4 for details.
"""
a = [0 for i in range(n + 1)]
k = 1
a[0] = m - 1
a[1] = n - m + 1
while k != 0:
x = a[k - 1] + 1
y = a[k] - 1
k -= 1
while sigma(x) <= y:
a[k] = x
x = sigma(x)
y -= x
k += 1
a[k] = x + y
yield a[:k + 1]
This algorithm is very general, and can generate partitions and compositions of many different types. For your case, use
ruleGen(n, 1, lambda x: 1)
to generate all unrestricted compositions. The third argument is known as the restriction function, and describes the type of composition/partition that you require. The method is efficient, as the amount of effort required to generate each composition is constant, when you average over all the compositions generated. If you would like to make it slightly faster in python then it's easy to replace the function sigma with 1.
It's worth noting here as well that for any constant amortised time algorithm, what you actually do with the generated objects will almost certainly dominate the cost of generating them. For example, if you store all the partitions in a list, then the time spent managing the memory for this big list will be far greater than the time spent generating the partitions.
Say, for some reason, you want to take the product of each partition. If you take a naive approach to this, then the processing involved is linear in the number of parts, whereas the cost of generation is constant. It's quite difficult to think of an application of a combinatorial generation algorithm in which the processing doesn't dominate the cost of generation. So, in practice, there'll be no measurable difference between using the simpler and more general ruleGen with sigma(x) = x and the specialised accelAsc.
If you are going to use this function repeatedly for the same inputs, it could still be worth caching the return values (if you are going to use it across separate runs, you could store the results in a file).
If you can't find a significantly faster algorithm, then it should be possible to speed this up by an order of magnitude or two by moving the code into a C extension (this is probably easiest using cython), or alternatively by using PyPy instead of CPython (PyPy has its downsides - it does not yet support Python 3, or some commonly-used libraries like numpy and scipy).
The reason for this is, since python is dynamically typed, the interpreter is probably spending most of its time checking the types of the variables - for all the interpreter knows, one of the operations could turn x into a string, in which case expressions like x + y would suddenly have very different meanings. Cython gets around this problem by allowing you to statically declare the variables as integers, while PyPy has a just-in-time compiler which minimises redundant type checks.
Testing with n=75 I get:
PyPy 1.8:
w:\>c:\pypy-1.8\pypy.exe pstst.py
1.04800009727 secs.
CPython 2.6:
w:\>python pstst.py
5.86199998856 secs.
Cython + mingw + gcc 4.6.2:
w:\pstst> python -c "import pstst;pstst.run()"
4.06399989128
I saw no difference with Psyco(?)
The run function:
def run():
import time
start = time.time()
for p in accelAsc(75):
pass
print time.time() - start, 'secs.'
If I change the definition of accelAsc for Cython to start with:
def accelAsc(int n):
cdef int x, y, k
# no more changes..
I get the Cython time down to 2.27 secs.
I'd say that your performance issue is somewhere else.
I didn't compare it with other approaches, but it does seem efficient to me:
import time
start = time.time()
partitions = list(accelAsc(40))
print('time: {:.5f} sec'.format(time.time() - start))
print('length:', len(partitions))
Gave:
time: 0.03636 sec
length: 37338