Gurobi model modification slow, can I modify the constraint matrix directly?

Gurobi model modification slow, can I modify the constraint matrix directly? - python

I want to make changes to the coefficients in an existing model. Currently (with the Python API) I'm looping through the constraints and calling model.chgCoeff but it's quite slow. Is there a faster way, perhaps accessing the constraint matrix directly, in the Python and/or C API?
Example code below. The reason it's slow seems to be mostly because of the loop itself; replacing chgCoeff with any other operation is still slow. Normally I would get around this by using vector operations rather than for loops, but without access to the constraint matrix I don't think I can do that.
from __future__ import division
import gurobipy as gp
import numpy as np
import time
N = 300
M = 2000
m = gp.Model()
m.setParam('OutputFlag', False)
masks = [np.random.rand(N) for i in range(M)]
p = 1/np.random.rand(N)
rets = [p * masks[i] - 1 for i in range(M)]
v = np.random.rand(N)*10000 * np.round(np.random.rand(N))
t = m.addVar()
x = [m.addVar(vtype=gp.GRB.SEMICONT, lb=1000, ub=v[i]) for i in range(N)]
m.update()
cons = [m.addConstr(t <= gp.LinExpr(ret, x)) for ret in rets]
m.setObjective(t, gp.GRB.MAXIMIZE)
m.update()
start_time = time.time()
m.optimize()
solve_ms = int(((time.time() - start_time)*1000))
print('First solve took %s ms' % solve_ms)
p = 1/np.random.rand(N)
rets = [p * masks[i] - 1 for i in range(M)]
start_time = time.time()
for i in range(M):
for j in range(N):
if rets[i][j] != -1:
m.chgCoeff(cons[i], x[j], -rets[i][j])
m.update()
update_ms = int(((time.time() - start_time)*1000))
print('Model update took %s ms' % update_ms)
start_time = time.time()
m.optimize()
solve_ms = int(((time.time() - start_time)*1000))
print('Second solve took %s ms' % solve_ms)
k = 2
start_time = time.time()
for i in range(M):
for j in range(N):
if rets[i][j] != -1:
k *= rets[i][j]
solve_ms = int(((time.time() - start_time)*1000))
print('Plain loop took %s ms' % solve_ms)
R = np.array(rets)
start_time = time.time()
S = np.copy(R)
copy_ms = int(((time.time() - start_time)*1000))
print('np.copy() took %s ms' % copy_ms)
Output:
First solve took 1767 ms
Model update took 2051 ms
Second solve took 1872 ms
Plain loop took 1103 ms
np.copy() took 3 ms
A call to np.copy on the size (2000, 300) constraint matrix takes 3ms. Is there a fundamental reason I'm missing that the whole model update can't be that fast?

You can't access the constraint matrix directly in Gurobi using the Python interface. Even if you could, you couldn't do an np.copy operation because the matrix is in the CSR format, not a dense format. To make the sort of wholesale changes to a constraint, it is better to modify the constraint matrix by removing the constraint and adding a new one. In your case, the changes to each are so significant that you aren't going to get much benefit from a warm start, so you won't loose anything by not keeping the same constraint objects.
Assuming you adjust the rets array in the code above for the -1 special case, the following code will do want you want and be much faster.
for con in cons:
m.remove(con)
new_cons = [m.addConstr(t <= gp.LinExpr(ret, x)) for ret in rets]

Related

Assignment problem: more agents than tasks, but tasks with multiple capacity

I've been using a pre-existing assignment tool which I use for allocating undergraduate students to projects. I am now keen to try and build an assignment tool in python which will allow me to add and adjust constraints as we are faced with extraordinary pressures on space usage due to COVID.
The basis of the task is to place as many possible students with their favored supervisors. I have 85 students who provided a 1-to-5 rank order of preferred supervisors, which allows me to tune in a cost variable. In addition, there are 40 supervisors each with varying levels of capacity; some can take 2 students, some 3 etc, with total capacity ca 100.
I have been using Google OR-Tools, python implementation, and have attempted "Assignment with Task Sizes" strategy so far, both CP-SAT and MIP. I can produce a solution using CP-SAT for a small dummy data set, but when I use last years data with a cost matrix of size 85x40, I haven't been able to generate an assignment solution. In contrast, the MIP solver approach produces an assignment with "0 cost" and no actual assignments. I have also started to construct a minimum flow model but so far I have not been able to get the program to run on my input data.
My general question is "what is the best general approach for such an assignment problem of more agents than tasks, but tasks with capacity to accept greater than 1 agent."
Happy to provide what code I have if useful.
Thanks, Dave
EDIT
The code I have been using to attempt a CP-SAT solution is below. I initially populate a np-array of sizes defined by number of students and number of staff and populate this with the value 100. This is what I use for elements which are not a choice in the cost matrix. I have taken the student's choices ( 1-to-5) and squared to give a bit of differentiation:
from __future__ import print_function
from ortools.sat.python import cp_model
import time
import pandas as pd
import numpy as np
capacity=pd.read_excel(r'C:\Users\Dave\Documents\assign_2019.xlsx',
index_col=0, sheet_name='capacity')
capacity=capacity.reset_index()
choices=pd.read_excel(r'C:\Users\Dave\Documents\assign_2019.xlsx',
index_col=0, sheet_name='choices')
choices=choices.reset_index()
array=np.empty((len(choices), len(capacity)))
array.fill(100)
cost = pd.DataFrame(data=array)
cost.index=choices['student']
cost.columns=capacity['staff']
choices=choices.set_index(['student'])
for i in choices.index:
for j in choices.columns:
s=choices.loc[i, j]
cost.loc[i,s]=j**2
cost=cost.to_numpy()
cost=cost.astype(int)
sizes = capacity['capacity']
sizes=sizes.to_numpy()
sizes=sizes.astype(int)
def main():
model = cp_model.CpModel()
start = time.time()
num_workers = len(cost)
num_tasks = len(cost[1])
# Variables
x = []
for i in range(num_workers):
t = []
for j in range(num_tasks):
t.append(model.NewIntVar(0, 1, "x[%i,%i]" % (i, j)))
x.append(t)
x_array = [x[i][j] for i in range(num_workers) for j in
range(num_tasks)]
# Constraints
# Each staff is allocated no more than capacity.
[model.Add(sum(x[i][j] for i in range(num_workers)) <= sizes[j])
for j in range(num_tasks)]
# Number of projects allocated to a student is 1.
[model.Add(sum(x[i][j] for j in range(num_tasks)) == 1)
for i in range(num_workers)]
model.Minimize(sum([np.dot(x_row, cost_row) for (x_row, cost_row) in
zip(x, cost)]))
solver = cp_model.CpSolver()
status = solver.Solve(model)
if status == cp_model.OPTIMAL:
print('Minimum cost = %i' % solver.ObjectiveValue())
print()
for i in range(num_workers):
for j in range(num_tasks):
if solver.Value(x[i][j]) >= 1:
print('Worker ', i, ' assigned to task ', j, ' Cost = ',
cost[i][j])
print()
end = time.time()
print("Time = ", round(end - start, 4), "seconds")
if __name__ == '__main__':
main()
EDIT 2:
Code used for MIP solver (data input is the same as the CP-SAT attempt described above):
def main():
solver = pywraplp.Solver('SolveAssignmentProblem',
pywraplp.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
start = time.time()
num_workers = len(cost)
num_tasks = len(cost[1])
# Variables
x = {}
for i in range(num_workers):
for j in range(num_tasks):
x[i, j] = solver.IntVar(0, 1, 'x[%i,%i]' % (i, j))
# Constraints
# Staff can accept students up to capacity.
for i in range(num_workers):
solver.Add(solver.Sum([x[i, j] for j in range(num_tasks)]) <=
sizes[j])
# Student is allocated 1 project.
for j in range(num_tasks):
solver.Add(solver.Sum([x[i, j] for i in range(num_workers)]) == 1)
solver.Minimize(solver.Sum([cost[i][j] * x[i,j] for i in
range(num_workers)
for j in range(num_tasks)]))
print('Minimum cost = ', solver.Objective().Value())
print()
for i in range(num_workers):
for j in range(num_tasks):
if x[i, j].solution_value() > 0:
print('Worker', i,' assigned to task', j, ' Cost = ', cost[i]
[j])
print()
end = time.time()
print("Time = ", round(end - start, 4), "seconds")
if __name__ == '__main__':
main()
The outcome of the MIP attempt is:
Minimum cost = 0.0
Time = 0.553 seconds

Multiprocessing with python application to shave running time that is currently 36 hours

I am currently working on a data mining project that is creating a similarity matrix that is 18000x18000
Here are the two methods which build the matrix
def CreateSimilarityMatrix(dbSubsetData, distancePairsList):
global matrix
matrix = [ [0.0 for y in range(dbSubsetData.shape[0])] for x in range(dbSubsetData.shape[0])]
for i in range(len(dbSubsetData)): #record1
SimilarityArray = []
start = time.time()
for j in range(i+1, len(dbSubsetData)): #record2
Similarity = GetDistanceBetweenTwoRecords(dbSubsetData, i, j, distancePairsList)
#The similarities are all very small numbers which might be why the preference value needs to be so precise.
#Let's multiply the value by a scalar 10 to give the values more range.
matrix[i][j] = Similarity * 10.0
matrix[j][i] = Similarity * 10.0
end = time.time()
return matrix
def GetDistanceBetweenTwoRecords(dbSubsetData, i, j, distancePairsList):
Record1 = dbSubsetData.iloc[i]
Record2 = dbSubsetData.iloc[j]
columns = dbSubsetData.columns
distancer = 0.0
distancec = 0.0
for i in range(len(Record1)):
columnName = columns[i]
Record1Value = Record1[i]
Record2Value = Record2[i]
if(Record1Value != Record2Value):
ob = distancePairsList[distancePairsDict[columnName]-1]
if(ob.attributeType == "String"):
strValue = Record1Value+":"+Record2Value
strValue2 = Record2Value+":"+Record1Value
if strValue in ob.distancePairs:
val = ((ob.distancePairs[strValue])**2)
val = val * -1
distancec = distancec + val
elif strValue2 in ob.distancePairs:
val = ((ob.distancePairs[strValue2])**2)
val = val * -1
distancec = distancec + val
elif(ob.attributeType == "Number"):
val = ((Record1Value - Record2Value)*ob.getSignificance())**2
val = val * -1
distancer = distancer + val
distance = distancer + distancec
return distance
Each iteration is looping 18000x19 times (18000 for each row and 19 times for each attribute). The total number of iterations is (18000x18000x19)/2 since it is symmetric and therefore I only have to do one half of the matrix. This will take around 36 hours to complete, which is a timeframe I obviously want to shave down.
I figured Multiprocessing is the trick. Since each row is independently generating numbers and fitting them to the matrix, I could run multiprocess with CreateSimilarityMatrix. So I created this in the function which will create my processes
matrix = [ [0.0 for y in range(SubsetDBNormalizedAttributes.shape[0])] for x in range(SubsetDBNormalizedAttributes.shape[0])]
if __name__ == '__main__':
procs = []
for i in range(4):
proc = Process(target=CreateSimilarityMatrix, args=(SubsetDBNormalizedAttributes, distancePairsList, i, 4))
procs.append(proc)
proc.start()
proc.join()
CreateSimilarityMatrix is now changed to
def CreateSimilarityMatrix(dbSubsetData, distancePairsList, counter=0, iteration=1):
global Matrix
for i in range(counter, len(dbSubsetData), iteration): #record1
SimilarityArray = []
start = time.time()
for j in range(i+1, len(dbSubsetData)): #record2
Similarity = GetDistanceBetweenTwoRecords(dbSubsetData, i, j, distancePairsList)
#print("Similarity Between Records",i,":",j," is ", Similarity)
#The similarities are all very small numbers which might be why the preference value needs to be so precise.
#Let's multiply the value by a scalar 10 to give the values more range.
Matrix[i][j] = Similarity * 10.0
Matrix[j][i] = Similarity * 10.0
end = time.time()
print("Iteration",i,"took",end-start,"(s)")
Currently this goes s-l-o-w. It's really slow. It takes minutes to start one process, then it takes minutes to start the next one. I thought these were supposed to run concurrently? Is my application of the process incorrect?

If you are using CPython, there is something called the global interpreter lock (GIL) that makes it difficult to actually multithread while making things faster, and can instead slow it down substantially.
If you are dealing with matrices, use numpy, which is definitely a lot faster than regular Python.

Code runs much faster in C than in NumPy

I wrote physics simulation code in python using numpy and than rewrote it to C++. in C++ it takes only 0.5 seconds while in python around 40s. Can someone please help my find what I did horribly wrong?
import numpy as np
def myFunc(i):
uH = np.copy(u)
for j in range(1, xmax-1):
u[i][j] = a*uH[i][j-1]+(1-2*a)*uH[i][j]+a*uH[i][j+1]
u[i][0] = u[i][0]/b
for x in range(1, xmax):
u[i][x] = (u[i][x]+a*u[i][x-1])/(b+a*c[x-1])
for x in range(xmax-2,-1,-1):
u[i][x]=u[i][x]-c[x]*u[i][x+1]
xmax = 101
tmax = 2000
#All other variables are defined here but I removed that for visibility
uH = np.zeros((xmax,xmax))
u = np.zeros((xmax,xmax))
c = np.full(xmax,-a)
uH[50][50] = 10000
for t in range(1, tmax):
if t % 2 == 0:
for i in range(0,xmax):
myFunc(i)
else:
for i in range(0, xmax):
myFunc(i)
In case someones wants to run it here is whole code: http://pastebin.com/20ZSpBqQ
EDIT: all variables are defined in the whole code which can be found on pastebin. Sorry for confusion, I thought removing all the clutter will make the code easier to understand

fundamentally, C is a compiled language, when Python is a interpreted one, speed against ease of use.
Numpy can fill the gap, but you must avoid for loop on items, which need often
some skills.
For exemple,
def block1():
for i in range(xmax):
for j in range(1, xmax-1):
u[i][j] = a*uH[i][j-1]+(1-2*a)*uH[i][j]+a*uH[i][j+1]
is in numpy style :
def block2():
u[:,1:-1] += a*np.diff(u,2)
with is shorter and faster ( and easier to read and understand ?) :
In [37]: %timeit block1()
10 loops, best of 3: 25.8 ms per loop
In [38]: %timeit block2()
10000 loops, best of 3: 123 µs per loop
At last, you can speed numpy code with Just In Time compilation, what is allowed with Numba. Just change the beginning of your code like :
import numba
#numba.jit
def myFunc(u,i):
...
and the calls by myFunc(u,i) at the end of the script (u must be a parameter for automatic determination of types) and you will reach the same performance (0,4 s on my PC).

So when I ran your numpy python code it took four minutes to run, once I removed the numpy code and replaced it with standard python code it only took one minute! (I have a not so fast computer)
Here's that code:
#import numpy as np
def impl(i,row):
if row:
uH = u[:][:] # this copys the array 'u'
for j in range(1, xmax-1):
u[i][j] = a*uH[i][j-1]+(1-2*a)*uH[i][j]+a*uH[i][j+1]
u[i][0] = u[i][0]/b
for x in range(1, xmax):
u[i][x] = (u[i][x]+a*u[i][x-1])/(b+a*c[x-1])
for x in range(xmax-2,-1,-1):
u[i][x]=u[i][x]-c[x]*u[i][x+1]
else:
uH = u[:][:] # this copys the array 'u'
for j in range(1, xmax-1):
u[j][i]= a*uH[j-1][i]+(1-2*a)*uH[j][i]+a*uH[j+1][i]
u[0][i] = u[0][i]/b
for y in range(1, xmax):
u[y][i] = (u[y][i]+a*u[y-1][i])/(b+a*c[y-1])
for y in range(xmax-2,-1,-1):
u[y][i]=u[y][i]-c[y]*u[y+1][i]
#Init
xmax = 101
tmax = 2000
D = 0.5
l = 1
tSec = 0.1
uH = [[0.0]*xmax]*xmax #np.zeros((xmax,xmax))
u = [[0.0]*xmax]*xmax #np.zeros((xmax,xmax))
dx = l / xmax
dt = tSec / tmax
a = (D*dt)/(dx*dx);
b=1+2*a
print("dx=="+str(dx))
print("dt=="+str(dt))
print(" a=="+str(a))
#koeficient c v trojdiagonalnej matici
c = [-a]*xmax #np.full(xmax,-a)
c[0]=c[0]/b
for i in range(1, xmax):
c[i]=c[i]/(b+a*c[i-1])
uH[50][50] = 10000
u = uH
for t in range(1, tmax):
if t % 2 == 0:
for i in range(0,xmax):
impl(i,False)
else:
for i in range(0, xmax):
impl(i,True)
I believe that this could be much faster if you were to have used numpy the correct way rather than as a substitute for arrays, however, not using numpy arrays cut the time to 1/4th of the original.

Vectorizing numpy Multiple Condition Nested Loops

On attempting to produce Automatic Peak Detection in Noisy Periodic and Quasi-Periodic Signals, by Felix Scholkmann, Jens Boss and Martin Wolf in Python, I've hit a stumbling block in the implementation.
Upon attempting to optimise, I've noticed that the nested for loops are creating a bottleneck in processing time (taking 115394 ms on average to complete).
Is there a more efficient means of constructing the nested for loop?
N.B:
The parameter, signal, is a list of co-ordinates to which the algorithm will process which is of the form
-48701.0
-20914.0
-1757.0
-49278.0
-106781.0
-88139.0
-13587.0
28071.0
11880.0
-13375.0
-18056.0
-15248.0
-12476.0
-9832.0
-26365.0
-65734.0
-81657.0
-41566.0
6382.0
872.0
-30666.0
-20261.0
17543.0
6278.0
...
The list is 32768 lines long.
The function returns the indexes of the peaks detected to which is processed in another function.
def ampd(signal):
s_time = range(1, len(signal)+1)
[fitPolynomial, fitError] = np.polyfit(s_time, signal, 1)
fitSignal = np.polyval([fitPolynomial, fitError], s_time)
dtrSignal = signal - fitSignal
N = len(dtrSignal)
L = math.ceil(N/2.0)-1
creation_start = time.time()
np.random.seed(1969)
LSM = np.random.uniform(0, 2, size=(L, N))
creation_elapsedTime = time.time() - creation_start
print('LSM created in %s ms' % int(creation_elapsedTime * 1000))
loop_start = time.time()
for k in range(1, L):
for i in range(k+2, N-k+1):
if signal[i-1]>signal[i-k-1] and signal[i-1]>signal[i+k-1]:
LSM[k,i] = 0
loop_elapsedTime = time.time() - loop_start
print('Loop completed in %s ms' % int(loop_elapsedTime * 1000))
G = np.sum(LSM, axis=1)
l = min(enumerate(G), key=itemgetter(1))[0]
MLSM = LSM[0:l]
S = np.std(MLSM, ddof=1)
found_indices = np.where(MLSM == ((S-1) == 0))
del LSM
del MLSM
return found_indices[1]

Here is a solution which uses only one loop
for k in range(1, L):
mat=1-((signal[k+1:N-k]>signal[1:N-2*k]) & (signal[k+1:N-k]>signal[2*k+1:N]))
LSM[k,k+2:N-k+1]*=mat
it's faster and seems do give the same solutions. You compare slices (as suggested by Ami Tavory), combine the comparisons with a &, which gives a True/False array; with 1-operation, you transform it to zeros and ones, the zeros corresponding to where the conditions are met. And lastly you multiply the row by the result.

comparing large vectors in python

I have two large vectors (~133000 values) of different length. They are each sortet from small to large values. I want to find values that are similar within a given tolerance. This is my solution but it is very slow. Is there a way to speed this up?
import numpy as np
for lv in range(np.size(vector1)):
for lv_2 in range(np.size(vector2)):
if np.abs(vector1[lv_2]-vector2[lv])<.02:
print(vector1[lv_2],vector2[lv],lv,lv_2)
break

Your algorithm is far from optimal. You compare way too much values. Assume you are at a certain position in vector1 and the current value in vector2 is already more than 0.02 bigger. Why would you compare the rest of vector2?
Start with something like
pos1 = 0
pos2 = 0
Now compare the values at those postions in your vectors. If the difference is too big, move the position of the smaller one fowared and check again. Continue until you reach the end of one vector.

haven't tested it, but the following should work. The idea is to exploit the fact that the vectors are sorted
lv_1, lv_2 = 0,0
while lv_1 < len(vector1) and lv_2 < len(vector2):
if np.abs(vector1[lv_2]-vector2[lv_1])<.02:
print(vector1[lv_2],vector2[lv_1],lv_1,lv_2)
lv_1 += 1
lv_2 += 1
elif vector1[lv_1] < vector2[lv_2]: lv_1 += 1
else: lv_2 += 1

The following code gives a nice increase in performance that depends upon how dense the numbers are. Using a set of 1000 random numbers, sampled uniformly between 0 and 100, it runs about 30 times faster than your implementation.
pos_1_start = 0
for i in range(np.size(vector1)):
for j in range(pos1_start, np.size(vector2)):
if np.abs(vector1[i] - vector2[j]) < .02:
results1 += [(vector1[i], vector2[j], i, j)]
else:
if vector2[j] < vector1[i]:
pos1_start += 1
else:
break
The timing:
time new method: 0.112464904785
time old method: 3.59720897675
Which is produced by the following script:
import random
import numpy as np
import time
# initialize the vectors to be compared
vector1 = [random.uniform(0, 40) for i in range(1000)]
vector2 = [random.uniform(0, 40) for i in range(1000)]
vector1.sort()
vector2.sort()
# the arrays that will contain the results for the first method
results1 = []
# the arrays that will contain the results for the second method
results2 = []
pos1_start = 0
t_start = time.time()
for i in range(np.size(vector1)):
for j in range(pos1_start, np.size(vector2)):
if np.abs(vector1[i] - vector2[j]) < .02:
results1 += [(vector1[i], vector2[j], i, j)]
else:
if vector2[j] < vector1[i]:
pos1_start += 1
else:
break
t1 = time.time() - t_start
print "time new method:", t1
t = time.time()
for lv1 in range(np.size(vector1)):
for lv2 in range(np.size(vector2)):
if np.abs(vector1[lv1]-vector2[lv2])<.02:
results2 += [(vector1[lv1], vector2[lv2], lv1, lv2)]
t2 = time.time() - t_start
print "time old method:", t2
# sort the results
results1.sort()
results2.sort()
print np.allclose(results1, results2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Gurobi model modification slow, can I modify the constraint matrix directly? - python

Related

Assignment problem: more agents than tasks, but tasks with multiple capacity

Multiprocessing with python application to shave running time that is currently 36 hours

Code runs much faster in C than in NumPy

Vectorizing numpy Multiple Condition Nested Loops

comparing large vectors in python

Categories

Resources