CPLEX and epgap - python

I have some code that generates 150 different lineups. I want to make cplex use as close to a greedy solution as possible. I read that if you make the epgap large enough it will mimic that of a greedy approach. Is this true? and if so what should I set the epgap to?
import pulp
from pulp import *
from pulp.solvers import CPLEX_PY
from pydfs_lineup_optimizer import get_optimizer, Site, Sport,CSVLineupExporter
from pydfs_lineup_optimizer.solvers.pulp_solver import PuLPSolver
import time
start_time = time.time()
class CustomPuLPSolver(PuLPSolver):
LP_SOLVER = pulp.CPLEX_PY(msg=0,epgap=.1)
optimizer = get_optimizer(Site.FANDUEL, Sport.BASEBALL, solver=CustomPuLPSolver)
optimizer.load_players_from_csv("/Users/austi/Desktop/MLB/PLAYERS_LIST.csv")
optimizer.restrict_positions_for_opposing_team(['P'], ['1B','C','2B','3B','SS','OF','UTIL'])
optimizer.set_spacing_for_positions(['SS','C','1B','3B','OF','2B'], 4)
optimizer.set_team_stacking([4])
optimizer.set_max_repeating_players(7)
lineups = list(optimizer.optimize(n=150))
for lineup in lineups:
print(lineup)
exporter = CSVLineupExporter(lineups)
exporter.export('MLB_result.csv')
print(round(((time.time() - start_time)/60)), "minutes run time")

No, this is not true (where did you read that?). Setting parameter epgap to N tells CPLEX to stop as soon as the relative difference between the best known feasible solution and the lower bound (for minimization problems) on an optimal solution falls below N.
This does not say anything about how the best known feasible solution was found. It could have come from any heuristic or even from an integral node.
If you explicitly need a greedy solution then you have two options:
Compute that greedy solution yourself
Modify your model so that the only feasible solution is the greedy solution.

Related

Python CPLEX warm starts from infeasible solution

I currently have an (integer) LP problem solved which has, amongst others, the following mathematical constraint as pseudocode.
Packages_T1 + Packages_T2 + Packages_T3 + RPackages = 25
It represents three package trucks (T1, T2 and T3) to each of which packages can be assigned plus a residual/spilled package variable which is used in the objective function. The current value of 25 represents the total package demand.
Lets say I want to re-solve this problem but change the current demand of 25 packages to 35 packages. When I warm start from the previous solution with 25 packages CPLEX errors out in stating that the provided solution is infeasible: which makes perfect sense. However, it subsequently fails to repair the previous solution even though the most straight-forward way to do this would be to "up" the RPackages variable for each of these constraints.
My question is whether there is any possibility to still use the information from the previously solved problem as a warm start to the new one. Is there a way to, for example, drop all RPackages from the solution and have them recalculated to fit the new constraint right-hand side? A "last resort" effort I thought of would be to manually recalculate all these RPackages values myself and replace them in the old solution but a more automated solution to this problem would be preferred. I am using the standard CPLEX Python API for reference.
Thank you in advance.
even if the warmstart is not feasible CPLEX can use some information.
Let me use the zoo example from
https://www.linkedin.com/pulse/making-optimization-simple-python-alex-fleischer/
from docplex.mp.model import Model
mdl = Model(name='buses')
nbbus40 = mdl.integer_var(name='nbBus40')
nbbus30 = mdl.integer_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
mdl.minimize(nbbus40*500 + nbbus30*400)
warmstart=mdl.new_solution()
warmstart.add_var_value(nbbus40,4)
warmstart.add_var_value(nbbus30,0)
mdl.add_mip_start(warmstart)
sol=mdl.solve(log_output=True)
for v in mdl.iter_integer_vars():
print(v," = ",v.solution_value)
gives
Warning: No solution found from 1 MIP starts.
Retaining values of one MIP start for possible repair.

How can I retrieve the current best solution from the Gurobi solver when using PulP?

If I use the interactive solver in the gurobi solver, I can do the following:
gurobi> m = read('model.mp')
gurobi> m.optimize()
[...]
Found heuristic solution: objective 821425.00000
Then abort and get the current solution via
gurobi> m.printAttr('X')
I want to have the same behavior in pulp. In particular, after having called:
prob = pulp.LpProblem(name="MIPProblem", sense=pulp.LpMaximize)
[...]
status = prob.solve(pulp.GUROBI_CMD(msg=True, keepFiles=1))
I want to wait until the first heuristic solution is found/abort after a certain timespan and then obtain the current best solution found by Gurobi. How would I do that?
You can either use pulp.GUROBI or pulp.GUROBI_CDM.
The main difference is that pulp.GUROBI is a wrapper to gurobipy (the Gurobi Python Interface) while pulp.GUROBI_CDM uses the command line (i.e, it writes the LP/ILP in a file and then calls the solver).
Let's distinguish the two cases:
Case 1: pulp.GUROBI:
In this case you can access the solverModel object. For the fields you can refer directly to the documentation. However, for your specific use case what you're looking for is ObjBound.
A Small Example:
import pulp
...
status = prob.solve(pulp.GUROBI(timeLimit=1))
print(pulp.LpStatus[status]) # status
print(prob.solverModel.ObjBound) # best objective found
Case 2: pulp.GUROBI_CDM:
In this case, the problem is solved by command line and the solution is read from the result file (i.e., here)
import pulp
...
status = prob.solve(pulp.GUROBI_CMD(options=[('TimeLimit','1')]))
print(pulp.LpStatus[status]) # status
print(pulp.value(prob.objective)) # best objective found
Note that, GUROBI_CMD does provide good solution status [see here], so you may read Optimal even though the solution is not since no information about the status are provided by the Gurobi solution file.

Should I linearize or try to solve the MINLP in python with gurobi or try a completely different approach?

I'm fairly new to this, so I'm just going to shoot and hope I'm as precise as possible and you'll think it warrants an answer.
I'm trying to optimize (minimize) a cost/quantity model, where both are continuous variables. Global cost should be minimized, but is dependent on total quantity, which is dependent on specific cost.
My code looks like this so far:
# create model
m = Model('Szenario1')
# create variables
X_WP = {}
X_IWP = {}
P_WP = {}
P_IWP = {}
for year in df1.index:
X_WP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Wärmepumpe%d" % year)
X_IWP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Industrielle Wärmepumpe%d" % year)
#Price in year i = Base.price * ((Sum of newly installed capacity + sum of historical capacity)^(math.log(LearningRate)/math.log(2)))
P_WP[year] = P_WP0 * (quicksum(X_WP[year] for year in df1.index) ** (learning_factor)
P_IWP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Preis Industrielle Wärmepumpe%d" % year)
X_WP[2016] = 0
X_IWP[2016] = 0
# Constraints and Objectives
for year in df1.index:
m.addConstr((X_WP[year]*VLST_WP+X_IWP[year]*VLST_IWP == Wärmemenge[year]), name="Demand(%d)" % year)
obj = quicksum(
((X_WP[year]-X_WP[year-1])*P_WP[year]+X_WP[year]*Strompreis_WP*VLST_WP)+
((X_IWP[year]-X_IWP[year-])*P_IWP[year]+X_IWP[year]*Strompreis_EHK*VLST_IWP)
for year in Wärmemenge.index)
m.setObjective(obj, GRB.MINIMIZE)
m.update()
m.optimize()
X is quantity and P is price. WP and IWP are two different technologies (more will be added later). Since X and P are multiplied the problem is nonlinear, now I haven't found a solution so far as to feed gurobi an objective, that it can handle.
My research online and on stackoverflow basically let me to the conclusion that I can either linearize and solve with gurobi, find another solver that can solve this MINLP or formulate my objective in a way that Gurobi can solve. Since I've already made myself familiar with Gurobi, that would be my prefered choice.
Any advice on what's best at this point?
Would be highly appreciated!
I'd suggest rewriting your Python code using Pyomo.
This is a general-purpose optimization modeling framework for Python which can construct valid inputs for Gurobi as well as a number of other optimization tools.
In particular, it will allow you to use Ipopt as a backend, which does solve (at least some) nonlinear problems. Even if Ipopt cannot solve your nonlinear problem, using Pyomo will allow you to test that quickly and then easily move back to a linearized representation in Gurobi if things don't work out.

Optimizing a multithreaded numpy array function

Given 2 large arrays of 3D points (I'll call the first "source", and the second "destination"), I needed a function that would return indices from "destination" which matched elements of "source" as its closest, with this limitation: I can only use numpy... So no scipy, pandas, numexpr, cython...
To do this i wrote a function based on the "brute force" answer to this question. I iterate over elements of source, find the closest element from destination and return its index. Due to performance concerns, and again because i can only use numpy, I tried multithreading to speed it up. Here are both threaded and unthreaded functions and how they compare in speed on an 8 core machine.
import timeit
import numpy as np
from numpy.core.umath_tests import inner1d
from multiprocessing.pool import ThreadPool
def threaded(sources, destinations):
# Define worker function
def worker(point):
dlt = (destinations-point) # delta between destinations and given point
d = inner1d(dlt,dlt) # get distances
return np.argmin(d) # return closest index
# Multithread!
p = ThreadPool()
return p.map(worker, sources)
def unthreaded(sources, destinations):
results = []
#for p in sources:
for i in range(len(sources)):
dlt = (destinations-sources[i]) # difference between destinations and given point
d = inner1d(dlt,dlt) # get distances
results.append(np.argmin(d)) # append closest index
return results
# Setup the data
n_destinations = 10000 # 10k random destinations
n_sources = 10000 # 10k random sources
destinations= np.random.rand(n_destinations,3) * 100
sources = np.random.rand(n_sources,3) * 100
#Compare!
print 'threaded: %s'%timeit.Timer(lambda: threaded(sources,destinations)).repeat(1,1)[0]
print 'unthreaded: %s'%timeit.Timer(lambda: unthreaded(sources,destinations)).repeat(1,1)[0]
Retults:
threaded: 0.894030461056
unthreaded: 1.97295164054
Multithreading seems beneficial but I was hoping for more than 2X increase given the real life dataset i deal with are much larger.
All recommendations to improve performance (within the limitations described above) will be greatly appreciated!
Ok, I've been reading Maya documentation on python and I came to these conclusions/guesses:
They're probably using CPython inside (several references to that documentation and not any other).
They're not fond of threads (lots of non-thread safe methods)
Since the above, I'd say it's better to avoid threads. Because of the GIL problem, this is a common problem and there are several ways to do the earlier.
Try to build a tool C/C++ extension. Once that is done, use threads in C/C++. Personally, I'd only try SIP to work, and then move on.
Use multiprocessing. Even if your custom python distribution doesn't include it, you can get to a working version since it's all pure python code. multiprocessing is not affected by the GIL since it spawns separate processes.
The above should've worked out for you. If not, try another parallel tool (after some serious praying).
On a side note, if you're using outside modules, be most mindful of trying to match maya's version. This may have been the reason because you couldn't build scipy. Of course, scipy has a huge codebase and the windows platform is not the most resilient to build stuff.

Does Python/Scipy have a firls( ) replacement (i.e. a weighted, least squares, FIR filter design)?

I am porting code from Matlab to Python and am having trouble finding a replacement for the firls( ) routine. It is used for, least-squares linear-phase Finite Impulse Response (FIR) filter design.
I looked at scipy.signal and nothing there looked like it would do the trick. Of course I was able to replace my remez and freqz algorithsm, so that's good.
On one blog I found an algorithm that implemented this filter without weighting, but I need one with weights.
Thanks, David
The firls equivalent in python now appears to be implemented as part of the signal package:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.firls.html#scipy.signal.firls
Also I agree with everything that #pev hall stated above especially how firls is optimum in many situations (such as when overall signal to noise is being optimized for a given number of taps), and to not use the boxcar window as he stated, they are not equivalent at all! firls generally outperforms all window and frequency sampling approaches to filter design when designing traditional FIR filters.
To my current understanding, scipy.signal and Octave only support odd length (even order) least squared filters. In this case when I need an even length (Type II or Type IV filter), I resort to using the windowed design approach with a Kaiser Window specifically. I have found the Kaiser window solution to come quite close to the optimum least squares solution.
This blog post contains code detailing how to use scipy.signal to implement FIR filters.
Obviously, this post is somewhat dated, but maybe it is still interesting for some:
I think there are two near-equivalents to firls in Python:
You can try the firwin function with window='boxcar'. This is similar to Matlab where fir1 with a boxcar window delivers the same (? or at least very similar results) as firls.
You could also try the firwin2 method (frequency sampling method, similar to fir2 in Matlab), again using window='boxcar'
I did try one example from the Matlab firls reference and achieved near-identical results for:
Matlab:
F = [0 0.3 0.4 0.6 0.7 0.9];
A = [0 1 0 0 0.5 0.5];
b = firls(24,F,A,'hilbert');
Python:
F = [0, 0.3, 0.4, 0.6, 0.7, 0.9, 1]
A = [0, 1, 0, 0, 0.5, 0.5, 0]
bb = sig.firwin2( 25, F,A, window='boxcar', antisymmetric=True )
I had to increase the order to N = 25 and I also had to add another data point (F = 1, A = 0) which Python insisted upon; the option antisymmetric = True is only necessary for this special case (Hilbert filter)
This post is really in response to
You can try the firwin function with window='boxcar'...
Do don't use boxcar it means no window at all (it is ideal but only works "ideally" with an infinite number of multipliers - sinc in time). The whole perpose of using a window is the reduce the number of multipliers required to get good stop band attenuation. See Window function
When comparing filters please use dB/log scale.
Scipy not having firls (FIR least squares filter) function is a large limitation (as it generates the optimum filter in many situations).
REMEZ has it's place but the flat roll off is a real killer when your trying to get the best results (and not just meeting some managers spec). ( warning scipy remez implementation can give amplification in stop band - see plot at bottom)
If you are using python (or need to use some window) I recommend using the kasiar window which gets very good results and can easily be tweaked for your attenuation vs transition vs multipliers requirement(attenuation (in dB) = 2.285 * (multipliers - 1) * pi * width + 7.95). It performance is not quite as good as firls but it has the benefit of being fast and easy to calculate (great if you don't store the coefficients).
I found a firls() implementation attached here in SciPy ticket 648
Minor changes to get it working:
Swap the following two lines:
bands, desired, weight = array(bands), array(desired), array(weight)
if weight==None : weight = ones(len(bands)/2)
import roots from numpy instead of scipy.signal
Since version 0.18 in July, 2016 scipy includes an implementation of firls, as scipy.signal.firls.
It seems unlikely that you'll find exactly what you seek already written in Python, but perhaps the Matlab function's help page gives or references a description of the algorithm?

Categories