Long Vector Linear Programming in R? - python

Hello and thanks in advance. Fresh off the heels of this question I acquired some more RAM and now have enough memory to fit all the matrices I need to run a linear programming solver. Now the problem is none of the Linear Programming packages in R seem to support long vectors (ie large matrices).
I've tried functions Rsymphony_solve_LP, Rglpk_solve_LP and lp from packages Rsymphony, Rglpk, and lpSolve respectively. All report a similar error to the one below:
Error in rbind(const.mat, const.dir.num, const.rhs) :
long vectors not supported yet: bind.c:1544
I also have my code below in case that helps...the constraint matrix mat is my big matrix (7062 rows by 364520 columns) created using the package bigmemory. When I run this this line the matrix is pulled into memory and then after a while the errors show.
Rsymph <- Rsymphony_solve_LP(obj,mat[1:nrow(mat),1:ncol(mat)],dir,rhs,types=types,max=max, write_lp=T)
I'm guessing this is a hard-coded error in each of the three functions? Is there currently a linear programming solver in R or even Python that supports long vectors? Should I contact the package maintainers or just edit the code myself? Thanks!

The package lpSolveAPI can solve long-vector linear programming problems. You have to first start my declaring a Linear Programming object, then add the constraints:
library(lpSolveAPI)
#Generate Linear Programming Object
lprec <- make.lp(nrow = 0 # Number of Constraints
, ncol = ncol(mat) # Number of Decision Variables
)
#Set Objective Function to Minimize
set.objfn(lprec, obj)
#Add Constraints
#Note Direction and RHS is included along with Constraint Value
for(i in 1:nrow(mat) ){
add.constraint(lprec,mat[i,], dir[i], rhs[i])
print(i)
}
#Set Decision Variable Type
set.type(lprec, c(1:ncol(mat)), type = c("binary"))
#Solve Model
solve(lprec)
#Obtain Solutions
get.total.iter(lprec)
get.objective(lprec)
get.variables(lprec)
There's a good introduction to this package here.

Related

free(): invalid pointer Aborted (core dumped)

I-m trying to run my python program it seems that it should run smoothly however I encounter an error that I haven't seen before it says:
free(): invalid pointer
Aborted (core dumped)
However I'm not sure how to try and fix error since it doesn't give me too much information about the problem itself.
At first I thought it should be a problem with the sizes of the tensor in my network however they are completely fine. I've google the problem a little and found that I can see that is a problem with allocating memory where I shouldn't, but I don't know how to fix this problem
My code is divided in two different files, and I use two libraries to be able to use Sinkhorn loss function and make sample randomly a mesh.
import argparse
import point_cloud_utils as pcu
import time
import numpy as np
import torch
import torch.nn as nn
from fml.nn import SinkhornLoss
import common
def main():
# x is a tensor of shape [n, 3] containing the positions of the vertices that
x = torch._C.from_numpy(common.loadpointcloud("sphere.txt"))
# t is a tensor of shape [n, 3] containing a set of nicely distributed samples in the unit cube
v, f = common.unit_cube()
t = torch._C.sample_mesh_lloyd(pcu.lloyd(v,f,x.shape[0]).astype(np.float32)) # sample randomly a point cloud (cube for now?)
# The model is a simple fully connected network mapping a 3D parameter point to 3D
phi = common.MLP(in_dim=3, out_dim=3)
# Eps is 1/lambda and max_iters is the maximum number of Sinkhorn iterations to do
emd_loss_fun = SinkhornLoss(eps=1e-3, max_iters=20,
stop_thresh=1e-3, return_transport_matrix=True)
mse_loss_fun = torch.nn.MSELoss()
# Adam optimizer at first
optimizer = torch.optim.Adam(phi.parameters(), lr= 10e-3)
fit_start_time = time.time()
for epoch in range(100):
optimizer.zero_grad()
# Do the forward pass of the neural net, evaluating the function at the parametric points
y = phi(t)
# Compute the Sinkhorn divergence between the reconstruction*(using the francis library) and the target
# NOTE: The Sinkhorn function expects a batch of b point sets (i.e. tensors of shape [b, n, 3])
# since we only have 1, we unsqueeze so x and y have dimension [1, n, 3]
with torch.no_grad():
_, P = emd_loss_fun(phi(t).unsqueeze(0), x.unsqueeze(0))
# Project the transport matrix onto the space of permutation matrices and compute the L-2 loss
# between the permuted points
loss = mse_loss_fun(y[P.squeeze().max(0)[1], :], x)
# loss = mse_loss_fun(P.squeeze() # y, x) # Use the transport matrix directly
# Take an optimizer step
loss.backward()
optimizer.step()
print("Epoch %d, loss = %f" % (epoch, loss.item()))
fit_end_time = time.time()
print("Total time = %f" % (fit_end_time - fit_start_time))
# Plot the ground truth, reconstructed points, and a mesh representing the fitted function, phi
common.visualitation(x,t,phi)
if __name__ == "__main__":
main()
The error message is:
free(): invalid pointer
Aborted (core dumped)
That again doesn't help me that much. I'll appreciate it a lot if someone has any idea what is happening or if you know more about this error.
Edit: The cause is actually known. The recommended solution is to build both packages from source.
There is a known issue with importing both open3d and PyTorch. The cause is unknown. https://github.com/pytorch/pytorch/issues/19739
A few possible workarounds exist:
(1) Some people have found that changing the order in which you import the two packages can resolve the issue, though in my personal testing both ways crash.
(2) Other people have found compiling both packages from source to help.
(3) Still others have found that moving open3d and PyTorch to be called from separate scripts resolves the issue.
Note for future readers: This bug was filed as issue #21018.
This is not a problem in your Python code. It is a bug in PyTorch (probably) or in Python itself (unlikely, but possible).
free(3) is a C function that releases dynamically allocated memory when it is no longer needed. You cannot (easily) call it from Python, because memory management is a low-level implementation detail normally handled by the Python interpreter. However, you are also using PyTorch, which is written in C++ and C, and does have the ability to directly allocate and free memory.
In this case, some C code has tried to release a block of memory, but the block of memory it tried to release was not dynamically allocated in the first place, which is an error. You should report this behavior to the PyTorch developers. Include as much detail as possible, including the shortest code you can find that reproduces the problem, and the complete output of that program.

SVD calculation error from lapack function when using scikit-learn's Linear Discriminant Analysis class

I'm classifying 2-class, 1-D data using scikit-learn's LDA classifier in a machine learning pipeline I've created. The following exception occurred:
ValueError: Internal work array size computation failed: -10
at the following line:
LinearDiscriminantAnalysis.fit(X,y)
where X = [-5e15, -5e15, -5e15, 5.7e16] and y = [0, 0, 0, 1], both float64 data-type
Additionally the following error was printed to console:
Intel MKL ERROR: Parameter 10 was incorrect on entry to DGESDD
After a quick Google search, dgesdd is a function in LAPACK which scikit-learn relies upon. The dgesdd documentation tells us that the function computes the singular value decomposition (SVD) of a real M-by-N matrix A.
Going back to the original exception, I found it was raised in scipy.linalg.lapack.py at the _compute_lwork function. This function takes as input a function, which in this case I believe is the dgesdd function. CTRL-F "-10" on the dgesdd documentation page gives the logic behind this error code, but I don't know Fortran so I'm not exactly sure what it means.
I want to bet that the SVD calculation is failing due to either (1) large values in X array, or (2) the fact that the 3 of the values in the X array are the exact same number.
I will keep reading into SVD and its limitations. Any insight on how to avoid this error would be tremendously appreciated.
Here is a screenshot of the error
This is the definition of DGESDD:
subroutine dgesdd (JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, LWORK, IWORK, INFO)
The error, that you have, indicates that the value that is passed to MKL's implementation of the routine for the 10th parameter, LDVT, the leading dimension of the V**T matrix does not comply with expecation of said routing.
This could be a bug in Intels implementation, rather unlikely, assuming that there is a battery on testing stress testing this routines, but not impossible. Which version of MKL is this? Or it's a bug in the LDA code, rather likely:
LDVT is INTEGER
The leading dimension of the array VT. LDVT >= 1;
if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N;
if JOBZ = 'S', LDVT >= min(M,N).
Would you please print M, N, LDA, LDU and LDVT?
If you set LDVT properly the workspace analysis will run just fine.
regard to Intel MKL ERROR: Parameter 10 was incorrect on entry to DGESDD problem. Actually this problem has been fixed in MKL v.2018 u4 ( Sep 2018). Here is the link to MKL 2018 bug fix list.
You may easier to check version of MKL you use by setting env variable MKL_VERBOSE=1 to the system environments and look at the output which will contain such kind info.
E.x:
MKL_VERBOSE Intel(R) MKL 2019.0 Update 2 Product build 20190118 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors, Lnx 2.80GHz lp64 intel_thread
MKL_VERBOSE ZGETRF(85,85,0x13e66f0,85,0x13e1080,0) 6.18ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:20

Function definition and function call produce syntax error in python 3, even when copied and pasted

I am trying to solve the problem of a least squares fit of a power law spliced to a third order polynomial in python using gradient descent. I have computed gradients with respect to the parameters in Matlab. The boundary conditions I computed by hand. I am running into a syntax error in my chi-squared minimization algorithm, which must take into account the boundary conditions. I am doing this for a machine learning class in which I am completing a somewhat self-directed and self-proposed long term project, but I am stuck because of this syntax error that I am not sure how to overcome. I will not get class credit for this. It is simply something to put on my resume.
def polypowerderiv(x,a1,b1,c1,a2,b2,c2,d2,boundaryx,ydat):
#need to minimize square of ydat-polypower
#from Mathematica, to be careful
gradd2=2*(d2+c2*x+b2*x**2+a2*x**3-ydat)
gradc2=gradd2*x
gradb2=gradc2*x
grada2=gradb2*x
#again from Mathematica, to be careful
gradc1=2(c+a1*x**b1-ydat)
grada1=gradc1*x**b1
gradb1=grada1*a1*log(x)
return [np.sum(grada1),np.sum(gradb1),\
np.sum(gradc1),np.sum(grada2),np.sum(gradb2),\
np.sum(gradc2),np.sum(gradd2)]
def manualleastabsolutedifference(xdat, ydat, params,seed, maxiter, learningrate):
chisq=0 #chisq is the L2 error of the fit relative to the ydata
dof=len(xdat)-len(params)
xparams=seed
for step in np.arange(maxiter):
a1,b1,c1,a2,b2,c2,d2=params
chisq=polypowerlaw(xdat,params)
for i in np.arange(len(xdat)):
grad=np.zeros(len(seed))
for i in np.arange(seed):
polypowerlawboundarysolver=\
polypowerboundaryconstraint(xdat,a1,b1,c1,a2,b2,c2)
boundaryx=minimize(polypowerlawboundarysolver,x0=1000)
#hard coded to be half of len(xdat)
chisq+=abs(ydat-\
polypower(xdat,a1,b1,c1,a2,b2,c2,d2,boundaryx)
grad=\
polypowerderiv(xdat,a1,b1,c1,\
a2,b2,c2,d2,boundaryx,ydat)
params+=learningrate*grad
return params
The error I get is:
File "", line 14
grad=polypowerderiv(xdat,a1,b1,c1,a2,b2,c2,d2,boundaryx,ydat)
^
SyntaxError: invalid syntax
Also, I'm having some small trouble with formatting. Please help. This one of my first few posts to Stack Overflow ever, after many years of up and down votes. Thank you for your extensive help, community.
As per Alan-Fey, you forgot a closing bracket:
chisq+=abs(ydat-\
polypower(xdat,a1,b1,c1,a2,b2,c2,d2,boundaryx)
should be
chisq+=abs(ydat-\
polypower(xdat,a1,b1,c1,a2,b2,c2,d2,boundaryx))

Should I linearize or try to solve the MINLP in python with gurobi or try a completely different approach?

I'm fairly new to this, so I'm just going to shoot and hope I'm as precise as possible and you'll think it warrants an answer.
I'm trying to optimize (minimize) a cost/quantity model, where both are continuous variables. Global cost should be minimized, but is dependent on total quantity, which is dependent on specific cost.
My code looks like this so far:
# create model
m = Model('Szenario1')
# create variables
X_WP = {}
X_IWP = {}
P_WP = {}
P_IWP = {}
for year in df1.index:
X_WP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Wärmepumpe%d" % year)
X_IWP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Industrielle Wärmepumpe%d" % year)
#Price in year i = Base.price * ((Sum of newly installed capacity + sum of historical capacity)^(math.log(LearningRate)/math.log(2)))
P_WP[year] = P_WP0 * (quicksum(X_WP[year] for year in df1.index) ** (learning_factor)
P_IWP[year] = m.addVar(vtype=GRB.CONTINUOUS, name="Preis Industrielle Wärmepumpe%d" % year)
X_WP[2016] = 0
X_IWP[2016] = 0
# Constraints and Objectives
for year in df1.index:
m.addConstr((X_WP[year]*VLST_WP+X_IWP[year]*VLST_IWP == Wärmemenge[year]), name="Demand(%d)" % year)
obj = quicksum(
((X_WP[year]-X_WP[year-1])*P_WP[year]+X_WP[year]*Strompreis_WP*VLST_WP)+
((X_IWP[year]-X_IWP[year-])*P_IWP[year]+X_IWP[year]*Strompreis_EHK*VLST_IWP)
for year in Wärmemenge.index)
m.setObjective(obj, GRB.MINIMIZE)
m.update()
m.optimize()
X is quantity and P is price. WP and IWP are two different technologies (more will be added later). Since X and P are multiplied the problem is nonlinear, now I haven't found a solution so far as to feed gurobi an objective, that it can handle.
My research online and on stackoverflow basically let me to the conclusion that I can either linearize and solve with gurobi, find another solver that can solve this MINLP or formulate my objective in a way that Gurobi can solve. Since I've already made myself familiar with Gurobi, that would be my prefered choice.
Any advice on what's best at this point?
Would be highly appreciated!
I'd suggest rewriting your Python code using Pyomo.
This is a general-purpose optimization modeling framework for Python which can construct valid inputs for Gurobi as well as a number of other optimization tools.
In particular, it will allow you to use Ipopt as a backend, which does solve (at least some) nonlinear problems. Even if Ipopt cannot solve your nonlinear problem, using Pyomo will allow you to test that quickly and then easily move back to a linearized representation in Gurobi if things don't work out.

scipy.integrate.ode with numpy.view

I try to solve 2 coupled equations systems, called here system A and system B. One of these 2 systems are an ODE system.
To avoid to copy the shared data between the 2 systems, I would like have a structure with pointers. To that, I use the mechanism of numpy.view.
A bit of code :
import numpy as np
import scipy
t0,t1,dt = 0.0,5.0, 1.0
data = np.ones((5,2))
data[:,1]*=2
y=np.array([0.0,0.0]) ### no matter default value
r = scipy.integrate.ode(f)
r.set_integrator('dopri5', rtol=1e-3, atol=1e-6 )
r.set_f_params(0.05)
#r.set_initial_value(y, t0); r._y = data[2] ### Apparently equivalent
r.set_initial_value(data[2], t0) ### Apparently equivalent
print(np.shares_memory(r.y,y))
print(np.shares_memory(r.y,data))
Here, at the initial state, I have a synchronization between r.y (system A) and data[2] (the variable named data is the data of system B). If I modify one, the other is also modified and vice versa. Tape the command r.y.base confirm that r.y is just a view of the array named data. That the behavior that I desired.
Now, the problem start here. If I make progress my EDO system :
while r.successful() and r.t < t1:
r.integrate(r.t+dt, step=True)
print(r.t+dt,r.y)
print(np.shares_memory(r.y,data))
print(data)
data and r.y are no more synchronized. r.y are no more a view of data.
It looks that the integrate function creates a new instance of its attribute r.y rather than just update it. I have read the source code of this function
https://github.com/scipy/scipy/blob/v0.19.1/scipy/integrate/_ode.py#L396
but it rapidly refers to fortran code, and my understanding abilities stop here.
How can I solve (or got round) this problem by a different way of the data copy between r.y and data (that also implies a manual management of the synchronisation) ?
Is it possible that is a bug in scipy ?
Thanks for your help

Categories