I have an optimization problem I wish to solve that has some special characteristics. I have been trying to figure out how to fit it into the mold that SciPy optimize wants, but have been having some trouble. Could someone familiar with the package help me understand how to get what I want out of it?
My optimization formula is
min(A) sum P(yi=1|A)
s.t. A.T*A == I
Where A is a matrix.
So I make a function opt_funct for the minimization function, but how do I pass it the matrix? Do I need to optimize a vector and then reshape the vector into the matrix within the optimization function?
For the constraint, I can make a function that returns A.T*A - eye(d), but I need to check that this is all zeros. Should I also reshape it as a vector, and will the constraint section of optimize know that every part of that vector needs to be 0?
Related
I have a large sparse square non-normal matrix: 73080 rows, but only 6 nonzero entries per row (and all equal to 1.). I'd like to compute the two largest eigenvalues, as well as the operator (2) norm, ideally with Python. The natural way for me to store this matrix is with scipy's csr_matrix, especially since I'll be multiplying it with other sparse matrices. However, I don't see a good way to compute the relevant statistics: scipy.sparse.linalg's norm method doesn't have the 2-norm implemented and converting to a dense matrix seems like it would be a bad idea, and running scipy.sparse.linalg.eigs seems to run extremely, maybe prohibitively, slowly, and in any event it computes lots of data that I just don't need. I suppose I could subtract off the spectral projector corresponding to the top eigenvalue but then I'd still need to know the top eigenvalue of the new matrix, which I'd like to do with an out-of-the-box method if at all possible, and in any event this wouldn't continue to work after multiplying with other large sparse matrices.
However, these kinds of computations seem to be doable: the top of page 6 of this paper seems to have data on the eigenvalues of ~10000-row matrices. If this is not feasible in Python, is there another way I should try to do this? Thanks in advance.
I'm solving a non-linear elliptic PDE via linearization + iteration and a finite difference method: basically it comes down to solving a matrix equation Ax = b. A is a banded matrix. Due to the large size of A (typically ~8 billion elements) I have been using a sparse solver (scipy.sparse.linalg.spsolve) to do this. In my code, I compute a residual value which measures deviation from the true non-linear solution and lowers it with successive iterations. It turns out that there is a difference between the values that the sparse solver produces in comparison to what scipy.linalg.solve does.
Output of normal solver:
Output of sparse solver:
The only difference in my code is the replacement of the solver. I don't think this is down to floating-point errors since the error creeps upto the 2nd decimal place (in the last iteration - but the order of magnitude also decreases... so I'm not sure). Any insights on why this might be happening? The final solution, it seems, is not affected qualitatively - but I wonder whether this can create problems.
(No code has been included since the difference is only there in the creation of a sparse matrix and the sparse solver. However, if you feel you need to check some part of it, please ask me to include code accordingly)
I have a python function that takes a bunch (1 or 2) of arguments and returns a 2D array. I have been trying to use scipy curve_fit and least_squares to optimize the input arguments so that the resultant 2D array matches another 2D array that has be pre-made. I ran into the problem of both the methods returning me the initial guess as the converged solution. After ripping apart much hair from my head, I have figured out that the issue was that since the small increment that it makes to the initial guess is too small to make any difference in the 2D array that my function returns (as the cell values in the array are quantized and are not continuous) and hence scipy assumes that it has reached convergence (or local minimum) at the initial guess.
I was wondering if there is a way around this (such as forcing it to use a bigger increment while guessing).
Thanks.
I have ran into a very similar problem recently and it turns out that these kind of optimizers work only for continous-differentiable functions. That's why they would return the initial parameters, as the function you want to fit cannot be differentiated. In my case, I could manually make my fit function differentiable by first fitting a polynomial function to it before plugging it into the curve_fit optimizer.
I am preconditioning a matrix using spilu, however, to pass this preconditioner into cg (the built in conjugate gradient method) it is necessary to use the LinearOperator function, can someone explain to me the parameter matvec, and why I need to use it. Below is my current code
Ainv=scla.spilu(A,drop_tol= 1e-7)
Ainv=scla.LinearOperator(Ainv.shape,matvec=Ainv)
scla.cg(A,b,maxiter=maxIterations, M = Ainv)
However this doesnt work and I am given the error TypeError: 'SuperLU' object is not callable. I have played around and tried
Ainv=scla.LinearOperator(Ainv.shape,matvec=Ainv.solve)
instead. This seems to work but I want to know why matvec needs Ainv.solve rather than just Ainv, and is it the right thing to feed LinearOperator?
Thanks for your time
Without having much experience with this part of scipy, some comments:
According to the docs you don't have to use LinearOperator, but you might do
M : {sparse matrix, dense matrix, LinearOperator}, so you can use explicit matrices too!
The idea/advantage of the LinearOperator:
Many iterative methods (e.g. cg, gmres) do not need to know the individual entries of a matrix to solve a linear system A*x=b. Such solvers only require the computation of matrix vector products docs
Depending on the task, sometimes even matrix-free approaches are available which can be much more efficient
The working approach you presented is indeed the correct one (some other source doing it similarily, and some course-materials doing it like that)
The idea of not using the inverse matrix, but using solve() here is not to form the inverse explicitly (which might be very costly)
A similar idea is very common in BFGS-based optimization algorithms although wiki might not give much insight here
scipy has an extra LinearOperator for this not forming the inverse explicitly! (although i think it's only used for statistics / completing/finishing some optimization; but i successfully build some LBFGS-based optimizers with this one)
Source # scicomp.stackexchange discussing this without touching scipy
And because of that i would assume spilu is completely going for this too (returning an object with a solve-method)
I have the following problem. Let's say we have x_{jk} it is an expression value of gene j in a sample k. It is the average of expression levels across the cell types s_{ij}, weighted by respective proportions a_{ki} (i = 1 ... N,N is the disease type):
Generally this can be expressed as matrix form
What I want to do is to solve this equation
How can it be done using Theano?
You can do this in theano or not do it in theano. The only thing theano can help you with here is the gradient of the euclidean norm, which it can calculate for you, but which is also easy to write by hand. The algorithm to solve the problem needs to be implemented by yourself. Either you write the lagrangian and then solve the dual problem by gradient ascent and projection onto the constraint, or you solve the primal problem directly by gradient descent and projection onto the constraints. You need to program these optimization steps yourself, which is also the case for any other optimization you do in theano.