I am trying to find the best set of parameters for a model of coupled partial differential equations, i.e. the objective function is not analytical. The underlying rate equations are analytical and must be integrated so the results depend on the history sent to the set of equations. There are up to 16 parameters. They are bounded but there are interdependencies that are unknown (otherwise, I would make some constraints). I have done my best to come up with constant bounds but there are instances where the optimizer chooses parameters that result in division by zero or infinity values.
I have already tried "try:/except:" to no avail. Does anyone know of a way I can get scipy.optmize.minimize to reject/ignore a run if these numerical issues show up?
Related
A problem I'm currently working on requires me to optimize some dimension parameters for a structure in order to prevent buckling while still not being over engineered. I've been able to solve it use iterative (semi-brute forced) methods, however, I wondering if there is a way to implement a gradient descent method to optimize the parameters. More background is given below:
Let's say we are trying to optimize three length/thickness parameters, (t1,t2,t3) .
We initialize these parameters with some random guess (t1,t2,t3)g. Through some transformation to each of these parameters (weights and biases), the aim is to obtain (t1,t2,t3)ideal such that three main criteria (R1,R2,R3)ideal are met. The criteria are calculated by using (t1,t2,t3)i as inputs to some structural equations, where i represents the inputs after the first iteration. Following this, some kind of loss function could be implemented to calculate the error, (R1,R2,R3)i - (R1,R2,R3)ideal
My confusion lies in the fact that traditionally, (t1,t2,t3)ideal would be known and the cost would be a function of the error between (t1,t2,t3)ideal and (t1,t2,t3)i, and subsequent iterations would follow. However, in a case where (t1,t2,t3)ideal are unknown and the targets (R1,R2,R3)ideal (known) are an indirect function of the inputs, how would gradient descent be implemented? How would minimizing the cost relate to the step change in (t1,t2,t3)i ?
P.S: Sorry about the formatting, I cannot embed latex images until my reputation is higher.
I'm having some difficulty understanding how the constraints you're describing are calculated. I'd imagine the quantity you're trying to minimize is the total material used or the cost of construction, not the "error" you describe?
I don't know the details of your specific problem, but it's probably a safe bet that the cost function isn't convex. Any gradient-based optimization algorithm carries the risk of getting stuck in a local minimum. If the cost function isn't computationally intensive to evaluate then I'd recommend you use an algorithm like differential evolution that starts with a population of initial guesses scattered throughout the parameter space. SciPy has a nice implementation of it that allows for constraints (and includes a final gradient-based "polishing" step).
I have written a constrained optimization function using scipy.optimize.minimize. However, when I run the exact same optimization problem multiple times I get variable results.
I know that it is possible when generating a (pseudo) random number, for example, to provide a seed so that the same numbers are generated on every simulation.
Is there a similar provision for running a constrained optimization problem in scipy?
or to put it another way, is there any random component to the scipy.optimize.minimize function.
I am using the SLSQP method and I am running the optimization with the same starting point.
I am looking for an optimisations tool for python like pyswarm which tries to find polynomial coefficients by trying different coefficients values. To guide the search I want to add additional constraints eg. the values of the polynomial needs to be between limits or can have a maximum gradient of a certain value. In case that the optimizer chooses coefficients which don't satisfy these constraints, new coefficients need to be chosen instantly instead of using the created polynomial.
Background information:
The polynomial generated by the optimizer will then be used by a different program generating a data set. The generated data will be compared to an existing data set. The goal is to match the data as closely as possible.
Thanks
I've been running some linear/logistic regression models recently, and I wanted to know how you can output the cost function for each iteration. One of the parameters in sci-kit LinearRegression is 'maxiter', but in reality you need to see cost vs iteration to find out what this value really needs to be i.e. is the benefit worth the computational time to run more iterations etc
I'm sure I'm missing something but I would have thought there was a method that outputted this information?
Thanks in advance!
One has to understand if there is any iteration (implying computing a cost function) or an analytical exact solution, when fitting any estimator.
Linear Regression
In fact, Linear Regression - ie Minimization of the Ordinary Least Square - is not an algorithm but a minimization problem that can be solved using different techniques. And those techniques
Not getting into the details of the statistical part described here :
There are at least three methods used in practice for computing least-squares solutions: the normal equations, QR decomposition, and singular value decomposition.
As far as I got into the details of the codes, it seems that the computational time is involved by getting the analytical exact solution, not iterating over the cost function. But I bet they depend on your system being under-, well- or over-determined, as well as the language and library you are using.
Logistic Regression
As Linear Regression, Logistic Regression is a minimization problem that can be solved using different techniques that, for scikit-learn, are : newton-cg, lbfgs, liblinear and sag.
As you mentionned, sklearn.linear_model.LogisticRegression includes the max_iter argument, meaning it includes iterations*. Those are controled either because the updated argument doesn't change anymore - up to a certain epsilon value - or because it reached the maximum number of iterations.
*As mentionned in the doc, it includes iterations only for some of the solvers
Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge.
In fact, each solver involves its own implementation, such as here for the liblinear solver.
I would recommand to use the verbose argument, maybe equal to 2 or 3 to get the maximum value. Depending on the solver, it might print the cost function error. However, I don't understand how you are planning to use this information.
Another solution might be to code your own solver and print the cost function at each iteration.
Curiosity kills cat but I checked the source code of scikit which involves many more.
First, sklearn.linear_model.LinearRegression use a fit to train its parameters.
Then, in the source code of fit, they use the Ordinary Least Square of Numpy (source).
Finally, Numpy's Least Square function uses the functionscipy.linalg.lapack.dgelsd, a wrapper to the LAPACK (Linear Algebra PACKage) function DGELSD written in Fortran (source).
That is to say that getting into the error calculation, if any, is not easy for scikit-learn developers. However, for the various using of LinearRegression and many more I had, the trade-off between cost-function and iteration time is well-adressed.
I am trying to solve a system of coupled iterative equations, each of which containing lots of integrations and derivatives.
First I used maxima (embedded in Sage) to solve it analytically, but the solution was too dependent on the initial guesses I had make for my unknown functions, constant initial guesses yielded answering back almost immediately while symbolic functions when used as initial guesses yielded the system to go deep into calculations, sometimes seemingly never ending ones.
However, what I tried with Sage was actually a simplified version of my original equations so I thought it might be the case that I have no other choice rather than to treat the integrations and derivatives numerically, however, I had some not ignorable problems:
integrations were only allowed to have numerical limits and variables were not allowed as e.g. their upper limits (I thought maybe a numerical method algorithm is faster than the analytic one even-though I leave a variable or parameter in its calculations, but it just didn't work so).
integrands couldn't also admit extra variables and parameters w.r.t. which not being integrated.
the derivative function was itself a big obstacle, as I wasn't able to compute partial derivatives or to use a derivative in the integrand of an integral.
To get rid of all the problems with numerical derivative I substitute it by the symbolic diff() function and the speed improvement was still hopeful but the problems with numerical integration persist.
Now I have three questions:
a- Is it right to conclude there is no other way for me rather than to discretize the equations and do a complete numerical treatment instead of a mixed one?
b- If so then is there any way to do this automatically? My equations are not DE ones to use ODEint or else, they are iterative equations, I have integrations and derivatives only to update my unknowns at each step to their newer values.
c- If my calculations are so huge in size is there any suggestion on switching from python to fortran or things like that as well?
Best Regards