I am using the minimize function from scipy.optimize library.
Is there a way to print some values during the optimization procedure? Values like the current x, objective function value, number of iterations and number of gradient evaluations.
I know there are options to save these values and return them after the optimization is over. But can I see them at each step?
The minimize function takes an options dict as a keyword argument. Accepted keys for this dict inlude, disp, which should be set to True to print the progress of the minimization.
Related
I am looking for a differential evolution algorithm (hopefully the one from Scipy) I could use in an unorthodox way. I would like that for each generation, the DE gives me all the child members of the new generation in advance and that I evaluate them all at once in my objective function.
The reason is that my objective function calls COMSOL. I can do a batch of calculations in a COMSOL that COMSOl is going to parallelize carefully, so I don't want the DE to parallelize it itself. So in the end, I want to calculate all the members in one call of COMSOL. Do you have any idea of a package in Python with this kind of freedom?
Thank you for your help!
You can vectorise differential_evolution by using the ability of the workers keyword to accept a map-like callable that is sent the entire population and is expected to return an array with the function values evaluated for the entire population:
from scipy.optimize import rosen, differential_evolution
bounds=[(0, 10), (0, 10)]
def maplike_fun(func, x):
# x.shape == (S, N), where S is the size of the population and N
# is the number of parameters. This is where you'd call out from
# Python to COMSOL, instead of the following line.
return func(x.T)
res = differential_evolution(rosen, bounds, workers=maplike_fun, polish=False, updating='deferred')
From scipy 1.9 there will also be a vectorized keyword, which will send the entire population to the objective function at each iteration.
I'm using lmfit to fit some data to a two reaction system in order to estimate the rate constants. My data are the changes in concentration of x1, x2, and x3 species in x1 -> x2 -> x3
In other tools, I generally use a global optimizer followed by a local optimizer so I can more easily get access to the Hessian etc. In lmfit I thought I could do something like:
minimizer = lmfit.Minimizer(self._residuals, params)
result = minimizer.minimize(method='differential_evolution')
result = minimizer.minimize(method='leastsqr')
I assumed that the parameters fitted by the differential evolution would remain in the minimized object and get picked up automatically by the second minimize function call.
However, I am not sure because I colleague of mine suggested otherwise. If anyone knows the lmfit package better, does the second minimize pick up the parameters where the first minimize left off?
Update 1: I added the Minimize call to ensure you that only one Minnizmize object is created. My current tests appear to indicate that the parameter values do get passed from one minimize call to another (which is what I'd expect).
Update 2: Further experiments indicate if the system is non-identifiable, then there is a difference, meaning that the first call to minimize doesn't appear to pass on its fitted parameters to the second minimize call.
no, the fitted parameters from the first method will not be used in the second minimization with the code you provided.
If you don't specify anything for minimizer.minimize() it will start from the params you supplied to initialize the Minimizer class. The code below should do what you want:
minimizer = lmfit.Minimizer(self._residuals, params)
result_de = minimizer.minimize(method='differential_evolution')
result = minimizer.minimize(params=result_de.params, method='leastsqr')
(I am assuming here that self._residuals is your fitting function, i.e., what you want to be minimized). Please check the documentation here.
I am trying to use scipy.optimize.least_squares(fun= my_fun, jac=my_jac, max_nfev= 1000) with two callable functions: my_fun and my_jac
both fuctions: my_fun and my_jac, use an external software to evaluate their value, this task is much time consuming, therefore I prefer to control the number of evaluations for both
the trf method uses the my_fun function for evaluating if trust region is adequate and the my_jac function for determine both the cost function and the jacobian matrix
There is an input parameter max_nfev. does this parameter count only for the fun evaluations? does it consider also the jac evaluations?
moreover, in matlab there are two parameters for the lsqnonlin function, MaxIterations and MaxFunctionEvaluations. does it exist in scipy.optimize.least_squares?
Thanks
Alon
According to the help of scipy.optimize.least_squares, max_nfev is the number of function evaluations before the program exits :
max_nfev : None or int, optional
Maximum number of function evaluations before the termination.
If None (default), the value is chosen automatically:
Again, according to the help, there is no MaxIterations argument but you can define the tolerance in f (ftol) that is the function you want to minimize or x (xtol) the solution, before exiting the code.
You can also use scipy.optimize.minimize(). In it, you can define a maxiter argument which will be in the options dictionary.
If you do so, beware that the function you want to minimize must be your cost function, meaning that you will have to code your least square function.
I hope it will be clear and useful to you
EDIT: looks like this was already answered before here
It didn't show up in my searches because I didn't know the right nomenclature. I'll leave the question here for now in case someone arrives here because of the constraints.
I'm trying to optimize a function which is flat on almost all points ("steps function", but in a higher dimension).
The objective is to optimize a set of weights, that must sum to one, and are the parameters of a function which I need to minimize.
The problem is that, as the function is flat at most points, gradient techniques fail because they immediately converge on the starting "guess".
My hypothesis is that this could be solved with (a) Annealing or (b) Genetic Algos. Scipy sends me to basinhopping. However, I cannot find any way to use the constraint (the weights must sum to 1) or ranges (weights must be between 0 and 1) using scipy.
Actual question: How can I solve a minimization problem without gradients, and also use constraints and ranges for the input variables?
The following is a toy example (evidently this one could be solved using the gradient):
# import minimize
from scipy.optimize import minimize
# define a toy function to minimize
def my_small_func(g):
x = g[0]
y = g[1]
return x**2 - 2*y + 1
# define the starting guess
start_guess = [.5,.5]
# define the acceptable ranges (for [g1, g2] repectively)
my_ranges = ((0,1),(0,1))
# define the constraint (they must always sum to 1)
def constraint(g):
return g[0] + g[1] - 1
cons = {'type':'eq', 'fun': constraint}
# minimize
minimize(my_small_func, x0=start_guess, method='SLSQP',
bounds=rranges, constraints=cons)
I usually use R so maybe this is a bad answer, but anyway here goes.
You can solve optimization problems like the using a global optimizer. An example of this is Differential Evolution. The linked method does not use gradients. As for constraints, I usually build them manually. That looks something like this:
# some dummy function to minimize
def objective.function(a, b)
if a + b != 1 # if some constraint is not met
# return a very high value, indicating a very bad fit
return(10^90)
else
# do actual stuff of interest
return(fit.value)
Then you simply feed this function to the differential evolution package function and that should do the trick. Methods like differential evolution are made to solve in particular very high dimensional problems. However the constraint you mentioned can be a problem as it will likely result in very many invalid parameter configurations. This is not necessarily a problem for the algorithm, but is simply means you need to do a lot of tweaking and need to expect a lot of waiting time. Depending on your problem, you could try optimizing weights/ parameters in blocks. That means, optimize parameters given a set of weights, then optimize weights given the previous set of parameters and repeat that many times.
Hope this helps :)
I'm trying to use scipy.optimize.minimize to minimize a complicated function. I noticed in hindsight that the minimize function takes the objective and derivative functions as separate arguments. Unfortunately, I've already defined a function which returns the objective function value and first-derivative values together -- because the two are computed simultaneously in a for loop. I don't think there is a good way to separate my function into two without the program essentially running the same for loop twice.
Is there a way to pass this combined function to minimize?
(FYI, I'm writing an artificial neural network backpropagation algorithm, so the for loop is used to loop over training data. The objective and derivatives are accumulated concurrently.)
Yes, you can pass them in a single function:
import numpy as np
from scipy.optimize import minimize
def f(x):
return np.sin(x) + x**2, np.cos(x) + 2*x
sol = minimize(f, [0], jac=True, method='L-BFGS-B')
Something that might work is: you can memoize the function, meaning that if it gets called with the same inputs a second time, it will simply return the same outputs corresponding to those inputs without doing any actual work the second time. What is happening behind the scenes is that the results are getting cached. In the context of a nonlinear program, there could be thousands of calls which implies a large cache. Often with memoizers(?), you can specify a cache limit and the population will be managed FIFO. IOW you still benefit fully for your particular case because the inputs will be the same only when you are needing to return function value and derivative around the same point in time. So what I'm getting at is that a small cache should suffice.
You don't say whether you are using py2 or py3. In Py 3.2+, you can use functools.lru_cache as a decorator to provide this memoization. Then, you write your code like this:
#functools.lru_cache
def original_fn(x):
blah
return fnvalue, fnderiv
def new_fn_value(x):
fnvalue, fnderiv = original_fn(x)
return fnvalue
def new_fn_deriv(x):
fnvalue, fnderiv = original_fn(x)
return fnderiv
Then you pass each of the new functions to minimize. You still have a penalty because of the second call, but it will do no work if x is unchanged. You will need to research what unchanged means in the context of floating point numbers, particularly since the change in x will fall away as the minimization begins to converge.
There are lots of recipes for memoization in py2.x if you look around a bit.
Did I make any sense at all?