I am trying to use scipy.optimize.least_squares(fun= my_fun, jac=my_jac, max_nfev= 1000) with two callable functions: my_fun and my_jac
both fuctions: my_fun and my_jac, use an external software to evaluate their value, this task is much time consuming, therefore I prefer to control the number of evaluations for both
the trf method uses the my_fun function for evaluating if trust region is adequate and the my_jac function for determine both the cost function and the jacobian matrix
There is an input parameter max_nfev. does this parameter count only for the fun evaluations? does it consider also the jac evaluations?
moreover, in matlab there are two parameters for the lsqnonlin function, MaxIterations and MaxFunctionEvaluations. does it exist in scipy.optimize.least_squares?
Thanks
Alon
According to the help of scipy.optimize.least_squares, max_nfev is the number of function evaluations before the program exits :
max_nfev : None or int, optional
Maximum number of function evaluations before the termination.
If None (default), the value is chosen automatically:
Again, according to the help, there is no MaxIterations argument but you can define the tolerance in f (ftol) that is the function you want to minimize or x (xtol) the solution, before exiting the code.
You can also use scipy.optimize.minimize(). In it, you can define a maxiter argument which will be in the options dictionary.
If you do so, beware that the function you want to minimize must be your cost function, meaning that you will have to code your least square function.
I hope it will be clear and useful to you
Related
I had no idea how to phrase the title of this question, so apologies for any confusion there. I am using the pymanopt package for optimization and would like to be able to create some sort of a function/method that allows for a generalized input (variable amount of input arrays). To use pymanopt, one has to provide a cost function defined in terms of array that are to be optimized to minimize the cost.
For example, a cost function could be:
#pymanopt.function.Autograd
def f(A,B):
return ((X - A#B.T)**2).sum()
To do the optimization, the variable X is defined prior to f, then f is supplied as the cost function to the pymanopt solver. Optimization is done with respect to the arguments of f and these arrays are returned by pymanopt with values that minimize the cost function.
Ideally, I would like to be able to do this definition more dynamically. So instead of defining a function in terms of hard coded arrays, to be able to supply a list of variables to be optimized. So if my cost function was instead:
#pymanopt.function.Autograd
def f(L):
return ((X - np.linalg.multi_dot(L)**2).sum()
Where the arrays A,B,...,C would be stored in a list, L. However, as far as I can tell, the variables to be optimized have to be directly defined as individual arrays in the cost function supplied to the solver.
The only thing I can think of doing is to define the cost function by creating a string that contains the 'hard coded' function and executing it via exec() with something like this:
args = ','.join(['A{}'.format(i) for i in range(len(L))])
exec('#pymanopt.function.Autograd\ndef({}):\n\treturn ((X-np.linalg.multi_dot({}))**2).sum()'.format(args,args))
but I understand that using this method should be avoided if possible. Any advice for navigating this sort of problem is greatly appreciated - thanks! Please let me know if anything is unclear/doesn't make sense.
I am setting up to use SciPy's basin-hopping global optimizer. Its documentation for parameter T states
T: float, optional
The “temperature” parameter for the accept or reject criterion. Higher “temperatures” mean that larger jumps in function value will be accepted. For best results T should be comparable to the separation (in function value) between local minima.
When it says "function value", does that mean the expected return value of the cost function func? Or the value passed to it? Or something else?
I read the source, and I see where T is passed to the Metropolis acceptance criterion, but I do not understand how it is used when converted to "beta".
I'm unfamiliar with the algorithm, but if you keep reading the documentation on the link you posted you'll find this:
Choosing T: The parameter T is the “temperature” used in the Metropolis criterion. Basinhopping steps are always accepted if func(xnew) < func(xold). Otherwise, they are accepted with probability:exp( -(func(xnew) - func(xold)) / T ). So, for best results, T should to be comparable to the typical difference (in function values) between local minima. (The height of “walls” between local minima is irrelevant.)
So I believe T should take on the value of the function which you are trying to optimize, func. This makes sense if you look at that probability expression -- you'd be comparing a difference in function values to what is meant to be a type of "upper bound" for the step. For example, if one local minima is func = 10 and another is func = 14, you might consider T = 4 to be an appropriate value.
I am using the minimize function from scipy.optimize library.
Is there a way to print some values during the optimization procedure? Values like the current x, objective function value, number of iterations and number of gradient evaluations.
I know there are options to save these values and return them after the optimization is over. But can I see them at each step?
The minimize function takes an options dict as a keyword argument. Accepted keys for this dict inlude, disp, which should be set to True to print the progress of the minimization.
I'm trying to use scipy.optimize.minimize to minimize a complicated function. I noticed in hindsight that the minimize function takes the objective and derivative functions as separate arguments. Unfortunately, I've already defined a function which returns the objective function value and first-derivative values together -- because the two are computed simultaneously in a for loop. I don't think there is a good way to separate my function into two without the program essentially running the same for loop twice.
Is there a way to pass this combined function to minimize?
(FYI, I'm writing an artificial neural network backpropagation algorithm, so the for loop is used to loop over training data. The objective and derivatives are accumulated concurrently.)
Yes, you can pass them in a single function:
import numpy as np
from scipy.optimize import minimize
def f(x):
return np.sin(x) + x**2, np.cos(x) + 2*x
sol = minimize(f, [0], jac=True, method='L-BFGS-B')
Something that might work is: you can memoize the function, meaning that if it gets called with the same inputs a second time, it will simply return the same outputs corresponding to those inputs without doing any actual work the second time. What is happening behind the scenes is that the results are getting cached. In the context of a nonlinear program, there could be thousands of calls which implies a large cache. Often with memoizers(?), you can specify a cache limit and the population will be managed FIFO. IOW you still benefit fully for your particular case because the inputs will be the same only when you are needing to return function value and derivative around the same point in time. So what I'm getting at is that a small cache should suffice.
You don't say whether you are using py2 or py3. In Py 3.2+, you can use functools.lru_cache as a decorator to provide this memoization. Then, you write your code like this:
#functools.lru_cache
def original_fn(x):
blah
return fnvalue, fnderiv
def new_fn_value(x):
fnvalue, fnderiv = original_fn(x)
return fnvalue
def new_fn_deriv(x):
fnvalue, fnderiv = original_fn(x)
return fnderiv
Then you pass each of the new functions to minimize. You still have a penalty because of the second call, but it will do no work if x is unchanged. You will need to research what unchanged means in the context of floating point numbers, particularly since the change in x will fall away as the minimization begins to converge.
There are lots of recipes for memoization in py2.x if you look around a bit.
Did I make any sense at all?
TL;DR: How to minimize a fairly smooth function that returns an integer value (not a float)?
>>> import scipy.optimize as opt
>>> opt.fmin(lambda (x,y): (0.1*x**2+0.1*(y**2)), (-10, 9))
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 49
Function evaluations: 92
array([ -3.23188819e-05, -1.45087583e-06])
>>> opt.fmin(lambda (x,y): int(0.1*x**2+0.1*(y**2)), (-10, 9))
Optimization terminated successfully.
Current function value: 17.000000
Iterations: 17
Function evaluations: 60
array([-9.5 , 9.45])
Trying to minimize a function that accepts floating point parameters but returns an integer, I'm running into a problem that the solver terminates immediately. This effect is demonstrated in the examples above - notice that the when the value returned is rounded as an int, the evaluation terminates prematurely.
I assume that this is happening because it detects no change in the derivative, i.e. the first time it changes a parameter, the change it makes is too small and the difference between first result and second is 0.00000000000, incorrectly indicating a minimum has been found.
I've had better luck with optimize.anneal, but despite its integer valued return I've plotted some regions of the function in three dimensions and it's actually pretty smooth. Therefore, I was hoping that when a derivative-aware minimizer would work better.
I've reverted to manually graphing to explore the space, but I'd like to introduce a couple more parameters so it'd be great if I could get this working.
The function I'm trying to minimize can't be made to return a float. It's counting the number of successful hits from a cross-validation, and I'm having the optimizer alter parameters on the model.
Any ideas?
UPDATE
Found a similar question: How to force larger steps on scipy.optimize functions?
In general, minimization on the integer space is an entirely different field called integer programming (or discrete optimization). The addition of integer constraints actually creates quite a few algorithmic difficulties that render continuous methods unfit. Look into scipy.optimize.anneal