Custom convergence criterion in scipy optimise - python

I am optimising a function using scipy.optimize in the following manner:
yEst=minimize(myFunction, y0, method='L-BFGS-B', tol=1e-6).x
My problem is that I don't want to stop simply when the tolerance is less than a value (e.g. if on the nth iteration stop is |y_n - y_(n-1)|<tol). Instead I have a slightly more complex function of y_n and y_(n-1), say tolFun, and I want to stop when tolFun(y_n, y_(n-1))<tol.
To give more detail my tolerance function is the following. It partitions y into chunks and then checks if any of the individual partitions have a norm difference within tolerance and, if any do, then the minimisation should stop.
# Takes in current and previous iteration values and a pre-specified fixed scalar r.
def tolFun(yprev,ycurr,r):
# The minimum norm so far (initialized to a big value)
minnorm = 5000
for i in np.arange(r):
# Work out the norm of the ith partition/block of entries
norm = np.linalg.norm(yprev[np.arange(r)+i*r],ycurr[np.arange(r)+i*r])
# Update minimum norm
minnorm = np.min(norm, minnorm)
return(minnorm)
My question is similar to this question here but differs in the fact that this user needed only the current iterations value of y, whereas my custom tolerance function needs both the current iterations value of y and the previous value. Does anyone know how I could do this?

You cannot do directly what you want since the callback function receives only the current parameter vector. To solve your problem you can modify second solution from https://stackoverflow.com/a/30365576/8033585 (which I prefer to the first solution that uses global) in the following way or so:
class Callback:
def __init__(self, tolfun, tol=1e-8):
self._tolf = tolfun
self._tol = tol
self._xk_prev = None
def __call__(self, xk):
if self._xk_prev is not None and self._tolf(xk, self._xk_prev) < self._tol:
return True
self._xk_prev = xk
return False
cb = Callback(tolfun=tolFun, tol=tol) # set tol here to control convergence
yEst = minimize(myFunction, y0, method='L-BFGS-B', tol=0, callback=cb)
or
yEst = optimize.minimize(
myFunction, y0, method='L-BFGS-B',
callback=cb, options={'gtol': 0, 'ftol': 0}
)
You can find available options for a solver/method using:
optimize.show_options('minimize', 'L-BFGS-B')

Related

Parameter Optimization Using Minimize in Python

I have written the following two functions to calibrate a model :
The main function is:
def function_Price(para,y,t,T,tau,N,C):
# y= price array
# C = Auto and cross correlation array
# a= paramters need to be calibrated
a=para[0:]
temp=0
for j in range(N):
price_j = a[j]*C[j]*P[t:T-tau,j]
temp=temp+price_j
Price=temp
return Price
The objective function is :
def GError_function_Price(para,y,k,t,T,tau,N,C):
# k is the price need to be fitted
return sum((function_Price(para,y,t,T,tau,N,C)-k[t+tau:T]) ** 2)
Now, I am calling these two functions to do the optimization of the model:
import numpy as np
from scipy.optimize import minimize
# Prices (example)
y = np.array([[1,2,3,4,5,4], [4,5,6,7,8,9], [6,7,8,7,8,6], [13,14,15,11,12,19]])
# Correaltion (example)
Corr= np.array([[1,2,3,4,5,4], [4,5,6,7,8,9], [6,7,8,7,8,6], [13,14,15,11,12,19],[1,2,3,4,5,4],[6,7,8,7,8,6]])
# Define
tau=1
Size = y.shape
N = Size[1]
T = Size[0]
t=0
# initial Values
para=np.zeros(N)
# Bounds
B = np.zeros(shape=(N,2))
for n in range(N):
B[n][0]= float('-inf')
B[n][1]= float('inf')
# Calibration
A = np.zeros(shape=(N,N))
for i in range (N):
k=y[:,i] #fitted one
C=Corr[i,:]
parag=minimize(GError_function_Price,para,args=(y,Y,t,T,tau,N,C),method='SLSQP',bounds=B)
A[i,:]=parag.x
Once, I run the model, It should produce an N by N array of optimized values of paramters. But, except for the first column, it keeps zeros for the rest. Something is wrong.
Can you help me fix the problem, please?
I know how to do it in Matlab.
The following is Matlab Code :
main function
function Price=function_Price(para,P,t,T,tau,N,C)
a=para(:,:);
temp=0;
for j=1:N
price_j = a(j).*C(j).*P(t:T-tau,j);
temp=temp+price_j;
end
Price=temp;
end
The objective function:
function gerr=GError_function_Price(para,P,Y,t,T,tau,N,C)
gerr=sum((function_Price(para,P,t,T,tau,N,C)-Y(t+tau:T)).^2);
end
Now, I call these two functions in the following way:
P = [1,2,3,4,5,4;4,5,6,7,8,9;6,7,8,7,8,6;13,14,15,11,12,19];
AutoAndCrossCorr= [1,2,3,4,5,4;4,5,6,7,8,9;6,7,8,7,8,6;13,14,15,11,12,19;1,2,3,4,5,4;6,7,8,7,8,6];
tau=1;
Size = size(P);
N =6;
T =4;
t=1;
for i=1:N
Y=P(:,i); % fitted one
C=AutoAndCrossCorr(i,:);
para=zeros(1,N);
lb= repmat(-inf,N,1);
ub= repmat(inf ,N,1);
parag=fminsearchbnd(#(para)abs(GError_function_Price(para,P,Y,t,T,tau,N,C)),para,lb,ub);
a(i,:)=parag;
end
The problem seems to be that you're passing the result of a function call to minimize, rather than the function itself. The arguments get passed by the args parameter. So instead of:
minimize(GError_function_Price(para,y,k,t,T,tau,N,C),para,method='SLSQP',bounds=B)
the following should work
minimize(GError_function_Price,para,args=(y,k,t,T,tau,N,C),method='SLSQP',bounds=B)

How to pass the i-ith value of an array to scipy.integrate using dopri5 (runge-kutta45)?

I am trying to use sci.integrate python module to solve for V (the voltage on a capacitor) a simple first order in time ODE:
def f(t, V, Vin, vtime, R):
return (Vin[vtime==t][0]-V)/(R*get_alpha(V))
R = .5
V0 = 0.
t0, dt, tmax = vtime[0], vtime[1]-vtime[0], vtime[-1]
result, time = np.zeros_like(V), vtime
r = ode(f).set_integrator('dopri5')
r.set_initial_value(V0, t0).set_f_params(V, vtime, R)
i = 0
while (r.successful()) & (r.t < tmax):
result[i] = r.integrate(r.t+dt)[0]
i+=1
As you can see my right-hand-side functional depends on an input voltage (stored in an array Vin) and a constant resistance (R), both of which I would need to pass to the function within the solver as an argument at each time.
The example given on the scipy documentation page is not clear enough to me as I am not able to simply call r.set_f_params(Vin, R).
What is the proper way to set those parameters?
You will need to implement some interpolation formula for Vin. In the most simple case,
def getVin(t): k = int( (t-t0Vin)/dtVin ); return Vin[k];
where of course you have to provide the paramters t0Vin, dtVin of the sampling times of the Vin samples.
For more general situations use the interpolation function numpy.interp or scipy.interpolate.interp1d.
Parameters other than the time and function itself can be set by r.set_initial_value(V0, t0).set_f_params(V, vtime, R) but their names in the function have to be arg1, arg2, etc...

Avoid evaluating the function with same input multiple times

I am trying to utilise scipy.optimise.fsolve for solving a function. I noticed that the function is evaluated with the same value multiple times in the beginning and the end of the iteration steps. For example when the following code is evaluated:
from scipy.optimize import fsolve
def yy(x):
print(x)
return x**2+9*x+20
y = fsolve(yy,22.)
print(y)
The following output is obtained:
[ 22.]
[ 22.]
[ 22.]
[ 22.00000033]
[ 8.75471707]
[ 4.34171812]
[ 0.81508685]
[-1.16277103]
[-2.42105811]
[-3.17288066]
[-3.61657372]
[-3.85653348]
[-3.96397335]
[-3.99561793]
[-3.99984826]
[-3.99999934]
[-4.]
[-4.]
[-4.]
Therefore the function is evaluated with 22. three times, which is unnecessary.
This is especially annoying when the function requires substantial evaluation time. Could anyone please explain this and suggest how to avoid this issue?
The first evaluation is done only to check the shape and data type of the output of the function. Specifically, fsolve calls _root_hybr which contains the line
shape, dtype = _check_func('fsolve', 'func', func, x0, args, n, (n,))
Naturally, _check_func calls the function:
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
Since only the shape and data type are retained from this evaluation, the solver will be calling the function with the value x0 again when actual root finding process begins.
The above accounts for one extraneous call (out of two). I did not track down the other one, but it's conceivable that the FORTRAN code does some kind of preliminary check of its own. This sort of thing happens when algorithms written long ago get wrapped over and over again.
If you really want to save these two evaluations of expensive function yy, one way is to compute the value yy(x0) separately and store it. For example:
def yy(x):
if x == x0 and y0 is not None:
return y0
print(x)
return x**2+9*x+20
x0 = 22.
y0 = None
y0 = yy(x0)
y = fsolve(yy, x0)
I realised an important reason for this issue is that fsolve is not meant for such a problem: Solvers should be chosen wisely :)
multivariate: fmin, fmin_powell, fmin_cg, fmin_bfgs, fmin_ncg
nonlinear: leastsq
constrained: fmin_l_bfgs_b, fmin_tnc, fmin_cobyla
global: basinhopping, brute, differential_evolution
local: fminbound, brent, golden, bracket
n-dimensional: fsolve
one-dimensional: brenth, ridder, bisect, newton
scalar: fixed_point

Scipy odeint giving index out of bounds errors

I am trying to solve a differential equation in python using Scipy's odeint function. The equation is of the form dy/dt = w(t) where w(t) = w1*(1+A*sin(w2*t)) for some parameters w1, w2, and A. The code I've written works for some parameters, but for others I get given index out of bound errors.
Here's some example code that works
import numpy as np
import scipy.integrate as integrate
t = np.arange(1000)
w1 = 2*np.pi
w2 = 0.016*np.pi
A = 1.0
w = w1*(1+A*np.sin(w2*t))
def f(y,t0):
return w[t0]
y = integrate.odeint(f,0,t)
Here's some example code that doesn't work
import numpy as np
import scipy.integrate as integrate
t = np.arange(1000)
w1 = 0.3*np.pi
w2 = 0.005*np.pi
A = 0.15
w = w1*(1+A*np.sin(w2*t))
def f(y,t0):
return w[t0]
y = integrate.odeint(f,0,t)
The only thing that changes between these is that the three parameters w1, w2, and A are smaller in the second, but the second one always gives me the following error
line 13, in f
return w[t0]
IndexError: index 1001 is out of bounds for axis 0 with size 1000
This error continues even after restarting python and running the second code first. I've tried with other parameters, some seem to work, but others give me different index out of bounds errors. Some say 1001 is out of bounds, some say 1000, some say 1008, ect.
Changing the initial condition on y (the second input for odeint, which I have as 0 on the above codes) also changes the number on the index error, so it might be that I'm misunderstanding what to put here. I wasn't told what the initial conditions should be other than that y is used as a phase of a signal, so I presumed it to be initially 0.
What you want to do is
def w(t):
return w1*(1+A*np.sin(w2*t))
def f(y,t0):
return w(t0)
Array indices are typically integers, time arguments and values of solutions of differential equations are typically real numbers. Thus there is some conceptual difficulty in invoking w[t0].
You might also try to integrate directly the function w, there is no inherent difficulty in this example.
As for coupled systems, you solve them as coupled systems.
def w(t):
return w1*(1+A*np.sin(w2*t))
def f(y,t):
wt = w(t)
return np.array([ wt, wt*sin(y[1]-y[0]) ])

How to define General deterministic function in PyMC

In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.
Is it possible to do that?
Following is a pyMC3 code which shows what I am trying to do in a simplified case.
import numpy as np
import pymc as pm
#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
for w in range(size):
# A silly version of my real model evaluated on grid.
Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])
# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
return Grid[int(x)*size+int(w),2:] * z
#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror # ie. True x= 16, w= 12 and z= 3.6
with pm.Model() as model:
#Priors
x = pm.Uniform('x',lower=0,upper= size)
w = pm.Uniform('w',lower=0,upper =size)
z = pm.Uniform('z',lower=-5,upper =10)
#Expected value
y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))
#Data likelihood
ysigmas = np.ones(len(idata))*9.0
y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)
# Inference...
start = pm.find_MAP() # Find starting value by optimization
step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling
print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()
When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.
Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.
I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?
Update
Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.
#pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.
Metropolis seems to be not converging.
Which step method should I use in a scenario like this?
You found the correct solution with as_op.
Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.
Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.

Categories