I am trying to set up an implicit solver to a pendulum F and dF are defined as:
def func(y,t):
### Simplified the Function to remove friction since it canceled out
x,v = y[:3],y[3:6]
grav = np.array([0., 0., -9.8 ])
lambd = (grav.dot(x)+v.dot(v))/x.dot(x)
return np.concatenate([v, grav - lambd*x] )
def dF_matrix(y):
n=y.size
dF=np.zeros((6,6))
xp=np.array([y[3],y[4],y[5]])[np.newaxis]
mass=1.
F1=0.
F2=0.
F3=-mass*9.8
F=np.array([F1,F2,F3])[np.newaxis]
phix=2.*y[0]
phiy=2.*y[1]
phiz=2.*y[2]
G=np.array([phix,phiy,phiz])[np.newaxis]
H=2.*np.eye(3)
lambd=(mass*np.dot(xp,np.dot(H,xp.T))+np.dot(F,G.T))/np.dot(G,G.T)
dF[0,3]=1
dF[1,4]=1
dF[2,5]=1
dF[3,0]=(y[0]*F1+2*lambd)/mass
dF[3,1]=(y[0]*F2)/mass
dF[3,2]=(y[0]*F3)/mass
dF[3,3]=phix*y[3]
dF[3,4]=phix*y[4]
dF[3,5]=phix*y[5]
dF[4,0]=(y[1]*F1)/mass
dF[4,1]=(y[1]*F2+2*lambd)/mass
dF[4,2]=(y[1]*F3)/mass
dF[4,3]=phiy*y[3]
dF[4,4]=phiy*y[4]
dF[4,5]=phiy*y[5]
dF[5,0]=(y[2]*F1)/mass
dF[5,1]=(y[2]*F2)/mass
dF[5,2]=(y[2]*F3+2*lambd)/mass
dF[5,3]=phiz*y[3]
dF[5,4]=phiz*y[4]
dF[5,5]=phiz*y[5]
return dF
I am trying to set up an implicit solver that uses the function and the dF in order to guess values in order to create a more accurate solver for my pendulum function.
the error that I get is the following
Traceback (most recent call last):
line 206, in test_function(backwards_Euler)
line 186, in test_function y1 = test_function(func, y0, t)
line 166, in backwards_Euler F = np.asarray(y[i] + dt * function(zold, time[i+1])-zold)
line 13, in func lambd = (grav.dot(x)+v.dot(v))/x.dot(x)
ValueError: shapes (3,6) and (3,6) not aligned: 6 (dim 1) != 3 (dim 0)
I think it is because one of the calculations I am doing is diverging to infinity but I cannot figure out the issue that might be causing this.
Here is my current backward Euler function:
### Backwards Euler Method
def backwards_Euler(function, y_matrix, time):
y = np.zeros((np.size(time), np.size(y_matrix)))
y[0, :] = y_matrix
dt = time[1] - time[0]
for i in range(len(time-1)):
err = 1
zold = y[i] + dt*function(y[i],time[i]) ## guess with forward euler
I = 0
while err > 10**(-10) and I < 5:
F = y[i] + dt * function(zold, time[i+1])-zold ## Here is where my error occurs
dF = dt*dF_matrix(y[i+1])-1
znew = zold - F/dF
zold = znew
I+=1
y[i+1]=znew
return y
Here is how I am testing the function although currently, I have no output
# initial condition
y0 = np.array([0.0, 1.0, 0.0, 0.8, 0.0, 1.2])
def test_function(test_function):
print(test_function.__name__ + "...")
nt = 2500
time_start = process_time()
# time points
t = np.linspace(0, 25, nt)
# solve ODE
y1 = test_function(func, y0, t)
time_elapsed = (process_time() - time_start)
print('elapsed time', time_elapsed)
# compute residual:
r = y1[:, 0] ** 2 + y1[:, 1] ** 2 + y1[:, 2] ** 2 - 1
rmax1 = np.max(np.abs(r))
print('error', rmax1)
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot3D(y1[:, 0], y1[:, 1], y1[:, 2], 'gray')
plt.show()
test_function(backwards_Euler)
No, the error is that the matrix shapes in some matrix operations don't match. The immediate cause is that x.dot(x) works when x is a simple array, but not for two-dimensional arrays. Then the rules of matrix multiplication have to be strictly obeyed.
However, there is no reason that there is a matrix in place of x, you do not call the function in a "vectorized" fashion. The true reason is that python is not Matlab, the division in
znew = zold - F/dF
does not do what you think it does. Instead it does elementwise division, repeating the smaller object to fill the shape of the larger, which also happens with the subtraction, so that znew in the end has the same shape as the matrix dF, with the observed results in the next ODE function evaluation.
You need to explicitly invoke a linear solver, the standard one in in np.linalg (there are others there and in scipy.sparse.linalg using other matrix factorizations) is simply called solve(A,b) and computes the solution of A*x=b.
znew = zold - np.linalg.solve(dF, F)
Implicit Euler gives a diverging solution, the length of the pendulum increases rapidly. Applying these methods to the similar implicit trapezoidal method, which is also Adams-Moulton 2nd order, gives the code
def Adams_Moulton_2nd(function, y_init, time):
# solve F(z)=0 with simplified Newton, where
# F = y + 0.5*dt*(f(y,t)+f(z,t+dt)) - z
# is the defect of the implicit trapezium method. Using the derivative
# dF_z = 0.5*dt*df_z(z,t+dt) - I
# the non-linear solver step is solution of the linear system
# dF_z*dz = -F
y = np.zeros((np.size(time), np.size(y_init)))
y[0, :] = y_init
Id = np.eye(len(y_init))
dt = time[1] - time[0]
dt4 = dt**4
for i in range(len(time)-1):
f_0 = function(y[i],time[i])
norm_0 = sum(f_0**2)**0.5 + 1e-6
z = y[i] + dt*f_0 ## guess with forward euler
dF = 0.5*dt*dF_matrix(z, t[i+1]) - Id
for I in range(5):
F = y[i] + 0.5*dt * (f_0+function(z, time[i+1])) - z
dz = - np.linalg.solve(dF,F)
if max(abs(dz)) < norm_0*dt4: break
z += dz
y[i+1]=z
return y### Adams_Moulton_2nd
Related
I am trying to assign an infinite value to a variable of gekko. I have tried with the numpy's infinite value and python's own infinite but it is still not working due to a problem of recognition of gekko.
The main objective of this idea is to force a variable to be strictly equal to 0, at least in the first iteration of the solver.
from gekko import GEKKO
from numpy import Inf
model=GEKKO()
R=model.FV(value=Inf)
T=model.Array(model.Var,2)
Q=model.FV()
model.Equation(Q==(T[1]-T[0])/R)
model.solve()
And the error I am getting:
Exception: #error: Model Expression
*** Error in syntax of function string: Invalid element: inf
Moreover, sometimes other variables are also required to be infinite, again, variables that are located in the denominator of a model equation. This is quite useful in order to try different scenarios of the simulation I am working with and check the systems behavior.
Hope you can help me, thank you.
The large-scale NLP and MINLP solvers don't know how to compute gradients with a np.nan value so initializing with NaN generally doesn't help. Please post example code that demonstrates the issue that you are observing with improved performance from NaN initialization.
Below are four unconstrained optimization methods compared on the same sample problem. The algorithms do not benefit from NaN for initialization. Some solvers substitute NaN with 0 or a high or low number. I suggest that you try giving np.nan as an initial condition to these solution methods to see how it affects the search for the minimum.
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
# define objective function
def f(x):
x1 = x[0]
x2 = x[1]
obj = x1**2 - 2.0 * x1 * x2 + 4 * x2**2
return obj
# define objective gradient
def dfdx(x):
x1 = x[0]
x2 = x[1]
grad = []
grad.append(2.0 * x1 - 2.0 * x2)
grad.append(-2.0 * x1 + 8.0 * x2)
return grad
# Exact 2nd derivatives (hessian)
H = [[2.0, -2.0],[-2.0, 8.0]]
# Start location
x_start = [-3.0, 2.0]
# Design variables at mesh points
i1 = np.arange(-4.0, 4.0, 0.1)
i2 = np.arange(-4.0, 4.0, 0.1)
x1_mesh, x2_mesh = np.meshgrid(i1, i2)
f_mesh = x1_mesh**2 - 2.0 * x1_mesh * x2_mesh + 4 * x2_mesh**2
# Create a contour plot
plt.figure()
# Specify contour lines
lines = range(2,52,2)
# Plot contours
CS = plt.contour(x1_mesh, x2_mesh, f_mesh,lines)
# Label contours
plt.clabel(CS, inline=1, fontsize=10)
# Add some text to the plot
plt.title(r'$f(x)=x_1^2 - 2x_1x_2 + 4x_2^2$')
plt.xlabel(r'$x_1$')
plt.ylabel(r'$x_2$')
##################################################
# Newton's method
##################################################
xn = np.zeros((2,2))
xn[0] = x_start
# Get gradient at start location (df/dx or grad(f))
gn = dfdx(xn[0])
# Compute search direction and magnitude (dx)
# with dx = -inv(H) * grad
delta_xn = np.empty((1,2))
delta_xn = -np.linalg.solve(H,gn)
xn[1] = xn[0]+delta_xn
plt.plot(xn[:,0],xn[:,1],'k-o')
##################################################
# Steepest descent method
##################################################
# Number of iterations
n = 8
# Use this alpha for every line search
alpha = 0.15
# Initialize xs
xs = np.zeros((n+1,2))
xs[0] = x_start
# Get gradient at start location (df/dx or grad(f))
for i in range(n):
gs = dfdx(xs[i])
# Compute search direction and magnitude (dx)
# with dx = - grad but no line searching
xs[i+1] = xs[i] - np.dot(alpha,dfdx(xs[i]))
plt.plot(xs[:,0],xs[:,1],'g-o')
##################################################
# Conjugate gradient method
##################################################
# Number of iterations
n = 8
# Use this alpha for the first line search
alpha = 0.15
neg = [[-1.0,0.0],[0.0,-1.0]]
# Initialize xc
xc = np.zeros((n+1,2))
xc[0] = x_start
# Initialize delta_gc
delta_cg = np.zeros((n+1,2))
# Initialize gc
gc = np.zeros((n+1,2))
# Get gradient at start location (df/dx or grad(f))
for i in range(n):
gc[i] = dfdx(xc[i])
# Compute search direction and magnitude (dx)
# with dx = - grad but no line searching
if i==0:
beta = 0
delta_cg[i] = - np.dot(alpha,dfdx(xc[i]))
else:
beta = np.dot(gc[i],gc[i]) / np.dot(gc[i-1],gc[i-1])
delta_cg[i] = alpha * np.dot(neg,dfdx(xc[i])) + beta * delta_cg[i-1]
xc[i+1] = xc[i] + delta_cg[i]
plt.plot(xc[:,0],xc[:,1],'y-o')
##################################################
# Quasi-Newton method
##################################################
# Number of iterations
n = 8
# Use this alpha for every line search
alpha = np.linspace(0.1,1.0,n)
# Initialize delta_xq and gamma
delta_xq = np.zeros((2,1))
gamma = np.zeros((2,1))
part1 = np.zeros((2,2))
part2 = np.zeros((2,2))
part3 = np.zeros((2,2))
part4 = np.zeros((2,2))
part5 = np.zeros((2,2))
part6 = np.zeros((2,1))
part7 = np.zeros((1,1))
part8 = np.zeros((2,2))
part9 = np.zeros((2,2))
# Initialize xq
xq = np.zeros((n+1,2))
xq[0] = x_start
# Initialize gradient storage
g = np.zeros((n+1,2))
g[0] = dfdx(xq[0])
# Initialize hessian storage
h = np.zeros((n+1,2,2))
h[0] = [[1, 0.0],[0.0, 1]]
for i in range(n):
# Compute search direction and magnitude (dx)
# with dx = -alpha * inv(h) * grad
delta_xq = -np.dot(alpha[i],np.linalg.solve(h[i],g[i]))
xq[i+1] = xq[i] + delta_xq
# Get gradient update for next step
g[i+1] = dfdx(xq[i+1])
# Get hessian update for next step
gamma = g[i+1]-g[i]
part1 = np.outer(gamma,gamma)
part2 = np.outer(gamma,delta_xq)
part3 = np.dot(np.linalg.pinv(part2),part1)
part4 = np.outer(delta_xq,delta_xq)
part5 = np.dot(h[i],part4)
part6 = np.dot(part5,h[i])
part7 = np.dot(delta_xq,h[i])
part8 = np.dot(part7,delta_xq)
part9 = np.dot(part6,1/part8)
h[i+1] = h[i] + part3 - part9
plt.plot(xq[:,0],xq[:,1],'r-o')
plt.tight_layout()
plt.savefig('contour.png',dpi=600)
plt.show()
More information is available in the design optimization course.
Response to Edit
Thanks for clarifying the question and for including a source code example. While it isn't possible to include Inf as a guess, an equivalent form with an additional variable x may be able to accomplish the desired behavior. This sets the term (T[1]-T[0])/R initially equal to zero at the beginning iteration.
from gekko import GEKKO
from numpy import Inf
model=GEKKO()
R=model.FV(value=1e20)
T=model.Array(model.Var,2)
x=model.Var(value=0)
Q=model.FV()
model.Equations([x==(T[1]-T[0])/R,
Q==x])
model.solve()
I am having some trouble with a model I want to analyze. I am trying to plot two differential equations however I am very new to doing this and am not getting it to work. Any help is appreciated
#Polyaneuploid cell development during cancer
#two eqns
#Fixed Points:
#13.37526
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def modelC(C,t):
λc = 0.0601
K = 2000
α = 1 * (10**-4)
ν = 1 * (10**-6)
λp = 0.1
γ = 2
def modelP(P,t):
λc = 0.0601
K = 2000
α = 1 * (10**-4)
ν = 1 * (10**-6)
λp = 0.1
γ = 2
#returning odes
dPdt = ((λp))*P(1-(C+(γ*P))/K)+ (α*C)
dCdt = ((λc)*C)(1-(C+(γ*P))/K)-(α*C) + (ν*P)
return dPdt, dCdt
#initial conditions
C0= 256
P0 = 0
#time points
t = np.linspace(0,30)
#solve odes
P = odeint(modelP,t,P0, args = (C0,))
C = odeint(modelC,t,C0, args= (P0,))
#P = odeint(modelP, P0 , t)
#P = P[:, 2]
#C = odeint(modelC, C0 , t)
#C = C[:, 2]
#plot results
plt.plot(t,np.log10(C0))
plt.plot(t,np.log10(P0))
plt.xlabel('time in days')
plt.ylabel('x(t)')
plt.show()
This is just what I have so far, and currently I am getting this error: ValueError: diff requires input that is at least one dimensional
Any tips on how to get the graphs to show?
You need to put your initial conditions in a list like so:
initial_conditions = [C0, P0]
P = odeint(modelP,t,initial_conditions)
you still have some error in your P function where try to access C which is not defined in the local scope of your function neither passed as an argument.
UPDATED
def modelP(P,t,C):
λc = 0.0601
K = 2000
α = 1 * (10**-4)
ν = 1 * (10**-6)
λp = 0.1
γ = 2
#returning odes
dPdt = ((λp))*P(1-(C+(γ*P))/K)+ (α*C)
dCdt = ((λc)*C)(1-(C+(γ*P))/K)-(α*C) + (ν*P)
return dPdt, dCdt
#initial conditions
C0= 256
P0 = 0
Pconds = [P0]
#time points
t = np.linspace(0,30)
#solve odes
P = odeint(modelP,t, Pconds, args=(C0,))
The solver deals with flat arrays with no inherent meaning in the components. You need to add that meaning, unpack the input vector into the state object, at the start of the model function, and remove that meaning, reduce the state to a flat array or list, at the end of the model function.
Here this is simple, the state consists of 2 scalars. Thus a structure for the model function is
def model(X,t):
P, C = X
....
return dPdt, dCdt
Then integrate as
X = odeint(model,(P0,C0),t)
P,C = X.T
plt.plot(t,P)
I am trying to model a system of coupled ODEs which represent a three-box ocean model of phosphorous concentration (y) in the low-latitude surface ocean (Box 1), high-latitude deep ocean (Box 2), and deep ocean (Box 3). The ODEs are given below:
dy1dt = (Q/V1)*(y3-y1) - y1/tau # dP/dt in Box 1
dy2dt = (Q/V2)*(y1-y2) + (qh/V2)*(y3-y2) - (fh/V2) # dP/dt in Box 2
dy3dt = (Q/V3)*(y2-y3) + (qh/V3)*(y2-y3) + (V1*y1)/(V3*tau) + (fh/V3) # dP/dt in Box 3
The constants and box volumes are given by:
### Define Constants
tau = 86400 # s
VT = 1.37e18 # m3
Q = 25e6 # m3/s
qh = 38e6 # m3/s
fh = 0.0022 # mol/m3
avp = 0.00215 # mol/m3
### Calculate Surface Area of Ocean
r = 6.4e6 # m
earth = 4*np.pi*(r**2) # m2
ocean = .70*earth # m2
### Calculate Volumes for Each Box
V1 = .85*100*ocean # m3
V2 = .15*250*ocean # m3
V3 = VT-V1-V2 # m3
This can be put into matrix form y = Ay + f, where y = [y1, y2, y3]. I have provided the matrices and initial conditions below:
A = np.array([[(-Q/V1)-(1/tau),0,(Q/V1)],
[(Q/V2),(-Q/V2)-(qh/V2),(qh/V2)],
[(V1/(V3*tau)),((Q+qh)/V3),((-Q-qh)/V3)]])
f = np.array([[0],[-fh/V2],[fh/V3]])
y1 = y2 = y3 = 0.00215 # mol/m3
I am having trouble adapting the Forward Euler method to apply to a system on linear ODEs, rather than just one. This is what I have come up with so far (it runs with no issues but doesn't work if that makes sense; I think it has something t do with initial conditions?):
### Define a Function for the Euler-Forward Scheme
import numpy as np
def ForwardEuler(t0,y0,N,dt):
N = 100
dt = 0.1
# Create empty 2D arrays for t and y
t = np.zeros([N+1,3,3]) # steps, # variables, # solutions
y = np.zeros([N+1,3,3])
# Assign each ODE to a vector element
y1 = y[0]
y2 = y[1]
y3 = y[2]
# Set initial conditions for each solution
t[0, 0] = t0[0]
y[0, 0] = y0[0]
t[0, 1] = t0[1]
y[0, 1] = y0[1]
t[0, 2] = t0[2]
y[0, 2] = y0[2]
for i in trange(int(N)):
t[i+1] = t[i] + t[i]*dt
y1[i+1] = y1[i] + ((Q/V1)*(y3[i]-y1[i]) - (y1[i]/tau))*dt
y2[i+1] = y2[i] + ((Q/V2)*(y1[i]-y2[i]) + (qh/V2)*(y3[i]-y2[i]) - (fh/V2))*dt
y3[i+1] = y3[i] + ((Q/V3)*(y2[i]-y3[i]) + (qh/V3)*(y2[i]-y3[i]) + (V1*y1[i])/(V3*tau) + (fh/V3))*dt
return t, y1, y2, y3
Any help on this is greatly appreciated. I have not found any resources online that go through the Euler Forward for a system of 3 ODEs, and am at a loss. I am happy to explain further if there are more questions.
As Lutz Lehmann pointed out, you need to design a simple ODE system. You could define the whole ODE system inside a function as follows:
import numpy as np
def fun(t, RHS):
# get initial boundary condition values
y1 = RHS[0]
y2 = RHS[1]
y3 = RHS[2]
# calculte rate of respective variables
dy1dt = (Q/V1)*(y3-y1) - y1/tau
dy2dt = (Q/V2)*(y1-y2) + (qh/V2)*(y3-y2) - (fh/V2)
dy3dt = (Q/V3)*(y2-y3) + (qh/V3)*(y2-y3) + (V1*y1)/(V3*tau) + (fh/V3)
# Left-hand side of ODE
LHS = np.zeros([3,])
LHS[0] = dy1dt
LHS[1] = dy2dt
LHS[2] = dy3dt
return LHS
In the above function, we get time t as an argument and the initial values of y1, y2, and y3 as a list in the variable RHS, which is then unpacked to get the respective variables. Afterward, the rate equation of each variable is defined. In the end, the calculated rates are returned also as a list in the variable LHS.
Now, we can define a simple Euler Forward method to solve this ODE system as follows:
def ForwardEuler(fun,t0,y0,N,dt):
t = np.arange(t0,N+dt,dt)
y = np.zeros([len(t), len(y0)])
y[0] = y0
for i in range(len(t)-1):
y[i+1,:] = y[i,:] + fun(t[i],y[i,:]) * dt
return t, y
Here, we create a time range from 0 to 100 with a step size of 0.1. Afterward, we create an array of zeros with the shape (len(t), len(y0)) which is in this case (1001,3). We need to do this because we want to solve fun for the range of t (1001) and the RHS variable of fun has a shape of (3,) ([y1, y2, y3]). So for each and every point in t, we will solve for the three variables of RHS, which will be returned as LHS.
In the end, we can solve this ODE system as follows:
dt = 0.1
N = 100
y0 = [0.00215, 0.00215, 0.00215]
t0 = 0
t,y = ForwardEuler(fun,t0,y0,N,dt)
Solution using scipy.integrate
As Lutz Lehmann also pointed out, you can use scipy.integrate for this purpose as well which is far easier. Here you can use the above defined fun and simply solve the ODE as follows:
import numpy as np
from scipy.integrate import odeint
dt = 0.1
N = 100
t = np.linspace(0,N,int(N/dt))
y0 = [0.00215, 0.00215, 0.00215]
res = odeint(fun, y0, t, tfirst=True)
print(res)
I am trying to find the minimum of a natural cubic spline. I have written the following code to find the natural cubic spline. (I have been given test data and have confirmed this method is correct.) Now I can not figure out how to find the minimum of this function.
This is the data
xdata = np.linspace(0.25, 2, 8)
ydata = 10**(-12) * np.array([1,2,1,2,3,1,1,2])
This is the function
import scipy as sp
import numpy as np
import math
from numpy.linalg import inv
from scipy.optimize import fmin_slsqp
from scipy.optimize import minimize, rosen, rosen_der
def phi(x, xd,yd):
n = len(xd)
h = np.array(xd[1:n] - xd[0:n-1])
f = np.divide(yd[1:n] - yd[0:(n-1)],h)
q = [0]*(n-2)
for i in range(n-2):
q[i] = 3*(f[i+1] - f[i])
A = np.zeros(((n-2),(n-2)))
#define A for j=0
A[0,0] = 2*(h[0] + h[1])
A[0,1] = h[1]
#define A for j = n-2
A[-1,-2] = h[-2]
A[-1,-1] = 2*(h[-2] + h[-1])
#define A for in the middle
for j in range(1,(n-3)):
A[j,j-1] = h[j]
A[j,j] = 2*(h[j] + h[j+1])
A[j,j+1] = h[j+1]
Ainv = inv(A)
B = Ainv.dot(q)
b = (n)*[0]
b[1:(n-1)] = B
# now we find a, b, c and d
a = [0]*(n-1)
c = [0]*(n-1)
d = [0]*(n-1)
s = [0]*(n-1)
for r in range(n-1):
a[r] = 1/(3*h[r]) * (b[r + 1] - b[r])
c[r] = f[r] - h[r]*((2*b[r] + b[r+1])/3)
d[r] = yd[r]
#solution 1 start
for m in range(n-1):
if xd[m] <= x <= xd[m+1]:
s = a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m]
return(s)
#solution 1 end
I want to find the minimum on the domain of my xdata, so a fmin didn't work as you can not define bounds there. I tried both fmin_slsqp and minimize. They are not compatible with the phi function I wrote so I rewrote phi(x, xd,yd) and added an extra variable such that phi is phi(x, xd,yd, m). M indicates in which subfunction of the spline we are calculating a solution (from x_m to x_m+1). In the code we replaced #solution 1 by the following
# solution 2 start
return(a[m]*(x - xd[m])**3 + b[m]*(x-xd[m])**2 + c[m]*(x-xd[m]) + d[m])
# solution 2 end
To find the minimum in a domain x_m to x_(m+1) we use the following code: (we use an instance where m=0, so x from 0.25 to 0.5. The initial guess is 0.3)
fmin_slsqp(phi, x0 = 0.3, bounds=([(0.25,0.5)]), args=(xdata, ydata, 0))
What I would then do (I know it's crude), is iterate this with a for loop to find the minimum on all subdomains and then take the overall minimum. However, the function fmin_slsqp constantly returns the initial guess as the minimum. So there is something wrong, which I do not know how to fix. If you could help me this would be greatly appreciated. Thanks for reading this far.
When I plot your function phi and the data you feed in, I see that its range is of the order of 1e-12. However, fmin_slsqp is unable to handle that level of precision and fails to find any change in your objective.
The solution I propose is scaling the return of your objective by the same order of precision like so:
return(s*1e12)
Then you get good results.
>>> sol = fmin_slsqp(phi, x0=0.3, bounds=([(0.25, 0.5)]), args=(xdata, ydata))
>>> print(sol)
Optimization terminated successfully. (Exit mode 0)
Current function value: 1.0
Iterations: 2
Function evaluations: 6
Gradient evaluations: 2
[ 0.25]
I am trying to find a vector that minimizes the residual sum of squares when multiplying a matrix.
I know of scipy's optimize package (which has a minimize function). However, there is an extra constraint for my code. The sum of all entries of w (see function below) must equal 1, and no entry of w can be less than 0. Is there a package that does this for me? If not, how can I do this?
Trying to minimize w:
def w_rss(w,x0,x1):
predictions = np.dot(x0,w)
errors = x1 - predictions
rss = np.dot(errors.transpose(),errors).item(0)
return rss
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([[4],
[2],
[4],
[2]])
W = np.array([[.0],
[.5],
[.5],
[.0]])
print w_rss(W,X0,X1)
So far this is my best attempt at looping through possible values of w, but it's not working properly.
def get_w(x0,x1):
J = x0.shape[1]
W0 = np.matrix([[1.0/J]*J]).transpose()
rss0 = w_rss(W0,x0,x1)
loop = range(J)
for i in loop:
W1 = W0
rss1 = rss0
while rss0 == rss1:
den = len(loop)-1
W1[i][0] += 0.01
for j in loop:
if i == j:
continue
W1[j][0] -= 0.01/den
if W1[j][0] <= 0:
loop.remove(j)
rss1 = w_rss(W1,x0,x1)
if rss1 < rss0:
#print W1
W0 = W1
rss0 = rss1
print '--'
print rss0
print W0
return W0,rss0
The SLSQP code in scipy can do this. You can use scipy.optimize.minimize with method='SLSQP, or you can use the function fmin_slsqp directly. In the following, I use fmin_slsqp.
The scipy solvers generally pass a one-dimensional array to the objective function, so to be consistent, I'll change W and X1 to be 1-d arrays, and I'll write the objective function (now called w_rss1) to expect a 1-d argument w.
The condition that all the elements in w must be between 0 and 1 is specified using the bounds argument, and the condition that the sum must be 1 is specified using the f_eqcons argument. The constraint function returns np.sum(w) - 1, so it is 0 when the sum of the elements is 1.
Here's the code:
import numpy as np
from scipy.optimize import fmin_slsqp
def w_rss1(w, x0, x1):
predictions = np.dot(x0, w)
errors = x1 - predictions
rss = (errors**2).sum()
return rss
def sum1constraint(w, x0, x1):
return np.sum(w) - 1
X0 = np.array([[3,4,5,3],
[1,2,2,4],
[6,5,3,7],
[1,0,5,2]])
X1 = np.array([4, 2, 4, 2])
W = np.array([.0, .5, .5, .0])
result = fmin_slsqp(w_rss1, W, f_eqcons=sum1constraint, bounds=[(0.0, 1.0)]*len(W),
args=(X0, X1), disp=False, full_output=True)
Wopt, fW, its, imode, smode = result
if imode != 0:
print("Optimization failed: " + smode)
else:
print(Wopt)
When I run this, the output is
[ 0.05172414 0.55172414 0.39655172 0. ]