Numerical gradient for nonlinear function in numpy/scipy - python

I'm trying to implement an numerical gradient calculation in numpy to be used as the callback function for the gradient in cyipopt. My understanding of the numpy gradient function is that it should return the gradient calculated at a point based on a finite different approximation.
I don't understand how I would able to implement the gradient of a nonlinear function with this module. The sample problem given appears to be a linear function.
>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> np.gradient(f)
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
>>> np.gradient(f, 2)
array([ 0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ])
My code snippet is as follows:
import numpy as np
# Hock & Schittkowski test problem #40
x = np.mgrid[0.75:0.85:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01]
# target is evaluation at x = [0.8, 0.8, 0.8, 0.8]
f = -x[0] * x[1] * x[2] * x[3]
g = np.gradient(f)
print g
The other downside of this is that I have to evaluate x at several points (and it returns the gradient at several points)
Is there a better option in numpy/scipy for the gradient to be numerically evaluated at a single point so I can implement this as a callback function?

First of all, some warnings:
numerical-optimization is hard to do right
ipopt is very complex software
combining ipopt with numerical-differentiation sounds like you are asking for trouble, but that depends on your problem of course
ipopt is almost always based on automatic-differentiation tools and not numerical-differentiation!
And some more:
as this is a complex task and the state of python + ipopt is not as nice as in some other languages (julia + JuMP for example), it's a bit of work
And some alternatives:
use pyomo which wraps ipopt and has automatic-differentiation
use casadi which also wraps ipopt and has automatic-differentiation
use autograd to automatically calculate gradients on a subset of numpy-code
then use cyipopt to add those
scipy.minimize with solvers SLSQP or COBYLA which can do everything for you (SLSQP can use equality and inequality constraints; COBYLA only inequality-constraints, where emulating equality-constraints by x >= y + x <= y can work)
Approaching your task with your tools
Your complete example-problem is defined in Test Examples for Nonlinear Programming Codes:
Here is some code, based on numerical-differentiation, solving your test-problem, including the official setup (function, gradients, start-point, bounds, ...)
import numpy as np
import scipy.sparse as sps
import ipopt
from scipy.optimize import approx_fprime
class Problem40(object):
""" # Hock & Schittkowski test problem #40
Basic structure follows:
- cyipopt example from https://pythonhosted.org/ipopt/tutorial.html#defining-the-problem
- which follows ipopt's docs from: https://www.coin-or.org/Ipopt/documentation/node22.html
Changes:
- numerical-diff using scipy for function & constraints
- removal of hessian-calculation
- we will use limited-memory approximation
- ipopt docs: https://www.coin-or.org/Ipopt/documentation/node31.html
- (because i'm too lazy to reason about the math; lagrange and co.)
"""
def __init__(self):
self.num_diff_eps = 1e-8 # maybe tuning needed!
def objective(self, x):
# callback for objective
return -np.prod(x) # -x1 x2 x3 x4
def constraint_0(self, x):
return np.array([x[0]**3 + x[1]**2 -1])
def constraint_1(self, x):
return np.array([x[0]**2 * x[3] - x[2]])
def constraint_2(self, x):
return np.array([x[3]**2 - x[1]])
def constraints(self, x):
# callback for constraints
return np.concatenate([self.constraint_0(x),
self.constraint_1(x),
self.constraint_2(x)])
def gradient(self, x):
# callback for gradient
return approx_fprime(x, self.objective, self.num_diff_eps)
def jacobian(self, x):
# callback for jacobian
return np.concatenate([
approx_fprime(x, self.constraint_0, self.num_diff_eps),
approx_fprime(x, self.constraint_1, self.num_diff_eps),
approx_fprime(x, self.constraint_2, self.num_diff_eps)])
def hessian(self, x, lagrange, obj_factor):
return False # we will use quasi-newton approaches to use hessian-info
# progress callback
def intermediate(
self,
alg_mod,
iter_count,
obj_value,
inf_pr,
inf_du,
mu,
d_norm,
regularization_size,
alpha_du,
alpha_pr,
ls_trials
):
print("Objective value at iteration #%d is - %g" % (iter_count, obj_value))
# Remaining problem definition; still following official source:
# http://www.ai7.uni-bayreuth.de/test_problem_coll.pdf
# start-point -> infeasible
x0 = [0.8, 0.8, 0.8, 0.8]
# variable-bounds -> empty => np.inf-approach deviates from cyipopt docs!
lb = [-np.inf, -np.inf, -np.inf, -np.inf]
ub = [np.inf, np.inf, np.inf, np.inf]
# constraint bounds -> c == 0 needed -> both bounds = 0
cl = [0, 0, 0]
cu = [0, 0, 0]
nlp = ipopt.problem(
n=len(x0),
m=len(cl),
problem_obj=Problem40(),
lb=lb,
ub=ub,
cl=cl,
cu=cu
)
# IMPORTANT: need to use limited-memory / lbfgs here as we didn't give a valid hessian-callback
nlp.addOption(b'hessian_approximation', b'limited-memory')
x, info = nlp.solve(x0)
print(x)
print(info)
# CORRECT RESULT & SUCCESSFUL STATE
Output:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 12
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 0
Total number of variables............................: 4
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 3
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
Objective value at iteration #0 is - -0.4096
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 -4.0960000e-01 2.88e-01 2.53e-02 0.0 0.00e+00 - 0.00e+00 0.00e+00 0
Objective value at iteration #1 is - -0.255391
1 -2.5539060e-01 1.28e-02 2.98e-01 -11.0 2.51e-01 - 1.00e+00 1.00e+00h 1
Objective value at iteration #2 is - -0.249299
2 -2.4929898e-01 8.29e-05 3.73e-01 -11.0 7.77e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #3 is - -0.25077
3 -2.5076955e-01 1.32e-03 3.28e-01 -11.0 2.46e-02 - 1.00e+00 1.00e+00h 1
Objective value at iteration #4 is - -0.250025
4 -2.5002535e-01 4.06e-05 1.93e-02 -11.0 4.65e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #5 is - -0.25
5 -2.5000038e-01 6.57e-07 1.70e-04 -11.0 5.46e-04 - 1.00e+00 1.00e+00h 1
Objective value at iteration #6 is - -0.25
6 -2.5000001e-01 2.18e-08 2.20e-06 -11.0 9.69e-05 - 1.00e+00 1.00e+00h 1
Objective value at iteration #7 is - -0.25
7 -2.5000000e-01 3.73e-12 4.42e-10 -11.0 1.27e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 7
(scaled) (unscaled)
Objective...............: -2.5000000000225586e-01 -2.5000000000225586e-01
Dual infeasibility......: 4.4218750883118219e-10 4.4218750883118219e-10
Constraint violation....: 3.7250202922223252e-12 3.7250202922223252e-12
Complementarity.........: 0.0000000000000000e+00 0.0000000000000000e+00
Overall NLP error.......: 4.4218750883118219e-10 4.4218750883118219e-10
Number of objective function evaluations = 8
Number of objective gradient evaluations = 8
Number of equality constraint evaluations = 8
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 8
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 0
Total CPU secs in IPOPT (w/o function evaluations) = 0.016
Total CPU secs in NLP function evaluations = 0.000
EXIT: Optimal Solution Found.
[ 0.79370053 0.70710678 0.52973155 0.84089641]
{'x': array([ 0.79370053, 0.70710678, 0.52973155, 0.84089641]), 'g': array([ 3.72502029e-12, -3.93685085e-13, 5.86974913e-13]), 'obj_val': -0.25000000000225586, 'mult_g': array([ 0.49999999, -0.47193715, 0.35355339]), 'mult_x_L': array([ 0., 0., 0., 0.]), 'mult_x_U': array([ 0., 0., 0., 0.]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
Remarks about the code
We use scipy's approx_fprime which basically was added for all those gradient-based optimizers in scipy.optimize
As stated in the sources; i did not take care about ipopt's need for the hessian and we used ipopts hessian-approximation
the basic idea is described at wiki: LBFGS
I did ignore ipopts need for sparsity structure of the Jacobian of the constraints
a default-assumption: the default hessian structure is of a lower triangular matrix is used and i won't give any guarantees on what can happen here (bad performance vs. breaking everything)

I think you have some kind of misunderstanding about what is a mathematical function and what is its numerical implementation.
You should define your function as:
def func(x1, x2, x3, x4):
return -x1*x2*x3*x4
Now you want to evaluate your function at specific points, which you can do using the np.mgrid you provided.
If you want to compute your gradient, use copy.misc.derivative(https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html) (watch out the default parameters for dx is usually bad, change it to 1e-5. There is no difference between linear and non-linear gradient for the numerical evaluation, only that for non linear function the gradient won't be the same everywhere.
What you did was with np.gradient was actually to compute the gradient from the point in your array, the definition of your function being hidden by your definition of f, thus not allowing for multiple gradient evaluation at different points. Also using your method makes you dependant of your discretisation step.

Related

Constrained optimisation in scipy enters restricted area

I am trying to solve multivariate optimisation problem using python with scipy.
Let me define enviroment I am working in:
searched parameters:
and the problem itself:
(In my case logL function is complex, so i will substitute it with the trivial one, generating similar issue. Therefore in this example I am not using function parameters fully, but I am including those, for problem consistency).
I am using following convention on storing parameters in single, flat array:
Here is the script, that was supposed to solve my problem.
import numpy as np
from scipy import optimize as opt
from pprint import pprint
from typing import List
_d = 2
_tmax = 500.0
_T = [[1,2,3,4,5], [6,7,8,9]]
def logL(args: List[float], T : List[List[float]], tmax : float):
# simplified - normaly using T in computation, here only to determine dimension
d = len(T)
# trivially forcing args to go 'out-of constrains'
return -sum([(args[2 * i] + args[2 * i + 1] * tmax)**2 for i in range(d)])
def gradientForIthDimension(i, d, t_max):
g = np.zeros(2 * d + 2 * d**2)
g[2 * i] = 1.0
g[2 * i + 1] = t_max + 1.0
return g
def zerosWithOneOnJth(j, l):
r = [0.0 for _ in range(l)]
r[j] = 1.0
return r
new_lin_const = {
'type': 'ineq',
'fun' : lambda x: np.array(
[x[2 * i] + x[2 * i + 1] * (_tmax + 1.0) for i in range(_d)]
+ [x[j] for j in range(2*_d + 2*_d**2) if j not in [2 * i + 1 for i in range(_d)]]
),
'jac' : lambda x: np.array(
[gradientForIthDimension(i, _d, _tmax) for i in range(_d)]
+ [zerosWithOneOnJth(j, 2*_d + 2*_d**2) for j in range(2*_d + 2*_d**2) if j not in [2 * i + 1 for i in range(_d)]]
)
}
and finally optimisation
logArgs = [2 for _ in range(2 * (_d ** 2) + 2 * _d)]
# addditional bounds, not mentioned in a problem, but suppose a'priori knowledge
bds = [(0.0, 10.0) for _ in range(2 * (_d ** 2) + 2 * _d)]
for i in range(_d):
bds[2*i + 1] = (-10.0, 10.0)
res = opt.minimize(lambda x, args: -logL(x, args[0], args[1]),
constraints=new_lin_const, x0 = logArgs, args=([_T, _tmax]), method='SLSQP', options={'disp': True}, bounds=bds)
But when checking for result, i am getting:
pprint(res)
# fun: 2.2124712864600578e-05
# jac: array([0.00665204, 3.32973738, 0.00665204, 3.32973738, 0. ,
# 0. , 0. , 0. , 0. , 0. ,
# 0. , 0. ])
# message: 'Optimization terminated successfully'
# nfev: 40
# nit: 3
# njev: 3
# status: 0
# success: True
# x: array([ 1.66633206, -0.00332601, 1.66633206, -0.00332601, 2. ,
# 2. , 2. , 2. , 2. , 2. ,
# 2. , 2. ])
particullary:
print(res.x[0] + res.x[1]*(501.0))
# -3.2529534621517087e-13
so result is out of constrained area...
I was trying to follow documentation, but for me it does not work. I will be happy to hear any advice on what is wrong.
First of all, please stop posting the same question multiple times. This question is basically the same as your other one here. Next time, just edit your question instead of posting a new one.
That being said, your code is needlessly complicated given that your optimization problem is quite simple. It should be your goal that reading your code is as simple as reading the mathematical optimization problem. A more than welcome side effect is that it's much easier to debug your code then in case it's not working as expected.
For this purpose, it's highly recommended that you make yourself familiar with numpy and its vectorized operations (as already mentioned in the comments of your previous question). For instance, you don't need loops to implement your objective, the constraint function or the jacobian. Packing all the optimization variables into one large vector x is the right approach. However, you can simply unpack x into its components lambda, gamma, alpha and beta again. This makes it easier for you to write your functions and easier to read, too.
Well, instead of cutting my way through your code, you can find a simplified and working implementation below. By evaluating the functions and comparing the outputs to the evaluated functions in your code snippet, you should get an idea of what's going wrong on your side.
Edit: It seems like most of the algorithms under the hood of scipy.minimize fail to converge to a local minimizer while preserving strict feasibility of the constraints. If you're open to using another package, I'd recommend using the state-of-the-art NLP solver Ipopt. You can use it by means of the cyipopt package and thanks to its minimize_ipopt method, you can use it similar to scipy.optimize.minimize:
import numpy as np
#from scipy.optimize import minimize
from cyipopt import minimize_ipopt as minimize
d = 2
tmax = 500.0
N = 2*d + 2*d**2
def logL(x, d, tmax):
lambda_, gamma, alpha, beta = np.split(x, np.cumsum([d, d, d**2]))
return np.sum((lambda_ + tmax*gamma)**2)
def con_fun(x, d, tmax):
# split the packed variable x = (lambda_, gamma, alpha, beta)
lambda_, gamma, alpha, beta = np.split(x, np.cumsum([d, d, d**2]))
return lambda_ + (tmax + 1.0) * gamma
def con_jac(x, d, tmax):
jac = np.block([np.eye(d), (tmax + 1.0)*np.eye(d), np.zeros((d, 2*d**2))])
return jac
constr = {
'type': 'ineq',
'fun': lambda x: con_fun(x, d, tmax),
'jac': lambda x: con_jac(x, d, tmax)
}
bounds = [(0, 10.0)]*N + [(-10.0, 10.0)]*N + [(0.0, 10.0)]*2*d**2
x0 = np.full(N, 2.0)
res = minimize(lambda x: logL(x, d, tmax), x0=x0, constraints=constr,
method='SLSQP', options={'disp': True}, bounds=bounds)
print(res)
yields
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
fun: 0.00014085582293562834
info: {'x': array([ 2.0037865 , 2.0037865 , -0.00399079, -0.00399079, 2.00700641,
2.00700641, 2.00700641, 2.00700641, 2.00700641, 2.00700641,
2.00700641, 2.00700641]), 'g': array([0.00440135, 0.00440135]), 'obj_val': 0.00014085582293562834, 'mult_g': array([-0.01675576, -0.01675576]), 'mult_x_L': array([5.00053270e-08, 5.00053270e-08, 1.00240003e-08, 1.00240003e-08,
4.99251018e-08, 4.99251018e-08, 4.99251018e-08, 4.99251018e-08,
4.99251018e-08, 4.99251018e-08, 4.99251018e-08, 4.99251018e-08]), 'mult_x_U': array([1.25309309e-08, 1.25309309e-08, 1.00160027e-08, 1.00160027e-08,
1.25359789e-08, 1.25359789e-08, 1.25359789e-08, 1.25359789e-08,
1.25359789e-08, 1.25359789e-08, 1.25359789e-08, 1.25359789e-08]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
message: b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'
nfev: 15
nit: 14
njev: 16
status: 0
success: True
x: array([ 2.0037865 , 2.0037865 , -0.00399079, -0.00399079, 2.00700641,
2.00700641, 2.00700641, 2.00700641, 2.00700641, 2.00700641,
2.00700641, 2.00700641])
and evaluating the constraint function at the found solution yields
In [17]: print(constr['fun'](res.x))
[0.00440135 0.00440135]
Consequently, the constraints are fulfilled.

Genetic Algorithm Population Individual as Array

I don't have much experience using Genetic Algorithms, so I would like to ask the community for some useful comments. I want to apologize for my terminology errors. Please, correct me if it's needed.
The problem I want to optimize is optimal power flow in an islanded microgrid. In the simple microgrid we have 2 diesel generators (DG), 1 PV array, 1 Energy Storage System (ESS) and Load. Let's assume we know Load and PV array output power values for next periods.
So, the objective function should be minimized is OPEX as sum of every microgrid component operational cost at each moment t in period T:
where a, b are some operational cost coefficients, is diesel generator binary (0/1 or ON/OFF) status variable and P is output power of the microgrid component at the time t.
And here are some of constraints (the real problem is hardly and nonlinearly constrained so I wrote down only three of constraints):
Power balance
ESS' Maximum depth of disharge
Diesel gensets power limit
So, it's mixed integer problem with nonlinear constraints. I tried to adapt the problem for solving it using Genetic Algorithm. I used pymoo Python library for multiobjective optimization with NSGA2 algorithm. Let's consider and for this T we have some Load and PV power data:
from pymoo.model.problem import FunctionalProblem
from pymoo.factory import get_sampling, get_crossover, get_mutation
from pymoo.operators.mixed_variable_operator import MixedVariableSampling, MixedVariableMutation, MixedVariableCrossover
from pymoo.algorithms.nsga2 import NSGA2
from pymoo.factory import get_sampling, get_crossover, get_mutation, get_termination
from pymoo.optimize import minimize
PV = np.array([10, 19.8, 16, 25, 7.8, 42.8, 10]) #PV inverter output power, kW
Load = np.array([100, 108, 150, 150, 90, 16, 170]) #Load, kW
balance_eps = 0.001 #equality constraint relaxing coefficient
DG1_pmin = 0.3 #DG1 min power
DG2_pmin = 0.3 #DG2 min power
P_dg1 = 75 #DG1 rated power, kW
P_dg2 = 75 #DG1 rated power, kW
P_PV_inv = 50 #PV inverter rated power, kW
P_ESS_inv = 30 #ESS bidirectional inverter absolute rated discharge/charge power, kW
ESS_c = 100 #ESS capacity, kWh
SOC_min = 30
SOC_max = 100
objs = [lambda x: x[0]*x[2]*200 + x[1]*x[3]*200 + x[4]*0.002 #objective function]
constr_eq = [lambda x: ((Load[t] - x[0]*x[2] - x[1]*x[3] - x[4] - PV[t] )**2)]
constr_ieq = [lambda x: -SOC_t + 100*x[4]/ESS_c + SOC_min,
lambda x: SOC_t - 100*x[4]/ESS_c - SOC_max]
problem = FunctionalProblem(n_var=n_var, objs, constr_eq=constr_eq, constr_eq_eps=1e-03, constr_ieq=constr_ieq,
xl=np.array([0, 0, DG1_pmin*P_dg1, DG2_pmin*P_dg2, -P_ESS_inv]), xu=np.array([1, 1, P_dg1, P_dg2, P_ESS_inv]))
mask = ["int", "int", "real", "real", "real"]
sampling = MixedVariableSampling(mask, {
"real": get_sampling("real_random"),
"int": get_sampling("int_random")})
crossover = MixedVariableCrossover(mask, {
"real": get_crossover("real_sbx", prob=1.0, eta=3.0),
"int": get_crossover("int_sbx", prob=1.0, eta=3.0)})
mutation = MixedVariableMutation(mask, {
"real": get_mutation("real_pm", eta=3.0),
"int": get_mutation("int_pm", eta=3.0)})
algorithm = NSGA2(
pop_size=150,
sampling=sampling,
crossover=crossover,
mutation=mutation,
eliminate_duplicates=True)
We have n_var = 5 decision variables which are being optimized: . We should also have an access to the previous value of SOC.
I wrote a recursive code to implement a consecutive optimization chain:
x=[]
s=[]
SOC_t = 100 #SOC at t = -1
for t in range (0, 7):
res = minimize(
problem,
algorithm,
seed=1,
termination = get_termination("n_gen", 300),
save_history=True, verbose=False)
SOC_t = SOC_t - 100*res.X[4]/ESS_c
print(res.X[:2], np.around(res.X[2:].astype(np.double), 3), np.around(SOC_t, 2))
x.append(res.X)
s.append(SOC_t)
So, we have initialized populations with size 150 for every time step t and individuals in that populations looked like . Running this code I get these optimization results found:
[1 1] [27.272 34.635 28.071] 71.93
[0 1] [28.127 58.168 30. ] 41.93
[1 1] [50.95 71.423 11.599] 30.33
[1 1] [53.966 70.97 0.034] 30.3
[1 1] [24.636 59.236 -1.702] 32.0
[0 0] [40.831 29.184 -26.832] 58.83
[1 1] [68.299 63.148 28.572] 30.26
Even my little experience in Genetic Algorithms allows me to state, that such approach is inappropriate and unefficient.
So, here is my question (if you're still reading my post :)
Is there a way to optimize such problem using not consecutive optimization of a particular variables set at t, but defining individuals in population as arrays with size (T, n_var)?
For the problem described an individual in population may look like
Is it possible to implement such approach? If yes, how to do it in pymoo?
Thank you very much for your time! Any comments and suggestions will be appreciated.

Trying to solve this non linear optimization using GEKKO, getting this error

#Error: setting an array element with a sequence
I am trying to mninimize the downside risk.
I have a two dimensional array of returns shape(1000, 10), and the portfolio starts with $100. Compound that 10 times by each return in a row. Do that for all the rows. Compare that last cell's value for each row with mean of last column's values. Keep the value if it's less than mean or else zero. So we will have an array of (1000, 1). At the end I am finding the standard deviation of that.
Objective is to minimize the standard deviation.
Constraints: weights need to be less than 1
the expected return i.e. wt*ret should be equal to a value like 7%. I have to do that for couple of values like 7%, 8% or 10%.
wt = np.array([0.4, 0.3, 0.3])
cov = array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1 + re_ret[m][n+1])
mean = np.mean(fund_ret[:,-1]) #just need the last column and all rows
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = GEKKO()
w = b.Array(b.Var,3,value=0.33,lb=1e-5, ub=1)
b.Equation(b.sum(w)<=1)
b.Equation(np.dot(w,ret) == .07)
b.Minimize(simulations(w))
b.solve(disp=False)
#simulations(wt)
If you comment out the gekko section and call the simulation function at the bottom, it works fine
In this case, you would want to consider a different optimizer such as scipy.minimize.optimize. The function np.std() is not currently supported in Gekko. Gekko compiles the model into byte-code for automatic differentiation so you need to fit the problem into a form that is supported. Gekko's approach has several advantages, especially for large-scale or non-linear problems. For small problems with fewer than 100 variables and nearly linear constraints, an optimizer such as scipy.minimize.optimize is often a viable option. Here is your problem with a solution:
import numpy as np
from scipy.optimize import minimize
wt = np.array([0.4, 0.3, 0.3])
cov = np.array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1+re_ret[m][n+1])
#just need the last column and all rows
mean = np.mean(fund_ret[:,-1])
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = (1e-5,1); bnds=(b,b,b)
cons = ({'type': 'ineq', 'fun': lambda x: sum(x)-1},\
{'type': 'eq', 'fun': lambda x: np.dot(x,ret)-.07})
sol = minimize(simulations,wt,bounds=bnds,constraints=cons)
w = sol.x
print(w)
This produces the solution sol with optimal values w=sol.x:
fun: 6.139162309118155
jac: array([ 8.02691203, 10.04863131, 9.49171901])
message: 'Optimization terminated successfully.'
nfev: 33
nit: 6
njev: 6
status: 0
success: True
x: array([0.09741111, 0.45326888, 0.44932001])

Least squares fit in python for 3d surface

I would like to fit my surface equation to some data. I already tried scipy.optimize.leastsq but as I cannot specify the bounds it gives me an unusable results. I also tried scipy.optimize.least_squares but it gives me an error:
ValueError: too many values to unpack
My equation is:
f(x,y,z)=(x-A+y-B)/2+sqrt(((x-A-y+B)/2)^2+C*z^2)
parameters A, B, C should be found so that the equation above would be as close as possible to zero when the following points are used for x,y,z:
[
[-0.071, -0.85, 0.401],
[-0.138, -1.111, 0.494],
[-0.317, -0.317, -0.317],
[-0.351, -2.048, 0.848]
]
The bounds would be A > 0, B > 0, C > 1
How I should obtain such a fit? What is the best tool in python to do that. I searched for examples on how to fit 3d surfaces but most of examples involving function fitting is about line or flat surface fits.
I've edited this answer to provide a more general example of how this problem can be solved with scipy's general optimize.minimize method as well as scipy's optimize.least_squares method.
First lets set up the problem:
import numpy as np
import scipy.optimize
# ===============================================
# SETUP: define common compoments of the problem
def our_function(coeff, data):
"""
The function we care to optimize.
Args:
coeff (np.ndarray): are the parameters that we care to optimize.
data (np.ndarray): the input data
"""
A, B, C = coeff
x, y, z = data.T
return (x - A + y - B) / 2 + np.sqrt(((x - A - y + B) / 2) ** 2 + C * z ** 2)
# Define some training data
data = np.array([
[-0.071, -0.85, 0.401],
[-0.138, -1.111, 0.494],
[-0.317, -0.317, -0.317],
[-0.351, -2.048, 0.848]
])
# Define training target
# This is what we want the target function to be equal to
target = 0
# Make an initial guess as to the parameters
# either a constant or random guess is typically fine
num_coeff = 3
coeff_0 = np.ones(num_coeff)
# coeff_0 = np.random.rand(num_coeff)
This isn't strictly least squares, but how about something like this?
This solution is like throwing a sledge hammer at the problem. There probably is a way to use least squares to get a solution more efficiently using an SVD solver, but if you're just looking for an answer scipy.optimize.minimize will find you one.
# ===============================================
# FORMULATION #1: a general minimization problem
# Here the bounds and error are all specified within the general objective function
def general_objective(coeff, data, target):
"""
General function that simply returns a value to be minimized.
The coeff will be modified to minimize whatever the output of this function
may be.
"""
# Constraints to keep coeff above 0
if np.any(coeff < 0):
# If any constraint is violated return infinity
return np.inf
# The function we care about
prediction = our_function(coeff, data)
# (optional) L2 regularization to keep coeff small
# (optional) reg_amount = 0.0
# (optional) reg = reg_amount * np.sqrt((coeff ** 2).sum())
losses = (prediction - target) ** 2
# (optional) losses += reg
# Return the average squared error
loss = losses.sum()
return loss
general_result = scipy.optimize.minimize(general_objective, coeff_0,
method='Nelder-Mead',
args=(data, target))
# Test what the squared error of the returned result is
coeff = general_result.x
general_output = our_function(coeff, data)
print('====================')
print('general_result =\n%s' % (general_result,))
print('---------------------')
print('general_output = %r' % (general_output,))
print('====================')
The output looks like this:
====================
general_result =
final_simplex: (array([[ 2.45700466e-01, 7.93719271e-09, 1.71257109e+00],
[ 2.45692680e-01, 3.31991619e-08, 1.71255150e+00],
[ 2.45726858e-01, 6.52636219e-08, 1.71263360e+00],
[ 2.45713989e-01, 8.06971686e-08, 1.71260234e+00]]), array([ 0.00012404, 0.00012404, 0.00012404, 0.00012404]))
fun: 0.00012404137498459109
message: 'Optimization terminated successfully.'
nfev: 431
nit: 240
status: 0
success: True
x: array([ 2.45700466e-01, 7.93719271e-09, 1.71257109e+00])
---------------------
general_output = array([ 0.00527974, -0.00561568, -0.00719941, 0.00357748])
====================
I found in the documentation that all you need to do to adapt this to actual least squares is to specify the function that computes the residuals.
# ===============================================
# FORMULATION #2: a special least squares problem
# Here all that is needeed is a function that computes the vector of residuals
# the optimization function takes care of the rest
def least_squares_residuals(coeff, data, target):
"""
Function that returns the vector of residuals between the predicted values
and the target value. Here we want each predicted value to be close to zero
"""
A, B, C = coeff
x, y, z = data.T
prediction = our_function(coeff, data)
vector_of_residuals = (prediction - target)
return vector_of_residuals
# Here the bounds are specified in the optimization call
bound_gt = np.full(shape=num_coeff, fill_value=0, dtype=np.float)
bound_lt = np.full(shape=num_coeff, fill_value=np.inf, dtype=np.float)
bounds = (bound_gt, bound_lt)
lst_sqrs_result = scipy.optimize.least_squares(least_squares_residuals, coeff_0,
args=(data, target), bounds=bounds)
# Test what the squared error of the returned result is
coeff = lst_sqrs_result.x
lst_sqrs_output = our_function(coeff, data)
print('====================')
print('lst_sqrs_result =\n%s' % (lst_sqrs_result,))
print('---------------------')
print('lst_sqrs_output = %r' % (lst_sqrs_output,))
print('====================')
The output here is:
====================
lst_sqrs_result =
active_mask: array([ 0, -1, 0])
cost: 6.197329866927735e-05
fun: array([ 0.00518416, -0.00564099, -0.00710112, 0.00385024])
grad: array([ -4.61826888e-09, 3.70771396e-03, 1.26659198e-09])
jac: array([[-0.72611025, -0.27388975, 0.13653112],
[-0.74479565, -0.25520435, 0.1644325 ],
[-0.35777232, -0.64222767, 0.11601263],
[-0.77338046, -0.22661953, 0.27104366]])
message: '`gtol` termination condition is satisfied.'
nfev: 13
njev: 13
optimality: 4.6182688779976278e-09
status: 1
success: True
x: array([ 2.46392438e-01, 5.39025298e-17, 1.71555150e+00])
---------------------
lst_sqrs_output = array([ 0.00518416, -0.00564099, -0.00710112, 0.00385024])
====================

IDL's INT_TABULATE - SciPy equivalent?

I am working on moving some code from IDL into python. One IDL call is to INT_TABULATE which performs integration on a fixed range.
The INT_TABULATED function integrates a tabulated set of data { xi , fi } on the closed interval [MIN(x) , MAX(x)], using a five-point Newton-Cotes integration formula.
Result = INT_TABULATED( X, F [, /DOUBLE] [, /SORT] )
Where result is the area under the curve.
IDL DOCS
My question is, does Numpy/SciPy offer a similar form of integration? I see that [scipy.integrate.newton_cotes] exists, but it appears to return "weights and error coefficient for Newton-Cotes integration instead of area".
Scipy does not provide such a high order integrator for tabulated data by default. The closest you have available without coding it yourself is scipy.integrate.simps, which uses a 3 point Newton-Cotes method.
If you simply want to get comparable integration precision, you could split your x and f arrays into 5 point chunks and integrate them one at a time, using the weights returned by scipy.integrate.newton_cotes doing something along the lines of:
def idl_tabulate(x, f, p=5) :
def newton_cotes(x, f) :
if x.shape[0] < 2 :
return 0
rn = (x.shape[0] - 1) * (x - x[0]) / (x[-1] - x[0])
weights = scipy.integrate.newton_cotes(rn)[0]
return (x[-1] - x[0]) / (x.shape[0] - 1) * np.dot(weights, f)
ret = 0
for idx in xrange(0, x.shape[0], p - 1) :
ret += newton_cotes(x[idx:idx + p], f[idx:idx + p])
return ret
This does 5-point Newton-Cotes on all intervals, except perhaps the last, where it will do a Newton-Cotes of the number of points remaining. Unfortunately, this will not give you the same results as IDL_TABULATE because the internal methods are different:
Scipy calculates the weights for points not equally spaced using what seems like a least-sqaures fit, I don't fully understand what is going on, but the code is pure python, you can find it in your Scipy installation in file scipy\integrate\quadrature.py.
INT_TABULATED always performs 5-point Newton-Cotes on equispaced data. If the data are not equispaced, it builds an equispaced grid, using a cubic spline to interpolate the values at those points. You can check the code here.
For the example in the INT_TABULATED docstring, which is suppossed to return 1.6271 using the original code, and have an exact solution of 1.6405, the above function returns:
>>> x = np.array([0.0, 0.12, 0.22, 0.32, 0.36, 0.40, 0.44, 0.54, 0.64,
... 0.70, 0.80])
>>> f = np.array([0.200000, 1.30973, 1.30524, 1.74339, 2.07490, 2.45600,
... 2.84299, 3.50730, 3.18194, 2.36302, 0.231964])
>>> idl_tabulate(x, f)
1.641998154242472

Categories