Constrained optimisation in scipy enters restricted area - python

I am trying to solve multivariate optimisation problem using python with scipy.
Let me define enviroment I am working in:
searched parameters:
and the problem itself:
(In my case logL function is complex, so i will substitute it with the trivial one, generating similar issue. Therefore in this example I am not using function parameters fully, but I am including those, for problem consistency).
I am using following convention on storing parameters in single, flat array:
Here is the script, that was supposed to solve my problem.
import numpy as np
from scipy import optimize as opt
from pprint import pprint
from typing import List
_d = 2
_tmax = 500.0
_T = [[1,2,3,4,5], [6,7,8,9]]
def logL(args: List[float], T : List[List[float]], tmax : float):
# simplified - normaly using T in computation, here only to determine dimension
d = len(T)
# trivially forcing args to go 'out-of constrains'
return -sum([(args[2 * i] + args[2 * i + 1] * tmax)**2 for i in range(d)])
def gradientForIthDimension(i, d, t_max):
g = np.zeros(2 * d + 2 * d**2)
g[2 * i] = 1.0
g[2 * i + 1] = t_max + 1.0
return g
def zerosWithOneOnJth(j, l):
r = [0.0 for _ in range(l)]
r[j] = 1.0
return r
new_lin_const = {
'type': 'ineq',
'fun' : lambda x: np.array(
[x[2 * i] + x[2 * i + 1] * (_tmax + 1.0) for i in range(_d)]
+ [x[j] for j in range(2*_d + 2*_d**2) if j not in [2 * i + 1 for i in range(_d)]]
),
'jac' : lambda x: np.array(
[gradientForIthDimension(i, _d, _tmax) for i in range(_d)]
+ [zerosWithOneOnJth(j, 2*_d + 2*_d**2) for j in range(2*_d + 2*_d**2) if j not in [2 * i + 1 for i in range(_d)]]
)
}
and finally optimisation
logArgs = [2 for _ in range(2 * (_d ** 2) + 2 * _d)]
# addditional bounds, not mentioned in a problem, but suppose a'priori knowledge
bds = [(0.0, 10.0) for _ in range(2 * (_d ** 2) + 2 * _d)]
for i in range(_d):
bds[2*i + 1] = (-10.0, 10.0)
res = opt.minimize(lambda x, args: -logL(x, args[0], args[1]),
constraints=new_lin_const, x0 = logArgs, args=([_T, _tmax]), method='SLSQP', options={'disp': True}, bounds=bds)
But when checking for result, i am getting:
pprint(res)
# fun: 2.2124712864600578e-05
# jac: array([0.00665204, 3.32973738, 0.00665204, 3.32973738, 0. ,
# 0. , 0. , 0. , 0. , 0. ,
# 0. , 0. ])
# message: 'Optimization terminated successfully'
# nfev: 40
# nit: 3
# njev: 3
# status: 0
# success: True
# x: array([ 1.66633206, -0.00332601, 1.66633206, -0.00332601, 2. ,
# 2. , 2. , 2. , 2. , 2. ,
# 2. , 2. ])
particullary:
print(res.x[0] + res.x[1]*(501.0))
# -3.2529534621517087e-13
so result is out of constrained area...
I was trying to follow documentation, but for me it does not work. I will be happy to hear any advice on what is wrong.

First of all, please stop posting the same question multiple times. This question is basically the same as your other one here. Next time, just edit your question instead of posting a new one.
That being said, your code is needlessly complicated given that your optimization problem is quite simple. It should be your goal that reading your code is as simple as reading the mathematical optimization problem. A more than welcome side effect is that it's much easier to debug your code then in case it's not working as expected.
For this purpose, it's highly recommended that you make yourself familiar with numpy and its vectorized operations (as already mentioned in the comments of your previous question). For instance, you don't need loops to implement your objective, the constraint function or the jacobian. Packing all the optimization variables into one large vector x is the right approach. However, you can simply unpack x into its components lambda, gamma, alpha and beta again. This makes it easier for you to write your functions and easier to read, too.
Well, instead of cutting my way through your code, you can find a simplified and working implementation below. By evaluating the functions and comparing the outputs to the evaluated functions in your code snippet, you should get an idea of what's going wrong on your side.
Edit: It seems like most of the algorithms under the hood of scipy.minimize fail to converge to a local minimizer while preserving strict feasibility of the constraints. If you're open to using another package, I'd recommend using the state-of-the-art NLP solver Ipopt. You can use it by means of the cyipopt package and thanks to its minimize_ipopt method, you can use it similar to scipy.optimize.minimize:
import numpy as np
#from scipy.optimize import minimize
from cyipopt import minimize_ipopt as minimize
d = 2
tmax = 500.0
N = 2*d + 2*d**2
def logL(x, d, tmax):
lambda_, gamma, alpha, beta = np.split(x, np.cumsum([d, d, d**2]))
return np.sum((lambda_ + tmax*gamma)**2)
def con_fun(x, d, tmax):
# split the packed variable x = (lambda_, gamma, alpha, beta)
lambda_, gamma, alpha, beta = np.split(x, np.cumsum([d, d, d**2]))
return lambda_ + (tmax + 1.0) * gamma
def con_jac(x, d, tmax):
jac = np.block([np.eye(d), (tmax + 1.0)*np.eye(d), np.zeros((d, 2*d**2))])
return jac
constr = {
'type': 'ineq',
'fun': lambda x: con_fun(x, d, tmax),
'jac': lambda x: con_jac(x, d, tmax)
}
bounds = [(0, 10.0)]*N + [(-10.0, 10.0)]*N + [(0.0, 10.0)]*2*d**2
x0 = np.full(N, 2.0)
res = minimize(lambda x: logL(x, d, tmax), x0=x0, constraints=constr,
method='SLSQP', options={'disp': True}, bounds=bounds)
print(res)
yields
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
fun: 0.00014085582293562834
info: {'x': array([ 2.0037865 , 2.0037865 , -0.00399079, -0.00399079, 2.00700641,
2.00700641, 2.00700641, 2.00700641, 2.00700641, 2.00700641,
2.00700641, 2.00700641]), 'g': array([0.00440135, 0.00440135]), 'obj_val': 0.00014085582293562834, 'mult_g': array([-0.01675576, -0.01675576]), 'mult_x_L': array([5.00053270e-08, 5.00053270e-08, 1.00240003e-08, 1.00240003e-08,
4.99251018e-08, 4.99251018e-08, 4.99251018e-08, 4.99251018e-08,
4.99251018e-08, 4.99251018e-08, 4.99251018e-08, 4.99251018e-08]), 'mult_x_U': array([1.25309309e-08, 1.25309309e-08, 1.00160027e-08, 1.00160027e-08,
1.25359789e-08, 1.25359789e-08, 1.25359789e-08, 1.25359789e-08,
1.25359789e-08, 1.25359789e-08, 1.25359789e-08, 1.25359789e-08]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
message: b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'
nfev: 15
nit: 14
njev: 16
status: 0
success: True
x: array([ 2.0037865 , 2.0037865 , -0.00399079, -0.00399079, 2.00700641,
2.00700641, 2.00700641, 2.00700641, 2.00700641, 2.00700641,
2.00700641, 2.00700641])
and evaluating the constraint function at the found solution yields
In [17]: print(constr['fun'](res.x))
[0.00440135 0.00440135]
Consequently, the constraints are fulfilled.

Related

Find the value of variables to maximize return of function in Python

I'd want to achieve similar result as how the Solver-function in Excel is working. I've been reading of Scipy optimization and been trying to build a function which outputs what I would like to find the maximal value of. The equation is based on four different variables which, see my code below:
import pandas as pd
import numpy as np
from scipy import optimize
cols = {
'Dividend2': [9390, 7448, 177],
'Probability': [341, 376, 452],
'EV': [0.53, 0.60, 0.55],
'Dividend': [185, 55, 755],
'EV2': [123, 139, 544],
}
df = pd.DataFrame(cols)
def myFunc(params):
"""myFunc metric."""
(ev, bv, vc, dv) = params
df['Number'] = np.where(df['Dividend2'] <= vc, 1, 0) \
+ np.where(df['EV2'] <= dv, 1, 0)
df['Return'] = np.where(
df['EV'] <= ev, 0, np.where(
df['Probability'] >= bv, 0, df['Number'] * df['Dividend'] - (vc + dv)
)
)
return -1 * (df['Return'].sum())
b1 = [(0.2,4), (300,600), (0,1000), (0,1000)]
start = [0.2, 600, 1000, 1000]
result = optimize.minimize(fun=myFunc, bounds=b1, x0=start)
print(result)
So I'd like to find the maximum value of the column Return in df when changing the variables ev,bv,vc & dv. I'd like them to be between in the intervals of ev: 0.2-4, bv: 300-600, vc: 0-1000 & dv: 0-1000.
When running my code it seem like the function stops at x0.
Solution
I will use optuna library to give you a solution to the type of problem you are trying to solve. I have tried using scipy.optimize.minimize and it appears that the loss-landscape is probably quite flat in most places, and hence the tolerances enforce the minimizing algorithm (L-BFGS-B) to stop prematurely.
Optuna Docs: https://optuna.readthedocs.io/en/stable/index.html
With optuna, it rather straight forward. Optuna only requires an objective function and a study. The study send various trials to the objective function, which in turn, evaluates the metric of your choice.
I have defined another metric function myFunc2 by mostly removing the np.where calls, as you can do-away with them (reduces number of steps) and make the function slightly faster.
# install optuna with pip
pip install -Uqq optuna
Although I looked into using a rather smooth loss landscape, sometimes it is necessary to visualize the landscape itself. The answer in section B elaborates on visualization. But, what if you want to use a smoother metric function? Section D sheds some light on this.
Order of code-execution should be:
Sections: C >> B >> B.1 >> B.2 >> B.3 >> A.1 >> A.2 >> D
A. Building Intuition
If you create a hiplot (also known as a plot with parallel-coordinates) with all the possible parameter values as mentioned in the search_space for Section B.2, and plot the lowest 50 outputs of myFunc2, it would look like this:
Plotting all such points from the search_space would look like this:
A.1. Loss Landscape Views for Various Parameter-Pairs
These figures show that mostly the loss-landscape is flat for any two of the four parameters (ev, bv, vc, dv). This could be a reason why, only GridSampler (which brute-forces the searching process) does better, compared to the other two samplers (TPESampler and RandomSampler). Please click on any of the images below to view them enlarged. This could also be the reason why scipy.optimize.minimize(method="L-BFGS-B") fails right off the bat.
01. dv-vc
02. dv-bv
03. dv-ev
04. bv-ev
05. cv-ev
06. vc-bv
# Create contour plots for parameter-pairs
study_name = "GridSampler"
study = studies.get(study_name)
views = [("dv", "vc"), ("dv", "bv"), ("dv", "ev"),
("bv", "ev"), ("vc", "ev"), ("vc", "bv")]
for i, (x, y) in enumerate(views):
print(f"Figure: {i}/{len(views)}")
study_contour_plot(study=study, params=(x, y))
A.2. Parameter Importance
study_name = "GridSampler"
study = studies.get(study_name)
fig = optuna.visualization.plot_param_importances(study)
fig.update_layout(title=f'Hyperparameter Importances: {study.study_name}',
autosize=False,
width=800, height=500,
margin=dict(l=65, r=50, b=65, t=90))
fig.show()
B. Code
Section B.3. finds the lowest metric -88.333 for:
{'ev': 0.2, 'bv': 500.0, 'vc': 222.2222, 'dv': 0.0}
import warnings
from functools import partial
from typing import Iterable, Optional, Callable, List
import pandas as pd
import numpy as np
import optuna
from tqdm.notebook import tqdm
warnings.filterwarnings("ignore", category=optuna.exceptions.ExperimentalWarning)
optuna.logging.set_verbosity(optuna.logging.WARNING)
PARAM_NAMES: List[str] = ["ev", "bv", "vc", "dv",]
DEFAULT_METRIC_FUNC: Callable = myFunc2
def myFunc2(params):
"""myFunc metric v2 with lesser steps."""
global df # define as a global variable
(ev, bv, vc, dv) = params
df['Number'] = (df['Dividend2'] <= vc) * 1 + (df['EV2'] <= dv) * 1
df['Return'] = (
(df['EV'] > ev)
* (df['Probability'] < bv)
* (df['Number'] * df['Dividend'] - (vc + dv))
)
return -1 * (df['Return'].sum())
def make_param_grid(
bounds: List[Tuple[float, float]],
param_names: Optional[List[str]]=None,
num_points: int=10,
as_dict: bool=True,
) -> Union[pd.DataFrame, Dict[str, List[float]]]:
"""
Create parameter search space.
Example:
grid = make_param_grid(bounds=b1, num_points=10, as_dict=True)
"""
if param_names is None:
param_names = PARAM_NAMES # ["ev", "bv", "vc", "dv"]
bounds = np.array(bounds)
grid = np.linspace(start=bounds[:,0],
stop=bounds[:,1],
num=num_points,
endpoint=True,
axis=0)
grid = pd.DataFrame(grid, columns=param_names)
if as_dict:
grid = grid.to_dict()
for k,v in grid.items():
grid.update({k: list(v.values())})
return grid
def objective(trial,
bounds: Optional[Iterable]=None,
func: Optional[Callable]=None,
param_names: Optional[List[str]]=None):
"""Objective function, necessary for optimizing with optuna."""
if param_names is None:
param_names = PARAM_NAMES
if (bounds is None):
bounds = ((-10, 10) for _ in param_names)
if not isinstance(bounds, dict):
bounds = dict((p, (min(b), max(b)))
for p, b in zip(param_names, bounds))
if func is None:
func = DEFAULT_METRIC_FUNC
params = dict(
(p, trial.suggest_float(p, bounds.get(p)[0], bounds.get(p)[1]))
for p in param_names
)
# x = trial.suggest_float('x', -10, 10)
return func((params[p] for p in param_names))
def optimize(objective: Callable,
sampler: Optional[optuna.samplers.BaseSampler]=None,
func: Optional[Callable]=None,
n_trials: int=2,
study_direction: str="minimize",
study_name: Optional[str]=None,
formatstr: str=".4f",
verbose: bool=True):
"""Optimizing function using optuna: creates a study."""
if func is None:
func = DEFAULT_METRIC_FUNC
study = optuna.create_study(
direction=study_direction,
sampler=sampler,
study_name=study_name)
study.optimize(
objective,
n_trials=n_trials,
show_progress_bar=True,
n_jobs=1,
)
if verbose:
metric = eval_metric(study.best_params, func=myFunc2)
msg = format_result(study.best_params, metric,
header=study.study_name,
format=formatstr)
print(msg)
return study
def format_dict(d: Dict[str, float], format: str=".4f") -> Dict[str, float]:
"""
Returns formatted output for a dictionary with
string keys and float values.
"""
return dict((k, float(f'{v:{format}}')) for k,v in d.items())
def format_result(d: Dict[str, float],
metric_value: float,
header: str='',
format: str=".4f"):
"""Returns formatted result."""
msg = f"""Study Name: {header}\n{'='*30}
✅ study.best_params: \n\t{format_dict(d)}
✅ metric: {metric_value}
"""
return msg
def study_contour_plot(study: optuna.Study,
params: Optional[List[str]]=None,
width: int=560,
height: int=500):
"""
Create contour plots for a study, given a list or
tuple of two parameter names.
"""
if params is None:
params = ["dv", "vc"]
fig = optuna.visualization.plot_contour(study, params=params)
fig.update_layout(
title=f'Contour Plot: {study.study_name} ({params[0]}, {params[1]})',
autosize=False,
width=width,
height=height,
margin=dict(l=65, r=50, b=65, t=90))
fig.show()
bounds = [(0.2, 4), (300, 600), (0, 1000), (0, 1000)]
param_names = PARAM_NAMES # ["ev", "bv", "vc", "dv",]
pobjective = partial(objective, bounds=bounds)
# Create an empty dict to contain
# various subsequent studies.
studies = dict()
Optuna comes with a few different types of Samplers. Samplers provide the strategy of how optuna is going to sample points from the parametr-space and evaluate the objective function.
https://optuna.readthedocs.io/en/stable/reference/samplers.html
B.1 Use TPESampler
from optuna.samplers import TPESampler
sampler = TPESampler(seed=42)
study_name = "TPESampler"
studies[study_name] = optimize(
pobjective,
sampler=sampler,
n_trials=100,
study_name=study_name,
)
# Study Name: TPESampler
# ==============================
#
# ✅ study.best_params:
# {'ev': 1.6233, 'bv': 585.2143, 'vc': 731.9939, 'dv': 598.6585}
# ✅ metric: -0.0
B.2. Use GridSampler
GridSampler requires a parameter search grid. Here we are using the following search_space.
from optuna.samplers import GridSampler
# create search-space
search_space = make_param_grid(bounds=bounds, num_points=10, as_dict=True)
sampler = GridSampler(search_space)
study_name = "GridSampler"
studies[study_name] = optimize(
pobjective,
sampler=sampler,
n_trials=2000,
study_name=study_name,
)
# Study Name: GridSampler
# ==============================
#
# ✅ study.best_params:
# {'ev': 0.2, 'bv': 500.0, 'vc': 222.2222, 'dv': 0.0}
# ✅ metric: -88.33333333333337
B.3. Use RandomSampler
from optuna.samplers import RandomSampler
sampler = RandomSampler(seed=42)
study_name = "RandomSampler"
studies[study_name] = optimize(
pobjective,
sampler=sampler,
n_trials=300,
study_name=study_name,
)
# Study Name: RandomSampler
# ==============================
#
# ✅ study.best_params:
# {'ev': 1.6233, 'bv': 585.2143, 'vc': 731.9939, 'dv': 598.6585}
# ✅ metric: -0.0
C. Dummy Data
For the sake of reproducibility, I am keeping a record of the dummy data used here.
import pandas as pd
import numpy as np
from scipy import optimize
cols = {
'Dividend2': [9390, 7448, 177],
'Probability': [341, 376, 452],
'EV': [0.53, 0.60, 0.55],
'Dividend': [185, 55, 755],
'EV2': [123, 139, 544],
}
df = pd.DataFrame(cols)
def myFunc(params):
"""myFunc metric."""
(ev, bv, vc, dv) = params
df['Number'] = np.where(df['Dividend2'] <= vc, 1, 0) \
+ np.where(df['EV2'] <= dv, 1, 0)
df['Return'] = np.where(
df['EV'] <= ev, 0, np.where(
df['Probability'] >= bv, 0, df['Number'] * df['Dividend'] - (vc + dv)
)
)
return -1 * (df['Return'].sum())
b1 = [(0.2,4), (300,600), (0,1000), (0,1000)]
start = [0.2, 600, 1000, 1000]
result = optimize.minimize(fun=myFunc, bounds=b1, x0=start)
print(result)
C.1. An Observation
So, it seems at first glance that the code executed properly and did not throw any error. It says it had success in finding the minimized solution.
fun: -0.0
hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
jac: array([0., 0., 3., 3.])
message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' # 💡
nfev: 35
nit: 2
status: 0
success: True
x: array([2.e-01, 6.e+02, 0.e+00, 0.e+00]) # 🔥
A close observation reveals that the solution (see 🔥) is no different from the starting point [0.2, 600, 1000, 1000]. So, seems like nothing really happened and the algorithm just finished prematurely?!!
Now look at the message above (see 💡). If we run a google search on this, you could find something like this:
Summary
b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
If the loss-landscape does not have a smoothely changing topography, the gradient descent algorithms will soon find that from one iteration to the next, there isn't much change happening and hence, will terminate further seeking. Also, if the loss-landscape is rather flat, this could see similar fate and get early-termination.
scipy-optimize-minimize does not perform the optimization - CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
D. Making the Loss Landscape Smoother
A binary evaluation of value = 1 if x>5 else 0 is essentially a step-function that assigns 1 for all values of x that are greater than 5 and 0 otherwise. But this introduces a kink - a discontinuity in smoothness and this could potentially introduce problems in traversing the loss-landscape.
What if we use a sigmoid function to introduce some smoothness?
# Define sigmoid function
def sigmoid(x):
"""Sigmoid function."""
return 1 / (1 + np.exp(-x))
For the above example, we could modify it as follows.
You can additionally introduce another factor (gamma: γ) as follows and try to optimize it to make the landscape smoother. Thus by controlling the gamma factor, you could make the function smoother and change how quickly it changes around x = 5
The above figure is created with the following code-snippet.
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'svg' # 'svg', 'retina'
plt.style.use('seaborn-white')
def make_figure(figtitle: str="Sigmoid Function"):
"""Make the demo figure for using sigmoid."""
x = np.arange(-20, 20.01, 0.01)
y1 = sigmoid(x)
y2 = sigmoid(x - 5)
y3 = sigmoid((x - 5)/3)
y4 = sigmoid((x - 5)/0.3)
fig, ax = plt.subplots(figsize=(10,5))
plt.sca(ax)
plt.plot(x, y1, ls="-", label="$\sigma(x)$")
plt.plot(x, y2, ls="--", label="$\sigma(x - 5)$")
plt.plot(x, y3, ls="-.", label="$\sigma((x - 5) / 3)$")
plt.plot(x, y4, ls=":", label="$\sigma((x - 5) / 0.3)$")
plt.axvline(x=0, ls="-", lw=1.3, color="cyan", alpha=0.9)
plt.axvline(x=5, ls="-", lw=1.3, color="magenta", alpha=0.9)
plt.legend()
plt.title(figtitle)
plt.show()
make_figure()
D.1. Example of Metric Smoothing
The following is an example of how you could apply function smoothing.
from functools import partial
def sig(x, gamma: float=1.):
return sigmoid(x/gamma)
def myFunc3(params, gamma: float=0.5):
"""myFunc metric v3 with smoother metric."""
(ev, bv, vc, dv) = params
_sig = partial(sig, gamma=gamma)
df['Number'] = _sig(x = -(df['Dividend2'] - vc)) * 1 \
+ _sig(x = -(df['EV2'] - dv)) * 1
df['Return'] = (
_sig(x = df['EV'] - ev)
* _sig(x = -(df['Probability'] - bv))
* _sig(x = df['Number'] * df['Dividend'] - (vc + dv))
)
return -1 * (df['Return'].sum())
As already mentioned in my comment, the crucial problem is that np.where() is neither differentiable nor continuous. Consequently, your objective function violates the mathematical assumptions for most of the (derivate-based) algorithms under the hood of scipy.optimize.minimize.
So, basically, you've got three options:
Use a derivative-free algorithm and hope for the best.
Replace np.where() with a smooth approximation such that your objective is continuously differentiable.
Reformulate your problem as a MIP.
Since #CypherX answer pursues approach 1, I'd like to focus on 2. Here, the main idea is to approximate the np.where function. One possible approximation is
def smooth_if_then(x):
eps = 1e-12
return 0.5 + x/(2*np.sqrt(eps + x*x))
which is continuous and differentiable. Then, given a np.ndarray arr and a scalar value x, the expression np.where(arr <= x, 1, 0) is equivalent to smooth_if_then(x - arr).
Hence, the objective function becomes:
div = df['Dividend'].values
div2 = df['Dividend2'].values
ev2 = df['EV2'].values
ev = df['EV'].values
prob = df['Probability'].values
def objective(x, *params):
ev, bv, vc, dv = x
div_vals, div2_vals, ev2_vals, ev_vals, prob_vals = params
number = smooth_if_then(vc - div2_vals) + smooth_if_then(dv - ev2_vals)
part1 = smooth_if_then(bv - prob_vals) * (number * div_vals - (vc + dv))
part2 = smooth_if_then(-1*(ev - ev_vals)) * part1
return -1 * part2.sum()
and using the trust-constr algorithm (which is the most robust one inside scipy.optimize.minimize), yields:
res = minimize(lambda x: objective(x, div, div2, ev2, ev, prob), x0=start, bounds=b1, method="trust-constr")
barrier_parameter: 1.0240000000000006e-08
barrier_tolerance: 1.0240000000000006e-08
cg_niter: 5
cg_stop_cond: 0
constr: [array([8.54635975e-01, 5.99253512e+02, 9.95614973e+02, 9.95614973e+02])]
constr_nfev: [0]
constr_nhev: [0]
constr_njev: [0]
constr_penalty: 1.0
constr_violation: 0.0
execution_time: 0.2951819896697998
fun: 1.3046631387761482e-08
grad: array([0.00000000e+00, 0.00000000e+00, 8.92175218e-12, 8.92175218e-12])
jac: [<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in Compressed Sparse Row format>]
lagrangian_grad: array([-3.60651033e-09, 4.89643010e-09, 2.21847918e-09, 2.21847918e-09])
message: '`gtol` termination condition is satisfied.'
method: 'tr_interior_point'
nfev: 20
nhev: 0
nit: 14
niter: 14
njev: 4
optimality: 4.896430096425101e-09
status: 1
success: True
tr_radius: 478515625.0
v: [array([-3.60651033e-09, 4.89643010e-09, 2.20955743e-09, 2.20955743e-09])]
x: array([8.54635975e-01, 5.99253512e+02, 9.95614973e+02, 9.95614973e+02])
Last but not least: Using smooth approximations is a common way to achieve differentiability. However, it's worth mentioning that these approximations are not convex. In practice, this means that your optimization problem is not convex and thus, you have no guarantee that a found stationary point (local minimizer) is a global optimum. For this end, one either needs to use a global optimization algorithm or formulate the problem as a MIP. The latter is the recommended approach, both from a mathematical and a practice point of view.

Trying to solve this non linear optimization using GEKKO, getting this error

#Error: setting an array element with a sequence
I am trying to mninimize the downside risk.
I have a two dimensional array of returns shape(1000, 10), and the portfolio starts with $100. Compound that 10 times by each return in a row. Do that for all the rows. Compare that last cell's value for each row with mean of last column's values. Keep the value if it's less than mean or else zero. So we will have an array of (1000, 1). At the end I am finding the standard deviation of that.
Objective is to minimize the standard deviation.
Constraints: weights need to be less than 1
the expected return i.e. wt*ret should be equal to a value like 7%. I have to do that for couple of values like 7%, 8% or 10%.
wt = np.array([0.4, 0.3, 0.3])
cov = array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1 + re_ret[m][n+1])
mean = np.mean(fund_ret[:,-1]) #just need the last column and all rows
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = GEKKO()
w = b.Array(b.Var,3,value=0.33,lb=1e-5, ub=1)
b.Equation(b.sum(w)<=1)
b.Equation(np.dot(w,ret) == .07)
b.Minimize(simulations(w))
b.solve(disp=False)
#simulations(wt)
If you comment out the gekko section and call the simulation function at the bottom, it works fine
In this case, you would want to consider a different optimizer such as scipy.minimize.optimize. The function np.std() is not currently supported in Gekko. Gekko compiles the model into byte-code for automatic differentiation so you need to fit the problem into a form that is supported. Gekko's approach has several advantages, especially for large-scale or non-linear problems. For small problems with fewer than 100 variables and nearly linear constraints, an optimizer such as scipy.minimize.optimize is often a viable option. Here is your problem with a solution:
import numpy as np
from scipy.optimize import minimize
wt = np.array([0.4, 0.3, 0.3])
cov = np.array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1+re_ret[m][n+1])
#just need the last column and all rows
mean = np.mean(fund_ret[:,-1])
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = (1e-5,1); bnds=(b,b,b)
cons = ({'type': 'ineq', 'fun': lambda x: sum(x)-1},\
{'type': 'eq', 'fun': lambda x: np.dot(x,ret)-.07})
sol = minimize(simulations,wt,bounds=bnds,constraints=cons)
w = sol.x
print(w)
This produces the solution sol with optimal values w=sol.x:
fun: 6.139162309118155
jac: array([ 8.02691203, 10.04863131, 9.49171901])
message: 'Optimization terminated successfully.'
nfev: 33
nit: 6
njev: 6
status: 0
success: True
x: array([0.09741111, 0.45326888, 0.44932001])

Sample from array of Beta distributions in Tensorflow2

I have an array params with shape (N, 2), where each pair corresponds to a Beta distribution, e.g. (alpha, beta). I need to sample a value from each of these distributions; is there a way to do this without looping in Python? I know I can do this:
u = [tfp.distributions.Beta(a, b).sample() for (a, b) in params]
# Or
u = [np.random.beta(a, b) for (a, b) in params]
but I imagine this to be very slow. I am using tensorflow, tensorflow_probability and numpy btw.
EDIT:
Thanks to JST99's answer, I found out that it is possible to pass a vector of parameters to tfp.distributions.Beta as well:
u = tfp.distributions.Beta(params[:, 0], params[:, 1]).sample()
You can separate out the alpha and the beta parameters using zip, then use np.random.beta in a vectorized fashion.
alpha, beta = zip(*params)
u = np.random.beta(alpha, beta)
Here is an example with some dummy parameters.
>>> params = [(i, i) for i in range(1, 100)]
>>> alpha, beta = zip(*params)
>>> u = np.random.beta(alpha, beta)
>>> u
array([0.88027947, 0.22507079, 0.46668932, 0.65091097, 0.62278597,
0.37450051, 0.53237829, 0.5589561 , 0.55190015, 0.64352003,
0.61396155, 0.58559066, 0.59525124, 0.49827492, 0.45065234,
0.53716919, 0.52950708, 0.41751582, 0.44912503, 0.57043946,
0.51909876, 0.34834858, 0.55753122, 0.41586101, 0.46762533,
0.4905744 , 0.53927006, 0.5234163 , 0.56215437, 0.38265575,
0.4940874 , 0.45066854, 0.53654453, 0.40955841, 0.49478651,
0.52974175, 0.43218663, 0.49791192, 0.47176042, 0.46717939,
0.45576387, 0.58941562, 0.44112651, 0.45401485, 0.48990107,
0.5640564 , 0.46720441, 0.439157 , 0.56098725, 0.43914691,
0.44654769, 0.5639682 , 0.41962566, 0.53689739, 0.46501042,
0.52775508, 0.55992535, 0.4948104 , 0.54856768, 0.4711496 ,
0.44694159, 0.54769584, 0.51792418, 0.48669042, 0.51969972,
0.51599904, 0.4818758 , 0.47555456, 0.47581746, 0.43417686,
0.49156854, 0.51359563, 0.52830314, 0.50988281, 0.47357164,
0.47619267, 0.52755645, 0.50141785, 0.48280575, 0.47817313,
0.47954096, 0.53885494, 0.5218641 , 0.50253071, 0.58804552,
0.50788384, 0.49429312, 0.47677202, 0.45542669, 0.47169082,
0.58838068, 0.4992328 , 0.5098452 , 0.44761298, 0.45971338,
0.4841432 , 0.47673295, 0.48205439, 0.4799415 ])

Numerical gradient for nonlinear function in numpy/scipy

I'm trying to implement an numerical gradient calculation in numpy to be used as the callback function for the gradient in cyipopt. My understanding of the numpy gradient function is that it should return the gradient calculated at a point based on a finite different approximation.
I don't understand how I would able to implement the gradient of a nonlinear function with this module. The sample problem given appears to be a linear function.
>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> np.gradient(f)
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
>>> np.gradient(f, 2)
array([ 0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ])
My code snippet is as follows:
import numpy as np
# Hock & Schittkowski test problem #40
x = np.mgrid[0.75:0.85:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01]
# target is evaluation at x = [0.8, 0.8, 0.8, 0.8]
f = -x[0] * x[1] * x[2] * x[3]
g = np.gradient(f)
print g
The other downside of this is that I have to evaluate x at several points (and it returns the gradient at several points)
Is there a better option in numpy/scipy for the gradient to be numerically evaluated at a single point so I can implement this as a callback function?
First of all, some warnings:
numerical-optimization is hard to do right
ipopt is very complex software
combining ipopt with numerical-differentiation sounds like you are asking for trouble, but that depends on your problem of course
ipopt is almost always based on automatic-differentiation tools and not numerical-differentiation!
And some more:
as this is a complex task and the state of python + ipopt is not as nice as in some other languages (julia + JuMP for example), it's a bit of work
And some alternatives:
use pyomo which wraps ipopt and has automatic-differentiation
use casadi which also wraps ipopt and has automatic-differentiation
use autograd to automatically calculate gradients on a subset of numpy-code
then use cyipopt to add those
scipy.minimize with solvers SLSQP or COBYLA which can do everything for you (SLSQP can use equality and inequality constraints; COBYLA only inequality-constraints, where emulating equality-constraints by x >= y + x <= y can work)
Approaching your task with your tools
Your complete example-problem is defined in Test Examples for Nonlinear Programming Codes:
Here is some code, based on numerical-differentiation, solving your test-problem, including the official setup (function, gradients, start-point, bounds, ...)
import numpy as np
import scipy.sparse as sps
import ipopt
from scipy.optimize import approx_fprime
class Problem40(object):
""" # Hock & Schittkowski test problem #40
Basic structure follows:
- cyipopt example from https://pythonhosted.org/ipopt/tutorial.html#defining-the-problem
- which follows ipopt's docs from: https://www.coin-or.org/Ipopt/documentation/node22.html
Changes:
- numerical-diff using scipy for function & constraints
- removal of hessian-calculation
- we will use limited-memory approximation
- ipopt docs: https://www.coin-or.org/Ipopt/documentation/node31.html
- (because i'm too lazy to reason about the math; lagrange and co.)
"""
def __init__(self):
self.num_diff_eps = 1e-8 # maybe tuning needed!
def objective(self, x):
# callback for objective
return -np.prod(x) # -x1 x2 x3 x4
def constraint_0(self, x):
return np.array([x[0]**3 + x[1]**2 -1])
def constraint_1(self, x):
return np.array([x[0]**2 * x[3] - x[2]])
def constraint_2(self, x):
return np.array([x[3]**2 - x[1]])
def constraints(self, x):
# callback for constraints
return np.concatenate([self.constraint_0(x),
self.constraint_1(x),
self.constraint_2(x)])
def gradient(self, x):
# callback for gradient
return approx_fprime(x, self.objective, self.num_diff_eps)
def jacobian(self, x):
# callback for jacobian
return np.concatenate([
approx_fprime(x, self.constraint_0, self.num_diff_eps),
approx_fprime(x, self.constraint_1, self.num_diff_eps),
approx_fprime(x, self.constraint_2, self.num_diff_eps)])
def hessian(self, x, lagrange, obj_factor):
return False # we will use quasi-newton approaches to use hessian-info
# progress callback
def intermediate(
self,
alg_mod,
iter_count,
obj_value,
inf_pr,
inf_du,
mu,
d_norm,
regularization_size,
alpha_du,
alpha_pr,
ls_trials
):
print("Objective value at iteration #%d is - %g" % (iter_count, obj_value))
# Remaining problem definition; still following official source:
# http://www.ai7.uni-bayreuth.de/test_problem_coll.pdf
# start-point -> infeasible
x0 = [0.8, 0.8, 0.8, 0.8]
# variable-bounds -> empty => np.inf-approach deviates from cyipopt docs!
lb = [-np.inf, -np.inf, -np.inf, -np.inf]
ub = [np.inf, np.inf, np.inf, np.inf]
# constraint bounds -> c == 0 needed -> both bounds = 0
cl = [0, 0, 0]
cu = [0, 0, 0]
nlp = ipopt.problem(
n=len(x0),
m=len(cl),
problem_obj=Problem40(),
lb=lb,
ub=ub,
cl=cl,
cu=cu
)
# IMPORTANT: need to use limited-memory / lbfgs here as we didn't give a valid hessian-callback
nlp.addOption(b'hessian_approximation', b'limited-memory')
x, info = nlp.solve(x0)
print(x)
print(info)
# CORRECT RESULT & SUCCESSFUL STATE
Output:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 12
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 0
Total number of variables............................: 4
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 3
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
Objective value at iteration #0 is - -0.4096
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 -4.0960000e-01 2.88e-01 2.53e-02 0.0 0.00e+00 - 0.00e+00 0.00e+00 0
Objective value at iteration #1 is - -0.255391
1 -2.5539060e-01 1.28e-02 2.98e-01 -11.0 2.51e-01 - 1.00e+00 1.00e+00h 1
Objective value at iteration #2 is - -0.249299
2 -2.4929898e-01 8.29e-05 3.73e-01 -11.0 7.77e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #3 is - -0.25077
3 -2.5076955e-01 1.32e-03 3.28e-01 -11.0 2.46e-02 - 1.00e+00 1.00e+00h 1
Objective value at iteration #4 is - -0.250025
4 -2.5002535e-01 4.06e-05 1.93e-02 -11.0 4.65e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #5 is - -0.25
5 -2.5000038e-01 6.57e-07 1.70e-04 -11.0 5.46e-04 - 1.00e+00 1.00e+00h 1
Objective value at iteration #6 is - -0.25
6 -2.5000001e-01 2.18e-08 2.20e-06 -11.0 9.69e-05 - 1.00e+00 1.00e+00h 1
Objective value at iteration #7 is - -0.25
7 -2.5000000e-01 3.73e-12 4.42e-10 -11.0 1.27e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 7
(scaled) (unscaled)
Objective...............: -2.5000000000225586e-01 -2.5000000000225586e-01
Dual infeasibility......: 4.4218750883118219e-10 4.4218750883118219e-10
Constraint violation....: 3.7250202922223252e-12 3.7250202922223252e-12
Complementarity.........: 0.0000000000000000e+00 0.0000000000000000e+00
Overall NLP error.......: 4.4218750883118219e-10 4.4218750883118219e-10
Number of objective function evaluations = 8
Number of objective gradient evaluations = 8
Number of equality constraint evaluations = 8
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 8
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 0
Total CPU secs in IPOPT (w/o function evaluations) = 0.016
Total CPU secs in NLP function evaluations = 0.000
EXIT: Optimal Solution Found.
[ 0.79370053 0.70710678 0.52973155 0.84089641]
{'x': array([ 0.79370053, 0.70710678, 0.52973155, 0.84089641]), 'g': array([ 3.72502029e-12, -3.93685085e-13, 5.86974913e-13]), 'obj_val': -0.25000000000225586, 'mult_g': array([ 0.49999999, -0.47193715, 0.35355339]), 'mult_x_L': array([ 0., 0., 0., 0.]), 'mult_x_U': array([ 0., 0., 0., 0.]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
Remarks about the code
We use scipy's approx_fprime which basically was added for all those gradient-based optimizers in scipy.optimize
As stated in the sources; i did not take care about ipopt's need for the hessian and we used ipopts hessian-approximation
the basic idea is described at wiki: LBFGS
I did ignore ipopts need for sparsity structure of the Jacobian of the constraints
a default-assumption: the default hessian structure is of a lower triangular matrix is used and i won't give any guarantees on what can happen here (bad performance vs. breaking everything)
I think you have some kind of misunderstanding about what is a mathematical function and what is its numerical implementation.
You should define your function as:
def func(x1, x2, x3, x4):
return -x1*x2*x3*x4
Now you want to evaluate your function at specific points, which you can do using the np.mgrid you provided.
If you want to compute your gradient, use copy.misc.derivative(https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html) (watch out the default parameters for dx is usually bad, change it to 1e-5. There is no difference between linear and non-linear gradient for the numerical evaluation, only that for non linear function the gradient won't be the same everywhere.
What you did was with np.gradient was actually to compute the gradient from the point in your array, the definition of your function being hidden by your definition of f, thus not allowing for multiple gradient evaluation at different points. Also using your method makes you dependant of your discretisation step.

IDL's INT_TABULATE - SciPy equivalent?

I am working on moving some code from IDL into python. One IDL call is to INT_TABULATE which performs integration on a fixed range.
The INT_TABULATED function integrates a tabulated set of data { xi , fi } on the closed interval [MIN(x) , MAX(x)], using a five-point Newton-Cotes integration formula.
Result = INT_TABULATED( X, F [, /DOUBLE] [, /SORT] )
Where result is the area under the curve.
IDL DOCS
My question is, does Numpy/SciPy offer a similar form of integration? I see that [scipy.integrate.newton_cotes] exists, but it appears to return "weights and error coefficient for Newton-Cotes integration instead of area".
Scipy does not provide such a high order integrator for tabulated data by default. The closest you have available without coding it yourself is scipy.integrate.simps, which uses a 3 point Newton-Cotes method.
If you simply want to get comparable integration precision, you could split your x and f arrays into 5 point chunks and integrate them one at a time, using the weights returned by scipy.integrate.newton_cotes doing something along the lines of:
def idl_tabulate(x, f, p=5) :
def newton_cotes(x, f) :
if x.shape[0] < 2 :
return 0
rn = (x.shape[0] - 1) * (x - x[0]) / (x[-1] - x[0])
weights = scipy.integrate.newton_cotes(rn)[0]
return (x[-1] - x[0]) / (x.shape[0] - 1) * np.dot(weights, f)
ret = 0
for idx in xrange(0, x.shape[0], p - 1) :
ret += newton_cotes(x[idx:idx + p], f[idx:idx + p])
return ret
This does 5-point Newton-Cotes on all intervals, except perhaps the last, where it will do a Newton-Cotes of the number of points remaining. Unfortunately, this will not give you the same results as IDL_TABULATE because the internal methods are different:
Scipy calculates the weights for points not equally spaced using what seems like a least-sqaures fit, I don't fully understand what is going on, but the code is pure python, you can find it in your Scipy installation in file scipy\integrate\quadrature.py.
INT_TABULATED always performs 5-point Newton-Cotes on equispaced data. If the data are not equispaced, it builds an equispaced grid, using a cubic spline to interpolate the values at those points. You can check the code here.
For the example in the INT_TABULATED docstring, which is suppossed to return 1.6271 using the original code, and have an exact solution of 1.6405, the above function returns:
>>> x = np.array([0.0, 0.12, 0.22, 0.32, 0.36, 0.40, 0.44, 0.54, 0.64,
... 0.70, 0.80])
>>> f = np.array([0.200000, 1.30973, 1.30524, 1.74339, 2.07490, 2.45600,
... 2.84299, 3.50730, 3.18194, 2.36302, 0.231964])
>>> idl_tabulate(x, f)
1.641998154242472

Categories