Normalization for optimization in python - python

During optimization, it is often helpful to normalize the input parameters to make them on the same order of magnitude, so the convergence can be much better. For example, if we want to minimize f(x), while a reasonable approximation is x0=[1e3, 1e-4], it might be helpful to normalize x0[0] and x0[1] to about the same order of magnitude (often O(1)).
My question is, I have been using scipy.optimize and specifically, the L-BFGS-B algorithm. I was wondering that, do I need to normalize that manually by writing a function, or the algorithm already did it for me?
Thank you!

I wrote a quick small program to test your question.
To summarize: if the parameters are within a couple orders of magnitude of each other, then the algorithm handles it (it successfully converges and does not need to do significantly more function evaluations).
On the other hand, when you start getting beyond a factor of 10000, the algorithm starts to break down and errors out.
Here is the program:
import scipy.optimize
def test_lbfgsb():
def surface(x):
return (x[0] - 3.0) ** 2 + (factor * x[1] - 4.0) ** 2
factor = None
for exponent in xrange(0, 9):
magnitude = 10 ** exponent
factors = [x * magnitude for x in [1, 3]]
for factor in factors:
optimization_result = scipy.optimize.minimize(surface, [0, 0], method='l-bfgs-b')
desc = 'at factor %d' % (factor)
if not optimization_result.success:
print '%s FAILURE (%s)' % (desc, optimization_result.message)
else:
print '%s, found min at %s, after %d evaluations' % (
desc, optimization_result.x, optimization_result.nfev)
test_lbfgsb()
Here is its output:
at factor 1, found min at [ 3.00000048 4.00000013], after 12 evaluations
at factor 3, found min at [ 2.99999958 1.33333351], after 36 evaluations
at factor 10, found min at [ 3.00000059 0.39999999], after 28 evaluations
at factor 30, found min at [ 2.99999994 0.13333333], after 36 evaluations
at factor 100, found min at [ 3.00000013 0.03999999], after 40 evaluations
at factor 300, found min at [ 3. 0.01333333], after 52 evaluations
at factor 1000, found min at [ 3. 0.00399999], after 64 evaluations
at factor 3000, found min at [ 3.00000006e+00 1.33332833e-03], after 72 evaluations
at factor 10000, found min at [ 3.00002680e+00 3.99998309e-04], after 92 evaluations
at factor 30000, found min at [ 3.00000002e+00 1.33328333e-04], after 104 evaluations
at factor 100000 FAILURE (ABNORMAL_TERMINATION_IN_LNSRCH)
at factor 300000, found min at [ 3.00013621e+00 1.33292531e-05], after 196 evaluations
at factor 1000000, found min at [ 3.00000348e-12 3.99500004e-06], after 60 evaluations
at factor 3000000 FAILURE (ABNORMAL_TERMINATION_IN_LNSRCH)
at factor 10000000 FAILURE (ABNORMAL_TERMINATION_IN_LNSRCH)
at factor 30000000 FAILURE (ABNORMAL_TERMINATION_IN_LNSRCH)
at factor 100000000 FAILURE (ABNORMAL_TERMINATION_IN_LNSRCH)
at factor 300000000, found min at [ 3.33333330e-17 8.33333350e-09], after 72 evaluations

Related

Simulating for loop 100 times with a matrix

I am currently trying to run a simulation of MBS a 100 times. I have written the for loop that I need, however I need to make this loop run 100 times. After consulting with a friend, I believe that I need to specify the "size" parameter in np.random.normal as a matrix, however my coding skills are limited and I would greatly appreciate your help with doing so. Specifically, for a sequence of correlation parameter ρ (rho) between 0 and 1, I need to simulate 100 MBSs, and report the average payoff of each tranche across the simulations Below is my code with notes.
EDIT:
I appreciate the code proposed in the answers and it is indeed very useful. I have one last hurdle now to include: How do I include a payment structure that is sequential. Specifically, The junior tranche is the first to absorb losses from the
underlying collateral pool and does so until the portfolio loss exceeds 5% (i.e. the proportion of defaults exceeds 10%) at which point the junior tranche becomes worthless. The mezzanine
tranche begins to absorb losses once the portfolio loss exceeds 5% and continues to do so until the portfolio loss reaches 10% (i.e. the proportion of defaults exceeds 20%). Finally, the senior
tranche absorbs portfolio losses in excess of 10%. As of now we've only considered an average payoff given a specific share of the total payoff.
rho_list = [0,0.2, 0.5,0.6, 1]
# parameters
n_borrowers = 10
payoff_default = 0.5
payoff_nodefault = 1
threshold = -1.65
# draw of income shocks (I have to draw new values of s and eps for each simulation)
s = np.random.normal(0,1,size=1) # common s for all borrowers
eps = np.random.normal(0,1,size=n_borrowers) # each borrower has their own eps
for rho in rho_list :
# compute the borrower's income
x = np.sqrt(rho) * s + np.sqrt(1-rho) * eps
# which borrower defaults?
loan_payoff = (x < threshold) * payoff_default + (x >= threshold) * payoff_nodefault
# total pool
total_payoff = np.sum(loan_payoff)
# how much does each investor receive from total_payoff?
senior_payoff = 0.82 * total_payoff
mezz_payoff = 0.12 * total_payoff
junior_payoff = 0.6 * total_payoff
print('total: {}, senior: {}, mezz: {}, junior: {}'.format(total_payoff, senior_payoff, mezz_payoff, junior_payoff))
# next steps (this is what I need help with)
# repeat this for 100 simulations and compute the average payoff to each investor
# is it possible to generate the income for all simulations in one step?
# Idea: specify the "size" parameter in np.random.normal as a matrix
I'm not certain I've understood the problem entirely correctly, but I think this does what you want without using any for loops by taking advantage of numpy's broadcasting. I'm by no means an expert in numpy, and multidimensional calculations are something I'm not super comfortable with but I believe my logic is sound. I'd be more than happy for any feedback.
Solution
# Setup
import pandas as pd
import numpy as np
investors = np.array([0.82, 0.12, 0.06])
rhos = np.linspace(0, 1, 11)[..., None, None]
default = 0.5
nodefault = 1
thresh = -1.65
n_sims = 100
n_borrowers = 10
s = np.random.normal(0, 1, size=(1, n_sims, 1))
eps = np.random.normal(0, 1, size=(n_sims, n_borrowers))
# Solution
x = np.sqrt(rhos) * s + np.sqrt(1 - rhos) * eps
payoffs = (x < thresh) * default + (x >= thresh) * nodefault
avgs = payoffs.sum(axis=2).mean(axis=1)
investor_payouts = avgs[..., None] * investors[None, ...]
data = np.hstack([rhos.reshape(-1, 1), investor_payouts])
df = pd.DataFrame(data, columns=["rho", "senior", "mezz", "junior"])
Output:
rho senior mezz junior
0 0.0 8.0155 1.1730 0.5865
1 0.1 7.9909 1.1694 0.5847
2 0.2 7.9991 1.1706 0.5853
3 0.3 7.9868 1.1688 0.5844
4 0.4 7.9991 1.1706 0.5853
5 0.5 7.9786 1.1676 0.5838
6 0.6 7.9745 1.1670 0.5835
7 0.7 7.9745 1.1670 0.5835
8 0.8 7.9458 1.1628 0.5814
9 0.9 7.9007 1.1562 0.5781
10 1.0 7.8310 1.1460 0.5730
Rationale
With n_sims = 100 and n_borrowers = 10, and rhos = np.linspace(0, 1, 11) we have the shapes
>>> rhos.shape
(11, 1, 1)
>>> s.shape
(1, 100, 1)
>>> eps.shape
(100, 10)
The reasoning for the shapes of rhos and s are so that broadcasting can be done more easily.
For each simulation, we essentially need to calculate the payoffs for each ρ. In essence, we want an array of shape (11, 100, 10) where along the first axis are the values of ρ, and the second and third axis are one hundred simulations of 10 borrowers.
The first term of your equation is sqrt(ρ) * s, and we want (11, 100, 1) so that we can broadcast later.
np.sqrt(rhos) * s
# shapes (11, 1, 1) * (1, 100, 1) = (11, 100, 1)
This gives us the same 100 simulated values for s, each multiplied by a different value of sqrt(ρ) (e.g., for ρ=0, which is the first value in rhos, the first row of this (11, 100) matrix is all zeros). We've added an extra dimension to get (11, 100, 1) in order to add to the second term.
The second term follows a similar logic, we want the values of sqrt(1 - ρ) to be multiplied across 100 simulations of 10 borrowers. Since eps.shape == (100, 10) and rhos.shape == (11,), and we want (11, 100, 10), we need to add two new axes to rhos:
np.sqrt(1 - rhos) * eps
# shapes (11, 1, 1) * (100, 10) = (11, 100, 10)
Now we want to combine those two terms for a final array of shape (11, 100, 10). This is why we gave the first term a new axis to get (11, 100, 1), which allows us to broadcast the values of the first term over the second term's last axis:
np.sqrt(rhos) * s + np.sqrt(1 - rhos) * eps
# shapes (11, 100, 1) + (11, 100, 10) = (11, 100, 10)
We're doing this because, in your original code, you are taking a scalar s and broadcasting it over eps, which was an array of length 10. In order to do that, numpy needed to broadcast s into an array of shape (10,) to match the shape of eps. We're doing the same thing here, except we're trying to do it for 100 simulations AND 11 different ρ values.
After all that nasty broadcasting, we arrive at an array which we can now collapse into a sum across borrowers, (total_payoff = np.sum(loan_payoff) in your original code), and then an average across all 100 simulations, which is achieved by the axis arguments to those respective functions; axis 2 has 10 elements, representing the borrowers; axis 1 has 100 elements, representing each simulation. So we use
payoffs.sum(axis=2).mean(axis=1)
Note that the calculation of the intermediary x is the same as in your original code.
At this point, we've obtained the average total payoff for 100 simulations across 10 borrowers, for 11 different values of ρ. From here we want to break out the average payoff by investor. In other words, we have 11 average payoffs (one for each ρ), and 3 investor rates, and we want to broadcast the 3 investor rates over the 11 average payoffs to get an array of shape (11, 3).
Right now avgs.shape == (11,) and investors.shape == (3,) so we need to add some axes to get our desired result:
investor_payouts = avgs[..., None] * investors[None, ...]
# shapes (11, 1) * (1, 3) = (11, 3)
Finally, the np.hstack stuff isn't necessary, that's just me stacking the ρ values with the results so that I could put everything in a dataframe. You could just as easily create the resultant dataframe in a number of other ways, depending on what you need.

Genetic Algorithm Population Individual as Array

I don't have much experience using Genetic Algorithms, so I would like to ask the community for some useful comments. I want to apologize for my terminology errors. Please, correct me if it's needed.
The problem I want to optimize is optimal power flow in an islanded microgrid. In the simple microgrid we have 2 diesel generators (DG), 1 PV array, 1 Energy Storage System (ESS) and Load. Let's assume we know Load and PV array output power values for next periods.
So, the objective function should be minimized is OPEX as sum of every microgrid component operational cost at each moment t in period T:
where a, b are some operational cost coefficients, is diesel generator binary (0/1 or ON/OFF) status variable and P is output power of the microgrid component at the time t.
And here are some of constraints (the real problem is hardly and nonlinearly constrained so I wrote down only three of constraints):
Power balance
ESS' Maximum depth of disharge
Diesel gensets power limit
So, it's mixed integer problem with nonlinear constraints. I tried to adapt the problem for solving it using Genetic Algorithm. I used pymoo Python library for multiobjective optimization with NSGA2 algorithm. Let's consider and for this T we have some Load and PV power data:
from pymoo.model.problem import FunctionalProblem
from pymoo.factory import get_sampling, get_crossover, get_mutation
from pymoo.operators.mixed_variable_operator import MixedVariableSampling, MixedVariableMutation, MixedVariableCrossover
from pymoo.algorithms.nsga2 import NSGA2
from pymoo.factory import get_sampling, get_crossover, get_mutation, get_termination
from pymoo.optimize import minimize
PV = np.array([10, 19.8, 16, 25, 7.8, 42.8, 10]) #PV inverter output power, kW
Load = np.array([100, 108, 150, 150, 90, 16, 170]) #Load, kW
balance_eps = 0.001 #equality constraint relaxing coefficient
DG1_pmin = 0.3 #DG1 min power
DG2_pmin = 0.3 #DG2 min power
P_dg1 = 75 #DG1 rated power, kW
P_dg2 = 75 #DG1 rated power, kW
P_PV_inv = 50 #PV inverter rated power, kW
P_ESS_inv = 30 #ESS bidirectional inverter absolute rated discharge/charge power, kW
ESS_c = 100 #ESS capacity, kWh
SOC_min = 30
SOC_max = 100
objs = [lambda x: x[0]*x[2]*200 + x[1]*x[3]*200 + x[4]*0.002 #objective function]
constr_eq = [lambda x: ((Load[t] - x[0]*x[2] - x[1]*x[3] - x[4] - PV[t] )**2)]
constr_ieq = [lambda x: -SOC_t + 100*x[4]/ESS_c + SOC_min,
lambda x: SOC_t - 100*x[4]/ESS_c - SOC_max]
problem = FunctionalProblem(n_var=n_var, objs, constr_eq=constr_eq, constr_eq_eps=1e-03, constr_ieq=constr_ieq,
xl=np.array([0, 0, DG1_pmin*P_dg1, DG2_pmin*P_dg2, -P_ESS_inv]), xu=np.array([1, 1, P_dg1, P_dg2, P_ESS_inv]))
mask = ["int", "int", "real", "real", "real"]
sampling = MixedVariableSampling(mask, {
"real": get_sampling("real_random"),
"int": get_sampling("int_random")})
crossover = MixedVariableCrossover(mask, {
"real": get_crossover("real_sbx", prob=1.0, eta=3.0),
"int": get_crossover("int_sbx", prob=1.0, eta=3.0)})
mutation = MixedVariableMutation(mask, {
"real": get_mutation("real_pm", eta=3.0),
"int": get_mutation("int_pm", eta=3.0)})
algorithm = NSGA2(
pop_size=150,
sampling=sampling,
crossover=crossover,
mutation=mutation,
eliminate_duplicates=True)
We have n_var = 5 decision variables which are being optimized: . We should also have an access to the previous value of SOC.
I wrote a recursive code to implement a consecutive optimization chain:
x=[]
s=[]
SOC_t = 100 #SOC at t = -1
for t in range (0, 7):
res = minimize(
problem,
algorithm,
seed=1,
termination = get_termination("n_gen", 300),
save_history=True, verbose=False)
SOC_t = SOC_t - 100*res.X[4]/ESS_c
print(res.X[:2], np.around(res.X[2:].astype(np.double), 3), np.around(SOC_t, 2))
x.append(res.X)
s.append(SOC_t)
So, we have initialized populations with size 150 for every time step t and individuals in that populations looked like . Running this code I get these optimization results found:
[1 1] [27.272 34.635 28.071] 71.93
[0 1] [28.127 58.168 30. ] 41.93
[1 1] [50.95 71.423 11.599] 30.33
[1 1] [53.966 70.97 0.034] 30.3
[1 1] [24.636 59.236 -1.702] 32.0
[0 0] [40.831 29.184 -26.832] 58.83
[1 1] [68.299 63.148 28.572] 30.26
Even my little experience in Genetic Algorithms allows me to state, that such approach is inappropriate and unefficient.
So, here is my question (if you're still reading my post :)
Is there a way to optimize such problem using not consecutive optimization of a particular variables set at t, but defining individuals in population as arrays with size (T, n_var)?
For the problem described an individual in population may look like
Is it possible to implement such approach? If yes, how to do it in pymoo?
Thank you very much for your time! Any comments and suggestions will be appreciated.

Trying to solve this non linear optimization using GEKKO, getting this error

#Error: setting an array element with a sequence
I am trying to mninimize the downside risk.
I have a two dimensional array of returns shape(1000, 10), and the portfolio starts with $100. Compound that 10 times by each return in a row. Do that for all the rows. Compare that last cell's value for each row with mean of last column's values. Keep the value if it's less than mean or else zero. So we will have an array of (1000, 1). At the end I am finding the standard deviation of that.
Objective is to minimize the standard deviation.
Constraints: weights need to be less than 1
the expected return i.e. wt*ret should be equal to a value like 7%. I have to do that for couple of values like 7%, 8% or 10%.
wt = np.array([0.4, 0.3, 0.3])
cov = array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1 + re_ret[m][n+1])
mean = np.mean(fund_ret[:,-1]) #just need the last column and all rows
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = GEKKO()
w = b.Array(b.Var,3,value=0.33,lb=1e-5, ub=1)
b.Equation(b.sum(w)<=1)
b.Equation(np.dot(w,ret) == .07)
b.Minimize(simulations(w))
b.solve(disp=False)
#simulations(wt)
If you comment out the gekko section and call the simulation function at the bottom, it works fine
In this case, you would want to consider a different optimizer such as scipy.minimize.optimize. The function np.std() is not currently supported in Gekko. Gekko compiles the model into byte-code for automatic differentiation so you need to fit the problem into a form that is supported. Gekko's approach has several advantages, especially for large-scale or non-linear problems. For small problems with fewer than 100 variables and nearly linear constraints, an optimizer such as scipy.minimize.optimize is often a viable option. Here is your problem with a solution:
import numpy as np
from scipy.optimize import minimize
wt = np.array([0.4, 0.3, 0.3])
cov = np.array([[0.00026566, 0.00016167, 0.00011949],
[0.00016167, 0.00065866, 0.00021662],
[0.00011949, 0.00021662, 0.00043748]])
ret =[.098, 0.0620,.0720]
iterations = 10000
return_sim = np.random.multivariate_normal(ret, cov, iterations)
def simulations(wt):
downside =[]
fund_ret =np.zeros((1000,10))
prt_ret = np.dot(return_sim , wt)
re_ret = np.array(prt_ret).reshape(1000, 10) #10 years
for m in range(len(re_ret)):
fund_ret[m][0] = 100 * (1 + re_ret[m][0]) #start with $100
for n in range(9):
fund_ret[m][n+1] = fund_ret[m][n]* (1+re_ret[m][n+1])
#just need the last column and all rows
mean = np.mean(fund_ret[:,-1])
for i in range(1000):
downside.append(np.maximum((mean - fund_ret[i,-1]), 0))
return np.std(downside)
b = (1e-5,1); bnds=(b,b,b)
cons = ({'type': 'ineq', 'fun': lambda x: sum(x)-1},\
{'type': 'eq', 'fun': lambda x: np.dot(x,ret)-.07})
sol = minimize(simulations,wt,bounds=bnds,constraints=cons)
w = sol.x
print(w)
This produces the solution sol with optimal values w=sol.x:
fun: 6.139162309118155
jac: array([ 8.02691203, 10.04863131, 9.49171901])
message: 'Optimization terminated successfully.'
nfev: 33
nit: 6
njev: 6
status: 0
success: True
x: array([0.09741111, 0.45326888, 0.44932001])

Numerical gradient for nonlinear function in numpy/scipy

I'm trying to implement an numerical gradient calculation in numpy to be used as the callback function for the gradient in cyipopt. My understanding of the numpy gradient function is that it should return the gradient calculated at a point based on a finite different approximation.
I don't understand how I would able to implement the gradient of a nonlinear function with this module. The sample problem given appears to be a linear function.
>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=np.float)
>>> np.gradient(f)
array([ 1. , 1.5, 2.5, 3.5, 4.5, 5. ])
>>> np.gradient(f, 2)
array([ 0.5 , 0.75, 1.25, 1.75, 2.25, 2.5 ])
My code snippet is as follows:
import numpy as np
# Hock & Schittkowski test problem #40
x = np.mgrid[0.75:0.85:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01, 0.75:0.8:0.01]
# target is evaluation at x = [0.8, 0.8, 0.8, 0.8]
f = -x[0] * x[1] * x[2] * x[3]
g = np.gradient(f)
print g
The other downside of this is that I have to evaluate x at several points (and it returns the gradient at several points)
Is there a better option in numpy/scipy for the gradient to be numerically evaluated at a single point so I can implement this as a callback function?
First of all, some warnings:
numerical-optimization is hard to do right
ipopt is very complex software
combining ipopt with numerical-differentiation sounds like you are asking for trouble, but that depends on your problem of course
ipopt is almost always based on automatic-differentiation tools and not numerical-differentiation!
And some more:
as this is a complex task and the state of python + ipopt is not as nice as in some other languages (julia + JuMP for example), it's a bit of work
And some alternatives:
use pyomo which wraps ipopt and has automatic-differentiation
use casadi which also wraps ipopt and has automatic-differentiation
use autograd to automatically calculate gradients on a subset of numpy-code
then use cyipopt to add those
scipy.minimize with solvers SLSQP or COBYLA which can do everything for you (SLSQP can use equality and inequality constraints; COBYLA only inequality-constraints, where emulating equality-constraints by x >= y + x <= y can work)
Approaching your task with your tools
Your complete example-problem is defined in Test Examples for Nonlinear Programming Codes:
Here is some code, based on numerical-differentiation, solving your test-problem, including the official setup (function, gradients, start-point, bounds, ...)
import numpy as np
import scipy.sparse as sps
import ipopt
from scipy.optimize import approx_fprime
class Problem40(object):
""" # Hock & Schittkowski test problem #40
Basic structure follows:
- cyipopt example from https://pythonhosted.org/ipopt/tutorial.html#defining-the-problem
- which follows ipopt's docs from: https://www.coin-or.org/Ipopt/documentation/node22.html
Changes:
- numerical-diff using scipy for function & constraints
- removal of hessian-calculation
- we will use limited-memory approximation
- ipopt docs: https://www.coin-or.org/Ipopt/documentation/node31.html
- (because i'm too lazy to reason about the math; lagrange and co.)
"""
def __init__(self):
self.num_diff_eps = 1e-8 # maybe tuning needed!
def objective(self, x):
# callback for objective
return -np.prod(x) # -x1 x2 x3 x4
def constraint_0(self, x):
return np.array([x[0]**3 + x[1]**2 -1])
def constraint_1(self, x):
return np.array([x[0]**2 * x[3] - x[2]])
def constraint_2(self, x):
return np.array([x[3]**2 - x[1]])
def constraints(self, x):
# callback for constraints
return np.concatenate([self.constraint_0(x),
self.constraint_1(x),
self.constraint_2(x)])
def gradient(self, x):
# callback for gradient
return approx_fprime(x, self.objective, self.num_diff_eps)
def jacobian(self, x):
# callback for jacobian
return np.concatenate([
approx_fprime(x, self.constraint_0, self.num_diff_eps),
approx_fprime(x, self.constraint_1, self.num_diff_eps),
approx_fprime(x, self.constraint_2, self.num_diff_eps)])
def hessian(self, x, lagrange, obj_factor):
return False # we will use quasi-newton approaches to use hessian-info
# progress callback
def intermediate(
self,
alg_mod,
iter_count,
obj_value,
inf_pr,
inf_du,
mu,
d_norm,
regularization_size,
alpha_du,
alpha_pr,
ls_trials
):
print("Objective value at iteration #%d is - %g" % (iter_count, obj_value))
# Remaining problem definition; still following official source:
# http://www.ai7.uni-bayreuth.de/test_problem_coll.pdf
# start-point -> infeasible
x0 = [0.8, 0.8, 0.8, 0.8]
# variable-bounds -> empty => np.inf-approach deviates from cyipopt docs!
lb = [-np.inf, -np.inf, -np.inf, -np.inf]
ub = [np.inf, np.inf, np.inf, np.inf]
# constraint bounds -> c == 0 needed -> both bounds = 0
cl = [0, 0, 0]
cu = [0, 0, 0]
nlp = ipopt.problem(
n=len(x0),
m=len(cl),
problem_obj=Problem40(),
lb=lb,
ub=ub,
cl=cl,
cu=cu
)
# IMPORTANT: need to use limited-memory / lbfgs here as we didn't give a valid hessian-callback
nlp.addOption(b'hessian_approximation', b'limited-memory')
x, info = nlp.solve(x0)
print(x)
print(info)
# CORRECT RESULT & SUCCESSFUL STATE
Output:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.8, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 12
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 0
Total number of variables............................: 4
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 3
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
Objective value at iteration #0 is - -0.4096
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 -4.0960000e-01 2.88e-01 2.53e-02 0.0 0.00e+00 - 0.00e+00 0.00e+00 0
Objective value at iteration #1 is - -0.255391
1 -2.5539060e-01 1.28e-02 2.98e-01 -11.0 2.51e-01 - 1.00e+00 1.00e+00h 1
Objective value at iteration #2 is - -0.249299
2 -2.4929898e-01 8.29e-05 3.73e-01 -11.0 7.77e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #3 is - -0.25077
3 -2.5076955e-01 1.32e-03 3.28e-01 -11.0 2.46e-02 - 1.00e+00 1.00e+00h 1
Objective value at iteration #4 is - -0.250025
4 -2.5002535e-01 4.06e-05 1.93e-02 -11.0 4.65e-03 - 1.00e+00 1.00e+00h 1
Objective value at iteration #5 is - -0.25
5 -2.5000038e-01 6.57e-07 1.70e-04 -11.0 5.46e-04 - 1.00e+00 1.00e+00h 1
Objective value at iteration #6 is - -0.25
6 -2.5000001e-01 2.18e-08 2.20e-06 -11.0 9.69e-05 - 1.00e+00 1.00e+00h 1
Objective value at iteration #7 is - -0.25
7 -2.5000000e-01 3.73e-12 4.42e-10 -11.0 1.27e-06 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 7
(scaled) (unscaled)
Objective...............: -2.5000000000225586e-01 -2.5000000000225586e-01
Dual infeasibility......: 4.4218750883118219e-10 4.4218750883118219e-10
Constraint violation....: 3.7250202922223252e-12 3.7250202922223252e-12
Complementarity.........: 0.0000000000000000e+00 0.0000000000000000e+00
Overall NLP error.......: 4.4218750883118219e-10 4.4218750883118219e-10
Number of objective function evaluations = 8
Number of objective gradient evaluations = 8
Number of equality constraint evaluations = 8
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 8
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 0
Total CPU secs in IPOPT (w/o function evaluations) = 0.016
Total CPU secs in NLP function evaluations = 0.000
EXIT: Optimal Solution Found.
[ 0.79370053 0.70710678 0.52973155 0.84089641]
{'x': array([ 0.79370053, 0.70710678, 0.52973155, 0.84089641]), 'g': array([ 3.72502029e-12, -3.93685085e-13, 5.86974913e-13]), 'obj_val': -0.25000000000225586, 'mult_g': array([ 0.49999999, -0.47193715, 0.35355339]), 'mult_x_L': array([ 0., 0., 0., 0.]), 'mult_x_U': array([ 0., 0., 0., 0.]), 'status': 0, 'status_msg': b'Algorithm terminated successfully at a locally optimal point, satisfying the convergence tolerances (can be specified by options).'}
Remarks about the code
We use scipy's approx_fprime which basically was added for all those gradient-based optimizers in scipy.optimize
As stated in the sources; i did not take care about ipopt's need for the hessian and we used ipopts hessian-approximation
the basic idea is described at wiki: LBFGS
I did ignore ipopts need for sparsity structure of the Jacobian of the constraints
a default-assumption: the default hessian structure is of a lower triangular matrix is used and i won't give any guarantees on what can happen here (bad performance vs. breaking everything)
I think you have some kind of misunderstanding about what is a mathematical function and what is its numerical implementation.
You should define your function as:
def func(x1, x2, x3, x4):
return -x1*x2*x3*x4
Now you want to evaluate your function at specific points, which you can do using the np.mgrid you provided.
If you want to compute your gradient, use copy.misc.derivative(https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.derivative.html) (watch out the default parameters for dx is usually bad, change it to 1e-5. There is no difference between linear and non-linear gradient for the numerical evaluation, only that for non linear function the gradient won't be the same everywhere.
What you did was with np.gradient was actually to compute the gradient from the point in your array, the definition of your function being hidden by your definition of f, thus not allowing for multiple gradient evaluation at different points. Also using your method makes you dependant of your discretisation step.

Python: how to make an histogram with equally *sized* bins

I have a set of data, and want to make an histogram of it. I need the bins to have the same size, by which I mean that they must contain the same number of objects, rather than the more common (numpy.histogram) problem of having equally spaced bins.
This will naturally come at the expenses of the bins widths, which can - and in general will - be different.
I will specify the number of desired bins and the data set, obtaining the bins edges in return.
Example:
data = numpy.array([1., 1.2, 1.3, 2.0, 2.1, 2.12])
bins_edges = somefunc(data, nbins=3)
print(bins_edges)
>> [1.,1.3,2.1,2.12]
So the bins all contain 2 points, but their widths (0.3, 0.8, 0.02) are different.
There are two limitations:
- if a group of data is identical, the bin containing them could be bigger.
- if there are N data and M bins are requested, there will be N/M bins plus one if N%M is not 0.
This piece of code is some cruft I've written, which worked nicely for small data sets. What if I have 10**9+ points and want to speed up the process?
1 import numpy as np
2
3 def def_equbin(in_distr, binsize=None, bin_num=None):
4
5 try:
6
7 distr_size = len(in_distr)
8
9 bin_size = distr_size / bin_num
10 odd_bin_size = distr_size % bin_num
11
12 args = in_distr.argsort()
13
14 hist = np.zeros((bin_num, bin_size))
15
16 for i in range(bin_num):
17 hist[i, :] = in_distr[args[i * bin_size: (i + 1) * bin_size]]
18
19 if odd_bin_size == 0:
20 odd_bin = None
21 bins_limits = np.arange(bin_num) * bin_size
22 bins_limits = args[bins_limits]
23 bins_limits = np.concatenate((in_distr[bins_limits],
24 [in_distr[args[-1]]]))
25 else:
26 odd_bin = in_distr[args[bin_num * bin_size:]]
27 bins_limits = np.arange(bin_num + 1) * bin_size
28 bins_limits = args[bins_limits]
29 bins_limits = in_distr[bins_limits]
30 bins_limits = np.concatenate((bins_limits, [in_distr[args[-1]]]))
31
32 return (hist, odd_bin, bins_limits)
Using your example case (bins of 2 points, 6 total data points):
from scipy import stats
bin_edges = stats.mstats.mquantiles(data, [0, 2./6, 4./6, 1])
>> array([1. , 1.24666667, 2.05333333, 2.12])
I would like to mention also the existence of pandas.qcut, which does equi-populated binning in quite an efficient way. In your case it would work something like
data = np.array([1., 1.2, 1.3, 2.0, 2.1, 2.12])
# parameter q specifies the number of bins
qc = pd.qcut(data, q=3, precision=1)
# bin definition
bins = qc.categories
print(bins)
>> Index(['[1, 1.3]', '(1.3, 2.03]', '(2.03, 2.1]'], dtype='object')
# bin corresponding to each point in data
codes = qc.codes
print(codes)
>> array([0, 0, 1, 1, 2, 2], dtype=int8)
Update for skewed distributions :
I came across the same problem as #astabada, wanting to create bins each containing an equal number of samples. When applying the solution proposed #aganders3, I found that it didn't work particularly well for skewed distributions. In the case of skewed data (for example something with a whole lot of zeros), stats.mstats.mquantiles for a predefined number of quantiles will not guarantee an equal number of samples in each bin. You will get bin edges that look like this :
[0. 0. 4. 9.]
In which case the first bin will be empty.
In order to deal with skewed cases, I created a function that calls stats.mstats.mquantiles and then dynamically modifies the number of bins if samples are not equal within a certain tolerance (30% of the smallest sample size in the example code). If samples are not equal between bins, the code reduces the number of equally-spaced quantiles by 1 and calls stats.mstats.mquantiles again until sample sizes are equal or only one bin exists.
I hard coded the tolerance in the example, but this could be modified to a keyword argument if desired.
I also prefer giving the number of equally spaced quantiles as an argument to my function instead of giving user defined quantiles to stats.mstats.mquantiles in order to reduce accidental errors (i.e. something like [0., 0.25, 0.7, 1.]).
Here's the code :
import numpy as np
from scipy import stats
def equibins(dat, binnum, **kwargs):
numin = binnum
while numin>1.:
qtls = np.linspace(0.,1.0,num=numin,endpoint=False)
ebins =stats.mstats.mquantiles(dat,qtls,alphap=kwargs['alpha'],betap=kwargs['beta'])
allhist, allbin = np.histogram(dat, bins = ebins)
if (np.unique(ebins).shape!=ebins.shape or tolerence(allhist,0.3)==False) and numin>2:
numin= numin-1
del qtls, ebins
else:
numin=0
return ebins
def tolerence(narray, percent):
if percent>1.0:
per = percent/100.
else:
per = percent
lev_tol = per*narray.min()
tolerate = np.all(narray[1:]-narray[0]<lev_tol)
return tolerate
Just sort the data, and divide it into fixed bins by length! Obviously you can never divide into exactly equally populated bins, if the number of samples does not divide exactly by the number of bins.
import math
import numpy as np
data = np.array([2,3,5,6,8,5,5,6,3,2,3,7,8,9,8,6,6,8,9,9,0,7,5,3,3,4,5,6,7])
data_sorted = np.sort(data)
nbins = 3
step = math.ceil(len(data_sorted)//nbins+1)
binned_data = []
for i in range(0,len(data_sorted),step):
binned_data.append(data_sorted[i:i+step])

Categories