I'm trying to run an optimization with scipy.optimize.differential_evolution. The code calls for bounds for each variable in x. But I want to a solution where parts of x must be integers, while others can range freely as floats. The relevant part of my code looks like
bounds = [(0,3),(0,3),(0,3),???,???]
result = differential_evolution(func, bounds)
What do I replace the ???'s with to force those variables to be ints in a given range?
As noted in the comments there isn't direct support for a "integer constraint".
You could however minimize a modified objective function, e.g.:
def func1(x):
return func(x) + K * (x[3] - round(x[3]))**2
and this will force x[3] towards an integer value (unfortunately you have to tune the K parameter).
An alternative is to round (some of) the real-valued parameters before evaluating the objective function:
def func1(x):
z = x;
z[3] = round(z[3])
return func(z)
Both are common techniques to approach a discrete optimization problem using Differential Evolution and they work quite well.
Differential evolution can support integer constraint but the current scipy implementation would need to be changed.
From the scipy source code it appears that their DE is based Storn, R and Price, K, Differential Evolution - a Simple and Efficient Heuristic for Global Optimization over Continuous Spaces, Journal of Global Optimization, 1997
However there has been progress in this field as pointed out by this review paper Recent advances in differential evolution – An updated survey
There are a few papers which introduce changes to the algorithm so that it can handle ints. I have not had time to look at all the options but perhaps this paper could help.
An Improved Differential Evolution Algorithm for Mixed Integer Programming Problems
What do I replace the ???'s with to force those variables to be ints in a given range?
wrapdisc is a package that is a thin wrapper which will let you optimize bounded integer variables alongside floats with various scipy.optimize optimizers. There is a usage example in its readme. With it, you don't have to adapt your objective function at all. It internally uses rounding in order to support integers, although this detail is hidden from the user.
Related
I am trying to solve an equation that can include truncations in Python with a numerical approach. I am wondering what the best library and approach would be? Following is more detail about the problem:
The equation changes every time. From a human perspective, the equations should be pretty simple; they include common operators such as +,-,*,/, and they also sometimes have truncation functions (truncate to integer) or limit functions (limit the value in parenthesis between two provided bounds) or (rarely) multiple variables. A couple of examples (with these being separate examples and not a system of equations) would be:
TRUNCATE(VAR_1 + 300) - 50.4 = 200
(VAR_2 + VAR_3)*3 = 35
LIMIT(3,5)(VAR_4) = 8
VAR_5 = 34
(This is not exactly what the equations look like, because I am writing them in postfix notation, but I have a calculator to determine their value with provided input values.)
All I need for these equations is some value for each variable that would solve each equation; I do not need to know every solution.
Some additional things to note about this is a) these variables all have maximum and minimum values, b) while perfection would be nice, occasional errors are acceptable, and c) some of the variables are integers, which I expect really complicates things. Right now, I'm handling this very sloppily but also mostly acceptably for my case by rounding the integer values to the nearest int.
In an attempt to solve this problem, I tried solving analytically with Sympy (which as you might expect didn't work on truncations and was difficult to implement), and I also tried using Scipy minimize as follows:
minimize(minimization, x0, method = 'SLSQP', constraints = cons, tol = 1e-3, options={'ftol': 1e-3, 'disp':True, 'maxiter': 100, "eps":.1}, args = (x_vals, postfix, const_values, value))
This one got stuck on truncations, presumably because it didn't know what direction to move, unless I set the step to 1, which decreased accuracy. For some reason, it also didn't seem to follow the ftol, because it would give acceptable answers within the tolerance but would just keep going to the iteration limit.
I am considering using something that does random walks like the "Markov Chain Monte Carlo" method, but I really don't know much about this and was eager to hear other thoughts.
I ended up solving the problem two slightly different ways. Both of them used the Powell solver as suggested by joni in the comments on the original question, and for both of them I had to multiply the output of the function that gets passed to the "fun" parameter (a function that I named minimize) by 100, because I could never get the tolerance adjusted in the solver function call.
When the equation had only one variable, I removed the truncation from the minimize function. This worked for my purposes because the reason the equations I was using was being truncated was so they would equal an integer value (generally). So, when the equation output is an integer and there is only one variable, I believe the correct solution will be obtained by just pretending the truncation function does not exist in the solver (though remember to be wary of floating point math). (And if any numbers outside of the truncation are integers, the equation may not have a solution anyways.)
In cases with multiple variables, my solution was to a) include the truncation function in the minimize function and b) round the x values suggested by the solver as I planned to round them in the end (ex. round them to an integer if they were an integer value).
Anyways, this solution worked for the problem defined above, but it otherwise has some limitations. It is not guaranteed to always find the correct output, especially the second part. Another approach people with this problem may wish to consider would be some sort of integer programming, if they have linear equations.
I have a non-linear minimization problem that takes a combination of continuous and binary variables as input. Think of it as a network flow problem with valves, for which the throughput can be controlled, and with pumps, for which you can change the direction.
A "natural," minimalistic formulation could be:
arg( min( f(x1,y2,y3) )) s.t.
x1 \in [0,1] //a continuous variable
y2,y3 \in {0,1} //two binary variables
The objective function is deterministic, but expensive to solve. If I leave away the binary variables, Scipy's differential evolution algorithm turns out to be a useful solution approach for my problem (converging faster than basin hopping).
There is some evidence available already with regard to the inclusion of integer variables in a differential evolution-based minimization problem. The suggested approaches turn y2,y3 into continuous variables x2,x3 \in [0,1], and then modify the objective function as follows:
(i) f(x1, round(x2), round(x3))
(ii) f(x1,x2,x3) + K( (x2-round(x2))^2 + (x3-round(x3))^2 )
with K a tuning parameter
A third, and probably naive approach would be to combine the binary variables into a single continuous variable z \in [0,1], and thereby to reduce the number of optimization variables.
For instance,
if z<0.25: y2=y3=0
elif z<0.5: y2=1, y3=0
elif z<0.75: y2=0, y3=1
else: y2=y3=1.
Which one of the above should be preferred, and why? I'd be very curious to hear how binary variables can be integrated in a continuous differential evolution algorithm (such as Scipy's) in a smart way.
PS. I'm aware that there's some literature available that proposes dedicated mixed-integer evolutionary algorithms. For now, I'd like to stay with Scipy.
I'd be very curious to hear how binary variables can be integrated in a continuous differential evolution algorithm
wrapdisc is a package that is a thin wrapper which will let you optimize binary variables alongside floats with various scipy.optimize optimizers. There is a usage example in its readme. With it, you don't have to adapt your objective function at all.
As of v2.0.0, it has two possible encodings for binary:
ChoiceVar: This uses one-hot max encoding. Two floats are used to represent the binary variable.
GridVar: This uses rounding. One float is used to represent the binary variable.
Although neither of these two variable types were made for binary, they both can support it just the same. On average, GridVar requires fewer function evaluations because it uses one less float than ChoiceVar.
When scipy 1.9 is released the differential_evolution function will gain an integrality parameter that will allow the user to indicate which parameters should be considered as integers. For binary selection one would use bounds of (0,1) for an integer parameter.
I am trying to solve a system of coupled iterative equations, each of which containing lots of integrations and derivatives.
First I used maxima (embedded in Sage) to solve it analytically, but the solution was too dependent on the initial guesses I had make for my unknown functions, constant initial guesses yielded answering back almost immediately while symbolic functions when used as initial guesses yielded the system to go deep into calculations, sometimes seemingly never ending ones.
However, what I tried with Sage was actually a simplified version of my original equations so I thought it might be the case that I have no other choice rather than to treat the integrations and derivatives numerically, however, I had some not ignorable problems:
integrations were only allowed to have numerical limits and variables were not allowed as e.g. their upper limits (I thought maybe a numerical method algorithm is faster than the analytic one even-though I leave a variable or parameter in its calculations, but it just didn't work so).
integrands couldn't also admit extra variables and parameters w.r.t. which not being integrated.
the derivative function was itself a big obstacle, as I wasn't able to compute partial derivatives or to use a derivative in the integrand of an integral.
To get rid of all the problems with numerical derivative I substitute it by the symbolic diff() function and the speed improvement was still hopeful but the problems with numerical integration persist.
Now I have three questions:
a- Is it right to conclude there is no other way for me rather than to discretize the equations and do a complete numerical treatment instead of a mixed one?
b- If so then is there any way to do this automatically? My equations are not DE ones to use ODEint or else, they are iterative equations, I have integrations and derivatives only to update my unknowns at each step to their newer values.
c- If my calculations are so huge in size is there any suggestion on switching from python to fortran or things like that as well?
Best Regards
[Frontmatter] (skip this if you just want the question):
I'm currently looking at using Shannon-Weaver Mutual Information and normalized redundancy to measure the degree of information masking between bags of discrete and continuous feature values, organized by feature. Using this method, it is my goal to construct an algorithm that looks very similar to ID3, but instead of using Shannon entropy, the algorithm will seek (as a loop constraint) to maximize or minimize shared information between a single feature and a collection of features based on the complete input feature space, adding new features to the latter collection if (and only if) they increase or decrease mutual information, respectively. This, in effect, moves ID3's decision algorithm into pairspace, stapling an ensemble approach to it with all of the expected time and space complexities of both methods.
[/Frontmatter]
On to the question: I'm trying to get a continuous integrator working in Python using SciPy. Because I'm working with comparisons of discrete and continuous variables, my current strategy for each comparison for feature-feature pairs is as follows:
Discrete feature versus discrete feature: use the discrete form of mutual information. This results in a double summation of the probabilities, which my code handles without issue.
All other cases (discrete versus continuous, the inverse, and continuous versus continuous): use the continuous form, using a Gaussian estimator to smooth out the probability density functions.
It is possible for me to perform some kind of discretization for the latter cases, but because my input data sets are not inherently linear, this is potentially needlessly complex.
Here's the salient code:
import math
import numpy
import scipy
from scipy.stats import gaussian_kde
from scipy.integrate import dblquad
# Constants
MIN_DOUBLE = 4.9406564584124654e-324
# The minimum size of a Float64; used here to prevent the
# logarithmic function from hitting its undefined region
# at its asymptote of 0.
INF = float('inf') # The floating-point representation for "infinity"
# x and y are previously defined as collections of
# floating point values with the same length
# Kernel estimation
gkde_x = gaussian_kde(x)
gkde_y = gaussian_kde(y)
if len(binned_x) != len(binned_y) and len(binned_x) != len(x):
x.append(x[0])
y.append(y[0])
gkde_xy = gaussian_kde([x,y])
mutual_info = lambda a,b: gkde_xy([a,b]) * \
math.log((gkde_xy([a,b]) / (gkde_x(a) * gkde_y(b))) + MIN_DOUBLE)
# Compute MI(X,Y)
(minfo_xy, err_xy) = \
dblquad(mutual_info, -INF, INF, lambda a: 0, lambda a: INF)
print 'minfo_xy = ', minfo_xy
Note that overcounting exactly one point is done deliberately to prevent a singularity in SciPy's gaussian_kde class. As the size of x and y mutually approach infinity, this effect becomes negligible.
My current snag is in trying to get multiple integration working against a Gaussian kernel density estimate in SciPy. I've been trying to use SciPy's dblquad to perform the integration, but in the latter case, I receive an astounding spew of the following messages.
When I set numpy.seterr ( all='ignore' ):
Warning: The ocurrence of roundoff error is detected, which prevents
the requested tolerance from being achieved. The error may be
underestimated.
And when I set it to 'call' using an error handler:
Floating point error (underflow), with flag 4
Floating point error (invalid value), with flag 8
Pretty easy to figure out what's going on, right? Well, almost: IEEE 754-2008 and SciPy only tell me what's going on here, not why or how to work around it.
The upshot: minfo_xy generally resolves to nan; its sampling is insufficient to prevent information from becoming lost or invalid when performing Float64 math.
Is there a general workaround for this problem when using SciPy?
Even better: if there is a robust, canned implementation of continuous mutual information for Python with an interface that takes two collections of floating point values or a merged collection of pairs, it would resolve this complete problem. Please link it if you know of one that exists.
Thank you in advance.
Edit: this resolves the nan propagation issue in the example above:
mutual_info = lambda a,b: gkde_xy([a,b]) * \
math.log((gkde_xy([a,b]) / ((gkde_x(a) * gkde_y(b)) + MIN_DOUBLE)) \
+ MIN_DOUBLE)
However, the question of roundoff correction remains, as does the request for a more robust implementation. Any help in either domain would be greatly appreciated.
Before trying more radical solutions like reframing the problem or using different integration tools, see if this helps. Replace INF=float('INF') with INF=1E12 or some other large number -- that may eliminate NaN results created by simple arithmetic operations on the input variables.
No promises on this one, but it is sometimes helpful to try a quick fix before engaging in a significant algorithmic rewrite or substitution of alternate tools.
I have a graph between 2 functions f and g.
I know it follows a power law function with exponential cutoff.
f(x) = x**(-alpha)*e**(-lambda*x)
How do I find the value of exponent alpha?
If you have sufficiently close x points (for example one every 0.1), you can try the following:
ln(f(x)) = -alpha ln(x) - lambda x
ln(f(x))' = - alpha / x - lambda
So depending on where you have your points:
If you have a lot of points near 0, you can try:
h(x) = x ln(f(x))' = -alpha - lambda x
So the limit of the function h when x goes to 0 is -alpha
If you have large values of x, the function x -> ln(f(x))' tends toward lambda when x goes to infinity, so you can guess lambda and use pwdyson's expression.
If you don't have close x points, the numerical derivative will be very noisy, so I would try to guess lambda as the limit of -ln(f(x)/x for large x's...
If you don't have large values, but a large number of x's, you can try a minimization of
sum_x_i (ln(y_i) + alpha ln(x_i) + lambda x_i) ^2
on both alpha and lambda (I guess It would be more precise than the initial expression)...
It is a simple least square regression (numpy.linalg.lstsq will do the job).
So you have plenty of methods, the one to chose really depends on you inputs.
The usual and general way of doing what you want is to perform a non-linear regression (even though, as noted in another response, it is possible to linearize the problem). Python can do this quite easily with the help of the SciPy package, which is used by many scientists.
The routine you are looking for is its least-square optimization routine (scipy.optimize.leastsq). Once you wrap your head around the way this general optimization procedure works (see the example), you will probably find many other opportunities to use it. Basically, you calculate the list of differences between your measurements and their ideal value f(x), and you ask SciPy to find the parameters that make these differences as small as possible, so that your data fits the model as well as possible. This then gives you the parameter you are looking for.
It sounds like you might be trying to fit a power-law to a distribution with an exponential cutoff at the low end due to incompleteness - but I may be reading too far into your problem.
If that is the problem you're dealing with, this website (and accompanying publication) addresses the issue: http://tuvalu.santafe.edu/~aaronc/powerlaws/. I wrote the python implementation of the power-law fitter on that page; it is linked from there.
If you know that the points follow this law exactly, then invert the equation and put in an x and its corresponding f(x) value:
import math
alpha = -(lambda*x + math.log(f(x)))/math.log(x)
But the if the points do not exactly fit the equation you will need to do some sort of regression to determine alpha.
EDIT: Ok, so they don't fit exactly. This is getting beyond a Python question, but there may be something in numpy that can handle it. Here is a numpy linear regression recipe but your equation can't be rearranged into a linear form, so you'll have to look into non-linear regression.