I have been playing around with Python and math lately, and I ran in to something I have yet to be able to figure out. Namely, is it possible, given an arbitrary lambda, to return the inverse of that lambda for mathematical operations? That is, invertLambda such that invertLambda(lambda x:(x+2))(2) = 0. The fact that lambdas are restricted to expressions gives me hope, but so far I have not been able to make it work. I understand that any result would have problems with functions that lose information, but I am willing to restrict users and myself to lossless functions if I have to.
Of course not: if lambda is not an injective function, you cannot invert it. Example: you cannot invert lambda mapping x to x*x, since the sign of the original x is lost.
Leaving injectivity aside, there are functions which are computationally very complex to invert. Consider, for example, restoring the original value from its md5 hash. (For a lambda calculating md5 hash, inverted function must break md5 in cryptological sense!)
Edit:
indeed, we can theoretically make lambdas invertable if we restrict the expressions which can be used there. For example, if the lambda is a linear function of 1 argument, we can easily invert it. If it's a polynomial of degree > 4, we have a problem with algebraically exact solution.
Of course, we could refrain from exact solution, and just invert the function numerically. This is possible, using, well, any method of numerical solving of the equation lambda(x) = value will do (the simplest be binary search).
I am a bit late, but I just published a python package that does this precisely. You may want to borrow some ideas from it:
https://pypi.python.org/pypi/pynverse
It essentially follows this strategy:
Figure out if the function is increasing or decreasing. For this two reference points ref1 and ref2 are needed:
In case of a finite interval, the points ref points are 1/4 and 3/4 through the interval.
In an infinite interval any two values work really.
If f(ref1) < f(ref2), the function is increasing, otherwise is decreasing.
Figure out the image of the function in the interval.
If values are provided, then those are used.
In a closed interval just calculate f(a) and f(b), where a and b are the ends of the interval.
In an open interval try to calculate f(a) and f(b), if this works those are used, otherwise it will be assume to be (-Inf, Inf).
Built a bounded function with the following conditions:
bounded_f(x):
return -Inf if x below interval, and f is increasing.
return +Inf if x below interval, and f is decreasing.
return +Inf if x above interval, and f is increasing.
return -Inf if x above interval, and f is decreasing.
return f(x) otherwise
If the required number y0 for the inverse is outside the image, raise an exception.
Find roots for bounded_f(x)-y0, by minimizing (bounded_f(x)-y0)**2, using the Brent method, making sure that the algorithm for minimising starts in a point inside the original interval by setting ref1, ref2 as brackets. As soon as if goes outside the allowed intervals, bounded_f returns infinite, forcing the algorithm to go back to search inside the interval.
Check that the solutions are accurate and they meet f(x0)=y0 to some desired precision, raising a warning otherwise.
Of course, as Vlad pointed out, the function has to be invertible for the inverse to exist, and also continuous in the domain for this to work.
Related
I'm looking for a Python algorithm to find the root of a function f(x) using bisection, equivalent to scipy.optimize.bisect, but allowing for discontinuities (jumps) in f. The function f is weakly monotonous.
It would be nice but not necessary for the algorithm to flag if the crossing (root) is directly 'at' a jump, and in this case to return the exact value x at which the relevant jump occurs (i.e. say the x for which sign(f(x-e)) != sign(f(x+e)) and abs(f(x-e)-f(x+e)>a for infinitesimal e>0 and non-infinitesimal a>0). It is also okay if instead the algorithm, for example, simply returns an x within a certain tolerance in this case.
As the function is only weakly monotonous, it can have flat areas, and theoretically these can occur 'at' the root, i.e. where f=0: f(x)=0 for an entire range, x in [x_0,x_1]. In this case again, nice but not necessary for the algo to flag this particularity, and to, say, ensure an x from the range [x_0,x_1] is returned.
As long as you supply (possibly very small) strictly positive positives for xtol and rtol, the function will work with discontinuities:
import numpy as np
>>> optimize.bisect(f=np.sign, a=-1, b=1, xtol=0.0001, rtol=0.001)
0.0
If you look in the scipy codebase at the C source code implementation of the function you can see that this is a very simple function that makes no assumptions on continuity. It basically takes two points which have a sign change, and switches to a smaller range with a sign change, until the iterations run out or the tolerances are met.
Given your requirements that functions might be discontinuous / flat, it is in fact necessary (for any algorithm) to supply these tolerances. Without them, it could be impossible for an optimization function to converge to a solution.
I'm working through some exercises on improper integrals and I've stumbled across an issue I can't resolve. I'm attempting to use the limit() function on the following problem:
Here N(x) is the cumulative distribution function of the standard normal variable.
The limit() function so far hasn't caused any problems, including problems which require L'Hôpital's rule be applied. However, I'm struggling to get compute the correct answer for this particular problem and can't work out why. The following code yields an incorrect answer
from sympy import *
x, y = symbols('x y')
init_printing(use_unicode=False) #Print the answers in unicode characters
cum_distribution = (1/sqrt(2*pi)*(integrate(exp(-y**2/2), (y, -oo, x))))
func = (cum_distribution -(1/2)-(x/sqrt(2*pi)))/(x**3)
limit(func, x, 0)
If I apply L'Hôpital's rule, i get the correct
l_hopital = diff((cum_distribution -(1/2)-(x/sqrt(2*pi))), x)/diff(x**3, x)
limit(l_hopital, x, 0)
I looked through the limit() function source code and my understanding is that L'Hôpital's rule isn't applied? In this case, can this problem be solved using the limit() function without applying this rule?
At present, a limit involving the function erf (known as the error function, related to normal CDF) can only be evaluated when the argument of erf tends to positive infinity. Limits at other places are either not evaluated, or evaluated incorrectly. (Related PR). This includes the limit
limit(-(sqrt(2)*x - sqrt(pi)*erf(sqrt(2)*x/2))/(2*sqrt(pi)*x**3), x, 0)
which returns unevaluated (though I would not call this incorrect). As a workaround, you can compute the Taylor series of this function with one term (the constant term), which gives the correct value of the limit:
series(func, x, 0, 1).removeO()
returns -sqrt(2)/(12*sqrt(pi)).
As in calculus practice, L'Hopital's rule is inferior to power series techniques when it comes to algorithmic computations, and SymPy relies primarily on the latter. The algorithm it uses is devised and explained in On Computing Limits in a Symbolic Manipulation System by Dominik Gruntz.
I have a function I want to minimize with scipy.optimize.fmin. Note that I force a print when my function is evaluated.
My problem is, when I start the minimization, the value printed decreases untill it reaches a certain point (the value 46700222.800). There it continues to decrease by very small bites, e.g., 46700222.797,46700222.765,46700222.745,46700222.699,46700222.688,46700222.678
So intuitively, I feel I have reached the minimum, since the length of each step are minus then 1. But the algorithm keeps running untill I get a "Maximum number of function evaluations has been exceeded" error.
My question is: how can I force my algorithm to accept the value of the parameter when the function evaluation reaches a value from where it does not really evolve anymore (let say, I don't gain more than 1 after an iteration). I read that the options ftol could be used but it has absolutely no effect on my code. In fact, I don't even know what value to put for ftol. I tried everything from 0.00001 to 10000 and there is still no convergence.
There is actually no need to see your code to explain what is happening. I will answer point by point quoting you.
My problem is, when I start the minimization, the value printed decreases
untill it reaches a certain point (the value 46700222.800). There it
continues to decrease by very small bites, e.g.,
46700222.797,46700222.765,46700222.745,46700222.699,46700222.688,46700222.678
Notice that the difference between the last 2 values is -0.009999997913837433, i.e. about 1e-2. In the convention of minimization algorithm, what you call values is usually labelled x. The algorithm stops if these 2 conditions are respected AT THE SAME TIME at the n-th iteration:
convergence on x: the absolute value of the difference between x[n] and the next iteration x[n+1] is smaller than xtol
convergence on f(x): the absolute value of the difference between f[n] and f[n+1] is smaller than ftol.
Moreover, the algorithm stops also if the maximum number of iterations is reached.
Now notice that xtol defaults to a value of 1e-4, about 100 times smaller than the value 1e-2 that appears for your case. The algorithm then does not stop, because the first condition on xtol is not respected, until it reaches the maximum number of iterations.
I read that the options ftol could be used but it has absolutely no
effect on my code. In fact, I don't even know what value to put for
ftol. I tried everything from 0.00001 to 10000 and there is still no
convergence.
This helped you respecting the second condition on ftol, but again the first condition was never reached.
To reach your aim, increase also xtol.
The following methods will also help you more in general when debugging the convergence of an optimization routine.
inside the function you want to minimize, print the value of x and the value of f(x) before returning it. Then run the optimization routine. From these prints you can decide sensible values for xtol and ftol.
consider nondimensionalizing the problem. There is a reason if ftol and xtol default both to 1e-4. They expect you to formulate the problem so that x and f(x) are of order O(1) or O(10), say numbers between -100 and +100. If you carry out the nondimensionalization you handle a simpler problem, in the way that you often know what values to expect and what tolerances you are after.
if you are interested just in a rough calculation and can't estimate typical values for xtol and ftol, and you know (or you hope) that your problem is well behaved, i.e. that it will converge, you can run fmin in a try block, pass to fmin only maxiter=20 (say), and catch the error regarding the Maximum number of function evaluations has been exceeded.
I just spent three hours digging into the source code of scipy.minimize. In it, the "while" loop in function "_minimize_neldermead" deals with the convergence rule:
if (numpy.max(numpy.ravel(numpy.abs(sim[1:] - sim[0]))) <= xtol and
numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= ftol):
break"
"fsim" is the variable that stores results from functional evaluation. However, I found that fsim[0] = f(x0) which is the function evaluation of the initial value, and it never changes during the "while" loop. fsim[1:] updates itself all the time. The second condition of the while loop was never satisfied. It might be a bug. But my knowledge of mathematical optimization is far from enough to judge it.
My current solution: design your own system to control the convergence. Add this in your function:
global x_old, Q_old
if (np.absolute(x_old-x).sum() <= 1e-4) and (np.absolute(Q_old-Q).sum() <= 1e-4):
return None
x_old = x; Q_old = Q
Here Q=f(x). Don't forget to give them an initial value.
Update 01/30/15:
I got it! This should be the correct code for the second line of the if function (i.e. remove numpy.absolute):
numpy.max(fsim[0] - fsim[1:]) <= ftol)
btw, this is my first debugging of a open source software. I just created an issue on GitHub.
Update 01/31/15 - 1:
I don't think my previous update is correct. Nevertheless, this is the a screenshot of the iterations of a function using the original code.
It prints the values of sim and fsim variable for each iteration. As you can see, the changes of each iteration is less than both of xtol and ftol values, but it just kept going without stopping. The original code compares the difference between fsim[0] and the rest of fsim values, i.e. the value here is always 87.63228689 - 87.61312213 = .01916476, which is greater than ftol=1e-2.
Update 01/31/15 - 2:
Here is the data and code that I used to reproduce the previous results. It includes two data files and one iPython Notebook file.
From the documentation it looks like you DO want to change the ftol arg.
Post your code so we can look at your progress.
edit: Try increasing xtol as well.
Your question is a bit ambiguous. Are you printing the value of your function, or the point where it is evaluated?
My understanding of xtol and ftol is as follows. The iteration stops
when the change in the value of the function between iterations is less than ftol
AND
when the change in x between successive iterations is less than xtol
When you say "...accept the value of the parameter...", this suggests you should change xtol.
I'm trying to use scipy.optimize.minimize to minimize a complicated function. I noticed in hindsight that the minimize function takes the objective and derivative functions as separate arguments. Unfortunately, I've already defined a function which returns the objective function value and first-derivative values together -- because the two are computed simultaneously in a for loop. I don't think there is a good way to separate my function into two without the program essentially running the same for loop twice.
Is there a way to pass this combined function to minimize?
(FYI, I'm writing an artificial neural network backpropagation algorithm, so the for loop is used to loop over training data. The objective and derivatives are accumulated concurrently.)
Yes, you can pass them in a single function:
import numpy as np
from scipy.optimize import minimize
def f(x):
return np.sin(x) + x**2, np.cos(x) + 2*x
sol = minimize(f, [0], jac=True, method='L-BFGS-B')
Something that might work is: you can memoize the function, meaning that if it gets called with the same inputs a second time, it will simply return the same outputs corresponding to those inputs without doing any actual work the second time. What is happening behind the scenes is that the results are getting cached. In the context of a nonlinear program, there could be thousands of calls which implies a large cache. Often with memoizers(?), you can specify a cache limit and the population will be managed FIFO. IOW you still benefit fully for your particular case because the inputs will be the same only when you are needing to return function value and derivative around the same point in time. So what I'm getting at is that a small cache should suffice.
You don't say whether you are using py2 or py3. In Py 3.2+, you can use functools.lru_cache as a decorator to provide this memoization. Then, you write your code like this:
#functools.lru_cache
def original_fn(x):
blah
return fnvalue, fnderiv
def new_fn_value(x):
fnvalue, fnderiv = original_fn(x)
return fnvalue
def new_fn_deriv(x):
fnvalue, fnderiv = original_fn(x)
return fnderiv
Then you pass each of the new functions to minimize. You still have a penalty because of the second call, but it will do no work if x is unchanged. You will need to research what unchanged means in the context of floating point numbers, particularly since the change in x will fall away as the minimization begins to converge.
There are lots of recipes for memoization in py2.x if you look around a bit.
Did I make any sense at all?
TL;DR: How to minimize a fairly smooth function that returns an integer value (not a float)?
>>> import scipy.optimize as opt
>>> opt.fmin(lambda (x,y): (0.1*x**2+0.1*(y**2)), (-10, 9))
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 49
Function evaluations: 92
array([ -3.23188819e-05, -1.45087583e-06])
>>> opt.fmin(lambda (x,y): int(0.1*x**2+0.1*(y**2)), (-10, 9))
Optimization terminated successfully.
Current function value: 17.000000
Iterations: 17
Function evaluations: 60
array([-9.5 , 9.45])
Trying to minimize a function that accepts floating point parameters but returns an integer, I'm running into a problem that the solver terminates immediately. This effect is demonstrated in the examples above - notice that the when the value returned is rounded as an int, the evaluation terminates prematurely.
I assume that this is happening because it detects no change in the derivative, i.e. the first time it changes a parameter, the change it makes is too small and the difference between first result and second is 0.00000000000, incorrectly indicating a minimum has been found.
I've had better luck with optimize.anneal, but despite its integer valued return I've plotted some regions of the function in three dimensions and it's actually pretty smooth. Therefore, I was hoping that when a derivative-aware minimizer would work better.
I've reverted to manually graphing to explore the space, but I'd like to introduce a couple more parameters so it'd be great if I could get this working.
The function I'm trying to minimize can't be made to return a float. It's counting the number of successful hits from a cross-validation, and I'm having the optimizer alter parameters on the model.
Any ideas?
UPDATE
Found a similar question: How to force larger steps on scipy.optimize functions?
In general, minimization on the integer space is an entirely different field called integer programming (or discrete optimization). The addition of integer constraints actually creates quite a few algorithmic difficulties that render continuous methods unfit. Look into scipy.optimize.anneal