I am working through Structure and Interpretation of Computer Programs.
in pg. 73, it uses Newton's Method as example of how to construct higher order procedure.
here is my code:
def deriv(g):
dx = 0.00001
return lambda x: (g(x + dx) - g(x)) / dx
def newton_transform(g):
return lambda x: x - g(x) / deriv(g)(x)
def fixed_point(f, guess):
def close_enough(a, b):
tolerance = 0.00001
return abs(a - b) < tolerance
def a_try(guess):
next = f(guess)
if close_enough(guess, next):
return next
else:
return a_try(next)
return a_try(guess)
def newton_method(g, guess):
return fixed_point(newton_transform(g), guess)
def sqrt(x):
return newton_method(lambda y: x / y, 1.0)
print sqrt(2)
The code will crash and give me ZeroDivisionError. I know how it crash but I don't understand why it behave as it is.
In my "a_try" function, every "next" is doubled value of "guess". When I print out "guess" and "next" of every iteration, my next guess just keep doubling. So the integer overflow in the end.
Why? What's wrong with my code? What's wrong with my logic?
Thanks for your time. Please help.
To use Newton's method to find, say, sqrt(2)—that is, y**2 == 2—you first write g(y) = y**2 - 2, and then iterate over that with newton_transform until it converges.
Your deriv and newton_transform are fine, and your fixed_point does in fact iterate over newton_transform until it converges—or until you hit the recursion limit, or until you underflow a float operation. In your case, it's the last one.
Why? Well, look at your g(y): it's 2/y. I don't know where you got that from, but g / g' is just -y, so the newton transform is y - -y, which obviously isn't going to converge.
But if you plug in y**2 - 2, then the transform on g/g' will converge (at least for most values).
So:
def sqrt(x):
return newton_method(lambda y: y**2-x, 1.0)
Related
from sympy import diff
from math import log
def newt_method(f, a, b, e):
if f(a) * f(b) >= 0:
raise Exception("Корені відсутні.")
iterations3 = 0
if diff(diff(f(a))) * f(a) > 0:
x0 = a
else:
x0 = b
while f(x0) > e:
x0 = x0 - (f(x0) / diff(f(x0)));
iterations3 = iterations3 + 1
return x0, iterations3
a = 0.1
b = 2.1
e = 0.001
def F(a): return log(a, 10) + (0.5 * a)
newt_method(F, a, b, e)
As far as i know, log function should not return complex data, however the program raises an exception "TypeError: Cannot convert complex to float". Would be grateful to hear an advise.
Traceback:
line 12, in newt_method
while f(x0) > e:
line 20, in F
def F(a): return log(a, 10) + (0.5 * a)
raise TypeError("Cannot convert complex to float")
a or x0 are numbers, f(a) is a number, symbolic differentiation of a constant expression will at best return zero. Use for instance central difference quotients of first and second order.
Why do you care about a bracketing interval if an initial guess is all that is needed for the Newton iteration?
The test to start at the right side of a concave function only is valid if both first and second derivative do not change their sign over the interval. So there is a 10% (swag) chance that the initial guess was improved by this test.
You should always use the absolute value to test against some error level.
If you go forward with the bracketing idea, for instance to ensure that all values are taken inside the domain of the function, you will have to actually test that all roots of the tangents (aka Newton iterates) you compute fall inside the recent bracketing interval. If that fails, fall back to the secant root or the middle point as in bisection. Then actually shrink the bracketing interval.
I'm trying to do one task, but I just can't figure it out.
This is my function:
1/(x**1/n) + 1/(y**1/n) + 1/(z**1/n) - 1
I want that sum to be as close to 1 as possible.
And these are my input variables (x,y,z):
test = np.array([1.42, 5.29, 7.75])
So n is the only decision variable.
To summarize:
I have a situation like this right now:
1/(1.42**1/1) + 1/(5.29**1/1) + 1/(7.75**1/1) = 1.02229
And I want to get the following:
1/(1.42^(1/0.972782944446024)) + 1/(5.29^(1/0.972782944446024)) + 1/(7.75^(1/0.972782944446024)) = 0.999625
So far I have roughly nothing, and any help is welcome.
import numpy as np
from scipy.optimize import minimize
def objectiv(xyz):
x = xyz[0]
y = xyz[1]
z = xyz[2]
n = 1
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))
test = np.array([1.42, 5.29, 7.75])
print(objectiv(test))
OUTPUT: 1.0222935270013889
How to properly define a constraint?
def conconstraint(xyz):
x = xyz[0]
y = xyz[1]
z = xyz[2]
n = 1
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n)) - 1
And it is not at all clear to me how and what to do with n?
EDIT
I managed to do the following:
def objective(n,*args):
x = odds[0]
y = odds[1]
z = odds[2]
return abs((1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))) - 1)
odds = [1.42,5.29,7.75]
solve = minimize(objective,1.0,args=(odds))
And my output:
fun: -0.9999999931706812
x: array([0.01864994])
And really when put in the formula:
(1/(1.42^(1/0.01864994)) + 1/(5.29^(1/0.01864994)) + 1/(7.75^(1/0.01864994))) -1 = -0.999999993171
Unfortunately I need a positive 1 and I have no idea what to change.
We want to find n that gets our result for a fixed x, y, and z as close as possible to 1. minimize tries to get the lowest possible value for something, without negative bound; -3 is better than -2, and so on.
So what we actually want is called least-squares optimization. Similar idea, though. This documentation is a bit hard to understand, so I'll try to clarify:
All these optimization functions have a common design where you pass in a callable that takes at least one parameter, the one you want to optimize for (in your case, n). Then you can have it take more parameters, whose values will be fixed according to what you pass in.
In your case, you want to be able to solve the optimization problem for different values of x, y and z. So you make your callback accept n, x, y, and z, and pass the x, y, and z values to use when you call scipy.optimize.least_squares. You pass these using the args keyword argument (notice that it is not *args). We can also supply an initial guess of 1 for the n value, which the algorithm will refine.
The rest is customization that is not relevant for our purposes.
So, first let us make the callback:
def objective(n, x, y, z):
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))
Now our call looks like:
best_n = least_squares(objective, 1.0, args=np.array([1.42, 5.29, 7.75]))
(You can call minimize the same way, and it will instead look for an n value to make the objective function return as low a value as possible. If I am thinking clearly: the guess for n should trend towards zero, making the denominators increase without bound, making the sum of the reciprocals go towards zero; negative values are not possible. However, it will stop when it gets close to zero, according to the default values for ftol, xtol and gtol. To understand this part properly is beyond the scope of this answer; please try on math.stackexchange.com.)
I am trying to get the root of a function and have been recommended to try using the Newthon Method.
I have tried to do the following:
def newton(f,Df,x0,epsilon,max_iter):
'''Approximate solution of f(x)=0 by Newton's method.
Parameters
----------
f : function
Function for which we are searching for a solution f(x)=0.
Df : function
Derivative of f(x).
x0 : number
Initial guess for a solution f(x)=0.
epsilon : number
Stopping criteria is abs(f(x)) < epsilon.
max_iter : integer
Maximum number of iterations of Newton's method.
Returns
-------
xn : number
Implement Newton's method: compute the linear approximation
of f(x) at xn and find x intercept by the formula
x = xn - f(xn)/Df(xn)
Continue until abs(f(xn)) < epsilon and return xn.
If Df(xn) == 0, return None. If the number of iterations
exceeds max_iter, then return None.
Examples
--------
>>> f = lambda x: x**2 - x - 1
>>> Df = lambda x: 2*x - 1
>>> newton(f,Df,1,1e-8,10)
Found solution after 5 iterations.
1.618033988749989
'''
xn = x0
for n in range(0,max_iter):
fxn = f(xn)
if abs(fxn) < epsilon:
print('Found solution after',n,'iterations.')
return xn
Dfxn = Df(xn)
if Dfxn == 0:
print('Zero derivative. No solution found.')
return None
xn = xn - fxn/Dfxn
print('Exceeded maximum iterations. No solution found.')
return None
f = lambda x: 1.03078 - (((x + 1.08804)**(23/252))*((2*x + 1.08804)**(37/252))*((3*x + 1.08804)**(19/126)))
But I need Df to be the first derivative of f. I have tried using scipy and simpy to get that but it is a different data type so that way the function I am using does not work.
If not by doing this way, could anyone recommend a different method?
Thanks
I'm not sure of an analytic way to calculate the derivative, but i think an approximation would not change the result of your function. Try to replace
Dfxn = Df(xn)
with
Dfxn = (f(xn+delta)-f(xn))/delta
for some small delta. Depends on nature of your function, but I'd say anything less than .1 should be fine?
I'm trying to create a function that takes a value x, and creates a pattern like this with n+1 square root terms: sqrt(x)^sqrt(x)^sqrt(x)^sqrt(x)^sqrt(x)...
def func(x,n):
a = x**0.5
i = 0
while i < n:
a = a ** (x**0.5)
i += 1
print a
For example using x = 2, the function does not converge (to 2), but increases exponentially in some way, I don't understand why.
For the first iteration (i=0) it seems to be correct, as it calculates, sqrt(2)^sqrt(2), but for the second iteration (i=1) it gives me 2.0, and it keeps increasing.
Thanks!
The above answer by #interjay illustrates what the problem is with the iterative method. As an alternative, you could also use a recursive method to calculate this
from math import sqrt
def fun(x,n):
if n == 0:
return sqrt(x)
else:
return sqrt(x) ** fun(x, n-1)
>>> fun(2,2)
1.7608395558800285
>>> fun(2,3)
1.8409108692910108
>>> fun(2,10)
1.988711773413954
>>> fun(2,100)
2.0000000000000004
For sqrt(x)^sqrt(x)^sqrt(x)... to converge, the exponentiation needs to be treated as right-associative, i.e. sqrt(x)^(sqrt(x)^(sqrt(x)^...)). But your code is calculating it as left-associative: ((...^sqrt(x))^sqrt(x))^sqrt(x).
You need to switch the order of terms in
a = a ** (x**0.5)
to
a = (x**0.5) ** a
I'm solving a one dimensional non-linear equation with Newton's method. I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
The following algorithm does not converge:
whereas the following does converge:
You may assume that the functions f and f' are smooth and well behaved. The best explanation I was able to come up with is that this is somehow related to what's called iterative improvement (Golub and Van Loan, 1989). Any further insight would be greatly appreciated!
Here is a simple python example illustrating the issue
# Python
def f(x):
return x*x-2.
def fp(x):
return 2.*x
xprev = 0.
# converges
x = 1. # guess
while x != xprev:
xprev = x
x = (x*fp(x)-f(x))/fp(x)
print(x)
# does not converge
x = 1. # guess
while x != xprev:
xprev = x
dx = -f(x)/fp(x)
x = x + dx
print(x)
Note: I'm aware of how floating point numbers work (please don't post your favourite link to a website telling me to never compare two floating point numbers). Also, I'm not looking for a solution to a problem but for an explanation as to why one of the algorithms converges but not the other.
Update:
As #uhoh pointed out, there are many cases where the second method does not converge. However, I still don't know why the second method converges so much more easily in my real world scenario than the first. All the test cases have very simple functions f whereas the real world f has several hundred lines of code (which is why I don't want to post it). So maybe the complexity of f is important. If you have any additional insight into this, let me know!
None of the methods is perfect:
One situation in which both methods will tend to fail is if the root is about exactly midway between two consecutive floating-point numbers f1 and f2. Then both methods, having arrived to f1, will try to compute that intermediate value and have a good chance of turning up f2, and vice versa.
/f(x)
/
/
/
/
f1 /
--+----------------------+------> x
/ f2
/
/
/
"I'm aware of how floating point numbers work...". Perhaps the workings of floating-point arithmetic are more complicated than imagined.
This is a classic example of cycling of iterates using Newton's method. The comparison of a difference to an epsilon is "mathematical thinking" and can burn you when using floating-point. In your example, you visit several floating-point values for x, and then you are trapped in a cycle between two numbers. The "floating-point thinking" is better formulated as the following (sorry, my preferred language is C++)
std::set<double> visited;
xprev = 0.0;
x = 1.0;
while (x != prev)
{
xprev = x;
dx = -F(x)/DF(x);
x = x + dx;
if (visited.find(x) != visited.end())
{
break; // found a cycle
}
visited.insert(x);
}
I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
Technically, it doesn't converge to the correct value. Try printing more digits, or using float.hex.
The first one gives
>>> print "%.16f" % x
1.4142135623730949
>>> float.hex(x)
'0x1.6a09e667f3bccp+0'
whereas the correctly rounded value is the next floating point value:
>>> print "%.16f" % math.sqrt(2)
1.4142135623730951
>>> float.hex(math.sqrt(2))
'0x1.6a09e667f3bcdp+0'
The second algorithm is actually alternating between the two values, so doesn't converge.
The problem is due to catastrophic cancellation in f(x): as x*x will be very close to 2, when you subtract 2, the result will be dominated by the rounding error incurred in computing x*x.
I think trying to force an exact equal (instead of err < small) is always going to fail frequently. In your example, for 100,000 random numbers between 1 and 10 (instead of your 2.0) the first method fails about 1/3 of the time, the second method about 1/6 of the time. I'll bet there's a way to predict that!
This takes ~30 seconds to run, and the results are cute!:
def f(x, a):
return x*x - a
def fp(x):
return 2.*x
def A(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
x = (x * fp(x) - f(x,a)) / fp(x)
n += 1
if n >100:
return n, x
return n, x
def B(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
dx = - f(x,a) / fp(x)
x = x + dx
n += 1
if n >100:
return n, x
return n, x
import numpy as np
import matplotlib.pyplot as plt
n = 100000
aa = 1. + 9. * np.random.random(n)
data_A = np.zeros((2, n))
data_B = np.zeros((2, n))
for i, a in enumerate(aa):
data_A[:,i] = A(a)
data_B[:,i] = B(a)
bins = np.linspace(0, 110, 12)
hist_A = np.histogram(data_A, bins=bins)
hist_B = np.histogram(data_B, bins=bins)
print "A: n<10: ", hist_A[0][0], " n>=100: ", hist_A[0][-1]
print "B: n<10: ", hist_B[0][0], " n>=100: ", hist_B[0][-1]
plt.figure()
plt.subplot(1,2,1)
plt.scatter(aa, data_A[0])
plt.subplot(1,2,2)
plt.scatter(aa, data_B[0])
plt.show()