I'm solving a one dimensional non-linear equation with Newton's method. I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
The following algorithm does not converge:
whereas the following does converge:
You may assume that the functions f and f' are smooth and well behaved. The best explanation I was able to come up with is that this is somehow related to what's called iterative improvement (Golub and Van Loan, 1989). Any further insight would be greatly appreciated!
Here is a simple python example illustrating the issue
# Python
def f(x):
return x*x-2.
def fp(x):
return 2.*x
xprev = 0.
# converges
x = 1. # guess
while x != xprev:
xprev = x
x = (x*fp(x)-f(x))/fp(x)
print(x)
# does not converge
x = 1. # guess
while x != xprev:
xprev = x
dx = -f(x)/fp(x)
x = x + dx
print(x)
Note: I'm aware of how floating point numbers work (please don't post your favourite link to a website telling me to never compare two floating point numbers). Also, I'm not looking for a solution to a problem but for an explanation as to why one of the algorithms converges but not the other.
Update:
As #uhoh pointed out, there are many cases where the second method does not converge. However, I still don't know why the second method converges so much more easily in my real world scenario than the first. All the test cases have very simple functions f whereas the real world f has several hundred lines of code (which is why I don't want to post it). So maybe the complexity of f is important. If you have any additional insight into this, let me know!
None of the methods is perfect:
One situation in which both methods will tend to fail is if the root is about exactly midway between two consecutive floating-point numbers f1 and f2. Then both methods, having arrived to f1, will try to compute that intermediate value and have a good chance of turning up f2, and vice versa.
/f(x)
/
/
/
/
f1 /
--+----------------------+------> x
/ f2
/
/
/
"I'm aware of how floating point numbers work...". Perhaps the workings of floating-point arithmetic are more complicated than imagined.
This is a classic example of cycling of iterates using Newton's method. The comparison of a difference to an epsilon is "mathematical thinking" and can burn you when using floating-point. In your example, you visit several floating-point values for x, and then you are trapped in a cycle between two numbers. The "floating-point thinking" is better formulated as the following (sorry, my preferred language is C++)
std::set<double> visited;
xprev = 0.0;
x = 1.0;
while (x != prev)
{
xprev = x;
dx = -F(x)/DF(x);
x = x + dx;
if (visited.find(x) != visited.end())
{
break; // found a cycle
}
visited.insert(x);
}
I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
Technically, it doesn't converge to the correct value. Try printing more digits, or using float.hex.
The first one gives
>>> print "%.16f" % x
1.4142135623730949
>>> float.hex(x)
'0x1.6a09e667f3bccp+0'
whereas the correctly rounded value is the next floating point value:
>>> print "%.16f" % math.sqrt(2)
1.4142135623730951
>>> float.hex(math.sqrt(2))
'0x1.6a09e667f3bcdp+0'
The second algorithm is actually alternating between the two values, so doesn't converge.
The problem is due to catastrophic cancellation in f(x): as x*x will be very close to 2, when you subtract 2, the result will be dominated by the rounding error incurred in computing x*x.
I think trying to force an exact equal (instead of err < small) is always going to fail frequently. In your example, for 100,000 random numbers between 1 and 10 (instead of your 2.0) the first method fails about 1/3 of the time, the second method about 1/6 of the time. I'll bet there's a way to predict that!
This takes ~30 seconds to run, and the results are cute!:
def f(x, a):
return x*x - a
def fp(x):
return 2.*x
def A(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
x = (x * fp(x) - f(x,a)) / fp(x)
n += 1
if n >100:
return n, x
return n, x
def B(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
dx = - f(x,a) / fp(x)
x = x + dx
n += 1
if n >100:
return n, x
return n, x
import numpy as np
import matplotlib.pyplot as plt
n = 100000
aa = 1. + 9. * np.random.random(n)
data_A = np.zeros((2, n))
data_B = np.zeros((2, n))
for i, a in enumerate(aa):
data_A[:,i] = A(a)
data_B[:,i] = B(a)
bins = np.linspace(0, 110, 12)
hist_A = np.histogram(data_A, bins=bins)
hist_B = np.histogram(data_B, bins=bins)
print "A: n<10: ", hist_A[0][0], " n>=100: ", hist_A[0][-1]
print "B: n<10: ", hist_B[0][0], " n>=100: ", hist_B[0][-1]
plt.figure()
plt.subplot(1,2,1)
plt.scatter(aa, data_A[0])
plt.subplot(1,2,2)
plt.scatter(aa, data_B[0])
plt.show()
Related
I'm trying to do one task, but I just can't figure it out.
This is my function:
1/(x**1/n) + 1/(y**1/n) + 1/(z**1/n) - 1
I want that sum to be as close to 1 as possible.
And these are my input variables (x,y,z):
test = np.array([1.42, 5.29, 7.75])
So n is the only decision variable.
To summarize:
I have a situation like this right now:
1/(1.42**1/1) + 1/(5.29**1/1) + 1/(7.75**1/1) = 1.02229
And I want to get the following:
1/(1.42^(1/0.972782944446024)) + 1/(5.29^(1/0.972782944446024)) + 1/(7.75^(1/0.972782944446024)) = 0.999625
So far I have roughly nothing, and any help is welcome.
import numpy as np
from scipy.optimize import minimize
def objectiv(xyz):
x = xyz[0]
y = xyz[1]
z = xyz[2]
n = 1
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))
test = np.array([1.42, 5.29, 7.75])
print(objectiv(test))
OUTPUT: 1.0222935270013889
How to properly define a constraint?
def conconstraint(xyz):
x = xyz[0]
y = xyz[1]
z = xyz[2]
n = 1
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n)) - 1
And it is not at all clear to me how and what to do with n?
EDIT
I managed to do the following:
def objective(n,*args):
x = odds[0]
y = odds[1]
z = odds[2]
return abs((1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))) - 1)
odds = [1.42,5.29,7.75]
solve = minimize(objective,1.0,args=(odds))
And my output:
fun: -0.9999999931706812
x: array([0.01864994])
And really when put in the formula:
(1/(1.42^(1/0.01864994)) + 1/(5.29^(1/0.01864994)) + 1/(7.75^(1/0.01864994))) -1 = -0.999999993171
Unfortunately I need a positive 1 and I have no idea what to change.
We want to find n that gets our result for a fixed x, y, and z as close as possible to 1. minimize tries to get the lowest possible value for something, without negative bound; -3 is better than -2, and so on.
So what we actually want is called least-squares optimization. Similar idea, though. This documentation is a bit hard to understand, so I'll try to clarify:
All these optimization functions have a common design where you pass in a callable that takes at least one parameter, the one you want to optimize for (in your case, n). Then you can have it take more parameters, whose values will be fixed according to what you pass in.
In your case, you want to be able to solve the optimization problem for different values of x, y and z. So you make your callback accept n, x, y, and z, and pass the x, y, and z values to use when you call scipy.optimize.least_squares. You pass these using the args keyword argument (notice that it is not *args). We can also supply an initial guess of 1 for the n value, which the algorithm will refine.
The rest is customization that is not relevant for our purposes.
So, first let us make the callback:
def objective(n, x, y, z):
return 1/(x**(1/n)) + 1/(y**(1/n)) + 1/(z**(1/n))
Now our call looks like:
best_n = least_squares(objective, 1.0, args=np.array([1.42, 5.29, 7.75]))
(You can call minimize the same way, and it will instead look for an n value to make the objective function return as low a value as possible. If I am thinking clearly: the guess for n should trend towards zero, making the denominators increase without bound, making the sum of the reciprocals go towards zero; negative values are not possible. However, it will stop when it gets close to zero, according to the default values for ftol, xtol and gtol. To understand this part properly is beyond the scope of this answer; please try on math.stackexchange.com.)
I have a lot of measurements, with measurement uncertainties. To speed up the process of reporting all the measurements and their uncertainties, I'm writing a script to print them for me.
For example, when I have measurement x = 0.012345 with uncertainty dx = 0.000321, I want my python script to print '0.0123(3)', with (3) being the uncertainty on the last digit of the rounded x
I feel like this should be very easy, but so far, I'm coming up with incredibly ugly solutions, iterating over numbers as a string. What would be a good pythonic way to do this?
from math import log10, floor
x = 0.012345
dx = 0.000321
print('{:.{prec}f}({:.0f})'.format(x, floor(dx / (10**floor(log10(dx)))), prec=-floor(log10(dx))))
This code prints
0.0123(3)
You can do it like this:
print("{:.4f}({:.0f})".format(x, dx*10**4))
However that will only work if your error is in the order of 10**-4 otherwise it will output more numbers and if it is smaller it will output (0). S you might have to deal with the powers of your values. So this works for your specific example. Not sure how flexible it is supposed to be. BUt this could be a start for an idea on how to do it.
Based on the very clever solution from Yaroslav Kornachevskyi, I wrote a little function to do this for all numbers, rounding correctly. Probably nobody will ever use this but me:
from math import log10, floor, ceil
x = 0.012345
dx = 0.0000968
def get_number(x, dx):
""" Returns a string of the measurement value"""
""" together with the measurement error"""
""" x: measurement value"""
""" dx: measurment error"""
# Power of dx
power_err = log10(dx)
# Digits of dx in format a.bcd
n_err = dx / (10**floor(power_err))
# If the second digit in dx is >=5
# round the 1st digit in dx up
if n_err % 1 >= 0.5:
# If the first digit of dx is 9
# the precision is one digit less
if int(n_err) == 9:
err = 1
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx))) - 1
else:
err = ceil(n_err)
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx)))
# Otherwise round down
else:
err = floor(n_err)
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx)))
return '{:.{prec}f}({:.0f})'.format(x, err, prec = prec)
print(get_number(x, dx))
I'm trying to create a function that takes a value x, and creates a pattern like this with n+1 square root terms: sqrt(x)^sqrt(x)^sqrt(x)^sqrt(x)^sqrt(x)...
def func(x,n):
a = x**0.5
i = 0
while i < n:
a = a ** (x**0.5)
i += 1
print a
For example using x = 2, the function does not converge (to 2), but increases exponentially in some way, I don't understand why.
For the first iteration (i=0) it seems to be correct, as it calculates, sqrt(2)^sqrt(2), but for the second iteration (i=1) it gives me 2.0, and it keeps increasing.
Thanks!
The above answer by #interjay illustrates what the problem is with the iterative method. As an alternative, you could also use a recursive method to calculate this
from math import sqrt
def fun(x,n):
if n == 0:
return sqrt(x)
else:
return sqrt(x) ** fun(x, n-1)
>>> fun(2,2)
1.7608395558800285
>>> fun(2,3)
1.8409108692910108
>>> fun(2,10)
1.988711773413954
>>> fun(2,100)
2.0000000000000004
For sqrt(x)^sqrt(x)^sqrt(x)... to converge, the exponentiation needs to be treated as right-associative, i.e. sqrt(x)^(sqrt(x)^(sqrt(x)^...)). But your code is calculating it as left-associative: ((...^sqrt(x))^sqrt(x))^sqrt(x).
You need to switch the order of terms in
a = a ** (x**0.5)
to
a = (x**0.5) ** a
I wish to calculate the standard error of a series of numbers. Suppose the numbers are x[i] where i = 1 ... N. To do this
I set
averageX = 0.0
averageXSquared = 0.0
I then loop over all i=1,...N and for each I calculate
averageX += x[i]
averageXSquared += x[i]**2
I then divide by N
averageX = averageXC / N
averageXSquared = averageXSquared/N
I then take the square root of the difference
stdX = math.sqrt(averageXSquared - averageX * averageX)
The argument here is sure to always be >=0.
However if I set all x[i] = 0.07 (for example) then I get a math domain error as the argument of the root function is negative. There seems to be some loss of precision.
The argument is of the order of 10e-15.
This does not look encouraging. I now have to check myself to see if the result is negative before taking the root.
Or have I done something wrong.
This is not a python problem, but a problem with finite precision in general. If you set all numbers to the same value, the standard error is mathematically 0, but not for a computer. The correct way to handle this, is to set very small values <0 to 0.
x = [0.7, 0.7, 0.7]
average = sum(x) / len(x)
sqav = sum(y**2 for y in x) / len(x)
stderr = math.sqrt(max(sqav - average**2, 0))
The correct way, of course is never subtract large numbers. Have another pass, which guarantees non-negativity (you need to do some algebra to realize that the result is mathematically the same):
y = [ v - average for v in x ]
dev = sum(v*v for v in y) / len(x)
stderr = math.sqrt(dev)
Does anyone know why the below doesn't equal 0?
import numpy as np
np.sin(np.radians(180))
or:
np.sin(np.pi)
When I enter it into python it gives me 1.22e-16.
The number π cannot be represented exactly as a floating-point number. So, np.radians(180) doesn't give you π, it gives you 3.1415926535897931.
And sin(3.1415926535897931) is in fact something like 1.22e-16.
So, how do you deal with this?
You have to work out, or at least guess at, appropriate absolute and/or relative error bounds, and then instead of x == y, you write:
abs(y - x) < abs_bounds and abs(y-x) < rel_bounds * y
(This also means that you have to organize your computation so that the relative error is larger relative to y than to x. In your case, because y is the constant 0, that's trivial—just do it backward.)
Numpy provides a function that does this for you across a whole array, allclose:
np.allclose(x, y, rel_bounds, abs_bounds)
(This actually checks abs(y - x) < abs_ bounds + rel_bounds * y), but that's almost always sufficient, and you can easily reorganize your code when it's not.)
In your case:
np.allclose(0, np.sin(np.radians(180)), rel_bounds, abs_bounds)
So, how do you know what the right bounds are? There's no way to teach you enough error analysis in an SO answer. Propagation of uncertainty at Wikipedia gives a high-level overview. If you really have no clue, you can use the defaults, which are 1e-5 relative and 1e-8 absolute.
One solution is to switch to sympy when calculating sin's and cos's, then to switch back to numpy using sp.N(...) function:
>>> # Numpy not exactly zero
>>> import numpy as np
>>> value = np.cos(np.pi/2)
6.123233995736766e-17
# Sympy workaround
>>> import sympy as sp
>>> def scos(x): return sp.N(sp.cos(x))
>>> def ssin(x): return sp.N(sp.sin(x))
>>> value = scos(sp.pi/2)
0
just remember to use sp.pi instead of sp.np when using scos and ssin functions.
Faced same problem,
import numpy as np
print(np.cos(math.radians(90)))
>> 6.123233995736766e-17
and tried this,
print(np.around(np.cos(math.radians(90)), decimals=5))
>> 0
Worked in my case. I set decimal 5 not lose too many information. As you can think of round function get rid of after 5 digit values.
Try this... it zeros anything below a given tiny-ness value...
import numpy as np
def zero_tiny(x, threshold):
if (x.dtype == complex):
x_real = x.real
x_imag = x.imag
if (np.abs(x_real) < threshold): x_real = 0
if (np.abs(x_imag) < threshold): x_imag = 0
return x_real + 1j*x_imag
else:
return x if (np.abs(x) > threshold) else 0
value = np.cos(np.pi/2)
print(value)
value = zero_tiny(value, 10e-10)
print(value)
value = np.exp(-1j*np.pi/2)
print(value)
value = zero_tiny(value, 10e-10)
print(value)
Python uses the normal taylor expansion theory it solve its trig functions and since this expansion theory has infinite terms, its results doesn't reach exact but it only approximates.
For e.g
sin(x) = x - x³/3! + x⁵/5! - ...
=> Sin(180) = 180 - ... Never 0 bout approaches 0.
That is my own reason by prove.
Simple.
np.sin(np.pi).astype(int)
np.sin(np.pi/2).astype(int)
np.sin(3 * np.pi / 2).astype(int)
np.sin(2 * np.pi).astype(int)
returns
0
1
0
-1