Why is math.sqrt() incorrect for large numbers?

Why is math.sqrt() incorrect for large numbers? - python

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?

Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.

The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y

If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))

The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

Related

I'm making mistakes dividing large numbers

I am trying to write a program in python 2.7 that will first see if a number divides the other evenly, and if it does get the result of the division.
However, I am getting some interesting results when I use large numbers.
Currently I am using:
from __future__ import division
import math
a=82348972389472433334783
b=2
if a/b==math.trunc(a/b):
answer=a/b
print 'True' #to quickly see if the if loop was invoked
When I run this I get:
True
But 82348972389472433334783 is clearly not even.
Any help would be appreciated.

That's a crazy way to do it. Just use the remainder operator.
if a % b == 0:
# then b divides a evenly
quotient = a // b

The true division implicitly converts the input to floats which don't provide the precision to store the value of a accurately. E.g. on my machine
>>> int(1E15+1)
1000000000000001
>>> int(1E16+1)
10000000000000000
hence you loose precision. A similar thing happens with your big number (compare int(float(a))-a).
Now, if you check your division, you see the result "is" actually found to be an integer
>>> (a/b).is_integer()
True
which is again not really expected beforehand.
The math.trunc function does something similar (from the docs):
Return the Real value x truncated to an Integral (usually a long integer).
The duck typing nature of python allows a comparison of the long integer and float, see
Checking if float is equivalent to an integer value in python and
Comparing a float and an int in Python.

Why don't you use the modulus operator instead to check if a number can be divided evenly?
n % x == 0

Python equal operator for finite precision [duplicate]

I have been asked to test a library provided by a 3rd party. The library is known to be accurate to n significant figures. Any less-significant errors can safely be ignored. I want to write a function to help me compare the results:
def nearlyequal( a, b, sigfig=5 ):
The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal. The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written in decimal.
Can anybody suggest a good implementation? I've written a mini unit-test. Unless you can see a bug in my tests then a good implementation should pass the following:
assert nearlyequal(1, 1, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(-1e-9, 1e-9, 5)
assert nearlyequal(1e9, 1e9 + 1 , 5)
assert not nearlyequal( 1e4, 1e4 + 1, 5)
assert nearlyequal( 0.0, 1e-15, 5 )
assert not nearlyequal( 0.0, 1e-4, 6 )
Additional notes:
Values a and b might be of type int, float or numpy.float64. Values a and b will always be of the same type. It's vital that conversion does not introduce additional error into the function.
Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal. This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.
Speed... I've got to compare a lot of numbers so the faster the better.
I've got numpy, scipy and the standard-library. Anything else will be hard for me to get, especially for such a small part of the project.

As of Python 3.5, the standard way to do this (using the standard library) is with the math.isclose function.
It has the following signature:
isclose(a, b, rel_tol=1e-9, abs_tol=0.0)
An example of usage with absolute error tolerance:
from math import isclose
a = 1.0
b = 1.00000001
assert isclose(a, b, abs_tol=1e-8)
If you want it with precision of n significant digits, simply replace the last line with:
assert isclose(a, b, abs_tol=10**-n)

There is a function assert_approx_equal in numpy.testing (source here) which may be a good starting point.
def assert_approx_equal(actual,desired,significant=7,err_msg='',verbose=True):
"""
Raise an assertion if two items are not equal up to significant digits.
.. note:: It is recommended to use one of `assert_allclose`,
`assert_array_almost_equal_nulp` or `assert_array_max_ulp`
instead of this function for more consistent floating point
comparisons.
Given two numbers, check that they are approximately equal.
Approximately equal is defined as the number of significant digits
that agree.

Here's a take.
def nearly_equal(a,b,sig_fig=5):
return ( a==b or
int(a*10**sig_fig) == int(b*10**sig_fig)
)

I believe your question is not defined well enough, and the unit-tests you present prove it:
If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test assert nearlyequal(1e9, 1e9 + 1 , 5) should fail, because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.
And if by 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test assert nearlyequal(-1e-9, 1e-9, 5) should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.
If you meant the first definition, then the first answer on this page (by Triptych) is good.
If you meant the second definition, please say it, I promise to think about it :-)

There are already plenty of great answers, but here's a think:
def closeness(a, b):
"""Returns measure of equality (for two floats), in unit
of decimal significant figures."""
if a == b:
return float("infinity")
difference = abs(a - b)
avg = (a + b)/2
return math.log10( avg / difference )
if closeness(1000, 1000.1) > 3:
print "Joy!"

This is a fairly common issue with floating point numbers. I solve it based on the discussion in Section 1.5 of Demmel[1]. (1) Calculate the roundoff error. (2) Check that the roundoff error is less than some epsilon. I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.
Step 1. Roundoff error
def roundoff_error(exact, approximate):
return abs(approximate/exact - 1.0)
Step 2. Floating point equality
def float_equal(float1, float2, epsilon=2.0e-9):
return (roundoff_error(float1, float2) < epsilon)
There are a couple obvious deficiencies with this code.
Division by zero error if the exact value is Zero.
Does not verify that the arguments are floating point values.
Revision 1.
def roundoff_error(exact, approximate):
if (exact == 0.0 or approximate == 0.0):
return abs(exact + approximate)
else:
return abs(approximate/exact - 1.0)
def float_equal(float1, float2, epsilon=2.0e-9):
if not isinstance(float1,float):
raise TypeError,"First argument is not a float."
elif not isinstance(float2,float):
raise TypeError,"Second argument is not a float."
else:
return (roundoff_error(float1, float2) < epsilon)
That's a little better. If either the exact or the approximate value is zero, than the error is equal to the value of the other. If something besides a floating point value is provided, a TypeError is raised.
At this point, the only difficult thing is setting the correct value for epsilon. I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon. But the correct value depends on both your application and your algorithm.
[1] James W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997.

"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.
>>> int(3.1415926 * 10**3)
3141
>>> int(1234567 * 10**-3)
1234
>>>

Oren Shemesh got part of the problem with the problem as stated but there's more:
assert nearlyequal( 0.0, 1e-15, 5 )
also fails the second definition (and that's the definition I learned in school.)
No matter how many digits you are looking at, 0 will not equal a not-zero. This could prove to be a headache for such tests if you have a case whose correct answer is zero.

There is a interesting solution to this by B. Dawson (with C++ code)
at "Comparing Floating Point Numbers". His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.

I have been asked to test a library provided by a 3rd party
If you are using the default Python unittest framework, you can use assertAlmostEqual
self.assertAlmostEqual(a, b, places=5)

There are lots of ways of comparing two numbers to see if they agree to N significant digits. Roughly speaking you just want to make sure that their difference is less than 10^-N times the largest of the two numbers being compared. That's easy enough.
But, what if one of the numbers is zero? The whole concept of relative-differences or significant-digits falls down when comparing against zero. To handle that case you need to have an absolute-difference as well, which should be specified differently from the relative-difference.
I discuss the problems of comparing floating-point numbers -- including a specific case of handling zero -- in this blog post:
http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

Converting An "Infinite" Float To An Int [duplicate]

This question already has answers here:
Integer square root in python
(14 answers)
Closed 8 years ago.
I'm trying to check if a number is a perfect square. However, i am dealing with extraordinarily large numbers so python thinks its infinity for some reason. it gets up to 1.1 X 10^154 before the code returns "Inf". Is there anyway to get around this? Here is the code, the lst variable just holds a bunch of really really really really really big numbers
import math
from decimal import Decimal
def main():
for i in lst:
root = math.sqrt(Decimal(i))
print(root)
if int(root + 0.5) ** 2 == i:
print(str(i) + " True")

Replace math.sqrt(Decimal(i)) with Decimal(i).sqrt() to prevent your Decimals decaying into floats

I think that you need to take a look at the BigFloat module, e.g.:
import bigfloat as bf
b = bf.BigFloat('1e1000', bf.precision(21))
print bf.sqrt(b)
Prints BigFloat.exact('9.9999993810013282e+499', precision=53)

#casevh has the right answer -- use a library that can do math on arbitrarily large integers. Since you're looking for squares, you presumably are working with integers, and one could argue that using floating point types (including decimal.Decimal) is, in some sense, inelegant.
You definitely shouldn't use Python's float type; it has limited precision (about 16 decimal places). If you do use decimal.Decimal, be careful to specify the precision (which will depend on how big your numbers are).
Since Python has a big integer type, one can write a reasonably simple algorithm to check for squareness; see my implementation of such an algorithm, along with illustrations of problems with float, and how you could use decimal.Decimal, below.
import math
import decimal
def makendigit(n):
"""Return an arbitraryish n-digit number"""
return sum((j%9+1)*10**i for i,j in enumerate(range(n)))
x=makendigit(30)
# it looks like float will work...
print 'math.sqrt(x*x) - x: %.17g' % (math.sqrt(x*x) - x)
# ...but actually they won't
print 'math.sqrt(x*x+1) - x: %.17g' % (math.sqrt(x*x+1) - x)
# by default Decimal won't be sufficient...
print 'decimal.Decimal(x*x).sqrt() - x:',decimal.Decimal(x*x).sqrt() - x
# ...you need to specify the precision
print 'decimal.Decimal(x*x).sqrt(decimal.Context(prec=30)) - x:',decimal.Decimal(x*x).sqrt(decimal.Context(prec=100)) - x
def issquare_decimal(y,prec=1000):
x=decimal.Decimal(y).sqrt(decimal.Context(prec=prec))
return x==x.to_integral_value()
print 'issquare_decimal(x*x):',issquare_decimal(x*x)
print 'issquare_decimal(x*x+1):',issquare_decimal(x*x+1)
# you can check for "squareness" without going to floating point.
# one option is a bisection search; this Newton's method approach
# should be faster.
# For "industrial use" you should use gmpy2 or some similar "big
# integer" library.
def isqrt(y):
"""Find largest integer <= sqrt(y)"""
if not isinstance(y,(int,long)):
raise ValueError('arg must be an integer')
if y<0:
raise ValueError('arg must be positive')
if y in (0,1):
return y
x0=y//2
while True:
# newton's rule
x1= (x0**2+y)//2//x0
# we don't always get converge to x0=x1, e.g., for y=3
if abs(x1-x0)<=1:
# nearly converged; find biggest
# integer satisfying our condition
x=max(x0,x1)
if x**2>y:
while x**2>y:
x-=1
else:
while (x+1)**2<=y:
x+=1
return x
x0=x1
def issquare(y):
"""Return true if non-negative integer y is a perfect square"""
return y==isqrt(y)**2
print 'isqrt(x*x)-x:',isqrt(x*x)-x
print 'issquare(x*x):',issquare(x*x)
print 'issquare(x*x+1):',issquare(x*x+1)

math.sqrt() converts the argument to a Python float which has a maximum value around 10^308.
You should probably look at using the gmpy2 library. gmpy2 provide very fast multiple precision arithmetic.
If you want to check for arbitrary powers, the function gmpy2.is_power() will return True if a number is a perfect power. It may be a cube or fifth power so you will need to check for power you are interested in.
>>> gmpy2.is_power(456789**372)
True
You can use gmpy2.isqrt_rem() to check if it is an exact square.
>>> gmpy2.isqrt_rem(9)
(mpz(3), mpz(0))
>>> gmpy2.isqrt_rem(10)
(mpz(3), mpz(1))
You can use gmpy2.iroot_rem() to check for arbitrary powers.
>>> gmpy2.iroot_rem(13**7 + 1, 7)
(mpz(13), mpz(1))

How to print floating point numbers as it is without any truncation in python?

I have some number 0.0000002345E^-60. I want to print the floating point value as it is.
What is the way to do it?
print %f truncates it to 6 digits. Also %n.nf gives fixed numbers. What is the way to print without truncation.

Like this?
>>> print('{:.100f}'.format(0.0000002345E-60))
0.0000000000000000000000000000000000000000000000000000000000000000002344999999999999860343602938602754
As you might notice from the output, it’s not really that clear how you want to do it. Due to the float representation you lose precision and can’t really represent the number precisely. As such it’s not really clear where you want the number to stop displaying.
Also note that the exponential representation is often used to more explicitly show the number of significant digits the number has.
You could also use decimal to not lose the precision due to binary float truncation:
>>> from decimal import Decimal
>>> d = Decimal('0.0000002345E-60')
>>> p = abs(d.as_tuple().exponent)
>>> print(('{:.%df}' % p).format(d))
0.0000000000000000000000000000000000000000000000000000000000000000002345

You can use decimal.Decimal:
>>> from decimal import Decimal
>>> str(Decimal(0.0000002345e-60))
'2.344999999999999860343602938602754401109865640550232148836753621775217856801120686600683401464097113374472942165409862789978024748827516129306833728589548440037314681709534891496105046826414763927459716796875E-67'
This is the actual value of float created by literal 0.0000002345e-60. Its value is a number representable as python float which is closest to actual 0.0000002345 * 10**-60.
float should be generally used for approximate calculations. If you want accurate results you should use something else, like mentioned Decimal.

If I understand, you want to print a float?
The problem is, you cannot print a float.
You can only print a string representation of a float. So, in short, you cannot print a float, that is your answer.
If you accept that you need to print a string representation of a float, and your question is how specify your preferred format for the string representations of your floats, then judging by the comments you have been very unclear in your question.
If you would like to print the string representations of your floats in exponent notation, then the format specification language allows this:
{:g} or {:G}, depending whether or not you want the E in the output to be capitalized). This gets around the default precision for e and E types, which leads to unwanted trailing 0s in the part before the exponent symbol.
Assuming your value is my_float, "{:G}".format(my_float) would print the output the way that the Python interpreter prints it. You could probably just print the number without any formatting and get the same exact result.
If your goal is to print the string representation of the float with its current precision, in non-exponentiated form, User poke describes a good way to do this by casting the float to a Decimal object.
If, for some reason, you do not want to do this, you can do something like is mentioned in this answer. However, you should set 'max_digits' to sys.float_info.max_10_exp, instead of 14 used in the answer. This requires you to import sys at some point prior in the code.
A full example of this would be:
import math
import sys
def precision_and_scale(x):
max_digits = sys.float_info.max_10_exp
int_part = int(abs(x))
magnitude = 1 if int_part == 0 else int(math.log10(int_part)) + 1
if magnitude >= max_digits:
return (magnitude, 0)
frac_part = abs(x) - int_part
multiplier = 10 ** (max_digits - magnitude)
frac_digits = multiplier + int(multiplier * frac_part + 0.5)
while frac_digits % 10 == 0:
frac_digits /= 10
scale = int(math.log10(frac_digits))
return (magnitude + scale, scale)
f = 0.0000002345E^-60
p, s = precision_and_scale(f)
print "{:.{p}f}".format(f, p=p)
But I think the method involving casting to Decimal is probably better, overall.

Prevent Rounding to Zero in Python

I have a program meant to approximate pi using the Chudnovsky Algorithm, but a term in my equation that is very small keeps being rounded to zero.
Here is the algorithm:
import math
from decimal import *
getcontext().prec = 100
pi = Decimal(0.0)
C = Decimal(12/(math.sqrt(640320**3)))
k = 0
x = Decimal(0.0)
result = Decimal(0.0)
sign = 1
while k<10:
r = Decimal(math.factorial(6*k)/((math.factorial(k)**3)*math.factorial(3*k)))
s = Decimal((13591409+545140134*k)/((640320**3)**k))
x += Decimal(sign*r*s)
sign = sign*(-1)
k += 1
result = Decimal(C*x)
pi = Decimal(1/result)
print Decimal(pi)
The equations may be clearer without the "decimal" terms.
import math
pi = 0.0
C = 12/(math.sqrt(640320**3))
k = 0
x = 0.0
result = 0.0
sign = 1
while k<10:
r = math.factorial(6*k)/((math.factorial(k)**3)*math.factorial(3*k))
s = (13591409+545140134*k)/((640320**3)**k)
x += sign*r*s
sign = sign*(-1)
k += 1
result = C*x
pi = 1/result
print pi
The issue is with the "s" variable. For k>0, it always comes to zero. e.g. at k=1, s should equal about 2.1e-9, but instead it is just zero. Because of this all of my terms after the first =0. How do I get python to calculate the exact value of s instead of rounding it down to 0?

Try:
s = Decimal((13591409+545140134*k)) / Decimal(((640320**3)**k))
The arithmetic you're doing is native python - by allowing the Decimal object to perform your division, you should eliminate your error.
You can do the same, then, when computing r.

A couple of comments.
If you are using Python 2.x, the / returns an integer result. If you want a Decimal result, you convert at least one side to Decimal first.
math.sqrt() only return ~16 digits of precision. Since your value for C will only be accurate to ~16 digits, your final result will only be accurate to 16 digits.

If you're doing maths in Python 2.x, you should probably be putting this line into every module:
from __future__ import division
This changes the meaning of the division operator so that it will return a floating point number if needed to give a (closer to) precise answer. The historical behaviour is for x / y to return an int if both x and y are ints, which usually forces the answer to be rounded down.
Returning a float if necessary is generally regarded as a better way to handle division in a language like Python where duck typing is encouraged, since you can just worry about the value of your numbers rather than getting different behaviour for different types.
In Python 3 this is in fact the default, but since old programs relied on the historical behaviour of the division operator it was felt the change was too backwards-incompatible to be made in Python 2. This is why you have to explicitly turn it on with the __future__ import. I would recommend always adding that import in any module that might be doing any mathematics (or just any module at all, if you can be bothered). You'll almost never be upset that it's there, but not having it there has been the cause of a number of obscure bugs I've had to chase.

I feel that the problem with 's' is that all terms are integers, thus you are doing integer maths. A very simple workaround, would be to use 3.0 in the denominator. It only takes one float in the calculation to get a float returned.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is math.sqrt() incorrect for large numbers? - python

If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy: import sympy num = sympy.Integer(123456758365483459347856) print(int(num) == int(sympy.sqrt(num**2)))

Related

I'm making mistakes dividing large numbers

Python equal operator for finite precision [duplicate]

Converting An "Infinite" Float To An Int [duplicate]

How to print floating point numbers as it is without any truncation in python?

Prevent Rounding to Zero in Python

Categories

Resources