Cast float to int in Python results wrong answer [duplicate] - python

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 1 year ago.
I have an algorithm that is calculating:
result = int(14949283383840498/5262*27115)
The correct result should be 77033412951888085, but Python3.8 gives me 77033412951888080
I also have tried the following:
>>> result = 77033412951888085
>>> print(result)
77033412951888085
>>> print(int(result))
77033412951888085
>>> print(float(result))
7.703341295188808e+16
>>> print(int(float(result)))
77033412951888080
It seems the problem occours when I cast the float to int. What am I missing?
PS: I have found that using result = 14949283383840498//5262*27115 I get the right answer!

Casting is not the issue. Floating-point arithmetic has limitations with respect to precision. See https://docs.python.org/3/tutorial/floatingpoint.html
Need to either use integer division or use the decimal module which defaults to using 28 places in precision.
Using integer division
result = 14949283383840498 // 5262 * 27115
print(result)
Output:
77033412951888085
Using decimal module
from decimal import Decimal
result = Decimal(14949283383840498) / 5262 * 27115
print(result)
Output:
77033412951888085

It is an precision limitation :
result = 14949283383840498/5262*27115
result
7.703341295188808e+16
In this case, result is a float.
You can see that the precision is of 15 digits.
Convert that to int, you see that the last non zero digit is 8, it is correct to what result: float show when printed.
Try the following:
print(sys.float_info.dig)
15
dig is the maximum number of decimal digits that can be faithfully represented in a float.
A very good explanation regarding this issue is available here.
But there are ways to do better with Python, see from the Python's doc:
For use cases which require exact decimal representation, try using
the decimal module which implements decimal arithmetic suitable for
accounting applications and high-precision applications.
Another form of exact arithmetic is supported by the fractions module
which implements arithmetic based on rational numbers (so the numbers
like 1/3 can be represented exactly).
If you are a heavy user of floating point operations you should take a
look at the NumPy package and many other packages for mathematical and
statistical operations supplied by the SciPy project

Related

Having problems with Decimal library of python [duplicate]

This question already has an answer here:
Why doesn't decimal.getcontext().prec=3 work for decimal.Decimal(1.234)
(1 answer)
Closed 25 days ago.
So I was trying to minimize floating point errors when doing arithmetic in python and I stumbled upon the Decimal module of python. It worked great in the first up until this operation.
from decimal import *
getcontext().prec = 100
test_x = Decimal(str(3.25)).quantize(Decimal('0.000001'), rounding=ROUND_HALF_UP)
test_y = Decimal(str(2196.646351)).quantize(Decimal('0.000001'), rounding=ROUND_HALF_UP)
print((test_y)*(test_x**Decimal('2')))
The above code outputs 23202.077082437500000000 instead of 23202.07708 where it is the output of our usual conventional arithmetic calculator. How can I output it like our calculator with rounding off to 6 decimal places? Also do you have better ways to do arithmetic calculations in python?
I have tried the round() function of the python but that is off limits for me because I am dealing with very large numbers which reaches the maximum length of numbers that the round() function support
Adding further context to the code. I cant change the value of getcontext().prec and the .quantize(Decimal('0.000001')) because I am dealing with numbers like 109796940503037.6545639765 and it is giving me errors if I dont set getcontext().prec to a high number.
I can't change the getcontext().prec to let's say 6 because it always gives the error:
InvalidOperation: [<class 'decimal.InvalidOperation'>]
If you do: Decimal(str(123312.12321221332)) it converts a float to string and the passes it to Decimal and you are losing precision during that conversion.
Do: Decimal('123312.12321221332') instead.
Also keep in mind that:
Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem
https://docs.python.org/3/library/decimal.html

Mitigating Floating Point Approximation Issues with Numpy

My code is quite simple, and only 1 line is causing an issue:
np.tan(np.radians(rotation))
Instead of my expected output for rotation = 45 as 1, I get 0.9999999999999999. I understand that 0 and a ton of 9's is 1. In my use case, however, it seems like the type of thing that will definitely build up over iterations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Edit:
I should clarify that I am familiar with floating point inaccuracies. My concern is that as that number gets multiplied, added, and compared, the 1e-6 error suddenly becomes a tangible issue. I've normally been able to safely ignore floating point issues, but now I am far more concerned about the build up of error. I would like to reduce the possibility of such an error.
Edit 2:
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Both functions incur rounding error, since in neither case is the exact result representable in floating point.
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
The problem has nothing to do with decimal representation, and this will give worse results outside of the exact case you mention above, e.g.
>>> np.tan(np.radians(60))
1.7320508075688767
>>> round(np.tan(np.radians(60)), 8)
1.73205081
>>> np.sqrt(3) # sqrt is correctly rounded, so this is the closest float to the true result
1.7320508075688772
If you absolutely need higher accuracy than the 15 decimal digits you would get from code above, then you can use an arbitrary precision library like gmpy2.
Take a look here: https://docs.scipy.org/doc/numpy/user/basics.types.html .
Standard dtypes in numpy do not go beyond 64 bits precision. From the docs:
Be warned that even if np.longdouble offers more precision than python
float, it is easy to lose that extra precision, since python often
forces values to pass through float. For example, the % formatting
operator requires its arguments to be converted to standard python
types, and it is therefore impossible to preserve extended precision
even if many decimal places are requested. It can be useful to test
your code with the value 1 + np.finfo(np.longdouble).eps.
You can increase precision with np.longdouble, but this is platform dependent
In spyder (windows):
np.finfo(np.longdouble).eps #same precision as float
>> 2.220446049250313e-16
np.finfo(np.longdouble).precision
>> 15
In google colab:
np.finfo(np.longdouble).eps #larger precision
>> 1.084202172485504434e-19
np.finfo(np.longdouble).precision
>> 18
print(np.tan(np.radians(45, dtype=np.float), dtype=np.float) - 1)
print(np.tan(np.radians(45, dtype=np.longfloat), dtype=np.longfloat) - 1)
>> -1.1102230246251565e-16
0.0

Decimal in Python

I am using Python for programming and then Gurobi for solving my optimization problems. As a part of my codes I read the data from a text file (called “Feed2”), then do some calculations on it.
with open('Feed2.txt', 'r') as Fee:
for i in range(C):
Feed= Fee.readline()
for s in L11:
A[i,s]=float(Feed)
for s in L12:
A[i,s] =float(Feed)*1.28
for s in L13:
A[i,s] =float(Feed)*0.95
print A
The result shows some of the numbers have many digits after the decimal (such as 106.51209999999999 or 1029.4144000000001) which crates problem for Gurobi for reading all those which are not really useful digits to me. So, I want to set the number of digits after the decimal to 5 for my entire program, I followed the method explained in https://docs.python.org/3/library/decimal.html (codes are below); but nothing is changed.
from decimal import *
getcontext().prec = 5
The documentation for the decimal module offers an explanation:
Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem.
When you did:
from decimal import *
getcontext().prec = 5
You only changed to precision used with Decimal objects from the decimal module. You didn't change the precision amount for Python's built-in floating point numbers.
As said in the comments, the behavior you are experiencing is not new. It's simply an side-effect of the way floating point numbers being stored in memory. If you really need the floats to stay a specific precision, use the decimal.Decimal class. e.g.:
>>> from decimal import Decimal
>>> Decimal.from_float(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> Decimal('0.1')
Decimal('0.1')
>>> Decimal('0.1') / Decimal('0.5')
Decimal('0.2')
If you simply need to round the decimal to a specif precision to display properly, use str.format in the format:
'{:<number of digits before decimal>.<number of digits after decimal >f}'.format(float)
Or with old style formatting:
'%<number of digits before decimal>.<number of digits after decimal >f' % (float)
Recommended reading: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
If you just need to print the numbers with, for example, only two decimals:
print "%.2f" % (A,)
or the newer
print "{0:.2f}".format(A)

Handling very large numbers

I need to write a simple program that calculates a mathematical formula.
The only problem here is that one of the variables can take the value 10^100.
Because of this I can not write this program in C++/C (I can't use external libraries like gmp).
Few hours ago I read that Python is capable of calculating such values.
My question is:
Why
print("%.10f"%(10.25**100))
is returning the number "118137163510621843218803309161687290343217035128100169109374848108012122824436799009169146127891562496.0000000000"
instead of
"118137163510621850716311252946961817841741635398513936935237985161753371506358048089333490072379307296.453937046171461"?
By default, Python uses a fixed precision floating-point data type to represent fractional numbers (just like double in C). You can work with precise rational numbers, though:
>>> from fractions import Fraction
>>> Fraction("10.25")
Fraction(41, 4)
>>> x = Fraction("10.25")
>>> x**100
Fraction(189839102486063226543090986563273122284619337618944664609359292215966165735102377674211649585188827411673346619890309129617784863285653302296666895356073140724001, 1606938044258990275541962092341162602522202993782792835301376)
You can also use the decimal module if you want arbitrary precision decimals (only numbers that are representable as finite decimals are supported, though):
>>> from decimal import *
>>> getcontext().prec = 150
>>> Decimal("10.25")**100
Decimal('118137163510621850716311252946961817841741635398513936935237985161753371506358048089333490072379307296.453937046171460995169093650913476028229144848989')
Python is capable of handling arbitrarily large integers, but not floating point values. They can get pretty large, but as you noticed, you lose precision in the low digits.

Rounding ** 0.5 and math.sqrt

In Python, are either
n**0.5 # or
math.sqrt(n)
recognized when a number is a perfect square? Specifically, should I worry that when I use
int(n**0.5) # instead of
int(n**0.5 + 0.000000001)
I might accidentally end up with the number one less than the actual square root due to precision error?
As several answers have suggested integer arithmetic, I'll recommend the gmpy2 library. It provides functions for checking if a number is a perfect power, calculating integer square roots, and integer square root with remainder.
>>> import gmpy2
>>> gmpy2.is_power(9)
True
>>> gmpy2.is_power(10)
False
>>> gmpy2.isqrt(10)
mpz(3)
>>> gmpy2.isqrt_rem(10)
(mpz(3), mpz(1))
Disclaimer: I maintain gmpy2.
Yes, you should worry:
In [11]: int((100000000000000000000000000000000000**2) ** 0.5)
Out[11]: 99999999999999996863366107917975552L
In [12]: int(math.sqrt(100000000000000000000000000000000000**2))
Out[12]: 99999999999999996863366107917975552L
obviously adding the 0.000000001 doesn't help here either...
As #DSM points out, you can use the decimal library:
In [21]: from decimal import Decimal
In [22]: x = Decimal('100000000000000000000000000000000000')
In [23]: (x ** 2).sqrt() == x
Out[23]: True
for numbers over 10**999999999, provided you keep a check on the precision (configurable), it'll throw an error rather than an incorrect answer...
Both **0.5 and math.sqrt() perform the calculation using floating point arithmetic. The input is converted to float before the square root is calculated.
Do these calculations recognize when the input value is a perfect square?
No they do not. Floating arithmetic has no concept of perfect squares.
large integers may not be representable, for values where the number has more significant digits than available in the floating point mantissa. It's easy to see therefore that for non-representable input values, n**0.5 may be innaccurate. And you proposed fix by adding a small value will not in general fix the problem.
If your input is an integer then you should consider performing your calculation using integer arithmetic. That ultimately is the right way to deal with this.
You can use the round(number, significant_figures) before converting to an int, I cannot recall if python truncs or rounds when doing a float-to-integer conversion.
In any case, since python uses floating point arithmetic, all the pitfalls apply. See:
http://docs.python.org/2/tutorial/floatingpoint.html
Perfect-square values will have no fractional components, so your main worry would be very large values, and for such values a difference of 1 or 2 being significant means you're going to want a specific numerical library that supports such high precision (as DSM mentions, the Decimal library, standard since Python 2.4, should be able to do what you want as it supports arbitrary precision.
http://docs.python.org/library/decimal.html
sqrt is one of the easier math library functions to implement, and any math library of reasonable quality will implement it with faithful rounding (sub-ULP accuracy). If the input is a perfect square, its square root is representable (in a reasonable floating-point format). In this case, faithful rounding guarantees the result is exact.
This addresses only the value actually passed to sqrt. Whether a number can be converted without error from another format to the floating-point input for sqrt is a separate issue.

Categories