Python precision issues with float formatting

Python precision issues with float formatting - python

Please look at the below Python code that I've entered into a Python 3.6 interpreter:
>>> 0.00225 * 100.0
0.22499999999999998
>>> '{:.2f}'.format(0.00225 * 100.0)
'0.22'
>>> '{:.2f}'.format(0.225)
'0.23'
>>> '{:.2f}'.format(round(0.00225 * 100.0, 10))
'0.23'
Hopefully you can immediately understand why I'm frustrated. I am attempting to display value * 100.0 on my GUI, storing the full precision behind a cell but only displaying 2 decimal points (or whatever the users precision setting is). The GUI is similar to an Excel spreadsheet.
I'd prefer not to lose the precision of something like 0.22222444937645 and round by 10, but I also don't want a value such as 0.00225 * 100.0 displaying as 0.22.
I'm interested in hearing about a standard way of approaching a situation like this or a remedy for my specific situation. Thanks ahead of time for any help.

Consider using the Decimal module, which "provides support for fast correctly-rounded decimal floating point arithmetic." The primary advantages of Decimal relevant to your use case are:
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.
The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants.
Based on the information you've provided in the question, I cannot say how much of an overhaul migrating to Decimal would require. However, if you're creating a spreadsheet-like application and always want to preserve maximal precision, then you will probably want to refactor to use Decimal sooner or later to avoid unexpected numbers in your user-facing GUI.
To get the behavior you desire, you may need to change the rounding mode (which defaults to ROUND_HALF_EVEN) for Decimal instances.
from decimal import getcontext, ROUND_HALF_UP
getcontext().rounding = ROUND_HALF_UP
n = round(Decimal('0.00225') * Decimal('100'), 2)
print(n) # prints Decimal('0.23')
m = round(Decimal('0.00225') * 100, 2)
print(m) # prints Decimal('0.23')

perhaps use decimal? docs.python.org/2/library/decimal.html
from decimal import *
getcontext().prec = 2
n = Decimal.from_float(0.00225)
m = n * 100
print(n, m)
print(m.quantize(Decimal('.01'), rounding=ROUND_DOWN))
print(m.quantize(Decimal('.01'), rounding=ROUND_UP)

Related

How to avoid incorrect rounding with numpy.round?

I'm working with floating point numbers. If I do:
import numpy as np
np.round(100.045, 2)
I get:
Out[15]: 100.04
Obviously, this should be 100.05. I know about the existence of IEEE 754 and that the way that floating point numbers are stored is the cause of this rounding error.
My question is: how can I avoid this error?

You are partly right, often the cause of this "incorrect rounding" is because of the way floating point numbers are stored. Some float literals can be represented exactly as floating point numbers while others cannot.
>>> a = 100.045
>>> a.as_integer_ratio() # not exact
(7040041011254395, 70368744177664)
>>> a = 0.25
>>> a.as_integer_ratio() # exact
(1, 4)
It's also important to know that there is no way you can restore the literal you used (100.045) from the resulting floating point number. So the only thing you can do is to use an arbitrary precision data type instead of the literal. For example you could use Fraction or Decimal (just to mention two built-in types).
I mentioned that you cannot restore the literal once it is parsed as float - so you have to input it as string or something else that represents the number exactly and is supported by these data types:
>>> from fractions import Fraction
>>> f = Fraction(100045, 100)
>>> f
Fraction(20009, 20)
>>> f = Fraction("100.045")
>>> f
Fraction(20009, 20)
>>> from decimal import Decimal
>>> Decimal("100.045")
Decimal('100.045')
However these don't work well with NumPy and even if you get it to work at all - it will almost certainly be very slow compared to basic floating point operations.
>>> import numpy as np
>>> a = np.array([Decimal("100.045") for _ in range(1000)])
>>> np.round(a)
AttributeError: 'decimal.Decimal' object has no attribute 'rint'
In the beginning I said that you're are only partly right. There is another twist!
You mentioned that rounding 100.045 will obviously give 100.05. But that's not obvious at all, in your case it is even wrong (in the context of floating point math in programming - it would be true for "normal calculations"). In many programming languages a "half" value (where the number after the decimal you're rounding is 5) isn't always rounded up - for example Python (and NumPy) use a "round half to even" approach because it's less biased. For example 0.5 will be rounded to 0 while 1.5 will be rounded to 2.
So even if 100.045 could be represented exactly as float - it would still round to 100.04 because of that rounding rule!
>>> round(Fraction("100.045"), 1)
Fraction(5002, 5)
>>> 5002 / 5
1000.4
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.04')
This is even mentioned in the NumPy docs for numpy.around:
Notes
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R1011] and errors introduced when scaling by powers of ten.
(Emphasis mine.)
The only (at least that I know) numeric type in Python that allows setting the rounding rule manually is Decimal - via ROUND_HALF_UP:
>>> from decimal import Decimal, getcontext, ROUND_HALF_UP
>>> dc = getcontext()
>>> dc.rounding = ROUND_HALF_UP
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.05')
Summary
So to avoid the "error" you have to:
Prevent Python from parsing it as floating point value and
use a data type that can represent it exactly
then you have to manually override the default rounding mode so that you will get rounding up for "halves".
(abandon NumPy because it doesn't have arbitrary precision data types)

Basically there is no general solution for this problem IMO, unless you have a general rule for all the different cases (see Floating Point Arithmetic: Issues and Limitation). However, in this case you can round the decimal part separately:
In [24]: dec, integ = np.modf(100.045)
In [25]: integ + np.round(dec, 2)
Out[25]: 100.05
The reason for such behavior is not because separating integer from decimal part makes any difference on round()'s logic. It's because when you use fmod it gives you a more realistic version of the decimal part of the number which is actually a rounded representation.
In this case here is what dec is:
In [30]: dec
Out[30]: 0.045000000000001705
And you can check that round gives same result with 0.045:
In [31]: round(0.045, 2)
Out[31]: 0.04
Now if you try with another number like 100.0333, the decimal part is a slightly smaller version which as I mentioned, the result you want depends on your rounding policies.
In [37]: dec, i = np.modf(100.0333)
In [38]: dec
Out[38]: 0.033299999999997
There are also modules like fractions and decimal that provide support for fast correctly-rounded decimal floating point and rational arithmetic, that you can use in situations as such.

This is not a bug, but a feature )))
you can simple use this trick:
def myround(val):
"Fix pythons round"
d,v = math.modf(val)
if d==0.5:
val += 0.000000001
return round(val)

SymPy rounding behaviour

I was investigating different rounding method using Python built-in solution and some other external libraries such SymPy and while doing so I stumbled upon some cases that I need help with understanding the reason behind it.
Ex-1:
print(round(1.0065,3))
output:
1.006
In the first case, using the Python built-in rounding function the output was 1.006 instead of 1.007 and I can understand that this is not a mistake as Python rounds to the nearest even and that's known as Bankers rounding.
And this is why I from the beginning started searching for another way to control the rounding behaviour. With a quick search, I've found decimal.Decimal module which can easily handle decimal values and efficiently round is using quantize() as in this example:
from decimal import Decimal, getcontext, ROUND_HALF_UP
context= getcontext()
context.rounding='ROUND_HALF_UP'
print(Decimal('1.0065').quantize(Decimal('.001')))
output:1.007
This is a very good solution but the only problem is it is not easy to be hardcoded in long math expressions as I'll need to convert every number to string then after using decimal I will pass it the precession as in the form of "0.001" instead of writing '3' directly as in the case of built-in round.
While searching for another solution I found that SymPy, which I already use a lot in my scripts, offers some very powerful functions that might help but when I tried it the output was not as I expected.
Ex-1 using SymPy sympify():
print(sympify(1.0065).evalf(3))
output: 1.01
Ex-2 using SymPy N (normalize):
print(N(1.0065,3))
output: 1.01
Af first the output was a little bit weird but after investigating I realized that N and sympify already performing round right but rounding to significant figures, not to decimal places.
And here the questions come:
As I can use with Decimal objects getcontext().rounding='ROUND_HALF_UP' to change the rounding behaviour, is there a way to change the N and sympify rounding behaviour to decimal places instead of significant figures?

Instead of re-implementing decimal rounding in SymPy, perhaps use decimal to do the rounding, but hide the calculation in a utility function:
import sympy as sym
import decimal
from decimal import Decimal as D
def dround(d, ndigits, rounding=decimal.ROUND_HALF_UP):
result = D(str(d)).quantize(D('0.1')**ndigits, rounding=rounding)
# result = sym.sympify(result) # if you want a SymPy Float
return result
for x in [0.0065, 1.0065, 10.0065, 100.0065]:
print(dround(x, 3))
prints
0.007
1.007
10.007
100.007

The n of evalf gives the first n significant digits of x (measured from the left). If you use x.round(3) it will round x to the nth digit from the decimal point and can be positive (right of decimal pt) or negative (left of decimal pt).
>>> for x in '0.0065, 1.0065, 10.0065, 100.0065'.split(', '):
... print S(x).round(3)
0.006
1.006
10.007
100.007
>>> int(S(12345).round(-2))
12300

First of all, N and evalf are essentially the same thing; N(x, n) amounts to sympify(x).evalf(n). In your case, since x is a Python float, it's easier to use N because it sympifies the input.
To get three digits after decimal dot, use N(x, 3 + log(x, 10) + 1). The adjustment log(x, 10) + 1 is 0 when x is between 0.1 and 1; in this case the number of significant digits is the same as the number of digits after the decimal dot. If x is larger, we get more significant digits.
Example:
for x in [0.0065, 1.0065, 10.0065, 100.0065]:
print(N(x, 3 + log(x, 10) + 1))
prints
0.006
1.007
10.007
100.007
The transition from 6 to 7 is curious, but not entirely surprising. These numbers are not exactly represented in binary system, so the truncation to nearest double-precision float may be a factor here. I've made a few additional observation on this effect on my blog.

Python and very small decimals

Interesting fact on Python 2.7.6 running on macosx:
You have a very small number like:
0.000000000000000000001
You can represent it as:
>>> 0.1 / (10 ** 20)
1.0000000000000001e-21
But you can see the floating pointing error at the end. What we really have is something like this:
0.0000000000000000000010000000000000001
So we got an error in the number, but thats ok. The problem is the following:
As expected:
>>> 0.1 / (10 ** 20) == 0
False
But, wait, whats this?
>>> 0.1 / (10 ** 20) + 1 == 1
True
>>> repr(0.1 / (10 ** 20) + 1)
'1.0'
Looks like python is using another data type to represent my number as this only happens when using the 16th decimal digit and so on. Also why python decided to automatically turn my number to 0 on the addition? Should not I be dealing with a decimal number with a very small decimal part and with a floating point error?
I know this problem can fall in the floating point errors umbrella, and the solution normally is do not trust floating point for this kinda of calculation, but I would like to understand more about whats happening under the hood.

Consider what happens when you attempt to add 7.00 and .001 and are only permitted to use three digits for the answer. 7.001 is not allowed. What do you do?
Of the values you can return, such as 6.99, 7.00, and 7.01, the closest to the correct answer is 7.00. So, when asked to add 7.00 and .001, you return 7.00.
The computer has the same issue when you ask it to add a number near 1e-21 and 1. It has only 53 bits to use for the fraction portion of the floating-point value, and 1e-21 is almost 70 bits below 1. So, it rounds the correct result to the closest value it can, 1, and returns that.

Floating point numbers work like scientific notation. 1.0000000000000001e-21 fits within the 53 bits of significand/mantissa a 64-bit float allows. Adding 1 to that, many orders of magnitude larger than the tiny fraction, causes the minor detail to be discarded, and storing exactly 1 instead

A "round"ed number multiplied by 0.01 results in x.y00000000000001 and not x.y?

The reason I'm asking this is because there is a validation in OpenERP that it's driving me crazy:
>>> round(1.2 / 0.01) * 0.01
1.2
>>> round(12.2 / 0.01) * 0.01
12.200000000000001
>>> round(122.2 / 0.01) * 0.01
122.2
>>> round(1222.2 / 0.01) * 0.01
1222.2
As you can see, the second round is returning an odd value.
Can someone explain to me why is this happening?

This has in fact nothing to with round, you can witness the exact same problem if you just do 1220 * 0.01:
>>> 1220*0.01
12.200000000000001
What you see here is a standard floating point issue.
You might want to read what Wikipedia has to say about floating point accuracy problems:
The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.
Also see:
Numerical analysis
Numerical stability
A simple example for numerical instability with floating-point:
the numbers are finite. lets say we save 4 digits after the dot in a given computer or language.
0.0001 multiplied with 0.0001 would result something lower than 0.0001, and therefore it is impossible to save this result!
In this case if you calculate (0.0001 x 0.0001) / 0.0001 = 0.0001, this simple computer will fail in being accurate because it tries to multiply first and only afterwards to divide. In javascript, dividing with fractions leads to similar inaccuracies.

The float type that you are using stores binary floating point numbers. Not every decimal number is exactly representable as a float. In particular there is no exact representation of 1.2 or 0.01, so the actual number stored in the computer will differ very slightly from the value written in the source code. This representation error can cause calculations to give slightly different results from the exact mathematical result.
It is important to be aware of the possibility of small errors whenever you use floating point arithmetic, and write your code to work well even when the values calculated are not exactly correct. For example, you should consider rounding values to a certain number of decimal places when displaying them to the user.
You could also consider using the decimal type which stores decimal floating point numbers. If you use decimal then 1.2 can be stored exactly. However, working with decimal will reduce the performance of your code. You should only use it if exact representation of decimal numbers is important. You should also be aware that decimal does not mean that you'll never have any problems. For example 0.33333... has no exact representation as a decimal.

There is a loss of accuracy from the division due to the way floating point numbers are stored, so you see that this identity doesn't hold
>>> 12.2 / 0.01 * 0.01 == 12.2
False
bArmageddon, has provided a bunch of links which you should read, but I believe the takeaway message is don't expect floats to give exact results unless you fully understand the limits of the representation.
Especially don't use floats to represent amounts of money! which is a pretty common mistake
Python also has the decimal module, which may be useful to you

Others have answered your question and mentioned that many numbers don't have an exact binary fractional representation. If you are accustomed to working only with decimal numbers, it can seem deeply weird that a nice, "round" number like 0.01 could be a non-terminating number in some other base. In the spirit of "seeing is believing," here's a little Python program that will print out a binary representation of any number to any desired number of digits.
from decimal import Decimal
n = Decimal("0.01") # the number to print the binary equivalent of
m = 1000 # maximum number of digits to print
p = -1
r = []
w = int(n)
n = abs(n) - abs(w)
while n and -p < m:
s = Decimal(2) ** p
if n >= s:
r.append("1")
n -= s
else:
r.append("0")
p -= 1
print "%s.%s%s" % ("-" if w < 0 else "", bin(abs(w))[2:],
"".join(r), "..." if n else "")

Why does str() round up floats?

The built-in Python str() function outputs some weird results when passing in floats with many decimals. This is what happens:
>>> str(19.9999999999999999)
>>> '20.0'
I'm expecting to get:
>>> '19.9999999999999999'
Does anyone know why? and maybe workaround it?
Thanks!

It's not str() that rounds, it's the fact that you're using floats in the first place. Float types are fast, but have limited precision; in other words, they are imprecise by design. This applies to all programming languages. For more details on float quirks, please read "What Every Programmer Should Know About Floating-Point Arithmetic"
If you want to store and operate on precise numbers, use the decimal module:
>>> from decimal import Decimal
>>> str(Decimal('19.9999999999999999'))
'19.9999999999999999'

A float has 32 bits (in C at least). One of those bits is allocated for the sign, a few allocated for the mantissa, and a few allocated for the exponent. You can't fit every single decimal to an infinite number of digits into 32 bits. Therefore floating point numbers are heavily based on rounding.
If you try str(19.998), it will probably give you something at least close to 19.998 because 32 bits have enough precision to estimate that, but something like 19.999999999999999 is too precise to estimate in 32 bits, so it rounds to the nearest possible value, which happens to be 20.

Please note that this is a problem of understanding floating point (fixed-length) numbers. Most languages do exactly (or very similar to) what Python does.
Python float is IEEE 754 64-bit binary floating point. It is limited to 53 bits of precision i.e. slightly less than 16 decimal digits of precision. 19.9999999999999999 contains 18 decimal digits; it cannot be represented exactly as a float. float("19.9999999999999999") produces the nearest floating point value, which happens to be the same as float("20.0").
>>> float("19.9999999999999999") == float("20.0")
True
If by "many decimals" you mean "many digits after the decimal point", please be aware that the same "weird" results happen when there are many decimal digits before the decimal point:
>>> float("199999999999999999")
2e+17
If you want the full float precision, don't use str(), use repr():
>>> x = 1. / 3.
>>> str(x)
'0.333333333333'
>>> str(x).count('3')
12
>>> repr(x)
'0.3333333333333333'
>>> repr(x).count('3')
16
>>>
Update It's interesting how often decimal is prescribed as a cure-all for float-induced astonishment. This is often accompanied by simple examples like 0.1 + 0.1 + 0.1 != 0.3. Nobody stops to point out that decimal has its share of deficiencies e.g.
>>> (1.0 / 3.0) * 3.0
1.0
>>> (Decimal('1.0') / Decimal('3.0')) * Decimal('3.0')
Decimal('0.9999999999999999999999999999')
>>>
True, float is limited to 53 binary digits of precision. By default, decimal is limited to 28 decimal digits of precision.
>>> Decimal(2) / Decimal(3)
Decimal('0.6666666666666666666666666667')
>>>
You can change the limit, but it's still limited precision. You still need to know the characteristics of the number format to use it effectively without "astonishing" results, and the extra precision is bought by slower operation (unless you use the 3rd-party cdecimal module).

For any given binary floating point number, there is an infinite set of decimal fractions that, on input, round to that number. Python's str goes to some trouble to produce the shortest decimal fraction from this set; see GLS's paper http://kurtstephens.com/files/p372-steele.pdf for the general algorithm (IIRC they use a refinement that avoids arbitrary-precision math in most cases). You happened to input a decimal fraction that rounds to a float (IEEE double) whose shortest possible decimal fraction is not the same as the one you entered.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python precision issues with float formatting - python

perhaps use decimal? docs.python.org/2/library/decimal.html from decimal import * getcontext().prec = 2 n = Decimal.from_float(0.00225) m = n * 100 print(n, m) print(m.quantize(Decimal('.01'), rounding=ROUND_DOWN)) print(m.quantize(Decimal('.01'), rounding=ROUND_UP)

Related

How to avoid incorrect rounding with numpy.round?

SymPy rounding behaviour

Python and very small decimals

A "round"ed number multiplied by 0.01 results in x.y00000000000001 and not x.y?

Why does str() round up floats?

Categories

Resources