Python representation of floating point numbers [duplicate] - python

This question already has answers here:
Floating Point Limitations [duplicate]
(3 answers)
Closed 9 years ago.
I spent an hour today trying to figure out why
return abs(val-desired) <= 0.1
was occasionally returning False, despite val and desired having an absolute difference of <=0.1. After some debugging, I found out that -13.2 + 13.3 = 0.10000000000000142. Now I understand that CPUs cannot easily represent most real numbers, but this is an exception, because you can subtract 0.00000000000000142 and get 0.1, so it can be represented in Python.
I am running Python 2.7 on Intel Core architecture CPUs (this is all I have been able to test it on). I'm curious to know how I can store a value of 0.1 despite not being able to apply arithmetic to particular floating point values. val and desired are float values.

Yes, this can be a bit surprising:
>>> +13.3
13.300000000000001
>>> -13.2
-13.199999999999999
>>> 0.1
0.10000000000000001
All these numbers can be represented with some 16 digits of accuracy. So why:
>>> 13.3-13.2
0.10000000000000142
Why only 14 digits of accuracy in that case?
Well, that's because 13.3 and -13.2 have 16 digits of accuracy, which means 14 decimal points, since there are two digits before the decimal point. So the result also have 14 decimal points of accuracy. Even though the computer can represent numbers with 16 digits.
If we make the numbers bigger, the accuracy of the result decreases further:
>>> 13000.3-13000.2
0.099999999998544808
>>> 1.33E10-13.2E10
-118700000000.0
In short, the accuracy of the result depends on the accuracy of the input.

"Now I understand that CPUs cannot easily represent most floating point numbers with high resolution", the fact you asked this question indicates that you don't understand. None of the real values 13.2, 13.3 nor 0.1 can be represented exactly as floating point numbers:
>>> "{:.20f}".format(13.2)
'13.19999999999999928946'
>>> "{:.20f}".format(13.3)
'13.30000000000000071054'
>>> "{:.20f}".format(0.1)
'0.10000000000000000555'

To directly address your question of "how do I store a value like 0.1 and do an exact comparison to it when I have imprecise floating-point numbers," the answer is to use a different type to represent your numbers. Python has a decimal module for doing decimal fixed-point and floating-point math instead of binary -- in decimal, obviously, 0.1, -13.2, and 13.3 can all be represented exactly instead of approximately; or you can set a specific level of precision when doing calculations using decimal and discard digits below that level of significance.
val = decimal.Decimal(some calculation)
desired = decimal.Decimal(some other calculation)
return abs(val-desired) <= decimal.Decimal('0.1')
The other common alternative is to use integers instead of floats by artificially multiplying by some power of ten.
return not int(abs(val-desired)*10)

Related

Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't?

I know that most decimals don't have an exact floating point representation (Is floating point math broken?).
But I don't see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn't, when
both values actually have ugly decimal representations:
>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an "exact" operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1 as 0.4 as these are equal, but it cannot display 3*0.1 as 0.3 because these are not equal.
You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.
>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'
0.1 is 0x1.999999999999a times 2^-4. The "a" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the "exact" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.
However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.
Python 3's float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f for all floats f). Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr engine chooses to display one with a slight apparent error.
repr (and str in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1 isn't the closest value to 0.3 (0x1.3333333333333p-2 in hex), it's actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.
On the other hand, the multiplication 4*0.1 does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn't need any additional digits.
You can verify this quite easily:
>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True
I used hex notation above because it's nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex(). If you'd rather see them in all their decimal glory, here you go:
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')
Here's a simplified conclusion from other answers.
If you check a float on Python's command line or print it, it goes through function repr which creates its string representation.
Starting with version 3.2, Python's str and repr use a complex rounding scheme, which prefers
nice-looking decimals if possible, but uses more digits where
necessary to guarantee bijective (one-to-one) mapping between floats
and their string representations.
This scheme guarantees that value of repr(float(s)) looks nice for simple
decimals, even if they can't be
represented precisely as floats (eg. when s = "0.1").
At the same time it guarantees that float(repr(x)) == x holds for every float x
Not really specific to Python's implementation but should apply to any float to decimal string functions.
A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.
The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.
Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don't get unintuitive results like 0.1000000000004121.
Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1
Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.
In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.
Edit:
https://docs.python.org/3.1/tutorial/floatingpoint.html
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).

In python, how do I preserve decimal places in numbers? [duplicate]

This question already has answers here:
How can I force division to be floating point? Division keeps rounding down to 0?
(11 answers)
Closed 9 years ago.
I'd like to pass numbers around between functions, while preserving the decimal places for the numbers.
I've discovered that if I pass a float like '10.00' in to a function, then the decimal places don't get used. This messes an operation like calculating percentages.
For example, x * (10 / 100) will always return 0.
But if I manage to preserve the decimal places, I end up doing x * (10.00 / 100). This returns an accurate result.
I'd like to have a technique that enables consistency when I'm working with numbers that decimal places that can hold zeroes.
When you write
10 / 100
you are performing integer division. That's because both operands are integers. The result is 0.
If you want to perform floating point division, make one of the operands be a floating point value. For instance:
10.0 / 100
or
float(10) / 100
Do beware also that
10.0 / 100
results in a binary floating point value and binary floating data types cannot represent the true result value of 0.1. So if you want to represent the result accurately you may need to use a decimal data type. The decimal module has the functionality needed for that.
Division in python for float and int works differently, take a look at this question and it's answers: Python division.
Moreover, if you are looking for a solution to format a decimal floating point of your figures into string, you might need to use %f.
Python
# '1.000000'
"%f" % (1.0)
# '1.00'
"%.2f" % (1.0)
# ' 1.00'
"%6.2f" % (1.0)
Python 2.x will use integer division when dividing two integers unless you explicitly tell it to do otherwise. Two integers in --> one integer out.
Python 3 onwards will return, to quote PEP 238 http://www.python.org/dev/peps/pep-0238/ a reasonable approximation of the result of the division approximation, i.e. it will perform a floating point division and return the result without rounding.
To enable this behaviour in earlier version of Python you can use:
from __future__ import division
At the very top of the module, this should get you the consistent results you want.
You should use the decimal module. Each number knows how many significant digits it has.
If you're trying to preserve significant digits, the decimal module is has everything you need. Example:
>>> from decimal import Decimal
>>> num = Decimal('10.00')
>>> num
Decimal('10.00')
>>> num / 10
Decimal('1.00')

A "round"ed number multiplied by 0.01 results in x.y00000000000001 and not x.y?

The reason I'm asking this is because there is a validation in OpenERP that it's driving me crazy:
>>> round(1.2 / 0.01) * 0.01
1.2
>>> round(12.2 / 0.01) * 0.01
12.200000000000001
>>> round(122.2 / 0.01) * 0.01
122.2
>>> round(1222.2 / 0.01) * 0.01
1222.2
As you can see, the second round is returning an odd value.
Can someone explain to me why is this happening?
This has in fact nothing to with round, you can witness the exact same problem if you just do 1220 * 0.01:
>>> 1220*0.01
12.200000000000001
What you see here is a standard floating point issue.
You might want to read what Wikipedia has to say about floating point accuracy problems:
The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.
Also see:
Numerical analysis
Numerical stability
A simple example for numerical instability with floating-point:
the numbers are finite. lets say we save 4 digits after the dot in a given computer or language.
0.0001 multiplied with 0.0001 would result something lower than 0.0001, and therefore it is impossible to save this result!
In this case if you calculate (0.0001 x 0.0001) / 0.0001 = 0.0001, this simple computer will fail in being accurate because it tries to multiply first and only afterwards to divide. In javascript, dividing with fractions leads to similar inaccuracies.
The float type that you are using stores binary floating point numbers. Not every decimal number is exactly representable as a float. In particular there is no exact representation of 1.2 or 0.01, so the actual number stored in the computer will differ very slightly from the value written in the source code. This representation error can cause calculations to give slightly different results from the exact mathematical result.
It is important to be aware of the possibility of small errors whenever you use floating point arithmetic, and write your code to work well even when the values calculated are not exactly correct. For example, you should consider rounding values to a certain number of decimal places when displaying them to the user.
You could also consider using the decimal type which stores decimal floating point numbers. If you use decimal then 1.2 can be stored exactly. However, working with decimal will reduce the performance of your code. You should only use it if exact representation of decimal numbers is important. You should also be aware that decimal does not mean that you'll never have any problems. For example 0.33333... has no exact representation as a decimal.
There is a loss of accuracy from the division due to the way floating point numbers are stored, so you see that this identity doesn't hold
>>> 12.2 / 0.01 * 0.01 == 12.2
False
bArmageddon, has provided a bunch of links which you should read, but I believe the takeaway message is don't expect floats to give exact results unless you fully understand the limits of the representation.
Especially don't use floats to represent amounts of money! which is a pretty common mistake
Python also has the decimal module, which may be useful to you
Others have answered your question and mentioned that many numbers don't have an exact binary fractional representation. If you are accustomed to working only with decimal numbers, it can seem deeply weird that a nice, "round" number like 0.01 could be a non-terminating number in some other base. In the spirit of "seeing is believing," here's a little Python program that will print out a binary representation of any number to any desired number of digits.
from decimal import Decimal
n = Decimal("0.01") # the number to print the binary equivalent of
m = 1000 # maximum number of digits to print
p = -1
r = []
w = int(n)
n = abs(n) - abs(w)
while n and -p < m:
s = Decimal(2) ** p
if n >= s:
r.append("1")
n -= s
else:
r.append("0")
p -= 1
print "%s.%s%s" % ("-" if w < 0 else "", bin(abs(w))[2:],
"".join(r), "..." if n else "")

Python rounding error with float numbers [duplicate]

This question already has answers here:
Why does floating-point arithmetic not give exact results when adding decimal fractions?
(31 answers)
Is floating point arbitrary precision available?
(5 answers)
Closed 7 years ago.
I don't know if this is an obvious bug, but while running a Python script for varying the parameters of a simulation, I realized the results with delta = 0.29 and delta = 0.58 were missing. On investigation, I noticed that the following Python code:
for i_delta in range(0, 101, 1):
delta = float(i_delta) / 100
(...)
filename = 'foo' + str(int(delta * 100)) + '.dat'
generated identical files for delta = 0.28 and 0.29, same with .57 and .58, the reason being that python returns float(29)/100 as 0.28999999999999998. But that isn't a systematic error, not in the sense it happens to every integer. So I created the following Python script:
import sys
n = int(sys.argv[1])
for i in range(0, n + 1):
a = int(100 * (float(i) / 100))
if i != a: print i, a
And I can't see any pattern in the numbers for which this rounding error happens. Why does this happen with those particular numbers?
Any number that can't be built from exact powers of two can't be represented exactly as a floating point number; it needs to be approximated. Sometimes the closest approximation will be less than the actual number.
Read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Its very well known due to the nature of floating point numbers.
If you want to do decimal arithmetic not floating point arithmatic there are libraries to do this.
E.g.,
>>> from decimal import Decimal
>>> Decimal(29)/Decimal(100)
Decimal('0.29')
>>> Decimal('0.29')*100
Decimal('29')
>>> int(Decimal('29'))
29
In general decimal is probably going overboard and still will have rounding errors in rare cases when the number does not have a finite decimal representation (for example any fraction where the denominator is not 1 or divisible by 2 or 5 - the factors of the decimal base (10)). For example:
>>> s = Decimal(7)
>>> Decimal(1)/s/s/s/s/s/s/s*s*s*s*s*s*s*s
Decimal('0.9999999999999999999999999996')
>>> int(Decimal('0.9999999999999999999999999996'))
0
So its best to always just round before casting floating points to ints, unless you want a floor function.
>>> int(1.9999)
1
>>> int(round(1.999))
2
Another alternative is to use the Fraction class from the fractions library which doesn't approximate. (It justs keeps adding/subtracting and multiplying the integer numerators and denominators as necessary).

Why does str() round up floats?

The built-in Python str() function outputs some weird results when passing in floats with many decimals. This is what happens:
>>> str(19.9999999999999999)
>>> '20.0'
I'm expecting to get:
>>> '19.9999999999999999'
Does anyone know why? and maybe workaround it?
Thanks!
It's not str() that rounds, it's the fact that you're using floats in the first place. Float types are fast, but have limited precision; in other words, they are imprecise by design. This applies to all programming languages. For more details on float quirks, please read "What Every Programmer Should Know About Floating-Point Arithmetic"
If you want to store and operate on precise numbers, use the decimal module:
>>> from decimal import Decimal
>>> str(Decimal('19.9999999999999999'))
'19.9999999999999999'
A float has 32 bits (in C at least). One of those bits is allocated for the sign, a few allocated for the mantissa, and a few allocated for the exponent. You can't fit every single decimal to an infinite number of digits into 32 bits. Therefore floating point numbers are heavily based on rounding.
If you try str(19.998), it will probably give you something at least close to 19.998 because 32 bits have enough precision to estimate that, but something like 19.999999999999999 is too precise to estimate in 32 bits, so it rounds to the nearest possible value, which happens to be 20.
Please note that this is a problem of understanding floating point (fixed-length) numbers. Most languages do exactly (or very similar to) what Python does.
Python float is IEEE 754 64-bit binary floating point. It is limited to 53 bits of precision i.e. slightly less than 16 decimal digits of precision. 19.9999999999999999 contains 18 decimal digits; it cannot be represented exactly as a float. float("19.9999999999999999") produces the nearest floating point value, which happens to be the same as float("20.0").
>>> float("19.9999999999999999") == float("20.0")
True
If by "many decimals" you mean "many digits after the decimal point", please be aware that the same "weird" results happen when there are many decimal digits before the decimal point:
>>> float("199999999999999999")
2e+17
If you want the full float precision, don't use str(), use repr():
>>> x = 1. / 3.
>>> str(x)
'0.333333333333'
>>> str(x).count('3')
12
>>> repr(x)
'0.3333333333333333'
>>> repr(x).count('3')
16
>>>
Update It's interesting how often decimal is prescribed as a cure-all for float-induced astonishment. This is often accompanied by simple examples like 0.1 + 0.1 + 0.1 != 0.3. Nobody stops to point out that decimal has its share of deficiencies e.g.
>>> (1.0 / 3.0) * 3.0
1.0
>>> (Decimal('1.0') / Decimal('3.0')) * Decimal('3.0')
Decimal('0.9999999999999999999999999999')
>>>
True, float is limited to 53 binary digits of precision. By default, decimal is limited to 28 decimal digits of precision.
>>> Decimal(2) / Decimal(3)
Decimal('0.6666666666666666666666666667')
>>>
You can change the limit, but it's still limited precision. You still need to know the characteristics of the number format to use it effectively without "astonishing" results, and the extra precision is bought by slower operation (unless you use the 3rd-party cdecimal module).
For any given binary floating point number, there is an infinite set of decimal fractions that, on input, round to that number. Python's str goes to some trouble to produce the shortest decimal fraction from this set; see GLS's paper http://kurtstephens.com/files/p372-steele.pdf for the general algorithm (IIRC they use a refinement that avoids arbitrary-precision math in most cases). You happened to input a decimal fraction that rounds to a float (IEEE double) whose shortest possible decimal fraction is not the same as the one you entered.

Categories