Division by 3 in Python - python

I am new to Python and while experimenting with operators, I came across this:
>>> 7.0 / 3
2.3333333333333335
Shouldn't the result be 2.3333333333333333 or maybe 2.3333333333333334. Why is it rounding the number in such a way?
Also, with regard to floor division in Python 2.7 my results were:
>>> 5 / 2
2
>>> 5 // 2
2
>>> 5.0 / 2
2.5
>>> 5.0 // 2
2.0
So my observation is that floor division returns the integer quotient even in case of floating numbers, while normal division return the decimal value. Is this true?

Take a look at this 0.30000000000000004.com
Your language isn't broken, it's doing floating point math. Computers can only natively store integers, so they need some way of representing decimal numbers. This representation comes with some degree of inaccuracy. That's why, more often than not, .1 + .2 != .3.

Shouldn't the result be 2.3333333333333333 or maybe 2.3333333333333334. Why is it rounding the number in such a way?
The key is the number is being rounded twice.
The first rounding is part of the division operation, rounding the number to the nearest double-precision floating point value. This is a binary operation not a decimal one.
The second rounding is part of converting the floating point number to a decimal representation for display. It is possible to represent the exact value of any binary fraction in decimal, but it is usually not desirable as in most applications doing so will simply result in many digits of false-precision. Python instead outputs the shortest decimal approximation that will round-trip to the correct floating point value.
We can better see what is going on by using the Fraction and Decimal types, unlike converting directly to a string converting a floating point number to a Fraction or Decimal will give the exact value. We can also use the Fraction type to determine the error in our calculation.
>>> from fractions import Fraction
>>> from decimal import Decimal
>>> 7.0 / 3
2.3333333333333335
>>> Decimal(7.0 / 3)
Decimal('2.333333333333333481363069950020872056484222412109375')
>>> Fraction(7.0 / 3)
Fraction(5254199565265579, 2251799813685248)
>>> Fraction(7,3) - Fraction(7.0 / 3)
Fraction(-1, 6755399441055744)
The conversion via type Decimal shows us the exact value of the floating point number and demonstrates the many digits of false-precision that typically result from exact conversion of a floating point value to decimal.
The conversion to a Fraction is also interesting, the denominator is 2251799813685248 which is equivalent to 251. This makes perfect sense, a Double precision floating point has 53 effective bits of mantissa and we need two of those for the integral part of the result leaving 51 for the fractional part.
The error in our floating point calculation is 1/6755399441055744 or ⅓ * 2-51. This error is less than half our precision step of 2-51 so the answer was indeed correctly rounded to a double precision floating point value.

Related

Does float type in python only represent approximations to real numbers?

I'm new to programming and studying the basic now. I'm wondering that does the float type in python only represent approximations to real number? I know float uses binary fractions but are the floats 0.5, 0.25, 0.125, etc still the approximations? I tried:
sum([0.1] * 10) == 1
it returned False.
But
sum([0.5] * 10) == 5
It returned True.
Finally I tried:
for i in range(1, 8):
answer = sum([1 / 2 ** i] * 10)
print(answer == 1 / 2 ** i * 10)
The answer is all True.
It's that means some floats in python are exactly the real number not the approximations?
Each floating-point object represents one number (or special value such as NaN) exactly. Floating-point objects do not represent approximations.
The correct way to think about floating-point is that floating-point values are exact numbers, but floating-point operations approximate real arithmetic.
Python does not specify floating-point arithmetic precisely; each Python implementation may use the underlying arithmetic of the platform it is implemented on. Commonly, IEEE 754 formats are used, although the operations may not conform to IEEE 754 completely. To illustrate what is happening with your code, I will use IEEE-754 basic 64-bit binary floating-point.
When the source text 0.5 is processed, it is converted to floating-point. Note that conversion is an operation, just as addition or multiplication are operations. The characters are interpreted as a decimal numeral, and the conversion produces the floating-point number that is closest to the number represented by the decimal numeral. In this case, 0.5 represents one-half, and that is exactly representable in binary floating-point, so the result is exactly 0.5.
Then [0.5] * 10 produces a list containing ten copies of 0.5, and sum adds those. All of the additions performed in this summation are exact, because the floating-point format can exactly represent 0.5, 1, 1.5, 2, and so on. So the result is 5, exactly, and comparing this to 5 produces true.
On the other hand, when the source text 0.1 is processed, that decimal numeral represents one-tenth, which cannot be represented exactly. The conversion produces the nearest representable value, which is 0.1000000000000000055511151231257827021181583404541015625.
When sum adds the ten copies of this, the addition cannot always be performed exactly. Adding the first two is exact, adding 0.1000000000000000055511151231257827021181583404541015625 to 0.1000000000000000055511151231257827021181583404541015625 produces 0.200000000000000011102230246251565404236316680908203125. However, when 0.200000000000000011102230246251565404236316680908203125 is added to 0.1000000000000000055511151231257827021181583404541015625, the result is 0.3000000000000000444089209850062616169452667236328125. During this addition, the bits in the addition carried to a new position (the operands are under ¼, but the result is over ¼—the addition carried into the ¼ position. Since the floating-point format has only a fixed number of bits (53) available for the value, the operation had to discard the low bit. In doing so, it changed the result slightly. So this addition is only approximate.
As these additions go on, the final value is 0.99999999999999988897769753748434595763683319091796875. When this is compared to 1, the result is false.
Python represents floating point numbers as binary fractions. Therefore numbers like 0.5 can be represented accurately, whereas 0.1 for example can not.
Floating point numbers in Python are just approximations, if they can not be exactly represented using binary fractions.
If you need more accuracy when dealing with floating point arithmetic, I would suggest taking a look at decimals: https://docs.python.org/3/library/decimal.html
Additionally, a good resource on floating point numbers in Python can be found here: https://docs.python.org/3/tutorial/floatingpoint.html

Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't?

I know that most decimals don't have an exact floating point representation (Is floating point math broken?).
But I don't see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn't, when
both values actually have ugly decimal representations:
>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an "exact" operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1 as 0.4 as these are equal, but it cannot display 3*0.1 as 0.3 because these are not equal.
You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.
>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'
0.1 is 0x1.999999999999a times 2^-4. The "a" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the "exact" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.
However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.
Python 3's float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f for all floats f). Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr engine chooses to display one with a slight apparent error.
repr (and str in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1 isn't the closest value to 0.3 (0x1.3333333333333p-2 in hex), it's actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.
On the other hand, the multiplication 4*0.1 does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn't need any additional digits.
You can verify this quite easily:
>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True
I used hex notation above because it's nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex(). If you'd rather see them in all their decimal glory, here you go:
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')
Here's a simplified conclusion from other answers.
If you check a float on Python's command line or print it, it goes through function repr which creates its string representation.
Starting with version 3.2, Python's str and repr use a complex rounding scheme, which prefers
nice-looking decimals if possible, but uses more digits where
necessary to guarantee bijective (one-to-one) mapping between floats
and their string representations.
This scheme guarantees that value of repr(float(s)) looks nice for simple
decimals, even if they can't be
represented precisely as floats (eg. when s = "0.1").
At the same time it guarantees that float(repr(x)) == x holds for every float x
Not really specific to Python's implementation but should apply to any float to decimal string functions.
A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.
The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.
Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don't get unintuitive results like 0.1000000000004121.
Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1
Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.
In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.
Edit:
https://docs.python.org/3.1/tutorial/floatingpoint.html
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).

Convert to Float without Rounding Decimal Places

I have a list and it contains a certain number '5.74536541' in it which I convert to a float.
I am printing it out in Python 3 using ("%0.2f" % (variable)) but it always prints out 5.75 instead of 5.74.
I know you're thinking who cares, but it is for a currency converter program and I don't want the currencies to round up/down but to be exact.
How can I keep it from rounding but also keep the 2 decimal places?
You shouldn't use floating point numbers for currency, due to rounding errors like you mentioned.
Your best bet is to use a fixed-precision decimal where you also have full control over how rounding and truncation works. From the docs:
>>> from decimal import *
>>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,
capitals=1, flags=[], traps=[Overflow, DivisionByZero,
InvalidOperation])
>>> getcontext().prec = 6
>>> Decimal('3.0')
Decimal('3.0')
>>> Decimal('3.1415926535')
Decimal('3.1415926535')
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85987')
>>> getcontext().rounding = ROUND_UP
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85988')
You should represent all currency-based values internally as Decimals with a high precision (the standard level of precision should be fine in your case - just leave the prec alone!). If you want to print a nicely formatted dollars and cents value to the user, using the locale module is a straightforward way to do this.
Be careful when printing as you will have to quantize the Decimal down to the correct number of places for display or the rounding will not be based on your Decimal context! You should only perform the quantize step for final display or for a single, final value - all intermediate steps should use high-precision Decimals to make any operations as accurate as possible.
>>> from decimal import *
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'en_AU.UTF-8'
>>> getcontext().rounding = ROUND_DOWN
>>> TWOPLACES = Decimal(10) ** -2
>>> var = Decimal('5.74536541')
Decimal('5.74536541')
>>> var.quantize(TWOPLACES)
Decimal('5.74')
>>> locale.currency(var.quantize(TWOPLACES))
'$5.74'
If you're dealing with currency and accuracy matters, don't use float, use decimal.
Take away the number mod 0.01
i.e.
rounded = number - (number % 0.01)
then print it the same as before.
This said, rounding down is not more accurate. Are you trying the old steal money from a bank by exploiting rounding errors scheme?
Floating point values are known as "useful approximations". Whatever you do to a floating point number—round it, truncate it, whatever—if the result is a floating point value, you don't get to decide how many digits to the right of the decimal point it has.
Never use floating point values for currency. See pydoc decimal, for example. Python's decimal module supports decimal fixed point and decimal floating point arithmetic.
Python docs warn about rounding floats.
Note The behavior of round() for floats can be surprising: for
example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This
is not a bug: it’s a result of the fact that most decimal fractions
can’t be represented exactly as a float.
If you're not careful, you'll be misled by the value that appears at the interpreter prompt.
Python only prints a decimal approximation to the true decimal value
of the binary approximation stored by the machine.
And
It’s important to realize that this is, in a real sense, an illusion:
the value in the machine is not exactly 1/10, you’re simply rounding
the display of the true machine value. This fact becomes apparent as
soon as you try to do arithmetic with these values
If the number is a string then truncate the string to only 2 characters after the decimal and then convert it to a float.
Otherwise multiply it with 10^n where n is the number of digits after the decimal and then divide your float by 10^n.

Why is Python's Decimal function defaulting to 54 places?

After inputting
from decimal import *
getcontext().prec = 6
Decimal (1) / Decimal (7)
I get the value
Decimal('0.142857')
However if I enter Decimal (1.0/7) I get
Decimal('0.142857142857142849212692681248881854116916656494140625')
The 1.0 / 7 computes a binary floating point number to 17 digits of precision. This happens before the Decimal constructor sees it:
>>> d = 1.0 / 7
>>> type(d)
<type 'float'>
>>> d.as_integer_ratio()
(2573485501354569, 18014398509481984)
The binary fraction, 2573485501354569 / 18014398509481984 is as close as binary floating point can get using 53 bits of precision. It is not exactly 1/7th, but it's pretty close.
The Decimal constructor then converts the binary fraction to as many places as necessary to get an exact decimal equivalent. The result you're are seeing is what you get when you evaluate 2573485501354569 / 18014398509481984 exactly:
>>> from decimal import Decimal, getcontext
>>> getcontext().prec = 100
>>> Decimal(2573485501354569) / Decimal(18014398509481984)
Decimal('0.142857142857142849212692681248881854116916656494140625')
Learning point 1: Binary floating point computes binary fractions to 53 bits of precision. The result is rounded if necessary.
Learning point 2: The Decimal constructor converts binary floating point numbers to decimals losslessly (no rounding). This tends to result in many more digits of precision than you might expect (See the 6th question in the Decimal FAQ).
Learning point 3: The decimal module is designed to treat all numbers as being exact. Only the results of computations get rounded to the context precision. The binary floating point input is converted to decimal exactly and context precision isn't applied until you do a computation with the number (See the final question and answer in the Decimal FAQ for details).
Executive summary: Don't do binary floating point division before handing the numbers to the decimal module. Let it do the work to your desired precision.
Hope this helps :-)

Why does str() round up floats?

The built-in Python str() function outputs some weird results when passing in floats with many decimals. This is what happens:
>>> str(19.9999999999999999)
>>> '20.0'
I'm expecting to get:
>>> '19.9999999999999999'
Does anyone know why? and maybe workaround it?
Thanks!
It's not str() that rounds, it's the fact that you're using floats in the first place. Float types are fast, but have limited precision; in other words, they are imprecise by design. This applies to all programming languages. For more details on float quirks, please read "What Every Programmer Should Know About Floating-Point Arithmetic"
If you want to store and operate on precise numbers, use the decimal module:
>>> from decimal import Decimal
>>> str(Decimal('19.9999999999999999'))
'19.9999999999999999'
A float has 32 bits (in C at least). One of those bits is allocated for the sign, a few allocated for the mantissa, and a few allocated for the exponent. You can't fit every single decimal to an infinite number of digits into 32 bits. Therefore floating point numbers are heavily based on rounding.
If you try str(19.998), it will probably give you something at least close to 19.998 because 32 bits have enough precision to estimate that, but something like 19.999999999999999 is too precise to estimate in 32 bits, so it rounds to the nearest possible value, which happens to be 20.
Please note that this is a problem of understanding floating point (fixed-length) numbers. Most languages do exactly (or very similar to) what Python does.
Python float is IEEE 754 64-bit binary floating point. It is limited to 53 bits of precision i.e. slightly less than 16 decimal digits of precision. 19.9999999999999999 contains 18 decimal digits; it cannot be represented exactly as a float. float("19.9999999999999999") produces the nearest floating point value, which happens to be the same as float("20.0").
>>> float("19.9999999999999999") == float("20.0")
True
If by "many decimals" you mean "many digits after the decimal point", please be aware that the same "weird" results happen when there are many decimal digits before the decimal point:
>>> float("199999999999999999")
2e+17
If you want the full float precision, don't use str(), use repr():
>>> x = 1. / 3.
>>> str(x)
'0.333333333333'
>>> str(x).count('3')
12
>>> repr(x)
'0.3333333333333333'
>>> repr(x).count('3')
16
>>>
Update It's interesting how often decimal is prescribed as a cure-all for float-induced astonishment. This is often accompanied by simple examples like 0.1 + 0.1 + 0.1 != 0.3. Nobody stops to point out that decimal has its share of deficiencies e.g.
>>> (1.0 / 3.0) * 3.0
1.0
>>> (Decimal('1.0') / Decimal('3.0')) * Decimal('3.0')
Decimal('0.9999999999999999999999999999')
>>>
True, float is limited to 53 binary digits of precision. By default, decimal is limited to 28 decimal digits of precision.
>>> Decimal(2) / Decimal(3)
Decimal('0.6666666666666666666666666667')
>>>
You can change the limit, but it's still limited precision. You still need to know the characteristics of the number format to use it effectively without "astonishing" results, and the extra precision is bought by slower operation (unless you use the 3rd-party cdecimal module).
For any given binary floating point number, there is an infinite set of decimal fractions that, on input, round to that number. Python's str goes to some trouble to produce the shortest decimal fraction from this set; see GLS's paper http://kurtstephens.com/files/p372-steele.pdf for the general algorithm (IIRC they use a refinement that avoids arbitrary-precision math in most cases). You happened to input a decimal fraction that rounds to a float (IEEE double) whose shortest possible decimal fraction is not the same as the one you entered.

Categories