Python behaviour when approaching 0.0 - python

I have written a Python program that needs to run until the initial value gets to 0 or until another case is met. I wanted to say anything useful about the amount of loops the program would go through until it would reach 0 as this is necessary to find the solution to another program. But for some reason I get the following results, depending on the input variables and searching the Internet thus far has not helped solve my problem. I wrote the following code to simulate my problem.
temp = 1
cool = 0.4999 # This is the important variable
count = 0
while temp != 0.0:
temp *= cool
count += 1
print(count, temp)
print(count)
So, at some point I'd expect the script to stop and print the amount of loops necessary to get to 0.0. And with the code above that is indeed the case. After 1075 loops the program stops and returns 1075. However, if I change the value of cool to something above 0.5 (for example 0.5001) the program seems to run indefinitely. Why is this the case?

The default behavior in many implementations of floating-point arithmetic is to round the real-number arithmetic result to the nearest number representable in floating-point arithmetic.
In any floating-point number format, there is some smallest positive representable value, say s. Consider how s * .25 would be evaluated. The real number result is .25•s. This is not a representable value, so it must be rounded to the nearest representable number. It is between 0 and s, so those are the two closest representable values. Clearly, it is closer to 0 than it is to s, so it is rounded to 0, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between 0 and ½, the result is rounded to zero.
In contrast, consider s * .75. The real number result is .75•s. This is closer to s than it is to 0, so it is rounded to s, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between ½ and 1, the result is rounded to s.
Thus, if you start with a positive number and continually multiply by some fraction between 0 and 1, the number will get smaller and smaller until it reaches s, and then it will either jump to 0 or remain at s depending on whether the fraction is less than or greater than ½.
If the fraction is exactly ½, .5•s is the same distance from 0 and s. The usual rule for breaking ties favors the number with the even low bit in its fraction portion, so the result is rounded to 0.
Note that Python does not fully specific floating-point semantics, so the behaviors may vary from implementation to implementation. Most commonly, IEEE-754 binary64 is used, in which the smallest representable positive number is 2−1074.

Related

Why python print() prints a rounded value rather than the exact value for non-representable float

The value 0.1 is not representable as a 64 bits floats.
The exact value is roughly equals to 0.10000000000000000555
https://www.exploringbinary.com/why-0-point-1-does-not-exist-in-floating-point/
You can highlight this behavior with this simple code:
timestep = 0.1
iterations = 1_000_000
total = 0
for _ in range(iterations):
total += timestep
print(total - timestep * iterations) # output is not zero but 1.3328826753422618e-06
I totally understand why 0.1 is not representable as an exact value as a float 64, but what I don't get is why when I do print(0.1), it outputs 0.1 and not the underlying value as a float 64.
Of course, the underlying value has many more digits on a base 10 system so there should be some rounding involved, but I am looking for the specification for all values and how to control that.
I had the issue with some application storing data in database:
the python app (using str(0.1)) would show 0.1
another database client UI would show 0.10000000000000000555, which would throw off the end user
P-S: I had other issues with other values
Regards,
First, you are right, floats (single, double, whatever) have an exact value.
For 64 bits IEEE-754 double, the nearest representable value to 0.1 would be exactly 0.1000000000000000055511151231257827021181583404541015625, quite long as you can see. But representable floating point values all have a finite number of decimal digits, because the base (2) is a divisor of some power of 10.
For a REPL language like python, it is essential to have this property:
the printed representation of the float shall be reinterpreted as the same value
A consequence is that
every two different float shall have different printed representation
For obtaining those properties, there are several possbilities:
print the exact value. That can be many digits, and for the vast majority of humans, just noise.
print enough digits so that every two different float have a different representation. For double precision, that's 17 digits in the worse case. So a naive implementation for representing floating point values would be to always print 17 significant digits.
print the shortest representation that would be reinterpreted unchanged.
Python, and many other languages have chosen the 3rd solution, because it is considered annoying to print 0.10000000000000001 when user have entered 0.1. Human users generally choose the shorter representation and printed representation is for human consumption. The shorter, the better.
The bad property is that it could give the false impression that those floating point values are storing exact decimal values like 1/10. That's a knowledge that is evangelized here and in many places now.

When dividing a number by its divisor in floating point math, when if ever is the result not an integral float

When I run this code in python
def is_cool(n):
if (n/7).is_integer():
return True
else:
return False
for i in range(0,1000000,7):
if not is_cool(i):
print(i, " is where the error is")
It doesn't print anything. I know there are some places where floating point math will always be correct. Is this one of them?
IEEE-754 division of a number by one of its divisors returns an exact result.
IEEE 754-2008 4.3 says:
… Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause.
When an intermediate result is representable, all of the rounding attributes round it to itself; rounding changes a value only when it is not representable. Rules for division are given in 5.4, and they do not state exceptions for the above.
The quotient resulting from dividing a representable number by a representable divisor must be representable, as it can have no more significant bits than the numerator. Therefore, dividing a number by one of its divisors returns an exact result.
Note that this rule applies to the numbers that are the actual operands of the division. When you have some numerals in source code, such as 1234567890123456890 / 7, those numerals are first converted to the numeric format. If they are not representable in that format, some approximation must be produced. That is a separate issue from how the division operates.
(n/7).is_integer() returns True only when n%7===0
Now you run your loop starting from 0 with a step size of 7
Your i will be 0, 7, 14, 21, ...
is_cool(i) will always return True for every value of i as stated above but in your if condition you have stated if not is_cool(i) will always be False hence code will not print anything

Floating point math accuracy in Python 2.7

I am having a bit of trouble understanding some results I am getting while operating with Python 2.7.
>>> x=1
>>> e=1e-20
>>> x+e
1.0
>>> x+e-x
0.0
>>> e+x-x
0.0
>>> x-x+e
1e-20
This is copied directly from Python. I am having a class on how to program on Python and I do not understand the disparity of results (x+e==1, x-x+e==1e-20 but x+e-x==0 and e+x-x==0).
I have already read the Python tutorial on Representation Errors, but I believe none of that was mentioned there
Thanks in advance
Floating-point addition is not associative.
x+e-x is grouped as (x+e)-x. It adds x and e, rounds the result to the nearest representable number (which is 1), then subtracts x from the result and rounds again, producing 0.
x-x+e is grouped as (x-x)+e. It subtracts x from x, producing 0, and rounds it to the nearest representable number, which is 0. It then adds e to 0, producing e, and rounds it to the nearest representable number, which is e.
This is because of the way that computers represent floating point numbers.
This is all really in binary format but let's pretend that it works with base 10 numbers because that's a lot easier for us to relate to.
A floating point number is expressed on the form 0.x*10^y where x is a 10-digit number (I'm omitting trailing zeroes here) and y is the exponent. This means that the number 1.0 is expressed as 0.1*10^1 and the number 0.1 as 0.1*10^0.
To add these two numbers together we need to make sure that they have the same exponent. We can do this easily by shifting the numbers back and forth, i.e. we change 0.1*10^0 to 0.01*10^1 and then we add the together to get 0.11*10^1.
When we have 0.1*10^1 and 0.1*10^-19 (1e-20) we will shift 0.1*10^-19 20 steps, meaning that the 1 will fall outside the range of our 10 digit number so we will end up with 0.1*10^1 + 0.0*10^1 = 0.1*10^1.
The reason you end up with 1e-20 in your last example is because addition is done from left to right, so we subtract 0.1*10^1 from 0.1*10^1 ending up with 0.0*10^0 and add 0.1*10^-19 to that, which is a special case where we don't need to shift any of them because one of them is exactly zero.

why (0.0006*100000)%10 is 10

When I did (0.0006*100000)%10 and (0.0003*100000)%10 in python it returned 9.999999999999993 respectively, but actually it has to be 0.
Similarly in c++ fmod(0.0003*100000,10) gives the value as 10. Can someone help me out where i'm getting wrong.
The closest IEEE 754 64-bit binary number to 0.0003 is 0.0002999999999999999737189393389513725196593441069126129150390625. The closest representable number to the result of multiplying it by 100000 is 29.999999999999996447286321199499070644378662109375.
There are a number of operations, such as floor and mod, that can make very low significance differences very visible. You need to be careful using them in connection with floating point numbers - remember that, in many cases, you have a very, very close approximation to the infinite precision value, not the infinite precision value itself. The actual value can be slightly high or, as in this case, slightly low.
Just to give the obvious answer: 0.0006 and 0.0003 are not representable in a machine double (at least on modern machines). So you didn't actually multiply by those values, but by some value very close. Slightly more, or slightly less, depending on how the compiler rounded them.
May I suggest using the remainder function in C?
It will compute the remainder after rounding the quotient to nearest integer, with exact computation (no rounding error):
remainder = dividend - round(dividend/divisor)*divisor
This way, your result will be in [-divisor/2,+divisor/2] interval.
This will still emphasize the fact that you don't get a float exactly equal to 6/10,000 , but maybe in a less surprising way when you expect a null remainder:
remainder(0.0006*100000,10.0) -> -7.105427357601002e-15
remainder(0.0003*100000,10.0) -> -3.552713678800501e-15
I don't know of such remainder function support in python, but there seems to be a match in gnulib-python module (to be verified...)
https://github.com/ghostmansd/gnulib-python/blob/master/modules/remainder
EDIT
Why does it apparently work with every other N/10,000 in [1,9] interval but 3 and 6?
It's not completely lucky, this is somehow good properties of IEEE 754 in default rounding mode (round to nearest, tie to even).
The result of a floating point operation is rounded to nearest floating point value.
Instead of N/D you thus get (N/D+err) where the absolute error err is given by this snippet (I'm more comfortable in Smalltalk, but I'm sure you will find equivalent in Python):
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) asFloat]
It gives you something like:
#(4.79217360238593e-21 9.58434720477186e-21 -2.6281060661048628e-20 1.916869440954372e-20 1.0408340855860843e-20 -5.2562121322097256e-20 -7.11236625150491e-21 3.833738881908744e-20 -2.4633073358870662e-20)
Changing the last bit of a floating point significand leads to a small difference named the unit of least precision (ulp), and it might be good to express the error in term of ulp:
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) / (n/d) asFloat ulp]
the number of ulp off the exact fraction is thus:
#(0.3536 0.3536 -0.4848 0.3536 0.096 -0.4848 -0.0656 0.3536 -0.2272)
The error is the same for N=1,2,4,8 because they are essentially the same floating point - same significand, just the exponent changes.
It's also the same for N=3 and 6 for same reason, but very near the maximum error for a single operation which is 0.5 ulp (unluckily the number can be half way between two floats).
For N=9, the relative error is smaller than for N=1, and for 5 and 7, the error is very small.
Now when we multiply these approximation by 10000 which is exactly representable as a float, (N/D+err)D is N+Derr, and it's then rounded to nearest float. If D*err is less than half distance to next float, then this is rounded to N and the rounding error vanishes.
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) * d / n asFloat ulp]
OK, we were unlucky for N=3 and 6, the already high rounding error magnitude has become greater than 0.5 ulp:
#(0.2158203125 0.2158203125 -0.591796875 0.2158203125 0.1171875 -0.591796875 -0.080078125 0.2158203125 -0.138671875)
Beware, the distance is not symmetric for exact powers of two, the next float after 1.0 is 1.0+2^-52, but before 1.0 it's 1.0-2^-53.
Nonetheless, what we see here, is that after the second rounding operation, the error did annihilate in four cases, and did cumulate only in a single case (counting only the cases with different significands).
We can generalize that result. As long as we do not sum numbers with very different exponents, but just use muliply/divide operations, while the error bound can be high after P operations, the statistical distribution of cumulated errors has a remarkably narrow peak compared to this bound, and the result are somehow surprisingly good w.r.t. what we regularly read about float imprecision. See my answer to The number of correct decimal digits in a product of doubles with a large number of terms for example.
I just wanted to mention that yes, float are inexact, but they sometimes do such a decent job, that they are fostering the illusion of exactness. Finding a few outliers like mentionned in this post is then surprising. The sooner surprise, the least surprise. Ah, if only float were implemented less carefully, there would be less questions in this category...

Incremented floats do not equal each other [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
Program not entering if statement
So I'm trying to run a program that has two variables, when one variable is equal to another, it performs a function. In this case, printing spam. However, for some reason, when I run this program, I'm not getting any output even though I know they are equal.
g=0.0
b=3.0
while g < 30.0:
if g==b:
print "Hi"
g+=.1
print g, b
You are assuming that adding .1 enough times to 0.0 will produce 3.0. These are floating point numbers, they are inaccurate. Rounding errors make it so that the value is never exactly equal to 3.0. You should almost never use == to test floating point numbers.
A good way to do this is to count with integer values (e.g., loop with i from 0 to 300 by 1) and scale the counter only when the float value is used (e.g., set f = i * .1). When you do this, the loop counter is always exact, so you get exactly the iterations you want, and there is only one floating-point rounding, which does not accumulate from iteration to iteration.
The loop counter is most commonly an integer type, so that addition is easily seen to be exact (until overflow is reached). However, the loop counter may also be a floating-point type, provided you are sure that the values and operations for it are exact. (The common 32-bit floating-point format represents integers exactly from -224 to +224. Outside that, it does not have the precision to represent integers exactly. It does not represent .1 exactly, so you cannot count with increments of .1. But you could count with increments of .5, .25, .375, or other small multiples of moderate powers of two, which are represented exactly.)
To expand on Karoly Horvath's comment, what you can do to test near-equality is choose some value (let's call it epsilon) that is very, very small relative to the minimum increment. Let's say epsilon is 1.0 * 10^-6, five orders of magnitude smaller than your increment. (It should probably be based on the average rounding error of your floating point representation, but that varies, and this is simply an example).
What you then do is check if g and b are less than epsilon different - if they are close enough that they are practically equal, the difference between practically and actually being the rounding error, which you're approximating with epsilon.
Check for
abs(g - b) < epsilon
and you'll have your almost-but-not-quite equality check, which should be good enough for most purposes.

Categories