Incremented floats do not equal each other [duplicate] - python

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
Program not entering if statement
So I'm trying to run a program that has two variables, when one variable is equal to another, it performs a function. In this case, printing spam. However, for some reason, when I run this program, I'm not getting any output even though I know they are equal.
g=0.0
b=3.0
while g < 30.0:
if g==b:
print "Hi"
g+=.1
print g, b

You are assuming that adding .1 enough times to 0.0 will produce 3.0. These are floating point numbers, they are inaccurate. Rounding errors make it so that the value is never exactly equal to 3.0. You should almost never use == to test floating point numbers.

A good way to do this is to count with integer values (e.g., loop with i from 0 to 300 by 1) and scale the counter only when the float value is used (e.g., set f = i * .1). When you do this, the loop counter is always exact, so you get exactly the iterations you want, and there is only one floating-point rounding, which does not accumulate from iteration to iteration.
The loop counter is most commonly an integer type, so that addition is easily seen to be exact (until overflow is reached). However, the loop counter may also be a floating-point type, provided you are sure that the values and operations for it are exact. (The common 32-bit floating-point format represents integers exactly from -224 to +224. Outside that, it does not have the precision to represent integers exactly. It does not represent .1 exactly, so you cannot count with increments of .1. But you could count with increments of .5, .25, .375, or other small multiples of moderate powers of two, which are represented exactly.)

To expand on Karoly Horvath's comment, what you can do to test near-equality is choose some value (let's call it epsilon) that is very, very small relative to the minimum increment. Let's say epsilon is 1.0 * 10^-6, five orders of magnitude smaller than your increment. (It should probably be based on the average rounding error of your floating point representation, but that varies, and this is simply an example).
What you then do is check if g and b are less than epsilon different - if they are close enough that they are practically equal, the difference between practically and actually being the rounding error, which you're approximating with epsilon.
Check for
abs(g - b) < epsilon
and you'll have your almost-but-not-quite equality check, which should be good enough for most purposes.

Related

Almost correct output, but not quite right. Math help in Python

I am trying to match an expected output of "13031.157014219536" exactly, and I have attempted 3 times to get the value with different methods, detailed below, and come extremely close to the value, but not close enough. What is happening in these code snippets that it causing the deviation? Is it rounding error in the calculation? Did I do something wrong?
Attempts:
item_worth = 8000
years = range(10)
for year in years:
item_worth += (item_worth*0.05)
print(item_worth)
value = item_worth * (1 + ((0.05)**10))
print(value)
cost = 8000
for x in range(10):
cost *= 1.05
print(cost)
Expected Output:
13031.157014219536
Actual Outputs:
13031.157014219529
13031.157014220802
13031.157014219538
On most machine, floats are represented by fp64, or double floats.
You can check the precision of those for your number that way (not a method to be used for real computation. Just out of curiosity):
import struct
struct.pack('d', 13031.157014219536)
# bytes representing that number in fp64 b'\xf5\xbc\n\x19\x94s\xc9#'
# or, in a more humanely understandable way
struct.unpack('q', struct.pack('d', 13031.157014219536))[0]
# 4668389568658717941
# That number has no meaning, except that this integer is represented by the same bytes as your float.
# Now, let's see what float is "next in line"
struct.unpack('d', struct.pack('q', 4668389568658717941+1))[0]
# 13031.157014219538
Note that this code works on most machine, but is not reliable. First of all, it relies on the fact that significant bits are not just all 1. Otherwise, it would give a totally unrelated number. Secondly, it makes assumption that ints are LE. But well, it gave me what I wanted.
That is the information that the smallest number bigger than 13031.157014219536 is 13031.157014219538.
(Or, said more accurately for this kind of conversation: the smallest float bigger than 13031.157014219536 whose representation is not the same as the representation of 13031.157014219536 has the same representation as 13031.157014219538)
So, my point is you are flirting with the representation limit. You can't expect the sum of 10 numbers to be more accurate.
I could also have said that saying that the biggest power of 2 smaller than your number is 8192=2¹³.
So, that 13 is the exponent of your float in its representation. And this you have 53 significant bits, the precision of such a number is 2**(13-53+1) = 1.8×10⁻¹² (which is indeed also the result of 13031.157014219538-13031.157014219536). Hence the reason why in decimal, 12 decimal places are printed. But not all combination of them can exist, and the last one is not insignificant, but not fully significant neither.
If your computation is the result of the sum of 10 such numbers, you could even have an error 10 times bigger than your last result the right to complain :D

Python behaviour when approaching 0.0

I have written a Python program that needs to run until the initial value gets to 0 or until another case is met. I wanted to say anything useful about the amount of loops the program would go through until it would reach 0 as this is necessary to find the solution to another program. But for some reason I get the following results, depending on the input variables and searching the Internet thus far has not helped solve my problem. I wrote the following code to simulate my problem.
temp = 1
cool = 0.4999 # This is the important variable
count = 0
while temp != 0.0:
temp *= cool
count += 1
print(count, temp)
print(count)
So, at some point I'd expect the script to stop and print the amount of loops necessary to get to 0.0. And with the code above that is indeed the case. After 1075 loops the program stops and returns 1075. However, if I change the value of cool to something above 0.5 (for example 0.5001) the program seems to run indefinitely. Why is this the case?
The default behavior in many implementations of floating-point arithmetic is to round the real-number arithmetic result to the nearest number representable in floating-point arithmetic.
In any floating-point number format, there is some smallest positive representable value, say s. Consider how s * .25 would be evaluated. The real number result is .25•s. This is not a representable value, so it must be rounded to the nearest representable number. It is between 0 and s, so those are the two closest representable values. Clearly, it is closer to 0 than it is to s, so it is rounded to 0, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between 0 and ½, the result is rounded to zero.
In contrast, consider s * .75. The real number result is .75•s. This is closer to s than it is to 0, so it is rounded to s, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between ½ and 1, the result is rounded to s.
Thus, if you start with a positive number and continually multiply by some fraction between 0 and 1, the number will get smaller and smaller until it reaches s, and then it will either jump to 0 or remain at s depending on whether the fraction is less than or greater than ½.
If the fraction is exactly ½, .5•s is the same distance from 0 and s. The usual rule for breaking ties favors the number with the even low bit in its fraction portion, so the result is rounded to 0.
Note that Python does not fully specific floating-point semantics, so the behaviors may vary from implementation to implementation. Most commonly, IEEE-754 binary64 is used, in which the smallest representable positive number is 2−1074.

Extract significant digit from 0.0007

I want to extract significant digit of 7 from XX=0.0007
The code is as follows
XX=0.0007
enX1=XX//10**np.floor(np.log10(XX));
But XX becomes 6not 7. Can anyone help me?
In some sense, you were lucky to start out with the value 0.0007. As it turns out, that value is one of the (many!) decimal values that cannot be represented exactly in a floating point format.
A floating point number gets usually stored in the common IEEE-754 format as powers of 2. Just like a whole number such as 175 is stored as the sum of bits with increasing powers-of-two values (165 = 128 + 32 + 4 + 1), fractions are stored as a sum of 1/power-of-two numbers. That means that a value of 1/2, 1/4, and 1/65536 can be stored exactly (and sums thereof, such as 3/4), but your 0.0007 can not. Its closest value is actually 0.0000699999999999999992887633748495. ("Closest" in the sense that adding just one more one-bit at the end will make it slightly larger than 0.0007, and the difference is ever so slightly larger than this lower one.)
In your calculation, you use the double divide slash //, which instructs Python to do an integer division and discard the fractional part. So while the intermediate calculation is correct and you get something like 6.99999..., this gets truncated and you end up with 6.
If you use a single slash, the result will keep its (exact!) decimals but Python will represent it as 7.0000, give or take a few zeroes. By default, Python displays only a small number of decimals.
Note that this still "is" not the exact value 7. The calculation starts out with an imprecise number, and although there may be some intermediate rounding here and there, there is only a small chance you end up with a precise integer. Again, not for all decimals, but for a large number of them. Other fractional values may be stored fractionally larger than the value you enter – 0.0004, for examplea – but the underlying 'problem' of accuracy is also present there. It's just not as visible as with yours.
If you want a nearest integer result, use a single divide slash for the exact calculation, followed by round to force the number to the nearest integer anyway.
a To be precise, as somewhere about 0.000400000000000000019168694409544. After your routine, Python will display it as 4 but internally it's still just a bit larger than that.

Determine whether two complex numbers are equal

The following code causes the print statements to be executed:
import numpy as np
import math
foo = np.array([1/math.sqrt(2), 1/math.sqrt(2)], dtype=np.complex_)
total = complex(0, 0)
one = complex(1, 0)
for f in foo:
total = total + pow(np.abs(f), 2)
if(total != one):
print str(total) + " vs " + str(one)
print "NOT EQUAL"
However, my input of [1/math.sqrt(2), 1/math.sqrt(2)] results in the total being one:
(1+0j) vs (1+0j) NOT EQUAL
Is it something to do with mixing NumPy with Python's complex type?
When using floating point numbers it is important to keep in mind that working with these numbers is never accurate and thus computations are every time subject to rounding errors. This is caused by the design of floating point arithmetic and currently the most practicable way to do high arbitrary precision mathematics on computers with limited resources. You can't compute exactly using floats (means you have practically no alternative), as your numbers have to be cut off somewhere to fit in a reasonable amount of memory (in most cases at maximum 64 bits), this cut-off is done by rounding it (see below for an example).
To deal correctly with these shortcomings you should never compare to floats for equality, but for closeness. Numpy provides 2 functions for that: np.isclose for comparison of single values (or a item-wise comparison for arrays) and np.allclose for whole arrays. The latter is a np.all(np.isclose(a, b)), so you get a single value for an array.
>>> np.isclose(np.float32('1.000001'), np.float32('0.999999'))
True
But sometimes the rounding is very practicable and matches with our analytical expectation, see for example:
>>> np.float(1) == np.square(np.sqrt(1))
True
After squaring the value will be reduced in size to fit in the given memory, so in this case it's rounded to what we would expect.
These two functions have built-in absolute and relative tolerances (you can also give then as parameter) that are use to compare two values. By default they are rtol=1e-05 and atol=1e-08.
Also, don't mix different packages with their types. If you use Numpy, use Numpy-Types and Numpy-Functions. This will also reduce your rounding errors.
Btw: Rounding errors have even more impact when working with numbers which differ in their exponent widely.
I guess, the same considerations as for real numbers are applicable: never assume they can be equal, but rather close enough:
eps = 0.000001
if abs(a - b) < eps:
print "Equal"

Scheduling events when time is a floating point value

The following example highlights a pitfall with regard to the use of floating point numbers:
available_again = 0
for i in range(0,15):
time = 0.1*i
if time < available_again:
print("failed to schedule at " + str(time))
available_again = time + 0.1
This code outputs the following:
failed to schedule at 1.3
I wasn't expecting this error however I do understand why it occurs. What options have I in order to address this problem?
One fix in my code would be:
available_again = 0.1*(i+1)
I'm wondering if this is the correct route. My particular application involves the scheduling of events where the time at which events occur is dictated by complex mathematical functions, for example: sinc(2*pi*f*t). The duration of events will be such that events may overlap each other, in which case I will need to send them on separate channels.
One fix in my code would be:
available_again = 0.1*(i+1)
This fix is correct and will make your code work as long as time remains small enough for the floating-point resolution to be better than 0.1 (up to about 250).
It works because the floating-point number 0.1*(i+1) computed at iteration i is exactly the same as the floating-point number computed as 0.1*i with i having been incremented by one at the next iteration, and because as long as integers n and m remain lower than about 250, no two 0.1*n and 0.1*m are equal for different values of n and m.
The reason is that floating-point arithmetic is deterministic. The floating-point operation 0.1 * n may produce a counter-intuitive result for some integral values of n, but it always produces the same result for the same n.
If in addition it is important for you that time is the closest possible the mathematical quotient i / 10, then you should compute time as i / 10.0, and logically, compute available_again as (i+1) / 10.0.
This continues to work for the same reason as above, and it has the additional property always to compute the floating-point number nearest to the intended quotient, whereas 0.1 * i magnifies the representation error between the floating-point number 0.1 and the rational 1/10.
In neither case will two consecutive values of time always be separated by the same interval. With the i/10.0 computation, the floating-point value with float around the rational i/10. With 0.1*i, it will float around i*0.1000000000000000055511151231257827021181583404541015625. If you have the freedom to pick the sampling frequency, choose it so that the factor between i and time is a power of two (say 1/64 or 1/128). Then you will have the additional property that time is computed exactly and that every time interval is exactly the same.

Categories