Scheduling events when time is a floating point value - python

The following example highlights a pitfall with regard to the use of floating point numbers:
available_again = 0
for i in range(0,15):
time = 0.1*i
if time < available_again:
print("failed to schedule at " + str(time))
available_again = time + 0.1
This code outputs the following:
failed to schedule at 1.3
I wasn't expecting this error however I do understand why it occurs. What options have I in order to address this problem?
One fix in my code would be:
available_again = 0.1*(i+1)
I'm wondering if this is the correct route. My particular application involves the scheduling of events where the time at which events occur is dictated by complex mathematical functions, for example: sinc(2*pi*f*t). The duration of events will be such that events may overlap each other, in which case I will need to send them on separate channels.

One fix in my code would be:
available_again = 0.1*(i+1)
This fix is correct and will make your code work as long as time remains small enough for the floating-point resolution to be better than 0.1 (up to about 250).
It works because the floating-point number 0.1*(i+1) computed at iteration i is exactly the same as the floating-point number computed as 0.1*i with i having been incremented by one at the next iteration, and because as long as integers n and m remain lower than about 250, no two 0.1*n and 0.1*m are equal for different values of n and m.
The reason is that floating-point arithmetic is deterministic. The floating-point operation 0.1 * n may produce a counter-intuitive result for some integral values of n, but it always produces the same result for the same n.
If in addition it is important for you that time is the closest possible the mathematical quotient i / 10, then you should compute time as i / 10.0, and logically, compute available_again as (i+1) / 10.0.
This continues to work for the same reason as above, and it has the additional property always to compute the floating-point number nearest to the intended quotient, whereas 0.1 * i magnifies the representation error between the floating-point number 0.1 and the rational 1/10.
In neither case will two consecutive values of time always be separated by the same interval. With the i/10.0 computation, the floating-point value with float around the rational i/10. With 0.1*i, it will float around i*0.1000000000000000055511151231257827021181583404541015625. If you have the freedom to pick the sampling frequency, choose it so that the factor between i and time is a power of two (say 1/64 or 1/128). Then you will have the additional property that time is computed exactly and that every time interval is exactly the same.

Related

Python behaviour when approaching 0.0

I have written a Python program that needs to run until the initial value gets to 0 or until another case is met. I wanted to say anything useful about the amount of loops the program would go through until it would reach 0 as this is necessary to find the solution to another program. But for some reason I get the following results, depending on the input variables and searching the Internet thus far has not helped solve my problem. I wrote the following code to simulate my problem.
temp = 1
cool = 0.4999 # This is the important variable
count = 0
while temp != 0.0:
temp *= cool
count += 1
print(count, temp)
print(count)
So, at some point I'd expect the script to stop and print the amount of loops necessary to get to 0.0. And with the code above that is indeed the case. After 1075 loops the program stops and returns 1075. However, if I change the value of cool to something above 0.5 (for example 0.5001) the program seems to run indefinitely. Why is this the case?
The default behavior in many implementations of floating-point arithmetic is to round the real-number arithmetic result to the nearest number representable in floating-point arithmetic.
In any floating-point number format, there is some smallest positive representable value, say s. Consider how s * .25 would be evaluated. The real number result is .25•s. This is not a representable value, so it must be rounded to the nearest representable number. It is between 0 and s, so those are the two closest representable values. Clearly, it is closer to 0 than it is to s, so it is rounded to 0, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between 0 and ½, the result is rounded to zero.
In contrast, consider s * .75. The real number result is .75•s. This is closer to s than it is to 0, so it is rounded to s, and that is the floating-point result of the operation.
Similarly, when s is multiplied by any number between ½ and 1, the result is rounded to s.
Thus, if you start with a positive number and continually multiply by some fraction between 0 and 1, the number will get smaller and smaller until it reaches s, and then it will either jump to 0 or remain at s depending on whether the fraction is less than or greater than ½.
If the fraction is exactly ½, .5•s is the same distance from 0 and s. The usual rule for breaking ties favors the number with the even low bit in its fraction portion, so the result is rounded to 0.
Note that Python does not fully specific floating-point semantics, so the behaviors may vary from implementation to implementation. Most commonly, IEEE-754 binary64 is used, in which the smallest representable positive number is 2−1074.

Python Numerical Differentiation and the minimum value for h

I calculate the first derivative using the following code:
def f(x):
f = np.exp(x)
return f
def dfdx(x):
Df = (f(x+h)-f(x-h)) / (2*h)
return Df
For example, for x == 10 this works fine. But when I set h to around 10E-14 or below, Df starts
to get values that are really far away from the expected value f(10) and the relative error between the expected value and Df becomes huge.
Why is that? What is happening here?
The evaluation of f(x) has, at best, a rounding error of |f(x)|*mu where mu is the machine constant of the floating point type. The total error of the central difference formula is thus approximately
2*|f(x)|*mu/(2*h) + |f'''(x)|/6 * h^2
In the present case, the exponential function is equal to all of its derivatives, so that the error is proportional to
mu/h + h^2/6
which has a minimum at h = (3*mu)^(1/3), which for the double format with mu=1e-16 is around h=1e-5.
The precision is increased if instead of 2*h the actual difference (x+h)-(x-h) between the evaluation points is used in the denominator. This can be seen in the following loglog plot of the distance to the exact derivative.
You are probably encountering some numerical instability, as for x = 10 and h =~ 1E-13, the argument for np.exp is very close to 10 whether h is added or subtracted, so small approximation errors in the value of np.exp are scaled significantly by the division with the very small 2 * h.
In addition to the answer by #LutzL I will add some info from a great book Numerical Recipes 3rd Edition: The Art of Scientific Computing from chapter 5.7 about Numerical Derivatives, especially about the choice of optimal h value for given x:
Always choose h so that h and x differ by an exactly representable number. Funny stuff like 1/3 should be avoided, except when x is equal to something along the lines of 14.3333333.
Round-off error is approximately epsilon * |f(x) * h|, where epsilon is floating point accuracy, Python represents floating point numbers with double precision so it's 1e-16. It may differ for more complicated functions (where precision errors arise further), though it's not your case.
Choice of optimal h: Not getting into details it would be sqrt(epsilon) * x for simple forward case, except when your x is near zero (you will find more information in the book), which is your case. You may want to use higher x values in such cases, complementary answer is already provided. In the case of f(x+h) - f(x-h) as in your example it would amount to epsilon ** 1/3 * x, so approximately 5e-6 times x, which choice might be a little difficult in case of small values like yours. Quite close (if one can say so bearing in mind floating point arithmetic...) to practical results posted by #LutzL though.
You may use other derivative formulas, except the symmetric one you are using. You may want to use the forward or backward evaluation(if the function is costly to evaluate and you have calculated f(x) beforehand. If your function is cheap to evaluate, you may want to evaluate it multiple times using higher order methods to make the precision error smaller (see five-point stencil on wikipedia as provided in the comment to your question).
This Python tutorial explains the reason behind the limited precision. In summary, decimals are ultimately represented in binary and the precision is about 17 significant digits. So, you are right that it gets fuzzy beyond 10E-14.

How to prevent float imprecision from affecting numpy.arange?

Because numpy.arange() uses ceil((stop - start)/step) to determine the number of items, a small float imprecision (stop = .400000001) can add an unintended value to the list.
Example
The first case does not include the stop point (intended)
>>> print(np.arange(.1,.3,.1))
[0.1 0.2]
The second case includes the stop point (not intended)
>>> print(np.arange(.1,.4,.1))
[0.1 0.2 0.3 0.4]
numpy.linspace() fixes this problem, np.linspace(.1,.4-.1,3). but requires you know the number of steps. np.linspace(start,stop-step,np.ceil((stop-step)/step)) leads to the same incosistencies.
Question
How can I generate a reliable float range without knowing the # of elements in the range?
Extreme Case
Consider the case in which I want generate a float index of unknown precision
np.arange(2.00(...)001,2.00(...)021,.00(...)001)
Your goal is to calculate what ceil((stop - start)/step) would be if the values had been calculated with exact mathematics.
This is impossible to do given only floating-point values of start, stop, and step that are the results of operations in which some rounding errors may have occurred. Rounding removes information, and there is simply no way to create information from lack of information.
Therefore, this problem is only solvable if you have additional information about start, stop, and step.
Suppose step is exact, but start and stop have some accumulated errors bounded by e0 and e1. That is, you know start is at most e0 away from its ideal mathematical value (in either direction), and stop is at most e1 away from its ideal value (in either direction). Then the ideal value of (stop-start)/step could range from (stop-start-e0-e1)/step to (stop-start+e0+e1)/step away from its ideal value.
Suppose there is an integer between (stop-start-e0-e1)/step to (stop-start+e0+e1)/step. Then it is impossible to know whether the ideal ceil result should be the lesser integer or the greater just from the floating-point values of start, stop, and step and the bounds e0 and e1.
However, from the examples you have given, the ideal (stop-start)/step could be exactly an integer, as in (.4-.1)/.1. If so, any non-zero error bounds could result in the error interval straddling an integer, making the problem impossible to solve from the information we have so far.
Therefore, in order to solve the problem, you must have more information than just simple bounds on the errors. You must know, for example, that (stop-start)/step is exactly an integer or is otherwise quantized. For example, if you knew that the ideal calculation of the number of steps would produce a multiple of .1, such as 3.8, 3.9, 4.0, 4.1, or 4.2, but never 4.05, and the errors were sufficiently small that the floating-point calculation (stop-start)/step had a final error less than .05, then it would be possible to round (stop-start)/step to the nearest qualifying multiple and then to apply ceil to that.
If you have such information, you can update the question with what you know about the errors in start, stop, and step (e.g., perhaps each of them is the result of a single conversion from decimal to floating-point) and the possible values of the ideal (stop-start)/step. If you do not have such information, there is no solution.
If you are guaranteed that (stop-start) is a multiple of step, then you can use the decimal module to compute the number of steps, i.e.
from decimal import Decimal
def arange(start, stop, step):
steps = (Decimal(stop) - Decimal(start))/Decimal(step)
if steps % 1 != 0:
raise ValueError("step is not a multiple of stop-start")
return np.linspace(float(start),float(stop),int(steps),endpoint=False)
print(arange('0.1','0.4','0.1'))
If you have an exact representation of your ends and step and if they are rational you can use the fractions module:
>>> from fractions import Fraction
>>>
>>> a = Fraction('1.0000000100000000042')
>>> b = Fraction('1.0000002100000000002')
>>> c = Fraction('0.0000000099999999998') * 5 / 3
>>>
>>> float(a) + float(c) * np.arange(int((b-a)/c))
array([1.00000001, 1.00000003, 1.00000004, 1.00000006, 1.00000008,
1.00000009, 1.00000011, 1.00000013, 1.00000014, 1.00000016,
1.00000018, 1.00000019])
>>>
>>> eps = Fraction(1, 10**100)
>>> b2 = b - eps
>>> float(a) + float(c) * np.arange(int((b2-a)/c))
array([1.00000001, 1.00000003, 1.00000004, 1.00000006, 1.00000008,
1.00000009, 1.00000011, 1.00000013, 1.00000014, 1.00000016,
1.00000018])
if not you'll have to settle for some form of cutoff:
>>> a = 1.0
>>> b = 1.003999999
>>> c = 0.001
>>>
# cut off at 4 decimals
>>> round(float((b-a)/c), 4)
4.0
# cut off at 6 decimals
>>> round(float((b-a)/c), 6)
3.999999
You can round numbers to arbitrary degrees of precision in Python using the format function.
For example, if you want the first three digits of e after the decimal place, you can run
float(format(np.e, '.3f'))
Use this to eliminate float imprecisions and you should be go to go.

why (0.0006*100000)%10 is 10

When I did (0.0006*100000)%10 and (0.0003*100000)%10 in python it returned 9.999999999999993 respectively, but actually it has to be 0.
Similarly in c++ fmod(0.0003*100000,10) gives the value as 10. Can someone help me out where i'm getting wrong.
The closest IEEE 754 64-bit binary number to 0.0003 is 0.0002999999999999999737189393389513725196593441069126129150390625. The closest representable number to the result of multiplying it by 100000 is 29.999999999999996447286321199499070644378662109375.
There are a number of operations, such as floor and mod, that can make very low significance differences very visible. You need to be careful using them in connection with floating point numbers - remember that, in many cases, you have a very, very close approximation to the infinite precision value, not the infinite precision value itself. The actual value can be slightly high or, as in this case, slightly low.
Just to give the obvious answer: 0.0006 and 0.0003 are not representable in a machine double (at least on modern machines). So you didn't actually multiply by those values, but by some value very close. Slightly more, or slightly less, depending on how the compiler rounded them.
May I suggest using the remainder function in C?
It will compute the remainder after rounding the quotient to nearest integer, with exact computation (no rounding error):
remainder = dividend - round(dividend/divisor)*divisor
This way, your result will be in [-divisor/2,+divisor/2] interval.
This will still emphasize the fact that you don't get a float exactly equal to 6/10,000 , but maybe in a less surprising way when you expect a null remainder:
remainder(0.0006*100000,10.0) -> -7.105427357601002e-15
remainder(0.0003*100000,10.0) -> -3.552713678800501e-15
I don't know of such remainder function support in python, but there seems to be a match in gnulib-python module (to be verified...)
https://github.com/ghostmansd/gnulib-python/blob/master/modules/remainder
EDIT
Why does it apparently work with every other N/10,000 in [1,9] interval but 3 and 6?
It's not completely lucky, this is somehow good properties of IEEE 754 in default rounding mode (round to nearest, tie to even).
The result of a floating point operation is rounded to nearest floating point value.
Instead of N/D you thus get (N/D+err) where the absolute error err is given by this snippet (I'm more comfortable in Smalltalk, but I'm sure you will find equivalent in Python):
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) asFloat]
It gives you something like:
#(4.79217360238593e-21 9.58434720477186e-21 -2.6281060661048628e-20 1.916869440954372e-20 1.0408340855860843e-20 -5.2562121322097256e-20 -7.11236625150491e-21 3.833738881908744e-20 -2.4633073358870662e-20)
Changing the last bit of a floating point significand leads to a small difference named the unit of least precision (ulp), and it might be good to express the error in term of ulp:
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) / (n/d) asFloat ulp]
the number of ulp off the exact fraction is thus:
#(0.3536 0.3536 -0.4848 0.3536 0.096 -0.4848 -0.0656 0.3536 -0.2272)
The error is the same for N=1,2,4,8 because they are essentially the same floating point - same significand, just the exponent changes.
It's also the same for N=3 and 6 for same reason, but very near the maximum error for a single operation which is 0.5 ulp (unluckily the number can be half way between two floats).
For N=9, the relative error is smaller than for N=1, and for 5 and 7, the error is very small.
Now when we multiply these approximation by 10000 which is exactly representable as a float, (N/D+err)D is N+Derr, and it's then rounded to nearest float. If D*err is less than half distance to next float, then this is rounded to N and the rounding error vanishes.
| d |
d := 10000.
^(1 to: 9) collect: [:n | ((n/d) asFloat asFraction - (n/d)) * d / n asFloat ulp]
OK, we were unlucky for N=3 and 6, the already high rounding error magnitude has become greater than 0.5 ulp:
#(0.2158203125 0.2158203125 -0.591796875 0.2158203125 0.1171875 -0.591796875 -0.080078125 0.2158203125 -0.138671875)
Beware, the distance is not symmetric for exact powers of two, the next float after 1.0 is 1.0+2^-52, but before 1.0 it's 1.0-2^-53.
Nonetheless, what we see here, is that after the second rounding operation, the error did annihilate in four cases, and did cumulate only in a single case (counting only the cases with different significands).
We can generalize that result. As long as we do not sum numbers with very different exponents, but just use muliply/divide operations, while the error bound can be high after P operations, the statistical distribution of cumulated errors has a remarkably narrow peak compared to this bound, and the result are somehow surprisingly good w.r.t. what we regularly read about float imprecision. See my answer to The number of correct decimal digits in a product of doubles with a large number of terms for example.
I just wanted to mention that yes, float are inexact, but they sometimes do such a decent job, that they are fostering the illusion of exactness. Finding a few outliers like mentionned in this post is then surprising. The sooner surprise, the least surprise. Ah, if only float were implemented less carefully, there would be less questions in this category...

Incremented floats do not equal each other [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
Program not entering if statement
So I'm trying to run a program that has two variables, when one variable is equal to another, it performs a function. In this case, printing spam. However, for some reason, when I run this program, I'm not getting any output even though I know they are equal.
g=0.0
b=3.0
while g < 30.0:
if g==b:
print "Hi"
g+=.1
print g, b
You are assuming that adding .1 enough times to 0.0 will produce 3.0. These are floating point numbers, they are inaccurate. Rounding errors make it so that the value is never exactly equal to 3.0. You should almost never use == to test floating point numbers.
A good way to do this is to count with integer values (e.g., loop with i from 0 to 300 by 1) and scale the counter only when the float value is used (e.g., set f = i * .1). When you do this, the loop counter is always exact, so you get exactly the iterations you want, and there is only one floating-point rounding, which does not accumulate from iteration to iteration.
The loop counter is most commonly an integer type, so that addition is easily seen to be exact (until overflow is reached). However, the loop counter may also be a floating-point type, provided you are sure that the values and operations for it are exact. (The common 32-bit floating-point format represents integers exactly from -224 to +224. Outside that, it does not have the precision to represent integers exactly. It does not represent .1 exactly, so you cannot count with increments of .1. But you could count with increments of .5, .25, .375, or other small multiples of moderate powers of two, which are represented exactly.)
To expand on Karoly Horvath's comment, what you can do to test near-equality is choose some value (let's call it epsilon) that is very, very small relative to the minimum increment. Let's say epsilon is 1.0 * 10^-6, five orders of magnitude smaller than your increment. (It should probably be based on the average rounding error of your floating point representation, but that varies, and this is simply an example).
What you then do is check if g and b are less than epsilon different - if they are close enough that they are practically equal, the difference between practically and actually being the rounding error, which you're approximating with epsilon.
Check for
abs(g - b) < epsilon
and you'll have your almost-but-not-quite equality check, which should be good enough for most purposes.

Categories