Related
I'm working with floating point numbers. If I do:
import numpy as np
np.round(100.045, 2)
I get:
Out[15]: 100.04
Obviously, this should be 100.05. I know about the existence of IEEE 754 and that the way that floating point numbers are stored is the cause of this rounding error.
My question is: how can I avoid this error?
You are partly right, often the cause of this "incorrect rounding" is because of the way floating point numbers are stored. Some float literals can be represented exactly as floating point numbers while others cannot.
>>> a = 100.045
>>> a.as_integer_ratio() # not exact
(7040041011254395, 70368744177664)
>>> a = 0.25
>>> a.as_integer_ratio() # exact
(1, 4)
It's also important to know that there is no way you can restore the literal you used (100.045) from the resulting floating point number. So the only thing you can do is to use an arbitrary precision data type instead of the literal. For example you could use Fraction or Decimal (just to mention two built-in types).
I mentioned that you cannot restore the literal once it is parsed as float - so you have to input it as string or something else that represents the number exactly and is supported by these data types:
>>> from fractions import Fraction
>>> f = Fraction(100045, 100)
>>> f
Fraction(20009, 20)
>>> f = Fraction("100.045")
>>> f
Fraction(20009, 20)
>>> from decimal import Decimal
>>> Decimal("100.045")
Decimal('100.045')
However these don't work well with NumPy and even if you get it to work at all - it will almost certainly be very slow compared to basic floating point operations.
>>> import numpy as np
>>> a = np.array([Decimal("100.045") for _ in range(1000)])
>>> np.round(a)
AttributeError: 'decimal.Decimal' object has no attribute 'rint'
In the beginning I said that you're are only partly right. There is another twist!
You mentioned that rounding 100.045 will obviously give 100.05. But that's not obvious at all, in your case it is even wrong (in the context of floating point math in programming - it would be true for "normal calculations"). In many programming languages a "half" value (where the number after the decimal you're rounding is 5) isn't always rounded up - for example Python (and NumPy) use a "round half to even" approach because it's less biased. For example 0.5 will be rounded to 0 while 1.5 will be rounded to 2.
So even if 100.045 could be represented exactly as float - it would still round to 100.04 because of that rounding rule!
>>> round(Fraction("100.045"), 1)
Fraction(5002, 5)
>>> 5002 / 5
1000.4
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.04')
This is even mentioned in the NumPy docs for numpy.around:
Notes
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R1011] and errors introduced when scaling by powers of ten.
(Emphasis mine.)
The only (at least that I know) numeric type in Python that allows setting the rounding rule manually is Decimal - via ROUND_HALF_UP:
>>> from decimal import Decimal, getcontext, ROUND_HALF_UP
>>> dc = getcontext()
>>> dc.rounding = ROUND_HALF_UP
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.05')
Summary
So to avoid the "error" you have to:
Prevent Python from parsing it as floating point value and
use a data type that can represent it exactly
then you have to manually override the default rounding mode so that you will get rounding up for "halves".
(abandon NumPy because it doesn't have arbitrary precision data types)
Basically there is no general solution for this problem IMO, unless you have a general rule for all the different cases (see Floating Point Arithmetic: Issues and Limitation). However, in this case you can round the decimal part separately:
In [24]: dec, integ = np.modf(100.045)
In [25]: integ + np.round(dec, 2)
Out[25]: 100.05
The reason for such behavior is not because separating integer from decimal part makes any difference on round()'s logic. It's because when you use fmod it gives you a more realistic version of the decimal part of the number which is actually a rounded representation.
In this case here is what dec is:
In [30]: dec
Out[30]: 0.045000000000001705
And you can check that round gives same result with 0.045:
In [31]: round(0.045, 2)
Out[31]: 0.04
Now if you try with another number like 100.0333, the decimal part is a slightly smaller version which as I mentioned, the result you want depends on your rounding policies.
In [37]: dec, i = np.modf(100.0333)
In [38]: dec
Out[38]: 0.033299999999997
There are also modules like fractions and decimal that provide support for fast correctly-rounded decimal floating point and rational arithmetic, that you can use in situations as such.
This is not a bug, but a feature )))
you can simple use this trick:
def myround(val):
"Fix pythons round"
d,v = math.modf(val)
if d==0.5:
val += 0.000000001
return round(val)
I'm pretty new to python, and I've made a table which calculates T=1+2^-n-1 and C=2^n, which both give the same values from n=40 to n=52, but for n=52 to n=61 I get 0.0 for T, whereas C gives me progressively smaller decimals each time - why is this?
I think I understand why T becomes 0.0, because of python using binary floating point and because of the machine epsilon value - but I'm slightly confused as to why C doesn't also become 0.0.
import numpy as np
import math
t=np.zeros(21)
c=np.zeros(21)
for n in range(40,61):
m=n-40
t[m]=1+2**(-n)-1
c[m]=2**(-n)
print (n,t[m],c[m])
The "floating" in floating point means that values are represented by storing a fixed number of leading digits and a scale factor, rather than assuming a fixed scale (which would be fixed point).
2**-53 only takes one (binary) digit to represent (not including the scale), but 1+2**-53 would take 54 to represent exactly. Python floats only have 53 binary digits of precision; 2**-53 can be represented exactly, but 1+2**-53 gets rounded to exactly 1, and subtracting 1 from that gives exactly 0. Thus, we have
>>> 2**-53
1.1102230246251565e-16
>>> 1+(2**-53)-1
0.0
Postscript: you might wonder why 2**-53 displays as a value not equal to the exact mathematical value when I said it was exact. That's due to the float->string conversion logic, which only keeps enough decimal digits to reconstruct the original float (instead of printing a bunch of digits at the end that are usually just noise).
The difference between both is indeed due to floating-point representation. Indeed, if you perform 1 + X where X is a very very small number, then the floating-point representation sets its exponent value to 0 and the precision is ensured by the mantissa, which is 52-bit on a 64-bit computer. Therefore, 1 + 2^(-X) if X > 52 is equal to 1. However, even 2^-100 can be represented in double-precision floating-point, so you can see C decrease for a larger number of samples.
I am new to Python and while experimenting with operators, I came across this:
>>> 7.0 / 3
2.3333333333333335
Shouldn't the result be 2.3333333333333333 or maybe 2.3333333333333334. Why is it rounding the number in such a way?
Also, with regard to floor division in Python 2.7 my results were:
>>> 5 / 2
2
>>> 5 // 2
2
>>> 5.0 / 2
2.5
>>> 5.0 // 2
2.0
So my observation is that floor division returns the integer quotient even in case of floating numbers, while normal division return the decimal value. Is this true?
Take a look at this 0.30000000000000004.com
Your language isn't broken, it's doing floating point math. Computers can only natively store integers, so they need some way of representing decimal numbers. This representation comes with some degree of inaccuracy. That's why, more often than not, .1 + .2 != .3.
Shouldn't the result be 2.3333333333333333 or maybe 2.3333333333333334. Why is it rounding the number in such a way?
The key is the number is being rounded twice.
The first rounding is part of the division operation, rounding the number to the nearest double-precision floating point value. This is a binary operation not a decimal one.
The second rounding is part of converting the floating point number to a decimal representation for display. It is possible to represent the exact value of any binary fraction in decimal, but it is usually not desirable as in most applications doing so will simply result in many digits of false-precision. Python instead outputs the shortest decimal approximation that will round-trip to the correct floating point value.
We can better see what is going on by using the Fraction and Decimal types, unlike converting directly to a string converting a floating point number to a Fraction or Decimal will give the exact value. We can also use the Fraction type to determine the error in our calculation.
>>> from fractions import Fraction
>>> from decimal import Decimal
>>> 7.0 / 3
2.3333333333333335
>>> Decimal(7.0 / 3)
Decimal('2.333333333333333481363069950020872056484222412109375')
>>> Fraction(7.0 / 3)
Fraction(5254199565265579, 2251799813685248)
>>> Fraction(7,3) - Fraction(7.0 / 3)
Fraction(-1, 6755399441055744)
The conversion via type Decimal shows us the exact value of the floating point number and demonstrates the many digits of false-precision that typically result from exact conversion of a floating point value to decimal.
The conversion to a Fraction is also interesting, the denominator is 2251799813685248 which is equivalent to 251. This makes perfect sense, a Double precision floating point has 53 effective bits of mantissa and we need two of those for the integral part of the result leaving 51 for the fractional part.
The error in our floating point calculation is 1/6755399441055744 or ⅓ * 2-51. This error is less than half our precision step of 2-51 so the answer was indeed correctly rounded to a double precision floating point value.
Is there a way to round a python float to x decimals? For example:
>>> x = roundfloat(66.66666666666, 4)
66.6667
>>> x = roundfloat(1.29578293, 6)
1.295783
I've found ways to trim/truncate them (66.666666666 --> 66.6666), but not round (66.666666666 --> 66.6667).
I feel compelled to provide a counterpoint to Ashwini Chaudhary's answer. Despite appearances, the two-argument form of the round function does not round a Python float to a given number of decimal places, and it's often not the solution you want, even when you think it is. Let me explain...
The ability to round a (Python) float to some number of decimal places is something that's frequently requested, but turns out to be rarely what's actually needed. The beguilingly simple answer round(x, number_of_places) is something of an attractive nuisance: it looks as though it does what you want, but thanks to the fact that Python floats are stored internally in binary, it's doing something rather subtler. Consider the following example:
>>> round(52.15, 1)
52.1
With a naive understanding of what round does, this looks wrong: surely it should be rounding up to 52.2 rather than down to 52.1? To understand why such behaviours can't be relied upon, you need to appreciate that while this looks like a simple decimal-to-decimal operation, it's far from simple.
So here's what's really happening in the example above. (deep breath) We're displaying a decimal representation of the nearest binary floating-point number to the nearest n-digits-after-the-point decimal number to a binary floating-point approximation of a numeric literal written in decimal. So to get from the original numeric literal to the displayed output, the underlying machinery has made four separate conversions between binary and decimal formats, two in each direction. Breaking it down (and with the usual disclaimers about assuming IEEE 754 binary64 format, round-ties-to-even rounding, and IEEE 754 rules):
First the numeric literal 52.15 gets parsed and converted to a Python float. The actual number stored is 7339460017730355 * 2**-47, or 52.14999999999999857891452847979962825775146484375.
Internally as the first step of the round operation, Python computes the closest 1-digit-after-the-point decimal string to the stored number. Since that stored number is a touch under the original value of 52.15, we end up rounding down and getting a string 52.1. This explains why we're getting 52.1 as the final output instead of 52.2.
Then in the second step of the round operation, Python turns that string back into a float, getting the closest binary floating-point number to 52.1, which is now 7332423143312589 * 2**-47, or 52.10000000000000142108547152020037174224853515625.
Finally, as part of Python's read-eval-print loop (REPL), the floating-point value is displayed (in decimal). That involves converting the binary value back to a decimal string, getting 52.1 as the final output.
In Python 2.7 and later, we have the pleasant situation that the two conversions in step 3 and 4 cancel each other out. That's due to Python's choice of repr implementation, which produces the shortest decimal value guaranteed to round correctly to the actual float. One consequence of that choice is that if you start with any (not too large, not too small) decimal literal with 15 or fewer significant digits then the corresponding float will be displayed showing those exact same digits:
>>> x = 15.34509809234
>>> x
15.34509809234
Unfortunately, this furthers the illusion that Python is storing values in decimal. Not so in Python 2.6, though! Here's the original example executed in Python 2.6:
>>> round(52.15, 1)
52.200000000000003
Not only do we round in the opposite direction, getting 52.2 instead of 52.1, but the displayed value doesn't even print as 52.2! This behaviour has caused numerous reports to the Python bug tracker along the lines of "round is broken!". But it's not round that's broken, it's user expectations. (Okay, okay, round is a little bit broken in Python 2.6, in that it doesn't use correct rounding.)
Short version: if you're using two-argument round, and you're expecting predictable behaviour from a binary approximation to a decimal round of a binary approximation to a decimal halfway case, you're asking for trouble.
So enough with the "two-argument round is bad" argument. What should you be using instead? There are a few possibilities, depending on what you're trying to do.
If you're rounding for display purposes, then you don't want a float result at all; you want a string. In that case the answer is to use string formatting:
>>> format(66.66666666666, '.4f')
'66.6667'
>>> format(1.29578293, '.6f')
'1.295783'
Even then, one has to be aware of the internal binary representation in order not to be surprised by the behaviour of apparent decimal halfway cases.
>>> format(52.15, '.1f')
'52.1'
If you're operating in a context where it matters which direction decimal halfway cases are rounded (for example, in some financial contexts), you might want to represent your numbers using the Decimal type. Doing a decimal round on the Decimal type makes a lot more sense than on a binary type (equally, rounding to a fixed number of binary places makes perfect sense on a binary type). Moreover, the decimal module gives you better control of the rounding mode. In Python 3, round does the job directly. In Python 2, you need the quantize method.
>>> Decimal('66.66666666666').quantize(Decimal('1e-4'))
Decimal('66.6667')
>>> Decimal('1.29578293').quantize(Decimal('1e-6'))
Decimal('1.295783')
In rare cases, the two-argument version of round really is what you want: perhaps you're binning floats into bins of size 0.01, and you don't particularly care which way border cases go. However, these cases are rare, and it's difficult to justify the existence of the two-argument version of the round builtin based on those cases alone.
Use the built-in function round():
In [23]: round(66.66666666666,4)
Out[23]: 66.6667
In [24]: round(1.29578293,6)
Out[24]: 1.295783
help on round():
round(number[, ndigits]) -> floating point number
Round a number to a given precision in decimal digits (default 0
digits). This always returns a floating point number. Precision may
be negative.
Default rounding in python and numpy:
In: [round(i) for i in np.arange(10) + .5]
Out: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10]
I used this to get integer rounding to be applied to a pandas series:
import decimal
and use this line to set the rounding to "half up" a.k.a rounding as taught in school:
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
Finally I made this function to apply it to a pandas series object
def roundint(value):
return value.apply(lambda x: int(decimal.Decimal(x).to_integral_value()))
So now you can do roundint(df.columnname)
And for numbers:
In: [int(decimal.Decimal(i).to_integral_value()) for i in np.arange(10) + .5]
Out: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Credit: kares
The Mark Dickinson answer, although complete, didn't work with the float(52.15) case. After some tests, there is the solution that I'm using:
import decimal
def value_to_decimal(value, decimal_places):
decimal.getcontext().rounding = decimal.ROUND_HALF_UP # define rounding method
return decimal.Decimal(str(float(value))).quantize(decimal.Decimal('1e-{}'.format(decimal_places)))
(The conversion of the 'value' to float and then string is very important, that way, 'value' can be of the type float, decimal, integer or string!)
Hope this helps anyone.
I coded a function (used in Django project for DecimalField) but it can be used in Python project :
This code :
Manage integers digits to avoid too high number
Manage decimals digits to avoid too low number
Manage signed and unsigned numbers
Code with tests :
def convert_decimal_to_right(value, max_digits, decimal_places, signed=True):
integer_digits = max_digits - decimal_places
max_value = float((10**integer_digits)-float(float(1)/float((10**decimal_places))))
if signed:
min_value = max_value*-1
else:
min_value = 0
if value > max_value:
value = max_value
if value < min_value:
value = min_value
return round(value, decimal_places)
value = 12.12345
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.12
value = 12.126
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.13
value = 1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : 99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : -99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2, signed = False)
# nb : 0
value = 12.123
nb = convert_decimal_to_right(value, 8, 4)
# nb : 12.123
def trim_to_a_point(num, dec_point):
factor = 10**dec_point # number of points to trim
num = num*factor # multiple
num = int(num) # use the trimming of int
num = num/factor #divide by the same factor of 10s you multiplied
return num
#test
a = 14.1234567
trim_to_a_point(a, 5)
output
========
14.12345
multiple by 10^ decimal point you want
truncate with int() method
divide by the same number you multiplied before
done!
Just posted this for educational reasons i think it is correct though :)
I experienced this phenomenon in Python first, but it turned out that it is the common answer, for example MS Excel gives this. Wolfram Alpha gives an interesting schizoid answer, where it states that the rational approximation of zero is 1/5. ( 1.0 mod 0.1 )
On the other hand, if I implement the definition by hand it gives me the 'right' answer (0).
def myFmod(a,n):
return a - floor(a/n) * n
What is going on here. Do I miss something?
Because 0.1 isn't 0.1; that value isn't representable in double precision, so it gets rounded to the nearest double-precision number, which is exactly:
0.1000000000000000055511151231257827021181583404541015625
When you call fmod, you get the remainder of division by the value listed above, which is exactly:
0.0999999999999999500399638918679556809365749359130859375
which rounds to 0.1 (or maybe 0.09999999999999995) when you print it.
In other words, fmod works perfectly, but you're not giving it the input that you think you are.
Edit: Your own implementation gives you the correct answer because it is less accurate, believe it or not. First off, note that fmod computes the remainder without any rounding error; the only source of inaccuracy is the representation error introduced by using the value 0.1. Now, let's walk through your implementation, and see how the rounding error that it incurs exactly cancels out the representation error.
Evaluate a - floor(a/n) * n one step at a time, keeping track of the exact values computed at each stage:
First we evaluate 1.0/n, where n is the closest double-precision approximation to 0.1 as shown above. The result of this division is approximately:
9.999999999999999444888487687421760603063276150363492645647081359...
Note that this value is not a representable double precision number -- so it gets rounded. To see how this rounding happens, let's look at the number in binary instead of decimal:
1001.1111111111111111111111111111111111111111111111111 10110000000...
The space indicates where the rounding to double precision occurs. Since the part after the round point is larger than the exact half-way point, this value rounds up to exactly 10.
floor(10.0) is, predictably, 10.0. So all that's left is to compute 1.0 - 10.0*0.1.
In binary, the exact value of 10.0 * 0.1 is:
1.0000000000000000000000000000000000000000000000000000 0100
again, this value is not representable as a double, and so is rounded at the position indicated by a space. This time it rounds down to exactly 1.0, and so the final computation is 1.0 - 1.0, which is of course 0.0.
Your implementation contains two rounding errors, which happen to exactly cancel out the representation error of the value 0.1 in this case. fmod, by contrast, is always exact (at least on platforms with a good numerics library), and exposes the representation error of 0.1.
This result is due to machine floating point representation. In your method, you are 'casting' (kinda) the float to an int and do not have this issue. The 'best' way to avoid such issues (esp for mod) is to multiply by a sufficiently large enough int (only 10 is needed in your case) and perform the operation again.
fmod(1.0,0.1)
fmod(10.0,1.0) = 0
From man fmod:
The fmod() function computes the floating-point remainder of dividing x
by y. The return value is x - n * y, where n is the quotient of x / y,
rounded towards zero to an integer.
So what happens is:
In fmod(1.0, 0.1), the 0.1 is actually slightly larger than 0.1 because the value cannot be exactly represented as a float.
So n = x / y = 1.0 / 0.1000something = 9.9999something
When rounded towards 0, n actually becomes 9
x - n * y = 1.0 - 9 * 0.1 = 0.1
Edit: As for why it works with floor(x/y), as far as I can tell this seems to be an FPU quirk. On x86, fmod uses the fprem instruction, whereas x/y will use fdiv. Curiously 1.0/0.1 seems to return exactly 10.0:
>>> struct.pack('d', 1.0/0.1) == struct.pack('d', 10.0)
True
I suppose fdiv uses a more precise algorithm than fprem. Some discussion can be found here: http://www.rapideuphoria.com/cgi-bin/esearch.exu?thread=1&fromMonth=A&fromYear=8&toMonth=C&toYear=8&keywords=%22Remainder%22
fmod returns x-i*y, which is less than y, and i is an integer. 0.09.... is because of floating point precision. try fmod(0.3, 0.1) -> 0.09... but fmod(0.4, 0.1) -> 0.0 because 0.3 is 0.2999999... as a float.
fmod(1/(2.**n), 1/(2.**m) will never produce anything but 0.0 for integer n>=m.
This gives the right answer:
a = 1.0
b = 0.1
a1,a2 = a.as_integer_ratio()
b1,b2 = b.as_integer_ratio()
div = float(a1*b2) / float(a2*b1)
mod = a - b*div
print mod
# 0.0
I think it works because by it uses rational equivalents of the two floating point numbers which provides a more accurate answer.
The Python divmod function is instructive here. It tells you both the quotient and remainder of a division operation.
$ python
>>> 0.1
0.10000000000000001
>>> divmod(1.0, 0.1)
(9.0, 0.09999999999999995)
When you type 0.1, the computer can't represent that exact value in binary floating-point arithmetic, so it chooses the closest number that it can represent, 0.10000000000000001. Then when you perform the division operation, floating-point arithmetic decides that the quotient has to be 9, since 0.10000000000000001 * 10 is larger than 1.0. This leaves you with a remainder that is slightly less than 0.1.
I would want to use the new Python fractions module to get exact answers.
>>> from fractions import Fraction
>>> Fraction(1, 1) % Fraction(1, 10)
Fraction(0, 1)
IOW, (1/1) mod (1/10) = (0/1), which is equivalent to 1 mod 0.1 = 0.
Another option is to implement the modulus operator yourself, allowing you to specify your own policy.
>>> x = 1.0
>>> y = 0.1
>>> x / y - math.floor(x / y)
0.0