How to avoid incorrect rounding with numpy.round?

How to avoid incorrect rounding with numpy.round? - python

I'm working with floating point numbers. If I do:
import numpy as np
np.round(100.045, 2)
I get:
Out[15]: 100.04
Obviously, this should be 100.05. I know about the existence of IEEE 754 and that the way that floating point numbers are stored is the cause of this rounding error.
My question is: how can I avoid this error?

You are partly right, often the cause of this "incorrect rounding" is because of the way floating point numbers are stored. Some float literals can be represented exactly as floating point numbers while others cannot.
>>> a = 100.045
>>> a.as_integer_ratio() # not exact
(7040041011254395, 70368744177664)
>>> a = 0.25
>>> a.as_integer_ratio() # exact
(1, 4)
It's also important to know that there is no way you can restore the literal you used (100.045) from the resulting floating point number. So the only thing you can do is to use an arbitrary precision data type instead of the literal. For example you could use Fraction or Decimal (just to mention two built-in types).
I mentioned that you cannot restore the literal once it is parsed as float - so you have to input it as string or something else that represents the number exactly and is supported by these data types:
>>> from fractions import Fraction
>>> f = Fraction(100045, 100)
>>> f
Fraction(20009, 20)
>>> f = Fraction("100.045")
>>> f
Fraction(20009, 20)
>>> from decimal import Decimal
>>> Decimal("100.045")
Decimal('100.045')
However these don't work well with NumPy and even if you get it to work at all - it will almost certainly be very slow compared to basic floating point operations.
>>> import numpy as np
>>> a = np.array([Decimal("100.045") for _ in range(1000)])
>>> np.round(a)
AttributeError: 'decimal.Decimal' object has no attribute 'rint'
In the beginning I said that you're are only partly right. There is another twist!
You mentioned that rounding 100.045 will obviously give 100.05. But that's not obvious at all, in your case it is even wrong (in the context of floating point math in programming - it would be true for "normal calculations"). In many programming languages a "half" value (where the number after the decimal you're rounding is 5) isn't always rounded up - for example Python (and NumPy) use a "round half to even" approach because it's less biased. For example 0.5 will be rounded to 0 while 1.5 will be rounded to 2.
So even if 100.045 could be represented exactly as float - it would still round to 100.04 because of that rounding rule!
>>> round(Fraction("100.045"), 1)
Fraction(5002, 5)
>>> 5002 / 5
1000.4
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.04')
This is even mentioned in the NumPy docs for numpy.around:
Notes
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R1011] and errors introduced when scaling by powers of ten.
(Emphasis mine.)
The only (at least that I know) numeric type in Python that allows setting the rounding rule manually is Decimal - via ROUND_HALF_UP:
>>> from decimal import Decimal, getcontext, ROUND_HALF_UP
>>> dc = getcontext()
>>> dc.rounding = ROUND_HALF_UP
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.05')
Summary
So to avoid the "error" you have to:
Prevent Python from parsing it as floating point value and
use a data type that can represent it exactly
then you have to manually override the default rounding mode so that you will get rounding up for "halves".
(abandon NumPy because it doesn't have arbitrary precision data types)

Basically there is no general solution for this problem IMO, unless you have a general rule for all the different cases (see Floating Point Arithmetic: Issues and Limitation). However, in this case you can round the decimal part separately:
In [24]: dec, integ = np.modf(100.045)
In [25]: integ + np.round(dec, 2)
Out[25]: 100.05
The reason for such behavior is not because separating integer from decimal part makes any difference on round()'s logic. It's because when you use fmod it gives you a more realistic version of the decimal part of the number which is actually a rounded representation.
In this case here is what dec is:
In [30]: dec
Out[30]: 0.045000000000001705
And you can check that round gives same result with 0.045:
In [31]: round(0.045, 2)
Out[31]: 0.04
Now if you try with another number like 100.0333, the decimal part is a slightly smaller version which as I mentioned, the result you want depends on your rounding policies.
In [37]: dec, i = np.modf(100.0333)
In [38]: dec
Out[38]: 0.033299999999997
There are also modules like fractions and decimal that provide support for fast correctly-rounded decimal floating point and rational arithmetic, that you can use in situations as such.

This is not a bug, but a feature )))
you can simple use this trick:
def myround(val):
"Fix pythons round"
d,v = math.modf(val)
if d==0.5:
val += 0.000000001
return round(val)

Related

How to increase decimal precision with 'exotic' functions?

I can't figure out how to specify my decimal precision for logs, importing decimal and setting the context does not affect the log functions.
from decimal import *
getcontext().prec = 54
print(Decimal(197)/ Decimal(83))
2.37349397590361445783132530120481927710843373493975904
print(math.log(Decimal(197)))
5.2832037287379885
I would like to set a high precision for functions other than fractions. Python 3 btw.

Possible solutions:
String formatting:
f-strings have the benefit of not requiring another package to be imported
f-Strings: A New and Improved Way to Format Strings in Python
Extend the decimal places shown, to the same extent as the decimal module
x = 4567.09710599898797936589076897
y = 2445.89790870380808990080797897
Lengthen:
print(f'{(x/y):.054f}')
>>> 1.867247643389702504990168563381303101778030395507812500
calculation = math.log(197)
print(f'{calculation:.050f}')
>>> 5.28320372873798849155946300015784800052642822265625
Shorten:
print(f'{(x/y):.02f}')
>>> 1.87
numpy:
numpy.round
Shorten:
print(np.round(x/y, 2))
>>> 1.87
Lengthen:
numpy does not extend the precision shown, beyond what's shown by python.
print(np.around(x/y, 54))
>>> 1.8672476433897025
print(x/y)
>>> 1.8672476433897025
decimal Module:
Decimal fixed point and floating point arithmetic
Example from the question:
print(math.log(197))
>>> 5.2832037287379885
print(math.log(Decimal(197.0)))
>>> 5.2832037287379885
print(Decimal(math.log(197)))
>>> 5.28320372873798849155946300015784800052642822265625
print(Decimal(197).ln())
>>> 5.283203728737988506779797329
print(f'{math.log(197):.050f}')
>>> 5.28320372873798849155946300015784800052642822265625
Notes:
Either method can be used to format the number to the required decimal place, prior to writing into a log.
Caveat: due to the way numbers are represented in computers, I'm doubtful if increasing the number of decimal places shown, increases the precision.
Floating Point Arithmetic: Issues and Limitations
Using f-strings provided the same final output precision as using the decimal module.

Python precision issues with float formatting

Please look at the below Python code that I've entered into a Python 3.6 interpreter:
>>> 0.00225 * 100.0
0.22499999999999998
>>> '{:.2f}'.format(0.00225 * 100.0)
'0.22'
>>> '{:.2f}'.format(0.225)
'0.23'
>>> '{:.2f}'.format(round(0.00225 * 100.0, 10))
'0.23'
Hopefully you can immediately understand why I'm frustrated. I am attempting to display value * 100.0 on my GUI, storing the full precision behind a cell but only displaying 2 decimal points (or whatever the users precision setting is). The GUI is similar to an Excel spreadsheet.
I'd prefer not to lose the precision of something like 0.22222444937645 and round by 10, but I also don't want a value such as 0.00225 * 100.0 displaying as 0.22.
I'm interested in hearing about a standard way of approaching a situation like this or a remedy for my specific situation. Thanks ahead of time for any help.

Consider using the Decimal module, which "provides support for fast correctly-rounded decimal floating point arithmetic." The primary advantages of Decimal relevant to your use case are:
Decimal numbers can be represented exactly. In contrast, numbers like 1.1 and 2.2 do not have exact representations in binary floating point. End users typically would not expect 1.1 + 2.2 to display as 3.3000000000000003 as it does with binary floating point.
The exactness carries over into arithmetic. In decimal floating point, 0.1 + 0.1 + 0.1 - 0.3 is exactly equal to zero. In binary floating point, the result is 5.5511151231257827e-017. While near to zero, the differences prevent reliable equality testing and differences can accumulate. For this reason, decimal is preferred in accounting applications which have strict equality invariants.
Based on the information you've provided in the question, I cannot say how much of an overhaul migrating to Decimal would require. However, if you're creating a spreadsheet-like application and always want to preserve maximal precision, then you will probably want to refactor to use Decimal sooner or later to avoid unexpected numbers in your user-facing GUI.
To get the behavior you desire, you may need to change the rounding mode (which defaults to ROUND_HALF_EVEN) for Decimal instances.
from decimal import getcontext, ROUND_HALF_UP
getcontext().rounding = ROUND_HALF_UP
n = round(Decimal('0.00225') * Decimal('100'), 2)
print(n) # prints Decimal('0.23')
m = round(Decimal('0.00225') * 100, 2)
print(m) # prints Decimal('0.23')

perhaps use decimal? docs.python.org/2/library/decimal.html
from decimal import *
getcontext().prec = 2
n = Decimal.from_float(0.00225)
m = n * 100
print(n, m)
print(m.quantize(Decimal('.01'), rounding=ROUND_DOWN))
print(m.quantize(Decimal('.01'), rounding=ROUND_UP)

Convert to Float without Rounding Decimal Places

I have a list and it contains a certain number '5.74536541' in it which I convert to a float.
I am printing it out in Python 3 using ("%0.2f" % (variable)) but it always prints out 5.75 instead of 5.74.
I know you're thinking who cares, but it is for a currency converter program and I don't want the currencies to round up/down but to be exact.
How can I keep it from rounding but also keep the 2 decimal places?

You shouldn't use floating point numbers for currency, due to rounding errors like you mentioned.
Your best bet is to use a fixed-precision decimal where you also have full control over how rounding and truncation works. From the docs:
>>> from decimal import *
>>> getcontext()
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999,
capitals=1, flags=[], traps=[Overflow, DivisionByZero,
InvalidOperation])
>>> getcontext().prec = 6
>>> Decimal('3.0')
Decimal('3.0')
>>> Decimal('3.1415926535')
Decimal('3.1415926535')
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85987')
>>> getcontext().rounding = ROUND_UP
>>> Decimal('3.1415926535') + Decimal('2.7182818285')
Decimal('5.85988')
You should represent all currency-based values internally as Decimals with a high precision (the standard level of precision should be fine in your case - just leave the prec alone!). If you want to print a nicely formatted dollars and cents value to the user, using the locale module is a straightforward way to do this.
Be careful when printing as you will have to quantize the Decimal down to the correct number of places for display or the rounding will not be based on your Decimal context! You should only perform the quantize step for final display or for a single, final value - all intermediate steps should use high-precision Decimals to make any operations as accurate as possible.
>>> from decimal import *
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'en_AU.UTF-8'
>>> getcontext().rounding = ROUND_DOWN
>>> TWOPLACES = Decimal(10) ** -2
>>> var = Decimal('5.74536541')
Decimal('5.74536541')
>>> var.quantize(TWOPLACES)
Decimal('5.74')
>>> locale.currency(var.quantize(TWOPLACES))
'$5.74'

If you're dealing with currency and accuracy matters, don't use float, use decimal.

Take away the number mod 0.01
i.e.
rounded = number - (number % 0.01)
then print it the same as before.
This said, rounding down is not more accurate. Are you trying the old steal money from a bank by exploiting rounding errors scheme?

Floating point values are known as "useful approximations". Whatever you do to a floating point number—round it, truncate it, whatever—if the result is a floating point value, you don't get to decide how many digits to the right of the decimal point it has.
Never use floating point values for currency. See pydoc decimal, for example. Python's decimal module supports decimal fixed point and decimal floating point arithmetic.
Python docs warn about rounding floats.
Note The behavior of round() for floats can be surprising: for
example, round(2.675, 2) gives 2.67 instead of the expected 2.68. This
is not a bug: it’s a result of the fact that most decimal fractions
can’t be represented exactly as a float.
If you're not careful, you'll be misled by the value that appears at the interpreter prompt.
Python only prints a decimal approximation to the true decimal value
of the binary approximation stored by the machine.
And
It’s important to realize that this is, in a real sense, an illusion:
the value in the machine is not exactly 1/10, you’re simply rounding
the display of the true machine value. This fact becomes apparent as
soon as you try to do arithmetic with these values

If the number is a string then truncate the string to only 2 characters after the decimal and then convert it to a float.
Otherwise multiply it with 10^n where n is the number of digits after the decimal and then divide your float by 10^n.

Round float to x decimals?

Is there a way to round a python float to x decimals? For example:
>>> x = roundfloat(66.66666666666, 4)
66.6667
>>> x = roundfloat(1.29578293, 6)
1.295783
I've found ways to trim/truncate them (66.666666666 --> 66.6666), but not round (66.666666666 --> 66.6667).

I feel compelled to provide a counterpoint to Ashwini Chaudhary's answer. Despite appearances, the two-argument form of the round function does not round a Python float to a given number of decimal places, and it's often not the solution you want, even when you think it is. Let me explain...
The ability to round a (Python) float to some number of decimal places is something that's frequently requested, but turns out to be rarely what's actually needed. The beguilingly simple answer round(x, number_of_places) is something of an attractive nuisance: it looks as though it does what you want, but thanks to the fact that Python floats are stored internally in binary, it's doing something rather subtler. Consider the following example:
>>> round(52.15, 1)
52.1
With a naive understanding of what round does, this looks wrong: surely it should be rounding up to 52.2 rather than down to 52.1? To understand why such behaviours can't be relied upon, you need to appreciate that while this looks like a simple decimal-to-decimal operation, it's far from simple.
So here's what's really happening in the example above. (deep breath) We're displaying a decimal representation of the nearest binary floating-point number to the nearest n-digits-after-the-point decimal number to a binary floating-point approximation of a numeric literal written in decimal. So to get from the original numeric literal to the displayed output, the underlying machinery has made four separate conversions between binary and decimal formats, two in each direction. Breaking it down (and with the usual disclaimers about assuming IEEE 754 binary64 format, round-ties-to-even rounding, and IEEE 754 rules):
First the numeric literal 52.15 gets parsed and converted to a Python float. The actual number stored is 7339460017730355 * 2**-47, or 52.14999999999999857891452847979962825775146484375.
Internally as the first step of the round operation, Python computes the closest 1-digit-after-the-point decimal string to the stored number. Since that stored number is a touch under the original value of 52.15, we end up rounding down and getting a string 52.1. This explains why we're getting 52.1 as the final output instead of 52.2.
Then in the second step of the round operation, Python turns that string back into a float, getting the closest binary floating-point number to 52.1, which is now 7332423143312589 * 2**-47, or 52.10000000000000142108547152020037174224853515625.
Finally, as part of Python's read-eval-print loop (REPL), the floating-point value is displayed (in decimal). That involves converting the binary value back to a decimal string, getting 52.1 as the final output.
In Python 2.7 and later, we have the pleasant situation that the two conversions in step 3 and 4 cancel each other out. That's due to Python's choice of repr implementation, which produces the shortest decimal value guaranteed to round correctly to the actual float. One consequence of that choice is that if you start with any (not too large, not too small) decimal literal with 15 or fewer significant digits then the corresponding float will be displayed showing those exact same digits:
>>> x = 15.34509809234
>>> x
15.34509809234
Unfortunately, this furthers the illusion that Python is storing values in decimal. Not so in Python 2.6, though! Here's the original example executed in Python 2.6:
>>> round(52.15, 1)
52.200000000000003
Not only do we round in the opposite direction, getting 52.2 instead of 52.1, but the displayed value doesn't even print as 52.2! This behaviour has caused numerous reports to the Python bug tracker along the lines of "round is broken!". But it's not round that's broken, it's user expectations. (Okay, okay, round is a little bit broken in Python 2.6, in that it doesn't use correct rounding.)
Short version: if you're using two-argument round, and you're expecting predictable behaviour from a binary approximation to a decimal round of a binary approximation to a decimal halfway case, you're asking for trouble.
So enough with the "two-argument round is bad" argument. What should you be using instead? There are a few possibilities, depending on what you're trying to do.
If you're rounding for display purposes, then you don't want a float result at all; you want a string. In that case the answer is to use string formatting:
>>> format(66.66666666666, '.4f')
'66.6667'
>>> format(1.29578293, '.6f')
'1.295783'
Even then, one has to be aware of the internal binary representation in order not to be surprised by the behaviour of apparent decimal halfway cases.
>>> format(52.15, '.1f')
'52.1'
If you're operating in a context where it matters which direction decimal halfway cases are rounded (for example, in some financial contexts), you might want to represent your numbers using the Decimal type. Doing a decimal round on the Decimal type makes a lot more sense than on a binary type (equally, rounding to a fixed number of binary places makes perfect sense on a binary type). Moreover, the decimal module gives you better control of the rounding mode. In Python 3, round does the job directly. In Python 2, you need the quantize method.
>>> Decimal('66.66666666666').quantize(Decimal('1e-4'))
Decimal('66.6667')
>>> Decimal('1.29578293').quantize(Decimal('1e-6'))
Decimal('1.295783')
In rare cases, the two-argument version of round really is what you want: perhaps you're binning floats into bins of size 0.01, and you don't particularly care which way border cases go. However, these cases are rare, and it's difficult to justify the existence of the two-argument version of the round builtin based on those cases alone.

Use the built-in function round():
In [23]: round(66.66666666666,4)
Out[23]: 66.6667
In [24]: round(1.29578293,6)
Out[24]: 1.295783
help on round():
round(number[, ndigits]) -> floating point number
Round a number to a given precision in decimal digits (default 0
digits). This always returns a floating point number. Precision may
be negative.

Default rounding in python and numpy:
In: [round(i) for i in np.arange(10) + .5]
Out: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10]
I used this to get integer rounding to be applied to a pandas series:
import decimal
and use this line to set the rounding to "half up" a.k.a rounding as taught in school:
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
Finally I made this function to apply it to a pandas series object
def roundint(value):
return value.apply(lambda x: int(decimal.Decimal(x).to_integral_value()))
So now you can do roundint(df.columnname)
And for numbers:
In: [int(decimal.Decimal(i).to_integral_value()) for i in np.arange(10) + .5]
Out: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Credit: kares

The Mark Dickinson answer, although complete, didn't work with the float(52.15) case. After some tests, there is the solution that I'm using:
import decimal
def value_to_decimal(value, decimal_places):
decimal.getcontext().rounding = decimal.ROUND_HALF_UP # define rounding method
return decimal.Decimal(str(float(value))).quantize(decimal.Decimal('1e-{}'.format(decimal_places)))
(The conversion of the 'value' to float and then string is very important, that way, 'value' can be of the type float, decimal, integer or string!)
Hope this helps anyone.

I coded a function (used in Django project for DecimalField) but it can be used in Python project :
This code :
Manage integers digits to avoid too high number
Manage decimals digits to avoid too low number
Manage signed and unsigned numbers
Code with tests :
def convert_decimal_to_right(value, max_digits, decimal_places, signed=True):
integer_digits = max_digits - decimal_places
max_value = float((10**integer_digits)-float(float(1)/float((10**decimal_places))))
if signed:
min_value = max_value*-1
else:
min_value = 0
if value > max_value:
value = max_value
if value < min_value:
value = min_value
return round(value, decimal_places)
value = 12.12345
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.12
value = 12.126
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.13
value = 1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : 99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : -99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2, signed = False)
# nb : 0
value = 12.123
nb = convert_decimal_to_right(value, 8, 4)
# nb : 12.123

def trim_to_a_point(num, dec_point):
factor = 10**dec_point # number of points to trim
num = num*factor # multiple
num = int(num) # use the trimming of int
num = num/factor #divide by the same factor of 10s you multiplied
return num
#test
a = 14.1234567
trim_to_a_point(a, 5)
output
========
14.12345
multiple by 10^ decimal point you want
truncate with int() method
divide by the same number you multiplied before
done!
Just posted this for educational reasons i think it is correct though :)

Why does str() round up floats?

The built-in Python str() function outputs some weird results when passing in floats with many decimals. This is what happens:
>>> str(19.9999999999999999)
>>> '20.0'
I'm expecting to get:
>>> '19.9999999999999999'
Does anyone know why? and maybe workaround it?
Thanks!

It's not str() that rounds, it's the fact that you're using floats in the first place. Float types are fast, but have limited precision; in other words, they are imprecise by design. This applies to all programming languages. For more details on float quirks, please read "What Every Programmer Should Know About Floating-Point Arithmetic"
If you want to store and operate on precise numbers, use the decimal module:
>>> from decimal import Decimal
>>> str(Decimal('19.9999999999999999'))
'19.9999999999999999'

A float has 32 bits (in C at least). One of those bits is allocated for the sign, a few allocated for the mantissa, and a few allocated for the exponent. You can't fit every single decimal to an infinite number of digits into 32 bits. Therefore floating point numbers are heavily based on rounding.
If you try str(19.998), it will probably give you something at least close to 19.998 because 32 bits have enough precision to estimate that, but something like 19.999999999999999 is too precise to estimate in 32 bits, so it rounds to the nearest possible value, which happens to be 20.

Please note that this is a problem of understanding floating point (fixed-length) numbers. Most languages do exactly (or very similar to) what Python does.
Python float is IEEE 754 64-bit binary floating point. It is limited to 53 bits of precision i.e. slightly less than 16 decimal digits of precision. 19.9999999999999999 contains 18 decimal digits; it cannot be represented exactly as a float. float("19.9999999999999999") produces the nearest floating point value, which happens to be the same as float("20.0").
>>> float("19.9999999999999999") == float("20.0")
True
If by "many decimals" you mean "many digits after the decimal point", please be aware that the same "weird" results happen when there are many decimal digits before the decimal point:
>>> float("199999999999999999")
2e+17
If you want the full float precision, don't use str(), use repr():
>>> x = 1. / 3.
>>> str(x)
'0.333333333333'
>>> str(x).count('3')
12
>>> repr(x)
'0.3333333333333333'
>>> repr(x).count('3')
16
>>>
Update It's interesting how often decimal is prescribed as a cure-all for float-induced astonishment. This is often accompanied by simple examples like 0.1 + 0.1 + 0.1 != 0.3. Nobody stops to point out that decimal has its share of deficiencies e.g.
>>> (1.0 / 3.0) * 3.0
1.0
>>> (Decimal('1.0') / Decimal('3.0')) * Decimal('3.0')
Decimal('0.9999999999999999999999999999')
>>>
True, float is limited to 53 binary digits of precision. By default, decimal is limited to 28 decimal digits of precision.
>>> Decimal(2) / Decimal(3)
Decimal('0.6666666666666666666666666667')
>>>
You can change the limit, but it's still limited precision. You still need to know the characteristics of the number format to use it effectively without "astonishing" results, and the extra precision is bought by slower operation (unless you use the 3rd-party cdecimal module).

For any given binary floating point number, there is an infinite set of decimal fractions that, on input, round to that number. Python's str goes to some trouble to produce the shortest decimal fraction from this set; see GLS's paper http://kurtstephens.com/files/p372-steele.pdf for the general algorithm (IIRC they use a refinement that avoids arbitrary-precision math in most cases). You happened to input a decimal fraction that rounds to a float (IEEE double) whose shortest possible decimal fraction is not the same as the one you entered.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to avoid incorrect rounding with numpy.round? - python

This is not a bug, but a feature ))) you can simple use this trick: def myround(val): "Fix pythons round" d,v = math.modf(val) if d==0.5: val += 0.000000001 return round(val)

Related

How to increase decimal precision with 'exotic' functions?

Python precision issues with float formatting

Convert to Float without Rounding Decimal Places

Round float to x decimals?

Why does str() round up floats?

Categories

Resources