When calculating, I get the wrong last digit of the number. At first, I just calculated with an accuracy of one digit more than I needed, and then I just removed the last rounded digit with a slice. But then I noticed that sometimes Decimal rounds more than one digit. Is it possible to calculate without rounding?
For example
from decimal import Decimal as dec, Context, setcontext, ROUND_DOWN
from math import log
def sqr(x):
return x*x
def pi(n):
getcontext().prec=n+1
a=p=1
b=dec(1)/dec(2).sqrt()
t=dec(1)/dec(4)
for _ in range(int(log(n,2))):
an=(a+b)/2
b=(a*b).sqrt()
t-=p*sqr(a-an)
p*=2
a=an
return sqr(a+b)/(4*t)
If I try pi (12) I get "3.141592653591" (the last 2 digits are wrong), but if I try pi(13), they both change to the correct ones - "3.1415926535899".
It's called Roundoff Error and is unavoidable when working with Floating-Point Arithmetic. You can write the following code in your Python REPL and should get, interestingly, False.
0.2 + 0.1 == 0.3 # False
It's because the last bits of float numbers are, actually, garbage. One way you can work around this is by using more terms in your series and, then, rounding the result to the wanted precision.
If you want to understand this deeper, you can read these two links I've attached and, maybe, some Numerical Computing textbook.
Related
I am trying to match an expected output of "13031.157014219536" exactly, and I have attempted 3 times to get the value with different methods, detailed below, and come extremely close to the value, but not close enough. What is happening in these code snippets that it causing the deviation? Is it rounding error in the calculation? Did I do something wrong?
Attempts:
item_worth = 8000
years = range(10)
for year in years:
item_worth += (item_worth*0.05)
print(item_worth)
value = item_worth * (1 + ((0.05)**10))
print(value)
cost = 8000
for x in range(10):
cost *= 1.05
print(cost)
Expected Output:
13031.157014219536
Actual Outputs:
13031.157014219529
13031.157014220802
13031.157014219538
On most machine, floats are represented by fp64, or double floats.
You can check the precision of those for your number that way (not a method to be used for real computation. Just out of curiosity):
import struct
struct.pack('d', 13031.157014219536)
# bytes representing that number in fp64 b'\xf5\xbc\n\x19\x94s\xc9#'
# or, in a more humanely understandable way
struct.unpack('q', struct.pack('d', 13031.157014219536))[0]
# 4668389568658717941
# That number has no meaning, except that this integer is represented by the same bytes as your float.
# Now, let's see what float is "next in line"
struct.unpack('d', struct.pack('q', 4668389568658717941+1))[0]
# 13031.157014219538
Note that this code works on most machine, but is not reliable. First of all, it relies on the fact that significant bits are not just all 1. Otherwise, it would give a totally unrelated number. Secondly, it makes assumption that ints are LE. But well, it gave me what I wanted.
That is the information that the smallest number bigger than 13031.157014219536 is 13031.157014219538.
(Or, said more accurately for this kind of conversation: the smallest float bigger than 13031.157014219536 whose representation is not the same as the representation of 13031.157014219536 has the same representation as 13031.157014219538)
So, my point is you are flirting with the representation limit. You can't expect the sum of 10 numbers to be more accurate.
I could also have said that saying that the biggest power of 2 smaller than your number is 8192=2¹³.
So, that 13 is the exponent of your float in its representation. And this you have 53 significant bits, the precision of such a number is 2**(13-53+1) = 1.8×10⁻¹² (which is indeed also the result of 13031.157014219538-13031.157014219536). Hence the reason why in decimal, 12 decimal places are printed. But not all combination of them can exist, and the last one is not insignificant, but not fully significant neither.
If your computation is the result of the sum of 10 such numbers, you could even have an error 10 times bigger than your last result the right to complain :D
The value 0.1 is not representable as a 64 bits floats.
The exact value is roughly equals to 0.10000000000000000555
https://www.exploringbinary.com/why-0-point-1-does-not-exist-in-floating-point/
You can highlight this behavior with this simple code:
timestep = 0.1
iterations = 1_000_000
total = 0
for _ in range(iterations):
total += timestep
print(total - timestep * iterations) # output is not zero but 1.3328826753422618e-06
I totally understand why 0.1 is not representable as an exact value as a float 64, but what I don't get is why when I do print(0.1), it outputs 0.1 and not the underlying value as a float 64.
Of course, the underlying value has many more digits on a base 10 system so there should be some rounding involved, but I am looking for the specification for all values and how to control that.
I had the issue with some application storing data in database:
the python app (using str(0.1)) would show 0.1
another database client UI would show 0.10000000000000000555, which would throw off the end user
P-S: I had other issues with other values
Regards,
First, you are right, floats (single, double, whatever) have an exact value.
For 64 bits IEEE-754 double, the nearest representable value to 0.1 would be exactly 0.1000000000000000055511151231257827021181583404541015625, quite long as you can see. But representable floating point values all have a finite number of decimal digits, because the base (2) is a divisor of some power of 10.
For a REPL language like python, it is essential to have this property:
the printed representation of the float shall be reinterpreted as the same value
A consequence is that
every two different float shall have different printed representation
For obtaining those properties, there are several possbilities:
print the exact value. That can be many digits, and for the vast majority of humans, just noise.
print enough digits so that every two different float have a different representation. For double precision, that's 17 digits in the worse case. So a naive implementation for representing floating point values would be to always print 17 significant digits.
print the shortest representation that would be reinterpreted unchanged.
Python, and many other languages have chosen the 3rd solution, because it is considered annoying to print 0.10000000000000001 when user have entered 0.1. Human users generally choose the shorter representation and printed representation is for human consumption. The shorter, the better.
The bad property is that it could give the false impression that those floating point values are storing exact decimal values like 1/10. That's a knowledge that is evangelized here and in many places now.
I was investigating different rounding method using Python built-in solution and some other external libraries such SymPy and while doing so I stumbled upon some cases that I need help with understanding the reason behind it.
Ex-1:
print(round(1.0065,3))
output:
1.006
In the first case, using the Python built-in rounding function the output was 1.006 instead of 1.007 and I can understand that this is not a mistake as Python rounds to the nearest even and that's known as Bankers rounding.
And this is why I from the beginning started searching for another way to control the rounding behaviour. With a quick search, I've found decimal.Decimal module which can easily handle decimal values and efficiently round is using quantize() as in this example:
from decimal import Decimal, getcontext, ROUND_HALF_UP
context= getcontext()
context.rounding='ROUND_HALF_UP'
print(Decimal('1.0065').quantize(Decimal('.001')))
output:1.007
This is a very good solution but the only problem is it is not easy to be hardcoded in long math expressions as I'll need to convert every number to string then after using decimal I will pass it the precession as in the form of "0.001" instead of writing '3' directly as in the case of built-in round.
While searching for another solution I found that SymPy, which I already use a lot in my scripts, offers some very powerful functions that might help but when I tried it the output was not as I expected.
Ex-1 using SymPy sympify():
print(sympify(1.0065).evalf(3))
output: 1.01
Ex-2 using SymPy N (normalize):
print(N(1.0065,3))
output: 1.01
Af first the output was a little bit weird but after investigating I realized that N and sympify already performing round right but rounding to significant figures, not to decimal places.
And here the questions come:
As I can use with Decimal objects getcontext().rounding='ROUND_HALF_UP' to change the rounding behaviour, is there a way to change the N and sympify rounding behaviour to decimal places instead of significant figures?
Instead of re-implementing decimal rounding in SymPy, perhaps use decimal to do the rounding, but hide the calculation in a utility function:
import sympy as sym
import decimal
from decimal import Decimal as D
def dround(d, ndigits, rounding=decimal.ROUND_HALF_UP):
result = D(str(d)).quantize(D('0.1')**ndigits, rounding=rounding)
# result = sym.sympify(result) # if you want a SymPy Float
return result
for x in [0.0065, 1.0065, 10.0065, 100.0065]:
print(dround(x, 3))
prints
0.007
1.007
10.007
100.007
The n of evalf gives the first n significant digits of x (measured from the left). If you use x.round(3) it will round x to the nth digit from the decimal point and can be positive (right of decimal pt) or negative (left of decimal pt).
>>> for x in '0.0065, 1.0065, 10.0065, 100.0065'.split(', '):
... print S(x).round(3)
0.006
1.006
10.007
100.007
>>> int(S(12345).round(-2))
12300
First of all, N and evalf are essentially the same thing; N(x, n) amounts to sympify(x).evalf(n). In your case, since x is a Python float, it's easier to use N because it sympifies the input.
To get three digits after decimal dot, use N(x, 3 + log(x, 10) + 1). The adjustment log(x, 10) + 1 is 0 when x is between 0.1 and 1; in this case the number of significant digits is the same as the number of digits after the decimal dot. If x is larger, we get more significant digits.
Example:
for x in [0.0065, 1.0065, 10.0065, 100.0065]:
print(N(x, 3 + log(x, 10) + 1))
prints
0.006
1.007
10.007
100.007
The transition from 6 to 7 is curious, but not entirely surprising. These numbers are not exactly represented in binary system, so the truncation to nearest double-precision float may be a factor here. I've made a few additional observation on this effect on my blog.
NB: this question is about significant figures. It is not a question about "digits after the decimal point" or anything like that.
EDIT: This question is not a duplicate of Significant figures in the decimal module. The two questions are asking about entirely different problems. I want to know why the function about does not return the desired value for a specific input. None of the answers to Significant figures in the decimal module address this question.
The following function is supposed to return a string representation of a float with the specified number of significant figures:
import decimal
def to_sigfigs(value, sigfigs):
return str(decimal.Context(prec=sigfigs).create_decimal(value))
At first glance, it seems to work:
print to_sigfigs(0.000003141592653589793, 5)
# 0.0000031416
print to_sigfigs(0.000001, 5)
# 0.0000010000
print to_sigfigs(3.141592653589793, 5)
# 3.1416
...but
print to_sigfigs(1.0, 5)
# 1
The desired output for the last expression (IOW, the 5-significant figure representation of 1.0) is the string '1.0000'. The actual output is the string '1'.
Am I misunderstanding something or is this a bug in decimal?
The precision of a context is a maximum precision; if an operation would produce a Decimal with less digits than the context's precision, it is not padded out to the context's precision.
When you call to_sigfigs(0.000001, 5), 0.000001 already has some rounding error due to the conversion from source code to binary floating point. It's actually 9.99999999999999954748111825886258685613938723690807819366455078125E-7. Rounding that to 5 significant figures gives decimal.Decimal("0.0000010000").
On the other hand, 1 is exactly representable in binary floating point, so 1.0 is exactly 1. Since only 1 digit is needed to represent this in decimal, the context's precision doesn't require any rounding, and you get a 1-digit Decimal result.
Is it a bug? I don't know, I don't think the documentation is tight enough to make that determination. It certainly is a surprising result.
It is possible to fix your own function with a little more logic.
def to_sigfigs(value, sigfigs):
sign, digits, exponent = decimal.Context(prec=sigfigs).create_decimal(value).as_tuple()
if len(digits) < sigfigs:
missing = sigfigs - len(digits)
digits = digits + (0,) * missing
exponent -= missing
return str(decimal.Decimal((sign, digits, exponent)))
I am having a bit of trouble understanding some results I am getting while operating with Python 2.7.
>>> x=1
>>> e=1e-20
>>> x+e
1.0
>>> x+e-x
0.0
>>> e+x-x
0.0
>>> x-x+e
1e-20
This is copied directly from Python. I am having a class on how to program on Python and I do not understand the disparity of results (x+e==1, x-x+e==1e-20 but x+e-x==0 and e+x-x==0).
I have already read the Python tutorial on Representation Errors, but I believe none of that was mentioned there
Thanks in advance
Floating-point addition is not associative.
x+e-x is grouped as (x+e)-x. It adds x and e, rounds the result to the nearest representable number (which is 1), then subtracts x from the result and rounds again, producing 0.
x-x+e is grouped as (x-x)+e. It subtracts x from x, producing 0, and rounds it to the nearest representable number, which is 0. It then adds e to 0, producing e, and rounds it to the nearest representable number, which is e.
This is because of the way that computers represent floating point numbers.
This is all really in binary format but let's pretend that it works with base 10 numbers because that's a lot easier for us to relate to.
A floating point number is expressed on the form 0.x*10^y where x is a 10-digit number (I'm omitting trailing zeroes here) and y is the exponent. This means that the number 1.0 is expressed as 0.1*10^1 and the number 0.1 as 0.1*10^0.
To add these two numbers together we need to make sure that they have the same exponent. We can do this easily by shifting the numbers back and forth, i.e. we change 0.1*10^0 to 0.01*10^1 and then we add the together to get 0.11*10^1.
When we have 0.1*10^1 and 0.1*10^-19 (1e-20) we will shift 0.1*10^-19 20 steps, meaning that the 1 will fall outside the range of our 10 digit number so we will end up with 0.1*10^1 + 0.0*10^1 = 0.1*10^1.
The reason you end up with 1e-20 in your last example is because addition is done from left to right, so we subtract 0.1*10^1 from 0.1*10^1 ending up with 0.0*10^0 and add 0.1*10^-19 to that, which is a special case where we don't need to shift any of them because one of them is exactly zero.