Python integer division gives wrong result - python

I'm not sure if this is a bug or if I'm just misunderstanding how integer division is supposed to work.
Consider the following code:
import math
a = 18000
b = 5500
c = (a-b)/9 # c = 1388.(8)
d = (a-b)/c
e = (a-b)//c
f = math.floor(d)
print(f"(a-b)/c={d}, (a-b)//c={e}, floor={f}") # outputs (a-b)/c=9.0, (a-b)//c=8.0, floor=9
Why is e different from d? As far as I understand, num1//num2 should be equal to math.floor(num1/num2).
Using Python 3.8.10 32bit on Windows 10 Pro.

The difference here is the use of intermediate calculations.
See this question for more details on floating point representation.
(a-b)/9 is not exactly representable in floating point. In python it is stored as 1388.888888888888914152630604803562164306640625
from decimal import Decimal
print(Decimal(c)) # 1388.888888888888914152630604803562164306640625
Observe that c is actually a little larger than the true value of (a-b)/9. Because of this, the true value of (a-b)/c is slightly less than 9. Floor-divide in python gives the floor of the true value of the division, therefore (a-b)//c correctly evaluates to 8.
On the other hand, (a-b)/c results in another floating point precision error. Though the true value is slightly less than 9, the closest value that can be represented is exactly 9. Applying the floor operation to exactly 9 results in 9, as expected.

Because c is a non-integer number of float type, the integer-division // can lead to unintuitive results. Best to only use // when dividing 2 integers with the purpose of creating an int as output type.

Related

What is the difference between a//b and int(a/b)?

They seem to be equal for small numbers but different for larger.
For example:
a = int(1267650600228229401496703205376/10)
b = 1267650600228229401496703205376 // 10
print(a - b) # prints 7036874417767
a = int(1493845793475/10)
b = 1493845793475 // 10
print(a - b) # prints 0
How come?
Complementing the existing answers it seems worthwile mentioning that you needn't go that far out to observe a difference:
>>> -1//2
-1
>>> int(-1/2)
0
In Python 3 / performs float division, which has 53 bits of precision; // does floor division, which has no precision limit when both operands are integers (apart from limitations imposed by available RAM).
You can get the Python 3 behaviour in Python 2 by using the true_division __future__ import.
As #khelwood explained, in Python 3, a/b performs float division. Try typing 1/2 into an interpreter -- you'll get 0.5, not 0.
So in your example, 1267650600228229401496703205376 / 10 in reality is equal to 126765060022822940149670320537.6 = 1.267650600228229401496703205376e+29 (floating point division), but due to the inaccuracy of floats, Python evaluates it as 1.2676506002282295e+29, so you've lost precision, which accounts for the difference.
See PEP 238.
The result of the integer division a//b
a=4//3=1
The result of the float division is a/b
a=4/3=1.33333

Why is math.sqrt() incorrect for large numbers?

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

I'm making mistakes dividing large numbers

I am trying to write a program in python 2.7 that will first see if a number divides the other evenly, and if it does get the result of the division.
However, I am getting some interesting results when I use large numbers.
Currently I am using:
from __future__ import division
import math
a=82348972389472433334783
b=2
if a/b==math.trunc(a/b):
answer=a/b
print 'True' #to quickly see if the if loop was invoked
When I run this I get:
True
But 82348972389472433334783 is clearly not even.
Any help would be appreciated.
That's a crazy way to do it. Just use the remainder operator.
if a % b == 0:
# then b divides a evenly
quotient = a // b
The true division implicitly converts the input to floats which don't provide the precision to store the value of a accurately. E.g. on my machine
>>> int(1E15+1)
1000000000000001
>>> int(1E16+1)
10000000000000000
hence you loose precision. A similar thing happens with your big number (compare int(float(a))-a).
Now, if you check your division, you see the result "is" actually found to be an integer
>>> (a/b).is_integer()
True
which is again not really expected beforehand.
The math.trunc function does something similar (from the docs):
Return the Real value x truncated to an Integral (usually a long integer).
The duck typing nature of python allows a comparison of the long integer and float, see
Checking if float is equivalent to an integer value in python and
Comparing a float and an int in Python.
Why don't you use the modulus operator instead to check if a number can be divided evenly?
n % x == 0

Prevent Rounding to Zero in Python

I have a program meant to approximate pi using the Chudnovsky Algorithm, but a term in my equation that is very small keeps being rounded to zero.
Here is the algorithm:
import math
from decimal import *
getcontext().prec = 100
pi = Decimal(0.0)
C = Decimal(12/(math.sqrt(640320**3)))
k = 0
x = Decimal(0.0)
result = Decimal(0.0)
sign = 1
while k<10:
r = Decimal(math.factorial(6*k)/((math.factorial(k)**3)*math.factorial(3*k)))
s = Decimal((13591409+545140134*k)/((640320**3)**k))
x += Decimal(sign*r*s)
sign = sign*(-1)
k += 1
result = Decimal(C*x)
pi = Decimal(1/result)
print Decimal(pi)
The equations may be clearer without the "decimal" terms.
import math
pi = 0.0
C = 12/(math.sqrt(640320**3))
k = 0
x = 0.0
result = 0.0
sign = 1
while k<10:
r = math.factorial(6*k)/((math.factorial(k)**3)*math.factorial(3*k))
s = (13591409+545140134*k)/((640320**3)**k)
x += sign*r*s
sign = sign*(-1)
k += 1
result = C*x
pi = 1/result
print pi
The issue is with the "s" variable. For k>0, it always comes to zero. e.g. at k=1, s should equal about 2.1e-9, but instead it is just zero. Because of this all of my terms after the first =0. How do I get python to calculate the exact value of s instead of rounding it down to 0?
Try:
s = Decimal((13591409+545140134*k)) / Decimal(((640320**3)**k))
The arithmetic you're doing is native python - by allowing the Decimal object to perform your division, you should eliminate your error.
You can do the same, then, when computing r.
A couple of comments.
If you are using Python 2.x, the / returns an integer result. If you want a Decimal result, you convert at least one side to Decimal first.
math.sqrt() only return ~16 digits of precision. Since your value for C will only be accurate to ~16 digits, your final result will only be accurate to 16 digits.
If you're doing maths in Python 2.x, you should probably be putting this line into every module:
from __future__ import division
This changes the meaning of the division operator so that it will return a floating point number if needed to give a (closer to) precise answer. The historical behaviour is for x / y to return an int if both x and y are ints, which usually forces the answer to be rounded down.
Returning a float if necessary is generally regarded as a better way to handle division in a language like Python where duck typing is encouraged, since you can just worry about the value of your numbers rather than getting different behaviour for different types.
In Python 3 this is in fact the default, but since old programs relied on the historical behaviour of the division operator it was felt the change was too backwards-incompatible to be made in Python 2. This is why you have to explicitly turn it on with the __future__ import. I would recommend always adding that import in any module that might be doing any mathematics (or just any module at all, if you can be bothered). You'll almost never be upset that it's there, but not having it there has been the cause of a number of obscure bugs I've had to chase.
I feel that the problem with 's' is that all terms are integers, thus you are doing integer maths. A very simple workaround, would be to use 3.0 in the denominator. It only takes one float in the calculation to get a float returned.

Round float to x decimals?

Is there a way to round a python float to x decimals? For example:
>>> x = roundfloat(66.66666666666, 4)
66.6667
>>> x = roundfloat(1.29578293, 6)
1.295783
I've found ways to trim/truncate them (66.666666666 --> 66.6666), but not round (66.666666666 --> 66.6667).
I feel compelled to provide a counterpoint to Ashwini Chaudhary's answer. Despite appearances, the two-argument form of the round function does not round a Python float to a given number of decimal places, and it's often not the solution you want, even when you think it is. Let me explain...
The ability to round a (Python) float to some number of decimal places is something that's frequently requested, but turns out to be rarely what's actually needed. The beguilingly simple answer round(x, number_of_places) is something of an attractive nuisance: it looks as though it does what you want, but thanks to the fact that Python floats are stored internally in binary, it's doing something rather subtler. Consider the following example:
>>> round(52.15, 1)
52.1
With a naive understanding of what round does, this looks wrong: surely it should be rounding up to 52.2 rather than down to 52.1? To understand why such behaviours can't be relied upon, you need to appreciate that while this looks like a simple decimal-to-decimal operation, it's far from simple.
So here's what's really happening in the example above. (deep breath) We're displaying a decimal representation of the nearest binary floating-point number to the nearest n-digits-after-the-point decimal number to a binary floating-point approximation of a numeric literal written in decimal. So to get from the original numeric literal to the displayed output, the underlying machinery has made four separate conversions between binary and decimal formats, two in each direction. Breaking it down (and with the usual disclaimers about assuming IEEE 754 binary64 format, round-ties-to-even rounding, and IEEE 754 rules):
First the numeric literal 52.15 gets parsed and converted to a Python float. The actual number stored is 7339460017730355 * 2**-47, or 52.14999999999999857891452847979962825775146484375.
Internally as the first step of the round operation, Python computes the closest 1-digit-after-the-point decimal string to the stored number. Since that stored number is a touch under the original value of 52.15, we end up rounding down and getting a string 52.1. This explains why we're getting 52.1 as the final output instead of 52.2.
Then in the second step of the round operation, Python turns that string back into a float, getting the closest binary floating-point number to 52.1, which is now 7332423143312589 * 2**-47, or 52.10000000000000142108547152020037174224853515625.
Finally, as part of Python's read-eval-print loop (REPL), the floating-point value is displayed (in decimal). That involves converting the binary value back to a decimal string, getting 52.1 as the final output.
In Python 2.7 and later, we have the pleasant situation that the two conversions in step 3 and 4 cancel each other out. That's due to Python's choice of repr implementation, which produces the shortest decimal value guaranteed to round correctly to the actual float. One consequence of that choice is that if you start with any (not too large, not too small) decimal literal with 15 or fewer significant digits then the corresponding float will be displayed showing those exact same digits:
>>> x = 15.34509809234
>>> x
15.34509809234
Unfortunately, this furthers the illusion that Python is storing values in decimal. Not so in Python 2.6, though! Here's the original example executed in Python 2.6:
>>> round(52.15, 1)
52.200000000000003
Not only do we round in the opposite direction, getting 52.2 instead of 52.1, but the displayed value doesn't even print as 52.2! This behaviour has caused numerous reports to the Python bug tracker along the lines of "round is broken!". But it's not round that's broken, it's user expectations. (Okay, okay, round is a little bit broken in Python 2.6, in that it doesn't use correct rounding.)
Short version: if you're using two-argument round, and you're expecting predictable behaviour from a binary approximation to a decimal round of a binary approximation to a decimal halfway case, you're asking for trouble.
So enough with the "two-argument round is bad" argument. What should you be using instead? There are a few possibilities, depending on what you're trying to do.
If you're rounding for display purposes, then you don't want a float result at all; you want a string. In that case the answer is to use string formatting:
>>> format(66.66666666666, '.4f')
'66.6667'
>>> format(1.29578293, '.6f')
'1.295783'
Even then, one has to be aware of the internal binary representation in order not to be surprised by the behaviour of apparent decimal halfway cases.
>>> format(52.15, '.1f')
'52.1'
If you're operating in a context where it matters which direction decimal halfway cases are rounded (for example, in some financial contexts), you might want to represent your numbers using the Decimal type. Doing a decimal round on the Decimal type makes a lot more sense than on a binary type (equally, rounding to a fixed number of binary places makes perfect sense on a binary type). Moreover, the decimal module gives you better control of the rounding mode. In Python 3, round does the job directly. In Python 2, you need the quantize method.
>>> Decimal('66.66666666666').quantize(Decimal('1e-4'))
Decimal('66.6667')
>>> Decimal('1.29578293').quantize(Decimal('1e-6'))
Decimal('1.295783')
In rare cases, the two-argument version of round really is what you want: perhaps you're binning floats into bins of size 0.01, and you don't particularly care which way border cases go. However, these cases are rare, and it's difficult to justify the existence of the two-argument version of the round builtin based on those cases alone.
Use the built-in function round():
In [23]: round(66.66666666666,4)
Out[23]: 66.6667
In [24]: round(1.29578293,6)
Out[24]: 1.295783
help on round():
round(number[, ndigits]) -> floating point number
Round a number to a given precision in decimal digits (default 0
digits). This always returns a floating point number. Precision may
be negative.
Default rounding in python and numpy:
In: [round(i) for i in np.arange(10) + .5]
Out: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10]
I used this to get integer rounding to be applied to a pandas series:
import decimal
and use this line to set the rounding to "half up" a.k.a rounding as taught in school:
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
Finally I made this function to apply it to a pandas series object
def roundint(value):
return value.apply(lambda x: int(decimal.Decimal(x).to_integral_value()))
So now you can do roundint(df.columnname)
And for numbers:
In: [int(decimal.Decimal(i).to_integral_value()) for i in np.arange(10) + .5]
Out: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Credit: kares
The Mark Dickinson answer, although complete, didn't work with the float(52.15) case. After some tests, there is the solution that I'm using:
import decimal
def value_to_decimal(value, decimal_places):
decimal.getcontext().rounding = decimal.ROUND_HALF_UP # define rounding method
return decimal.Decimal(str(float(value))).quantize(decimal.Decimal('1e-{}'.format(decimal_places)))
(The conversion of the 'value' to float and then string is very important, that way, 'value' can be of the type float, decimal, integer or string!)
Hope this helps anyone.
I coded a function (used in Django project for DecimalField) but it can be used in Python project :
This code :
Manage integers digits to avoid too high number
Manage decimals digits to avoid too low number
Manage signed and unsigned numbers
Code with tests :
def convert_decimal_to_right(value, max_digits, decimal_places, signed=True):
integer_digits = max_digits - decimal_places
max_value = float((10**integer_digits)-float(float(1)/float((10**decimal_places))))
if signed:
min_value = max_value*-1
else:
min_value = 0
if value > max_value:
value = max_value
if value < min_value:
value = min_value
return round(value, decimal_places)
value = 12.12345
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.12
value = 12.126
nb = convert_decimal_to_right(value, 4, 2)
# nb : 12.13
value = 1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : 99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2)
# nb : -99.99
value = -1234.123
nb = convert_decimal_to_right(value, 4, 2, signed = False)
# nb : 0
value = 12.123
nb = convert_decimal_to_right(value, 8, 4)
# nb : 12.123
def trim_to_a_point(num, dec_point):
factor = 10**dec_point # number of points to trim
num = num*factor # multiple
num = int(num) # use the trimming of int
num = num/factor #divide by the same factor of 10s you multiplied
return num
#test
a = 14.1234567
trim_to_a_point(a, 5)
output
========
14.12345
multiple by 10^ decimal point you want
truncate with int() method
divide by the same number you multiplied before
done!
Just posted this for educational reasons i think it is correct though :)

Categories