Floating Point Concepts in Python - python

Why Does -22/10 return -3 in python. Any pointers regarding this will be helpful for me.

Because it's integer division by default. And integer division is rounded towards minus infinity. Take a look:
>>> -22/10
-3
>>> -22/10.0
-2.2000000000000002
Positive:
>>> 22/10
2
>>> 22/10.0
2.2000000000000002
Regarding the seeming "inaccuracy" of floating point, this is a great article to read: Why are floating point calculations so inaccurate?

PEP 238, "Changing the Division Operator", explains the issues well, I think. In brief: when Python was designed it adopted the "truncating" meaning for / between integers, simply because most other programming languages did ever since the first FORTRAN compiler was launched in 1957 (all-uppercase language name and all;-). (One widespread language that didn't adopt this meaning, using / to produce a floating point result and div for truncation, was Pascal).
In 2001 it was decided that this choice was not optimal (to quote the PEP, "This makes expressions expecting float or complex results error-prone when integers are not expected but possible as inputs"), and to switch to using a new operator // to request division with truncation, and change the meaning of / to produce a float result ("true division").
You can explicitly request this behavior by putting the statement
from __future__ import division
at the start of a module (the command-line switch -Q to the python interpreter can also control the behavior of division). Missing such an "import from the future" (and command line switch use), Python 2.x, for all values of x, always uses "classic division" (i.e., / is truncating between ints).
Python 3, however, always uses "true division" (/ between ints produces a float).
Note a curious corollary (in Python 3)...:
>>> from fractions import Fraction
>>> Fraction(1/2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/fractions.py", line 100, in __new__
raise TypeError("argument should be a string "
TypeError: argument should be a string or a Rational instance
since / produces a float, it's not acceptable as the argument to Fraction (otherwise precision might be silently lost). You must use a string, or pass numerator and denominator as separate arguments:
>>> Fraction(1, 2)
Fraction(1, 2)
>>> Fraction('1/2')
Fraction(1, 2)
gmpy uses a different, more tolerant approach to building mpqs, its equivalent of Python 3 Fractions...:
>>> import gmpy
>>> gmpy.mpq(1/2)
mpq(1,2)
Specifically (see lines 3168 and following in the source), gmpy uses a Stern-Brocot tree to get the "best practical approximation" of the floating point argument as a rational (of course, this can mask a loss of precision).

By default, the current versions of Python 2.x (I'm not sure about 3.x) give an integer result for any arithmetic operator when both operands are integers. However, there is a way to change this behaviour.
from __future__ import division
print(22/10)
Outputs
2.2000000000000002
Of course, a simpler way is to simply make one of the operands a float as described by the previous two answers.

Because you're doing an integer division. If you do -22.0/10 instead, you'll get the correct result.

This happens because the operation of integer division returns the number, which when multiplied by the divisor gives the largest possible integer that is no larger than the number you divided.
This is exactly why 22/10 gives 2: 10*2=20, which is the largest integer multiple of 10 not bigger than 20.
When this goes to the negative, your operation becomes -22/10. Your result is -3. Applying the same logic as in the previous case, we see that 10*-3=-30, which is the largest integer multiple of 10 not bigger than -20.
This is why you get a slightly unexpected answer when dealing with negative numbers.
Hope that helps

Related

Big number division leads to unexact result

I am test a solidity program, which deals with very large numbers (1e17).
The formula I'm testing is the following : int(1e17*(100-3)/(100-1))
WolframAlpha and the Solidity language tell me that it's equivalent to 97979797979797979.
The test fails however because Python returns 97979797979797984.
How can I get the right value with Python ?
In this case, the fractions module works well here, since its division between two numbers is exact, in that it preserves the precision of the result.
>>> from fractions import Fraction
>>> int(Fraction(10)**17*(100-3)/(100-1))
97979797979797979
>>>
(Here we avoid notation like 1e17 because it may resolve to an inexact float value.)

Get a decimal point result upto n number of precision in python

I have an app that has some complex mathematical operations leading to very small amount in results like 0.000028 etc. Now If I perform some calculation in javascript like (29631/1073741824) it gives me result as 0.000027596019208431244 whereas the same calculation in python with data type float gives me 0.0. How can I increase the number of decimal points in results in python
The problem is not in a lack of decimal points, it's that in Python 2 integer division produces an integer result. This means that 29631/1073741824 yields 0 rather than the float you are expecting. To work around this, you can use a float for either operand:
>>> 29631.0/1073741824
2.7596019208431244e-05
This changed in Python 3, where the division operator does the expected thing. You can use a from __future__ import to turn on the new behavior in Python 2:
>>> from __future__ import division
>>> 29631/1073741824
2.7596019208431244e-05
If you're using Python 3.x, you should try this :
print(str(float(29631)/1073741824))

1 == 2 for large numbers of 1

I'm wondering what causes this behaviour. I haven't been able to find an answer that covers this. It is probably something simple and obvious, but it is not to me. I am using python 2.7.3 in Ubuntu.
In [1]: 2 == 1.9999999999999999
Out[1]: True
In [2]: 2 == 1.999999999999999
Out[2]: False
EDIT:
To clarify my question. Is there a written(in documentation) max number of 9's where python will evaluate the expression above as being equal to 2?
Python uses floating point representation
What a floating point actually is, is a fixed-width binary number (called the "significand") plus a small integer to tell you how many powers of two to shift that value by (the "exponent"). Plus a sign bit. Just like scientific notation, but in base 2 instead of 10.
The closest 64 bit floating point value to 1.9999999999999999 is 2.0, because 64 bit floating point values (so-called "double precision") uses 52 bits of significand, which is equivalent to about 15 decimal places. So the literal 1.9999999999999999 is just another way of writing 2.0. However, the closest value to 1.999999999999999 is less than 2.0 (I think it's 1.9999999999999988897769753748434595763683319091796875 exactly, but I'm too lazy to check that's correct, I'm just relying on Python's formatting code to be exact).
I don't actually know whether the use specifically of 64 bit floats is required by the Python language, or is an implementation detail of CPython. But whatever size is used, the important thing is not specifically the number of decimal places, it is where the closest floating-point value of that size lies to your decimal literal. It will be closer for some literals than others.
Hence, 1.9999999999999999 == 2 for the same reason that 2.0 == 2 (Python allows mixed-type numeric operations including comparison, and the integer 2 is equal to the float 2.0). Whereas 1.999999999999999 != 2.
Types coercion
>>> 2 == 2.0
True
And consequences of maximum number of digits that can be represented in python :
>>> import sys
>>> sys.float_info.dig
15
>>> 1.9999999999999999
2.0
more from docs
>>> float('9876543211234567')
9876543211234568.0
note ..68 on the end instead of expected ..67
This is due to the way floats are implemented in Python. To keep it short and simple: Since floats almost always are an approximation and thus have more digits than most people find useful, the Python interpreter displays a rounded value.
More detailed, floats are stored in binary. This means that they're stored as fractions to the base 2, unlike decimal, were you can display a float as fractions to the base 10. However, most decimal fractions don't have an exact representation in binary. Because of that, they are typically stored with a precision of 53 bits. This renders them pretty much useless if you want to do more complex arithmetic operations, since you'll run into some strange problems, e. g.:
>>> 0.1 + 0.2
0.30000000000000004
>>> round(2.675, 2)
2.67
See The docs on floats as well.
Mathematically speaking, 2.0 does equal 1.9999... forever. They are two different ways of writing the same number.
However, in software, it's important to never compare two floats or decimals for equality - instead, subtract them, take the absolute value, and verify that the (always positive) difference is sufficiently low for your purposes.
EG:
if abs(value1 - value2) < 1e10:
# they are close enough
else:
# they are not
You probably should set EPSILON = 1e10, and use the symbolic constant instead of scattering 1e10 throughout your code, or better still use a comparison function.

Rounding ** 0.5 and math.sqrt

In Python, are either
n**0.5 # or
math.sqrt(n)
recognized when a number is a perfect square? Specifically, should I worry that when I use
int(n**0.5) # instead of
int(n**0.5 + 0.000000001)
I might accidentally end up with the number one less than the actual square root due to precision error?
As several answers have suggested integer arithmetic, I'll recommend the gmpy2 library. It provides functions for checking if a number is a perfect power, calculating integer square roots, and integer square root with remainder.
>>> import gmpy2
>>> gmpy2.is_power(9)
True
>>> gmpy2.is_power(10)
False
>>> gmpy2.isqrt(10)
mpz(3)
>>> gmpy2.isqrt_rem(10)
(mpz(3), mpz(1))
Disclaimer: I maintain gmpy2.
Yes, you should worry:
In [11]: int((100000000000000000000000000000000000**2) ** 0.5)
Out[11]: 99999999999999996863366107917975552L
In [12]: int(math.sqrt(100000000000000000000000000000000000**2))
Out[12]: 99999999999999996863366107917975552L
obviously adding the 0.000000001 doesn't help here either...
As #DSM points out, you can use the decimal library:
In [21]: from decimal import Decimal
In [22]: x = Decimal('100000000000000000000000000000000000')
In [23]: (x ** 2).sqrt() == x
Out[23]: True
for numbers over 10**999999999, provided you keep a check on the precision (configurable), it'll throw an error rather than an incorrect answer...
Both **0.5 and math.sqrt() perform the calculation using floating point arithmetic. The input is converted to float before the square root is calculated.
Do these calculations recognize when the input value is a perfect square?
No they do not. Floating arithmetic has no concept of perfect squares.
large integers may not be representable, for values where the number has more significant digits than available in the floating point mantissa. It's easy to see therefore that for non-representable input values, n**0.5 may be innaccurate. And you proposed fix by adding a small value will not in general fix the problem.
If your input is an integer then you should consider performing your calculation using integer arithmetic. That ultimately is the right way to deal with this.
You can use the round(number, significant_figures) before converting to an int, I cannot recall if python truncs or rounds when doing a float-to-integer conversion.
In any case, since python uses floating point arithmetic, all the pitfalls apply. See:
http://docs.python.org/2/tutorial/floatingpoint.html
Perfect-square values will have no fractional components, so your main worry would be very large values, and for such values a difference of 1 or 2 being significant means you're going to want a specific numerical library that supports such high precision (as DSM mentions, the Decimal library, standard since Python 2.4, should be able to do what you want as it supports arbitrary precision.
http://docs.python.org/library/decimal.html
sqrt is one of the easier math library functions to implement, and any math library of reasonable quality will implement it with faithful rounding (sub-ULP accuracy). If the input is a perfect square, its square root is representable (in a reasonable floating-point format). In this case, faithful rounding guarantees the result is exact.
This addresses only the value actually passed to sqrt. Whether a number can be converted without error from another format to the floating-point input for sqrt is a separate issue.

Why do Python's math.ceil() and math.floor() operations return floats instead of integers?

Can someone explain this (straight from the docs- emphasis mine):
math.ceil(x) Return the ceiling of x as a float, the smallest integer value greater than or equal to x.
math.floor(x) Return the floor of x as a float, the largest integer value less than or equal to x.
Why would .ceil and .floor return floats when they are by definition supposed to calculate integers?
EDIT:
Well this got some very good arguments as to why they should return floats, and I was just getting used to the idea, when #jcollado pointed out that they in fact do return ints in Python 3...
As pointed out by other answers, in python they return floats probably because of historical reasons to prevent overflow problems. However, they return integers in python 3.
>>> import math
>>> type(math.floor(3.1))
<class 'int'>
>>> type(math.ceil(3.1))
<class 'int'>
You can find more information in PEP 3141.
The range of floating point numbers usually exceeds the range of integers. By returning a floating point value, the functions can return a sensible value for input values that lie outside the representable range of integers.
Consider: If floor() returned an integer, what should floor(1.0e30) return?
Now, while Python's integers are now arbitrary precision, it wasn't always this way. The standard library functions are thin wrappers around the equivalent C library functions.
Because python's math library is a thin wrapper around the C math library which returns floats.
The source of your confusion is evident in your comment:
The whole point of ceil/floor operations is to convert floats to integers!
The point of the ceil and floor operations is to round floating-point data to integral values. Not to do a type conversion. Users who need to get integer values can do an explicit conversion following the operation.
Note that it would not be possible to implement a round to integral value as trivially if all you had available were a ceil or float operation that returned an integer. You would need to first check that the input is within the representable integer range, then call the function; you would need to handle NaN and infinities in a separate code path.
Additionally, you must have versions of ceil and floor which return floating-point numbers if you want to conform to IEEE 754.
Before Python 2.4, an integer couldn't hold the full range of truncated real numbers.
http://docs.python.org/whatsnew/2.4.html#pep-237-unifying-long-integers-and-integers
Because the range for floats is greater than that of integers -- returning an integer could overflow
This is a very interesting question! As a float requires some bits to store the exponent (=bits_for_exponent) any floating point number greater than 2**(float_size - bits_for_exponent) will always be an integral value! At the other extreme a float with a negative exponent will give one of 1, 0 or -1. This makes the discussion of integer range versus float range moot because these functions will simply return the original number whenever the number is outside the range of the integer type. The python functions are wrappers of the C function and so this is really a deficiency of the C functions where they should have returned an integer and forced the programer to do the range/NaN/Inf check before calling ceil/floor.
Thus the logical answer is the only time these functions are useful they would return a value within integer range and so the fact they return a float is a mistake and you are very smart for realizing this!
Maybe because other languages do this as well, so it is generally-accepted behavior. (For good reasons, as shown in the other answers)
This totally caught me off guard recently. This is because I've programmed in C since the 1970's and I'm only now learning the fine details of Python. Like this curious behavior of math.floor().
The math library of Python is how you access the C standard math library. And the C standard math library is a collection of floating point numerical functions, like sin(), and cos(), sqrt(). The floor() function in the context of numerical calculations has ALWAYS returned a float. For 50 YEARS now. It's part of the standards for numerical computation. For those of us familiar with the math library of C, we don't understand it to be just "math functions". We understand it to be a collection of floating-point algorithms. It would be better named something like NFPAL - Numerical Floating Point Algorithms Libary. :)
Those of us that understand the history instantly see the python math module as just a wrapper for the long-established C floating-point library. So we expect without a second thought, that math.floor() is the same function as the C standard library floor() which takes a float argument and returns a float value.
The use of floor() as a numerical math concept goes back to 1798 per the Wikipedia page on the subject: https://en.wikipedia.org/wiki/Floor_and_ceiling_functions#Notation
It never has been a computer science covert floating-point to integer storage format function even though logically it's a similar concept.
The floor() function in this context has always been a floating-point numerical calculation as all(most) the functions in the math library. Floating-point goes beyond what integers can do. They include the special values of +inf, -inf, and Nan (not a number) which are all well defined as to how they propagate through floating-point numerical calculations. Floor() has always CORRECTLY preserved values like Nan and +inf and -inf in numerical calculations. If Floor returns an int, it totally breaks the entire concept of what the numerical floor() function was meant to do. math.floor(float("nan")) must return "nan" if it is to be a true floating-point numerical floor() function.
When I recently saw a Python education video telling us to use:
i = math.floor(12.34/3)
to get an integer I laughed to myself at how clueless the instructor was. But before writing a snarkish comment, I did some testing and to my shock, I found the numerical algorithms library in Python was returning an int. And even stranger, what I thought was the obvious answer to getting an int from a divide, was to use:
i = 12.34 // 3
Why not use the built-in integer divide to get the integer you are looking for! From my C background, it was the obvious right answer. But low and behold, integer divide in Python returns a FLOAT in this case! Wow! What a strange upside-down world Python can be.
A better answer in Python is that if you really NEED an int type, you should just be explicit and ask for int in python:
i = int(12.34/3)
Keeping in mind however that floor() rounds towards negative infinity and int() rounds towards zero so they give different answers for negative numbers. So if negative values are possible, you must use the function that gives the results you need for your application.
Python however is a different beast for good reasons. It's trying to address a different problem set than C. The static typing of Python is great for fast prototyping and development, but it can create some very complex and hard to find bugs when code that was tested with one type of objects, like floats, fails in subtle and hard to find ways when passed an int argument. And because of this, a lot of interesting choices were made for Python that put the need to minimize surprise errors above other historic norms.
Changing the divide to always return a float (or some form of non int) was a move in the right direction for this. And in this same light, it's logical to make // be a floor(a/b) function, and not an "int divide".
Making float divide by zero a fatal error instead of returning float("inf") is likewise wise because, in MOST python code, a divide by zero is not a numerical calculation but a programming bug where the math is wrong or there is an off by one error. It's more important for average Python code to catch that bug when it happens, instead of propagating a hidden error in the form of an "inf" which causes a blow-up miles away from the actual bug.
And as long as the rest of the language is doing a good job of casting ints to floats when needed, such as in divide, or math.sqrt(), it's logical to have math.floor() return an int, because if it is needed as a float later, it will be converted correctly back to a float. And if the programmer needed an int, well then the function gave them what they needed. math.floor(a/b) and a//b should act the same way, but the fact that they don't I guess is just a matter of history not yet adjusted for consistency. And maybe too hard to "fix" due to backward compatibility issues. And maybe not that important???
In Python, if you want to write hard-core numerical algorithms, the correct answer is to use NumPy and SciPy, not the built-in Python math module.
import numpy as np
nan = np.float64(0.0) / 0.0 # gives a warning and returns float64 nan
nan = np.floor(nan) # returns float64 nan
Python is different, for good reasons, and it takes a bit of time to understand it. And we can see in this case, the OP, who didn't understand the history of the numerical floor() function, needed and expected it to return an int from their thinking about mathematical integers and reals. Now Python is doing what our mathematical (vs computer science) training implies. Which makes it more likely to do what a beginner expects it to do while still covering all the more complex needs of advanced numerical algorithms with NumPy and SciPy. I'm constantly impressed with how Python has evolved, even if at times I'm totally caught off guard.

Categories