When I convert a float to decimal.Decimal in Python and afterwards call Decimal.shift it may give me completely wrong and unexpected results depending on the float. Why is this the case?
Converting 123.5 and shifting it:
from decimal import Decimal
a = Decimal(123.5)
print(a.shift(1)) # Gives expected result
The code above prints the expected result of 1235.0.
If I instead convert and shift 123.4:
from decimal import Decimal
a = Decimal(123.4)
print(a.shift(1)) # Gives UNexpected result
it gives me 3.418860808014869689941406250E-18 (approx. 0) which is completely unexpected and wrong.
Why is this the case?
Note:
I understand the floating-point imprecision because of the representation of floats in memory. However, I can't explain why this should give me such a completely wrong result.
Edit:
Yes, in general it would be best to not convert the floats to decimals but convert strings to decimals instead. However, this is not the point of my question. I want to understand why the shifting after float conversion gives such a completely wrong result. So if I print(Decimal(123.4)) it gives 123.40000000000000568434188608080148696899414062 so after shifting I would expect it to be 1234.0000000000000568434188608080148696899414062 and not nearly zero.
You need to change the Decimal constructor input to use strings instead of floats.
a = Decimal('123.5')
print(a.shift(1))
a = Decimal('123.4')
print(a.shift(1))
or
a = Decimal(str(123.5))
print(a.shift(1))
a = Decimal(str(123.4))
print(a.shift(1))
The output will be as expected.
>>> 1235.0
>>> 1234.0
Decimal instances can be constructed from integers, strings, floats, or tuples. Construction from an integer or a float performs an exact conversion of the value of that integer or float.
For floats, Decimal calls Decimal.from_float()
Note that Decimal.from_float(0.1) is not the same as Decimal('0.1'). Since 0.1 is not exactly representable in binary floating point, the value is stored as the nearest representable value which is 0x1.999999999999ap-4. The exact equivalent of the value in decimal is 0.1000000000000000055511151231257827021181583404541015625.
Internally, the Python decimal library converts a float into two integers representing the numerator and denominator of a fraction that yields the float.
n, d = abs(123.4).as_integer_ratio()
It then calculates the bit length of the denominator, which is the number of bits required to represent the number in binary.
k = d.bit_length() - 1
And then from there the bit length k is used to record the coefficient of the decimal number by multiplying the numerator * 5 to the power of the bit length of the denominator.
coeff = str(n*5**k)
The resulting values are used to create a new Decimal object with constructor arguments of sign, coefficient, and exponent using this values.
For the float 123.5 these values are
>>> 1 1235 -1
and for the float 123.4 these values are
1 123400000000000005684341886080801486968994140625 -45
So far, nothing is amiss.
However when you call shift, the Decimal library has to calculate how much to pad the number with zeroes based on the shift you've specified. To do this internally it takes the precision subtracted by length of the coefficient.
amount_to_pad = context.prec - len(coeff)
The default precision is only 28 and with a float like 123.4 the coefficient becomes much longer than the default precision as noted above. This creates a negative amount to pad with zeroes and makes the number very tiny as you noted.
A way around this is to increase the precision to the length of the exponent + the length of the number you started with (45 + 4).
from decimal import Decimal, getcontext
getcontext().prec = 49
a = Decimal(123.4)
print(a)
print(a.shift(1))
>>> 123.400000000000005684341886080801486968994140625
>>> 1234.000000000000056843418860808014869689941406250
The documentation for shift hints that the precision is important for this calculation:
The second operand must be an integer in the range -precision through precision.
However it does not explain this caveat for floats that don't play nice with memory limitations.
I would expect this to raise some kind of error and prompt you to change your input or increase the precision, but at least you know!
#MarkDickinson noted in a comment above that you can view this Python bug tracker for more information: https://bugs.python.org/issue7233
I'm trying to convert strings of numbers that come from the output of another program into floating point numbers with two forced decimal places (including trailing zeros).
Right now I'm converting the strings to floats, then separately specifying precision (two decimal places), then converting back to float to do numeral comparisons on later.
# convert to float
float1 = float(output_string[6])
# this doesn't guarantee two decimal places in my output
# eg: -36.55, -36.55, -40.34, -36.55, -35.7 (no trailing zero on the last number)
nice_float = float('{0:.2f}'.format(float1))
# this works but then I later need to convert back into a float
# string->float->string->float is not super clean
nice_string = '{0:.2f}'.format(float1)
Edit for clarity:
I have a problem with the display in that I need that to show exactly two decimal places.
Is there a way to convert a string to a floating point number rounded to two decimal places that's cleaner than my implementation which involves converting a string to a float, then the float back into a formatted string?
How would I convert:
s='8.833167174e+11' (str)
==> 883316717400 (int)
I tried doing int(s) or some other 'casting' but it wasn't effective.
As your string is a float digit you need to first convert it to float:
>>> int(float(s))
883316717400
float([x])
Return a floating point number constructed from a number or string x.
If the argument is a string, it must contain a possibly signed decimal or floating point number, possibly embedded in whitespace. The argument may also be [+|-]nan or [+|-]inf. Otherwise, the argument may be a plain or long integer or a floating point number, and a floating point number with the same value (within Python’s floating point precision) is returned. If no argument is given, returns 0.0.
Here is example
In > int('1.5')
Out > 1
In > int('10.5')
Out > 10
But I want to keep values intact. How do you do it?
Integers are only numbers that have no decimals.
-4,-3,-2,-1,0,1,2,3,4,...,65535 etc...
Floating point numbers or Decimal numbers are allowed to represent fractions and more precise numbers
10.5, 4.9999999
If you want to take a string and get a numerical type for non-whole numbers, use float()
float('10.5')
Here is a very simple elementary school explanation of integers
Here is the python documentation of numerical types
foo = 10.5
foo2 = int(foo)
print foo, foo2
10.5, 10
Integer can one represent whole number.
If you have a known consistent number of digest after the the comma, I recommend multiplying the number by 10 to the power of X.
Or round the number to the nearest whole number
Python's math module contain handy functions like floor & ceil. These functions take a floating point number and return the nearest integer below or above it. However these functions return the answer as a floating point number. For example:
import math
f=math.floor(2.3)
Now f returns:
2.0
What is the safest way to get an integer out of this float, without running the risk of rounding errors (for example if the float is the equivalent of 1.99999) or perhaps I should use another function altogether?
All integers that can be represented by floating point numbers have an exact representation. So you can safely use int on the result. Inexact representations occur only if you are trying to represent a rational number with a denominator that is not a power of two.
That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.
To quote from Wikipedia,
Any integer with absolute value less than or equal to 224 can be exactly represented in the single precision format, and any integer with absolute value less than or equal to 253 can be exactly represented in the double precision format.
Use int(your non integer number) will nail it.
print int(2.3) # "2"
print int(math.sqrt(5)) # "2"
You could use the round function. If you use no second parameter (# of significant digits) then I think you will get the behavior you want.
IDLE output.
>>> round(2.99999999999)
3
>>> round(2.6)
3
>>> round(2.5)
3
>>> round(2.4)
2
Combining two of the previous results, we have:
int(round(some_float))
This converts a float to an integer fairly dependably.
That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.
This post explains why it works in that range.
In a double, you can represent 32bit integers without any problems. There cannot be any rounding issues. More precisely, doubles can represent all integers between and including 253 and -253.
Short explanation: A double can store up to 53 binary digits. When you require more, the number is padded with zeroes on the right.
It follows that 53 ones is the largest number that can be stored without padding. Naturally, all (integer) numbers requiring less digits can be stored accurately.
Adding one to 111(omitted)111 (53 ones) yields 100...000, (53 zeroes). As we know, we can store 53 digits, that makes the rightmost zero padding.
This is where 253 comes from.
More detail: We need to consider how IEEE-754 floating point works.
1 bit 11 / 8 52 / 23 # bits double/single precision
[ sign | exponent | mantissa ]
The number is then calculated as follows (excluding special cases that are irrelevant here):
-1sign × 1.mantissa ×2exponent - bias
where bias = 2exponent - 1 - 1, i.e. 1023 and 127 for double/single precision respectively.
Knowing that multiplying by 2X simply shifts all bits X places to the left, it's easy to see that any integer must have all bits in the mantissa that end up right of the decimal point to zero.
Any integer except zero has the following form in binary:
1x...x where the x-es represent the bits to the right of the MSB (most significant bit).
Because we excluded zero, there will always be a MSB that is one—which is why it's not stored. To store the integer, we must bring it into the aforementioned form: -1sign × 1.mantissa ×2exponent - bias.
That's saying the same as shifting the bits over the decimal point until there's only the MSB towards the left of the MSB. All the bits right of the decimal point are then stored in the mantissa.
From this, we can see that we can store at most 52 binary digits apart from the MSB.
It follows that the highest number where all bits are explicitly stored is
111(omitted)111. that's 53 ones (52 + implicit 1) in the case of doubles.
For this, we need to set the exponent, such that the decimal point will be shifted 52 places. If we were to increase the exponent by one, we cannot know the digit right to the left after the decimal point.
111(omitted)111x.
By convention, it's 0. Setting the entire mantissa to zero, we receive the following number:
100(omitted)00x. = 100(omitted)000.
That's a 1 followed by 53 zeroes, 52 stored and 1 added due to the exponent.
It represents 253, which marks the boundary (both negative and positive) between which we can accurately represent all integers. If we wanted to add one to 253, we would have to set the implicit zero (denoted by the x) to one, but that's impossible.
If you need to convert a string float to an int you can use this method.
Example: '38.0' to 38
In order to convert this to an int you can cast it as a float then an int. This will also work for float strings or integer strings.
>>> int(float('38.0'))
38
>>> int(float('38'))
38
Note: This will strip any numbers after the decimal.
>>> int(float('38.2'))
38
math.floor will always return an integer number and thus int(math.floor(some_float)) will never introduce rounding errors.
The rounding error might already be introduced in math.floor(some_large_float), though, or even when storing a large number in a float in the first place. (Large numbers may lose precision when stored in floats.)
Another code sample to convert a real/float to an integer using variables.
"vel" is a real/float number and converted to the next highest INTEGER, "newvel".
import arcpy.math, os, sys, arcpy.da
.
.
with arcpy.da.SearchCursor(densifybkp,[floseg,vel,Length]) as cursor:
for row in cursor:
curvel = float(row[1])
newvel = int(math.ceil(curvel))
Since you're asking for the 'safest' way, I'll provide another answer other than the top answer.
An easy way to make sure you don't lose any precision is to check if the values would be equal after you convert them.
if int(some_value) == some_value:
some_value = int(some_value)
If the float is 1.0 for example, 1.0 is equal to 1. So the conversion to int will execute. And if the float is 1.1, int(1.1) equates to 1, and 1.1 != 1. So the value will remain a float and you won't lose any precision.