Python3 Infinity/NaN: Decimal vs. float - python

Given (Python3):
>>> float('inf') == Decimal('inf')
True
>>> float('-inf') <= float('nan') <= float('inf')
False
>>> float('-inf') <= Decimal(1) <= float('inf')
True
Why are the following invalid? I have read Special values.
Invalid
>>> Decimal('-inf') <= Decimal('nan') <= Decimal('inf')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]
>>> Decimal('-inf') <= float('nan') <= Decimal('inf')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]
>>> float('-inf') <= Decimal('nan') <= float('inf')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>]

From the decimal.py source code:
# Note: The Decimal standard doesn't cover rich comparisons for
# Decimals. In particular, the specification is silent on the
# subject of what should happen for a comparison involving a NaN.
# We take the following approach:
#
# == comparisons involving a quiet NaN always return False
# != comparisons involving a quiet NaN always return True
# == or != comparisons involving a signaling NaN signal
# InvalidOperation, and return False or True as above if the
# InvalidOperation is not trapped.
# <, >, <= and >= comparisons involving a (quiet or signaling)
# NaN signal InvalidOperation, and return False if the
# InvalidOperation is not trapped.
#
# This behavior is designed to conform as closely as possible to
# that specified by IEEE 754.
And from the Special values section you say you read:
An attempt to compare two Decimals using any of the <, <=, > or >= operators will raise the InvalidOperation signal if either operand is a NaN, and return False if this signal is not trapped.
Note that IEEE 754 uses NaN as a floating point exception value; e.g. you did something that cannot be computed and you got an exception instead. It is a signal value and should be seen as an error, not something to compare other floats against, which is why in the IEEE 754 standard it is unequal to anything else.
Moreover, the Special values section mentions:
Note that the General Decimal Arithmetic specification does not specify the behavior of direct comparisons; these rules for comparisons involving a NaN were taken from the IEEE 854 standard (see Table 3 in section 5.7).
and looking at IEEE 854 section 5.7 we find:
In addition to the true/false response, an invalid operation exception (see 7.1) shall be signaled
when, as indicated in the last column of Table 3, “unordered” operands are compared using one of the predicates
involving “<” or “>” but not “?.” (Here the symbol “?” signifies “unordered.” )
with comparisons with NaN classified as unordered.
By default InvalidOperation is trapped, so a Python exception is raised when using <= and >= against Decimal('NaN'). This is a logical extension; Python has actual exceptions so if you compare against the NaN exception value, you can expect an exception being raised.
You could disable trapping by using a Decimal.localcontext():
>>> from decimal import localcontext, Decimal, InvalidOperation
>>> with localcontext() as ctx:
... ctx.traps[InvalidOperation] = 0
... Decimal('-inf') <= Decimal('nan') <= Decimal('inf')
...
False

Related

How to convert bin to float using 64bit IEE754, [duplicate]

I want to convert a binary number into a float number. Here's an example of a possibility:
>>> float(-0b1110)
gives me the correct output:
-14.0
Unfortunately, I am working with binary strings, i.e., I need something like float('-0b1110').
However, this doesn't work:
>>> float('-0b1110')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -0b1110
I tried to use binascii.a2b_qp(string[, header]) which converts a block of quoted-printable data back to binary and returns the binary data. But eventually, I get the same error:
>>> float(binascii.a2b_qp('-0b1110'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -0b1110
I understand the cases where the output number is an integer but what if I want to obtain the number 12.546? What would the function call for the binary string look like then?
In one of your comments you indicated that the binary number represents a float in 8 byte long IEEE 754 binary64 format. However that is inconsistent with the -0b1110 value you showed as an example, so I've ignored it and used my own which is in the proper format as example input data for testing the answer shown below.
Essentially what is done is first the binary string is converted into an integer value, then next into a string of raw bytes which is passed to struct.unpack() for final conversion to a floating point value. The bin_to_float() function shown below drives the process. Although not illustrated, binary input string arguments can be prefixed with '0b'.
from codecs import decode
import struct
def bin_to_float(b):
""" Convert binary string to a float. """
bf = int_to_bytes(int(b, 2), 8) # 8 bytes needed for IEEE 754 binary64.
return struct.unpack('>d', bf)[0]
def int_to_bytes(n, length): # Helper function
""" Int/long to byte string.
Python 3.2+ has a built-in int.to_bytes() method that could be used
instead, but the following works in earlier versions including 2.x.
"""
return decode('%%0%dx' % (length << 1) % n, 'hex')[-length:]
def float_to_bin(value): # For testing.
""" Convert float to 64-bit binary string. """
[d] = struct.unpack(">Q", struct.pack(">d", value))
return '{:064b}'.format(d)
if __name__ == '__main__':
for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
print('Test value: %f' % f)
binary = float_to_bin(f)
print(' float_to_bin: %r' % binary)
floating_point = bin_to_float(binary) # Round trip.
print(' bin_to_float: %f\n' % floating_point)
Output:
Test value: 0.000000
float_to_bin: '0000000000000000000000000000000000000000000000000000000000000000'
bin_to_float: 0.000000
Test value: 1.000000
float_to_bin: '0011111111110000000000000000000000000000000000000000000000000000'
bin_to_float: 1.000000
Test value: -14.000000
float_to_bin: '1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float: -14.000000
Test value: 12.546000
float_to_bin: '0100000000101001000101111000110101001111110111110011101101100100'
bin_to_float: 12.546000
Test value: 3.141593
float_to_bin: '0100000000001001001000011111101110000010110000101011110101111111'
bin_to_float: 3.141593
This works for me.
Tested with Python3.4:
def float_to_bin(num):
return bin(struct.unpack('!I', struct.pack('!f', num))[0])[2:].zfill(32)
def bin_to_float(binary):
return struct.unpack('!f',struct.pack('!I', int(binary, 2)))[0]
float_to_bin(bin_to_float(float_to_bin(123.123))) == float_to_bin(123.123)
>>> True
Another option is to do
from ast import literal_eval
float_str = "-0b101010101"
result = float(literal_eval(float_str))
Unlike the built-in "eval", literal_eval is safe to be run even on user inputs, as it can only parse Python literals - and will not execute expressions, which means it will not call functions as well.
float(int('-0b1110',0))
That works for me.
If you have a 64-bit string that represents a floating point number rather than an integer, you can do a three-step conversion - the first step turns the string into an integer, the second converts it into an 8-byte string, and the third re-interprets those bits as a float.
>>> import struct
>>> s = '0b0100000000101001000101111000110101001111110111110011101101100100'
>>> q = int(s, 0)
>>> b8 = struct.pack('Q', q)
>>> struct.unpack('d', b8)[0]
12.546
Of course you can combine all those steps into a single line.
>>> s2 = '0b1100000000101100000000000000000000000000000000000000000000000000'
>>> struct.unpack('d', struct.pack('Q', int(s2, 0)))[0]
-14.0
You could use eval('') and then cast it as a float if needed. Example:
>> eval('-0b1110')
-14
>> float(eval('-0b1110'))
-14.0
You can convert a binary number in string form to an int by setting the base to 2 in the built-in int([x[, base]]) function, however you need to get rid of the 0b first. You can then pass the result into float() to get your final result:
>>> s = '-0b1110'
>>> float(int(s.replace('0b', ''), 2))
-14.0
edit: Apparently getting rid of the 0b is only necessary on Python 2.5 and below, Mark's answer works fine for me on Python 2.6 but here is what I see on Python 2.5:
>>> int('-0b1110', 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 2: '-0b1110'
def bin_to_float(binary):
di = binary.find('.')
if di == -1:
return int(binary,2)
else:
whole = binary[:di] or '0'
fraction = binary [di+1:] or '0'
return int(whole,2) + int(whole,2)/abs(int(whole,2)) * int(fraction,2) / 2** len(fraction)
samples = [
'-0b1110.010101011',
'-0b1001110',
'101',
'0b1110010.11010111011',
'1100.',
'-00011',
'+0b1100.0011',
]
for binary in samples:
print(binary,'=',bin_to_float(binary))
Returns
-0b1110.010101011 = -14.333984375
-0b1001110 = -78
101 = 5
0b1110010.11010111011 = 114.84130859375
1100. = 12.0
-00011 = -3
+0b1100.0011 = 12.1875

Pandas bitwise comparisons throws exception when using multiple conditions

I am working with a large data, and I want to extract a subset.
In SQL representation this is what I want to achieve. I would like to do this using pandas/numpy.
select * from Data where cpty_type = 'INTERBRANCH' and (settlementDate >= '2017-04-18 00:00:00.000' or settlementDate = '1899-12-30 00:00:00.000'))
These two statements on their own work:
#1. unionX1 = data[data.cpty_type == 'INTERBRANCH']
#2. unionX1 = data[data.settlementDate >= '2017-04-18 00:00:00.000']
My versions (combining both does not work):
unionX1 = data[data.cpty_type == 'INTERBRANCH' & (data.settlementDate >= '2017-04-18' | data.settlementDate == '2017-04-18')]
I get the following exception when I run it:
I think it is cause by the bit-wise comparison
Any suggestion on what I am doing wrong here?
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 877, in na_op
result = op(x, y)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 127, in <lambda>
ror_=bool_method(lambda x, y: operator.or_(y, x),
TypeError: ufunc 'bitwise_or' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 895, in na_op
result = lib.scalar_binop(x, y, op)
File "pandas\lib.pyx", line 912, in pandas.lib.scalar_binop (pandas\lib.c:16177)
ValueError: cannot include dtype 'M' in a buffer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Karunyan/PycharmProjects/RECON/criteria/distinct_matched_trades.py", line 18, in <module>
unionX1 = data[data.cpty_type == 'INTERBRANCH' & (data.settlementDate >= '2017-04-18' | data.settlementDate == '1899-12-30')]
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 929, in wrapper
na_op(self.values, other),
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 899, in na_op
x.dtype, type(y).__name__))
TypeError: cannot compare a dtyped [datetime64[ns]] array with a scalar of type [bool]
In Python, bitwise operations like |, &, and ^ have higher precedence than comparison operations like <, >, ==, etc. You need to use parentheses in your expressions to force the correct evaluation order.
For example, if you write A < B & C < D, it will be evaluated as A < (B & C) < D which will produce an error in the case of Pandas series. You need to explicitly write (A < B) & (C < D) to make it work as you expect.
In your case, you can do this:
unionX1 = data[(data.cpty_type == 'INTERBRANCH') & ((data.settlementDate >= '2017-04-18') | (data.settlementDate == '2017-04-18'))]
You need to enclose multiple conditions in braces due to operator precedence and use the bitwise and (&) and or (|) operators:
unionX1 = data[(data.cpty_type == 'INTERBRANCH') &
((data.settlementDate >='2017-04-18') | (data.settlementDate =='2017-04-18'))]
If you want to sidestep the operator precedence oddity, you can use numpy's logical_or and logical_and functions.
unionX1 = data[logical_and(
data.cpty_type == 'INTERBRANCH',
np.logical_or(
data.settlementDate >= '2017-04-18',
data.settlementDate == '2017-04-18'
)
)]
The grouping is explicit, so you get the behavior you intend without having to remember the precedence of binary operators.

Is there any way to make Python .format() thow an exception if the data won't fit the field?

I want to normalize floating-point numbers to nn.nn strings, and to do some special handling if the number is out of range.
try:
norm = '{:5.2f}'.format(f)
except ValueError:
norm = 'BadData' # actually a bit more complex than this
except it doesn't work: .format silently overflows the 5-character width. Obviously I could length-check norm and raise my own ValueError, but have I missed any way to force format (or the older % formatting) to raise an exception on field-width overflow?
You can not achieve this with format(). You have to create your custom formatter which raises the exception. For example:
def format_float(num, max_int=5, decimal=2):
if len(str(num).split('.')[0])>max_int:
raise ValueError('Integer part of float can have maximum {} digits'.format(max_int))
return "{:.2f}".format(num)
Sample run:
>>> format_float(123.456)
'123.46'
>>> format_float(123.4)
'123.40'
>>> format_float(123789.456) # Error since integer part is having length more than 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in format_float
ValueError: Integer part of float can have maximum 5 digits

Float value is equal -1.#IND

A function returns a list which contains of float values. If I plot this list, I see that some of the float values are equal -1.#IND. I also checked the type of those -1.#IND values. And they are also of float type.
But how can I understand this -1.#IND values? What do they represent or stand for?
-1.#IND means indefinite, the result of a floating point equation that doesn't have a solution. On other platforms, you'd get NaN instead, meaning 'not a number', -1.#IND is specific to Windows. On Python 2.5 on Linux I get:
>>> 1e300 * 1e300 * 0
-nan
You'll only find this on python versions 2.5 and before, on Windows platforms. The float() code was improved in python 2.6 and consistently uses float('nan') for such results; mostly because there was no way to turn 1.#INF and -1.#IND back into an actual float() instance again:
>>> repr(inf)
'1.#INF'
>>> float(_)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): 1.#INF
>>> repr(nan)
'-1.#IND'
>>> float(_)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -1.#IND
On versions 2.6 and newer this has all been cleaned up and made consistent:
>>> 1e300 * 1e300 * 0
nan
>>> 1e300 * 1e300
inf
>>> 1e300 * 1e300 * -1
-inf

How to convert a binary (string) into a float value?

I want to convert a binary number into a float number. Here's an example of a possibility:
>>> float(-0b1110)
gives me the correct output:
-14.0
Unfortunately, I am working with binary strings, i.e., I need something like float('-0b1110').
However, this doesn't work:
>>> float('-0b1110')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -0b1110
I tried to use binascii.a2b_qp(string[, header]) which converts a block of quoted-printable data back to binary and returns the binary data. But eventually, I get the same error:
>>> float(binascii.a2b_qp('-0b1110'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): -0b1110
I understand the cases where the output number is an integer but what if I want to obtain the number 12.546? What would the function call for the binary string look like then?
In one of your comments you indicated that the binary number represents a float in 8 byte long IEEE 754 binary64 format. However that is inconsistent with the -0b1110 value you showed as an example, so I've ignored it and used my own which is in the proper format as example input data for testing the answer shown below.
Essentially what is done is first the binary string is converted into an integer value, then next into a string of raw bytes which is passed to struct.unpack() for final conversion to a floating point value. The bin_to_float() function shown below drives the process. Although not illustrated, binary input string arguments can be prefixed with '0b'.
from codecs import decode
import struct
def bin_to_float(b):
""" Convert binary string to a float. """
bf = int_to_bytes(int(b, 2), 8) # 8 bytes needed for IEEE 754 binary64.
return struct.unpack('>d', bf)[0]
def int_to_bytes(n, length): # Helper function
""" Int/long to byte string.
Python 3.2+ has a built-in int.to_bytes() method that could be used
instead, but the following works in earlier versions including 2.x.
"""
return decode('%%0%dx' % (length << 1) % n, 'hex')[-length:]
def float_to_bin(value): # For testing.
""" Convert float to 64-bit binary string. """
[d] = struct.unpack(">Q", struct.pack(">d", value))
return '{:064b}'.format(d)
if __name__ == '__main__':
for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
print('Test value: %f' % f)
binary = float_to_bin(f)
print(' float_to_bin: %r' % binary)
floating_point = bin_to_float(binary) # Round trip.
print(' bin_to_float: %f\n' % floating_point)
Output:
Test value: 0.000000
float_to_bin: '0000000000000000000000000000000000000000000000000000000000000000'
bin_to_float: 0.000000
Test value: 1.000000
float_to_bin: '0011111111110000000000000000000000000000000000000000000000000000'
bin_to_float: 1.000000
Test value: -14.000000
float_to_bin: '1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float: -14.000000
Test value: 12.546000
float_to_bin: '0100000000101001000101111000110101001111110111110011101101100100'
bin_to_float: 12.546000
Test value: 3.141593
float_to_bin: '0100000000001001001000011111101110000010110000101011110101111111'
bin_to_float: 3.141593
This works for me.
Tested with Python3.4:
def float_to_bin(num):
return bin(struct.unpack('!I', struct.pack('!f', num))[0])[2:].zfill(32)
def bin_to_float(binary):
return struct.unpack('!f',struct.pack('!I', int(binary, 2)))[0]
float_to_bin(bin_to_float(float_to_bin(123.123))) == float_to_bin(123.123)
>>> True
Another option is to do
from ast import literal_eval
float_str = "-0b101010101"
result = float(literal_eval(float_str))
Unlike the built-in "eval", literal_eval is safe to be run even on user inputs, as it can only parse Python literals - and will not execute expressions, which means it will not call functions as well.
float(int('-0b1110',0))
That works for me.
If you have a 64-bit string that represents a floating point number rather than an integer, you can do a three-step conversion - the first step turns the string into an integer, the second converts it into an 8-byte string, and the third re-interprets those bits as a float.
>>> import struct
>>> s = '0b0100000000101001000101111000110101001111110111110011101101100100'
>>> q = int(s, 0)
>>> b8 = struct.pack('Q', q)
>>> struct.unpack('d', b8)[0]
12.546
Of course you can combine all those steps into a single line.
>>> s2 = '0b1100000000101100000000000000000000000000000000000000000000000000'
>>> struct.unpack('d', struct.pack('Q', int(s2, 0)))[0]
-14.0
You could use eval('') and then cast it as a float if needed. Example:
>> eval('-0b1110')
-14
>> float(eval('-0b1110'))
-14.0
You can convert a binary number in string form to an int by setting the base to 2 in the built-in int([x[, base]]) function, however you need to get rid of the 0b first. You can then pass the result into float() to get your final result:
>>> s = '-0b1110'
>>> float(int(s.replace('0b', ''), 2))
-14.0
edit: Apparently getting rid of the 0b is only necessary on Python 2.5 and below, Mark's answer works fine for me on Python 2.6 but here is what I see on Python 2.5:
>>> int('-0b1110', 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 2: '-0b1110'
def bin_to_float(binary):
di = binary.find('.')
if di == -1:
return int(binary,2)
else:
whole = binary[:di] or '0'
fraction = binary [di+1:] or '0'
return int(whole,2) + int(whole,2)/abs(int(whole,2)) * int(fraction,2) / 2** len(fraction)
samples = [
'-0b1110.010101011',
'-0b1001110',
'101',
'0b1110010.11010111011',
'1100.',
'-00011',
'+0b1100.0011',
]
for binary in samples:
print(binary,'=',bin_to_float(binary))
Returns
-0b1110.010101011 = -14.333984375
-0b1001110 = -78
101 = 5
0b1110010.11010111011 = 114.84130859375
1100. = 12.0
-00011 = -3
+0b1100.0011 = 12.1875

Categories