Converting float.hex() value to binary in Python - python

I am wondering how to convert the result returned by float.hex() to binary, for example, from 0x1.a000000000000p+2 to 110.1.
Can anyone please help? Thanks.

def float_to_binary(num):
exponent=0
shifted_num=num
while shifted_num != int(shifted_num):
shifted_num*=2
exponent+=1
if exponent==0:
return '{0:0b}'.format(int(shifted_num))
binary='{0:0{1}b}'.format(int(shifted_num),exponent+1)
integer_part=binary[:-exponent]
fractional_part=binary[-exponent:].rstrip('0')
return '{0}.{1}'.format(integer_part,fractional_part)
def floathex_to_binary(floathex):
num = float.fromhex(floathex)
return float_to_binary(num)
print(floathex_to_binary('0x1.a000000000000p+2'))
# 110.1
print(floathex_to_binary('0x1.b5c2000000000p+1'))
# 11.01101011100001
Explanation:
float.fromhex returns a float num. We'd like its binary representation.
{0:b}.format(...) returns binary representations of integers, but not floats.
But if we multiply the float by enough powers of 2, that is, shift the binary representation to the left enough places, we end up with an integer, shifted_num.
Once we have that integer, we are home free, because now we can use {0:b}.format(...).
We can re-insert the decimal point (err, binary point?) by using a bit of string slicing based on the number of places we had shifted to the left (exponent).
Technical point: The number of digits in the binary representation of shifted_num may be smaller than exponent. In that case, we need to pad the binary representation with more 0's on the left, so binary slicing with binary[:-exponent] won't be empty. We manage that with '{0:0{1}b}'.format(...). The 0{1} in the format string sets the width of the formated string to {1}, padded on the left with zeros. (The {1} gets replaced by the number exponent.)

Note that the binary form of 0x1.a000000000000p+2 isn't 101.1 (or more exactly 0b101.1 )
but 0b110.1 (in my Python 2.7, binary numbers are displayed like that)
.
First, a useful method of float instances float.hex() and its inverse function, a float class method float.fromhex()
fh = 12.34.hex()
print fh
print float.fromhex(fh)
result
0x1.8ae147ae147aep+3 # hexadecimal representation of a float
12.34
"Note that float.hex() is an instance method, while float.fromhex() is a class method."
http://docs.python.org/library/stdtypes.html#float.fromhex
.
Secondly, I didn't find a Python's function to transform an hexadecimal representation of a float into a binary representation of this float, that is to say with a dot ( nor than one to transform directly a decimal representation of a float into a binary one).
So I created a function for that purpose.
Before hand, this function transforms the hexadecimal representation into a decimal representation (string) of the input float.
Then there are two problems:
how to transform the part before the dot ?
This part being an integer, it's easy to use bin()
how to transform the part after the dot ??
The problem of this transformation has been asked several times on SO, but I didn't understand the solutions, so I wrote my own.
Then, here's the function you wish, Qiang Li:
def hexf2binf(x):
'''Transforms an hexadecimal float with a dot into a binary float with a dot'''
a,_,p = str(float.fromhex(x)).partition('.')
# the following part transforms the part after the dot into a binary after the dot
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
bits = []
pdec = Decimal('.'+p)
for tin in tinies:
if pdec-tin==0:
bits.append('1')
break
elif pdec-tin>0:
bits.append('1')
pdec -= tin
else:
bits.append('0')
pbin = ''.join(bits) # it's the binary after the dot
# the integer before the dot is easily transformed into a binary
return '.'.join((bin(int(a)),pbin))
.
In order to perform verification, I wrote a function to transform the part of a binary float after a dot into its decimal representation:
from decimal import Decimal, getcontext()
getcontext().prec = 500
# precision == 500 , to be large !
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
com = dict((i,tin) for i,tin in enumerate(tinies,1))
def afterdotbinary2float(sbin, com = com):
'''Transforms a binary lying after a dot into a float after a dot'''
if sbin.startswith('0b.') or sbin.startswith('.'):
sbin = sbin.split('.')[1]
if all(c in '01' for c in sbin):
return sum(int(c)*com[i] for i,c in enumerate(sbin,1))
else:
return None
.
.
Finally, applying these functions:
from decimal import Decimal
getcontext().prec = 500
# precision == 500 , to be large !
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
com = dict((i,tin) for i,tin in enumerate(tinies,1))
def afterdotbinary2float(sbin, com = com):
'''Transforms a binary lying after a dot into a float after a dot'''
if sbin.startswith('0b.') or sbin.startswith('.'):
sbin = sbin.split('.')[1]
if all(c in '01' for c in sbin):
return sum(int(c)*com[i] for i,c in enumerate(sbin,1))
else:
return None
def hexf2binf(x):
'''Transforms an hexadecimal float with a dot into a binary float with a dot'''
a,_,p = str(float.fromhex(x)).partition('.')
# the following part transforms the float after the dot into a binary after the dot
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
bits = []
pdec = Decimal('.'+p)
for tin in tinies:
if pdec-tin==0:
bits.append('1')
break
elif pdec-tin>0:
bits.append('1')
pdec -= tin
else:
bits.append('0')
pbin = ''.join(bits) # it's the binary after the dot
# the float before the dot is easily transformed into a binary
return '.'.join((bin(int(a)),pbin))
for n in (45.625 , 780.2265625 , 1022.796875):
print 'n ==',n,' transformed with its method hex() to:'
nhexed = n.hex()
print 'nhexed = n.hex() ==',nhexed
print '\nhexf2binf(nhexed) ==',hexf2binf(nhexed)
print "\nVerification:\nbefore,_,after = hexf2binf(nhexed).partition('.')"
before,_,after = hexf2binf(nhexed).partition('.')
print 'before ==',before,' after ==',after
print 'int(before,2) ==',int(before,2)
print 'afterdotbinary2float(after) ==',afterdotbinary2float(after)
print '\n---------------------------------------------------------------\n'
result
n == 45.625 transformed with its method hex() to:
nhexed = n.hex() == 0x1.6d00000000000p+5
hexf2binf(nhexed) == 0b101101.101
Verification:
before,_,after = hexf2binf(nhexed).partition('.')
before == 0b101101 after == 101
int(before,2) == 45
afterdotbinary2float(after) == 0.625
---------------------------------------------------------------
n == 780.2265625 transformed with its method hex() to:
nhexed = n.hex() == 0x1.861d000000000p+9
hexf2binf(nhexed) == 0b1100001100.0011101
Verification:
before,_,after = hexf2binf(nhexed).partition('.')
before == 0b1100001100 after == 0011101
int(before,2) == 780
afterdotbinary2float(after) == 0.2265625
---------------------------------------------------------------
n == 1022.796875 transformed with its method hex() to:
nhexed = n.hex() == 0x1.ff66000000000p+9
hexf2binf(nhexed) == 0b1111111110.110011
Verification:
before,_,after = hexf2binf(nhexed).partition('.')
before == 0b1111111110 after == 110011
int(before,2) == 1022
afterdotbinary2float(after) == 0.796875
---------------------------------------------------------------
.
For the two numbers:
from decimal import Decimal
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
com = dict((i,tin) for i,tin in enumerate(tinies,1))
def hexf2binf(x, tinies = tinies):
'''Transforms an hexadecimal float with a dot into a binary float with a dot'''
a,_,p = str(float.fromhex(x)).partition('.')
# the following part transforms the float after the dot into a binary after the dot
bits = []
pdec = Decimal('.'+p)
for tin in tinies:
if pdec-tin==0:
bits.append('1')
break
elif pdec-tin>0:
bits.append('1')
pdec -= tin
else:
bits.append('0')
pbin = ''.join(bits) # it's the binary after the dot
# the float before the dot is easily transformed into a binary
return '.'.join((bin(int(a)),pbin))
print hexf2binf('0x1.a000000000000p+2')
print
print hexf2binf('0x1.b5c2000000000p+1')
as a result, this displays:
0b110.1
0b11.011010111000010000000000000000000000010000011111100001111111101001000110110000101101110000000001111110011100110101011100011110011101111010001011111001011101011000000011001111111010111011000101000111100100110000010110001000111010111101110111011111000100100110110101011100101001110011000100000000010001101111111010001110100100101110111000111111001011101101010011011111011001010011111111010101011010110

Given the a hexadecimal string h, you can find the corresponding float with
x = float.fromhex(h)
So really you're interesting in being able to produce a "fixed point" binary representation of any float. Its possible there is no finite representation so you probably want to restrict the length it can be. (eg the binary representation of math.pi wouldn't end ...)
So something like the following might work
def binaryRepresentation(x, n=8):
# the base and remainder ... handle negatives as well ...
base = int(x)
fraction = abs(int(round( (x - base) * (2**n) )))
# format and remove redundant zeros
return "{0:b}.{1:b}".format(base, fraction).rstrip("0")

Big EDIT about big numbers
.
The following code shows a problem with my solution in my other answer.
Note that I changed the parameter of my function hexf2binf(floathex) from h to floathex, to make it the same as the parameter used by unutbu in his function floathex_to_binary(floathex)
from decimal import Decimal,getcontext
getcontext.prec = 500
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
com = dict((i,tin) for i,tin in enumerate(tinies,1))
def hexf2binf(floathex, tinies = tinies):
fromh = float.fromhex(floathex)
print 'fromh = float.fromhex(h) DONE'
print 'fromh ==',fromh
print "str(float.fromhex(floathex)) ==",str(float.fromhex(floathex))
a,_,p = str(float.fromhex(floathex)).partition('.')
print 'before the dot ==',a
print 'after the dot ==',p
# the following part transforms the float after the dot into a binary after the dot
bits = []
pdec = Decimal('.'+p)
for tin in tinies:
if pdec-tin==0:
bits.append('1')
break
elif pdec-tin>0:
bits.append('1')
pdec -= tin
else:
bits.append('0')
pbin = ''.join(bits) # it's the binary after the dot
# the float before the dot is easily transformed into a binary
return '.'.join((bin(int(a)),pbin))
x = x = 123456789012345685803008.0
print ' x = {:f}'.format(x)
h = x.hex()
print ' h = x.hex() ==',h
print '\nENTERING hexf2binf(floathex) with h as argument'
v = hexf2binf(h)
print '\nhexf2binf(x)==',v
result
x = 123456789012345685803008.000000
h = x.hex() == 0x1.a249b1f10a06dp+76
ENTERING hexf2binf(floathex) with h as argument
fromh = float.fromhex(h) DONE
fromh == 1.23456789012e+23
str(float.fromhex(floathex)) == 1.23456789012e+23
before the dot == 1
after the dot == 23456789012e+23
hexf2binf(x)== 0b1.111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
The problem is due to the instruction str(float.fromhex(x)) in the instruction a,_,p = str(float.fromhex(x)).partition('.') that produces , for a big number, a representation of float.fromhex(x) with an exponent.
Then THE PARTS BEFORE THE DOT (a as ante) AND AFTER THE DOT (p as post) ARE FALSE.
Correcting this is easy: replacing the inaccurate instruction with this one:
a,_,p = '{:f}'.format(float.fromhex(x)).partition('.')
.
Nota bene:
On a typical machine running Python, there are 53 bits of precision
available for a Python float, so the value stored internally when you
enter the decimal number 0.1 is the binary fraction
0.00011001100110011001100110011001100110011001100110011010
http://docs.python.org/tutorial/floatingpoint.html
That means that when a big value for a float is written in a code, its internal representation is in fact an approximation of the written value.
That is shown by the following code:
x1 = 123456789012345685803008.0
print 'x1 == 123456789012345685803008.0'
h1 = x1.hex()
print 'h1 = x1.hex() ==',h1
y1 = float.fromhex(h1)
print 'y1 = float.fromhex(h1) == {:f}'.format(y1)
print
x2 = 123456789012345678901234.64655
print 'x2 == 123456789012345678901234.64655'
h2 = x2.hex()
print 'h2 = x2.hex() ==',h2
y2 = float.fromhex(h2)
print 'y2 = float.fromhex(h2) == {:f}'.format(y2)
print
result
x1 == 123456789012345685803008.0
h1 = x1.hex() == 0x1.a249b1f10a06dp+76
y1 = float.fromhex(h1) == 123456789012345685803008.000000
x2 == 123456789012345678901234.64655
h2 = x2.hex() == 0x1.a249b1f10a06dp+76
y2 = float.fromhex(h2) == 123456789012345685803008.000000
Values of h1 and h2 are the same because, though different values are assigned to identifiers x1 and x2 in the script, the OBJECTS x1 and x2 are represented with the same approximation in the machine.
The internal representation of 123456789012345685803008.0 is the exact value of 123456789012345685803008.0 and is the internal representation of 123456789012345678901234.64655 but its approximation, hence deduction of h1 and h2 from x1 and x2 gives the same value to h1 and h2.
This problem exists when we write a number in decimal representation in a script. It doesn't exist when we write a number directly in hexadecimal or binary representation.
What I wanted to underline
is that I wrote a function afterdotbinary2float(sbin, com = com) to perform verification on the results yielded by hexf2binf( ). This verification works well when the number passed to hexf2binf( ) isn't big, but because of the internal approximation of big numbers (= having a lot of digits), I wonder if this verification isn't distorted. Indeed, when a big number arrives in the function, it has already been approximated : the digits after the dot have been transformed into a series of zeros;
as it is shown here after:
from decimal import Decimal, getcontext
getcontext().prec = 500
tinies = [ Decimal(1) / Decimal(2**i) for i in xrange(1,400)]
com = dict((i,tin) for i,tin in enumerate(tinies,1))
def afterdotbinary2float(sbin, com = com):
'''Transforms a binary lying after a dot into a float after a dot'''
if sbin.startswith('0b.') or sbin.startswith('.'):
sbin = sbin.split('.')[1]
if all(c in '01' for c in sbin):
return sum(int(c)*com[i] for i,c in enumerate(sbin,1))
else:
return None
def hexf2binf(floathex, tinies = tinies):
'''Transforms an hexadecimal float with a dot into a binary float with a dot'''
a,_,p = '{:.400f}'.format(float.fromhex(floathex)).partition('.')
# the following part transforms the float after the dot into a binary after the dot
bits = []
pdec = Decimal('.'+p)
for tin in tinies:
if pdec-tin==0:
bits.append('1')
break
elif pdec-tin>0:
bits.append('1')
pdec -= tin
else:
bits.append('0')
pbin = ''.join(bits) # it's the binary after the dot
# the float before the dot is easily transformed into a binary
return '.'.join((bin(int(a)),pbin))
for n in (123456789012345685803008.0, 123456789012345678901234.64655, Decimal('123456789012345.2546') ):
print 'n == {:f} transformed with its method hex() to:'.format(n)
nhexed = n.hex()
print 'nhexed = n.hex() ==',nhexed
print '\nhexf2binf(nhexed) ==',hexf2binf(nhexed)
print "\nVerification:\nbefore,_,after = hexf2binf(nhexed).partition('.')"
before,_,after = hexf2binf(nhexed).partition('.')
print 'before ==',before,' after ==',after
print 'int(before,2) ==',int(before,2)
print 'afterdotbinary2float(after) ==',afterdotbinary2float(after)
print '\n---------------------------------------------------------------\n'
result
n == 123456789012345685803008.000000 transformed with its method hex() to:
nhexed = n.hex() == 0x1.a249b1f10a06dp+76
hexf2binf(nhexed) == 0b11010001001001001101100011111000100001010000001101101000000000000000000000000.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Verification:
before,_,after = hexf2binf(nhexed).partition('.')
before == 0b11010001001001001101100011111000100001010000001101101000000000000000000000000 after == 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
int(before,2) == 123456789012345685803008
afterdotbinary2float(after) == 0E-399
---------------------------------------------------------------
n == 123456789012345685803008.000000 transformed with its method hex() to:
nhexed = n.hex() == 0x1.a249b1f10a06dp+76
hexf2binf(nhexed) == 0b11010001001001001101100011111000100001010000001101101000000000000000000000000.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Verification:
before,_,after = hexf2binf(nhexed).partition('.')
before == 0b11010001001001001101100011111000100001010000001101101000000000000000000000000 after == 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
int(before,2) == 123456789012345685803008
afterdotbinary2float(after) == 0E-399
---------------------------------------------------------------
n == 123456789012345.2546 transformed with its method hex() to:
Traceback (most recent call last):
File "I:\verfitruc.py", line 41, in <module>
nhexed = n.hex()
AttributeError: 'Decimal' object has no attribute 'hex'
Conclusion: testing with numbers 123456789012345685803008.0 and 123456789012345678901234.64655 makes no difference and no interest.
So, I wanted to test un-approximated numbers, and I passed a Decimal float number. As you see, the problem is that such an instance hasn't the hex() method.
.
Finally, I'm not entirely sure of my function for the big numbers, but it works correctly for common numbers after I corrected the inaccurate instruction.
.
EDIT
I added '.400' in the instruction:
a,_,p = '{:.400f}'.format(fromh).partition('.')
otherwise the value of p could be truncated, thus giving a binary representation of a slightly different number than the one passed to the function.
I put 400 because it is the length I defined for the list tinies that contains the Decimal instances corresponding to 1/2 , 1/4 , 1/8 , 1/16 etc
However, though it is rare that a number with more than 400 digits after the comma has any sense, this adding remains unsatisfactory for me: the code isn't absolutely general, which is the case of unutbu 's code.

Related

Find least significant digit in a double in Python

I have a lot of financial data stored as floating point doubles and I'm trying to find the least significant digit so that I can convert the data to integers with exponent.
All the data is finite, e.g. 1234.23 or 0.0001234 but because it's stored in doubles it can be 123.23000000001 or 0.00012339999999 etc
Is there an easy or proper approach to this or will I just have to botch it?
You have a couple of options,
Firstly and most preferably, use the stdlib Decimal, not builtin float
This fixes most errors related to floats but not the infamous 0.1 + 0.2 = 0.3...4
from decimal import Demical
print(0.1 + 0.2) # 0.30000000000000004
print(Decimal(0.1) + Decimal(0.2)) # 0.3000000000000000166533453694
An alternative option if that isn't possible, is setting a tolerance for number of repeated digits after the decimal point.
For example:
import re
repeated_digit_tolerance = 8 # Change to an appropriate value for your dataset
repeated_digit_pattern = re.compile(r"(.)\1{2,}")
def longest_repeated_digit_re(s: str):
match = repeated_digit_pattern.search(s)
string = match.string
span = match.span()
substr_len = span[1] - span[0]
return substr_len, string
def fix_rounding(num: float) -> float:
num_str = str(num)
pre_dp = num_str[:num_str.index(".")]
post_dp = num_str[num_str.index(".") + 1:]
repetition_length, string = longest_repeated_digit_re(post_dp)
if repetition_length > repeated_digit_tolerance:
shortened_string = string[:repeated_digit_tolerance-1]
return float(".".join([pre_dp, shortened_string]))
print(0.1 + 0.2) # 0.30000000000000004
print(0.2 + 0.4) # 0.6000000000000001
print(fix_rounding(0.1 + 0.2)) # 0.3
print(fix_rounding(0.2 + 0.4)) # 0.6
It's perfectly functioning code but Decimal is practially always the better option of the two, even if it wont do 0.1 + 0.2 correctly.
Here is my botch using strings. It works adequately at the moment for what I need but I haven't fully tested it.
print (int_sci_notation(0.1+0.2)) will return a tupple (3,-1)
def int_sci_notation(decimal_value):
#decimal value is finite value stored in double precision
#convert to scientific string (cannot prevent E notation so force all numbers to E notation)
tostr = format(decimal_value, ".14E")
#get exponent from string
if tostr[-3] == '-':
exp = -int(tostr[-2:])
else:
exp = int(tostr[-2:])
#get significant figures as an integer
frac = tostr[1:-4].strip('0')
sf = tostr[0]+frac[1:]
#return the integer 'mantissa' and the exponent
return int(sf), -int(len(sf)-1-exp)

int 111 to binary 111(decimal 7)

Problem:Take a number example 37 is (binary 100101).
Count the binary 1s and create a binary like (111) and print the decimal of that binary(7)
num = bin(int(input()))
st = str(num)
count=0
for i in st:
if i == "1":
count +=1
del st
vt = ""
for i in range(count):
vt = vt + "1"
vt = int(vt)
print(vt)
I am a newbie and stuck here.
I wouldn't recommend your approach, but to show where you went wrong:
num = bin(int(input()))
st = str(num)
count = 0
for i in st:
if i == "1":
count += 1
del st
# start the string representation of the binary value correctly
vt = "0b"
for i in range(count):
vt = vt + "1"
# tell the `int()` function that it should consider the string as a binary number (base 2)
vt = int(vt, 2)
print(vt)
Note that the code below does the exact same thing as yours, but a bit more concisely so:
ones = bin(int(input())).count('1')
vt = int('0b' + '1' * ones, 2)
print(vt)
It uses the standard method count() on the string to get the number of ones in ones and it uses Python's ability to repeat a string a number of times using the multiplication operator *.
Try this once you got the required binary.
def binaryToDecimal(binary):
binary1 = binary
decimal, i, n = 0, 0, 0
while(binary != 0):
dec = binary % 10
decimal = decimal + dec * pow(2, i)
binary = binary//10
i += 1
print(decimal)
In one line:
print(int(format(int(input()), 'b').count('1') * '1', 2))
Let's break it down, inside out:
format(int(input()), 'b')
This built-in function takes an integer number from the input, and returns a formatted string according to the Format Specification Mini-Language. In this case, the argument 'b' gives us a binary format.
Then, we have
.count('1')
This str method returns the total number of occurrences of '1' in the string returned by the format function.
In Python, you can multiply a string times a number to get the same string repeatedly concatenated n times:
x = 'a' * 3
print(x) # prints 'aaa'
Thus, if we take the number returned by the count method and multiply it by the string '1' we get a string that only contains ones and only the same amount of ones as our original input number in binary. Now, we can express this number in binary by casting it in base 2, like this:
int(number_string, 2)
So, we have
int(format(int(input()), 'b').count('1') * '1', 2)
Finally, let's print the whole thing:
print(int(format(int(input()), 'b').count('1') * '1', 2))

How do I format a number using SI prefix (micro, milli, Mega, Giga, etc)?

I have numbers that range from very small to very large, and I'd like to format them using "engineering notation" with a magnitude and a suffix:
n.nnn S
where 1.0 <= n.nnn < 1000., and S is a metric (SI) prefix. So:
1234.5e+13 => 12.35P
12345678 => 12.35M
1234 => 1.234K
1.234 => 1.234
0.1234 => 123.4m
1234.5e-16 => 1.235f
etc. How can I do that, e.g. using Python?
(Posted here in Q&A style because I keep re-inventing this code, and others might find it helpful. Feel free to tweak it if you see improvements...)
Here is one implementation that lets you choose a long suffix (e.g. "peta") or a short suffix (e.g. "P"), and also lets you choose how many total digits are displayed (i.e. the precision):
def si_classifier(val):
suffixes = {
24:{'long_suffix':'yotta', 'short_suffix':'Y', 'scalar':10**24},
21:{'long_suffix':'zetta', 'short_suffix':'Z', 'scalar':10**21},
18:{'long_suffix':'exa', 'short_suffix':'E', 'scalar':10**18},
15:{'long_suffix':'peta', 'short_suffix':'P', 'scalar':10**15},
12:{'long_suffix':'tera', 'short_suffix':'T', 'scalar':10**12},
9:{'long_suffix':'giga', 'short_suffix':'G', 'scalar':10**9},
6:{'long_suffix':'mega', 'short_suffix':'M', 'scalar':10**6},
3:{'long_suffix':'kilo', 'short_suffix':'k', 'scalar':10**3},
0:{'long_suffix':'', 'short_suffix':'', 'scalar':10**0},
-3:{'long_suffix':'milli', 'short_suffix':'m', 'scalar':10**-3},
-6:{'long_suffix':'micro', 'short_suffix':'µ', 'scalar':10**-6},
-9:{'long_suffix':'nano', 'short_suffix':'n', 'scalar':10**-9},
-12:{'long_suffix':'pico', 'short_suffix':'p', 'scalar':10**-12},
-15:{'long_suffix':'femto', 'short_suffix':'f', 'scalar':10**-15},
-18:{'long_suffix':'atto', 'short_suffix':'a', 'scalar':10**-18},
-21:{'long_suffix':'zepto', 'short_suffix':'z', 'scalar':10**-21},
-24:{'long_suffix':'yocto', 'short_suffix':'y', 'scalar':10**-24}
}
exponent = int(math.floor(math.log10(abs(val))/3.0)*3)
return suffixes.get(exponent, None)
def si_formatter(value):
'''
Return a triple of scaled value, short suffix, long suffix, or None if
the value cannot be classified.
'''
classifier = si_classifier(value)
if classifier == None:
# Don't know how to classify this value
return None
scaled = value / classifier['scalar']
return (scaled, classifier['short_suffix'], classifier['long_suffix'])
def si_format(value, precision=4, long_form=False, separator=''):
'''
"SI prefix" formatted string: return a string with the given precision
and an appropriate order-of-3-magnitudes suffix, e.g.:
si_format(1001.0) => '1.00K'
si_format(0.00000000123, long_form=True, separator=' ') => '1.230 nano'
'''
scaled, short_suffix, long_suffix = si_formatter(value)
if scaled == None:
# Don't know how to format this value
return value
suffix = long_suffix if long_form else short_suffix
if abs(scaled) < 10:
precision = precision - 1
elif abs(scaled) < 100:
precision = precision - 2
else:
precision = precision - 3
return '{scaled:.{precision}f}{separator}{suffix}'.format(
scaled=scaled, precision=precision, separator=separator, suffix=suffix)
You can use Prefixed which has a float type with additional formatting options.
>>> from prefixed import Float
>>> f'{Float(1234.5e+13):.2h}'
'12.35P'
>>> f'{Float(12345678):.2h}'
'12.35M'
>>> f'{Float(1234):.2h}'
'1.23k'
>>> f'{Float(1.234):.2h}'
'1.23'
>>> f'{Float(0.1234):.2h}'
'123.40m'
>>> f'{Float(1234.5e-16):.2h}'
'123.45f'

Python Decimal - engineering notation for mili (10e-3) and micro (10e-6)

Here is the example which is bothering me:
>>> x = decimal.Decimal('0.0001')
>>> print x.normalize()
>>> print x.normalize().to_eng_string()
0.0001
0.0001
Is there a way to have engineering notation for representing mili (10e-3) and micro (10e-6)?
Here's a function that does things explicitly, and also has support for using SI suffixes for the exponent:
def eng_string( x, format='%s', si=False):
'''
Returns float/int value <x> formatted in a simplified engineering format -
using an exponent that is a multiple of 3.
format: printf-style string used to format the value before the exponent.
si: if true, use SI suffix for exponent, e.g. k instead of e3, n instead of
e-9 etc.
E.g. with format='%.2f':
1.23e-08 => 12.30e-9
123 => 123.00
1230.0 => 1.23e3
-1230000.0 => -1.23e6
and with si=True:
1230.0 => 1.23k
-1230000.0 => -1.23M
'''
sign = ''
if x < 0:
x = -x
sign = '-'
exp = int( math.floor( math.log10( x)))
exp3 = exp - ( exp % 3)
x3 = x / ( 10 ** exp3)
if si and exp3 >= -24 and exp3 <= 24 and exp3 != 0:
exp3_text = 'yzafpnum kMGTPEZY'[ ( exp3 - (-24)) / 3]
elif exp3 == 0:
exp3_text = ''
else:
exp3_text = 'e%s' % exp3
return ( '%s'+format+'%s') % ( sign, x3, exp3_text)
EDIT:
Matplotlib implemented the engineering formatter, so one option is to directly use Matplotlibs formatter, e.g.:
import matplotlib as mpl
formatter = mpl.ticker.EngFormatter()
formatter(10000)
result: '10 k'
Original answer:
Based on Julian Smith's excellent answer (and this answer), I changed the function to improve on the following points:
Python3 compatible (integer division)
Compatible for 0 input
Rounding to significant number of digits, by default 3, no trailing zeros printed
so here's the updated function:
import math
def eng_string( x, sig_figs=3, si=True):
"""
Returns float/int value <x> formatted in a simplified engineering format -
using an exponent that is a multiple of 3.
sig_figs: number of significant figures
si: if true, use SI suffix for exponent, e.g. k instead of e3, n instead of
e-9 etc.
"""
x = float(x)
sign = ''
if x < 0:
x = -x
sign = '-'
if x == 0:
exp = 0
exp3 = 0
x3 = 0
else:
exp = int(math.floor(math.log10( x )))
exp3 = exp - ( exp % 3)
x3 = x / ( 10 ** exp3)
x3 = round( x3, -int( math.floor(math.log10( x3 )) - (sig_figs-1)) )
if x3 == int(x3): # prevent from displaying .0
x3 = int(x3)
if si and exp3 >= -24 and exp3 <= 24 and exp3 != 0:
exp3_text = 'yzafpnum kMGTPEZY'[ exp3 // 3 + 8]
elif exp3 == 0:
exp3_text = ''
else:
exp3_text = 'e%s' % exp3
return ( '%s%s%s') % ( sign, x3, exp3_text)
The decimal module is following the Decimal Arithmetic Specification, which states:
This is outdated - see below
to-scientific-string – conversion to numeric string
[...]
The coefficient is first converted to a string in base ten using the characters 0 through 9 with no leading zeros (except if its value is zero, in which case a single 0 character is used).
Next, the adjusted exponent is calculated; this is the exponent, plus the number of characters in the converted coefficient, less one. That is, exponent+(clength-1), where clength is the length of the coefficient in decimal digits.
If the exponent is less than or equal to zero and the adjusted exponent is greater than or equal to -6, the number will be converted
to a character form without using exponential notation.
[...]
to-engineering-string – conversion to numeric string
This operation converts a number to a string, using engineering
notation if an exponent is needed.
The conversion exactly follows the rules for conversion to scientific
numeric string except in the case of finite numbers where exponential
notation is used. In this case, the converted exponent is adjusted to be a multiple of three (engineering notation) by positioning the decimal point with one, two, or three characters preceding it (that is, the part before the decimal point will range from 1 through 999).
This may require the addition of either one or two trailing zeros.
If after the adjustment the decimal point would not be followed by a digit then it is not added. If the final exponent is zero then no indicator letter and exponent is suffixed.
Examples:
For each abstract representation [sign, coefficient, exponent] on the left, the resulting string is shown on the right.
Representation
String
[0,123,1]
"1.23E+3"
[0,123,3]
"123E+3"
[0,123,-10]
"12.3E-9"
[1,123,-12]
"-123E-12"
[0,7,-7]
"700E-9"
[0,7,1]
"70"
Or, in other words:
>>> for n in (10 ** e for e in range(-1, -8, -1)):
... d = Decimal(str(n))
... print d.to_eng_string()
...
0.1
0.01
0.001
0.0001
0.00001
0.000001
100E-9
I realize that this is an old thread, but it does come near the top of a search for python engineering notation and it seems prudent to have this information located here.
I am an engineer who likes the "engineering 101" engineering units. I don't even like designations such as 0.1uF, I want that to read 100nF. I played with the Decimal class and didn't really like its behavior over the range of possible values, so I rolled a package called engineering_notation that is pip-installable.
pip install engineering_notation
From within Python:
>>> from engineering_notation import EngNumber
>>> EngNumber('1000000')
1M
>>> EngNumber(1000000)
1M
>>> EngNumber(1000000.0)
1M
>>> EngNumber('0.1u')
100n
>>> EngNumber('1000m')
1
This package also supports comparisons and other simple numerical operations.
https://github.com/slightlynybbled/engineering_notation
The «full» quote shows what is wrong!
The decimal module is indeed following the proprietary (IBM) Decimal Arithmetic Specification.
Quoting this IBM specification in its entirety clearly shows what is wrong with decimal.to_eng_string() (emphasis added):
to-engineering-string – conversion to numeric string
This operation converts a number to a string, using engineering
notation if an exponent is needed.
The conversion exactly follows the rules for conversion to scientific
numeric string except in the case of finite numbers where exponential
notation is used. In this case, the converted exponent is adjusted to be a multiple of three (engineering notation) by positioning the decimal point with one, two, or three characters preceding it (that is, the part before the decimal point will range from 1 through 999). This may require the addition of either one or two trailing zeros.
If after the adjustment the decimal point would not be followed by a digit then it is not added. If the final exponent is zero then no indicator letter and exponent is suffixed.
This proprietary IBM specification actually admits to not applying the engineering notation for numbers with an infinite decimal representation, for which ordinary scientific notation is used instead! This is obviously incorrect behaviour for which a Python bug report was opened.
Solution
from math import floor, log10
def powerise10(x):
""" Returns x as a*10**b with 0 <= a < 10
"""
if x == 0: return 0,0
Neg = x < 0
if Neg: x = -x
a = 1.0 * x / 10**(floor(log10(x)))
b = int(floor(log10(x)))
if Neg: a = -a
return a,b
def eng(x):
"""Return a string representing x in an engineer friendly notation"""
a,b = powerise10(x)
if -3 < b < 3: return "%.4g" % x
a = a * 10**(b % 3)
b = b - b % 3
return "%.4gE%s" % (a,b)
Source: https://code.activestate.com/recipes/578238-engineering-notation/
Test result
>>> eng(0.0001)
100E-6
Like the answers above, but a bit more compact:
from math import log10, floor
def eng_format(x,precision=3):
"""Returns string in engineering format, i.e. 100.1e-3"""
x = float(x) # inplace copy
if x == 0:
a,b = 0,0
else:
sgn = 1.0 if x > 0 else -1.0
x = abs(x)
a = sgn * x / 10**(floor(log10(x)))
b = int(floor(log10(x)))
if -3 < b < 3:
return ("%." + str(precision) + "g") % x
else:
a = a * 10**(b % 3)
b = b - b % 3
return ("%." + str(precision) + "gE%s") % (a,b)
Trial:
In [10]: eng_format(-1.2345e-4,precision=5)
Out[10]: '-123.45E-6'

drop trailing zeros from decimal

I have a long list of Decimals and that I have to adjust by factors of 10, 100, 1000,..... 1000000 depending on certain conditions. When I multiply them there is sometimes a useless trailing zero (though not always) that I want to get rid of. For example...
from decimal import Decimal
# outputs 25.0, PROBLEM! I would like it to output 25
print Decimal('2.5') * 10
# outputs 2567.8000, PROBLEM! I would like it to output 2567.8
print Decimal('2.5678') * 1000
Is there a function that tells the decimal object to drop these insignificant zeros? The only way I can think of doing this is to convert to a string and replace them using regular expressions.
Should probably mention that I am using python 2.6.5
EDIT
senderle's fine answer made me realize that I occasionally get a number like 250.0 which when normalized produces 2.5E+2. I guess in these cases I could try to sort them out and convert to a int
You can use the normalize method to remove extra precision.
>>> print decimal.Decimal('5.500')
5.500
>>> print decimal.Decimal('5.500').normalize()
5.5
To avoid stripping zeros to the left of the decimal point, you could do this:
def normalize_fraction(d):
normalized = d.normalize()
sign, digits, exponent = normalized.as_tuple()
if exponent > 0:
return decimal.Decimal((sign, digits + (0,) * exponent, 0))
else:
return normalized
Or more compactly, using quantize as suggested by user7116:
def normalize_fraction(d):
normalized = d.normalize()
sign, digit, exponent = normalized.as_tuple()
return normalized if exponent <= 0 else normalized.quantize(1)
You could also use to_integral() as shown here but I think using as_tuple this way is more self-documenting.
I tested these both against a few cases; please leave a comment if you find something that doesn't work.
>>> normalize_fraction(decimal.Decimal('55.5'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55.500'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55500'))
Decimal('55500')
>>> normalize_fraction(decimal.Decimal('555E2'))
Decimal('55500')
There's probably a better way of doing this, but you could use .rstrip('0').rstrip('.') to achieve the result that you want.
Using your numbers as an example:
>>> s = str(Decimal('2.5') * 10)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
25
>>> s = str(Decimal('2.5678') * 1000)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
2567.8
And here's the fix for the problem that #gerrit pointed out in the comments:
>>> s = str(Decimal('1500'))
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
1500
Answer from the Decimal FAQ in the documentation:
>>> def remove_exponent(d):
... return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
>>> remove_exponent(Decimal('5.00'))
Decimal('5')
>>> remove_exponent(Decimal('5.500'))
Decimal('5.5')
>>> remove_exponent(Decimal('5E+3'))
Decimal('5000')
Answer is mentioned in FAQ (https://docs.python.org/2/library/decimal.html#decimal-faq) but does not explain things.
To drop trailing zeros for fraction part you should use normalize:
>>> Decimal('100.2000').normalize()
Decimal('100.2')
>> Decimal('0.2000').normalize()
Decimal('0.2')
But this works different for numbers with leading zeros in sharp part:
>>> Decimal('100.0000').normalize()
Decimal('1E+2')
In this case we should use `to_integral':
>>> Decimal('100.000').to_integral()
Decimal('100')
So we could check if there's a fraction part:
>>> Decimal('100.2000') == Decimal('100.2000').to_integral()
False
>>> Decimal('100.0000') == Decimal('100.0000').to_integral()
True
And use appropriate method then:
def remove_exponent(num):
return num.to_integral() if num == num.to_integral() else num.normalize()
Try it:
>>> remove_exponent(Decimal('100.2000'))
Decimal('100.2')
>>> remove_exponent(Decimal('100.0000'))
Decimal('100')
>>> remove_exponent(Decimal('0.2000'))
Decimal('0.2')
Now we're done.
Use the format specifier %g. It seems remove to trailing zeros.
>>> "%g" % (Decimal('2.5') * 10)
'25'
>>> "%g" % (Decimal('2.5678') * 1000)
'2567.8'
It also works without the Decimal function
>>> "%g" % (2.5 * 10)
'25'
>>> "%g" % (2.5678 * 1000)
'2567.8'
I ended up doing this:
import decimal
def dropzeros(number):
mynum = decimal.Decimal(number).normalize()
# e.g 22000 --> Decimal('2.2E+4')
return mynum.__trunc__() if not mynum % 1 else float(mynum)
print dropzeros(22000.000)
22000
print dropzeros(2567.8000)
2567.8
note: casting the return value as a string will limit you to 12 significant digits
Slightly modified version of A-IV's answer
NOTE that Decimal('0.99999999999999999999999999995').normalize() will round to Decimal('1')
def trailing(s: str, char="0"):
return len(s) - len(s.rstrip(char))
def decimal_to_str(value: decimal.Decimal):
"""Convert decimal to str
* Uses exponential notation when there are more than 4 trailing zeros
* Handles decimal.InvalidOperation
"""
# to_integral_value() removes decimals
if value == value.to_integral_value():
try:
value = value.quantize(decimal.Decimal(1))
except decimal.InvalidOperation:
pass
uncast = str(value)
# use exponential notation if there are more that 4 zeros
return str(value.normalize()) if trailing(uncast) > 4 else uncast
else:
# normalize values with decimal places
return str(value.normalize())
# or str(value).rstrip('0') if rounding edgecases are a concern
You could use :g to achieve this:
'{:g}'.format(3.140)
gives
'3.14'
This should work:
'{:f}'.format(decimal.Decimal('2.5') * 10).rstrip('0').rstrip('.')
Just to show a different possibility, I used to_tuple() to achieve the same result.
def my_normalize(dec):
"""
>>> my_normalize(Decimal("12.500"))
Decimal('12.5')
>>> my_normalize(Decimal("-0.12500"))
Decimal('-0.125')
>>> my_normalize(Decimal("0.125"))
Decimal('0.125')
>>> my_normalize(Decimal("0.00125"))
Decimal('0.00125')
>>> my_normalize(Decimal("125.00"))
Decimal('125')
>>> my_normalize(Decimal("12500"))
Decimal('12500')
>>> my_normalize(Decimal("0.000"))
Decimal('0')
"""
if dec is None:
return None
sign, digs, exp = dec.as_tuple()
for i in list(reversed(digs)):
if exp >= 0 or i != 0:
break
exp += 1
digs = digs[:-1]
if not digs and exp < 0:
exp = 0
return Decimal((sign, digs, exp))
Why not use modules 10 from a multiple of 10 to check if there is remainder? No remainder means you can force int()
if (x * 10) % 10 == 0:
x = int(x)
x = 2/1
Output: 2
x = 3/2
Output: 1.5

Categories