Python - Convert scientific notation string to float retaining decimal places

Python - Convert scientific notation string to float retaining decimal places - python

I am using python to read some float values from a file. The values read are input to an 'Ada'(Programming Language) program.
The values being read are in different formats (scientific, decimal) and I would like to retain the format.
Everything works well with simple float() operation except when converting '1.0e-5' to float.
>>float('1.0e-5')
#returns 1e-5
1e-5 when used in Ada program gives
error:negative exponent not allowed for integer literal
1.0e-35 works with ada program.
I know if I use format I can get 1.0e-5
>>>"{:.1E}".format(float('1.0e-5'))
#returns '1.0E-5'
But this changes format for other read values also as my reading/manipulation function is common.
How should I approach this problem?
and if
float('1.0')
#returns 1.0
why same behavior is not followed when converting a scientific notation string to float?
(My reading/manipulation function is common. Using formatter string will change the formatting of other read values also)

You could use a custom float to string conversion function which checks if the number will be accepted by Ada using a regular expression (which tests if there are only non-dots before the exponent character, and in which case only convert with format):
import re
def ada_compliant_float_as_string(f):
return "{:.1e}".format(f) if re.match("^-?[^\.]e",str(f)) else str(f)
for f in [-1e-5,1e-5,1.4e-5,-12e4,1,1.0]:
print(ada_compliant_float_as_string(f))
prints:
-1.0e-05
1.0e-05
1.4e-05
-120000.0
1
1.0
only the first value is corrected, other values are just the string representation of a float, unchanged.

Related

python how convert a binary string to a binary number [duplicate]

I can't quite find a solution for this.
Basically what I've done so far is created a string which represents the binary version of x amount of characters padded to show all 8 bits.
E.g. if x = 2 then I have 0101100110010001 so 8 digits in total. Now I have 2 strings of the same length which I want to XOR together, but python keeps thinking it's a string instead of binary. If I use bin() then it throws a wobbly thinking it's a string which it is. So if I cast to an int it then removes the leading 0's.
So I've already got the binary representation of what I'm after, I just need to let python know it's binary, any suggestions?
The current function I'm using to create my binary string is here
for i in origAsci:
origBin = origBin + '{0:08b}'.format(i)
Thanks in advance!

Use Python's int() function to convert the string to an integer. Use 2 for the base parameter since binary uses base 2:
binary_str = '10010110' # Binary string
num = int(binary_str, 2)
# Output: 150
Next, use the bin() function to convert the integer to binary:
binary_num = bin(num)
# Output: 0b10010110

Convert Python Series of strings into float with 18 decimals

I have the following pandas Series:
my_series = ['150000000000000000000000', '45064744242514231410', '2618611848503168287542', '7673975728717793369']
Every number in the list has 18 decimal places (that's what dictates what number exactly it is, prior to seeing any formatting).
my_series[0], therefore, is 150,000.000000000000000000 (one hundred and fifty thousand).
my_series[1], therefore, is 45.064744242514231410 (fourty-five...).
And so on.
I basically want Python to recognize the strings and tunr them into the correct float for me to make calculations with thie Series later.
I don't need to print the correct formatted number, rather, have Pythoin recognize it's a 150,000 instead of a 1,500,000,000 and so on.
Example for my_series[2] of what the corrrect float would be:
2,618.61
My current code:
[float("{:.18f}".format(int(item) for item in my_series))]
Which yields me the following error:
TypeError: unsupported format string passed to generator.__format__
How do I format the strings in the Series according to my requirements above and get the correct float?

You can convert the string to float and then apply formatting.
my_series = ['150000000000000000000000', '45064744242514231410',
'2618611848503168287542', '7673975728717793369']
["{:,.2f}".format(float(item)/10**18) for item in my_series]
['150,000.00', '45.06', '2,618.61', '7.67']
Note that this may lose some precision when converting the string to float.
If this is a problem to you, then you may want to use either
Separate the integer part and decimal part and combine them when printing
Use Decimal class

After a few iterations, I think I understand what OP was going for, so I changed my example. OP does not seem to be worried about loss of precision and was getting value errors (probably due to invalid fields coming in as part of the Series). I've modified my sample to be close to how it would happen in Pandas by adding some deliberately fake inputs.
my_series = [
"not a number",
"",
"150000000000000000000000",
"45064744242514231410",
"2618611848503168287542",
"7673975728717793369",
]
def convert_to_float(number):
float_string = None
my_float = None
try:
float_string = f"{int(number[:-18])}.{number[-18:]}"
my_float = float(float_string)
except ValueError as e:
print(e)
return None
return my_float
numbers = list(map(convert_to_float, my_series))
for num in numbers:
if num:
print(f"{num :.18f}")

Convert a scientific notation to decimal number, and still keeping the data format as float64

I have a simple question.
I am using the numpy.std formula to calculate the standard deviation. However, the result comes as a number in scientific notation.
Therefore I am asking, How to convert a scientific notation to decimal number, and still keep the data format as float64 ?
Or is there any workaround to get the initial result as a decimal number?
Solutions such as this:
stdev = format(stdev, '.10f')
converts the data into a string object, which I don't want.
ere is my code:
stdev = numpy.std(dataset)
print(stdev)
Result: 4.999999999999449e-05
print(stdev.dtype)
Result: float64
Expected result
I willing to have a result as a decimal number in a float64 format.

I suppose it is just a matter of representation since the data type, as you showed, is still float64.
If this is right, you can just print as a formatted string:
>>> x = 123456789e-15
>>> print(x)
1.23456789e-07
>>> print("{:.15f}".format(x))
0.000000123456789
The .15f specifies that you want a float format with 15 decimals.
From python 3.6 you can use the f-string format:
print(f"{x:.15f}")
0.000000123456789
Have a look at the string format documentation

You confuse representation and internal storage. By default, Python will show the value in the scientific form. You can force another form of representation - but you cannot affect the value itself.
Vanilla Python does not have a notion of float64 (it is not that great for exact calculations) - but numpy does

How to create a custom NaN (single precision) in python without setting the 23rd bit?

I'm trying to create floating-point NaNs by choosing the fraction bits. But it seems that python float always set the 23rd fraction bit (IEEE754 single) when it interprets a NaN.
So, my question is: is it possible to define a float nan in python without it setting the 23rd bit?
(I'm using Python 2.7)
NaNs in IEEE 754 have this format:
sign = either 0 or 1.
biased exponent = all 1 bits.
fraction = anything except all 0 bits (since all 0 bits represents infinity).
So, a hex representation for a NaN could be 0x7F800001, but when interpreting this int as a float and interpreting it back to int gives 0x7FC00001
1st try: struct.pack/unpack:
import struct
def hex_to_float(value):
return struct.unpack( '#f', struct.pack( '#L', value) )[0]
def float_to_hex(value):
return struct.unpack( '#L', struct.pack( '#f', value) )[0]
print hex(float_to_hex(hex_to_float(0x7f800001)))
# 0x7fc00001
2nd try: ctypes
import ctypes
def float2hex(float_input):
INTP = ctypes.POINTER(ctypes.c_uint)
float_value = ctypes.c_float(float_input)
my_pointer = ctypes.cast(ctypes.addressof(float_value), INTP)
return my_pointer.contents.value
def hex2float(hex_input):
FLOATP = ctypes.POINTER(ctypes.c_float)
int_value = ctypes.c_uint(hex_input)
my_pointer = ctypes.cast(ctypes.addressof(int_value), FLOATP)
return my_pointer.contents.value
print hex(float2hex(hex2float(0x7f800001)))
# 0x7fc00001L
3rd try: xdrlib packers. Same result.

The underlying problem is that you convert a C-float (which has 32bit) to Python-float (which has 64bit, i.e. a double in C-parlance) and than back to C-float.
The execution of both cconversions after each other doesn't always lead to the original input - you are witnessing such a case.
If the exact bit-pattern is important, you should avoid the above conversions at any cost.
Here are some gory details:
So when struct.unpack('=f', some_bytes) (please note, that I use the standard size =-format character as compared to your usage of native size ('#'), for example #L means different things on Windows and Linux), the following happends:
unpack_float is called, which calls
_PyFloat_Unpack4, which interprets data (here or here) as a
32bit-c-float, i.e. float,
but converts it to double (because the function returns a `double') while returning.
On x86-64 the last conversion means the the operation VCVTSS2SD (i.e. Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision Floating-Point Value) and this opperation results in
0x7f800001 becomming 0x7ff8000020000000.
As you see, already the result of the operation struct.unpack( '=f', struct.pack( '=L', value) )[0] is not what was put in.
However, calling struct.pack(=f, value) for a python-float value (which is a wrapper around C's double), will get us to _PyFloat_Pack4, where the conversion from double to float happens, i.e. CVTSD2SS (Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value) is called and
0x7ff8000020000000 becomes 0x7fc00001.

What are you really trying to do?
Any Python code consuming floats will ignore a "specially crafted" NaN on the best, and crash on the worst case.
If you are passing this value to something outside Python code - serializing, or calling a C API, just define it with the exact bytes you want using struct, and sent those bytes to your desired destination.
Also, if you are using NumPy, then, yes, you can create the special NaNs and expect then to be reatiend within a ndarray - but the way to do that is also through dictacting the exact bytes you want with struct, and somehow converting the data-type while preserving the buffer contents .
Check this answer on building 80bit double numbers to use with NumPy to get hold of a workaround: Longdouble(1e3000) becomes inf: What can I do?
(I tried numpy.frombuffer here and it interprets the byte sequence you crafted there as a 32bit, if that will suit you:
import numpy as np
import binascii
a = "7f800001"
b = binascii.unhexlify(a) # in Python 2 a.decode("hex") would work, but not Python3
# little endian format we need to revert the byte order
c = "".join(b[::-1])
x = np.frombuffer(c, dtype="float32")
x.tobytes()
will print the original -
'\x01\x00\x80\x7f'
And checking the array x will show it is actually a NaN:
>>> x
array([nan], dtype=float32)
However, for the reasons above, if you extract the value from the numpy array with x[0], it will be converted to a "pasteurizd" float64 NaN, with the default value.

How to take an integer array and convert it into other types?

I'm currently trying to take integer arrays that actually represent other data types and convert them into the correct datatype.
So for example, if I had the integer array [1196773188, 542327116], I discover that this integer array represents a string from some other function, convert it, and realize it represents the string "DOUGLAS". The first number translates to the hexadecimal number 0x47554F44 and the second number represents the hexadecimal number 0x2053414C. Using a hex to string converter, these correspond to the strings 'GOUD' and 'SAL' respectively, spelling DOUGLAS in a little endian manner. The way the letters are backwards in individual elements of the array likely stem from the bytes being stored in a litte endian manner, although I might be mistaken on that.
These integer arrays could represent a number of datatypes, including strings, booleans, and floats.
I need to use Python 2.7, so I unfortunately can't use the bytes function.
Is there a simple way to convert an integer array to its corresponding datatype?

It seems that the struct module is the best way to go when converting between different types like this:
import struct
bufferstr = ""
dougarray = [1196773188, 542327116]
for num in dougarray:
bufferstr += struct.pack("i", num)
print bufferstr # Result is 'DOUGLAS'
From this point on we can easily convert 'DOUGLAS' to any datatype we want using struct.unpack():
print struct.unpack("f", bufferstr[0:4]) # Result is (54607.265625)
We can only unpack a certain number of bytes at a time however. Thank you all for the suggestions!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - Convert scientific notation string to float retaining decimal places - python

Related

python how convert a binary string to a binary number [duplicate]

Convert Python Series of strings into float with 18 decimals

Convert a scientific notation to decimal number, and still keeping the data format as float64

How to create a custom NaN (single precision) in python without setting the 23rd bit?

How to take an integer array and convert it into other types?

Categories

Resources