How to convert bytearray in python 2.7 to decimal string? - python

I have the following byte array
>>> string_ba
bytearray(b'4\x00/\t\xb5')
which is easily converted to hex string with the next 2 lines:
hex_string = [chr(x).encode('hex') for x in string_ba]
hex_string = ''.join(hex_string)
that return
>>> hex_string.lower()
'34002f09b5'
which is expected. (This is an RFID card signature)
I convert this to decimal by doing the above and then converting from hex string to decimal string (padded with zeroes) with the next line. I have a limit of 10 characters in string, so I'm forced to remove the first 2 characters in the string to be able to convert it to, at most, 10 character decimal number.
dec_string = str(int(hex_string[2:], 16)).zfill(10)
>>> dec_string
'0003082677'
which is correct, as I tested this with an online converter (hex: 002f09b5, dec: 3082677)
The question is, if there's a way to skip converting from bytearray to hex_string, to obtain a decimal string. In other words to go straight from bytearray to dec_string
This will be running on Python 2.7.15.
>>> sys.version
'2.7.15rc1 (default, Apr 15 2018, 21:51:34) \n[GCC 7.3.0]'
I've tried removing the first element from bytearray and then converting it to string directly and joining. But this does not provide the desired result.
string_ba = string_ba[1:]
test_string = [str(x) for x in string_ba]
test_dec_string = ''.join(test_string).zfill(10)
>>> test_dec_string
'0000479181'
To repeat the question is there a way to go straight from bytearray to decimal string

You can use struct library to convert bytearray to decimal. https://codereview.stackexchange.com/questions/142915/converting-a-bytearray-into-an-integer maybe help you

A number (let's call it X) consisting of n digits (in a base, let's refer to it as B) is written as:
Dn-1Dn-2Dn-3...D2D1D0 (where each Di is a digit in base B)
and its value can be computed based on the formula:
VX = Σin=-01(Bi * Di) (notice that in this example the number is traversed from right to left - the traversing sense doesn't affect the final value).
As an example, number 2468 (B10) = 100 * 8 + 101 * 6 + 102 * 4 + 103 * 2 (= 8 + 60 + 400 + 2000).
An ASCII string is actually a number in base 256 (0x100), where each char (byte) is a digit.
Here's an alternative based on the above:
It only performs mathematical operations on integers (the conversion to string is done only at the end)
The traversing sense (from above) is helpful with the restriction (final (decimal) number must fit in a number of digits, and in case of overflow the most significant ones are to be ignored)
The algorithm is simple, starting from the right, compute the partial number value, until reaching the maximum allowed value, or exhausting the number digits (string chars)
code.py:
#!/usr/bin/env python
import sys
DEFAULT_MAX_DIGITS = 10
def convert(array, max_digits=DEFAULT_MAX_DIGITS):
max_val = 10 ** max_digits
number_val = 0
for idx, digit in enumerate(reversed(array)):
cur_val = 256 ** idx * digit
if number_val + cur_val > max_val:
break
number_val += cur_val
return str(number_val).zfill(max_digits)
def main():
b = bytearray("4\x00/\t\xb5")
print("b: {:}\n".format(repr(b)))
for max_digits in range(6, 15, 2):
print("Conversion of b (with max {:02d} digits): {:}{:s}".format(
max_digits, convert(b, max_digits=max_digits),
" (!!! Default case - required in the question)" if max_digits == DEFAULT_MAX_DIGITS else ""
))
if __name__ == "__main__":
print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
main()
Outpput:
(py_064_02.07.15_test0) e:\Work\Dev\StackOverflow\q054091895>"e:\Work\Dev\VEnvs\py_064_02.07.15_test0\Scripts\python.exe" code.py
Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:30:26) [MSC v.1500 64 bit (AMD64)] on win32
b: bytearray(b'4\x00/\t\xb5')
Conversion of b (with max 06 digits): 002485
Conversion of b (with max 08 digits): 03082677
Conversion of b (with max 10 digits): 0003082677 (!!! Default case - required in the question)
Conversion of b (with max 12 digits): 223341382069
Conversion of b (with max 14 digits): 00223341382069

Related

How to convert a number to base 16 with math ( not using strings )?

Say i have
1009732533765201
and i want:
0x1009732533765201 which is
1155581383011619329
You can do this in programming languages with strings like this
int('1009732533765201',16)
but i want the pure math way. To convert 1009732533765201 to it's base16 of 1155581383011619329
I tried: int('1009732533765201',16) but this uses a string, and is slow for large numbers, i'm looking for a math angle only.
Here is a math way i know how to do it:
0x1009732533765201 = 1155581383011619329
Here is a python way to do it:
int('1009732533765201',16)
but i can only do the first math version manually. How can this be accomplished, converting 0x1009732533765201 without using a string to concatenating '0x' to '1009732533765201' and not using eval
Is there any way to take 1009732533765201, and convert it to the same ouput as 0x1009732533765201 to get its integer without using int('1009732533765201',16) my goal is to find a faster approach
ANSWERED BY PARTHIAN SHOT, here is the result of his approach which is exactly what i was looking for, a way to do this without int()
orig = 1009732533765201
num = orig
result = 0
i = 0
while num != 0:
result += (num % 10) * (16 ** i)
num //= 10
i += 1
print(orig, num, result, "%x" % (result))
1009732533765201 0 1155581383011619329 1009732533765201
As I said in my comment, Python knows, out of box, how to deal with base 16 numbers. Just go ahead and assign the base 16 value to a variable.
Here is an example:
Python 3.7.4 (default, Aug 12 2019, 14:45:07)
[GCC 9.1.1 20190605 (Red Hat 9.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> i=0x16
>>> i
22
>>> i=0xAA
>>> i
170
>>>
And as I said, that works for other bases, like base 2:
>>> i=0b1010
>>> i
10
>>>
And base 8:
>>> i=0o12
>>> i
10
>>>
The idea here is that we're looking at each digit of the original decimal number one at a time, from right to left, and multiplying it by 16 ** i instead of 10 ** i, then adding it to the new number.
The result is the equivalent of interpreting the original decimal representation of the number as if it were hexadecimal.
#!/usr/bin/env python
orig = 34029235
num = orig
result = 0
i = 0
while num != 0:
result += (num % 10) * (16 ** i)
num //= 10
i += 1
print(orig, num, result, "%x" % (result))
And running that code gets us...
bash$ ./main.py
(34029235, 0, 872583733, '34029235')
bash$

Convert amount (int) to BCD

I need to convert an Int left padded 6 bytes (amount) to a BCD in Python.
int = 145
expect = "\x00\x00\x00\x00\x01\x45"
The closest I come is with this code (but it needs to loop in byte pair):
def TO_BCD(value):
return chr((((value / 10) << 4) & 0xF0) + ((value % 10) & 0x0F))
int = 145
TO_BCD(int) # => "\x00\x00\x00\x00\x01\x45" (expected)
This seems fairly simple, and gets the answer you were looking for. Just isolate each pair of digits and convert to ASCII.
If I were doing this in high volume then I'd probably build a table (perhaps in numpy) of all the possible 100 values per byte and index it with each pair of digits in the input.
m = 145
print(''.join(f"\\x{m // 10**i % 10}{m // 10**(i-1) % 10}" for i in range(11, -1, -2)))
Output, although it's just a string, not any internal BCD representation
\x00\x00\x00\x00\x01\x45
Along the same lines, you can pack the BCD into a byte string. When printed, Python will interpret BCD 45 as a capital E
import struct
m = 145
packed = struct.pack('6B', *[(m // 10**i % 10 << 4) + (m // 10**(i-1) % 10) for i in range(11, -1, -2)])
print(packed)
print(''.join(f"\\{p:02x}" for p in packed))
Output
b'\x00\x00\x00\x00\x01E'
\00\00\00\00\01\45
Here's an example.
script0.py:
#!/usr/bin/env python3
import sys
def bcd(value, length=0, pad='\x00'):
ret = ""
while value:
value, ls4b = divmod(value, 10)
value, ms4b = divmod(value, 10)
ret = chr((ms4b << 4) + ls4b) + ret
return pad * (length - len(ret)) + ret
def bcd_str(value, length=0, pad='\x00'):
value_str = str(value)
value_str = ("0" if len(value_str) % 2 else "") + value_str
ret = ""
for i in range(0, len(value_str), 2):
ms4b = ord(value_str[i]) - 0x30
ls4b = ord(value_str[i + 1]) - 0x30
ret += chr((ms4b << 4) + ls4b)
return pad * (length - len(ret)) + ret
def main():
values = [
145,
5,
123456,
]
for value in values:
print("{0:d} - [{1:s}] - [{2:s}]".format(value, repr(bcd(value, length=6)), repr(bcd_str(value, length=6))))
# Bonus
speed_test = 1
if speed_test:
import timeit # Anti pattern: only import at the beginning of the file
print("\nTesting speed:")
stmt = "bcd({0:d})".format(1234567890 ** 32)
count = 100000
for func_name in ["bcd", "bcd_str"]:
print(" {0:s}: {1:.03f} secs".format(func_name, timeit.timeit(stmt, setup="from __main__ import {0:s} as bcd".format(func_name), number=count)))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main()
print("\nDone.")
Output:
[cfati#CFATI-5510-0:e:\Work\Dev\StackOverflow\q057476837]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" script0.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32
145 - ['\x00\x00\x00\x00\x01E'] - ['\x00\x00\x00\x00\x01E']
5 - ['\x00\x00\x00\x00\x00\x05'] - ['\x00\x00\x00\x00\x00\x05']
123456 - ['\x00\x00\x00\x124V'] - ['\x00\x00\x00\x124V']
Testing speed:
bcd: 17.107 secs
bcd_str: 8.021 secs
Done.
Notes:
Since you're working with packed BCD, each digit will be stored in 4 bits, and thus 2 digits will take one byte
The algorithm is simple: split the number in 2 digit groups, in each group the 1st (Most Significant) digit will be shifted to the left by 4 bits, and then the 2nd (Least Significant) one will be added - this will be the char's ASCII code
The output might look a bit different than what you're expecting, but that's only due display formatting: for example capital letter 'E' char has the ASCII code 0x45 (69), and can also be written as '\x45', so the output is correct
There are 2 implementations:
bcd - uses arithmetical operations
bcd_str - uses operations on strings
The speed test (at the end of main) yields surprising results: the 2nd (string) variant is faster (by a factor of ~2). A brief explanation would be that (in Python,) modulo operation is expensive (slow) on large numbers.

Python Decimal - engineering notation for mili (10e-3) and micro (10e-6)

Here is the example which is bothering me:
>>> x = decimal.Decimal('0.0001')
>>> print x.normalize()
>>> print x.normalize().to_eng_string()
0.0001
0.0001
Is there a way to have engineering notation for representing mili (10e-3) and micro (10e-6)?
Here's a function that does things explicitly, and also has support for using SI suffixes for the exponent:
def eng_string( x, format='%s', si=False):
'''
Returns float/int value <x> formatted in a simplified engineering format -
using an exponent that is a multiple of 3.
format: printf-style string used to format the value before the exponent.
si: if true, use SI suffix for exponent, e.g. k instead of e3, n instead of
e-9 etc.
E.g. with format='%.2f':
1.23e-08 => 12.30e-9
123 => 123.00
1230.0 => 1.23e3
-1230000.0 => -1.23e6
and with si=True:
1230.0 => 1.23k
-1230000.0 => -1.23M
'''
sign = ''
if x < 0:
x = -x
sign = '-'
exp = int( math.floor( math.log10( x)))
exp3 = exp - ( exp % 3)
x3 = x / ( 10 ** exp3)
if si and exp3 >= -24 and exp3 <= 24 and exp3 != 0:
exp3_text = 'yzafpnum kMGTPEZY'[ ( exp3 - (-24)) / 3]
elif exp3 == 0:
exp3_text = ''
else:
exp3_text = 'e%s' % exp3
return ( '%s'+format+'%s') % ( sign, x3, exp3_text)
EDIT:
Matplotlib implemented the engineering formatter, so one option is to directly use Matplotlibs formatter, e.g.:
import matplotlib as mpl
formatter = mpl.ticker.EngFormatter()
formatter(10000)
result: '10 k'
Original answer:
Based on Julian Smith's excellent answer (and this answer), I changed the function to improve on the following points:
Python3 compatible (integer division)
Compatible for 0 input
Rounding to significant number of digits, by default 3, no trailing zeros printed
so here's the updated function:
import math
def eng_string( x, sig_figs=3, si=True):
"""
Returns float/int value <x> formatted in a simplified engineering format -
using an exponent that is a multiple of 3.
sig_figs: number of significant figures
si: if true, use SI suffix for exponent, e.g. k instead of e3, n instead of
e-9 etc.
"""
x = float(x)
sign = ''
if x < 0:
x = -x
sign = '-'
if x == 0:
exp = 0
exp3 = 0
x3 = 0
else:
exp = int(math.floor(math.log10( x )))
exp3 = exp - ( exp % 3)
x3 = x / ( 10 ** exp3)
x3 = round( x3, -int( math.floor(math.log10( x3 )) - (sig_figs-1)) )
if x3 == int(x3): # prevent from displaying .0
x3 = int(x3)
if si and exp3 >= -24 and exp3 <= 24 and exp3 != 0:
exp3_text = 'yzafpnum kMGTPEZY'[ exp3 // 3 + 8]
elif exp3 == 0:
exp3_text = ''
else:
exp3_text = 'e%s' % exp3
return ( '%s%s%s') % ( sign, x3, exp3_text)
The decimal module is following the Decimal Arithmetic Specification, which states:
This is outdated - see below
to-scientific-string – conversion to numeric string
[...]
The coefficient is first converted to a string in base ten using the characters 0 through 9 with no leading zeros (except if its value is zero, in which case a single 0 character is used).
Next, the adjusted exponent is calculated; this is the exponent, plus the number of characters in the converted coefficient, less one. That is, exponent+(clength-1), where clength is the length of the coefficient in decimal digits.
If the exponent is less than or equal to zero and the adjusted exponent is greater than or equal to -6, the number will be converted
to a character form without using exponential notation.
[...]
to-engineering-string – conversion to numeric string
This operation converts a number to a string, using engineering
notation if an exponent is needed.
The conversion exactly follows the rules for conversion to scientific
numeric string except in the case of finite numbers where exponential
notation is used. In this case, the converted exponent is adjusted to be a multiple of three (engineering notation) by positioning the decimal point with one, two, or three characters preceding it (that is, the part before the decimal point will range from 1 through 999).
This may require the addition of either one or two trailing zeros.
If after the adjustment the decimal point would not be followed by a digit then it is not added. If the final exponent is zero then no indicator letter and exponent is suffixed.
Examples:
For each abstract representation [sign, coefficient, exponent] on the left, the resulting string is shown on the right.
Representation
String
[0,123,1]
"1.23E+3"
[0,123,3]
"123E+3"
[0,123,-10]
"12.3E-9"
[1,123,-12]
"-123E-12"
[0,7,-7]
"700E-9"
[0,7,1]
"70"
Or, in other words:
>>> for n in (10 ** e for e in range(-1, -8, -1)):
... d = Decimal(str(n))
... print d.to_eng_string()
...
0.1
0.01
0.001
0.0001
0.00001
0.000001
100E-9
I realize that this is an old thread, but it does come near the top of a search for python engineering notation and it seems prudent to have this information located here.
I am an engineer who likes the "engineering 101" engineering units. I don't even like designations such as 0.1uF, I want that to read 100nF. I played with the Decimal class and didn't really like its behavior over the range of possible values, so I rolled a package called engineering_notation that is pip-installable.
pip install engineering_notation
From within Python:
>>> from engineering_notation import EngNumber
>>> EngNumber('1000000')
1M
>>> EngNumber(1000000)
1M
>>> EngNumber(1000000.0)
1M
>>> EngNumber('0.1u')
100n
>>> EngNumber('1000m')
1
This package also supports comparisons and other simple numerical operations.
https://github.com/slightlynybbled/engineering_notation
The «full» quote shows what is wrong!
The decimal module is indeed following the proprietary (IBM) Decimal Arithmetic Specification.
Quoting this IBM specification in its entirety clearly shows what is wrong with decimal.to_eng_string() (emphasis added):
to-engineering-string – conversion to numeric string
This operation converts a number to a string, using engineering
notation if an exponent is needed.
The conversion exactly follows the rules for conversion to scientific
numeric string except in the case of finite numbers where exponential
notation is used. In this case, the converted exponent is adjusted to be a multiple of three (engineering notation) by positioning the decimal point with one, two, or three characters preceding it (that is, the part before the decimal point will range from 1 through 999). This may require the addition of either one or two trailing zeros.
If after the adjustment the decimal point would not be followed by a digit then it is not added. If the final exponent is zero then no indicator letter and exponent is suffixed.
This proprietary IBM specification actually admits to not applying the engineering notation for numbers with an infinite decimal representation, for which ordinary scientific notation is used instead! This is obviously incorrect behaviour for which a Python bug report was opened.
Solution
from math import floor, log10
def powerise10(x):
""" Returns x as a*10**b with 0 <= a < 10
"""
if x == 0: return 0,0
Neg = x < 0
if Neg: x = -x
a = 1.0 * x / 10**(floor(log10(x)))
b = int(floor(log10(x)))
if Neg: a = -a
return a,b
def eng(x):
"""Return a string representing x in an engineer friendly notation"""
a,b = powerise10(x)
if -3 < b < 3: return "%.4g" % x
a = a * 10**(b % 3)
b = b - b % 3
return "%.4gE%s" % (a,b)
Source: https://code.activestate.com/recipes/578238-engineering-notation/
Test result
>>> eng(0.0001)
100E-6
Like the answers above, but a bit more compact:
from math import log10, floor
def eng_format(x,precision=3):
"""Returns string in engineering format, i.e. 100.1e-3"""
x = float(x) # inplace copy
if x == 0:
a,b = 0,0
else:
sgn = 1.0 if x > 0 else -1.0
x = abs(x)
a = sgn * x / 10**(floor(log10(x)))
b = int(floor(log10(x)))
if -3 < b < 3:
return ("%." + str(precision) + "g") % x
else:
a = a * 10**(b % 3)
b = b - b % 3
return ("%." + str(precision) + "gE%s") % (a,b)
Trial:
In [10]: eng_format(-1.2345e-4,precision=5)
Out[10]: '-123.45E-6'

python: unpack IBM 32-bit float point

I was reading a binary file in python like this:
from struct import unpack
ns = 1000
f = open("binary_file", 'rb')
while True:
data = f.read(ns * 4)
if data == '':
break
unpacked = unpack(">%sf" % ns, data)
print str(unpacked)
when I realized unpack(">f", str) is for unpacking IEEE floating point, my data is IBM 32-bit float point numbers
My question is:
How can I impliment my unpack to unpack IBM 32-bit float point type numbers?
I don't mind using like ctypes to extend python to get better performance.
EDIT: I did some searching:
http://mail.scipy.org/pipermail/scipy-user/2009-January/019392.html
This looks very promising, but I want to get more efficient: there are potential tens of thousands of loops.
EDIT: posted answer below. Thanks for the tip.
I think I understood it:
first unpack the string to unsigned 4 byte integer, and then use this function:
def ibm2ieee(ibm):
"""
Converts an IBM floating point number into IEEE format.
:param: ibm - 32 bit unsigned integer: unpack('>L', f.read(4))
"""
if ibm == 0:
return 0.0
sign = ibm >> 31 & 0x01
exponent = ibm >> 24 & 0x7f
mantissa = (ibm & 0x00ffffff) / float(pow(2, 24))
return (1 - 2 * sign) * mantissa * pow(16, exponent - 64)
Thanks for all who helped!
IBM Floating Point Architecture, how to encode and decode:
http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
My solution:
I wrote a class, I think in this way, it can be a bit faster, because used Struct object, so that the unpack fmt is compiled only once.
EDIT: also because it's unpacking size*bytes all at once, and unpacking can be an expensive operation.
from struct import Struct
class StructIBM32(object):
"""
see example in:
http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture#An_Example
>>> import struct
>>> c = StructIBM32(1)
>>> bit = '11000010011101101010000000000000'
>>> c.unpack(struct.pack('>L', int(bit, 2)))
[-118.625]
"""
def __init__(self, size):
self.p24 = float(pow(2, 24))
self.unpack32int = Struct(">%sL" % size).unpack
def unpack(self, data):
int32 = self.unpack32int(data)
return [self.ibm2ieee(i) for i in int32]
def ibm2ieee(self, int32):
if int32 == 0:
return 0.0
sign = int32 >> 31 & 0x01
exponent = int32 >> 24 & 0x7f
mantissa = (int32 & 0x00ffffff) / self.p24
return (1 - 2 * sign) * mantissa * pow(16, exponent - 64)
if __name__ == "__main__":
import doctest
doctest.testmod()

Hex string to signed int in Python

How do I convert a hex string to a signed int in Python 3?
The best I can come up with is
h = '9DA92DAB'
b = bytes(h, 'utf-8')
ba = binascii.a2b_hex(b)
print(int.from_bytes(ba, byteorder='big', signed=True))
Is there a simpler way? Unsigned is so much easier: int(h, 16)
BTW, the origin of the question is itunes persistent id - music library xml version and iTunes hex version
In n-bit two's complement, bits have value:
bit 0 = 20
bit 1 = 21
bit n-2 = 2n-2
bit n-1 = -2n-1
But bit n-1 has value 2n-1 when unsigned, so the number is 2n too high. Subtract 2n if bit n-1 is set:
def twos_complement(hexstr, bits):
value = int(hexstr, 16)
if value & (1 << (bits - 1)):
value -= 1 << bits
return value
print(twos_complement('FFFE', 16))
print(twos_complement('7FFF', 16))
print(twos_complement('7F', 8))
print(twos_complement('FF', 8))
Output:
-2
32767
127
-1
import struct
For Python 3 (with comments' help):
h = '9DA92DAB'
struct.unpack('>i', bytes.fromhex(h))
For Python 2:
h = '9DA92DAB'
struct.unpack('>i', h.decode('hex'))
or if it is little endian:
h = '9DA92DAB'
struct.unpack('<i', h.decode('hex'))
Here's a general function you can use for hex of any size:
import math
# hex string to signed integer
def htosi(val):
uintval = int(val,16)
bits = 4 * (len(val) - 2)
if uintval >= math.pow(2,bits-1):
uintval = int(0 - (math.pow(2,bits) - uintval))
return uintval
And to use it:
h = str(hex(-5))
h2 = str(hex(-13589))
x = htosi(h)
x2 = htosi(h2)
This works for 16 bit signed ints, you can extend for 32 bit ints. It uses the basic definition of 2's complement signed numbers. Also note xor with 1 is the same as a binary negate.
# convert to unsigned
x = int('ffbf', 16) # example (-65)
# check sign bit
if (x & 0x8000) == 0x8000:
# if set, invert and add one to get the negative value, then add the negative sign
x = -( (x ^ 0xffff) + 1)
It's a very late answer, but here's a function to do the above. This will extend for whatever length you provide. Credit for portions of this to another SO answer (I lost the link, so please provide it if you find it).
def hex_to_signed(source):
"""Convert a string hex value to a signed hexidecimal value.
This assumes that source is the proper length, and the sign bit
is the first bit in the first byte of the correct length.
hex_to_signed("F") should return -1.
hex_to_signed("0F") should return 15.
"""
if not isinstance(source, str):
raise ValueError("string type required")
if 0 == len(source):
raise valueError("string is empty")
sign_bit_mask = 1 << (len(source)*4-1)
other_bits_mask = sign_bit_mask - 1
value = int(source, 16)
return -(value & sign_bit_mask) | (value & other_bits_mask)

Categories