Python: Convert Tuple hex strings into integer - python

I need to convert some tuple in little endian hex format into integer format, how do i do so?
Example:
myTuple = ['0xD4', '0x51', '0x1', '0x0']
I need to convert it into integer (86484).

Just convert your hex_string into int with int(hex_string, 16) and use struct lib to merge 4 bytes into one big int:
import struct
myTuple = ['0xD4', '0x51', '0x1', '0x0']
myResult = struct.unpack("<I", bytearray((int(x, 16) for x in myTuple)))
print(myResult[0])
< is the endianess, I is big int (4 bytes)

On python2 , using int.frombytes
int.from_bytes(bytearray((int(x, 16) for x in myTuple)), byteorder='little')
# 86484
Or explicitly summing up each value after shifting it
sum(int(e, 16) << (i * 8) for i,e in enumerate(myTuple))
# 86484
Or using reduce
from functools import reduce # only for python3
reduce(lambda x, y: (x<<8) + int(y,16), [0]+myTuple[::-1])
# 86484

Related

Converting struct format string to range of allowable int values

Python struct library has a bunch of format strings corresponding with a ctype ("h": int16, "H": uint16).
Is there a simple way to go from a format string (e.g. "h", "H", etc.) to the range of possible values (e.g. -32768 to 32767, 0 to 65535, etc.)?
I see the struct library provides calcsize, but what I really want is something like calcrange.
Is there a built-in solution, or an elegant solution I am neglecting? I am also open to third party libraries.
I have made a DIY calcrange below, but it only covers a limited number of possible format strings and makes some non-generalizable assumptions.
def calcrange(fmt: str) -> Tuple[int, int]:
"""Calculate the min and max possible value of a given struct format string."""
size: int = calcsize(fmt)
unsigned_max = int("0x" + "FF" * size, 16)
if fmt.islower():
# Signed case
min_ = -1 * int("0x80" + "00" * (calcsize(fmt) - 1), 16)
return min_, unsigned_max + min_
# Unsigned case
return 0, unsigned_max
The math can be simplified. If b is the bit-width, then unsigned values are 0 to 2b-1 and signed values are -2(b-1) to 2(b-1)-1. It only works for the integer types.
Here's a the simplified version:
from typing import *
import struct
def calcrange(intcode):
b = struct.calcsize(intcode) * 8
if intcode.islower():
return -2**(b-1),2**(b-1)-1
else:
return 0,2**b-1
for code in 'bBhHiIlLqQnN':
s,e = calcrange(code)
print(f'{code} {s:26,} to {e:26,}')
Output:
b -128 to 127
B 0 to 255
h -32,768 to 32,767
H 0 to 65,535
i -2,147,483,648 to 2,147,483,647
I 0 to 4,294,967,295
l -2,147,483,648 to 2,147,483,647
L 0 to 4,294,967,295
q -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
Q 0 to 18,446,744,073,709,551,615
n -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
N 0 to 18,446,744,073,709,551,615

Create SHA1 hash of int32 list with elements bigger than 255

I have an int32 list of size 10 with elements that are bigger than 255. E.g. lst = [21443324, 435654454, 3242234, ..., 434343623]. How can I get the SHA1 hash of that list?
hash1 = hashlib.sha1(bytearray(lst))
obviously does not work, since the values of lst would need to be in range (0,255).
Use struct.pack. For big-endian int32, use:
endianness = '>'
b = struct.pack(f'{endianness}{len(lst)}I', *lst)
hash1 = hashlib.sha1(b)
For little-endian int32, use endianness = '<'.
If you want to use the lowest byte only of each int32, use
hash1 = hashlib.sha1(bytes(x & 0xff for x in lst))
I hope this helps:
hash1 = hashlib.sha1(bytearray(b"[21443324, 435654454, 3242234, 52158998, 54587878, 12123623, 54215525, 434343623, 54888787, 65966427]"))
Output:
<sha1 _hashlib.HASH object # 0x0000023A66AD6810>
How about this:
def hash_func(r):
p = str(r)
return hash(p)
hash_func([21443324, 435654454, 3242234, 52158998, 54587878, 12123623, 54215525, 434343623, 54888787, 65966427])
Output:
870403843331539247

Calculate CRC-CCITT (0xFFFF) for HEX string

I'm trying to calulate CRC-CCITT (0xFFFF) for HEX string and get result back as HEX string. I tried binascii and crc16 but I get int values and when I convert them to HEX it's not the value I expected. I need this:
hex_string = "AA01"
crc_string = crccitt(hex_string)
print("CRC: ", crc_string)
>>> CRC: FF9B
You can use str.format / format to convert the int value to hexadecimal format: (used crc16 to get crc)
>>> import binascii
>>> import crc16
>>> hex_string = 'AA01'
>>> crc = crc16.crc16xmodem(binascii.unhexlify(hex_string), 0xffff)
>>> '{:04X}'.format(crc & 0xffff)
'FF9B'
>>> format(crc & 0xffff, '04X')
'FF9B'
or using % operator:
>>> '%04X' % (crc & 0xffff)
'FF9B'
import binascii
import crc16
def crccitt(hex_string):
byte_seq = binascii.unhexlify(hex_string)
crc = crc16.crc16xmodem(byte_seq, 0xffff)
return '{:04X}'.format(crc & 0xffff)

Hex string to signed int in Python

How do I convert a hex string to a signed int in Python 3?
The best I can come up with is
h = '9DA92DAB'
b = bytes(h, 'utf-8')
ba = binascii.a2b_hex(b)
print(int.from_bytes(ba, byteorder='big', signed=True))
Is there a simpler way? Unsigned is so much easier: int(h, 16)
BTW, the origin of the question is itunes persistent id - music library xml version and iTunes hex version
In n-bit two's complement, bits have value:
bit 0 = 20
bit 1 = 21
bit n-2 = 2n-2
bit n-1 = -2n-1
But bit n-1 has value 2n-1 when unsigned, so the number is 2n too high. Subtract 2n if bit n-1 is set:
def twos_complement(hexstr, bits):
value = int(hexstr, 16)
if value & (1 << (bits - 1)):
value -= 1 << bits
return value
print(twos_complement('FFFE', 16))
print(twos_complement('7FFF', 16))
print(twos_complement('7F', 8))
print(twos_complement('FF', 8))
Output:
-2
32767
127
-1
import struct
For Python 3 (with comments' help):
h = '9DA92DAB'
struct.unpack('>i', bytes.fromhex(h))
For Python 2:
h = '9DA92DAB'
struct.unpack('>i', h.decode('hex'))
or if it is little endian:
h = '9DA92DAB'
struct.unpack('<i', h.decode('hex'))
Here's a general function you can use for hex of any size:
import math
# hex string to signed integer
def htosi(val):
uintval = int(val,16)
bits = 4 * (len(val) - 2)
if uintval >= math.pow(2,bits-1):
uintval = int(0 - (math.pow(2,bits) - uintval))
return uintval
And to use it:
h = str(hex(-5))
h2 = str(hex(-13589))
x = htosi(h)
x2 = htosi(h2)
This works for 16 bit signed ints, you can extend for 32 bit ints. It uses the basic definition of 2's complement signed numbers. Also note xor with 1 is the same as a binary negate.
# convert to unsigned
x = int('ffbf', 16) # example (-65)
# check sign bit
if (x & 0x8000) == 0x8000:
# if set, invert and add one to get the negative value, then add the negative sign
x = -( (x ^ 0xffff) + 1)
It's a very late answer, but here's a function to do the above. This will extend for whatever length you provide. Credit for portions of this to another SO answer (I lost the link, so please provide it if you find it).
def hex_to_signed(source):
"""Convert a string hex value to a signed hexidecimal value.
This assumes that source is the proper length, and the sign bit
is the first bit in the first byte of the correct length.
hex_to_signed("F") should return -1.
hex_to_signed("0F") should return 15.
"""
if not isinstance(source, str):
raise ValueError("string type required")
if 0 == len(source):
raise valueError("string is empty")
sign_bit_mask = 1 << (len(source)*4-1)
other_bits_mask = sign_bit_mask - 1
value = int(source, 16)
return -(value & sign_bit_mask) | (value & other_bits_mask)

Is there a faster way to convert an arbitrary large integer to a big endian sequence of bytes?

I have this Python code to do this:
from struct import pack as _pack
def packl(lnum, pad = 1):
if lnum < 0:
raise RangeError("Cannot use packl to convert a negative integer "
"to a string.")
count = 0
l = []
while lnum > 0:
l.append(lnum & 0xffffffffffffffffL)
count += 1
lnum >>= 64
if count <= 0:
return '\0' * pad
elif pad >= 8:
lens = 8 * count % pad
pad = ((lens != 0) and (pad - lens)) or 0
l.append('>' + 'x' * pad + 'Q' * count)
l.reverse()
return _pack(*l)
else:
l.append('>' + 'Q' * count)
l.reverse()
s = _pack(*l).lstrip('\0')
lens = len(s)
if (lens % pad) != 0:
return '\0' * (pad - lens % pad) + s
else:
return s
This takes approximately 174 usec to convert 2**9700 - 1 to a string of bytes on my machine. If I'm willing to use the Python 2.7 and Python 3.x specific bit_length method, I can shorten that to 159 usecs by pre-allocating the l array to be the exact right size at the very beginning and using l[something] = syntax instead of l.append.
Is there anything I can do that will make this faster? This will be used to convert large prime numbers used in cryptography as well as some (but not many) smaller numbers.
Edit
This is currently the fastest option in Python < 3.2, it takes about half the time either direction as the accepted answer:
def packl(lnum, padmultiple=1):
"""Packs the lnum (which must be convertable to a long) into a
byte string 0 padded to a multiple of padmultiple bytes in size. 0
means no padding whatsoever, so that packing 0 result in an empty
string. The resulting byte string is the big-endian two's
complement representation of the passed in long."""
if lnum == 0:
return b'\0' * padmultiple
elif lnum < 0:
raise ValueError("Can only convert non-negative numbers.")
s = hex(lnum)[2:]
s = s.rstrip('L')
if len(s) & 1:
s = '0' + s
s = binascii.unhexlify(s)
if (padmultiple != 1) and (padmultiple != 0):
filled_so_far = len(s) % padmultiple
if filled_so_far != 0:
s = b'\0' * (padmultiple - filled_so_far) + s
return s
def unpackl(bytestr):
"""Treats a byte string as a sequence of base 256 digits
representing an unsigned integer in big-endian format and converts
that representation into a Python integer."""
return int(binascii.hexlify(bytestr), 16) if len(bytestr) > 0 else 0
In Python 3.2 the int class has to_bytes and from_bytes functions that can accomplish this much more quickly that the method given above.
Here is a solution calling the Python/C API via ctypes. Currently, it uses NumPy, but if NumPy is not an option, it could be done purely with ctypes.
import numpy
import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
numpy.ctypeslib.ndpointer(numpy.uint8),
ctypes.c_size_t,
ctypes.c_int,
ctypes.c_int]
def packl_ctypes_numpy(lnum):
a = numpy.zeros(lnum.bit_length()//8 + 1, dtype=numpy.uint8)
PyLong_AsByteArray(lnum, a, a.size, 0, 1)
return a
On my machine, this is 15 times faster than your approach.
Edit: Here is the same code using ctypes only and returning a string instead of a NumPy array:
import ctypes
PyLong_AsByteArray = ctypes.pythonapi._PyLong_AsByteArray
PyLong_AsByteArray.argtypes = [ctypes.py_object,
ctypes.c_char_p,
ctypes.c_size_t,
ctypes.c_int,
ctypes.c_int]
def packl_ctypes(lnum):
a = ctypes.create_string_buffer(lnum.bit_length()//8 + 1)
PyLong_AsByteArray(lnum, a, len(a), 0, 1)
return a.raw
This is another two times faster, totalling to a speed-up factor of 30 on my machine.
For completeness and for future readers of this question:
Starting in Python 3.2, there are functions int.from_bytes() and int.to_bytes() that perform the conversion between bytes and int objects in a choice of byte orders.
I suppose you really should just be using numpy, which I'm sure has something or other built in for this. It might also be faster to hack around with the array module. But I'll take a stab at it anyway.
IMX, creating a generator and using a list comprehension and/or built-in summation is faster than a loop that appends to a list, because the appending can be done internally. Oh, and 'lstrip' on a large string has got to be costly.
Also, some style points: special cases aren't special enough; and you appear not to have gotten the memo about the new x if y else z construct. :) Although we don't need it anyway. ;)
from struct import pack as _pack
Q_size = 64
Q_bitmask = (1L << Q_size) - 1L
def quads_gen(a_long):
while a_long:
yield a_long & Q_bitmask
a_long >>= Q_size
def pack_long_big_endian(a_long, pad = 1):
if lnum < 0:
raise RangeError("Cannot use packl to convert a negative integer "
"to a string.")
qs = list(reversed(quads_gen(a_long)))
# Pack the first one separately so we can lstrip nicely.
first = _pack('>Q', qs[0]).lstrip('\x00')
rest = _pack('>%sQ' % len(qs) - 1, *qs[1:])
count = len(first) + len(rest)
# A little math trick that depends on Python's behaviour of modulus
# for negative numbers - but it's well-defined and documented
return '\x00' * (-count % pad) + first + rest
Just wanted to post a follow-up to Sven's answer (which works great). The opposite operation - going from arbitrarily long bytes object to Python Integer object requires the following (because there is no PyLong_FromByteArray() C API function that I can find):
import binascii
def unpack_bytes(stringbytes):
#binascii.hexlify will be obsolete in python3 soon
#They will add a .tohex() method to bytes class
#Issue 3532 bugs.python.org
return int(binascii.hexlify(stringbytes), 16)

Categories