How do I convert a string of escaped hex characters to a single hex number?
Reading from a socket I get a string of \xFF\xFF\xFF.., etc. I want to convert this to a hex number, 0xFFFFFF, keeping any insignificant 0s, so \x00\xFF should be 0x00FF. I have tried various functions from binascii, but I have not had any luck.
Using struct.unpack:
>>> struct.unpack('>I', '\xFF\xFF\xFF\xFF') # >, !: big (network) endian
(4294967295,)
>>> hex(struct.unpack('>I', '\xFF\xFF\xFF\xFF')[0])
'0xffffffff'
>>> struct.unpack('>H', '\x00\xff')
(255,)
>>> '0x{:04x}'.format(struct.unpack('>H', '\x00\xff')[0])
'0x00ff'
>>> '0x{:04X}'.format(struct.unpack('>H', '\x00\xff')[0])
'0x00FF'
Format characters used:
I: 4-bytes unsigned int
H: 2-bytes unsinged int
UPDATE
If you indent to convert arbitrary binary string into hex string, you can use binascii.hexlify:
>>> import binascii
>>> '0x' + binascii.hexlify('\xFF\xFF\xFF')
'0xffffff'
>>> '0x' + binascii.hexlify('\x00\x00\xFF')
'0x0000ff'
Related
So I wrote this small socket program to send a udp packet and receive the response
sock.sendto(data, (MCAST_GRP, MCAST_PORT))
msgFromServer = sock.recvfrom(1024)
banner=msgFromServer[0]
print(msgFromServer[0])
#name = msgFromServer[0].decode('ascii', 'ignore')
#print(name)
Response is
b'\xff\xff\xff\xffI\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00\xda\x02\x00\x10\x00dl\x01\x011.38.2.2\x00\xa1\x87iempty,secure\x00\xda\x02\x00\x00\x00\x00\x00\x00'
Now the thing is I wanted to convert all hex value to decimal,
I tried the decode; but then I endup loosing all the hex values.
How can I convert all the hex values to decimal in my case
example: \x13 = 19
EDIT: I guess better way to iterate my question is
How do I convert only the hex values to decimal in the given response
There are two problems here:
handling the non-ASCII bytes
handling \xhh sequences which are legitimate characters in Python strings
We can address both with a mix of regular expressions and string methods.
First, decode the bytes to ASCII using the backslashreplace error handler to avoid losing the non-ASCII bytes.
>>> import re
>>>
>>> decoded = msgFromServer[0].decode('ascii', errors='backslashreplace')
>>> decoded
'\\xff\\xff\\xff\\xffI\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00\\xda\x02\x00\x10\x00dl\x01\x011.38.2.2\x00\\xa1\\x87iempty,secure\x00\\xda\x02\x00\x00\x00\x00\x00\x00'
Next, use a regular expression to replace the non-ASCII '\\xhh' sequences with their numeric equivalents:
>>> temp = re.sub(r'\\x([a-fA-F0-9]{2})', lambda m: str(int(m.group(1), 16)), decoded)
>>> temp
'255255255255I\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00218\x02\x00\x10\x00dl\x01\x011.38.2.2\x00161135iempty,secure\x00218\x02\x00\x00\x00\x00\x00\x00'
Finally, map \xhh escape sequences to their decimal values using str.translate:
>>> tt = str.maketrans({x: str(x) for x in range(32)})
>>> final = temp.translate(tt)
>>> final
'255255255255I17server banner0map0game0Counter-Strike: Global Offensive021820160dl111.38.2.20161135iempty,secure02182000000'
You can first convert the bytes representation to hex using the bytes.hex method and then cast it into an integer with the appropriate base with int(x, base)
>>> b'\x13'.hex()
'13'
>>> int(b'\x13'.hex(), 16)
19
Assume v contains the response, what you are asking for is
[int(i) for i in v]
I suspect it's not what you want, it is what I read from the question
i have a problem.
i get data like:
hex_num='0EE6'
data_decode=str(codecs.decode(hex_num, 'hex'))[(0):(80)]
print(data_decode)
>>>b'\x0e\xe6'
And i want encode this like:
data_enc=str(codecs.encode(data_decode, 'hex'))[(2):(6)]
print(str(int(data_enc,16)))
>>>TypeError: encoding with 'hex' codec failed (TypeError: a bytes-like object is required, not 'str')
If i wrote this:
data_enc=str(codecs.encode(b'\x0e\xe6', 'hex'))[(2):(6)]
print(str(int(data_enc,16)))
>>>3814
It will retrun number what i want (3814)
Please help.
You can remove the quotation marks like this: data = b'\x0e\xe6'
The Python 3 documentation states:
Bytes literals are always prefixed with 'b' or 'B'; they produce an instance of the bytes type instead of the str type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.
When b is within a string, it will not behave like a string literal prefix, so you have to remove the quotations for the literal to work, and convert the text to bytes directly.
Corrected code:
import codecs
data = b'\x0e\xe6'
data_enc=str(codecs.encode(data, 'hex'))[(2):(6)]
print(str(int(data_enc,16)))
Output:
3814
To change from a hex string to binary data, then using binascii.unhexlify is a convenient method. e.g.:
>>> hex_num='0EE6'
>>> import binascii
>>> binascii.unhexlify(hex_num)
b'\x0e\xe6'
Then to convert the binary data to an integer, using int.from_bytes allows you control over the endianness of the data and if it signed. e.g:
>>> bytes_data = b'\x0e\xe6'
>>> int.from_bytes(bytes_data, byteorder='little', signed=False)
58894
>>> int.from_bytes(bytes_data, byteorder='big', signed=False)
3814
I want to convert a decimal integer into a \xLO\xHI hex string while keeping the "\x" prefix and without translating printable ASCII to their equivalent ASCII.
What I want to achieve:
>>> dec_to_hex(512)
"\x00\x02"
The following solutions I found while searching for an answer aren't good enough and I'll explain why:
This one didn't put the "\x" prefix and don't translated it to bytes
>>> hex(512)
'0x200'
This examples is really close but receives hexadecimal (I need decimal) and translated chars to ascii:
>>> from binascii import unhexlify
>>> unhexlify('65004100430005FF70000000')
'e\x00A\x00C\x00\x05\xffp\x00\x00\x00'
This one translates ASCIIinto chars:
>>> import struct
>>> struct.pack('<h', 512)
'\x00\x02'
>>> struct.pack('<h', 97)
'a\x00'
I am having string of four hex numbers like:
"0x00 0x01 0x13 0x00"
which is equivalent to 0x00011300. How i can convert this hex value to integer?
Since you are having string of hex values. You may remove ' 0x' from the string to get the single hex number string. For example:
my_str = "0x00 0x01 0x13 0x00"
hex_str = my_str.replace(' 0x', '')
where value hold by hex_str will be:
'0x00011300' # `hex` string
Now you may convert hex string to int as:
>>> int(hex_str, 16)
70400
>>> s="0x00 0x01 0x13 0x00"
>>> a=0
>>> for b in s.split():
... a = a*256 + int(b,16)
...
>>> a
70400
>>> hex(a)
'0x11300'
The answer is here Convert hex string to int in Python.
Without the 0x prefix, you need to specify the base explicitly, otherwise there's no way to tell:
x = int("deadbeef", 16)
With the 0x prefix, Python can distinguish hex and decimal automatically.
print int("0xdeadbeef", 0)
3735928559
print int("10", 0)
10
(You must specify 0 as the base in order to invoke this prefix-guessing behavior; omitting the second parameter means to assume base-10. See the comments for more details.)
Let us use the character Latin Capital Letter a with Ogonek (U+0104) as an example.
I have an int that represents its UTF-8 encoded form:
my_int = 0xC484
# Decimal: `50308`
# Binary: `0b1100010010000100`
If use the unichr function i get: \uC484 or 쒄 (U+C484)
But, I need it to output: Ą
How do I convert my_int to a Unicode code point?
To convert the integer 0xC484 to the bytestring '\xc4\x84' (the UTF-8 representation of the Unicode character Ą), you can use struct.pack():
>>> import struct
>>> struct.pack(">H", 0xC484)
'\xc4\x84'
... where > in the format string represents big-endian, and H represents unsigned short int.
Once you have your UTF-8 bytestring, you can decode it to Unicode as usual:
>>> struct.pack(">H", 0xC484).decode("utf8")
u'\u0104'
>>> print struct.pack(">H", 0xC484).decode("utf8")
Ą
>>> int2bytes(0xC484).decode('utf-8')
u'\u0104'
>>> print(_)
Ą
where int2bytes() is defined here.
Encode the number to a hex string, using hex() or %x. Then you can interpret that as a series of hex bytes using the hex decoder. Finally use the utf-8 decoder to get a unicode string:
def weird_utf8_integer_to_unicode(n):
s= '%x' % n
if len(s) % 2:
s= '0'+s
return s.decode('hex').decode('utf-8')
The len check is in case the first byte is in the range 0x1–0xF, which would leave it missing a leading zero. This should be able to cope with any length string and any character (however encoding a byte sequence in an integer like this would be unable to preseve leading zero bytes).