How to get the raw bytes from a hex string in python

How to get the raw bytes from a hex string in python - python

I have the following problem in python
I have the value 0x402de4a in hex and would like to convert it to bytes so I use .to_bytes(3, 'little') which gives me b'J\2d#' if I print it. I am aware that this is just a representation of the bytes but I need to turn a string later for the output which would give me J\2d# if I use str() nut I need it to be \x4a\x2d\x40 how can I convert the byte object to string so I can get the raw binary data as a string
my code is as follows
addr = 0x402d4a
addr = int(addr,16)
addr = str(addr.to_bytes(3,'little'))
print(addr)
and my expected output is
\x4a\x2d\x40
Thanks in advance

There is no direct way to get \x4a\x2d and so forth from a string. Or bytes, for this matter.
What you should do:
Convert the int to bytes -- you've done this, good
Loop over the bytes, use f-string to print the hexadecimal value with the "\\x" prefix
join() them
2 & 3 can nicely be folded into one generator comprehension, e.g.:
rslt = "".join(
f"\\x{b:02x}" for b in value_as_bytes
)

Related

Convert hex to decimal/string in python

So I wrote this small socket program to send a udp packet and receive the response
sock.sendto(data, (MCAST_GRP, MCAST_PORT))
msgFromServer = sock.recvfrom(1024)
banner=msgFromServer[0]
print(msgFromServer[0])
#name = msgFromServer[0].decode('ascii', 'ignore')
#print(name)
Response is
b'\xff\xff\xff\xffI\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00\xda\x02\x00\x10\x00dl\x01\x011.38.2.2\x00\xa1\x87iempty,secure\x00\xda\x02\x00\x00\x00\x00\x00\x00'
Now the thing is I wanted to convert all hex value to decimal,
I tried the decode; but then I endup loosing all the hex values.
How can I convert all the hex values to decimal in my case
example: \x13 = 19
EDIT: I guess better way to iterate my question is
How do I convert only the hex values to decimal in the given response

There are two problems here:
handling the non-ASCII bytes
handling \xhh sequences which are legitimate characters in Python strings
We can address both with a mix of regular expressions and string methods.
First, decode the bytes to ASCII using the backslashreplace error handler to avoid losing the non-ASCII bytes.
>>> import re
>>>
>>> decoded = msgFromServer[0].decode('ascii', errors='backslashreplace')
>>> decoded
'\\xff\\xff\\xff\\xffI\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00\\xda\x02\x00\x10\x00dl\x01\x011.38.2.2\x00\\xa1\\x87iempty,secure\x00\\xda\x02\x00\x00\x00\x00\x00\x00'
Next, use a regular expression to replace the non-ASCII '\\xhh' sequences with their numeric equivalents:
>>> temp = re.sub(r'\\x([a-fA-F0-9]{2})', lambda m: str(int(m.group(1), 16)), decoded)
>>> temp
'255255255255I\x11server banner\x00map\x00game\x00Counter-Strike: Global Offensive\x00218\x02\x00\x10\x00dl\x01\x011.38.2.2\x00161135iempty,secure\x00218\x02\x00\x00\x00\x00\x00\x00'
Finally, map \xhh escape sequences to their decimal values using str.translate:
>>> tt = str.maketrans({x: str(x) for x in range(32)})
>>> final = temp.translate(tt)
>>> final
'255255255255I17server banner0map0game0Counter-Strike: Global Offensive021820160dl111.38.2.20161135iempty,secure02182000000'

You can first convert the bytes representation to hex using the bytes.hex method and then cast it into an integer with the appropriate base with int(x, base)
>>> b'\x13'.hex()
'13'
>>> int(b'\x13'.hex(), 16)
19

Assume v contains the response, what you are asking for is
[int(i) for i in v]
I suspect it's not what you want, it is what I read from the question

Converting Byte to String and Back Properly in Python3?

Given a random byte (i.e. not only numbers/characters!), I need to convert it to a string and then back to the inital byte without loosing information. This seems like a basic task, but I ran in to the following problems:
Assuming:
rnd_bytes = b'w\x12\x96\xb8'
len(rnd_bytes)
prints: 4
Now, converting it to a string. Note: I need to set backslashreplace as it otherwise returns a 'UnicodeDecodeError' or would loose information setting it to another flag value.
my_str = rnd_bytes.decode('utf-8' , 'backslashreplace')
Now, I have the string.
I want to convert it back to exactly the original byte (size 4!):
According to python ressources and this answer, there are different possibilities:
conv_bytes = bytes(my_str, 'utf-8')
conv_bytes = my_str.encode('utf-8')
But len(conv_bytes) returns 10.
I tried to analyse the outcome:
>>> repr(rnd_bytes)
"b'w\\x12\\x96\\xb8'"
>>> repr(my_str)
"'w\\x12\\\\x96\\\\xb8'"
>>> repr(conv_bytes)
"b'w\\x12\\\\x96\\\\xb8'"
It would make sense to replace '\\\\'. my_str.replace('\\\\','\\') doesn't change anything. Probably, because four backslashes represent only two. So, my_str.replace('\\','\') would find the '\\\\', but leads to
SyntaxError: EOL while scanning string literal
due to the last argument '\'. This had been discussed here, where the following suggestion came up:
>>> my_str2=my_str.encode('utf_8').decode('unicode_escape')
>>> repr(my_str2)
"'w\\x12\\x96¸'"
This replaces the '\\\\' but seems to add / change some other characters:
>>> conv_bytes2 = my_str2.encode('utf8')
>>> len(conv_bytes2)
6
>>> repr(conv_bytes2)
"b'w\\x12\\xc2\\x96\\xc2\\xb8'"
There must be a prober way to convert a (complex) byte to a string and back. How can I achieve that?

Note: Some codes found on the Internet.
You could try to convert it to hex format. Then it is easy to convert it back to byte format.
Sample code to convert bytes to string:
hex_str = rnd_bytes.hex()
Here is how 'hex_str' looks like:
'771296b8'
And code for converting it back to bytes:
new_rnd_bytes = bytes.fromhex(hex_str)
The result is:
b'w\x12\x96\xb8'
For processing you can use:
readable_str = ''.join(chr(int(hex_str[i:i+2], 16)) for i in range(0, len(hex_str), 2))
But newer try to encode readable string, here is how readable string looks like:
'w\x12\x96¸'
After processing readable string convert it back to hex format before converting it back to bytes string like:
hex_str = ''.join([str(hex(ord(i)))[2:4] for i in readable_str])

Now, converting it to a string. Note: I need to set backslashreplace as it otherwise returns a 'UnicodeDecodeError' or would loose information setting it to another flag value.
The UTF-8 encoding cannot interpret every possible sequence of bytes as a string. Using backslashreplace gives you a string that preserves the information for bytes that couldn't be converted:
>>> rnd_bytes = b'w\x12\x96\xb8'
>>> rnd_bytes.decode('utf-8', 'backslashreplace')
'w\x12\\x96\\xb8'
but that representation is not very useful for converting back.
Instead, use an encoding that does interpret every possible sequence of bytes as a string. The most straightforward of these is ISO-8859-1, which simply maps each byte one at a time to the first 256 Unicode code points respectively.
>>> rnd_bytes.decode('iso-8859-1')
'w\x12\x96¸'
>>> rnd_bytes.decode('iso-8859-1').encode('iso-8859-1') == rnd_bytes
True

Python: Converting HEX string to bytes

I'm trying to make byte frame which I will send via UDP. I have class Frame which has attributes sync, frameSize, data, checksum etc. I'm using hex strings for value representation. Like this:
testFrame = Frame("AA01","0034","44853600","D43F")
Now, I need to concatenate this hex values together and convert them to byte array like this?!
def convertToBits(self):
stringMessage = self.sync + self.frameSize + self.data + self.chk
return b16decode(self.stringMessage)
But when I print converted value I don't get the same values or I don't know to read python notation correctly:
This is sync: AA01
This is frame size: 0034
This is data:44853600
This is checksum: D43F
b'\xaa\x01\x004D\x856\x00\xd4?'
So, first word is converted ok (AA01 -> \xaa\x01) but (0034 -> \x004D) it's not the same. I tried to use bytearray.fromhex because I can use spaces between bytes but I got same result. Can you help me to send same hex words via UDP?

Python displays any byte that can represent a printable ASCII character as that character. 4 is the same as \x34, but as it opted to print the ASCII character in the representation.
So \x004 is really the same as \x00\x34, D\x856\x00 is the same as \x44\x85\x36\x00, and \xd4? is the same as \xd4\x3f, because:
>>> b'\x34'
'4'
>>> b'\x44'
'D'
>>> b'\x36'
'6'
>>> b'\x3f'
'?'
This is just the representation of the bytes value; the value is entirely correct and you don't need to do anything else.
If it helps, you can visualise the bytes values as hex again using binascii.hexlify():
>>> import binascii
>>> binascii.hexlify(b'\xaa\x01\x004D\x856\x00\xd4?')
b'aa01003444853600d43f'
and you'll see that 4, D, 6 and ? are once again represented by the correct hexadecimal characters.

How to get escaped hex value unescaped in Python

I implemented a simple file seek and read in Python:
>>>f = open("<filepath>", "rb")
>>>f.seek(0x20) #offset 0x20
>>>byte=f.read(4) #4 byte space
I ended up with
>>>byte
'\xe0\x00\x00\x00'
which is the expected result, but I need to use it as a hex value without escapes for further calculations.
How can I convert such a string into an unescaped hex value? (In the above example '\xe0\x00\x00\x00' should tranform into 'e0000000' or '0xe0000000'.)

Use encode('hex'):
>>> byte.encode('hex')
'e0000000'
# convert it to int
>>> int(byte.encode('hex'), 16)
3758096384

You could use byte.encode('hex') to get the hex value.

You could use the struct module to unpack the bytes into a number and then format that the way you wish.
inport struct
print '{:08x}'.format(struct.unpack('>I', byte)[0])
Output:
e0000000

passing large number of arguments to struct.pack

I am using struct.pack method which takes variable number of arguments. I want to convert a string to bytes. If a string is short (e.g. 'name') I can do it like:
bytes = struct.pack('4c','n','a','m','e')
But what to do when my string is 80 characters long?
I have tried the format string 's', instead of '80c' for struct.pack, but the result is not the same as that of above call.

Use "80s", not just "s". The input is a single string, rather than a series of characters. i.e.
bytes = struct.pack('4s','name')
Note that if you specify a length greater than that of the input, the output will be null-padded.

That doesn't make much sense. Strings are already bytes in python 2.x; So you could just do:
my_string = 'I am some big string'
my_bytes = my_string
On python 3, strings are unicode objects by default. To get bytes you have to encode the string.
my_bytes = my_string.encode('utf-8')
If really you want to use struct.pack, you'd use * syntax as described in the tutorial:
my_bytes = struct.pack('20c', *my_string)
or
my_bytes = struct.pack('20s', my_string)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get the raw bytes from a hex string in python - python

Related

Convert hex to decimal/string in python

Converting Byte to String and Back Properly in Python3?

Python: Converting HEX string to bytes

How to get escaped hex value unescaped in Python

passing large number of arguments to struct.pack

Categories

Resources