Can you help me understand packed binary data in python - python

I followed an example to convert 24bit audio to bytes.
For example:
struct.pack('<i', 4000000)
Gives:
b'\x00\t=\x00'
Can you help me understand the packed binary data
\x00\t=\x00
How do we interpret it?
Many thanks.

We can get the byte representation for 4000000 using this:
In [14]: x = 4000000
In [15]: print("%08X" % x)
003D0900
But on x86 and x64 machines, integers are stored LSB first, so the bytes in memory will be
00 09 3D 00
Those translate to the following characters:
In [46]: print(b"%c%c%c%c" % (0x00, 0x09, 0x3D, 0x00))
b'\x00\t=\x00'
So 0x09 is \t and 0x3D is =.
We can recreate the original value using ord, some bit shifts, and addition:
In [52]: (ord('=') << 16) + (ord('\t') << 8)
Out[52]: 4000000
In the comments, you asked "why it isn't just \t=". Hopefully this example answers that question:
In [8]: struct.unpack('<i', b'\t=')
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-8-c3b2a260fbdf> in <module>
----> 1 struct.unpack('<i', b'\t=')
error: unpack requires a buffer of 4 bytes
The '<i' tells pack and unpack that they are working with 4-byte little-endian integers, so we need to give them at least 4 bytes.
You also asked "why not \x00\t\x00=\x00". This time we promise 4 bytes but deliver 5:
In [10]: struct.unpack('<i', b'\x00\t\x00=\x00')
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-10-8f3448548a2a> in <module>
----> 1 struct.unpack('<i', b'\x00\t\x00=\x00')
error: unpack requires a buffer of 4 bytes
Maybe I confused you by leaving out the 0x00 bytes in the reassembly:
In [11]: (0 << 24) + (ord('=') << 16) + (ord('\t') << 8) + 0
Out[11]: 4000000

Related

error while receiving data and converting it into f64

I am trying to send data over a TCP connection from rust to python, however while receiving the data in python I am getting the following error when trying to convert it from bytes to f64.
Traceback (most recent call last):
File "server.py", line 36, in <module>
[x] = struct.unpack('f', data)
struct.error: unpack requires a buffer of 4 bytes
I am using the follwoing method to convert the data from bytes to f64,
[x] = struct.unpack('f', data)
print(x)
my data looks like this, which I am sending over a tcp
x: 0.011399809271097183 (f64 from rust)
and getting something like
b'?t=*\x00\x00\x00\x00?s\xbd\xc6\x80\x00\x00\x00?q\xd5s\x00\x00\x00\x00?|\xae\x85\x80\x00\x00\x00?e\xb5\xc3\x00\x00\x00\x00?yp;\x80\x00\x00\x00?p\x7f\x98\x00\x00\x00\x00?hG|\x00\x00\x00\x00?o\x8d&\x00\x00\x00\x00?cv[\x00\x00\x00\x00?s\xdf\x97\x80\x00\x00\x00?{\x0e\xde\x80\x00\x00\x00?n\xec\xbf\x00\x00\x00\x00?n\xd8E\x00\x00\x00\x00?y+\xdd\x80\x00\x00\x00?r\xd90\x80\x00\x00\x00?r\xc2\x89\x00\x00\x00\x00?q\xc2i\x00\x00\x00\x00?kq"\x00\x00\x00\x00?t5\xec\x80\x00\x00\x00?|\xaak\x80\x00\x00\x00?z\x10\x9d\x00\x00\x00\x00?o\xeb\xde\x00\x00\x00\x01?m6\xfc\x00\x00\x00\x00'
It appears 24 8-byte floats (double type in C) were sent in big-endian format. At least the data appears as something reasonable, although none of them match the single value listed in the question:
>>> data = b'?t=*\x00\x00\x00\x00?s\xbd\xc6\x80\x00\x00\x00?q\xd5s\x00\x00\x00\x00?|\xae\x85\x80\x00\x00\x00?e\xb5\xc3\x00\x00\x00\x00?yp;\x80\x00\x00\x00?p\x7f\x98\x00\x00\x00\x00?hG|\x00\x00\x00\x00?o\x8d&\x00\x00\x00\x00?cv[\x00\x00\x00\x00?s\xdf\x97\x80\x00\x00\x00?{\x0e\xde\x80\x00\x00\x00?n\xec\xbf\x00\x00\x00\x00?n\xd8E\x00\x00\x00\x00?y+\xdd\x80\x00\x00\x00?r\xd90\x80\x00\x00\x00?r\xc2\x89\x00\x00\x00\x00?q\xc2i\x00\x00\x00\x00?kq"\x00\x00\x00\x00?t5\xec\x80\x00\x00\x00?|\xaak\x80\x00\x00\x00?z\x10\x9d\x00\x00\x00\x00?o\xeb\xde\x00\x00\x00\x01?m6\xfc\x00\x00\x00\x00'
>>> s
b'?t=*\x00\x00\x00\x00?s\xbd\xc6\x80\x00\x00\x00?q\xd5s\x00\x00\x00\x00?|\xae\x85\x80\x00\x00\x00?e\xb5\xc3\x00\x00\x00\x00?yp;\x80\x00\x00\x00?p\x7f\x98\x00\x00\x00\x00?hG|\x00\x00\x00\x00?o\x8d&\x00\x00\x00\x00?cv[\x00\x00\x00\x00?s\xdf\x97\x80\x00\x00\x00?{\x0e\xde\x80\x00\x00\x00?n\xec\xbf\x00\x00\x00\x00?n\xd8E\x00\x00\x00\x00?y+\xdd\x80\x00\x00\x00?r\xd90\x80\x00\x00\x00?r\xc2\x89\x00\x00\x00\x00?q\xc2i\x00\x00\x00\x00?kq"\x00\x00\x00\x00?t5\xec\x80\x00\x00\x00?|\xaak\x80\x00\x00\x00?z\x10\x9d\x00\x00\x00\x00?o\xeb\xde\x00\x00\x00\x01?m6\xfc\x00\x00\x00\x00'
>>> len(data)
192
>>> len(s)/8
24.0
>>> import struct
>>> struct.unpack('>24d',s)
(0.004941143095493317, 0.004819655790925026, 0.004353951662778854, 0.007002374157309532, 0.0026501473039388657, 0.0062105488032102585, 0.00402793288230896, 0.0029637739062309265, 0.0038514845073223114, 0.0023757722228765488, 0.004851905629038811, 0.006605977192521095, 0.0037749987095594406, 0.003765234723687172, 0.006145348772406578, 0.004601659253239632, 0.004580054432153702, 0.004335794597864151, 0.003349844366312027, 0.0049342382699251175, 0.006998462602496147, 0.0063634999096393585, 0.003896649926900864, 0.003566257655620575)

Python struct.unpack byte length issues

I have the following code:
msg = b'0,[\x00\x01\x86\xec\x96N'
print(struct.unpack("<"+"I",msg))
however everytime i try to do this it says
struct.error: unpack requires a buffer of 4 bytes
What i tried to do is the following
times = int(len(msg)/4)
struct.unpack("<"+"I" * times,msg)
but it doesnt always work, i think on uneven numbers, how can i get the correct size so i dont encounter these issues?
struct.unpack requires that the length of the buffer being consumed is exactly the size of the format. [1]
Use struct.unpack_from instead, which requires that the length of the buffer being consumed is at least the size of the format. [2]
>>> msg = b'0,[\x00\x01\x86\xec\x96N'
>>> import struct
>>> print(struct.unpack("<"+"I", msg))
Traceback (most recent call last):
File "<input>", line 1, in <module>
struct.error: unpack requires a buffer of 4 bytes
>>> print(struct.unpack_from("<"+"I", msg))
(5975088,)
Additional bytes will be ignored by unpack_from
[1] https://docs.python.org/3/library/struct.html#struct.unpack
[2] https://docs.python.org/3/library/struct.html#struct.unpack_from

Convert Python's binascii.crc_hqx() back to ascii

I'm using the standard Python3 lib binascii, and specifically the crc_hqx() function
binascii.crc_hqx(data, value)
Compute a 16-bit CRC value of data, starting with value as the initial CRC, and return the result. This uses the CRC-CCITT polynomial x16 + x12 + x5 + 1, often represented as 0x1021. This CRC is used in the binhex4 format.
I'm able to convert to CRC with this code:
import binascii
t = 'abcd'
z = binascii.crc_hqx(t.encode('ascii'), 0)
print(t,z)
which, as expected, prints the line
abcd 43062
But how do I convert back to ASCII?
I've tried variations with the a2b_hqx() function
binascii.a2b_hqx(string)
Convert binhex4 formatted ASCII data to binary, without doing RLE-decompression. The string should contain a complete number of binary bytes, or (in case of the last portion of the binhex4 data) have the remaining bits zero.
The simplest version would be:
y = binascii.a2b_hqx(str(z))
But I've also tried variations with bytearray() and str.encode(), etc.
For this code:
import binascii
t = 'abcd'
z = binascii.crc_hqx(t.encode('ascii'), 0)
print(t,z)
y = binascii.a2b_hqx(str(z))
the Traceback:
abcd 43062
Traceback (most recent call last):
File "test.py", line 5, in <module>
y = binascii.a2b_hqx(str(z))
binascii.Incomplete: String has incomplete number of bytes
And with this code:
y = binascii.a2b_hqx(bytearray(z))
This Traceback:
binascii.Error: Illegal char
What is generated is a checksum and not possible to convert back to ascii.

Error in packing 128 byte structure using Struct in python

I want to pack 128 byte of different data types. The structure as follows
4 bytes - 0x12345678,
2 bytes - 0x1234,
120 bytes - 0x00 (repeats 120 times),
2 byte - 0x99 ,
I tried with below code but fails
struct.pack('<LH120BH',0x12345678,0x1234,0x00,0x99 )
gives error
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
struct.pack('<LH120BH',0x12345678,0x1234,0x00,0x99 )
struct.error: pack expected 123 items for packing (got 4)
pls help me. Thanks in advane
You may need to pack 0x00 in to an array if you want it to repeat 120 times and unpack it when call struct.pack, maybe something like this:
struct.pack('<LH120BH',0x12345678,0x1234,*[0x00] * 120,0x99)

from hex string to int and back

I'm using Scapy to forge packets in Python, but I need to manually modify a sequence of bits (that scapy doesn't support) inside a specific packet, so I do the following:
Given a packet p, I convert it to a hex string, then to base 10 and finally to a binary number. I modify the bits I'm interested in, then I convert it back to a packet. I have trouble converting it back to the same format of hex string...
# I create a packet with Scapy
In [3]: p = IP(dst="www.google.com") / TCP(sport=10000, dport=10001) / "asdasdasd"
In [6]: p
Out[6]: <IP frag=0 proto=tcp dst=Net('www.google.com') |<TCP sport=webmin dport=10001 |<Raw load='asdasdasd' |>>>
# I convert it to a hex string
In [7]: p_str = str(p)
In [8]: p_str
Out[8]: "E\x00\x001\x00\x01\x00\x00#\x06Q\x1c\x86;\x81\x99\xad\xc2t\x13'\x10'\x11\x00\x00\x00\x00\x00\x00\x00\x00P\x02 \x00\x19a\x00\x00asdasdasd"
# I convert it to an integer
In [9]: p_int = int(p_str.encode('hex'), 16)
In [10]: p_int
Out[10]: 2718738542629841457617712654487115358609175161220115024628433766520503527612013312415911474170471993202533513363026788L
# Finally, I convert it to a binary number
In [11]: p_bin = bin(p_int)
In [11]: p_bin
Out[11]: '0b1000101000000000000000000110001000000000000000100000000000000000100000000000110010100010001110010000110001110111000000110011001101011011100001001110100000100110010011100010000001001110001000100000000000000000000000000000000000000000000000000000000000000000101000000000010001000000000000000011001011000010000000000000000011000010111001101100100011000010111001101100100011000010111001101100100'
# ... (I modify some bits in p_bin, for instance the last three)...
In [12]: p_bin_modified = p_bin[:-3] + '000'
# I convert it back to a packet!
# First to int
In [13]: p_int_modified = int(p_bin_modified, 2)
In [14]: p_int_modified
Out[14]: 2718738542629841457617712654487115358609175161220115024628433766520503527612013312415911474170471993202533513363026784L
# Then to a hex string
In [38]: hex(p_int_modified)
Out[38]: '0x45000031000100004006511c863b8199adc274132710271100000000000000005002200019610000617364617364617360L'
Ops! It doesn't really look like the format of the original hex string. Any ideas on how to do it?
EDIT:
ok, I found decode('hex'), which works on a hex number, but it breaks the reflexivity of the whole conversion...
In [73]: hex(int(bin(int(str(p).encode('hex'), 16)), 2)).decode('hex')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-73-f5b9d74d557f> in <module>()
----> 1 hex(int(bin(int(str(p).encode('hex'), 16)), 2)).decode('hex')
/usr/lib/python2.7/encodings/hex_codec.pyc in hex_decode(input, errors)
40 """
41 assert errors == 'strict'
---> 42 output = binascii.a2b_hex(input)
43 return (output, len(input))
44
TypeError: Odd-length string
EDIT2: I get the same error if I remove the conversion to a binary number...
In [13]: hex(int(str(p).encode('hex'), 16)).decode('hex')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/home/ricky/<ipython-input-13-47ae9c87a5d2> in <module>()
----> 1 hex(int(str(p).encode('hex'), 16)).decode('hex')
/usr/lib/python2.7/encodings/hex_codec.pyc in hex_decode(input, errors)
40 """
41 assert errors == 'strict'
---> 42 output = binascii.a2b_hex(input)
43 return (output, len(input))
44
TypeError: Odd-length string
Ok, I solved it.
I have to strip the trailing L in the long int and the leading 0x in the hex representation.
In [76]: binascii.unhexlify(hex(int(binascii.hexlify(str(p)), 16)).lstrip('0x').rstrip('L')) == str(p)
Out[76]: True

Categories