Porting struct.unpack from python 2.7 to 3 - python

The following code works fine in python 2.7:
def GetMaxNoise(data, max_noise):
for byte in data:
noise = ComputeNoise(struct.unpack('=B',byte)[0])
if max_noise < noise:
max_noise = noise
return max_noise
where data is a string holding binary data (taken from a network packet).
I'm trying to port it to Python 3 and I get this:
File "Desktop/Test.py", line 2374, in GetMaxNoise
noise = ComputeNoise(struct.unpack('=B',byte)[0])
TypeError: 'int' does not support the buffer interface
How can I convert "data" to the appropriate type needed by unpack()?

Assuming the data variable is a string of bytes that you got from a binary file on a network packet, it it not processed the same in Python2 and Python3.
In Python2, it is a string. When you iterate its values, you get single byte strings, that you convert to int with struct.unpack('=B')[0]
In Python3, it is a bytes object. When you iterate its values, you directly get integers! So you should directly use:
def GetMaxNoise(data, max_noise):
for byte in data:
noise = ComputeNoise(byte) # byte is already the int value of the byte...
if max_noise < noise:
max_noise = noise
return max_noise

From the docs of the struct module https://docs.python.org/3.4/library/struct.html I see that the unpack method expects it's second argument to implement Buffer Protocol, so it generally expects bytes.
Your data object seems to be of the type bytes as it's read from somewhere. When you iterate over it with the for loop, you end up with byte variable being single int values.
I don't know what your code is supposed to do and how, but maybe change the way you iterate over your data object to handle not ints but bytes of length == 1?
for i in range(len(data)):
byte = data[i:i+1]
print(byte)

Related

How to convert a byte array to float in Python

I have a byte array, which originally was converted from a float array in Scala. I need to convert it back to a float array in Python.
This is the code I used to convert the float array in Scala:
val float_ary_len = float_ary.size
val bb = java.nio.ByteBuffer.allocate(float_ary_len * 4)
for(each_float <- float_ary){
bb.putFloat(each_folat)
}
val bytes_ary = bb.array()
Then in Python, I can get this byte array and I need to convert it back to a float array.
I have tried the following code in Python, but it didn't give me the right float.
print(list(bytes_ary[0:4]))
#['\xc2', '\xda', 't', 'Z']
struct.unpack('f', bytes_ary[0:4])
# it gave me 1.7230105268977664e+16, but it should be -109.22725
Please let me know how should I get the right float?
Apparently the Scala code that encodes the value uses a different byte order than the Python code that decodes it.
Make sure you use the same byte order (endianness) in both programs.
In Python, you can change the byte order used to decode the value by using >f or <f instead of f. See https://docs.python.org/3/library/struct.html#struct-alignment.
>>> b = b'\xc2\xdatZ'
>>> struct.unpack('f', b) # native byte order (little-endian on my machine)
(1.7230105268977664e+16,)
>>> struct.unpack('>f', b) # big-endian
(-109.22724914550781,)
It could be because of the endian encoding.
You should try big endian:
struct.unpack('>f', bytes_ary[0:4])
or little endian:
struct.unpack('<f', bytes_ary[0:4])
Depends on your byte array.
if
print(byte_array_of_old_float)
returns
bytearray(b'684210')
then this should work:
floatvar=float(byte_array_of_old_float)
In my case the byte array came from a MariaDB select call, and I did the conversion like that.

Struct. pack error in python 3 - struct.error: argument for 's' must be a bytes object

I know this question has been asked before, and some of the suggestions seem to be about needing a b to make the string a byte literal. However, im passing hex code to the function as 0x414243 to save it as ABC.
def _pack(_data, size):
numofbytes = size/8
print("Chars Expected: " + str(numofbytes))
formatString = "{}s".format(int(numofbytes))
print("Formatted String:" + formatString)
struct.pack(formatString,_data)
_pack(0x414243,24)
I'm not sure what to change here, im wondering if its a problem with how im using formatstring variable. I want the function to be able to work out how many chars are in the passed data from the size and in this case 24 bits = 3 bytes so it formats 3s and passes 0x414243 to convert to ABC.
Can anyone advise how to get past the error.
As the error message says, struct.pack() wants a group of bytes and you're giving it an integer.
If you want to be able to pass the data in as an integer, convert it to bytes before packing it:
_data = _data.to_bytes(numofbytes, "big") # or "little", depending on endianness
Or just pass the data in as bytes when you call it:
_pack(b"0x410x420x43", 24)
If you have a string containing hexadecimal, such as "0x414243", you can convert it to an integer, then on to bytes:
_data = int(_data, 16).to_bytes(numofbytes, "big")
You might use isinstance() to allow your function to accept any of these formats:
if isinstance(_data, str):
_data = int(_data, 16)
if isinstance(_data, int):
_data = _data.to_bytes(numofbytes, "big")
By the way, your calculation of the number of bytes will produce a floating-point answer if size is not a multiple of 8. A fractional number of bytes is an error. To address this:
numofbytes = size // 8 + bool(size % 8)
The + bool(size % 8) bit adds one to the result of the integer division if there are any bits left over.

Python3 reading a binary file, 4 bytes at a time and xor it with a 4 byte long key

I want to read a binary file, get the content four bytes by four bytes and perform int operations on these packets.
Using a dummy binary file, opened this way:
with open('MEM_10001000_0000B000.mem', 'br') as f:
for byte in f.read():
print (hex(byte))
I want to perform an encryption with a 4 byte long key, 0x9485A347 for example.
Is there a simple way I can read my files 4 bytes at a time and get them as int or do I need to put them in a temporary result using a counter?
My original idea is the following:
current_tmp = []
for byte in data:
current_tmp.append(int(byte))
if (len(current_tmp) == 4):
print (current_tmp)
# but current_tmp is an array not a single int
current_tmp = []
In my example, instead of having [132, 4, 240, 215] I would rather have 0x8404f0d7
Just use the "amount" parameter of read to read 4 bytes at a time, and the "from_bytes" constructor of Python's 3 int to get it going:
with open('MEM_10001000_0000B000.mem', 'br') as f:
data = f.read(4)
while data:
number = int.from_bytes(data, "big")
...
data = f.read(4)
If you are not using Python 3 yet for some reason, int won't feature a from_bytes method - then you could resort to use the struct module:
import struct
...
number = struct.unpack(">i", data)[0]
...
These methods however are good for a couple interations, and could get slow for a large file - Python offers a way for you to simply fill an array of 4-byte integer numbers directly in memory from an openfile - which is more likely what you should be using:
import array, os
numbers = array.array("i")
with open('MEM_10001000_0000B000.mem', 'br') as f:
numbers.fromfile(f, os.stat('MEM_10001000_0000B000.mem').st_size // numbers.itemsize)
numbers.byteswap()
Once you have the array, you can xor it with something like
from functools import reduce #not needed in Python2.7
result = reduce(lambda result, input: result ^ input, numbers, key)
will give you a numbers sequence with all numbers in your file read-in as 4 byte, big endian, signed ints.
If you file is not a multiple of 4 bytes, the first two methods might need some adjustment - fixing the while condition will be enough.

python convert raw binary data to array of floats

I'm new to python but i need to get this project done in it. I'm using telnetlib to get some raw data from a device, and this is what the data looks like (this is only part of the output i get, the real one is about 10x bigger)
\xc2\xb2\xdd\x0f\xc2\xb2x/\xc2\xb2\x08\xb2M\xcf\xc2\xb2\xc5S\xc2\xb2\xd6[\xc2\xb2qw\xc2\xb1\xafK\xc2\xb1n+\xc2\xb2?\x83\xc2\xb1\xe3\xb7\xc2\xb0\xe8\x87\xc2\xb0\xf1\x8f\xc2\xb1x\xbf\xc2\xb1\xcbO\xc2\xb1\x98\x93\xc2\xb1\xd4\xc3\xc2\xb1\xf7\x9f\xc2\xb1\xb3\x97\xc2\xb1\xe7;\xc2\xb2\x97\xcb\xc2\xb2\xd3\xf3\xc2\xb2f\x8b\xc2\xb1\xc6\xdb\xc2\xb1\xadC\xc2\xb1t\xcf\xc2\xb1\x9c\xdf\xc2\xb1\xb7\x1b\xc2\xb1\xa3\xc2\xb1\t_\xc2\xb1v\xc3\xc2\xb1\xeb
The documentation of the device says that this is
raw data: binary. An array of float values in big-endian format (not as a string).
The question is how can i convert this data into an array of float numbers?
the code:
import telnetlib
tn = telnetlib.Telnet(hostIP)
tn.read_until("connected")
tn.write("getData\r\n")
data = tn.read_until("\r\n")
print data
When i execute this script from terminal i get some binary "garbage"
²\f²▒▒²▒V²▒²▒
³▒▒³u▒³:v³▒>³;>²W▒²O^²Xf²▒▒±▒▒²P▒²▒j²▒²▒³Pv³▒▒²▒n²:Z²▒±▒F±▒±7▒±#▒±t^±▒▒±▒▒²5:±▒"±▒~±ю±±*±▒°▒▒°{n°a▒°▒:°Q▒°[°cj°0▒¯▒▒¯▒▒r¯ޒ°▒°▒¯▒▒¯a▒¯▒°E▒°▒r°q*¯▒¯▒
If i do the same from python shell i get the \xc2\xb2\xdd\x0f\xc2... values
You need to know in advance the number of elements in the array, or somehow infer the count, ie by counting the number of bytes and then dividing by the float size. You then use the struct module to unpack the binary data.
if (len(data) % 8) > 0:
assert "Data length not a multiple of 8"
L = []
for i in range(0, len(data), 8):
L.append(struct.unpack('>d', data[i:i+8]))
Complementing #vz0 answer, there is also struct.iter_unpack() that:
Iteratively unpack from the buffer buffer according to the format string format.
read the docs here
So we can convert without any trouble:
import struct
import numpy as np
# Choose operators from https://docs.python.org/3/library/struct.html#format-strings
Byte_Order = '<' # little-endian
Format_Characters = 'f' # float (4 bytes)
data_format = Byte_Order + Format_Characters
r = np.array(list(struct.iter_unpack(data_format, data)), dtype=float)

Convert little endian string to integer [duplicate]

This question already has answers here:
How to convert a string of bytes into an int?
(12 answers)
Closed 7 months ago.
I have read samples out of a wave file using the wave module, but it gives the samples as a string, it's out of wave so it's little endian (for example, \x00).
What is the easiest way to convert this into a python integer, or a numpy.int16 type? (It will eventually become a numpy.int16, so going directly there is fine).
Code needs to work on little endian and big endian processors.
The struct module converts packed data to Python values, and vice-versa.
>>> import struct
>>> struct.unpack("<h", "\x00\x05")
(1280,)
>>> struct.unpack("<h", "\x00\x06")
(1536,)
>>> struct.unpack("<h", "\x01\x06")
(1537,)
"h" means a short int, or 16-bit int. "<" means use little-endian.
struct is fine if you have to convert one or a small number of 2-byte strings to integers, but array and numpy itself are better options. Specifically, numpy.fromstring (called with the appropriate dtype argument) can directly convert the bytes from your string to an array of (whatever that dtype is). (If numpy.little_endian is false, you'll then have to swap the bytes -- see here for more discussion, but basically you'll want to call the byteswap method on the array object you just built with fromstring).
Kevin Burke's answer to this question works great when your binary string represents a single short integer, but if your string holds binary data representing multiple integers, you will need to add an additional 'h' for each additional integer that the string represents.
For Python 2
Convert Little Endian String that represents 2 integers
import struct
iValues = struct.unpack("<hh", "\x00\x04\x01\x05")
print(iValues)
Output: (1024, 1281)
Convert Little Endian String that represents 3 integers
import struct
iValues = struct.unpack("<hhh", "\x00\x04\x01\x05\x03\x04")
print(iValues)
Output: (1024, 1281, 1027)
Obviously, it's not realistic to always guess how many "h" characters are needed, so:
import struct
# A string that holds some unknown quantity of integers in binary form
strBinary_Values = "\x00\x04\x01\x05\x03\x04"
# Calculate the number of integers that are represented by binary string data
iQty_of_Values = len(strBinary_Values)/2
# Produce the string of required "h" values
h = "h" * int(iQty_of_Values)
iValues = struct.unpack("<"+h, strBinary_Values)
print(iValues)
Output: (1024, 1281, 1027)
For Python 3
import struct
# A string that holds some unknown quantity of integers in binary form
strBinary_Values = "\x00\x04\x01\x05\x03\x04"
# Calculate the number of integers that are represented by binary string data
iQty_of_Values = len(strBinary_Values)/2
# Produce the string of required "h" values
h = "h" * int(iQty_of_Values)
iValues = struct.unpack("<"+h, bytes(strBinary_Values, "utf8"))
print(iValues)
Output: (1024, 1281, 1027)
int(value[::-1].hex(), 16)
By example:
value = b'\xfd\xff\x00\x00\x00\x00\x00\x00'
print(int(value[::-1].hex(), 16))
65533
[::-1] invert the values (little endian), .hex() trabnsform to hex literal, int(,16) transform from hex literal to int base16.

Categories