Python struct as networking data packets (uknown byte sequence)

Python struct as networking data packets (uknown byte sequence) - python

I am working on a server engine in Python, for my game made in GameMaker Studio 2. I'm currently having some issues with making and sending a packet.
I've successfully managed to establish a connection and send the first packet, but I can't find a solution for sending data in a sequence of which if the first byte in the packed struct is equal to a value, then unpack other data into a given sequence.
Example:
types = 'hhh' #(message_id, x, y) example
message_id = 0
x = 200
y = 200
buffer = pack(types, 0,x, y)
On the server side:
data = conn.recv(BUFFER_SIZE)
mid = unpack('h', data)[0]
if not data: break
if mid == 0:
sequnce = 'hhh'
x = unpack(sequnce, data)[1]
y = unpack(sequnce, data)[2]

It looks like your subsequent decoding is going to vary based on the message ID?
If so, you will likely want to use unpack_from which allows you to pull only the first member from the data (as written now, your initial unpack call will generate an exception because the buffer you're handing it is not the right size). You can then have code that varies the unpacking format string based on the message ID. That code could look something like this:
from struct import pack, unpack, unpack_from
while True:
data = conn.recv(BUFFER_SIZE)
# End of file, bail out of loop
if not data: break
mid = unpack_from('!h', data)[0]
if mid == 0:
# Message ID 0
types = '!hhh'
_, x, y = unpack(types, data)
# Process message type 0
...
elif mid == 1:
types = '!hIIq'
_, v, w, z = unpack(types, data)
# Process message type 1
...
elif mid == 2:
...
Note that we're unpacking the message ID again in each case along with the ID-specific parameters. You could avoid that if you like by using the optional offset argument to unpack_from:
x, y = unpack_from('!hh', data, offset=2)
One other note of explanation: If you are sending messages between two different machines, you should consider the "endianness" (byte ordering). Not all machines are "little-endian" like x86. Accordingly it's conventional to send integers and other structured numerics in a certain defined byte order - traditionally that has been "network byte order" (which is big-endian) but either is okay as long as you're consistent. You can easily do that by prepending each format string with '!' or '<' as shown above (you'll need to do that for every format string on both sides).
Finally, the above code probably works fine for a simple "toy" application but as your program increases in scope and complexity, you should be aware that there is no guarantee that your single recv call actually receives all the bytes that were sent and no other bytes (such as bytes from a subsequently sent buffer). In other words, it's often necessary to add a buffering layer, or otherwise ensure that you have received and are operating on exactly the number of bytes you intended.

Could you unpack whole data to list, and then check its elements in the loop? What is the reason to unpack it 3 times? I guess, you could unpack it once, and then work with that list - check its length first, if not empty -> check first element -> if equal to special one, continue on list parsing. Did you try like that?

Related

Python C# datatypes for Sockets

I am creating a socket server to connect and speak with a C# program over TCP. Currently I am trying to create a way to convert the hex sent over the TCP socket to specific variables (the variable type will be in the packet header, and yes I do know tcp is a stream not technically sending packets but I am designing it like this). Currently I have all of the C# integral numeric types converting to and from bytesarray/integers correctly via the code below (All of the different types are the same with a couple edits to fit the c# type)
## SBYTE Type Class definition
## C#/Unity "sbyte" can be from -128 to 127
##
## Usage:
##
## Constructor
## variable = sbyte(integer)
## variable = sbyte(bytearray)
##
## Variables
## sbyte.integer (Returns integer representation)
## sbyte.bytes (Returns bytearray representation)
class sbyte:
def __init__(self, input):
if type(input) == type(int()):
self.integer = input
self.bytes = self.__toBytes(input)
elif type(input) == type(bytearray()):
self.bytes = input
self.integer = self.__toInt(input)
else:
raise TypeError(f"sbyte constructor can take integer or bytearray type not {type(input)}")
## Return Integer from Bytes Array
def __toInt(self, byteArray):
## Check that there is only 1 byte
if len(byteArray) != 1:
raise OverflowError(f"sbyte.__toInt length can only be 1 byte not {len(byteArray)} bytes")
## Return signed integer
return int.from_bytes(byteArray, byteorder='little', signed=True)
## Return Bytes Array from Integer
def __toBytes(self, integer):
## Check that the passed integer is not larger than 128 and is not smaller than -128
if integer > 127 or integer < -128:
raise ValueError(f"sbyte.__toBytes can only take an integer less than or equal to 127, and greater than or equal to -128, not \"{integer}\"")
## Convert the passed integer to Bytes
return integer.to_bytes(1, byteorder='little', signed=True)
This is working for all the types I currently implemented, but I do wonder if there is a better way to handle this? Such as using ctype's or some other python library. Since this will be a socket server with potentially many connections handling this as fast as possible is best. Or if there is anything else you see that I can improve I would love to know.

If all you want is an integer value from a byte array, simply index the byte array:
>>> b = bytearray.fromhex('1E')
>>> b[0]
30

After testing the differenced between from_bytes, struct.unpack, and numpy.frombuffer with the following code:
setup1 = """\
byteArray = bytearray.fromhex('1E')
"""
setup2 = """\
import struct
byteArray = bytearray.fromhex('1E')
"""
setup3 = """\
import numpy as np
type = np.dtype(np.byte)
byteArray = bytearray.fromhex('1E')
"""
stmt1 = "int.from_bytes(byteArray, byteorder='little', signed=True)"
stmt2 = "struct.unpack('b', byteArray)"
stmt3 = "np.frombuffer(byteArray, type)"
print(f"First statement execution time = {timeit.timeit(stmt=stmt1, setup=setup1, number=10**8)}")
print(f"Second statement execution time = {timeit.timeit(stmt=stmt2, setup=setup2, number=10**8)}")
print(f"Third statement execution time = {timeit.timeit(stmt=stmt3, setup=setup3, number=10**8)}")
Results:
First statement execution time = 14.456886599999999
Second statement execution time = 6.671141799999999
Third statement execution time = 21.8327342
From the initial results it looks like struct is the fastest way to accomplish this. Unless there are other libraries I am missing.
EDIT:
Per AKX's suggestion I added the following test for signed byte:
stmt4 = """\
if byteArray[0] <=127:
byteArray[0]
else:
byteArray[0]-127
"""
and got the following execution time:
Fourth statement execution time = 4.581732600000002
Going this path is the fastest although slightly over just using structs. I will have to test with each type for the fastest way to cast the bytes and vice versa but this question gave me 4 different ways to test each one now. Thanks!

How to index an unpickled list?

I'm making a LAN multiplayer arena shooter using pygame and sockets and am having troubling transferring pickled object data from server to client. I have 2 objects, player and projectile (the bullets). I don't know how to send multiple objects at once, so i decided to put the 2 objects in a list and pickle them. But when unpickling, I can't index the list as I keep getting the 'EOFError: Ran out of input'
So I want to unpickle the list that I receive and separate out the 2 objects in that list. But python won't let me index the list after I unpickled them. Any help would be much appreciated.
Here's my code:
#instantiating instances for Player and Projectile classes
players=[Player(450,400,"sprite1.png",1),Player(500,400,"sprite2.png",2)]
bullets=[Projectile(50,50,5,"right","projectile.png",0)]
def threaded_client(conn, player):
player_data=(players[player])
bullet_data=(bullets[player])
alldata=[player_data,bullet_data] #putting the 2 objects in a list.
conn.send(pickle.dumps(alldata)) #pickling list
reply = ""
while True:
try:
alldata = pickle.loads(conn.recv(2048))
players[player] = alldata[0]
...
self.client.connect(self.addr)
alldata=pickle.loads(self.client.recv(2048)) #unpickling the list
return alldata[0] #trying to return the first object

You need to make arrangements to ensure that you have the entire object before you unpickle. You're doing a conn.recv(XXX) but that does not mean you actually received all XXX bytes. On success, it means you got somewhere between 1 and XXX bytes (inclusive). If it's a small buffer, you often get the entire thing in one chunk but you should never count on that.
Generally, you'll want to send the byte count in a fixed-size binary format (typically using the struct module), then after retrieving the byte count, keep receiving until you got all the expected bytes or you get an error (i.e. your peer disconnected).
Something like this on the sending side:
import struct
pickled_bytes = pickle.dumps(thing_youre_sending)
p_size = len(pickled_bytes) # Size of pickled buffer
p_size_buf = struct.pack("!I", p_size) # Packed binary size (4 byte field)
conn.sendall(p_size_buf) # Send length (Note sendall!)
conn.sendall(pickled_bytes) # Send actual pickled object
On the receiving side, you'll do something like this:
import struct
...
def recv_all(conn, rlen):
""" Function to receive all bytes """
recvd = 0
buf = b''
while recvd < rlen:
rbuf = conn.recv(rlen - recvd)
if not rbuf:
# Client disconnected. Handle error in whatever way makes sense
raise ClientDisconnected()
recvd += len(rbuf)
buf += rbuf
...
p_size_buf = recv_all(conn, 4) # Receive entire binary length field
p_size = struct.unpack("!I", p_size_buf)[0] # (Unpack returns an array)
pickled_bytes = recv_all(conn, p_size) # Receive actual pickled object
thing_you_sent = pickle.loads(pickled_bytes)

unable to unpack information between custom Preamble in Python and telnetlib

I have an industrial sensor which provides me information via telnet over port 10001.
It has a Data Format as follows:
Also the manual:
All the measuring values are transmitted int32 or uint32 or float depending on the sensors
Code
import telnetlib
import struct
import time
# IP Address, Port, timeout for Telnet
tn = telnetlib.Telnet("169.254.168.150", 10001, 10)
while True:
op = tn.read_eager() # currently read information limit this till preamble
print(op[::-1]) # make little-endian
if not len(op[::-1]) == 0: # initially an empty bit starts (b'')
data = struct.unpack('!4c', op[::-1]) # unpacking `MEAS`
time.sleep(0.1)
my initial attempt:
Connect to the sensor
read data
make it to little-endian
OUTPUT
b''
b'MEAS\x85\x8c\x8c\x07\xa7\x9d\x01\x0c\x15\x04\xf6MEAS'
b'\x04\xf6MEAS\x86\x8c\x8c\x07\xa7\x9e\x01\x0c\x15\x04\xf6'
b'\x15\x04\xf6MEAS\x85\x8c\x8c\x07\xa7\x9f\x01\x0c\x15'
b'\x15\x04\xf6MEAS\x87\x8c\x8c\x07\xa7\xa0\x01\x0c'
b'\xa7\xa2\x01\x0c\x15\x04\xf6MEAS\x87\x8c\x8c\x07\xa7\xa1\x01\x0c'
b'\x8c\x07\xa7\xa3\x01\x0c\x15\x04\xf6MEAS\x87\x8c\x8c\x07'
b'\x88\x8c\x8c\x07\xa7\xa4\x01\x0c\x15\x04\xf6MEAS\x88\x8c'
b'MEAS\x8b\x8c\x8c\x07\xa7\xa5\x01\x0c\x15\x04\xf6MEAS'
b'\x04\xf6MEAS\x8b\x8c\x8c\x07\xa7\xa6\x01\x0c\x15\x04\xf6'
b'\x15\x04\xf6MEAS\x8a\x8c\x8c\x07\xa7\xa7\x01\x0c\x15'
b'\x15\x04\xf6MEAS\x88\x8c\x8c\x07\xa7\xa8\x01\x0c'
b'\x01\x0c\x15\x04\xf6MEAS\x88\x8c\x8c\x07\xa7\xa9\x01\x0c'
b'\x8c\x07\xa7\xab\x01\x0c\x15\x04\xf6MEAS\x8b\x8c\x8c\x07\xa7\xaa'
b'\x8c\x8c\x07\xa7\xac\x01\x0c\x15\x04\xf6MEAS\x8c\x8c'
b'AS\x89\x8c\x8c\x07\xa7\xad\x01\x0c\x15\x04\xf6MEAS\x8a'
b'MEAS\x88\x8c\x8c\x07\xa7\xae\x01\x0c\x15\x04\xf6ME'
b'\x15\x04\xf6MEAS\x87\x8c\x8c\x07\xa7\xaf\x01\x0c\x15\x04\xf6'
b'\x15\x04\xf6MEAS\x8a\x8c\x8c\x07\xa7\xb0\x01\x0c'
b'\x0c\x15\x04\xf6MEAS\x8a\x8c\x8c\x07\xa7\xb1\x01\x0c'
b'\x07\xa7\xb3\x01\x0c\x15\x04\xf6MEAS\x89\x8c\x8c\x07\xa7\xb2\x01'
b'\x8c\x8c\x07\xa7\xb4\x01\x0c\x15\x04\xf6MEAS\x89\x8c\x8c'
b'\x85\x8c\x8c\x07\xa7\xb5\x01\x0c\x15\x04\xf6MEAS\x84'
b'MEAS\x87\x8c\x8c\x07\xa7\xb6\x01\x0c\x15\x04\xf6MEAS'
b'\x04\xf6MEAS\x8b\x8c\x8c\x07\xa7\xb7\x01\x0c\x15\x04\xf6'
b'\x15\x04\xf6MEAS\x8b\x8c\x8c\x07\xa7\xb8\x01\x0c\x15'
b'\x15\x04\xf6MEAS\x8a\x8c\x8c\x07\xa7\xb9\x01\x0c'
b'\xa7\xbb\x01\x0c\x15\x04\xf6MEAS\x87\x8c\x8c\x07\xa7\xba\x01\x0c'
try to unpack the preamble !?
How do I read information like Article number, Serial number, Channel, Status, Measuring Value between the preamble?
The payload size seems to be fixed here for 22 Bytes (via Wireshark)

Parsing the reversed buffer is just weird; please use struct's support for endianess. Using big-endian '!' in a little-endian context is also odd.
The first four bytes are a text constant. Ok, fine perhaps you'll need to reverse those. But just those, please.
After that, use struct.unpack to parse out 'IIQI'. So far, that was kind of working OK with your approach, since all fields consume 4 bytes or a pair of 4 bytes. But finding frame M's length is the fly in the ointment since it is just 2 bytes, so parse it with 'H', giving you a combined 'IIQIH'. After that, you'll need to advance by only that many bytes, and then expect another 'MEAS' text constant once you've exhausted that set of measurements.

I managed to avoid TelnetLib altogether and created a tcp client using python3. I had the payload size already from my wireshark dump (22 Bytes) hence I keep receiving 22 bytes of Information. Apparently the module sends two distinct 22 Bytes payload
First (frame) payload has the preamble, serial, article, channel information
Second (frame) payload has the information like bytes per frame, measuring value counter, measuring value Channel 1, measuring value Channel 2, measuring value Channel 3
The information is in int32 and thus needs a formula to be converted to real readings (mentioned in the instruction manual)
(as mentioned by #J_H the unpacking was as He mentioned in his answer with small changes)
Code
import socket
import time
import struct
DRANGEMIN = 3261
DRANGEMAX = 15853
MEASRANGE = 50
OFFSET = 35
# Create a TCP/IP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_address = ('169.254.168.150', 10001)
print('connecting to %s port %s' % server_address)
sock.connect(server_address)
def value_mm(raw_val):
return (((raw_val - DRANGEMIN) * MEASRANGE) / (DRANGEMAX - DRANGEMIN) + OFFSET)
if __name__ == '__main__':
while True:
Laser_Value = 0
data = sock.recv(22)
preamble, article, serial, x1, x2 = struct.unpack('<4sIIQH', data)
if not preamble == b'SAEM':
status, bpf, mValCounter, CH1, CH2, CH3 = struct.unpack('<hIIIII',data)
#print(CH1, CH2, CH3)
Laser_Value = CH3
print(str(value_mm(Laser_Value)) + " mm")
#print('RAW: ' + str(len(data)))
print('\n')
#time.sleep(0.1)
Sure enough, this provides me the information that is needed and I compared the information via the propreitary software which the company provides.

Deserializing byte stream to objects

I am building a python project that receives bytes from a serial port. The bytes are responses to commands sent (also via serial port). The responses have no identifying marks, i.e. from the bytes alone, I don't know which command response this is. The decoder would of course need to know in advance which command this is a response to.
I would like to have the incoming byte sequence represented as a nested object, indicating the frame, header, payload, decoded payload, etc. I would much prefer to push 1 byte at a time to the decoder and have it call a callback once it has received enough bytes for a full object (or errorCallback if there are errors or timeout).
The actual frame has a start byte and an end byte. It has a header with a few bytes (some id, command status (basically ok/fail)), one is a data length byte. This is followed by the data which is followed by a checksum (single byte). The data is the response to the command.
The response is predictable in that the previous bytes decide what the coming bytes mean.
Example response:
aa:00:0c:00:01:00:00:d3:8d:d4:5c:50:01:04:e0:6e:bb
Broken down:
aa: start frame
00: id
0c: data length (incl status): 12 bytes
00: command status (byte 1)
01: 1 data frame (byte 2)
00:00: flags of first data frame (byte 3-4)
d3:8d:d4:5c:50:01:04:e0: first data (aa and bb could be in it) (byte 5-12)
6e: checksum (includes all bytes except start, checksum, end bytes)
bb: end frame
This being serial port communication, bytes may be lost (and extra produced) and I expect to use timeout to handle resets (no responses are expected without first a command being sent).
I really would like an object oriented approach where the decoder would produce an object that when serialized, would produce the same byte sequence again. I am using python 2.7, but really any object oriented language would do (as long as I could convert it to python).
I am just not sure how to structure the decoder to make it neat looking. I am looking for a full solution, just something that would get me going in the right direction (right direction being somewhat subjective here).

I don't completely understand what you want to do but if you want to receive fixed-length responses from some device and make them attributes of some object, would something like this be okay?:
START_FRAME = 0xAA
END_FRAME = 0xBB
TIMEOUT = 2
class Response:
def __init__(self, response):
if (len(response) - 6) % 11 == 0 and response[0] == START_FRAME and response[-1] == END_FRAME: # verify that its a valid response
self.header = {} # build header
self.header['id'] = response[1]
self.header['length'] = response[2]
self.header['status'] = response[3]
self.r_checksum = response[-2] # get checksum from response
self.checksum = self.header['id'] ^ self.header['length'] ^ self.header['status'] # verify the checksum
self.payload = response[4:-2] # get raw payload slice
self.decode(self.payload) # parse payload
if self.r_checksum == self.checksum: # final check
self.parsed = True
else:
self.parsed = False
else: # if response didnt follow the pattern
self.parsed = False
def decode(self, packet): # decode payload
self.frm_count = packet[0] # get number of data frames
self.checksum ^= self.frm_count
self.decoded = [] # hold decoded payload
frames = packet[1:]
for c in range(self.frm_count): # iterate through data frames
flags = frames[(c*10):(c*10 + 2)]
for f in flags:
self.checksum ^= f
data = frames[(c*10 + 2):(c+1)*10]
for d in data:
self.checksum ^= d
self.decoded.append({'frame': c+1, 'flags': flags, 'data':data})
def serialize(): # reconstruct response
res = bytearray()
res.append(START_FRAME)
res.extend([self.header['id'], self.header['length'], self.header['status']])
res.extend(self.payload)
res.extend([self.checksum, END_FRAME])
return res
response = bytearray()
ser = serial.Serial('COM3', 9600) # timeout is 2 seconds
last_read = time.clock()
while time.clock() - last_read < TIMEOUT:
while ser.inWaiting() > 0:
response.append(ser.read())
last_read = time.clock()
decoded_res = Response(response)
if decoded_res.parsed:
# do stuff
else:
print('Invalid response!')
This code assumes there may be more than one data frame, with the data frames immediately preceded by a byte indicating the number of data frames.
Parsing a packet is fast compared to the time taken for serial comms (even at 115200 baud). The whole thing is roughly O(n), i think.

Unspecified byte lengths in Python

I'm writing a client for a P2P application at the minute and the spec for the protocol says that the header for each packet should have each field with a particular byte length like so:
Version: 1 Byte
Type: 1 Byte
Length: 2 Bytes
And then the data
I've got the way of packing and unpacking the header fields (I think) like this:
packed = struct.pack('cch' , '1' , '1' , 26)
This constructs a header for a packet with a data length of 26, but when it comes to unpacking the data I'm unsure how to go about getting the rest of the data afterwards. To unpack we need to know the size of all the fields, unless I'm missing something? I guess to pack the data I'd use a format indicator 'cch26s' meaning:
1 Byte char
1 Byte char
2 Byte short
26 Byte char array
But how do I unpack the data when I don't know how much data will be included in the packet first?

The way you're describing the protocol, you should unpack the first four bytes first, and extract Length (a 16-bit int). This tells you how many bytes to unpack in a second step.
version, type, length = struct.unpack("cch", packed[:4])
content, = struct.unpack("%ds" % length, packed[4:])
This is if everything checks out. unpack() requires that the packed buffer contain exactly as much data as you unpack. Also, check whether the 4 header bytes are included in the length count.

You can surmise the number of characters to unpack by inspecting len(data).
Here is a helper function which does this for you:
def unpack(fmt, astr):
"""
Return struct.unpack(fmt, astr) with the optional single * in fmt replaced with
the appropriate number, given the length of astr.
"""
# http://stackoverflow.com/a/7867892/190597
try:
return struct.unpack(fmt, astr)
except struct.error:
flen = struct.calcsize(fmt.replace('*', ''))
alen = len(astr)
idx = fmt.find('*')
before_char = fmt[idx-1]
n = (alen-flen)/struct.calcsize(before_char)+1
fmt = ''.join((fmt[:idx-1], str(n), before_char, fmt[idx+1:]))
return struct.unpack(fmt, astr)
You can use it like this:
unpack('cchs*', data)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python struct as networking data packets (uknown byte sequence) - python

Related

Python C# datatypes for Sockets

How to index an unpickled list?

unable to unpack information between custom Preamble in Python and telnetlib

Deserializing byte stream to objects

Unspecified byte lengths in Python

Categories

Resources