I am trying to implement a custom UDP protocol. This protocol has a header and data. The header has information about the structure of the data.
The header is defined using a ctypes struct and the data using 'h' array(signed short).
I am using python sockets.
I tried sending the header ans data using separate calls to "socket.sendto" like this:
s.sendto(header, addr)
s.sendto(data, addr)
But I am not able to receive this as a continuous stream. "socket.recvfrom" is only fetching the header. Maybe if I call "socket.recvfrom" again I will get the data as well. But thats not what I need. I need the full packet as a stream. I think concatenating the header and data in the server itself might fix this.
So I tried different combinations of the following to concatenate the two:
converting the header to "c_char_p" array.
Converting the data to bytearray
using numpy.concatenate
Standard array concatenation after converting the header to 'B' array.
Converting header to 'h' array. (This failed with "string length not a multiple of item size"
All the above failed for one reason or the other.
If I need to convert either header or data, I would prefer it to be the header as it is smaller. But if there is no way around, I am ok with converting the data.
Would appreciate any help.
Relevant code snippets:
sdata = array("h")
.
.
.
header=protocol.HEADER(1,50,1,50,1,50,1,50,1,10,1,12,1,1,1,1,1,18,19,10,35,60,85,24,25)
.
.
s.sendto(header+sdata, addr)
You can copy the header struct into a ctypes array of bytes:
>>> buf = (ctypes.c_char * ctypes.sizeof(header)).from_buffer_copy(header)
Now, in Python 2,
>>> buf.raw + sdata.tostring()
should give you what you're looking for.
In Python 3, it would be
>>> buf.raw + sdata.tobytes()
Related
In my project, I am using WebRTC to connect between 2 client using the aiortc package.
I am using this example code and it works, but it seems I can't send non-string data in the data channel.
This is what I send in the data channel (modified the code in the start function in client.js file):
dc.onopen = function() {
dataChannelLog.textContent += '- open\n';
dcInterval = setInterval(function() {
let message = new DataObject(/*All the parameters*/);
dataChannelLog.textContent += '> ' + message + '\n';
dc.send(message);
}, 100);
};
Where DataObject is a class I created that contains the data I want to send.
The Python client receives [object Object] as a string. I expected it will send the bytes representing the object that I can convert back in Python to a normal class.
I know that a workaround for this is to convert the object to a string-format (like JSON), but I prefer not to do it because I am sending the objects very frequently (and every object contains a large array in it) and I am sure it will lead to a performance issues.
So my question is, how can I send the object through the data channel without converting to a string¿
EDIT: If it helps, I can use an array instead of an object to represent my data. But again, it is still sent and received as a string.
You need some sort of serializer function to convert your Javascript object into a stream of bytes. Those bytes don't have to be readable as text. You can't just send a Javascript object.
The built-in robust and secure serializer is, of course JSON.stringify(). As you've pointed out JSON is a verbose format.
To avoid using JSON, you'll need to create your own serializer in Javascript and deserializer in Python. Those will most likely be custom code for your particular object type. For best results, you'll copy the attributes of your object, one by one, into a Uint8Array, then send that.
You didn't tell us anything about your object, so it's hard to help you further.
If this were my project I'd get everything working with JSON and then do the custom serialization as a performance enhancement.
Thanks o-jones for the detailed answer.
In my case it was fairly simple, because I was able to represent all my data as an array.
The main issue I had is that I didn't know the send function has an "overload" that accepts bytes array…
After realizing that, I created a Float32Array in Javascript to hold my data and send it.
And in the Python side, I read this data and converted it to a float array using struct.unpack function.
Something like that:
Javascript side:
dc.onopen = function() {
dataChannelLog.textContent += '- open\n';
dcInterval = setInterval(function() {
let dataObj = new DataObject(/*All the parameters*/);
let data = new Float32Array(3); // Build an array (in my case 3 floats array)
// Insert the data into the array
data[0] = dataObj.value1;
data[1] = dataObj.value2;
data[2] = dataObj.value3;
// Send the data
dc.send(data);
}, 100);
};
Python side:
import struct
def message_received(message: str | bytes) -> None:
if isinstance(message, str):
return # Don't handle string messages
# Read 3 floats from the bytes array we received.
data = struct.unpack('3f', message)
I am reading data from a socket (UDP), which will obviously be using network format (always big-endian according to IETF RFC 1700), modifying some parameters, and then re-transmitting it on another socket. To avoid making lots of copies of the bytearray (data is coming in very frequently) I have been using a memoryview object, and passing that between the functions so the data is modified in-place:
full_msg, addr = in_port.recvfrom(4096)
data = memoryview(full_msg)
if data:
msg_id = data[:4]
if msg_id == b'QRT1':
Controller.QuaternionMsg.process(data)
elif msg_id == b'QRT2':
#etc
The data I am processing is a series of 4D vector floats, so the easiest way to read them is just to cast the memory view object so it reads them as floats:
>>> msg_data = data[4:].cast('f')
>>> msg_data[0]
0.12345
However, because the native byte-order on my PC is intel (little-endian), these are being read wrong. I tried using a similar syntax to the struct module but this isn't supported:
>>> msg_data = data[4:].cast('!f')
ValueError: memoryview: destination format must be a native single character format prefixed with an optional '#'
Is there a way to force the memoryview object to read this in a particular byteorder? I know that I could use the struct module and make copies but I am trying to avoid that if possible.
Another Option
I am also looking at numpy as an option, but I'm worried about any additional overhead that might bring. If I used:
network_float = np.dtype(float)
network_float = network_float.newbyteorder('>')
data = np.frombuffer(full_msg,dtype=network_float, offset=4) #maybe not writeable?
would this allow me to do the same thing with not much extra overhead or does np.frombuffer make a copy if it's not pointed at an existing numpy array?
I'm trying to serialize a Record in Delphi by using MessagePack and then using ZeroMQ TCP protocol I send it to a Python server.
b'\xa8DataType\x01\xa4data\xbf{"major":1,"minor":0,"build":2}\x00'
I'm having trouble deserializing it on the server side. Any ideas why this is happening? Is it some kind of encoding issues? Thanks!
Update #1:
I use the messagepack library "QMsgPack" recommended on www.msgpack.org Here's some Delphi code. My user defined Records and an enum:
Version = Record
major : Integer;
minor : Integer;
build : Integer;
end;
TDataType = (dtUnset, dtVersion, dtEntityDescription, dtEntityDescriptionVector, dtEntityState, dtEntityStateVector, dtCommandContinue, dtCommandComplete);
TPacket = Record
DataType : TDataType;
data : string;
end;
And the code to serialize the object:
begin
dmVersion.major := 1;
dmVersion.minor := 1;
dmVersion.build := 1;
lvMsg := TQMsgPack.Create;
lvMsg.FromRecord(dmVersion);
lvMsgString := lvMsg.ToString();
packet.DataType := dtVersion;
packet.data := lvMsgString;
lvMsg.Clear;
lvMsg.FromRecord(packet);
lvbytes:=lvMsg.Encode;
ZeroMQ.zSendByteArray(skt, lvbytes);
I then try to deserialize the received byte array in the python server which looks like this:
b'\xa8DataType\x01\xa4data\xbf{"major":1,"minor":0,"build":2}\x00'
by using umsgpack.unpack() and then print out the result in the result like this:
packet_packed = command.recv()
# Unpack the packet
umsgpack.compatibility = True
packet = umsgpack.unpackb( packet_packed )
print (packet)
for item in packet:
print (item)
and this is what I get printed out on the screen:
b'DataType'
68
97
116
97
84
121
112
101
I hope this helps! Thanks!
Update #2
Here is some server code on the python side. The VDS_PACKET_VERSION is a constant int set to 1.
# Make sure its a version packet
if VDS_PACKET_VERSION == packet[0]:
# Unpack the data portion of the packet
version = umsgpack.unpackb( packet[1] )
roster = []
if ( VDS_VERSION_MAJOR == version[0] ) and ( VDS_VERSION_MINOR == version[1] ) and ( VDS_VERSION_BUILD == version[2] ):
dostuff()
With the current serialized object
b'\x82\xa8DataType\x01\xa4data\xbf{"major":1,"minor":1,"build":1}'
I get
KeyError: 0 on packet[0]
Why is that?
The packed data appears to be invalid.
>>> packet = { "DataType": 1, "data": "{\"major\":1,\"minor\":0,\"build\":2}"}
>>> umsgpack.packb(packet)
b'\x82\xa4data\xbf{"major":1,"minor":0,"build":2}\xa8DataType\x01'
The first byte is \x82 which, as can be seen in the specification, is a two entry fixmap.
Your packed data is missing that information, and launches straight in to a fixstr. So, yes, there could be a mismatch between your Delphi based packer and the Python based unpacker. However, when I take your Delphi code, using the latest qmsgpack from the repo, it produces the following bytes:
82A8446174615479706501A464617461
BF7B226D616A6F72223A312C226D696E
6F72223A312C226275696C64223A317D
Let's convert that into a Python bytes object. It looks like this:
b'\x82\xa8DataType\x01\xa4data\xbf{"major":1,"minor":1,"build":1}'
Now, that's quite different from what you report. And umsgpack can unpack it just fine. Note that the first byte is \x82, a two entry fixmap, just as expected. Yes, the entries are in a different order, but that's just fine. Order is not significant for a map.
So, I've been able to encode using qmsgpack in Delphi, and decode using umsgpack in Python. Which then suggests that this issue is really in the transmission. It looks to me as though there has been an off-by-one error. Instead of transmitting bytes 0 to N-1, bytes 1 to N have been transmitted. Note the spurious trailing zero in your received data.
In the comments you obverse that the data field is being coded as JSON and passed as a string. But you'd rather have that data encoded using MessagePack. So here's what to do:
In the Delphi code change the data field's type from string to TBytes. That's because we are going to put a byte array in there.
Populate the data field using Encode, like this: packet.data := lvMsg.Encode.
On the Python side, when you unpack data you'll find that it is an array of integers. Convert that to bytes and then unpack: umsgpack.unpackb(bytes(data)).
I need to send an array of namedtuples by a socket.
To create the array of namedtuples I use de following:
listaPeers=[]
for i in range(200):
ipPuerto=collections.namedtuple('ipPuerto', 'ip, puerto')
ipPuerto.ip="121.231.334.22"
ipPuerto.puerto="8988"
listaPeers.append(ipPuerto)
Now that is filled, i need to pack "listaPeers[200]"
How can i do it?
Something like?:
packedData = struct.pack('XXXX',listaPeers)
First of all you are using namedtuple incorrectly. It should look something like this:
# ipPuerto is a type
ipPuerto=collections.namedtuple('ipPuerto', 'ip, puerto')
# theTuple is a tuple object
theTuple = ipPuerto("121.231.334.22", "8988")
As for packing, it depends what you want to use on the other end. If the data will be read by Python, you can just use Pickle module.
import cPickle as Pickle
pickledTuple = Pickle.dumps(theTuple)
You can pickle whole array of them at once.
It is not that simple - yes, for integers and simple numbers, it s possible to pack straight from named tuples to data provided by the struct package.
However, you are holding your data as strings, not as numbers - it is a simple thing to convert to int in the case of the port - as it is a simple integer, but requires some juggling when it comes to the IP.
def ipv4_from_str(ip_str):
parts = ip_str.split(".")
result = 0
for part in parts:
result <<= 8
result += int(part)
return result
def ip_puerto_gen(list_of_ips):
for ip_puerto in list_of_ips:
yield(ipv4_from_str(ip_puerto.ip))
yield(int(ip_puerto.puerto))
def pack(list_of_ips):
return struct.pack(">" + "II" * len(list_of_ips),
*ip_puerto_gen(list_of_ips)
)
And you then use the "pack" function from here to pack your structure as you seem to want.
But first, attempt to the fact that you are creating your "listaPiers" incorrectly (your example code simply will fail with an IndexError) - use an empty list, and the append method on it to insert new named tuples with ip/port pairs as each element:
listaPiers = []
ipPuerto=collections.namedtuple('ipPuerto', 'ip, puerto')
for x in range(200):
new_element = ipPuerto("123.123.123.123", "8192")
listaPiers.append(new_element)
data = pack(listaPiers)
ISTR that pickle is considered insecure in server processes, if the server process is receiving pickled data from untrusted clients.
You might want to come up with some sort of separator character(s) for the records and fields (perhaps \0 and \001 or \376 and \377). Then putting together a message is kind of like a text file broken up into records and fields separated by spaces and newlines. Or for that matter, you could use spaces and newlines, if your normal data doesn't include these.
I find this module very valuable for framing data in socket-based protocols:
http://stromberg.dnsalias.org/~strombrg/bufsock.html
It lets you do things like "read up until the next null byte" or "read the next 10 characters" - without needing to worry about the complexities of IP aggregating or splitting packets.
I have a socket opened and I'd like to read some json data from it. The problem is that the json module from standard library can only parse from strings (load only reads the whole file and calls loads inside) It even looks that all the way inside the module it all depends on the parameter being string.
This is a real problem with sockets since you can never read it all to string and you don't know how many bytes to read before you actually parse it.
So my questions are: Is there a (simple and elegant) workaround? Is there another json library that can parse data incrementally? Is it worth writing it myself?
Edit: It is XBMC jsonrpc api. There are no message envelopes, and I have no control over the format. Each message may be on a single line or on several lines.
I could write some simple parser that needs only getc function in some form and feed it using s.recv(1), but this doesn't as a very pythonic solution and I'm a little lazy to do that :-)
Edit: given that you aren't defining the protocol, this isn't useful, but it might be useful in other contexts.
Assuming it's a stream (TCP) socket, you need to implement your own message framing mechanism (or use an existing higher level protocol that does so). One straightforward way is to define each message as a 32-bit integer length field, followed by that many bytes of data.
Sender: take the length of the JSON packet, pack it into 4 bytes with the struct module, send it on the socket, then send the JSON packet.
Receiver: Repeatedly read from the socket until you have at least 4 bytes of data, use struct.unpack to unpack the length. Read from the socket until you have at least that much data and that's your JSON packet; anything left over is the length for the next message.
If at some point you're going to want to send messages that consist of something other than JSON over the same socket, you may want to send a message type code between the length and the data payload; congratulations, you've invented yet another protocol.
Another, slightly more standard, method is DJB's Netstrings protocol; it's very similar to the system proposed above, but with text-encoded lengths instead of binary; it's directly supported by frameworks such as Twisted.
If you're getting the JSON from an HTTP stream, use the Content-Length header to get the length of the JSON data. For example:
import httplib
import json
h = httplib.HTTPConnection('graph.facebook.com')
h.request('GET', '/19292868552')
response = h.getresponse()
content_length = int(response.getheader('Content-Length','0'))
# Read data until we've read Content-Length bytes or the socket is closed
data = ''
while len(data) < content_length or content_length == 0:
s = response.read(content_length - len(data))
if not s:
break
data += s
# We now have the full data -- decode it
j = json.loads(data)
print j
What you want(ed) is ijson, an incremental json parser.
It is available here: https://pypi.python.org/pypi/ijson/ . The usage should be simple as (copying from that page):
import ijson.backends.python as ijson
for item in ijson.items(file_obj):
# ...
(for those who prefer something self-contained - in the sense that it relies only on the standard library: I wrote yesterday a small wrapper around json - but just because I didn't know about ijson. It is probably much less efficient.)
EDIT: since I found out that in fact (a cythonized version of) my approach was much more efficient than ijson, I have packaged it as an independent library - see here also for some rough benchmarks: http://pietrobattiston.it/jsaone
Do you have control over the json? Try writing each object as a single line. Then do a readline call on the socket as described here.
infile = sock.makefile()
while True:
line = infile.readline()
if not line: break
# ...
result = json.loads(line)
Skimming the XBMC JSON RPC docs, I think you want an existing JSON-RPC library - you could take a look at:
http://www.freenet.org.nz/dojo/pyjson/
If that's not suitable for whatever reason, it looks to me like each request and response is contained in a JSON object (rather than a loose JSON primitive that might be a string, array, or number), so the envelope you're looking for is the '{ ... }' that defines a JSON object.
I would, therefore, try something like (pseudocode):
while not dead:
read from the socket and append it to a string buffer
set a depth counter to zero
walk each character in the string buffer:
if you encounter a '{':
increment depth
if you encounter a '}':
decrement depth
if depth is zero:
remove what you have read so far from the buffer
pass that to json.loads()
You may find JSON-RPC useful for this situation. It is a remote procedure call protocol that should allow you to call the methods exposed by the XBMC JSON-RPC. You can find the specification on Trac.
res = str(s.recv(4096), 'utf-8') # Getting a response as string
res_lines = res.splitlines() # Split the string to an array
last_line = res_lines[-1] # Normally, the last one is the json data
pair = json.loads(last_line)
https://github.com/A1vinSmith/arbitrary-python/blob/master/sockets/loopHost.py