We (as project group) are currently stuck on the issue of how to handle live data to our server.
We are getting updates on data every second, and we would like to insert this into our database (security is currently not an issue, because it is a school project). The problem is here we tried python SockerServer and AsyncIO to create a TCP server to which the data can be sent.
We got this working with different libraries etc. But we are stuck on the fact that if we keep an open connection with the client (in this case hardware which sends data every second) we can't split the different JSON or XML messages. They are all added up together.
We know why because TCP only provides order.
Any thoughts on how to handle this? So that every message sent will get split from the others.
Recreating the socket won't be the right option if I recall correctly.
What you will have to do is ensure that there is a clear delimiter for each message. For example, the first 6 characters of every message could be the length of the message - whatever reads from the socket decodes the length then reads that number of bytes, and sends the data to whatever needs it. Another way would be if there is a character/byte which never appears in the content, send it immediately before a message - for example control-A (binary value 1) could be the leadin character, and send control-B (binary value 2) as the leadout. Again the server looks for these framing a message.
If you can't change the client side (the thing sending the data), then you are going to have to parse the input. You can't just add a delimiter to something that you don't control.
An alternative is to use a header that encodes the size of the message that will be sent. Lets say you use a header of 4 bytes, The client first send the server a header with the size of the message to come. The client then sends the message (up to 4 gigs or there about). The server knows that it must first read 4 bytes (a header). It calculates the size n that the header contained then reads n bytes from the socket buffer. You are guaranteed to have read only your message. Using special delimiters is dangerous as you MUST know all possible values that a client can send.
It really depends on the type of data you are receiving. What type of connection, latency... If you have a pause of 1 second between packets and your connection is consistent, you could probably get away with first reading the entire buffer once to clear it, then as soon as there is data available - read it and clear the buffer it. not a great approach, but it might work for what you need - and no parsing involved.
Related
The documentation of socket.send says:
Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.
I believe I have observed an instance of this in production. The fix is easy: Use sendall instead. However I'm struggling to reproduce the issue for a test.
How can I get send to not send all the data passed to it?
I've tried repeatedly writing to a socket that is never read, but it will just fill the buffer and then block.
I want to send a text file from one serial device(slcan0) to another serial device(slcan1) can this operation be performed in SocketCAN? The serial CAN device I am using is CANtact toolkit. Or can the same operation be done in Python-can?
When you want to send a text file over the CAN bus, you have to decide which CAN-ID you want so sent for sending and receiving.
Most likely your text file is larger than 8 bytes, so you would have to use a higher level protocol on CAN.
ISO-TP will allow 4095 of data in one message.
If this is still not enough, you would have to invent another protocol for sending and receiving the data. E.g. first send the length of data, then send the data in chunks of 4095 bytes.
Once you have figured this out, it does not really matter whether you use SocketCAN, Python-CAN, pyvit or anything else.
SORRY FOR BAD ENGLISH
Why if I have two send()-s on the server, and two recv()-s on the client, sometimes the first recv() will get the content of the 2nd send() from the server, without taking just the content of the first one and let the other recv() to take the "due and proper" content of the other send()?
How can I get this work in an other way?
This is by design.
A TCP stream is a channel on which you can send bytes between two endpoints but the transmission is stream-based, not message based.
If you want to send messages then you need to encode them... for example by prepending a "size" field that will inform the receiver how many bytes to expect for the body.
If you send 100 bytes and then other 100 bytes it's well possible that the receiver will instead see 200 at once, or even 50 + 150 in two different read commands. If you want message boundaries then you have to put them in the data yourself.
There is a lower layer (datagrams) that allows to send messages, however they are limited in size and delivery is not guaranteed (i.e. it's possible that a message will get lost, that will be duplicated or that two messages you send will arrive in different order).
TCP stream is built on top of this datagram service and implements all the logic needed to transfer data reliably between the two endpoints.
As an alternative there are libraries designed to provide reliable message-passing between endpoints, like ZeroMQ.
Most probably you use SOCK_STREAM type socket. This is a TCP socket and that means that you push data to one side and it gets from the other side in the same order and without missing chunks, but there are no delimiters. So send() just sends data and recv() receives all the data available to the current moment.
You can use SOCK_DGRAM and then UDP will be used. But in such case every send() will send a datagram and recv() will receive it. But you are not guaranteed that your datagrams will not be shuffled or lost, so you will have to deal with such problems yourself. There is also a limit on maximal datagram size.
Or you can stick to TCP connection but then you have to send delimiters yourself.
I have an application in PyQT (UDP client) that send some parameters over UDP/IP to an application on raspberry (UDP server).
This Qt application has several fields like PID parameters, speed of the motor, sensors presets and so on.
Actually, the UDP client sends a string by getting the values from each field in the QT application and appending the data into the string with separator character (','), always in the same sequence. For instance, "142.0, 10.0, 2.0, negative, positive".
The UDP server receives this message, splits the message and moves each item of the list to the respective variable.
It works, but it is not smart, all parameters are sent even when one the parameter is not changed.
Whats should be the smart way to send only specific parameters, not depending of the right sequence? or only the changed ones?
Maybe some encapsulate protocol over the UDP message?
If you really want to keep things simple and change the existing code the least, you can include empty values for parameters with values that didn't change. E.g. if you have four parameters, then assuming they all changed you'd send 142,10,2,negative,positive, but if only the first two changed you'd send 142,10,,. But, such ad-hoc schemes should be IMHO discouraged.
You could use json with very short member strings. E.g.{"a":142,"b":10}. You'd have to keep a human-readable mapping between the short string keys and their meaning separate from the data. Since the strings can be any Unicode character, you have quite a way to go before you ran out of single characters to use. Also, Python natively supports json.
If you don't care much about the length of the packet, then you don't even need short member strings: make your packets-self documenting by using meaningful strings, such as {"velocity":142,"acceleration":10}.
In Twisted when implementing the dataReceived method, there doesn't seem to be any examples which refer to packets being fragmented. In every other language this is something you manually implement, so I was just wondering if this is done for you in twisted already or what? If so, do I need to prefix my packets with a length header? Or do I have to do this manually? If so, what way would that be?
In the dataReceived method you get back the data as a string of indeterminate length meaning that it may be a whole message in your protocol or it may only be part of the message that some 'client' sent to you. You will have to inspect the data to see if it comprises a whole message in your protocol.
I'm currently using Twisted on one of my projects to implement a protocol and decided to use the struct module to pack/unpack my data. The protocol I am implementing has a fixed header size so I don't construct any messages until I've read at least HEADER_SIZE amount of bytes. The total message size is declared in this header data portion.
I guess you don't really need to define a message length as part of your protocol but it helps. If you didn't define one you would have to have a special delimiter that determines when a message begins/ends. Sort of how the FIX protocol uses the SOH byte to delimit fields. Though it does have a required field that tells you how long a message is (just not how many fields are in a message).
When dealing with TCP, you should really forget all notion of 'packets'. TCP is a stream protocol - you stream data in and data streams out the other side. Once the data is sent, it is allowed to arrive in as many or as few blocks as it wants, as long as the data all arrives in the right order. You'll have to manually do the delimitation as with other languages, with a length field, or a message type field, or a special delimiter character, etc.
You can also use a LineReceiver protocol