Under which circumstances does socket.send not send all data?

Under which circumstances does socket.send not send all data? - python

The documentation of socket.send says:
Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.
I believe I have observed an instance of this in production. The fix is easy: Use sendall instead. However I'm struggling to reproduce the issue for a test.
How can I get send to not send all the data passed to it?
I've tried repeatedly writing to a socket that is never read, but it will just fill the buffer and then block.

Related

How to send and receive a file in SocketCAN or Python-can?

I want to send a text file from one serial device(slcan0) to another serial device(slcan1) can this operation be performed in SocketCAN? The serial CAN device I am using is CANtact toolkit. Or can the same operation be done in Python-can?

When you want to send a text file over the CAN bus, you have to decide which CAN-ID you want so sent for sending and receiving.
Most likely your text file is larger than 8 bytes, so you would have to use a higher level protocol on CAN.
ISO-TP will allow 4095 of data in one message.
If this is still not enough, you would have to invent another protocol for sending and receiving the data. E.g. first send the length of data, then send the data in chunks of 4095 bytes.
Once you have figured this out, it does not really matter whether you use SocketCAN, Python-CAN, pyvit or anything else.

Weird behavior of send() and recv()

SORRY FOR BAD ENGLISH
Why if I have two send()-s on the server, and two recv()-s on the client, sometimes the first recv() will get the content of the 2nd send() from the server, without taking just the content of the first one and let the other recv() to take the "due and proper" content of the other send()?
How can I get this work in an other way?

This is by design.
A TCP stream is a channel on which you can send bytes between two endpoints but the transmission is stream-based, not message based.
If you want to send messages then you need to encode them... for example by prepending a "size" field that will inform the receiver how many bytes to expect for the body.
If you send 100 bytes and then other 100 bytes it's well possible that the receiver will instead see 200 at once, or even 50 + 150 in two different read commands. If you want message boundaries then you have to put them in the data yourself.
There is a lower layer (datagrams) that allows to send messages, however they are limited in size and delivery is not guaranteed (i.e. it's possible that a message will get lost, that will be duplicated or that two messages you send will arrive in different order).
TCP stream is built on top of this datagram service and implements all the logic needed to transfer data reliably between the two endpoints.
As an alternative there are libraries designed to provide reliable message-passing between endpoints, like ZeroMQ.

Most probably you use SOCK_STREAM type socket. This is a TCP socket and that means that you push data to one side and it gets from the other side in the same order and without missing chunks, but there are no delimiters. So send() just sends data and recv() receives all the data available to the current moment.
You can use SOCK_DGRAM and then UDP will be used. But in such case every send() will send a datagram and recv() will receive it. But you are not guaranteed that your datagrams will not be shuffled or lost, so you will have to deal with such problems yourself. There is also a limit on maximal datagram size.
Or you can stick to TCP connection but then you have to send delimiters yourself.

Python TCP socket for a lot of data

We (as project group) are currently stuck on the issue of how to handle live data to our server.
We are getting updates on data every second, and we would like to insert this into our database (security is currently not an issue, because it is a school project). The problem is here we tried python SockerServer and AsyncIO to create a TCP server to which the data can be sent.
We got this working with different libraries etc. But we are stuck on the fact that if we keep an open connection with the client (in this case hardware which sends data every second) we can't split the different JSON or XML messages. They are all added up together.
We know why because TCP only provides order.
Any thoughts on how to handle this? So that every message sent will get split from the others.
Recreating the socket won't be the right option if I recall correctly.

What you will have to do is ensure that there is a clear delimiter for each message. For example, the first 6 characters of every message could be the length of the message - whatever reads from the socket decodes the length then reads that number of bytes, and sends the data to whatever needs it. Another way would be if there is a character/byte which never appears in the content, send it immediately before a message - for example control-A (binary value 1) could be the leadin character, and send control-B (binary value 2) as the leadout. Again the server looks for these framing a message.

If you can't change the client side (the thing sending the data), then you are going to have to parse the input. You can't just add a delimiter to something that you don't control.
An alternative is to use a header that encodes the size of the message that will be sent. Lets say you use a header of 4 bytes, The client first send the server a header with the size of the message to come. The client then sends the message (up to 4 gigs or there about). The server knows that it must first read 4 bytes (a header). It calculates the size n that the header contained then reads n bytes from the socket buffer. You are guaranteed to have read only your message. Using special delimiters is dangerous as you MUST know all possible values that a client can send.
It really depends on the type of data you are receiving. What type of connection, latency... If you have a pause of 1 second between packets and your connection is consistent, you could probably get away with first reading the entire buffer once to clear it, then as soon as there is data available - read it and clear the buffer it. not a great approach, but it might work for what you need - and no parsing involved.

what does it mean when python socket.sendall returns successfully?

In my code I wrote something like this:
try:
s.sendall(data)
except Exception as e:
print e
Now, can I assume that if there wasn't any exception thrown by sendall that the other side of the socket (its kernel) did receive 'data'? If not then that means I need to send an application ack which seems unreasonable to me.
If I can assume that the other side's kernel did receive 'data' then that means that 'sendall' returns only when it sees tcp ack for all the bytes I have put in 'data' but I couldn't see any documentation for this, on the contrary, from searching the web I got the feeling that I cannot assume an ack was received.

can I assume that if there wasn't any exception thrown by sendall that the other side of the socket (its kernel) did receive 'data'?
No, you can't. All it tells you that the system successfully sent the data. It will not wait for the peer to ACK the data (i.e. data received at the OS kernel) or even wait until the data got processed by the peer application. This behavior is not specific to python.
And usually it does not matter much if the peer systems kernel received the data and put it into the applications socket buffer. All what really counts is if it received and processed the data inside the application, which might involve complex things like inserting the data into a database and waiting for a successful commit or even forwarding the data to yet another system. And since it is up to the application to decide when the data are really processed you have to make your application specific ACK to signal successful processing.

Yes you can :)
According to the socket.sendall docs:
socket.sendall(string[, flags]) Send data to the socket. The socket
must be connected to a remote socket. The optional flags argument has
the same meaning as for recv() above. Unlike send(), this method
continues to send data from string until either all data has been sent
or an error occurs. None is returned on success. On error, an
exception is raised, and there is no way to determine how much data,
if any, was successfully sent.
Specifically:
socket.sendall() will continue to send all data until it has completed or an error has occurred.
Update: To answer your comment about what's going on under the hook:
Looking at the socketmodule.c source code it looks like it repeatedly tries to "send all data" until there is no more data left to send. You can see this on L3611 } while (len > 0);. Hopefully this answers your question.

Python/Twisted - TCP packet fragmentation?

In Twisted when implementing the dataReceived method, there doesn't seem to be any examples which refer to packets being fragmented. In every other language this is something you manually implement, so I was just wondering if this is done for you in twisted already or what? If so, do I need to prefix my packets with a length header? Or do I have to do this manually? If so, what way would that be?

In the dataReceived method you get back the data as a string of indeterminate length meaning that it may be a whole message in your protocol or it may only be part of the message that some 'client' sent to you. You will have to inspect the data to see if it comprises a whole message in your protocol.
I'm currently using Twisted on one of my projects to implement a protocol and decided to use the struct module to pack/unpack my data. The protocol I am implementing has a fixed header size so I don't construct any messages until I've read at least HEADER_SIZE amount of bytes. The total message size is declared in this header data portion.
I guess you don't really need to define a message length as part of your protocol but it helps. If you didn't define one you would have to have a special delimiter that determines when a message begins/ends. Sort of how the FIX protocol uses the SOH byte to delimit fields. Though it does have a required field that tells you how long a message is (just not how many fields are in a message).

When dealing with TCP, you should really forget all notion of 'packets'. TCP is a stream protocol - you stream data in and data streams out the other side. Once the data is sent, it is allowed to arrive in as many or as few blocks as it wants, as long as the data all arrives in the right order. You'll have to manually do the delimitation as with other languages, with a length field, or a message type field, or a special delimiter character, etc.

You can also use a LineReceiver protocol

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Under which circumstances does socket.send not send all data? - python

Related

How to send and receive a file in SocketCAN or Python-can?

Weird behavior of send() and recv()

Python TCP socket for a lot of data

what does it mean when python socket.sendall returns successfully?

Python/Twisted - TCP packet fragmentation?

Categories

Resources