Need some clarification on how socket.recv behaves - python

I'm trying to write an IRC bot but I'm not exactly sure how the receiving of data works. What I currently have:
while True:
data = socket.recv(1024)
#process data
Let's say that for whatever reason it takes it more time to process the data, what would happen if something is sent at that time? Will it get skipped or get added to some sort of a queue and processed after the current one is done?

Depending upon the protocol type the behavior will be different.
TCP:
The TCP RFC clearly states:
TCP provides a means for the receiver to govern the amount of data
sent by the sender. This is achieved by returning a "window" with
every ACK indicating a range of acceptable sequence numbers beyond
the last segment successfully received. The window indicates an
allowed number of octets that the sender may transmit before
receiving further permission.
Also from wikipedia the information is similar:
TCP uses an end-to-end flow control protocol to avoid having the
sender send data too fast for the TCP receiver to receive and process
it reliably. For example, if a PC sends data to a smartphone that is
slowly processing received data, the smartphone must regulate the data
flow so as not to be overwhelmed. TCP uses a sliding window flow
control protocol. In each TCP segment, the receiver specifies in the
receive window field the amount of additionally received data (in
bytes) that it is willing to buffer for the connection. The sending
host can send only up to that amount of data before it must wait for
an acknowledgment and window update from the receiving host.
UDP:
UDP doesn't have any flow control mechanism as TCP. However there is an other implementation of UDP such as RUDP that have some of the features of TCP like flow control.
Here is an other interesting link for the differences between TCP & UDP.

Related

Why does my TCP/IP Socket receive inconsistent amounts of data per read?

I am on working a project with Socket TCP/IP ( Server-C# and Client-Python).
Streaming video after sometimes,the data of Recv Socket is splitted.
My data is buff = 22000 bytes,if it is splitted it will become :
buff = 1460
buff = 20600
I don't know why,i have researched some methods with MTU,Fragmentation,Windows Size,....but not have result
Specially,if i setsocketopt the process will appear less.
self.sk.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1048576)
enter image description here --Image about the data is splitted
This is my Recv Socket.
buff = self.sk.recv(1048576)
print("BUFF RECEIVE ::: ::::: ---->>>>> ",len(buff))
if buff == b'' :
self.sk=None
buff = None
return buff
Suggestions: This just only happened to Chrome Browser(It mean,can't Streaming Video if loss data ).But at Firefox,it is not.It seem to blink a moment if loss data but It can continue stream after.
enter image description here - Chrome and FireFox
That is just the way TCP works. It is a streaming protocol with no built-in message framing, so it is free to place bytes into packets in any amounts it chooses — all it guarantees is that the bytes will still be in the correct order when they are read() by the receiving socket.
Therefore, if your program requires a certain number of bytes before it can proceed, it is up to your program to do the necessary buffering to assemble those bytes together.
As for why TCP might behave the way you observed —it is likely reacting to network conditions (dropped packets, feedback from the receiving host’s TCP stack, etc), and trying to make transmission as efficient as possible given its current environment. It’s 100% up to the TCP stack how it wants to transmit the data, and different TCP stacks may behave differently, which is fine as long as they follow the rules of the TCP specification.
After a long time,I have found the answer for my issue.
Solution for TCP/IP client socket message boundary problem
**1/**When you send a package from Server to Client with Send(Write).At Client side,the Receive will not get full data in sometimes.It not mean,Send/write at Server not send enough data.Just because this is TCP/IP protocol,Receive is not graduatee and the package will be fragmentation at Client Side ( your code ).
**2/**You can solve this issue by add more pattern at send/write Server Side. For example, send(data) --> send ( Pattern + data) and at Client side,you can use patern to check data.
**3/**Limitations of this method,the package after fragmentation,it can "combine together" or sometime it can't not.For example,your send data = 4000 and at Client side,your receive = 1460 + 2540.
This is what I understood with my issue.

can TCP really guarantee delivery?

I am reading a networking book and from what I have read about the TCP protocol, it makes sure the data will be sent. I want to write some code to do a file transfer. Before getting to that, I also read in the Python documents this passage:
"Applications are responsible for checking that all data has been
sent; if only some of the data was transmitted, the application needs
to attempt delivery of the remaining data"
This seems to contradict what I read in the networking book. The passage above says applications are responsible for the lost data.
I may be misunderstanding so I want to ask some questions:
1-If I have to check that the data is sent, then why use TCP?
2-I read in the networking book that TCP does the math to make sure that the data is there. Then why isn't using TCP a waste of time ?
3- The python docs didn't specify a buffer size. what is the maximum size of buffer to send at a time?
4-I read in the networking book that the server can increase the amount of
data that it can send if it knows the client can receive it. can this change
the size of the buffer more than the maximum number?
Here is my code attempt so far:
Server code:
import socket
s = socket.socket()
host = socket.gethostname()
port = 3000
s.bind((host,port))
s.listen(1)
c,addr = s.accept()
with open("Filetosend","rb") as File:
data= File.read(1024)
while data:
c.send(data)
data = File.read(1024)
s.close()
Client code:
import socket
s= socket.socket()
host = socket.gethostname()
port = 3000
s.connect((host,port))
with open("Filetowrite","wb") as File:
data = s.recv(1024)
while data:
File.write(data)
data = s.recv(1024)
s.close()
TCP tries to guarantee that if the data is delivered, it's correct and in order. It uses checksums to ensure data isn't corrupted, and sequence numbers to ensure that data is delivered in order and with no gaps. And it uses acknowledgements so the sender will know that data has been received.
But suppose there's a network failure in the middle of a transmission. If it happens after the data segment is received, but before the acknowledgement is sent back, the sender will not know that the data was received. The sender will keep trying to resend the data, and will eventually time out and report an error to the application.
Most TCP APIs don't allow the application to find out precisely where in the communication the error happened. If you sent a megabyte, and get an error, it could have happened at the beginning, when hardly anything was sent, or at the end when most of the data was sent. It could even have happened after all the data was sent -- maybe just the last ACK was lost.
Furthermore, the write() system call generally just puts the data in a kernel buffer. It doesn't wait for the data to be sent to the network, and doesn't wait for the receiver to acknowledge it.
Even if you successfully close the connection, you can't be totally sure. When you close the connection, the sender sends a message to the recipient saying they're done sending data. But closing the connection just queues this in the network stack, it doesn't wait for the other system to acknowledge it.
This is why application protocols have their own level of acknowledgements, on top of the basic TCP protocol. For instance, in the SMTP protocol, the client sends the message contents, followed by a line with a . to indicate the end, then waits for the server to send back a response code that indicates that the message was received successfully and is being delivered or queued. TCP's checking ensures that if you receive this response, the message contents were sent intact.
Regarding the general ability of any protocol to guarantee perfect delivery of all messages, you should read about the Two Generals' Problem. No matter what you do, there's no way to verify delivery of all messages in any communication, because the only way to confirm that the last message was delivered is by sending another message in reply, and now that reply is the last message, and needs confirmation.

Weird behavior of send() and recv()

SORRY FOR BAD ENGLISH
Why if I have two send()-s on the server, and two recv()-s on the client, sometimes the first recv() will get the content of the 2nd send() from the server, without taking just the content of the first one and let the other recv() to take the "due and proper" content of the other send()?
How can I get this work in an other way?
This is by design.
A TCP stream is a channel on which you can send bytes between two endpoints but the transmission is stream-based, not message based.
If you want to send messages then you need to encode them... for example by prepending a "size" field that will inform the receiver how many bytes to expect for the body.
If you send 100 bytes and then other 100 bytes it's well possible that the receiver will instead see 200 at once, or even 50 + 150 in two different read commands. If you want message boundaries then you have to put them in the data yourself.
There is a lower layer (datagrams) that allows to send messages, however they are limited in size and delivery is not guaranteed (i.e. it's possible that a message will get lost, that will be duplicated or that two messages you send will arrive in different order).
TCP stream is built on top of this datagram service and implements all the logic needed to transfer data reliably between the two endpoints.
As an alternative there are libraries designed to provide reliable message-passing between endpoints, like ZeroMQ.
Most probably you use SOCK_STREAM type socket. This is a TCP socket and that means that you push data to one side and it gets from the other side in the same order and without missing chunks, but there are no delimiters. So send() just sends data and recv() receives all the data available to the current moment.
You can use SOCK_DGRAM and then UDP will be used. But in such case every send() will send a datagram and recv() will receive it. But you are not guaranteed that your datagrams will not be shuffled or lost, so you will have to deal with such problems yourself. There is also a limit on maximal datagram size.
Or you can stick to TCP connection but then you have to send delimiters yourself.

Python UDP socket misses packets

I'm implementing client-server communication using UDP that's used for FTP. First off, you don't need to tell me that UDP is unreliable, I know. My approach is: client asks for a file, server blasts the client with udp packets with sequence numbers, then says "what'd you miss?", resending those. On a local network, packet loss is < 1%. I'm pretty new to socket programming, so I'm not familiar with all the socket options (of which most examples found on google are for tcp).
My problem is why my client's receiving of this data.
PACKET_SIZE = 9216
mysocket.sendto('GO!', server_addr)
while True:
resp = mysocket.recv(PACKET_SIZE)
worker_thread.enqeue_packet(resp)
But by the time it gets back up to .recv(), it's missed a few udp packets (that I've confirmed are being sent using wireshark). I can fix this by making the server send slightly slower (actually, including logging statements is enough of a delay to make everything function).
How can i make sure that socket.recv doesn't miss anything in the time it takes to process a packet? I've tried pushing the data out to a separate thread that pushes it into a queue, but it's still not enough.
Any ideas? select, recv_into, setblocking?
While you already know, that UDP is not reliable, you maybe missed the other advantages of TCP. Relevant for you is that TCP has flow control and automatically scales down if the receiver is unable to cope with the senders speed (e.g. packet loss). So for normal connections TCP should be preferred for data transfer. For high latency connections (satellite link) it behaves too bad in the default configuration, so that some people design there custom transfer protocols (mostly with UDP), while others just tune the existing TCP stack.
I don't know why you use UDP, but if you want to continue to use it you should add some kind of back channel to the sender to inform it from current packet loss, so that it can scale down. Maybe you should have a look at RTCP, which accompanies RTP (used for VoIP etc).

How to test server behavior under network loss at every possible packet

I'm working with mobile, so I expect network loss to be common. I'm doing payments, so each request matters.
I would like to be able to test my server to see precisely how it will behave with client network loss at different points in the request cycle -- specifically between any given packet send/receive during the entire network communication.
I suspect that the server will behave slightly differently if the communication is lost while sending the response vs. while waiting for a FIN-ACK, and I want to know which timings of disconnections I can distinguish.
I tried simulating an http request using scapy, and stopping communication between each TCP packet. (I.e.: first send SYN then disappear; then send SYN and receive SYN-ACK and then disappear; then send SYN and receive SYN-ACK and send ACK and then disappear; etc.) However, I quickly got bogged down in the details of trying to reproduce a functional TCP stack.
Is there a good existing tool to automate/enable this kind of testing?
Unless your application is actually responding to and generating its own IP packets (which would be incredibly silly), you probably don't need to do testing at that layer. Simply testing at the TCP layer (e.g, connect(), send(), recv(), shutdown()) will probably be sufficient, as those events are the only ones which your server will be aware of.

Categories