Python socket recv taking ages to deliver packet

Python socket recv taking ages to deliver packet - python

I have a Python 3 program which sends short commands to a host and gets short responses back (both 20 bytes). It's not doing anything complicated.
The socket is opened like this:
self.conn = socket.create_connection( ( self.host, self.port ) )
self.conn.settimeout( POLL_TIME )
and used like this:
while( True ):
buf = self.conn.recv( 256 )
# append buffer to bigger buffer, parse packet once we've got enough bytes
After my program has been running for a while (hours, usually), sometimes it goes into a strange mode - if I use tcpdump, I can see a response packet arriving at the local machine, but recv doesn't give me the packet until 30s (Windows) to 1m (Linux) later. The time is random +/- about ten seconds. I wondered if the packet was being delayed til the next packet arrived, but this doesn't seem to be true.
In the meantime, the same program is also operating a second socket connection using the same code on a different thread, which continues to work normally.
This doesn't happen all the time, but it's happened several times in a month. Sometimes it's preceded by a few seconds of packets taking longer and longer to arrive, but most of the time it just goes straight from OK to completely broken. Most of the time it stays broken for hours until I restart the server, but last night I noticed it recovering and going back to normal operation, so it's not irrecoverable.
CPU usage is almost zero, and nothing else is running on the same machine.
The weirdest thing is that this happens on both the Linux Subsystem for Windows (two different laptops), and on Linux (AWS tiny instance running Amazon Linux).
I had a look at the CPython implementation of socket.recv() using GDB. Looking at the source code, it looks like it passes calls to socket.recv() straight through to the underlying recv(). However, while the outer function sock_recv() (which implements socket.recv() ) gets called frequently, it only calls recv() when there's actually data to read from the socket, using the socket_call() function to call poll()/select() to see if there's any data waiting. Calls to recv() happen directly before the app receives a packet, so the delay is somewhere before that point, rather than between recv() and my code.
Any ideas on how to troubleshoot this?
(Both the Linux and Windows machines are updated to the most recent everything, and the Python is Python 3.6.2)
[edit] The issue gets even weirder. I got fed up and wrote a method to detect the issue (looking for ten late-arriving packets in a row with near-identical roundtrip times), drop the connection and reconnect (by closing the previous connection and creating a new socket object) ... and it didn't work. Even with a new socket object, the delayed packets remain delayed by exactly the same amount. So I altered the method to completely kill the thread that was running that code and restart it, reasoning that perhaps there was some thread-local state. That still didn't work. The only resort I have left is killing the entire program and having a watchdog to restart it...
[edit2] Killing the entire program and restarting it with an external watchdog worked. It's a terrible solution, but at least it's a solution.

Related

Python pygame Client/Server runs slow

I found a basic space invaders pygame on Youtube and I want to modify it in order that, as of right now, the server is doing all the processing and drawing, and the client only sends keyboard input(all run on localhost). The problem is that the game is no longer that responsive after I implemented this mechanism. It appears to be about 1 second delay after I press a key to when the ship is actually moving (when starting the game from pycharm, when it starts from cmd it's much worse).
I don't have any idea why this is happening because there isn't really anything heavy to process and I could really use your help.
I also monitored the Ethernet traffic in wireshark and there seems to be sent about 60-70 packets each second.
Here is the github link with all the necesary things: https://github.com/PaaulFarcas/C-S-Game

I would expect this code in the main loop is the issue:
recv = conn.recv(661)
keys = pickle.loads(recv)
The socket function conn.recv() will block until 661 bytes are received, or there is some socket event (like being closed). So your program is blocking every iteration of the main loop waiting for the data to arrive.
You could try using socket.setblocking( False ) as per the manual.
However I prefer to use the select module (manual link), as I like the better level of control it gives. Basically you can use it to know if any data has arrived on the socket (or if there's an error). This gives you a simple select-read-buffer type logic loop:
procedure receiveSocketData
Use select on the socket, with an immediate timeout.
Did select indicate any data arrived on my socket?
Read the data, appending it to a Rx-buffer
Does the Rx-buffer contain enough for a whole packet?
take the packet-chunk from the head of the Rx-buffer
decode & return it
Else
Keep the Rx-Buffer somewhere safe
return None
Did any errors happen on my socket
clear Rx-Buffer
close socket
return error
I guess using an unknown-sized packet, you could try to un-pickle it, and return OK when successful... this is quite inefficient though. I would use a fixed size packet and the struct module to pack and unpack it in network-byte-order.

Why does sending consecutive UDP messages cause messages to arrive late?

I've written a server python script in Windows 7 to send Ethernet UDP packets to a UNIX system running a C client receiving program that sends the message back to the server. However, sometimes (not always) a message in the last port (and always the last port) that python sends to won't arrive until the next batch of 4 messages is sent. This causes the timing of the message received for the last port incorrect to when it was sent, and I cannot have two messages back to back on the same port.
I have been able to verify this in Wireshark by locating two messages that arrived around the same time because the one that wasn't received was processed with the other. I have also checked the timing right after the recv() function and it shows a long delay and then a short delay because it basically had two packets received.
Things I have done to try to fix this, but has help me explain the problem or how to solve it: I can add a delay in between each sendto() and I will successfully send and receive all messages with correct timing but I want the test to work the way I've written it below; I've increased the priority of the receive thread thinking that my Ethernet receive was not getting signal to pick up the package or that some process was taking too long, but this didn't work and 20ms should be WAY more than necessary to process the data; I have removed ports C and D, then port B misses messages (Only having one port doesn't caause issues), I thought reducing the number of ports would improve timing; Sending to a dummy PORTE immediately after PORTD lets me receive all of the messages with correct timing (I assume the problem is transferred to PORTE); I have also reproduced the python script in a UNIX environment and C code and have had the same issue, pointing me to a receiving issue; I've also set my recv function to time out every 1ms hoping that it could recover somehow even though the timing would be off a bit, but I still saw messages back to back. I've also checked that no UDP packets have been dropped and that the buffer is large enough to hold those 4 messages. Any new ideas would help.
This is the core of the code, the python script will send 4 packets. One 20 byte message to a corresponding waiting thread in C and delay for 20ms
A representation of the python code looks something like
msg_cnt = 5000
while cnt < msg_cnt:
UDPsocket.sendto(data, (IP, PORTA))
UDPsocket.sendto(data, (IP, PORTB))
UDPsocket.sendto(data, (IP, PORTC))
UDPsocket.sendto(data, (IP, PORTD))
time.sleep(.02)
cnt++
The C code has 4 threads waiting to receive on their corresponding ports. Essentially each thread should receive its packet, process it, and send back to the server. This process should take less than 20ms before the next set of messages arrive
void * receiveEthernetThread(){
uint8_t ethRxBuff[1024];
if((byteCnt = recv(socketForPort, ethRxBuff, 1024, 0)) < 0){
perror("recv")
}else{
//Process Data, cannot have back to back messages on the same port
//Send back to the server
}
}

I found out the reason I was missing messages a while back and wanted to answer my question. I was running the program on a Zynq-7000 and didn't realize this would be an issue.
In the Xilinx Zynq-7000-TRM, there is a known issue describing:
" It is possible to have the last frame(s) stuck in the RX FIFO with the software having no way to get the last frame(s) out of there. The GEM only initiates a descriptor request when it receives another frame. Therefore, when a frame is in the FIFO and a descriptor only becomes available at a later time and no new frames arrive, there is no way to get that frame out or even know that it exists.
This issue does not occur under typical operating conditions. Typical operating conditions are when the system always has incoming Ethernet frames. The above mentioned issue occurs when the MAC stops receiving the Ethernet frames.
Workaround: There is no workaround in the software except to ensure a continual flow of Ethernet frames."
Was fixed basically by having continuous incoming Ethernet traffic, sorry for missing that crucial information.

This causes the timing of the message received for the last port
incorrect to when it was sent, and I cannot have two messages back to
back on the same port.
The short explanation is you are using UDP, and that protocol gives no guarantees about delivery or order.
That aside, what you are describing most definitely sounds like a buffering issue. Unfortunately, there is no real way to "flush" a socket.
You will either need to use a protocol that guarantees what you need (TCP), or implement your needs on top of UDP.
I believe your real problem though is how the data is being parsed on the server side. If your application is completely dependent on a 20ms interval of four separate packets from the network, that's just asking for trouble. If it's possible, I would fix that rather than attempt fixing (normal) socket buffering issues.
A hacky solution though, since I love hacky things:
Set up a fifth socket on the server. After sending your four time-sensitive packets, send "enough" packets to the fifth port to force any remaining time-sensitive packets through. What is "enough" packets is up to you. You can send a static number, assuming it will work, or have the fifth port send you a message back the moment it starts recv-ing.

Significant Delay in Receiving Data using Python Select()

I have a Python script that is used to receive data associated with a radio station audio event (such as a song or commercial) from the machine playing the audio. The script will parse and process the data and then send portions of it to various other destinations.
First the socket is set up:
client_socket_1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
print 'trying to open socket 1'
client_socket_1.connect((TCP_RCV_IP_CR1, TCP_RCV_PORT_CR1))
client_socket_1.setblocking(0)
except socket.error, e:
print 'Error', e, TCP_RCV_IP_CR1, '\n\n\n'
else:
SOCK1 = 1
print 'Successful connection to ',TCP_RCV_IP_CR1,'\n'
Now we wait until data is available to be read. I used select() and when the socket is ready to be read, the thread that parses and processes the data is spawned.
ready_1 = select.select([client_socket_1], [], [], 1) # select tells us when data is available at the socket
if ready_1[0] and SOCK1: # Don't run this code if there is no connection on client_socket_1 or no data available
t1 = Thread(target=processdata1) # Set up the thread
t1.start() # Call the process to process available data as a thread
It is important that the data be read as quickly as possible as it will be transported via TCP or UDP (depending on the particular data chunks and program specifications) along with the associated audio, and the function of one of the data items we are handling can create an on-air 'hiccup' in the audio if not received in a timely fashion. (TMI: It causes a 'replacement' commercial to play at the receiving end which is supposed to 'cover' the commercial audio we are sending. If the replacement spot doesn't start quickly enough listeners will hear the beginning of the commercial we are sending, then the local replacement one will start when our data is received and it sounds like a hiccup on the air.)
To confirm that my script is not always receiving the data quickly enough I telnetted to the port it is listening to and watched the data as it is received in the telnet window, then look at the Python output (which sends received data to stdout as soon as it is received) and I see about a 1.5-second delay between the telnet output and the Python output. This is the same amount of delay we have observed in normal on-air operation.
I chose to use select() because I was asked to multi-thread the script and I thought that would be a good way to know when to trigger a data-processing thread. My original idea was to simply loop through attempting to read data from each of the three systems we are monitoring and, when data is found, process it.
The thought was that if data is being processed from one system when another system has data ready to be read, it might cause a delay in processing and sending out the data from that machine. However, I can't see that delay being as significant as what I am experiencing now. I am considering going back to the original plan.
I would rather stick with what I have which is working flawlessly as long as data is received in a timely fashion. Any thoughts on why the excessively long delay?

I think it has to do with your timeout parameter in combination with the wlist and xlist parameters
Consider this piece of code
write_list = []
exception_list = []
select.select([client_socket_1], write_list, exception_list)
It takes an optional timeout parameter, like you use it. The documentation says
select() also takes an optional fourth parameter which is the number
of seconds to wait before breaking off monitoring if no channels have
become active. Using a timeout value lets a main program call select()
as part of a larger processing loop, taking other actions in between
checking for network input.
It might be that the call will always wait one second before returning because of the empty lists. Try
ready_1 = select.select(
[client_socket_1],
[client_socket_1],
[client_socket_1], 1
)
Or you can use a timeout value of 0, which
specifies a poll and never blocks.

Speeding up Scapy scans?

I'm writing a port scanner using scapy, but I'm finding that it is horrendously slow. I use a single line of code to actually do the scan:
ans, unans = sr(IP(dst=targetIP)/TCP(dport=(1, 49151), flags='S'))
And it takes about 15 minutes to run, even though I'm on the same LAN as the computer I'm scanning. Heck, I'm plugged into the same SWITCH as my target!
I tried multi-threading, but that actually made it slower. Using multiple processes is faster, but only to a certain point. Either scapy's sniffer can't keep up and it is losing packets, or the network itself is dropping packets (Not likely, considering nmap works fine). In either case, using 5 processes, I got the TCP scan time down to about 5-6 minutes, which while is 1/3rd the time it takes to run it in a single process, is still much slower than the ~10 seconds nmap takes.
Anyone know any other tricks to speed up Scapy port scans of large ranges?

Note that in your example, you had forgotten the timeout parameter, which is crucial: without it, scapy will wait to have recieved an answer for each packet you have send, which in your case will never happend !
As of 2018 (2.3.3dev (github version)), running
ans, unans = sr(IP(dst=targetIP)/TCP(dport=(1, 49151), flags='S', timeout=2))
Takes approximately 90 sec. The pending PR https://github.com/secdev/scapy/pull/1142 speed that up to around 50sec.

Interact with long running python process

I have a long running python process running headless on a raspberrypi (controlling a garden) like so:
from time import sleep
def run_garden():
while 1:
/* do work */
sleep(60)
if __name__ == "__main__":
run_garden()
The 60 second sleep period is plenty of time for any changes happening in my garden (humidity, air temp, turn on pump, turn off fan etc), BUT what if i want to manually override these things?
Currently, in my /* do work */ loop, i first call out to another server where I keep config variables, and I can update those config variables via a web console, but it lacks any sort of real time feel, because it relies on the 60 second loop (e.g. you might update the web console, and then wait 45 seconds for the desired effect to take effect)
The raspberryPi running run_garden() is dedicated to the garden and it is basically the only thing taking up resources. So i know i have room to do something, I just dont know what.
Once the loop picks up the fact that a config var has been updated, the loop could then do exponential backoff to keep checking for interaction, rather than wait 60 seconds, but it just doesnt feel like that is a whole lot better.
Is there a better way to basically jump into this long running process?

Listen on a socket in your main loop. Use a timeout (e.g. of 60 seconds, the time until the next garden update should be performed) on your socket read calls so you get back to your normal functionality at least every minute when there are no commands coming in.
If you need garden-tending updates to happen no faster than every minute you need to check the time since the last update, since read calls will complete significantly faster when there are commands coming in.

Python's select module sounds like it might be helpful.
If you've ever used the unix analog (for example in socket programming maybe?), then it'll be familiar.
If not, here is the select section of a C sockets reference I often recommend. And here is what looks like a nice writeup of the module.
Warning: the first reference is specifically about C, not Python, but the concept of the select system call is the same, so the discussion might be helpful.
Basically, it allows you to tell it what events you're interested in (for example, socket data arrival, keyboard event), and it'll block either forever, or until a timeout you specify elapses.
If you're using sockets, then adding the socket and stdin to the list of events you're interested in is easy. If you're just looking for a way to "conditionally sleep" for 60 seconds unless/until a keypress is detected, this would work just as well.
EDIT:
Another way to solve this would be to have your raspberry-pi "register" with the server running the web console. This could involve a little bit extra work, but it would give you the realtime effect you're looking for.
Basically, the raspberry-pi "registers" itself, by alerting the server about itself, and the server stores the address of the device. If using TCP, you could keep a connection open (which might be important if you have firewalls to deal with). If using UDP you could bind the port on the device before registering, allowing the server to respond to the source address of the "announcement".
Once announced, when config. options change on the server, one of two things usually happen:
A) You send a tiny "ping" (in the general sense, not the ICMP host detection protocol) to the device alerting it that config options have changed. At this point the host would immediately request the full config. set, acquiring the update with it.
B) You send the updated config. option (or maybe the entire config. set) back to the device. This decreases the number of messages between the device and server, but would probably take more work as it seems like more a deviation from your current setup.

Why not use an event based loop instead of sleeping for a certain amount of time.
That way your loop will only run when a change is detected, and it will always run when a change is detected (which is the point of your question?).
You can do such a thing by using:
python event objects
Just wait for one or all of your event objects to be triggered and run the loop. You can also wait for X events to be done, etc, depending if you expect one variable to be updated a lot.
Or even a system like:
broadcasting events

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.