Handling Python sockets

Handling Python sockets - python

First off, I've got the following code that... works. Apparently.
while not self.socket_connected:
try:
client_socket.connect((self.hostname, self.port))
self.socket_connected = True
except:
sleep(0.5)
pass
while self.socket_connected:
message = client_socket.recv(4096)
if(message == b''):
client_socket.close()
self.socket_connected = False
break
#...do stuff
I say "apparently" because I'm reading conflicting sources about how one ought to implement sockets in Python.
Firstly, you've got information as here and here that would have you believe an empty buffer is a disconnected socket. That must've been what I read first (the code above is a few months old at this point, and my first serious attempt at sockets in Python).
However, there's also this post that seems a little better informed. That is, if the buffer is empty, it just means you've read everything available for now. Kind of like how I understand TCP to work in the first place. And maybe I missed it, but is that even mentioned in the docs?
Anyway... what I realized about my code is that, every time the buffer is empty, I drop the client-side socket and reconnect to read new information. That's obviously not ideal, and I'd like to change it.
In C, if recv returns zero, the buffer is empty. If it returns <0, something's gone wrong and you can destroy the file descriptor and attempt to reestablish the connection. How is one supposed to do the same in Python?
EDIT: Just as a bit more context - I've got the first five bytes of the messages being received here encoded to the size of the overall message, so I'll be able to test for 'done-ness' internally, provided that I can distinguish between an empty buffer and a dropped socket.
EDIT 2: What I'm asking specifically is how to check Python sockets for both an empty buffer as well as a dropped connection. Both should be handled differently, of course, and I need to make sure I'm getting the full message by possibly doing multiple recv() calls.

By default socket.recv is a blocking call, it'll suspend the thread until there is something it has received in the buffer, then it'll return the whole buffer.
When a socket is disconnected the buffer will become b'' not empty, so it returns b''.
Non blocking sockets that have no data when you call .recv will return a socket exception.
So to answer what I believe the question is, by default, socket.recv will return b'' when the client disconnects.
edit: To see if a socket is empty, you could disable blocking, and then catch the exception that would be thrown by calling recv when the buffer is empty.
Alternatively using the select module with sockets to sort your sockets into 3 lists, ready to read, ready to write, and sockets with exceptions.
https://docs.python.org/3/library/select.html#select.select

Related

Python pygame Client/Server runs slow

I found a basic space invaders pygame on Youtube and I want to modify it in order that, as of right now, the server is doing all the processing and drawing, and the client only sends keyboard input(all run on localhost). The problem is that the game is no longer that responsive after I implemented this mechanism. It appears to be about 1 second delay after I press a key to when the ship is actually moving (when starting the game from pycharm, when it starts from cmd it's much worse).
I don't have any idea why this is happening because there isn't really anything heavy to process and I could really use your help.
I also monitored the Ethernet traffic in wireshark and there seems to be sent about 60-70 packets each second.
Here is the github link with all the necesary things: https://github.com/PaaulFarcas/C-S-Game

I would expect this code in the main loop is the issue:
recv = conn.recv(661)
keys = pickle.loads(recv)
The socket function conn.recv() will block until 661 bytes are received, or there is some socket event (like being closed). So your program is blocking every iteration of the main loop waiting for the data to arrive.
You could try using socket.setblocking( False ) as per the manual.
However I prefer to use the select module (manual link), as I like the better level of control it gives. Basically you can use it to know if any data has arrived on the socket (or if there's an error). This gives you a simple select-read-buffer type logic loop:
procedure receiveSocketData
Use select on the socket, with an immediate timeout.
Did select indicate any data arrived on my socket?
Read the data, appending it to a Rx-buffer
Does the Rx-buffer contain enough for a whole packet?
take the packet-chunk from the head of the Rx-buffer
decode & return it
Else
Keep the Rx-Buffer somewhere safe
return None
Did any errors happen on my socket
clear Rx-Buffer
close socket
return error
I guess using an unknown-sized packet, you could try to un-pickle it, and return OK when successful... this is quite inefficient though. I would use a fixed size packet and the struct module to pack and unpack it in network-byte-order.

Adding a time.sleep to a multithreaded program solves a UnicodeDecodeError in python

Here's a basic idea of the threads that I am creating in my program:
Main thread
|
ListenerCreator(The WebSocketServer thread) ---> Several listener threads(using log())
So the main thread creates a ListenerCreator thread, which connects to a number of clients and creates a listener thread for each client. Here's briefly what a listener thread does:
EDIT1 :
I'm using WebSockets to read/write data off my client. I've made my own server for this purpose. There is a framing protocol which the standard specifies -- and I am using that. On the client side I am simply using WebSocket.send() and "unmasking" the messages according to the instructions given in the protocol(see section 5.3 in the link above).
I would be willing to provide the server code if someone requests it, however, here's a brief outline:
class WebSocketServer:
def start():
#Open server socket, bind to host:port
while True:
#Accept client socket, start a new listener thread for self.log(client)
def log(client):
#Receive data using socket.socket.recv(1024)
#Unmask data as per the protocol
#Decode using data.decode("utf-8")
#Append to data_q while holding data_q_lock
There are other methods - those to facilitate sending, closing, handshaking and so on.
Meanwhile in the main thread:
while breaking!=len(client_list):
#time.sleep(0.5)
with data_q_lock:
for i in range(len(data_q)):
mes = data_q.pop()
for m in client_list:
if "#DONE"== mes:
breaking += 1
if(mes[:len("#COUNT:")] == "#COUNT:"):
print(mes)
So basically what this loop does is: Loop thru the data_q, if the message starts with "#COUNT", print the message, and after getting a certain number of "#DONE" messages, exit the loop.
If the time.sleep is uncommented, then this code works, however without time.sleep I get an UnicodeDecodeError in the log function.
Also I only get the error sometimes , sometimes the program works perfectly.
(The client is sending the same data every time, by the way)
So, my question is, why is the time.sleep required?
I thought it was something to do with the GIL in python, as time.sleep releases the GIL. However, even after reading about it I couldn't solve the question

Currently there is no information about how the listener is reading data off the socket. It seems likely however that this is being caused by the usual misunderstanding of sockets.
Data sent down a socket is not "framed" in any way by the socket. Imagine if I sent the message "hello" three times down a socket. Then, like writing to a file without line breaks, the following would flow on the socket:
hellohellohello
Now consider the reader ... when reading the data, how does it know where one message ("hello") starts and and the next? It cannot, unless the sender and receiver agree about how that data should be "framed". This could be done by agreeing on some protocol like:
null-terminating data; or
fixed size messages; or
size prefixed messages.
It gets more complicated of course, even once you've decided how the data should be framed, you cannot guarantee that socket.recv will return a "whole" message ... it will simply return whatever data happens to be in the buffer at the time. It may be a half a message, or a message and a half. Its your job to collate the data read from the socket and divide it into messages.
Turning to your problem, where you are sending utf-8 data. How does the reader know it has read a full utf-8 data message? Most likely, what is happening here is that you have only received a partial message ... there is still more to arrive.
In particular, a valid utf-8 character may consist of more than one byte. So if your partial message ends in the middle of a multi-byte utf-8 representation of a character, then you can certainly not decode it.

In this Python 3 client-server example, client can't send more than one message

This is a simple client-server example where the server returns whatever the client sends, but reversed.
Server:
import socketserver
class MyTCPHandler(socketserver.BaseRequestHandler):
def handle(self):
self.data = self.request.recv(1024)
print('RECEIVED: ' + str(self.data))
self.request.sendall(str(self.data)[::-1].encode('utf-8'))
server = socketserver.TCPServer(('localhost', 9999), MyTCPHandler)
server.serve_forever()
Client:
import socket
import threading
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(('localhost',9999))
def readData():
while True:
data = s.recv(1024)
if data:
print('Received: ' + data.decode('utf-8'))
t1 = threading.Thread(target=readData)
t1.start()
def sendData():
while True:
intxt = input()
s.send(intxt.encode('utf-8'))
t2 = threading.Thread(target=sendData)
t2.start()
I took the server from an example I found on Google, but the client was written from scratch. The idea was having a client that can keep sending and receiving data from the server indefinitely.
Sending the first message with the client works. But when I try to send a second message, I get this error:
ConnectionAbortedError: [WinError 10053] An established connection was
aborted by the software in your host machine
What am I doing wrong?

For TCPServer, the handle method of the handler gets called once to handle the entire session. This may not be entirely clear from the documentation, but socketserver is, like many libraries in the stdlib, meant to serve as clear sample code as well as to be used directly, which is why the docs link to the source, where you can clearly see that it's only going to call handle once per connection (TCPServer.get_request is defined as just calling accept on the socket).
So, your server receives one buffer, sends back a response, and then quits, closing the connection.
To fix this, you need to use a loop:
def handle(self):
while True:
self.data = self.request.recv(1024)
if not self.data:
print('DISCONNECTED')
break
print('RECEIVED: ' + str(self.data))
self.request.sendall(str(self.data)[::-1].encode('utf-8'))
A few side notes:
First, using BaseRequestHandler on its own only allows you to handle one client connection at a time. As the introduction in the docs says:
These four classes process requests synchronously; each request must be completed before the next request can be started. This isn’t suitable if each request takes a long time to complete, because it requires a lot of computation, or because it returns a lot of data which the client is slow to process. The solution is to create a separate process or thread to handle each request; the ForkingMixIn and ThreadingMixIn mix-in classes can be used to support asynchronous behaviour.
Those mixin classes are described further in the rest of the introduction, and farther down the page, and at the bottom, with a nice example at the end. The docs don't make it clear, but if you need to do any CPU-intensive work in your handler, you want ForkingMixIn; if you need to share data between handlers, you want ThreadingMixIn; otherwise it doesn't matter much which you choose.
Note that if you're trying to handle a large number of simultaneous clients (more than a couple dozen), neither forking nor threading is really appropriate—which means TCPServer isn't really appropriate. For that case, you probably want asyncio, or a third-party library (Twisted, gevent, etc.).
Calling str(self.data) is a bad idea. You're just going to get the source-code-compatible representation of the byte string, like b'spam\n'. What you want is to decode the byte string into the equivalent Unicode string: self.data.decode('utf8').
There's no guarantee that each sendall on one side will match up with a single recv on the other side. TCP is a stream of bytes, not a stream of messages; it's perfectly possible to get half a message in one recv, and two and a half messages in the next one. When testing with a single connection on localhost with the system under light load, it will probably appear to "work", but as soon as you try to deploy any code that assumes that each recv gets exactly one message, your code will break. See Sockets are byte streams, not message streams for more details. Note that if your messages are just lines of text (as they are in your example), using StreamRequestHandler and its rfile attribute, instead of BaseRequestHandler and its request attribute, solves this problem trivially.
You probably want to set server.allow_reuse_address = True. Otherwise, if you quit the server and re-launch it again too quickly, it'll fail with an error like OSError: [Errno 48] Address already in use.

Python socket wait

I was wondering if there is a way I can tell python to wait until it gets a response from a server to continue running.
I am writing a turn based game. I make the first move and it sends the move to the server and then the server to the other computer. The problem comes here. As it is no longer my turn I want my game to wait until it gets a response from the server (wait until the other player makes a move). But my line:
data=self.sock.recv(1024)
hangs because (I think) it's no getting something immediately. So I want know how can I make it wait for something to happen and then keep going.
Thanks in advance.

The socket programming howto is relevant to this question, specifically this part:
Now we come to the major stumbling block of sockets - send and recv operate on the
network buffers. They do not necessarily handle all the bytes you hand them (or expect
from them), because their major focus is handling the network buffers. In general, they
return when the associated network buffers have been filled (send) or emptied (recv).
They then tell you how many bytes they handled. It is your responsibility to call them
again until your message has been completely dealt with.
...
One complication to be aware of: if your conversational protocol allows multiple
messages to be sent back to back (without some kind of reply), and you pass recv an
arbitrary chunk size, you may end up reading the start of a following message. You’ll
need to put that aside >and hold onto it, until it’s needed.
Prefixing the message with it’s length (say, as 5 numeric characters) gets more complex,
because (believe it or not), you may not get all 5 characters in one recv. In playing
around, you’ll get away with it; but in high network loads, your code will very quickly
break unless you use two recv loops - the first to determine the length, the second to
get the data part of the message. Nasty. This is also when you’ll discover that send
does not always manage to get rid of everything in one pass. And despite having read
this, you will eventually get bit by it!
The main takeaways from this are:
you'll need to establish either a FIXED message size, OR you'll need to send the the size of the message at the beginning of the message
when calling socket.recv, pass number of bytes you actually want (and I'm guessing you don't actually want 1024 bytes). Then use LOOPs because you are not guaranteed to get all you want in a single call.

That line, sock.recv(1024), blocks until 1024 bytes have been received or the OS detects a socket error. You need some way to know the message size -- this is why HTTP messages include the Content-Length.
You can set a timeout with socket.settimeout to abort reading entirely if the expected number of bytes doesn't arrive before a timeout.
You can also explore Python's non-blocking sockets using setblocking(0).

PySerial write timed out -- how much data went through?

I have two applications interacting over a TCP/IP connection; now I need them to be able to interact over a serial connection as well.
There are a few differences between socket IO and serial IO that make porting less trivial than I hoped for.
One of the differences is about the semantics of send/write timeouts and the assumptions an application may make about the amount of data successfully passed down the connection. Knowing this amount the application also knows what leftover data it needs to transmit later should it choose so.
Socket.send
A call like socket.send(string) may produce the following results:
The entire string has been accepted by the TCP/IP stack, and the
length of the string is returned.
A part of the string has been accepted by the TCP/IP stack, and the
length of that part is returned. The application may transmit the
rest of the string later.
A socket.timeout exception is raised if the socket is configured to
use timeouts and the sender overwhelms the connection with data.
This means (if I understand it correctly) that no bytes of the
string have been accepted by the TCP/IP stack and hence the
application may try to send the entire string later.
A socket.error exception is raised because of some issues with the
connection.
PySerial.Serial.write
The PySerial API documentation says the following about Serial.write(string):
write(data)
Parameters:
data – Data to send.
Returns:
Number of bytes written.
Raises
SerialTimeoutException:
In case a write timeout is configured for the port and the time is exceeded.
Changed in version 2.5: Write returned None in previous versions.
This spec leaves a few questions uncertain to me:
In which circumstances may "write(data)" return fewer bytes written
than the length of the data? Is it only possible in the non-blocking
mode (writeTimeout=0)?
If I use a positive writeTimeout and the SerialTimeoutException is
raised, how do I know how many bytes went into the connection?
I also observe some behaviors of serial.write that I did not expect.
The test tries sending a long string over a slow connection. The sending port uses 9600,8,N,1 and no flow control. The receiving port is open too but no attempts to read data from it are being made.
If the writeTimeout is positive but not large enough the sender expectedly
gets the SerialTimeoutException.
If the writeTimeout is set large enough the sender expectedly gets all data written
successfully (the receiver does not care to read, neither do we).
If the writeTimeout is set to None, the sender unexpectedly gets the SerialTimeoutException
instead of blocking until all data goes down the connection. Am I missing something?
I do not know if that behavior is typical.
In case that matters, I experiment with PySerial on Windows 7 64-bit using two USB-to-COM adapters connected via a null-modem cable; that setup seems to be operational as two instances of Tera Term can talk to each other over it.
It would be helpful to know if people handle serial write timeouts in any way other than aborting the connection and notifying the user of the problem.
Since I currently do not know the amount of data written before the timeout has occurred, I am thinking of a workaround using non-blocking writes and maintaining the socket-like timeout semantics myself above that level. I do not expect this to be a terrifically efficient solution (:-)), but luckily my applications exchange relatively infrequent and short messages so the performance should be within the acceptable range.
[EDITED]
A closer look at non-blocking serial writes
I wrote a simple program to see if I understand how the non-blocking write works:
import serial
p1 = serial.Serial("COM11") # My USB-to-COM adapters appear at these high port numbers
p2 = serial.Serial("COM12")
message = "Hello! " * 10
print "%d bytes in the whole message: %r" % (len(message), message)
p1.writeTimeout = 0 # enabling non-blocking mode
bytes_written = p1.write(message)
print "Written %d bytes of the message: %r" % (bytes_written, message[:bytes_written])
print "Receiving back %d bytes of the message" % len(message)
message_read_back = p2.read(len(message))
print "Received back %d bytes of the message: %r" % (len(message_read_back), message_read_back)
p1.close()
p2.close()
The output I get is this:
70 bytes in the whole message: 'Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! '
Written 0 bytes of the message: ''
Receiving back 70 bytes of the message
Received back 70 bytes of the message: 'Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! Hello! '
I am very confused: the sender thinks no data was sent yet the receiver got it all. I must be missing something very fundamental here...
Any comments / suggestions / questions are very welcome!

Since it isn't documented, let's look at the source code. I only looked at the POSIX and Win32 implementations, but it's pretty obvious that on at least those two platforms:
There are no circumstances when write(data) may return fewer bytes written than the length of the data, timeout or otherwise; it always either returns the full len(data), or raises an exception.
If you use a positive writeTimeout and the SerialTimeoutException is raised, there is no way at all to tell how many bytes were sent.
In particular, on POSIX, the number of bytes sent so far is only stored on a local variable that's lost as soon as the exception is raised; on Windows, it just does a single overlapped WriteFile and raises an exception for anything but a successful "wrote everything".
I assume that you care about at least one of those two platforms. (And if not, you're probably not writing cross-platform code, and can look at the one platform you do care about.) So, there is no direct solution to your problem.
If the workaround you described is acceptable, or a different one (like writing exactly one byte at a time—which is probably even less efficient, but maybe simpler), do that.
Alternatively, you will have to edit the write implementations you care about (whether you do this by forking the package and editing your fork, monkeypatching Serial.write at runtime, or just writing a serial_write function and calling serial_write(port, data) instead of port.write(data) in your script) to provide the information you want.
That doesn't look too hard. For example, in the POSIX version, you just have to stash len(data)-t somewhere before either of the raise writeTimeoutError lines. You could stick it in an attribute of the Serial object, or pass it as an extra argument to the exception constructor. (Of course if you're trying to write a cross-platform program, and you don't know all of the platforms well enough to write the appropriate implementations, that isn't likely to be a good answer.)
And really, given that it's not that hard to implement what you want, you might want to add a feature request (ideally with a patch) on the pyserial tracker.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.