Tornado websocket client loosing response messages?

Tornado websocket client loosing response messages? - python

I need to process frames from a webcam and send a few selected frames to a remote websocket server. The server answers immediately with a confirmation message (much like an echo server).
Frame processing is slow and cpu intensive so I want to do it using a separate thread pool (producer) to use all the available cores. So the client (consumer) just sits idle until the pool has something to send.
My current implementation, see below, works fine only if I add a small sleep inside the producer test loop. If I remove this delay I stop receiving any answer from the server (both the echo server and from my real server). Even the first answer is lost, so I do not think this is a flood protection mechanism.
What am I doing wrong?
import tornado
from tornado.websocket import websocket_connect
from tornado import gen, queues
import time
class TornadoClient(object):
url = None
onMessageReceived = None
onMessageSent = None
ioloop = tornado.ioloop.IOLoop.current()
q = queues.Queue()
def __init__(self, url, onMessageReceived, onMessageSent):
self.url = url
self.onMessageReceived = onMessageReceived
self.onMessageSent = onMessageSent
def enqueueMessage(self, msgData, binary=False):
print("TornadoClient.enqueueMessage")
self.ioloop.add_callback(self.addToQueue, (msgData, binary))
print("TornadoClient.enqueueMessage done")
#gen.coroutine
def addToQueue(self, msgTuple):
yield self.q.put(msgTuple)
#gen.coroutine
def main_loop(self):
connection = None
try:
while True:
while connection is None:
try:
print("Connecting...")
connection = yield websocket_connect(self.url)
print("Connected " + str(connection))
except Exception, e:
print("Exception on connection " + str(e))
connection = None
print("Retry in a few seconds...")
yield gen.Task(self.ioloop.add_timeout, time.time() + 3)
try:
print("Waiting for data to send...")
msgData, binaryVal = yield self.q.get()
print("Writing...")
sendFuture = connection.write_message(msgData, binary=binaryVal)
print("Write scheduled...")
finally:
self.q.task_done()
yield sendFuture
self.onMessageSent("Sent ok")
print("Write done. Reading...")
msg = yield connection.read_message()
print("Got msg.")
self.onMessageReceived(msg)
if msg is None:
print("Connection lost")
connection = None
print("main loop completed")
except Exception, e:
print("ExceptionExceptionException")
print(e)
connection = None
print("Exit main_loop function")
def start(self):
self.ioloop.run_sync(self.main_loop)
print("Main loop completed")
######### TEST METHODS #########
def sendMessages(client):
time.sleep(2) #TEST only: wait for client startup
while True:
client.enqueueMessage("msgData", binary=False)
time.sleep(1) # <--- comment this line to break it
def testPrintMessage(msg):
print("Received: " + str(msg))
def testPrintSentMessage(msg):
print("Sent: " + msg)
if __name__=='__main__':
from threading import Thread
client = TornadoClient("ws://echo.websocket.org", testPrintMessage, testPrintSentMessage)
thread = Thread(target = sendMessages, args = (client, ))
thread.start()
client.start()
My real problem
In my real program I use a "window like" mechanism to protect the consumer (an autobahn.twisted.websocket server): the producer can send up to a maximum number of un-acknowledge messages (the webcam frames), then stops waiting for half of the window to free up.
The consumer sends a "PROCESSED" message back acknowleding one or more messages (just a counter, not by id).
What I see on the consumer log is that the messages are processed and the answer is sent back but these acks vanish somewhere in the network.
I have little experience with asynchio so I wanted to be sure that I'm not missing any yield, annotation or something else.
This is the consumer side log:
2017-05-13 18:59:54+0200 [-] TX Frame to tcp4:192.168.0.5:48964 : fin = True, rsv = 0, opcode = 1, mask = -, length = 21, repeat_length = None, chopsize = None, sync = False, payload = {"type": "PROCESSED"}
2017-05-13 18:59:54+0200 [-] TX Octets to tcp4:192.168.0.5:48964 : sync = False, octets = 81157b2274797065223a202250524f434553534544227d

This is neat code. I believe the reason you need a sleep in your sendMessages thread is because, otherwise, it keeps calling enqueueMessage as fast as possible, millions of times per second. Since enqueueMessage does not wait for the enqueued message to be processed, it keeps calling IOLoop.add_callback as fast as it can, without giving the loop enough opportunity to execute the callbacks.
The loop might make some progress running on the main thread, since you're not actually blocking it. But the sendMessages thread adds callbacks much faster than the loop can handle them. By the time the loop has popped one message from the queue and has begun to process it, millions of new callbacks are added already, which the loop must execute before it can advance to the next stage of message-processing.
Therefore, for your test code, I think it's correct to sleep between calls to enqueueMessage on the thread.

Related

ZMQ pair (for signaling) is blocking because of bad connection

I have two threads. One is a Worker Thread, the other a Communication Thread.
The Worker Thread is reading data off a serial port, doing some processing, and then enqueueing the results to be sent to a server.
The Communication Tthread is reading the results off the queue, and sending it. The challenge is that connectivity is wireless, and although usually present, it can be spotty (dropping in and out of range for a few minutes), and I don't want to block Worker Thread if I lose connectivity.
The pattern I have chosen for this, is as follows:
Worker Thread has an enqueue method which adds the message to a Queue, then send a signal to inproc://signal using a zmq.PAIR.
Communication Thread uses zmq.DEALER to communicate to the server (a zmq.ROUTER), but polls the inproc://signal pair in order to register whether there is a new message needing sending or not.
The following is a simplified example of the pattern:
import Queue
import zmq
import time
import threading
import simplejson
class ZmqPattern():
def __init__(self):
self.q_out = Queue.Queue()
self.q_in = Queue.Queue()
self.signal = None
self.API_KEY = 'SOMETHINGCOMPLEX'
self.zmq_comm_thr = None
def start_zmq_signal(self):
self.context = zmq.Context()
# signal socket for waking the zmq thread to send messages to the relay
self.signal = self.context.socket(zmq.PAIR)
self.signal.bind("inproc://signal")
def enqueue(self, msg):
print("> pre-enqueue")
self.q_out.put(msg)
print("< post-enqueue")
print(") send sig")
self.signal.send(b"")
print("( sig sent")
def communication_thread(self, q_out):
poll = zmq.Poller()
self.endpoint_url = 'tcp://' + '127.0.0.1' + ':' + '9001'
wake = self.context.socket(zmq.PAIR)
wake.connect("inproc://signal")
poll.register(wake, zmq.POLLIN)
self.socket = self.context.socket(zmq.DEALER)
self.socket.setsockopt(zmq.IDENTITY, self.API_KEY)
self.socket.connect(self.endpoint_url)
poll.register(self.socket, zmq.POLLIN)
while True:
sockets = dict(poll.poll())
if self.socket in sockets:
message = self.socket.recv()
message = simplejson.loads(message)
# Incomming messages which need to be handled on the worker thread
self.q_in.put(message)
if wake in sockets:
wake.recv()
while not q_out.empty():
print(">> Popping off Queue")
message = q_out.get()
print(">>> Popped off Queue")
message = simplejson.dumps(message)
print("<<< About to be sent")
self.socket.send(message)
print("<< Sent")
def start(self):
self.start_zmq_signal()
# ZMQ Thread
self.zmq_comm_thr = threading.Thread(target=self.communication_thread, args=([self.q_out]))
self.zmq_comm_thr.daemon = True
self.zmq_comm_thr.name = "ZMQ Thread"
self.zmq_comm_thr.start()
if __name__ == '__main__':
test = ZmqPattern()
test.start()
print '###############################################'
print '############## Starting comms #################'
print "###############################################"
last_debug = time.time()
test_msg = {}
for c in xrange(1000):
key = 'something{}'.format(c)
val = 'important{}'.format(c)
test_msg[key] = val
while True:
test.enqueue(test_msg)
if time.time() - last_debug > 1:
last_debug = time.time()
print "Still alive..."
If you run this, you'll see the dealer blocks as there is no router on the other end, and shortly after, the pair blocks as the Communication Thread isn't receiving
How should I best set up the inproc zmq to not block Worker Thread.
FYI, the most the entire system would need to buffer is in the order of 200k messages, and each message is around 256 bytes.

The dealer socket has a limit on the number of messages it will store, called the high water mark. Right below your dealer socket creation, try:
self.socket = self.context.socket(zmq.DEALER)
self.socket.setsockopt(zmq.SNDHWM, 200000)
And set that number as high as you dare; the limit is your machine's memory.
EDIT:
Some good discussion of high water marks in this question:
Majordomo broker: handling large number of connections

What is the best way to kill a looping python thread on exception?

I wrote a program that uses threads to keep a connection alive while the main program loops until it either has an exception or is manually closed. My program runs in 1 hour intervals and the timeout for the connection is 20 minutes, thus I spawn a thread for every connection element that exist inside of my architecture. Thus, if we have two servers to connect to it connects to both these serves and stays connected and loops through each server retrieving data.
the program I wrote works correctly, however I can't seem to find a way to handle when the program it's self throws an exception. This is to say I can't find an appropriate way to dispose of the threads when the main program excepts. When the program excepts it will just hang open because of the thread not excepting as well and it won't close correctly and will have to be closed manually.
Any suggestions on how to handle cleaning up threads on program exit?
This is my thread:
def keep_vc_alive(vcenter,credentials, api):
vm_url = str(vcenter._proxy.binding.url).split('/')[2]
while True:
try:
logging.info('staying connected %s' % str(vm_url))
vcenter.keep_session_alive()
except:
logging.info('unable to call current time of vcenter %s attempting to reconnect.' % str(vm_url))
try:
vcenter = None
connected,api_version,uuid,vcenter = vcenter_open(60, api, * credentials)
except:
logging.critical('unable to call current time of vcenter %s killing application, please have administrator restart the module.' % str(vm_url))
break
time.sleep(60*10)
Then my exception clean up code is as follows, obviously I know.stop() doesn't work, but I honestly have no idea how to do what it is im trying to do.
except Abort: # Exit without clearing the semaphore
logging.exception('ApplicationError')
try:
config_values_vc = metering_config('VSphere',['vcenter-ip','username','password','api-version'])
for k in xrange(0, len(config_values_vc['username'])): # Loop through each vcenter server
vc_thread[config_values_vc['vcenter-ip'][k]].stop()
except:
pass
#disconnect vcenter
try:
for vcenter in list_of_vc_connections:
list_of_vc_connections[vcenter].disconnect()
except:
pass
try: # Close the db is it is open (db is defined)
db.close()
except:
pass
sys.exit(1)
except SystemExit:
raise
except:
logging.exception('ApplicationError')
semaphore('ComputeLoader', False)
logging.critical('Unexpected error: %s' % sys.exc_info()[0])
raise

Instead of sleeping, wait on a threading.Event():
def keep_vc_alive(vcenter,credentials, api, event): # event is a threading.Event()
vm_url = str(vcenter._proxy.binding.url).split('/')[2]
while not event.is_set(): # If the event got set, we exit the thread
try:
logging.info('staying connected %s' % str(vm_url))
vcenter.keep_session_alive()
except:
logging.info('unable to call current time of vcenter %s attempting to reconnect.' % str(vm_url))
try:
vcenter = None
connected,api_version,uuid,vcenter = vcenter_open(60, api, * credentials)
except:
logging.critical('unable to call current time of vcenter %s killing application, please have administrator restart the module.' % str(vm_url))
break
event.wait(timeout=60*10) # Wait until the timeout expires, or the event is set.
Then, in your main thread, set the event in the exception handling code:
except Abort: # Exit without clearing the semaphore
logging.exception('ApplicationError')
event.set() # keep_alive thread will wake up, see that the event is set, and exit

The generally accepted way to stop threads in python is to use the threading.Event object.
The algorithm followed usually is something like the following:
import threading
...
threads = []
#in the main program
stop_event = threading.Event()
#create thread and store thread and stop_event together
thread = threading.Thread(target=keep_vc_alive, args=(stop_event))
threads.append((thread, stop_event))
#execute thread
thread.start()
...
#in thread (i.e. keep_vc_alive)
# check is_set in stop_event
while not stop_event.is_set():
#receive data from server, etc
...
...
#in exception handler
except Abort:
#set the stop_events
for thread, stop_event in threads:
stop_event.set()
#wait for threads to stop
while 1:
#check for any alive threads
all_finished = True
for thread in threads:
if thread.is_alive():
all_finished = False
#keep cpu down
time.sleep(1)

Have a function time out if a certain condition is not fulfilled in time

The issue I have is that my chat client is supposed to recieve and print data from server when the server sends it, and then allow the client to reply.
This works fine, except that the entire process stops when the client is prompted to reply. So messages pile up until you type something, and after you do that, then it prints all the recieved messages.
Not sure how to fix this, so I decided why not have the client's time to type a reply timeout after 5 seconds, so that the replies can come through regardless. It's pretty flawed, because the input will reset itself, but it works better anyways.
Here's the function that needs to have a timeout:
# now for outgoing data
def outgoing():
global out_buffer
while 1:
user_input=input("your message: ")+"\n"
if user_input:
out_buffer += [user_input.encode()]
# for i in wlist:
s.send(out_buffer[0])
out_buffer = []
How should I go about using a timeout? I was thinking of using time.sleep, but that just pauses the entire operation.
I tried looking for documentation. But I didn't find anything that would help me make the program count up to a set limit, then continue.
Any idea's about how to solve this? (Doesn't need to use a timeout, just needs to stop the message pileup before the clients reply can be sent) (Thanks to all who helped me get this far)
For Ionut Hulub:
from socket import *
import threading
import json
import select
import signal # for trying to create timeout
print("client")
HOST = input("connect to: ")
PORT = int(input("on port: "))
# create the socket
s = socket(AF_INET, SOCK_STREAM)
s.connect((HOST, PORT))
print("connected to:", HOST)
#--------- need 2 threads for handling incoming and outgoing messages--
# 1: create out_buffer:
out_buffer = []
# for incoming data
def incoming():
rlist,wlist,xlist = select.select([s], out_buffer, [])
while 1:
for i in rlist:
data = i.recv(1024)
if data:
print("\nreceived:", data.decode())
# now for outgoing data
def outgoing():
global out_buffer
while 1:
user_input=input("your message: ")+"\n"
if user_input:
out_buffer += [user_input.encode()]
# for i in wlist:
s.send(out_buffer[0])
out_buffer = []
thread_in = threading.Thread(target=incoming, args=())
thread_out = threading.Thread(target=outgoing, args=())
thread_in.start() # this causes the thread to run
thread_out.start()
thread_in.join() # this waits until the thread has completed
thread_out.join()

We can use signals for the same. I think the below example will be useful for you.
import signal
def timeout(signum, frame):
raise Exception
#this is an infinite loop, never ending under normal circumstances
def main():
print 'Starting Main ',
while 1:
print 'in main ',
#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)
#change 5 to however many seconds you need
signal.alarm(5)
try:
main()
except:
print "whoops"

Thread synchronization in Python

I am currently working on a school project where the assignment, among other things, is to set up a threaded server/client system. Each client in the system is supposed to be assigned its own thread on the server when connecting to it. In addition i would like the server to run other threads, one concerning input from the command line and another concerning broadcasting messages to all clients. However, I can't get this to run as i want to. It seems like the threads are blocking each other. I would like my program to take inputs from the command line, at the "same time" as the server listens to connected clients, and so on.
I am new to python programming and multithreading, and allthough I think my idea is good, I'm not suprised my code doesn't work. Thing is I'm not exactly sure how I'm going to implement the message passing between the different threads. Nor am I sure exactly how to implement the resource lock commands properly. I'm going to post the code for my server file and my client file here, and I hope someone could help me with this. I think this actually should be two relative simple scripts. I have tried to comment on my code as good as possible to some extend.
import select
import socket
import sys
import threading
import client
class Server:
#initializing server socket
def __init__(self, event):
self.host = 'localhost'
self.port = 50000
self.backlog = 5
self.size = 1024
self.server = None
self.server_running = False
self.listen_threads = []
self.local_threads = []
self.clients = []
self.serverSocketLock = None
self.cmdLock = None
#here i have also declared some events for the command line input
#and the receive function respectively, not sure if correct
self.cmd_event = event
self.socket_event = event
def openSocket(self):
#binding server to port
try:
self.server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.server.bind((self.host, self.port))
self.server.listen(5)
print "Listening to port " + str(self.port) + "..."
except socket.error, (value,message):
if self.server:
self.server.close()
print "Could not open socket: " + message
sys.exit(1)
def run(self):
self.openSocket()
#making Rlocks for the socket and for the command line input
self.serverSocketLock = threading.RLock()
self.cmdLock = threading.RLock()
#set blocking to non-blocking
self.server.setblocking(0)
#making two threads always running on the server,
#one for the command line input, and one for broadcasting (sending)
cmd_thread = threading.Thread(target=self.server_cmd)
broadcast_thread = threading.Thread(target=self.broadcast,args=[self.clients])
cmd_thread.daemon = True
broadcast_thread.daemon = True
#append the threads to thread list
self.local_threads.append(cmd_thread)
self.local_threads.append(broadcast_thread)
cmd_thread.start()
broadcast_thread.start()
self.server_running = True
while self.server_running:
#connecting to "knocking" clients
try:
c = client.Client(self.server.accept())
self.clients.append(c)
print "Client " + str(c.address) + " connected"
#making a thread for each clientn and appending it to client list
listen_thread = threading.Thread(target=self.listenToClient,args=[c])
self.listen_threads.append(listen_thread)
listen_thread.daemon = True
listen_thread.start()
#setting event "client has connected"
self.socket_event.set()
except socket.error, (value, message):
continue
#close threads
self.server.close()
print "Closing client threads"
for c in self.listen_threads:
c.join()
def listenToClient(self, c):
while self.server_running:
#the idea here is to wait until the thread gets the message "client
#has connected"
self.socket_event.wait()
#then clear the event immidiately...
self.socket_event.clear()
#and aquire the socket resource
self.serverSocketLock.acquire()
#the below is the receive thingy
try:
recvd_data = c.client.recv(self.size)
if recvd_data == "" or recvd_data == "close\n":
print "Client " + str(c.address) + (" disconnected...")
self.socket_event.clear()
self.serverSocketLock.release()
return
print recvd_data
#I put these here to avoid locking the resource if no message
#has been received
self.socket_event.clear()
self.serverSocketLock.release()
except socket.error, (value, message):
continue
def server_cmd(self):
#this is a simple command line utility
while self.server_running:
#got to have a smart way to make this work
self.cmd_event.wait()
self.cmd_event.clear()
self.cmdLock.acquire()
cmd = sys.stdin.readline()
if cmd == "":
continue
if cmd == "close\n":
print "Server shutting down..."
self.server_running = False
self.cmdLock.release()
def broadcast(self, clients):
while self.server_running:
#this function will broadcast a message received from one
#client, to all other clients, but i guess any thread
#aspects applied to the above, will work here also
try:
send_data = sys.stdin.readline()
if send_data == "":
continue
else:
for c in clients:
c.client.send(send_data)
self.serverSocketLock.release()
self.cmdLock.release()
except socket.error, (value, message):
continue
if __name__ == "__main__":
e = threading.Event()
s = Server(e)
s.run()
And then the client file
import select
import socket
import sys
import server
import threading
class Client(threading.Thread):
#initializing client socket
def __init__(self,(client,address)):
threading.Thread.__init__(self)
self.client = client
self.address = address
self.size = 1024
self.client_running = False
self.running_threads = []
self.ClientSocketLock = None
def run(self):
#connect to server
self.client.connect(('localhost',50000))
#making a lock for the socket resource
self.clientSocketLock = threading.Lock()
self.client.setblocking(0)
self.client_running = True
#making two threads, one for receiving messages from server...
listen = threading.Thread(target=self.listenToServer)
#...and one for sending messages to server
speak = threading.Thread(target=self.speakToServer)
#not actually sure wat daemon means
listen.daemon = True
speak.daemon = True
#appending the threads to the thread-list
self.running_threads.append(listen)
self.running_threads.append(speak)
listen.start()
speak.start()
#this while-loop is just for avoiding the script terminating
while self.client_running:
dummy = 1
#closing the threads if the client goes down
print "Client operating on its own"
self.client.close()
#close threads
for t in self.running_threads:
t.join()
return
#defining "listen"-function
def listenToServer(self):
while self.client_running:
#here i acquire the socket to this function, but i realize I also
#should have a message passing wait()-function or something
#somewhere
self.clientSocketLock.acquire()
try:
data_recvd = self.client.recv(self.size)
print data_recvd
except socket.error, (value,message):
continue
#releasing the socket resource
self.clientSocketLock.release()
#defining "speak"-function, doing much the same as for the above function
def speakToServer(self):
while self.client_running:
self.clientSocketLock.acquire()
try:
send_data = sys.stdin.readline()
if send_data == "close\n":
print "Disconnecting..."
self.client_running = False
else:
self.client.send(send_data)
except socket.error, (value,message):
continue
self.clientSocketLock.release()
if __name__ == "__main__":
c = Client((socket.socket(socket.AF_INET, socket.SOCK_STREAM),'localhost'))
c.run()
I realize this is quite a few code lines for you to read through, but as I said, I think the concept and the script in it self should be quite simple to understand. It would be very much appriciated if someone could help me synchronize my threads in a proper way =)
Thanks in advance
---Edit---
OK. So I now have simplified my code to just containing send and receive functions in both the server and the client modules. The clients connecting to the server gets their own threads, and the send and receive functions in both modules operetes in their own separate threads. This works like a charm, with the broadcast function in the server module echoing strings it gets from one client to all clients. So far so good!
The next thing i want my script to do, is taking specific commands, i.e. "close", in the client module to shut down the client, and join all running threads in the thread list. Im using an event flag to notify the listenToServer and the main thread that the speakToServer thread has read the input "close". It seems like the main thread jumps out of its while loop and starts the for loop that is supposed to join the other threads. But here it hangs. It seems like the while loop in the listenToServer thread never stops even though server_running should be set to False when the event flag is set.
I'm posting only the client module here, because I guess an answer to get these two threads to synchronize will relate to synchronizing more threads in both the client and the server module also.
import select
import socket
import sys
import server_bygg0203
import threading
from time import sleep
class Client(threading.Thread):
#initializing client socket
def __init__(self,(client,address)):
threading.Thread.__init__(self)
self.client = client
self.address = address
self.size = 1024
self.client_running = False
self.running_threads = []
self.ClientSocketLock = None
self.disconnected = threading.Event()
def run(self):
#connect to server
self.client.connect(('localhost',50000))
#self.client.setblocking(0)
self.client_running = True
#making two threads, one for receiving messages from server...
listen = threading.Thread(target=self.listenToServer)
#...and one for sending messages to server
speak = threading.Thread(target=self.speakToServer)
#not actually sure what daemon means
listen.daemon = True
speak.daemon = True
#appending the threads to the thread-list
self.running_threads.append((listen,"listen"))
self.running_threads.append((speak, "speak"))
listen.start()
speak.start()
while self.client_running:
#check if event is set, and if it is
#set while statement to false
if self.disconnected.isSet():
self.client_running = False
#closing the threads if the client goes down
print "Client operating on its own"
self.client.shutdown(1)
self.client.close()
#close threads
#the script hangs at the for-loop below, and
#refuses to close the listen-thread (and possibly
#also the speak thread, but it never gets that far)
for t in self.running_threads:
print "Waiting for " + t[1] + " to close..."
t[0].join()
self.disconnected.clear()
return
#defining "speak"-function
def speakToServer(self):
#sends strings to server
while self.client_running:
try:
send_data = sys.stdin.readline()
self.client.send(send_data)
#I want the "close" command
#to set an event flag, which is being read by all other threads,
#and, at the same time set the while statement to false
if send_data == "close\n":
print "Disconnecting..."
self.disconnected.set()
self.client_running = False
except socket.error, (value,message):
continue
return
#defining "listen"-function
def listenToServer(self):
#receives strings from server
while self.client_running:
#check if event is set, and if it is
#set while statement to false
if self.disconnected.isSet():
self.client_running = False
try:
data_recvd = self.client.recv(self.size)
print data_recvd
except socket.error, (value,message):
continue
return
if __name__ == "__main__":
c = Client((socket.socket(socket.AF_INET, socket.SOCK_STREAM),'localhost'))
c.run()
Later on, when I get this server/client system up and running, I will use this system on some elevator models we have here on the lab, with each client receiving floor orders or "up" and "down" calls. The server will be running an distribution algorithm and updating the elevator queues on the clients that are most appropriate for the requested order. I realize it's a long way to go, but I guess one should just take one step at the time =)
Hope someone has the time to look into this. Thanks in advance.

The biggest problem I see with this code is that you have far too much going on right away to easily debug your problem. Threading can get extremely complicated because of how non-linear the logic becomes. Especially when you have to worry about synchronizing with locks.
The reason you are seeing clients blocking on each other is because of the way you are using your serverSocketLock in your listenToClient() loop in the server. To be honest this isn't exactly your problem right now with your code, but it became the problem when I started to debug it and turned the sockets into blocking sockets. If you are putting each connection into its own thread and reading from them, then there is no reason to use a global server lock here. They can all read from their own sockets at the same time, which is the purpose of the thread.
Here is my recommendation to you:
Get rid of all the locks and extra threads that you don't need, and start from the beginning
Have the clients connect as you do, and put them in their thread as you do. And simply have them send data every second. Verify that you can get more than one client connecting and sending, and that your server is looping and receiving. Once you have this part working, you can move on to the next part.
Right now you have your sockets set to non-blocking. This is causing them all to spin really fast over their loops when data is not ready. Since you are threading, you should set them to block. Then the reader threads will simply sit and wait for data and respond immediately.
Locks are used when threads will be accessing shared resources. You obviously need to for any time a thread will try and modify a server attribute like a list or a value. But not when they are working on their own private sockets.
The event you are using to trigger your readers doesn't seem necessary here. You have received the client, and you start the thread afterwards. So it is ready to go.
In a nutshell...simplify and test one bit at a time. When its working, add more. There are too many threads and locks right now.
Here is a simplified example of your listenToClient method:
def listenToClient(self, c):
while self.server_running:
try:
recvd_data = c.client.recv(self.size)
print "received:", c, recvd_data
if recvd_data == "" or recvd_data == "close\n":
print "Client " + str(c.address) + (" disconnected...")
return
print recvd_data
except socket.error, (value, message):
if value == 35:
continue
else:
print "Error:", value, message

Backup your work, then toss it - partially.
You need to implement your program in pieces, and test each piece as you go. First, tackle the input part of your program. Don't worry about how to broadcast the input you received. Instead worry that you are able to successfully and repeatedly receive input over your socket. So far - so good.
Now, I assume you would like to react to this input by broadcasting to the other attached clients. Well too bad, you can't do that yet! Because, I left one minor detail out of the paragraph above. You have to design a PROTOCOL.
What is a protocol? It's a set of rules for communication. How does your server know when the client had finished sending it's data? Is it terminated by some special character? Or perhaps you encode the size of the message to be sent as the first byte or two of the message.
This is turning out to be a lot of work, isn't it? :-)
What's a simple protocol. A line-oriented protocol is simple. Read 1 character at a time until you get to the end of record terminator - '\n'. So, clients would send records like this to your server --
HELO\n
MSG DAVE Where Are Your Kids?\n
So, assuming you have this simple protocol designed, implement it. For now, DON'T WORRY ABOUT THE MULTITHREADING STUFF! Just worry about making it work.
Your current protocol is to read 1024 bytes. Which may not be bad, just make sure you send 1024 byte messages from the client.
Once you have the protocol stuff setup, move on to reacting to the input. But for now you need something that will read input. Once that is done, we can worry about doing something with it.
jdi is right, you have too much program to work with. Pieces are easier to fix.

RabbitMQ, Pika and reconnection strategy

I'm using Pika to process data from RabbitMQ.
As I seemed to run into different kind of problems I decided to write a small test application to see how I can handle disconnects.
I wrote this test app which does following:
Connect to Broker, retry until successful
When connected create a queue.
Consume this queue and put result into a python Queue.Queue(0)
Get item from Queue.Queue(0) and produce it back into the broker queue.
What I noticed were 2 issues:
When I run my script from one host connecting to rabbitmq on another host (inside a vm) then this scripts exits on random moments without producing an error.
When I run my script on the same host on which RabbitMQ is installed it runs fine and keeps running.
This might be explained because of network issues, packets dropped although I find the connection not really robust.
When the script runs locally on the RabbitMQ server and I kill the RabbitMQ then the script exits with error: "ERROR pika SelectConnection: Socket Error on 3: 104"
So it looks like I can't get the reconnection strategy working as it should be. Could someone have a look at the code so see what I'm doing wrong?
Thanks,
Jay
#!/bin/python
import logging
import threading
import Queue
import pika
from pika.reconnection_strategies import SimpleReconnectionStrategy
from pika.adapters import SelectConnection
import time
from threading import Lock
class Broker(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.logging = logging.getLogger(__name__)
self.to_broker = Queue.Queue(0)
self.from_broker = Queue.Queue(0)
self.parameters = pika.ConnectionParameters(host='sandbox',heartbeat=True)
self.srs = SimpleReconnectionStrategy()
self.properties = pika.BasicProperties(delivery_mode=2)
self.connection = None
while True:
try:
self.connection = SelectConnection(self.parameters, self.on_connected, reconnection_strategy=self.srs)
break
except Exception as err:
self.logging.warning('Cant connect. Reason: %s' % err)
time.sleep(1)
self.daemon=True
def run(self):
while True:
self.submitData(self.from_broker.get(block=True))
pass
def on_connected(self,connection):
connection.channel(self.on_channel_open)
def on_channel_open(self,new_channel):
self.channel = new_channel
self.channel.queue_declare(queue='sandbox', durable=True)
self.channel.basic_consume(self.processData, queue='sandbox')
def processData(self, ch, method, properties, body):
self.logging.info('Received data from broker')
self.channel.basic_ack(delivery_tag=method.delivery_tag)
self.from_broker.put(body)
def submitData(self,data):
self.logging.info('Submitting data to broker.')
self.channel.basic_publish(exchange='',
routing_key='sandbox',
body=data,
properties=self.properties)
if __name__ == '__main__':
format=('%(asctime)s %(levelname)s %(name)s %(message)s')
logging.basicConfig(level=logging.DEBUG, format=format)
broker=Broker()
broker.start()
try:
broker.connection.ioloop.start()
except Exception as err:
print err

The main problem with your script is that it is interacting with a single channel from both your main thread (where the ioloop is running) and the "Broker" thread (calls submitData in a loop). This is not safe.
Also, SimpleReconnectionStrategy does not seem to do anything useful. It does not cause a reconnect if the connection is interrupted. I believe this is a bug in Pika: https://github.com/pika/pika/issues/120
I attempted to refactor your code to make it work as I think you wanted it to, but ran into another problem. Pika does not appear to have a way to detect delivery failure, which means that data may be lost if the connection drops. This seems like such an obvious requirement! How can there be no way to detect that basic_publish failed? I tried all kinds of stuff including transactions and add_on_return_callback (all of which seemed clunky and overly complicated), but came up with nothing. If there truly is no way then pika only seems to be useful in situations that can tolerate loss of data sent to RabbitMQ, or in programs that only need to consume from RabbitMQ.
This is not reliable, but for reference, here's some code that solves your multi-thread problem:
import logging
import pika
import Queue
import sys
import threading
import time
from functools import partial
from pika.adapters import SelectConnection, BlockingConnection
from pika.exceptions import AMQPConnectionError
from pika.reconnection_strategies import SimpleReconnectionStrategy
log = logging.getLogger(__name__)
DEFAULT_PROPERTIES = pika.BasicProperties(delivery_mode=2)
class Broker(object):
def __init__(self, parameters, on_channel_open, name='broker'):
self.parameters = parameters
self.on_channel_open = on_channel_open
self.name = name
def connect(self, forever=False):
name = self.name
while True:
try:
connection = SelectConnection(
self.parameters, self.on_connected)
log.debug('%s connected', name)
except Exception:
if not forever:
raise
log.warning('%s cannot connect', name, exc_info=True)
time.sleep(10)
continue
try:
connection.ioloop.start()
finally:
try:
connection.close()
connection.ioloop.start() # allow connection to close
except Exception:
pass
if not forever:
break
def on_connected(self, connection):
connection.channel(self.on_channel_open)
def setup_submitter(channel, data_queue, properties=DEFAULT_PROPERTIES):
def on_queue_declared(frame):
# PROBLEM pika does not appear to have a way to detect delivery
# failure, which means that data could be lost if the connection
# drops...
channel.confirm_delivery(on_delivered)
submit_data()
def on_delivered(frame):
if frame.method.NAME in ['Confirm.SelectOk', 'Basic.Ack']:
log.info('submission confirmed %r', frame)
# increasing this value seems to cause a higher failure rate
time.sleep(0)
submit_data()
else:
log.warn('submission failed: %r', frame)
#data_queue.put(...)
def submit_data():
log.info('waiting on data queue')
data = data_queue.get()
log.info('got data to submit')
channel.basic_publish(exchange='',
routing_key='sandbox',
body=data,
properties=properties,
mandatory=True)
log.info('submitted data to broker')
channel.queue_declare(
queue='sandbox', durable=True, callback=on_queue_declared)
def blocking_submitter(parameters, data_queue,
properties=DEFAULT_PROPERTIES):
while True:
try:
connection = BlockingConnection(parameters)
channel = connection.channel()
channel.queue_declare(queue='sandbox', durable=True)
except Exception:
log.error('connection failure', exc_info=True)
time.sleep(1)
continue
while True:
log.info('waiting on data queue')
try:
data = data_queue.get(timeout=1)
except Queue.Empty:
try:
connection.process_data_events()
except AMQPConnectionError:
break
continue
log.info('got data to submit')
try:
channel.basic_publish(exchange='',
routing_key='sandbox',
body=data,
properties=properties,
mandatory=True)
except Exception:
log.error('submission failed', exc_info=True)
data_queue.put(data)
break
log.info('submitted data to broker')
def setup_receiver(channel, data_queue):
def process_data(channel, method, properties, body):
log.info('received data from broker')
data_queue.put(body)
channel.basic_ack(delivery_tag=method.delivery_tag)
def on_queue_declared(frame):
channel.basic_consume(process_data, queue='sandbox')
channel.queue_declare(
queue='sandbox', durable=True, callback=on_queue_declared)
if __name__ == '__main__':
if len(sys.argv) != 2:
print 'usage: %s RABBITMQ_HOST' % sys.argv[0]
sys.exit()
format=('%(asctime)s %(levelname)s %(name)s %(message)s')
logging.basicConfig(level=logging.DEBUG, format=format)
host = sys.argv[1]
log.info('connecting to host: %s', host)
parameters = pika.ConnectionParameters(host=host, heartbeat=True)
data_queue = Queue.Queue(0)
data_queue.put('message') # prime the pump
# run submitter in a thread
setup = partial(setup_submitter, data_queue=data_queue)
broker = Broker(parameters, setup, 'submitter')
thread = threading.Thread(target=
partial(broker.connect, forever=True))
# uncomment these lines to use the blocking variant of the submitter
#thread = threading.Thread(target=
# partial(blocking_submitter, parameters, data_queue))
thread.daemon = True
thread.start()
# run receiver in main thread
setup = partial(setup_receiver, data_queue=data_queue)
broker = Broker(parameters, setup, 'receiver')
broker.connect(forever=True)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.