Making sure a worker process always terminate in zeroMQ - python

I am implementing a pipeline pattern with zeroMQ using the python bindings.
tasks are fanned out to workers which listen for new tasks with an infinite loop like this:
while True:
socks = dict(self.poller.poll())
if self.receiver in socks and socks[self.receiver] == zmq.POLLIN:
msg = self.receiver.recv_unicode(encoding='utf-8')
self.process(msg)
if self.hear in socks and socks[self.hear] == zmq.POLLIN:
msg = self.hear.recv()
print self.pid,":", msg
sys.exit(0)
they exit when they get a message from the sink node, confirming having received all the results expected.
however, worker may miss such a message and not finish. What is the best way to have workers always finish, when they have no way to know (other than through the already mentioned message, that there are no further tasks to process).
Here is the testing code I wrote for checking the workers status:
#-*- coding:utf-8 -*-
"""
Test module containing tests for all modules of pypln
"""
import unittest
from servers.ventilator import Ventilator
from subprocess import Popen, PIPE
import time
class testWorkerModules(unittest.TestCase):
def setUp(self):
self.nw = 4
#spawn 4 workers
self.ws = [Popen(['python', 'workers/dummy_worker.py'], stdout=None) for i in range(self.nw)]
#spawn a sink
self.sink = Popen(['python', 'sinks/dummy_sink.py'], stdout=None)
#start a ventilator
self.V = Ventilator()
# wait for workers and sinks to connect
time.sleep(1)
def test_send_unicode(self):
'''
Pushing unicode strings through workers to sinks.
'''
self.V.push_load([u'são joão' for i in xrange(80)])
time.sleep(1)
#[p.wait() for p in self.ws]#wait for the workers to terminate
wsr = [p.poll() for p in self.ws]
while None in wsr:
print wsr, [p.pid for p in self.ws if p.poll() == None] #these are the unfinished workers
time.sleep(0.5)
wsr = [p.poll() for p in self.ws]
self.sink.wait()
self.sink = self.sink.returncode
self.assertEqual([0]*self.nw, wsr)
self.assertEqual(0, self.sink)
if __name__ == '__main__':
unittest.main()

All the messaging stuff eventually ends up with heartbeats. If you (as a worker or a sink or whatever) discover that a component you need to work with is dead, you can basically either try to connect somewhere else or kill yourself. So if you as a worker discover that the sink is there no more, just exit. This also means that you may exit even though the sink is still there but the connection is broken. But I am not sure you can do more, perhaps set all the timeouts more reasonably...

Related

How to kill threads that are listening to message queue elegantly

In my Python application, I have a function that consumes message from Amazon SQS FIFO queue.
def consume_msgs():
sqs = boto3.client('sqs',
region_name='us-east-1',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
print('STARTING WORKER listening on {}'.format(QUEUE_URL))
while 1:
response = sqs.receive_message(
QueueUrl=QUEUE_URL,
MaxNumberOfMessages=1,
WaitTimeSeconds=10,
)
messages = response.get('Messages', [])
for message in messages:
try:
print('{} > {}'.format(threading.currentThread().getName(), message.get('Body')))
body = json.loads(message.get('Body'))
sqs.delete_message(QueueUrl=QUEUE_URL, ReceiptHandle=message.get('ReceiptHandle'))
except Exception as e:
print('Exception in worker > ', e)
sqs.delete_message(QueueUrl=QUEUE_URL, ReceiptHandle=message.get('ReceiptHandle'))
time.sleep(10)
In order to scale up, I am using multi threading to process messages.
if __name__ == '__main__:
for i in range(3):
t = threading.Thread(target=consume_msgs, name='worker-%s' % i)
t.setDaemon(True)
t.start()
while True:
print('Waiting')
time.sleep(5)
The application runs as service. If I need to deploy new release, it has to be restarted. Is there a way have the threads exist gracefully when main process is being terminated? In stead of killing the threads abruptly, they finish with current message first and stop receiving the next messages.
Since your threads keep looping, you cannot just join them, but you need to signal them it's time to break out of the loop too in order to be able to do that. This docs hint might be useful:
Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.
With that, I've put the following example together, which can hopefully help a bit:
from threading import Thread, Event
from time import sleep
def fce(ident, wrap_up_event):
cnt = 0
while True:
print(f"{ident}: {cnt}", wrap_up_event.is_set())
sleep(3)
cnt += 1
if wrap_up_event.is_set():
break
print(f"{ident}: Wrapped up")
if __name__ == '__main__':
wanna_exit = Event()
for i in range(3):
t = Thread(target=fce, args=(i, wanna_exit))
t.start()
sleep(5)
wanna_exit.set()
A single event instance is passed to fce which would just keep running endlessly, but when done with each iteration, before going back to the top check, if the event has been set to True. And before exiting from the script, we set this event to True from the controlling thread. Since the threads are no longer marked as daemon threads, we do not have to explicitly join them.
Depending on how exactly you want to shutdown your script, you will need to handle the incoming signal (SIGTERM perhaps) or KeyboardInterrupt exception for SIGINT. And perform your clean-up before exiting, the mechanics of which remain the same. Apart from not letting python just stop execution right away, you need to let your threads know they should not re-enter the loop and wait for them to be joined.
The SIGINT is a bit simpler, because it's exposed as a python exception and you could do for instance this for the "main" bit:
if __name__ == '__main__':
wanna_exit = Event()
for i in range(3):
t = Thread(target=fce, args=(i, wanna_exit))
t.start()
try:
while True:
sleep(5)
print('Waiting')
except KeyboardInterrupt:
pass
wanna_exit.set()
You can of course send SIGINT to a process with kill and not only from the controlling terminal.

RQ Timeout does not kill multi-threaded jobs

I'm having problems running multithreaded tasks using python RQ (tested on v0.5.6 and v0.6.0).
Consider the following piece of code, as a simplified version of what I'm trying to achieve:
thing.py
from threading import Thread
class MyThing(object):
def say_hello(self):
while True:
print "Hello World"
def hello_task(self):
t = Thread(target=self.say_hello)
t.daemon = True # seems like it makes no difference
t.start()
t.join()
main.py
from rq import Queue
from redis import Redis
from thing import MyThing
conn = Redis()
q = Queue(connection=conn)
q.enqueue(MyThing().say_hello, timeout=5)
When executing main.py (while rqworker is running in background), the job breaks as expected by timeout, within 5 seconds.
Problem is, when I'm setting a task containing thread/s such as MyThing().hello_task, the thread runs forever and nothing happens when the 5 seconds timeout is over.
How can I run a multithreaded task with RQ, such that the timeout will kill the task, its sons, grandsons and their wives?
When you run t.join(), the hello_task thread blocks and waits until the say_hello thread returns - thus not receiving the timeout signal from rq. You can allow the main thread to run and properly receive the timeout signal by using Thread.join with a set amount of time to wait, while waiting for the thread to finish running. Like so:
def hello_task(self):
t = Thread(target=self.say_hello)
t.start()
while t.isAlive():
t.join(1) # Block for 1 second
That way you could also catch the timeout exception and handle it, if you wish:
def hello_task(self):
t = Thread(target=self.say_hello)
t.start()
try:
while t.isAlive():
t.join(1) # Block for 1 second
except JobTimeoutException: # From rq.timeouts.JobTimeoutException
print "Thread killed due to timeout"
raise

Python multithreading, Queues for messaging and avoiding processor hogging

I have a process sends messages between threads using Queues.
# receiver.py
class Receiver(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.daemon = True
self.inbox = Queue.Queue()
def run(self):
while True:
if not self.inbox.empty():
msg = self.inbox.get()
# do other stuff
# main.py
def main():
R1 = Receiver()
R2 = Receiver()
R1.start()
R2.start()
# spin up child threads that can also stuff messages into Receiver() inboxes
while True:
msg = "You're hogging processor time"
R1.inbox.put(msg)
R2.inbox.put(msg)
# do a whole bunch more fancy stuff
if __name__ == '__main__':
main()
When I look at the processor time % alloted to this process, it's usually pinned at > 90%.
Is there a better paradigm besides a while-True-check-inbox? I've tried sleeps, but the threads need to respond immediately.
Queue.get will wait (block) until there's something in the queue. During this wait, the thread will sleep, allowing other threads (and processes) to run.
So just remove your check for self.inbox.empty():
def run(self):
while True:
msg = self.inbox.get()
# do other stuff

proper threading in python

I am writing a home automation helpers - they are basically small daemon-like python applications. They can run each as a separate process but since there will be made I decided that I will put up a small dispatcher that will spawn each of the daemons in their own threads and be able to act shall a thread die in the future.
This is what it looks like (working with two classes):
from daemons import mosquitto_daemon, gtalk_daemon
from threading import Thread
print('Starting daemons')
mq_client = mosquitto_daemon.Client()
gt_client = gtalk_daemon.Client()
print('Starting MQ')
mq = Thread(target=mq_client.run)
mq.start()
print('Starting GT')
gt = Thread(target=gt_client.run)
gt.start()
while mq.isAlive() and gt.isAlive():
pass
print('something died')
The problem is that MQ daemon (moquitto) will work fine shall I run it directly:
mq_client = mosquitto_daemon.Client()
mq_client.run()
It will start and hang in there listening to all the messages that hit relevant topics - exactly what I'm looking for.
However, run within the dispatcher makes it act weirdly - it will receive a single message and then stop acting yet the thread is reported to be alive. Given it works fine without the threading woodoo I'm assuming I'm doing something wrong in the dispatcher.
I'm quoting the MQ client code just in case:
import mosquitto
import config
import sys
import logging
class Client():
mc = None
def __init__(self):
logging.basicConfig(format=u'%(filename)s:%(lineno)d %(levelname)-8s [%(asctime)s] %(message)s', level=logging.DEBUG)
logging.debug('Class initialization...')
if not Client.mc:
logging.info('Creating an instance of MQ client...')
try:
Client.mc = mosquitto.Mosquitto(config.DEVICE_NAME)
Client.mc.connect(host=config.MQ_BROKER_ADDRESS)
logging.debug('Successfully created MQ client...')
logging.debug('Subscribing to topics...')
for topic in config.MQ_TOPICS:
result, some_number = Client.mc.subscribe(topic, 0)
if result == 0:
logging.debug('Subscription to topic "%s" successful' % topic)
else:
logging.error('Failed to subscribe to topic "%s": %s' % (topic, result))
logging.debug('Settings up callbacks...')
self.mc.on_message = self.on_message
logging.info('Finished initialization')
except Exception as e:
logging.critical('Failed to complete creating MQ client: %s' % e.message)
self.mc = None
else:
logging.critical('Instance of MQ Client exists - passing...')
sys.exit(status=1)
def run(self):
self.mc.loop_forever()
def on_message(self, mosq, obj, msg):
print('meesage!!111')
logging.info('Message received on topic %s: %s' % (msg.topic, msg.payload))
You are passing Thread another class instance's run method... It doesn't really know what to do with it.
threading.Thread can be used in two general ways: spawn a Thread wrapped independent function, or as a base class for a class with a run method.
In your case it appears like baseclass is the way to go, since your Client class has a run method.
Replace the following in your MQ class and it should work:
from threading import Thread
class Client(Thread):
mc = None
def __init__(self):
Thread.__init__(self) # initialize the Thread instance
...
...
def stop(self):
# some sort of command to stop mc
self.mc.stop() # not sure what the actual command is, if one exists at all...
Then when calling it, do it without Thread:
mq_client = mosquitto_daemon.Client()
mq_client.start()
print 'Print this line to be sure we get here after starting the thread loop...'
Several things to consider:
zeromq hates being initialized in 1 thread and run in another. You can rewrite Client() to be a Thread as suggested, or write your own function that will create a Client and run that function in a thread.
Client() has a class level variable mc. I assume that mosquitto_daemon and gtalk_daemon both use the same Client and so they are in contention for which Client.mc wins.
"while mq.isAlive() and gt.isAlive(): pass" will eat an entire processor because it just keeps polling over and over without sleep. Considering that python is only quasi-threaded (the Global Interpreter Lock (GIL) allows only 1 thread to run at a single time), this will stall out your "daemons".
Also considering the GIL, the orignal daemon implementation is likely to perform better.

Threading in python using queue

I wanted to use threading in python to download lot of webpages and went through the following code which uses queues in one of the website.
it puts a infinite while loop. Does each of thread run continuously with out ending till all of them are complete? Am I missing something.
#!/usr/bin/env python
import Queue
import threading
import urllib2
import time
hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
"http://ibm.com", "http://apple.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
#grabs host from queue
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print url.read(1024)
#signals to queue job is done
self.queue.task_done()
start = time.time()
def main():
#spawn a pool of threads, and pass them queue instance
for i in range(5):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
#populate queue with data
for host in hosts:
queue.put(host)
#wait on the queue until everything has been processed
queue.join()
main()
print "Elapsed Time: %s" % (time.time() - start)
Setting the thread's to be daemon threads causes them to exit when the main is done. But, yes you are correct in that your threads will run continuously for as long as there is something in the queue else it will block.
The documentation explains this detail Queue docs
The python Threading documentation explains the daemon part as well.
The entire Python program exits when no alive non-daemon threads are left.
So, when the queue is emptied and the queue.join resumes when the interpreter exits the threads will then die.
EDIT: Correction on default behavior for Queue
Your script works fine for me, so I assume you are asking what is going on so you can understand it better. Yes, your subclass puts each thread in an infinite loop, waiting on something to be put in the queue. When something is found, it grabs it and does its thing. Then, the critical part, it notifies the queue that it's done with queue.task_done, and resumes waiting for another item in the queue.
While all this is going on with the worker threads, the main thread is waiting (join) until all the tasks in the queue are done, which will be when the threads have sent the queue.task_done flag the same number of times as messages in the queue . At that point the main thread finishes and exits. Since these are deamon threads, they close down too.
This is cool stuff, threads and queues. It's one of the really good parts of Python. You will hear all kinds of stuff about how threading in Python is screwed up with the GIL and such. But if you know where to use them (like in this case with network I/O), they will really speed things up for you. The general rule is if you are I/O bound, try and test threads; if you are cpu bound, threads are probably not a good idea, maybe try processes instead.
good luck,
Mike
I don't think Queue is necessary in this case. Using only Thread:
import threading, urllib2, time
hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
"http://ibm.com", "http://apple.com"]
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, host):
threading.Thread.__init__(self)
self.host = host
def run(self):
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(self.host)
print url.read(1024)
start = time.time()
def main():
#spawn a pool of threads
for i in range(len(hosts)):
t = ThreadUrl(hosts[i])
t.start()
main()
print "Elapsed Time: %s" % (time.time() - start)

Categories