When using module threading and class Thread(), I can't catch SIGINT (Ctrl + C in console) can not be caught.
Why and what can I do?
Simple test program:
#!/usr/bin/env python
import threading
def test(suffix):
while True:
print "test", suffix
def main():
for i in (1, 2, 3, 4, 5):
threading.Thread(target=test, args=(i, )).start()
if __name__ == "__main__":
main()
When I hit Ctrl + C, nothing happens.
Threads and signals don't mix. In Python this is even more so the case than outside: signals only ever get delivered to one thread (the main thread); other threads won't get the message. There's nothing you can do to interrupt threads other than the main thread. They're out of your control.
The only thing you can do here is introduce a communication channel between the main thread and whatever threads you start, using the queue module. You can then send a message to the thread and have it terminate (or do whatever else you want) when it sees the message.
Alternatively, and it's often a very good alternative, is to not use threads. What to use instead depends greatly on what you're trying to achieve, however.
Basically you can check if the parent issued a signal by reading a queue during the work. If the parent receives a SIGINT then it issues a signal across the queue (in this case anything) and the children wrap up their work and exit...
def fun(arg1, thread_no, queue):
while True:
WpORK...
if queue.empty() is False or errors == 0:
print('thread ', thread_no, ' exiting...')
with open('output_%i' % thread_no, 'w') as f:
for line in lines: f.write(line)
exit()
threads = []
for i, item in enumerate(items):
threads.append( dict() )
q = queue.Queue()
threads[i]['queue'] = q
threads[i]['thread'] = threading.Thread(target=fun, args=(arg1, i, q))
threads[i]['thread'].start()
try:
time.sleep(10000)
except:
for thread in threads:
thread['queue'].put('TERMINATING')
Related
I'm trying to run multiple API requests in parallel with multiprocessing.Process and requests. I put urls to parse into JoinableQueue instance and put back the content to the Queue instance. I've noticed that putting response.content into the Queue somehow prevents the process from terminating.
Here's simplified example with just 1 process (Python 3.5):
import multiprocessing as mp
import queue
import requests
import time
class ChildProcess(mp.Process):
def __init__(self, q, qout):
super().__init__()
self.qin = qin
self.qout = qout
self.daemon = True
def run(self):
while True:
try:
url = self.qin.get(block=False)
r = requests.get(url, verify=False)
self.qout.put(r.content)
self.qin.task_done()
except queue.Empty:
break
except requests.exceptions.RequestException as e:
print(self.name, e)
self.qin.task_done()
print("Infinite loop terminates")
if __name__ == '__main__':
qin = mp.JoinableQueue()
qout = mp.Queue()
for _ in range(5):
qin.put('http://en.wikipedia.org')
w = ChildProcess(qin, qout)
w.start()
qin.join()
time.sleep(1)
print(w.name, w.is_alive())
After running the code I get:
Infinite loop terminates
ChildProcess-1 True
Please help to understand why the process doesn't terminate after run function exits.
Update: added print statement to show the loop terminates
As noted in the Pipes and Queues documentation
if a child process has put items on a queue (and it has not used
JoinableQueue.cancel_join_thread), then that process will not
terminate until all buffered items have been flushed to the pipe.
This means that if you try joining that process you may get a deadlock
unless you are sure that all items which have been put on the queue
have been consumed.
...
Note that a queue created using a manager does not have this issue.
If you switch over to a manager queue, then the process terminates successfully:
import multiprocessing as mp
import queue
import requests
import time
class ChildProcess(mp.Process):
def __init__(self, q, qout):
super().__init__()
self.qin = qin
self.qout = qout
self.daemon = True
def run(self):
while True:
try:
url = self.qin.get(block=False)
r = requests.get(url, verify=False)
self.qout.put(r.content)
self.qin.task_done()
except queue.Empty:
break
except requests.exceptions.RequestException as e:
print(self.name, e)
self.qin.task_done()
print("Infinite loop terminates")
if __name__ == '__main__':
manager = mp.Manager()
qin = mp.JoinableQueue()
qout = manager.Queue()
for _ in range(5):
qin.put('http://en.wikipedia.org')
w = ChildProcess(qin, qout)
w.start()
qin.join()
time.sleep(1)
print(w.name, w.is_alive())
It's a bit hard to figure this out based on the Queue documentation - I struggled with the same problem.
The key concept here is that before a producer thread terminates, it joins any queues that it has put data into; that join then blocks until the queue's background thread terminates, which only happens when the queue is empty. So basically, before your ChildProcess can exit, someone has to consume all the stuff it put into the queue!
There is some documentation of the Queue.cancel_join_thread function, which is supposed to circumvent this problem, but I couldn't get it to have any effect - maybe I'm not using it correctly.
Here's an example modification you can make that should fix the issue:
if __name__ == '__main__':
qin = mp.JoinableQueue()
qout = mp.Queue()
for _ in range(5):
qin.put('http://en.wikipedia.org')
w = ChildProcess(qin, qout)
w.start()
qin.join()
while True:
try:
qout.get(True, 0.1) # Throw away remaining stuff in qout (or process it or whatever,
# just get it out of the queue so the queue background process
# can terminate, so your ChildProcess can terminate.
except queue.Empty:
break
w.join() # Wait for your ChildProcess to finish up.
# time.sleep(1) # Not necessary since we've joined the ChildProcess
print(w.name, w.is_alive())
Add a call to w.terminate() above the print message.
Regarding why the process doesn't terminate itself; your function code is an infinite loop, so it doesn't ever return. Calling terminate signals the process to kill itself.
I have a small piece of code that I made to test out and hopefully debug the problem without having to modify the code in my main applet in Python. This has let me to build this code:
#!/usr/bin/env python
import sys, threading, time
def loop1():
count = 0
while True:
sys.stdout.write('\r thread 1: ' + str(count))
sys.stdout.flush()
count = count + 1
time.sleep(.3)
pass
pass
def loop2():
count = 0
print ""
while True:
sys.stdout.write('\r thread 2: ' + str(count))
sys.stdout.flush()
count = count + 2
time.sleep(.3)
pass
if __name__ == '__main__':
try:
th = threading.Thread(target=loop1)
th.start()
th1 = threading.Thread(target=loop2)
th1.start()
pass
except KeyboardInterrupt:
print ""
pass
pass
My goal with this code is to be able to have both of these threads displaying output in stdout format (with flushing) at the same time and have then side by side or something. problem is that I assume since it is flushing each one, it flushes the other string by default. I don't quite know how to get this to work if it is even possible.
If you just run one of the threads, it works fine. However I want to be able to run both threads with their own string running at the same time in the terminal output. Here is a picture displaying what I'm getting:
terminal screenshot
let me know if you need more info. thanks in advance.
Instead of allowing each thread to output to stdout, a better solution is to have one thread control stdout exclusively. Then provide a threadsafe channel for the other threads to dispatch data to be output.
One good method to achieve this is to share a Queue between all threads. Ensure that only the output thread is accessing data after it has been added to the queue.
The output thread can store the last message from each other thread and use that data to format stdout nicely. This can include clearing output to display something like this, and update it as each thread generates new data.
Threads
#1: 0
#2: 0
Example
Some decisions were made to simplify this example:
There are gotchas to be wary of when giving arguments to threads.
Daemon threads terminate themselves when the main thread exits. They are used to avoid adding complexity to this answer. Using them on long-running or large applications can pose problems. Other
questions discuss how to exit a multithreaded application without leaking memory or locking system resources. You will need to think about how your program needs to signal an exit. Consider using asyncio to save yourself these considerations.
No newlines are used because \r carriage returns cannot clear the whole console. They only allow the current line to be rewritten.
import queue, threading
import time, sys
q = queue.Queue()
keepRunning = True
def loop_output():
thread_outputs = dict()
while keepRunning:
try:
thread_id, data = q.get_nowait()
thread_outputs[thread_id] = data
except queue.Empty:
# because the queue is used to update, there's no need to wait or block.
pass
pretty_output = ""
for thread_id, data in thread_outputs.items():
pretty_output += '({}:{}) '.format(thread_id, str(data))
sys.stdout.write('\r' + pretty_output)
sys.stdout.flush()
time.sleep(1)
def loop_count(thread_id, increment):
count = 0
while keepRunning:
msg = (thread_id, count)
try:
q.put_nowait(msg)
except queue.Full:
pass
count = count + increment
time.sleep(.3)
pass
pass
if __name__ == '__main__':
try:
th_out = threading.Thread(target=loop_output)
th_out.start()
# make sure to use args, not pass arguments directly
th0 = threading.Thread(target=loop_count, args=("Thread0", 1))
th0.daemon = True
th0.start()
th1 = threading.Thread(target=loop_count, args=("Thread1", 3))
th1.daemon = True
th1.start()
# Keep the main thread alive to wait for KeyboardInterrupt
while True:
time.sleep(.1)
except KeyboardInterrupt:
print("Ended by keyboard stroke")
keepRunning = False
for th in [th0, th1]:
th.join()
Example Output:
(Thread0:110) (Thread1:330)
This is my first try with threads in Python,
I wrote the following program as a very simple example. It just gets a list and prints it using some threads. However, Whenever there is an error, the program just hangs in Ubuntu, and I can't seem to do anything to get the control prompt back, so have to restart another SSH session to get back in.
Also have no idea what the issue with my program is.
Is there some kind of error handling I can put in to ensure it doesn't hang.
Also, any idea why ctrl/c doesn't work (I don't have a break key)
from Queue import Queue
from threading import Thread
import HAInstances
import logging
log = logging.getLogger()
logging.basicConfig()
class GetHAInstances:
def oraHAInstanceData(self):
log.info('Getting HA instance routing data')
# HAData = SolrGetHAInstances.TalkToOracle.main()
HAData = HAInstances.main()
log.info('Query fetched ' + str(len(HAData)) + ' HA Instances to query')
# for row in HAData:
# print row
return(HAData)
def do_stuff(q):
while True:
print q.get()
print threading.current_thread().name
q.task_done()
oraHAInstances = GetHAInstances()
mainHAData = oraHAInstances.oraHAInstanceData()
q = Queue(maxsize=0)
num_threads = 10
for i in range(num_threads):
worker = Thread(target=do_stuff, args=(q,))
worker.setDaemon(True)
worker.start()
for row in mainHAData:
#print str(row[0]) + ':' + str(row[1]) + ':' + str(row[2]) + ':' + str(row[3])i
q.put((row[0],row[1],row[2],row[3]))
q.join()
In your thread method, it is recommended to use the "try ... except ... finally". This structure guarantees to return the control to the main thread even when errors occur.
def do_stuff(q):
while True:
try:
#do your works
except:
#log the error
finally:
q.task_done()
Also, in case you want to kill your program, go find out the pid of your main thread and use kill #pid to kill it. In Ubuntu or Mint, use ps -Ao pid,cmd, in the output, you can find out the pid (first column) by searching for the command (second column) you yourself typed to run your Python script.
Your q is hanging because your worker as errored. So your q.task_done() never got called.
import threading
to use
print threading.current_thread().name
I'm trying to set up a simple producer-consumer system in Gevent but my script doesn't exit:
import gevent
from gevent.queue import *
import time
import random
q = Queue()
workers = []
def do_work(wid, value):
"""
Actual blocking function
"""
gevent.sleep(random.randint(0,2))
print 'Task', value, 'done', wid
return
def worker(wid):
"""
Consumer
"""
while True:
item = q.get()
do_work(wid, item)
def producer():
"""
Producer
"""
for i in range(4):
workers.append(gevent.spawn(worker, random.randint(1, 100000)))
for item in range(1, 9):
q.put(item)
producer()
gevent.joinall(workers)
I haven't been able to find good examples/tutorials on using Gevent so what I've pasted above is what I've cobbled up from the internet.
Multiple workers get activated, the items go into the queue but even when everything in the queue finishes, the main program doesn't exit. I have to press CTRL ^ C.
What am I doing wrong?
Thanks.
On a side note: if there is anything my script that could be improved, please let me know. Simple things like checking when the Queue is empty, etc.
I think you should use JoinableQueue like in example from documentation.
import gevent
from gevent.queue import *
import time
import random
q = JoinableQueue()
workers = []
def do_work(wid, value):
gevent.sleep(random.randint(0,2))
print 'Task', value, 'done', wid
def worker(wid):
while True:
item = q.get()
try:
do_work(wid, item)
finally:
q.task_done()
def producer():
for i in range(4):
workers.append(gevent.spawn(worker, random.randint(1, 100000)))
for item in range(1, 9):
q.put(item)
producer()
q.join()
In your worker, you activate a loop that will run forever.
As a side note, an imho more elegant "forever loop" can be written with just:
for work_unit in q:
# Do work, etc
gevent.joinall() waits for the workers to finish; but they never do, so your program will forever be waiting. This is what causes it to not exit.
If you don't care about the workers anymore, you can just kill them instead:
gevent.killall(workers)
An alternative is to put a 'special' item in the queue. When a worker receives this item, it recognises it as different from normal work and stops working.
for worker in workers:
q.put("TimeToDie")
for work_unit in q:
if work_unint == "TimeToDie":
break
do_work()
Or you could even use gevent's Event to do this kind of pattern.
If I have a program that uses threading and Queue, how do I get exceptions to stop execution? Here is an example program, which is not possible to stop with ctrl-c (basically ripped from the python docs).
from threading import Thread
from Queue import Queue
from time import sleep
def do_work(item):
sleep(0.5)
print "working" , item
def worker():
while True:
item = q.get()
do_work(item)
q.task_done()
q = Queue()
num_worker_threads = 10
for i in range(num_worker_threads):
t = Thread(target=worker)
# t.setDaemon(True)
t.start()
for item in range(1, 10000):
q.put(item)
q.join() # block until all tasks are done
The simplest way is to start all the worker threads as daemon threads, then just have your main loop be
while True:
sleep(1)
Hitting Ctrl+C will throw an exception in your main thread, and all of the daemon threads will exit when the interpreter exits. This assumes you don't want to perform cleanup in all of those threads before they exit.
A more complex way is to have a global stopped Event:
stopped = Event()
def worker():
while not stopped.is_set():
try:
item = q.get_nowait()
do_work(item)
except Empty: # import the Empty exception from the Queue module
stopped.wait(1)
Then your main loop can set the stopped Event to False when it gets a KeyboardInterrupt
try:
while not stopped.is_set():
stopped.wait(1)
except KeyboardInterrupt:
stopped.set()
This lets your worker threads finish what they're doing you want instead of just having every worker thread be a daemon and exit in the middle of execution. You can also do whatever cleanup you want.
Note that this example doesn't make use of q.join() - this makes things more complex, though you can still use it. If you do then your best bet is to use signal handlers instead of exceptions to detect KeyboardInterrupts. For example:
from signal import signal, SIGINT
def stop(signum, frame):
stopped.set()
signal(SIGINT, stop)
This lets you define what happens when you hit Ctrl+C without affecting whatever your main loop is in the middle of. So you can keep doing q.join() without worrying about being interrupted by a Ctrl+C. Of course, with my above examples, you don't need to be joining, but you might have some other reason for doing so.