Multithreading app on python 2.7.
I use "threading" and "thread" libraries.
Main thread had started other 10 threads, which do some work. They have one shared class with data (singletone). I don't use any thread blocking, and it's seems good.
Main thread starting this threads with "start" and "join" methods of threading class.
One of ten threads, was starting every 10 seconds and do some math calculation.
When the work is complete, the thread invoke "thread.exit()".
And sometimes main thread did not have the result of the one thread.
The thread is end! And all strings of code are complete, but main thread stops on "join" instruction and did not response.
p.s. I'm not native english speacker, and discribe that problem was very difficult. Please be tolerant.
Code example:
while True:
all_result = check_is_all_results()
time.sleep(1)
if (all_result):
print app_data.task_table
app_data.flag_of_close = True
time.sleep(2) # Задержка на всякий случай
if (app_data.flag_of_close):
terminate()
print u"TEST"
if len(app_data.ip_table[app_data.cfg.MY_IP]['tasks']):
if (app_data.cfg.MULTITHREADING or app_data.complete_task.is_set()):
job = Worker(app_data, SRV.taskResultSendToSlaves, app_data.ip_table[app_data.cfg.MY_IP]['tasks'].pop())
job.setDaemon(True)
job.start()
###########################################################
class Worker(threading.Thread):
def __init__(self, data, sender, taskname):
self.data = data
self.sender = sender
self.taskname = taskname
threading.Thread.__init__(self)
def run(self):
import thread
self.data.complete_task.clear()
tick_before = time.time()
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
startupinfo.wShowWindow = subprocess.SW_HIDE
p = subprocess.Popen(self.data.cfg.PATH_INTERPRETER + " " + self.data.cfg.PATH_TASKS + self.taskname, startupinfo=startupinfo, shell=False, stdout=subprocess.PIPE)
job_result, err = p.communicate()
tick_after = time.time()
work_time = tick_after - tick_before
self.data.task_table[self.taskname]['status'] = 'complete'
self.data.task_table[self.taskname]['result'] = job_result
self.data.task_table[self.taskname]['time'] = work_time
tr = threading.Thread(target=self.sender, name="SENDER", args=(self.taskname, ))
tr.setDaemon(True)
tr.start()
tr.join()
self.data.complete_task.set()
thread.exit()
Sometimes main infinite loop, which calls Worker, does not print "TEST", and does not response.
Your worker threads are spawning subprocesses. Unfortunately, this never works right, because this is first done with a fork that only copies the executing thread from the parent process. Sorry, but your program will not be reliable until you restructure it. Here is some background information with links to more information:
https://stackoverflow.com/a/32107436/3577601
Status of mixing multiprocessing and threading in Python
https://stackoverflow.com/a/6079669/3577601
Related
I've read that it's considered bad practice to kill a thread. (Is there any way to kill a Thread?) There are a LOT of answers there, and I'm wondering if even using a thread in the first place is the right answer for me.
I have a bunch multiprocessing.Processes. Essentially, each Process is doing this:
while some_condition:
result = self.function_to_execute(i, **kwargs_i)
# outQ is a multiprocessing.queue shared between all Processes
self.outQ.put(Result(i, result))
Problem is... I need a way to interrupt function_to_execute, but can't modify the function itself. Initially, I was thinking simply process.terminate(), but that appears to be unsafe with multiprocessing.queue.
Most likely (but not guaranteed), if I need to kill a thread, the 'main' program is going to be done soon. Is my safest option to do something like this? Or perhaps there is a more elegant solution than using a thread in the first place?
def thread_task():
while some_condition:
result = self.function_to_execute(i, **kwargs_i)
if (this_thread_is_not_daemonized):
self.outQ.put(Result(i, result))
t = Thread(target=thread_task)
t.start()
if end_early:
t.daemon = True
I believe the end result of this is that the Process that spawned the thread will continue to waste CPU cycles on a task I no longer care about the output for, but if the main program finishes, it'll clean up all my memory nicely.
The main problem with daemonizing a thread is that the main program could potentially continue for 30+ minutes even when I don't care about the output of that thread anymore.
From the threading docs:
If you want your threads to stop gracefully, make them non-daemonic
and use a suitable signalling mechanism such as an Event
Here is a contrived example of what I was thinking - no idea if it mimics what you are doing or can be adapted for your situation. Another caveat: I've never written any real concurrent code.
Create an Event object in the main process and pass it all the way to the thread.
Design the thread so that it loops until the Event object is set. Once you don't need the processing anymore SET the Event object in the main process. No need to modify the function being run in the thread.
from multiprocessing import Process, Queue, Event
from threading import Thread
import time, random, os
def f_to_run():
time.sleep(.2)
return random.randint(1,10)
class T(Thread):
def __init__(self, evt,q, func, parent):
self.evt = evt
self.q = q
self.func = func
self.parent = parent
super().__init__()
def run(self):
while not self.evt.is_set():
n = self.func()
self.q.put(f'PID {self.parent}-{self.name}: {n}')
def f(T,evt,q,func):
pid = os.getpid()
t = T(evt,q,func,pid)
t.start()
t.join()
q.put(f'PID {pid}-{t.name} is alive - {t.is_alive()}')
q.put(f'PID {pid}:DONE')
return 'foo done'
if __name__ == '__main__':
results = []
q = Queue()
evt = Event()
# two processes each with one thread
p= Process(target=f, args=(T, evt, q, f_to_run))
p1 = Process(target=f, args=(T, evt, q, f_to_run))
p.start()
p1.start()
while len(results) < 40:
results.append(q.get())
print('.',end='')
print('')
evt.set()
p.join()
p1.join()
while not q.empty():
results.append(q.get_nowait())
for thing in results:
print(thing)
I initially tried to use threading.Event but the multiprocessing module complained that it couldn't be pickled. I was actually surprised that the multiprocessing.Queue and multiprocessing.Event worked AND could be accessed by the thread.
Not sure why I started with a Thread subclass - I think I thought it would be easier to control/specify what happens in it's run method. But it can be done with a function also.
from multiprocessing import Process, Queue, Event
from threading import Thread
import time, random
def f_to_run():
time.sleep(.2)
return random.randint(1,10)
def t1(evt,q, func):
while not evt.is_set():
n = func()
q.put(n)
def g(t1,evt,q,func):
t = Thread(target=t1,args=(evt,q,func))
t.start()
t.join()
q.put(f'{t.name} is alive - {t.is_alive()}')
return 'foo'
if __name__ == '__main__':
q = Queue()
evt = Event()
p= Process(target=g, args=(t1, evt, q, f_to_run))
p.start()
time.sleep(5)
evt.set()
p.join()
I am building a watchdog timer that runs another Python program, and if it fails to find a check-in from any of the threads, shuts down the whole program. This is so it will, eventually, be able to take control of needed communication ports. The code for the timer is as follows:
from multiprocessing import Process, Queue
from time import sleep
from copy import deepcopy
PATH_TO_FILE = r'.\test_program.py'
WATCHDOG_TIMEOUT = 2
class Watchdog:
def __init__(self, filepath, timeout):
self.filepath = filepath
self.timeout = timeout
self.threadIdQ = Queue()
self.knownThreads = {}
def start(self):
threadIdQ = self.threadIdQ
process = Process(target = self._executeFile)
process.start()
try:
while True:
unaccountedThreads = deepcopy(self.knownThreads)
# Empty queue since last wake. Add new thread IDs to knownThreads, and account for all known thread IDs
# in queue
while not threadIdQ.empty():
threadId = threadIdQ.get()
if threadId in self.knownThreads:
unaccountedThreads.pop(threadId, None)
else:
print('New threadId < {} > discovered'.format(threadId))
self.knownThreads[threadId] = False
# If there is a known thread that is unaccounted for, then it has either hung or crashed.
# Shut everything down.
if len(unaccountedThreads) > 0:
print('The following threads are unaccounted for:\n')
for threadId in unaccountedThreads:
print(threadId)
print('\nShutting down!!!')
break
else:
print('No unaccounted threads...')
sleep(self.timeout)
# Account for any exceptions thrown in the watchdog timer itself
except:
process.terminate()
raise
process.terminate()
def _executeFile(self):
with open(self.filepath, 'r') as f:
exec(f.read(), {'wdQueue' : self.threadIdQ})
if __name__ == '__main__':
wd = Watchdog(PATH_TO_FILE, WATCHDOG_TIMEOUT)
wd.start()
I also have a small program to test the watchdog functionality
from time import sleep
from threading import Thread
from queue import SimpleQueue
Q_TO_Q_DELAY = 0.013
class QToQ:
def __init__(self, processQueue, threadQueue):
self.processQueue = processQueue
self.threadQueue = threadQueue
Thread(name='queueToQueue', target=self._run).start()
def _run(self):
pQ = self.processQueue
tQ = self.threadQueue
while True:
while not tQ.empty():
sleep(Q_TO_Q_DELAY)
pQ.put(tQ.get())
def fastThread(q):
while True:
print('Fast thread, checking in!')
q.put('fastID')
sleep(0.5)
def slowThread(q):
while True:
print('Slow thread, checking in...')
q.put('slowID')
sleep(1.5)
def hangThread(q):
print('Hanging thread, checked in')
q.put('hangID')
while True:
pass
print('Hello! I am a program that spawns threads!\n\n')
threadQ = SimpleQueue()
Thread(name='fastThread', target=fastThread, args=(threadQ,)).start()
Thread(name='slowThread', target=slowThread, args=(threadQ,)).start()
Thread(name='hangThread', target=hangThread, args=(threadQ,)).start()
QToQ(wdQueue, threadQ)
As you can see, I need to have the threads put into a queue.Queue, while a separate object slowly feeds the output of the queue.Queue into the multiprocessing queue. If instead I have the threads put directly into the multiprocessing queue, or do not have the QToQ object sleep in between puts, the multiprocessing queue will lock up, and will appear to always be empty on the watchdog side.
Now, as the multiprocessing queue is supposed to be thread and process safe, I can only assume I have messed something up in the implementation. My solution seems to work, but also feels hacky enough that I feel I should fix it.
I am using Python 3.7.2, if it matters.
I suspect that test_program.py exits.
I changed the last few lines to this:
tq = threadQ
# tq = wdQueue # option to send messages direct to WD
t1 = Thread(name='fastThread', target=fastThread, args=(tq,))
t2 = Thread(name='slowThread', target=slowThread, args=(tq,))
t3 = Thread(name='hangThread', target=hangThread, args=(tq,))
t1.start()
t2.start()
t3.start()
QToQ(wdQueue, threadQ)
print('Joining with threads...')
t1.join()
t2.join()
t3.join()
print('test_program exit')
The calls to join() means that the test program never exits all by itself since none of the threads ever exit.
So, as is, t3 hangs and the watchdog program detects this and detects the unaccounted for thread and stops the test program.
If t3 is removed from the above program, then the other two threads are well behaved and the watchdog program allows the test program to continue indefinitely.
I'm using Python Python Multiprocessing for a RabbitMQ Consumers.
On Application Start I create 4 WorkerProcesses.
def start_workers(num=4):
for i in xrange(num):
process = WorkerProcess()
process.start()
Below you find my WorkerClass.
The Logic works so far, I create 4 parallel Consumer Processes.
But the Problem is after a Process got killed. I want to create a new Process. The Problem in the Logic below is that the new Process is created as child process from the old one and after a while the memory runs out of space.
Is there any possibility with Python Multiprocessing to start a new process and kill the old one correctly?
class WorkerProcess(multiprocessing.Process):
def ___init__(self):
app.logger.info('%s: Starting new Thread!', self.name)
super(multiprocessing.Process, self).__init__()
def shutdown(self):
process = WorkerProcess()
process.start()
return True
def kill(self):
start_workers(1)
self.terminate()
def run(self):
try:
# Connect to RabbitMQ
credentials = pika.PlainCredentials(app.config.get('RABBIT_USER'), app.config.get('RABBIT_PASS'))
connection = pika.BlockingConnection(
pika.ConnectionParameters(host=app.config.get('RABBITMQ_SERVER'), port=5672, credentials=credentials))
channel = connection.channel()
# Declare the Queue
channel.queue_declare(queue='screenshotlayer',
auto_delete=False,
durable=True)
app.logger.info('%s: Start to consume from RabbitMQ.', self.name)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(callback, queue='screenshotlayer')
channel.start_consuming()
app.logger.info('%s: Thread is going to sleep!', self.name)
# do what channel.start_consuming() does but with stoppping signal
#while self.stop_working.is_set():
# channel.transport.connection.process_data_events()
channel.stop_consuming()
connection.close()
except Exception as e:
self.shutdown()
return 0
Thank You
In the main process, keep track of your subprocesses (in a list) and loop over them with .join(timeout=50) (https://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process.join).
Then check is he is alive (https://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process.is_alive).
If he is not, replace him with a fresh one.
def start_workers(n):
wks = []
for _ in range(n):
wks.append(WorkerProcess())
wks[-1].start()
while True:
#Remove all terminated process
wks = [p for p in wks if p.is_alive()]
#Start new process
for i in range(n-len(wks)):
wks.append(WorkerProcess())
wks[-1].start()
I would not handle the process pool management myself. Instead, I would use the ProcessPoolExecutor from the concurrent.future module.
No need to inherit the WorkerProcess to inherit the Process class. Just write your actual code in the class and then submit it to a process pool executor. The executor would have a pool of processes always ready to execute your tasks.
This way you can keep things simple and less headache for you.
You can read more about in my blog post here: http://masnun.com/2016/03/29/python-a-quick-introduction-to-the-concurrent-futures-module.html
Example Code:
from concurrent.futures import ProcessPoolExecutor
from time import sleep
def return_after_5_secs(message):
sleep(5)
return message
pool = ProcessPoolExecutor(3)
future = pool.submit(return_after_5_secs, ("hello"))
print(future.done())
sleep(5)
print(future.done())
print("Result: " + future.result())
Here is my code, it launches a subprocess, waits till it ends and returns stdout, or a timeout happens and it raises exception. Common use is print(Run('python --version').execute())
class Run(object):
def __init__(self, cmd, timeout=2*60*60):
self.cmd = cmd.split()
self.timeout = timeout
self._stdout = b''
self.dt = 10
self.p = None
def execute(self):
print("Execute command: {}".format(' '.join(self.cmd)))
def target():
self.p = Popen(self.cmd, stdout=PIPE, stderr=STDOUT)
self._stdout = self.p.communicate()[0]
thread = Thread(target=target)
thread.start()
t = 0
while t < self.timeout:
thread.join(self.dt)
if thread.is_alive():
t += self.dt
print("Running for: {} seconds".format(t))
else:
ret_code = self.p.poll()
if ret_code:
raise AssertionError("{} failed.\nretcode={}\nstdout:\n{}".format(
self.cmd, ret_code, self._stdout))
return self._stdout
else:
print('Timeout {} reached, kill task, pid={}'.format(self.timeout, self.p.pid))
self.p.terminate()
thread.join()
raise AssertionError("Timeout")
The problem is following case. The process that I launch spawns more child processes. So when the timeout is reached, I kill main process (the one I srarted using my class) with self.p.terminate(), the children are remaining and my code hangs on line self._stdout = self.p.communicate()[0]. And execution continues if I manually kill all child processes.
I tried soulution when instead of self.p.terminate() I kill whole process tree.
This also does not work if the main process finished by itself and its children are existing on their own, and I have no ability to find and kill them. But they are blocking self.p.communicate().
Is there way to effectively solve this?
You could use the ProcessWrapper from the PySys framework - it offers alot of this functionality as an abstraction in a cross platform way i.e.
import sys, os
from pysys.constants import *
from pysys.process.helper import ProcessWrapper
from pysys.exceptions import ProcessTimeout
command=sys.executable
arguments=['--version']
try:
process = ProcessWrapper(command, arguments=arguments, environs=os.environ, workingDir=os.getcwd(), stdout='stdout.log', stderr='stderr.log', state=FOREGROUND, timeout=5.0)
process.start()
except ProcessTimeout:
print "Process timeout"
process.stop()
It's at SourceForge (http://sourceforge.net/projects/pysys/files/ and http://pysys.sourceforge.net/) if of interest.
I have a service that is running (Twisted jsonrpc server). When I make a call to "run_procs" the service will look at a bunch of objects and inspect their timestamp property to see if they should run. If they should, they get added to a thread_pool (list) and then every item in the thread_pool gets the start() method called.
I have used this setup for several other applications where I wanted to run a function within my class with theading. However, when I am using a subprocess.Popen call in the function called by each thread, the calls run one-at-a-time instead of running concurrently like I would expect.
Here is some sample code:
class ProcService(jsonrpc.JSONRPC):
self.thread_pool = []
self.running_threads = []
self.lock = threading.Lock()
def clean_pool(self, thread_pool, join=False):
for th in [x for x in thread_pool if not x.isAlive()]:
if join: th.join()
thread_pool.remove(th)
del th
return thread_pool
def run_threads(self, parallel=10):
while len(self.running_threads)+len(self.thread_pool) > 0:
self.clean_pool(self.running_threads, join=True)
n = min(max(parallel - len(self.running_threads), 0), len(self.thread_pool))
if n > 0:
for th in self.thread_pool[0:n]: th.start()
self.running_threads.extend(self.thread_pool[0:n])
del self.thread_pool[0:n]
time.sleep(.01)
for th in self.running_threads+self.thread_pool: th.join()
def jsonrpc_run_procs(self):
for i, item in enumerate(self.items):
if item.should_run():
self.thread_pool.append(threading.Thread(target=self.run_proc, args=tuple([item])))
self.run_threads(5)
def run_proc(self, proc):
self.lock.acquire()
print "\nSubprocess started"
p = subprocess.Popen('%s/program_to_run.py %s' %(os.getcwd(), proc.data), shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE,)
stdout_value = proc.communicate('through stdin to stdout')[0]
self.lock.release()
Any help/suggestions are appreciated.
* EDIT *
OK. So now I want to read back the output from the stdout pipe. This works some of the time, but also fails with select.error: (4, 'Interrupted system call') I assume this is because sometimes the process has already terminated before I try to run the communicate method. the code in the run_proc method has been changed to:
def run_proc(self, proc):
self.lock.acquire()
p = subprocess.Popen( #etc
self.running_procs.append([p, proc.data.id])
self.lock.release()
after I call self.run_threads(5) I call self.check_procs()
check_procs method iterates the list of running_procs to check for poll() is not None. How can I get output from pipe? I have tried both of the following
calling check_procs once:
def check_procs(self):
for proc_details in self.running_procs:
proc = proc_details[0]
while (proc.poll() == None):
time.sleep(0.1)
stdout_value = proc.communicate('through stdin to stdout')[0]
self.running_procs.remove(proc_details)
print proc_details[1], stdout_value
del proc_details
calling check_procs in while loop like:
while len(self.running_procs) > 0:
self.check_procs()
def check_procs(self):
for proc_details in self.running_procs:
if (proc.poll() is not None):
stdout_value = proc.communicate('through stdin to stdout')[0]
self.running_procs.remove(proc_details)
print proc_details[1], stdout_value
del proc_details
I think the key code is:
self.lock.acquire()
print "\nSubprocess started"
p = subprocess.Popen( # etc
stdout_value = proc.communicate('through stdin to stdout')[0]
self.lock.release()
the explicit calls to acquire and release should guarantee serialization -- don't you observe serialization just as invariably if you do other things in this block instead of the subprocess use?
Edit: all silence here, so I'll add the suggestion to remove the locking and instead put each stdout_value on a Queue.Queue() instance -- Queue is intrinsicaly threadsafe (deals with its own locking) so you can get (or get_nowait, etc etc) results from it once they're ready and have been put there. In general, Queue is the best way to arrange thread communication (and often synchronization too) in Python, any time it can be feasibly arranged to do things that way.
Specifically: add import Queue at the start; give up making, acquiring and releasing self.lock (just delete those three lines); add self.q = Queue.Queue() to the __init__; right after the call stdout_value = proc.communicate(... add one statement self.q.put(stdout_value); now e.g finish the jsonrpc_run_procs method with
while not self.q.empty():
result = self.q.get()
print 'One result is %r' % result
to confirm that all the results are there. (Normally the empty method of queues is not reliable, but in this case all threads putting to the queue are already finished, so you should be fine).
Your specific problem is probably caused by the line stdout_value = proc.communicate('through stdin to stdout')[0]. Subprocess.communicate will "Wait for process to terminate", which, when used with a lock, will run one at a time.
What you can do is simply add the p variable to a list and run and use the Subprocess API to wait for the subprocesses to finish. Periodically poll each subprocess in your main thread.
On second look, it looks like you may have an issue on this line as well: for th in self.running_threads+self.thread_pool: th.join(). Thread.join() is another method that will wait for the thread to finish.