Python multiprocessing in Nuke causing Nuke to hang - python

I have the following code that causes Nuke to hang. Basically, what I'm trying to do is get a list of files and folders from the file system, and I am trying to speed it up through parallel processing. This works perfectly outside of Nuke, but as I said before, running this in Nuke will cause Nuke to hang. Is there a better way to do this that will cause Nuke to not hang? Preferably, I'd like to fix this through Python's standard library, or packages that are platform agnostic. But, if there's no way to do that, then I'm fine with that. Worst case, I will have to go back to not using parallel processing and find other optimizations.
Also, when I run this code in Nuke, I get the following error in the console:
Unknown units in -c from multiprocessing.forking import main; main()
The code:
#!/bin/env python
import multiprocessing
import os
CPU_COUNT = multiprocessing.cpu_count()
def _threaded_master(root):
in_queue = multiprocessing.JoinableQueue()
folder_queue = multiprocessing.JoinableQueue()
file_queue = multiprocessing.JoinableQueue()
in_queue.put(root)
for _ in xrange(CPU_COUNT):
multiprocessing.Process(target=_threaded_slave, args=(in_queue, folder_queue, file_queue)).start()
in_queue.join()
return {"folders": folder_queue, "files": file_queue}
def _threaded_slave(in_queue, folder_queue, file_queue):
while True:
path_item = in_queue.get()
if os.path.isdir(path_item):
for item in os.listdir(path_item):
path = os.path.join(path_item, item)
in_queue.put(path)
in_queue.task_done()
if __name__ == "__main__":
print _threaded_master(r"/path/to/root")

Here's my code to scan through a large tree of directories using several threads.
I'd originally written the code to use good old multiprocessing.Pool(), because it's very easy and gives you the results of the functions. Input and output queues are not needed. Another difference is it uses processes over threads, which have some tradeoffs.
The Pool has a big drawback: it assumes you have a static list of items to process.
So, I rewrote the code following your original example: input/output queue of directories to process, and an output queue. The caller has to explicitly grab items from the output queue.
For grins I ran a timing comparison with good old os.walk() and... at least on my machine the traditional solution was faster. The two solutions produced quite different numbers of files, which I can't explain.
Have fun!
source
#!/bin/env python
import multiprocessing, threading, time
import logging, os, Queue, sys
logging.basicConfig(
level=logging.INFO,
format="%(asctime)-4s %(levelname)s %(threadName)s %(message)s",
datefmt="%H:%M:%S",
stream=sys.stderr,
)
def scan_dir(topdir):
try:
for name in os.listdir(topdir):
path = os.path.join(topdir, name)
yield (path, os.path.isdir(path))
except OSError:
logging.error('uhoh: %s', topdir)
def scan_dir_queue(inqueue, outqueue):
logging.info('start')
while True:
try:
dir_item = inqueue.get_nowait()
except Queue.Empty:
break
res = list( scan_dir(dir_item) )
logging.debug('- %d paths', len(res))
for path,isdir in res:
outqueue.put( (path,isdir) )
if isdir:
inqueue.put(path)
logging.info('done')
def thread_master(root):
dir_queue = Queue.Queue() # pylint: disable=E1101
dir_queue.put(root)
result_queue = Queue.Queue()
threads = [
threading.Thread(
target=scan_dir_queue, args=[dir_queue, result_queue]
)
for _ in range(multiprocessing.cpu_count())
]
for th in threads:
th.start()
for th in threads:
th.join()
return result_queue.queue
if __name__ == "__main__":
topdir = os.path.expanduser('~')
start = time.time()
res = thread_master(topdir)
print 'threaded:', time.time() - start
print len(res), 'paths'
def mywalk(topdir):
for (dirpath, _dirnames, filenames) in os.walk(topdir):
for name in filenames:
yield os.path.join(dirpath, name)
start = time.time()
res = list(mywalk(topdir))
print 'os.walk:', time.time() - start
print len(res), 'paths'
output
11:56:35 INFO Thread-1 start
11:56:35 INFO Thread-2 start
11:56:35 INFO Thread-3 start
11:56:35 INFO Thread-4 start
11:56:35 INFO Thread-2 done
11:56:35 INFO Thread-3 done
11:56:35 INFO Thread-4 done
11:56:42 INFO Thread-1 done
threaded: 6.49218010902
299230 paths
os.walk: 1.6940600872
175741 paths

Here's a link to refer to: https://learn.foundry.com/nuke/developers/63/pythondevguide/threading.html
What's notable is the warning mentioned in there: nuke.executeInMainThread and nuke.executeInMainThreadWithResult should always be run from a child thread. If run from within the main thread, they freeze NUKE.
So, spawn a new child thread, and do your stuff there.

Related

Why doesn't Python process with input and output queues not join once it is done?

This simple Python3 program using multiprocessing does not seem to work as expected.
All the input processes share an input queue from which they consume data. They all share an output queue where they write a result once they are fully done. I find that this program hangs at the process join(). Why is that?
#!/usr/bin/env python3
import multiprocessing
def worker_func(in_q, out_q):
print("A worker has started")
w_results = {}
while not in_q.empty():
v = in_q.get()
w_results[v] = v
out_q.put(w_results)
print("A worker has finished")
def main():
# Input queue to share among processes
fpaths = [str(i) for i in range(10000)]
in_q = multiprocessing.Queue()
for fpath in fpaths:
in_q.put(fpath)
# Create processes and start them
N_PROC = 2
out_q = multiprocessing.Queue()
workers = []
for _ in range(N_PROC):
w = multiprocessing.Process(target=worker_func, args=(in_q, out_q,))
w.start()
workers.append(w)
print("Done adding workers")
# Wait for processes to finish
for w in workers:
w.join()
print("Done join of workers")
# Collate worker results
out_results = {}
while not out_q.empty():
out_results.update(out_q.get())
if __name__ == "__main__":
main()
I get this result from this program when N_PROC = 2:
$ python3 test.py
Done adding workers
A worker has started
A worker has started
A worker has finished
<---- I do not get "A worker has finished" from second worker
<---- I do not get "Done join of workers"
It does not work even with a single child process N_PROC = 1:
$ python3 test.py
Done adding workers
A worker has started
A worker has finished
<---- I do not get "Done join of workers"
If I try a smaller input queue with say 1000 items, everything works fine.
I am aware of some old StackOverflow questions that say that the Queue has a limit. Why is this not documented in the Python3 docs?
What is an alternative solution I can use? I want to use multi-processing (not threading), to split the input among N processes. Once their shared input queue is empty, I want each process to collect its results (can be a big/complex data structure like dict) and return it back to the parent process. How to do this?
This is a classical bug caused by your design. When the worker are terminating, they stall because they have not been able to put all the data in the out_q, thus deadlocking your program. This has to do with size of the pipe buffer underlying your queue.
When you are using a multiprocessing.Queue, you should empty it before trying to join the feeder process, to make sure that the Process does not stall waiting for all the object to be put in the Queue. So putting your out_q.get call before the joinning the processes should solve your problem:. You can use a sentinel pattern to detect the end of the computations.
#!/usr/bin/env python3
import multiprocessing
from multiprocessing.queues import Empty
def worker_func(in_q, out_q):
print("A worker has started")
w_results = {}
while not in_q.empty():
try:
v = in_q.get(timeout=1)
w_results[v] = v
except Empty:
pass
out_q.put(w_results)
out_q.put(None)
print("A worker has finished")
def main():
# Input queue to share among processes
fpaths = [str(i) for i in range(10000)]
in_q = multiprocessing.Queue()
for fpath in fpaths:
in_q.put(fpath)
# Create processes and start them
N_PROC = 2
out_q = multiprocessing.Queue()
workers = []
for _ in range(N_PROC):
w = multiprocessing.Process(target=worker_func, args=(in_q, out_q,))
w.start()
workers.append(w)
print("Done adding workers")
# Collate worker results
out_results = {}
n_proc_end = 0
while not n_proc_end == N_PROC:
res = out_q.get()
if res is None:
n_proc_end += 1
else:
out_results.update(res)
# Wait for processes to finish
for w in workers:
w.join()
print("Done join of workers")
if __name__ == "__main__":
main()
Also, note that your code has a race condition in it. The queue in_q can be emptied between the moment you check not in_q.empty() and the get. You should use a non blocking get to make sure you don't deadlock, waiting on an empty queue.
Finally, you are trying to implement something that look like a multiprocessing.Pool, which handle this kind of communication in a more robust way. you can also look at the concurrent.futures API, which is even more robust and in some sense, better designed.

Python Threading and interpreter shutdown- Is this fixable or issue Python Issue #Issue14623

I have python script that uploads files to a cloud account. It was working for a while, but out of now where I started getting the 'Exception in thread Thread-1 (most likely raised during interpreter shutdown)' error. After researching I found this python issue http://bugs.python.org/issue14623 which states the issue will not get fixed.
However, I'm not exactly sure this would apply to me and I am hoping someone could point out a fix. I would like to stay with python's threading and try to avoid using multiprocessing since this is I/O bound. This is the stripped down version(which has this issue also), but in the full version the upload.py has a list I like to share so I want it to run in the same memory.
It always breaks only after it completes and all the files are uploaded. I tried removing 't.daemon = True' and it will just hang(instead of breaking) at that same point(after all the files are uploaded). I also tried removing q.join() along with 't.daemon = True' and it will just hang after completion. Without the t.daemon = True and q.join(), I think it is blocking at item = q.get() when it comes to the end of the script execution(just guess).
main:
import logging
import os
import sys
import json
from os.path import expanduser
from Queue import Queue
from threading import Thread
from auth import Authenticate
from getinfo import get_containers, get_files, get_link
from upload import upload_file
from container_util import create_containers
from filter import MyFilter
home = expanduser("~") + '/'
directory = home + "krunchuploader_logs"
if not os.path.exists(directory):
os.makedirs(directory)
debug = directory + "/krunchuploader__debug_" + str(os.getpid())
error = directory + "/krunchuploader__error_" + str(os.getpid())
info = directory + "/krunchuploader__info_" + str(os.getpid())
os.open(debug, os.O_CREAT | os.O_EXCL)
os.open(error, os.O_CREAT | os.O_EXCL)
os.open(info, os.O_CREAT | os.O_EXCL)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s',
filename=debug,
filemode='w')
logger = logging.getLogger("krunch")
fh_error = logging.FileHandler(error)
fh_error.setLevel(logging.ERROR)
fh_error.setFormatter(formatter)
fh_error.addFilter(MyFilter(logging.ERROR))
fh_info = logging.FileHandler(info)
fh_info.setLevel(logging.INFO)
fh_info.setFormatter(formatter)
fh_info.addFilter(MyFilter(logging.INFO))
std_out_error = logging.StreamHandler()
std_out_error.setLevel(logging.ERROR)
std_out_info = logging.StreamHandler()
std_out_info.setLevel(logging.INFO)
logger.addHandler(fh_error)
logger.addHandler(fh_info)
logger.addHandler(std_out_error)
logger.addHandler(std_out_info)
def main():
sys.stdout.write("\x1b[2J\x1b[H")
print title
authenticate = Authenticate()
cloud_url = get_link(authenticate.jsonresp)
#per 1 million files the list will take
#approx 300MB of memory.
file_container_list, file_list = get_files(authenticate, cloud_url)
cloud_container_list = get_containers(authenticate, cloud_url)
create_containers(cloud_container_list,
file_container_list, authenticate, cloud_url)
return file_list
def do_the_uploads(file_list):
def worker():
while True:
item = q.get()
upload_file(item)
q.task_done()
q = Queue()
for i in range(5):
t = Thread(target=worker)
t.daemon = True
t.start()
for item in file_list:
q.put(item)
q.join()
if __name__ == '__main__':
file_list = main()
value = raw_input("\nProceed to upload files? Enter [Y/y] for yes: ").upper()
if value == "Y":
do_the_uploads(file_list)
upload.py:
def upload_file(file_obj):
absolute_path_filename, filename, dir_name, token, url = file_obj
url = url + dir_name + '/' + filename
header_collection = {
"X-Auth-Token": token}
print "Uploading " + absolute_path_filename
with open(absolute_path_filename) as f:
r = requests.put(url, data=f, headers=header_collection)
print "done"
Error output:
Fetching Cloud Container List... Got it!
All containers exist, none need to be added
Proceed to upload files? Enter [Y/y] for yes: y
Uploading /home/one/huh/one/green
Uploading /home/one/huh/one/red
Uploading /home/one/huh/two/white
Uploading /home/one/huh/one/blue
Uploading /home/one/huh/two/yellow
done
Uploading /home/one/huh/two/purple
done
done
done
done
done
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 808, in __bootstrap_inner
File "/usr/lib64/python2.7/threading.py", line 761, in run
File "krunchuploader.py", line 97, in worker
File "/usr/lib64/python2.7/Queue.py", line 168, in get
File "/usr/lib64/python2.7/threading.py", line 332, in wait
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
UPDATE: I placed a time.sleep(2) at the end of the script which seems to have fixed the issue. I guess the sleep allows the daemons to finish before the script comes to end of life and closes? I would of thought the main process would have to wait for the daemons to finish.
You can use a "poison pill" to kill the workers gracefully. After putting all the work in the queue, add a special object, one per worker, that workers recognize and quit. You can make the threads non-daemonic so Python will wait for them to finish before shutting down the process.
A concise way to make worker recognize poison and quit is to use the two-argument form of the iter() builtin in a for loop:
def do_the_uploads(file_list):
def worker():
for item in iter(q.get, poison):
upload_file(item)
poison = object()
num_workers = 5
q = Queue()
for i in range(num_workers):
t = Thread(target=worker)
t.start()
for item in file_list:
q.put(item)
for i in range(num_workers):
q.put(poison)

how to kill zombie processes created by multiprocessing module?

I'm very new to multiprocessing module. And I just tried to create the following: I have one process that's job is to get message from RabbitMQ and pass it to internal queue (multiprocessing.Queue). Then what I want to do is : spawn a process when new message comes in. It works, but after the job is finished it leaves a zombie process not terminated by it's parent. Here is my code:
Main Process:
#!/usr/bin/env python
import multiprocessing
import logging
import consumer
import producer
import worker
import time
import base
conf = base.get_settings()
logger = base.logger(identity='launcher')
request_order_q = multiprocessing.Queue()
result_order_q = multiprocessing.Queue()
request_status_q = multiprocessing.Queue()
result_status_q = multiprocessing.Queue()
CONSUMER_KEYS = [{'queue':'product.order',
'routing_key':'product.order',
'internal_q':request_order_q}]
# {'queue':'product.status',
# 'routing_key':'product.status',
# 'internal_q':request_status_q}]
def main():
# Launch consumers
for key in CONSUMER_KEYS:
cons = consumer.RabbitConsumer(rabbit_q=key['queue'],
routing_key=key['routing_key'],
internal_q=key['internal_q'])
cons.start()
# Check reques_order_q if not empty spaw a process and process message
while True:
time.sleep(0.5)
if not request_order_q.empty():
handler = worker.Worker(request_order_q.get())
logger.info('Launching Worker')
handler.start()
if __name__ == "__main__":
main()
And here is my Worker:
import multiprocessing
import sys
import time
import base
conf = base.get_settings()
logger = base.logger(identity='worker')
class Worker(multiprocessing.Process):
def __init__(self, msg):
super(Worker, self).__init__()
self.msg = msg
self.daemon = True
def run(self):
logger.info('%s' % self.msg)
time.sleep(10)
sys.exit(1)
So after all the messages gets processed I can see processes with ps aux command. But I would really like them to be terminated once finished.
Thanks.
Using multiprocessing.active_children is better than Process.join. The function active_children cleans any zombies created since the last call to active_children. The method join awaits the selected process. During that time, other processes can terminate and become zombies, but the parent process will not notice, until the awaited method is joined. To see this in action:
import multiprocessing as mp
import time
def main():
n = 3
c = list()
for i in range(n):
d = dict(i=i)
p = mp.Process(target=count, kwargs=d)
p.start()
c.append(p)
for p in reversed(c):
p.join()
print('joined')
def count(i):
print(f'{i} going to sleep')
time.sleep(i * 10)
print(f'{i} woke up')
if __name__ == '__main__':
main()
The above will create 3 processes that terminate 10 seconds apart each. As the code is, the last process is joined first, so the other two, which terminated earlier, will be zombies for 20 seconds. You can see them with:
ps aux | grep Z
There will be no zombies if the processes are awaited in the sequence that they will terminate. Remove the call to the function reversed to see this case. However, in real applications we rarely know the sequence that children will terminate, so using the method multiprocessing.Process.join will result in some zombies.
The alternative active_children does not leave any zombies.
In the above example, replace the loop for p in reversed(c): with:
while True:
time.sleep(1)
if not mp.active_children():
break
and see what happens.
A couple of things:
Make sure the parent joins its children, to avoid zombies. See Python Multiprocessing Kill Processes
You can check whether a child is still running with the is_alive() member function. See http://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process
Use active_children.
multiprocessing.active_children

python threading : cannot switch thread to Daemon

I would expect next code to be executed simultaneously and all filenames from os.walk iterations , that got 0 at random , will get in result dictionary. And all threads that have some timeout would get into deamon mode and will be killed as soon as script reaches end. However, script respects all timeouts for each thread.
Why is this happening? Should it put all threads in backgroung and kill them if they will not finish and return result before the end of script execution? thank you.
import threading
import os
import time
import random
def check_file(file_name,timeout):
time.sleep(timeout)
print file_name
result.append(file_name)
result = []
for home,dirs,files in os.walk("."):
for ifile in files :
filename = '/'.join([home,ifile])
t = threading.Thread(target=check_file(filename,random.randint(0,5)))
t.setDaemon(True)
t.start()
print result
Solution: I found my mistake:
t = threading.Thread(target=check_file(filename,random.randint(0,5)))
has to be
t = threading.Thread(target=check_file, args=(filename,random.randint(0,5)))
In this case, threading will spawn a thread with function as object ang give it arguments. In my initial example, function with args has to be resolved BEFORE thread spawns. And this is fair.
However, example above works for me at 2.7.3 , but at 2.7.2 i cannot make it working.
I `m getting got exception that
function check_file accepts exactly 1 argument (34 is given).
Soulution :
in 2.7.2 i had to put ending coma in args tuple , considering that i have 1 variable only . God knows why this not affects 2.7.3 version . It was
t = threading.Thread(target=check_file, args=(filename))
and started to work with
t = threading.Thread(target=check_file, args=(filename,))
I understand what you were trying to do, but you're not using the right format for threading. I fixed your example...look up the Queue class on how to do this properly.
Secondly, never ever do string manipulation on file paths. Use the os.path module; there's a lot more than adding separators between strings that you and I don't think about most of the time.
Good luck!
import threading
import os
import time
import random
import Queue
def check_file():
while True:
item = q.get()
time.sleep(item[1])
print item
q.task_done()
q = Queue.Queue()
result = []
for home,dirs,files in os.walk("."):
for ifile in files:
filename = os.path.join(home, ifile)
q.put((filename, random.randint(0,5)))
number_of_threads = 25
for i in range(number_of_threads):
t = threading.Thread(target=check_file)
t.daemon = True
t.start()
q.join()
print result

python3 multiprocessing example crashed my pc :(

I am new to multiprocessing
I have run example code for two 'highly recommended' multiprocessing examples given in response to other stackoverflow multiprocessing questions. Here is an example of one (which i dare not run again!)
test2.py (running from pydev)
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
When I run this my computer slows down imminently. It gets incrementally slower. After some time I managed to get into the task manager only to see MANY MANY python.exe under the processes tab. after trying to end process on some, my mouse stopped moving. It was the second time i was forced to reboot.
I am too scared to attempt a third example...
running - Intel(R) Core(TM) i7 CPU 870 # 2.93GHz (8 CPUs), ~2.9GHz on win7 64
If anyone know what the issue is and can provide a VERY SIMPLE example of multiprocessing (send a string too a multiprocess, alter it and send it back for printing) I would be very grateful.
From the docs:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
Thus, on Windows, you must wrap your code inside a
if __name__=='__main__':
block.
For example, this sends a string to the worker process, the string is reversed and the result is printed by the main process:
import multiprocessing as mp
def worker(inq,outq):
obj = inq.get()
obj = obj[::-1]
outq.put(obj)
if __name__=='__main__':
inq = mp.Queue()
outq = mp.Queue()
p = mp.Process(target=worker, args=(inq,outq))
p.start()
inq.put('Fancy Dan')
# Wait for the worker to finish
p.join()
result = outq.get()
print(result)
Because of the way multiprocessing works on Windows (child processes import the __main__ module) the __main__ module cannot actually run anything when imported -- any code that should execute when run directly must be protected by the if __name__ == '__main__' idiom. Your corrected code:
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
Might I suggest this link? It's using threads, instead of multiprocessing, but many of the principles are the same.

Categories