Python multiprocessing with dill.load - python

I'm having great difficulties in getting functions executed in MultiProcessing Pool method that are loaded through
dill.load('somefile.sav','rb')
The code is something like this:
import dill as dill
import multiprocessing_on_dill as mp
dill_func = dill.load('somefile.sav','rb')
def some_mp_func(x):
dill_func(x)
if (__name__ == '__main__'):
__spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)"
x="test"
pool = mp.Pool(processes = (8))
pool.map(some_mp_func, x)
pool.close()
pool.join()
dill_func is an SKlearn Pipeline.
The output is:
NameError: name 'Y' is not defined
Where 'Y' is a function within dill_func, part of a Class in dill_func.
Running some_mp_func(x) without Multiprocessing runs perfectly fine, no Name errors. Any suggestions?

SOLUTION: when dumping use this setting
dill.settings['recurse']=True

Related

RunTime Error: Python multiprocessing not using fork to start your child processes and forgotten to use the proper idiom in the main module? [duplicate]

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.
EDIT: By the way this code runs fine on ubuntu. Not quite on windows
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.
testMain.py:
import parallelTestModule
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
parallelTestModule.py:
import multiprocessing
from multiprocessing import Process
import threading
class ThreadRunner(threading.Thread):
""" This class represents a single instance of a running thread"""
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run(self):
print self.name,'\n'
class ProcessRunner:
""" This class represents a single instance of a running process """
def runp(self, pid, numThreads):
mythreads = []
for tid in range(numThreads):
name = "Proc-"+str(pid)+"-Thread-"+str(tid)
th = ThreadRunner(name)
mythreads.append(th)
for i in mythreads:
i.start()
for i in mythreads:
i.join()
class ParallelExtractor:
def runInParallel(self, numProcesses, numThreads):
myprocs = []
prunner = ProcessRunner()
for pid in range(numProcesses):
pr = Process(target=prunner.runp, args=(pid, numThreads))
myprocs.append(pr)
# if __name__ == 'parallelTestModule': #This didnt work
# if __name__ == '__main__': #This obviously doesnt work
# multiprocessing.freeze_support() #added after seeing error to no avail
for i in myprocs:
i.start()
for i in myprocs:
i.join()
On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.
Modified testMain.py:
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
Try putting your code inside a main function in testMain.py
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
See the docs:
"For an explanation of why (on Windows) the if __name__ == '__main__'
part is necessary, see Programming guidelines."
which say
"Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process)."
... by using if __name__ == '__main__'
Though the earlier answers are correct, there's a small complication it would help to remark on.
In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:
if __name__ == '__main__':
import my_module
As #Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':
So, in my case, ended like this:
if __name__ == '__main__':
import librosa
import os
import pandas as pd
run_my_program()
hello here is my structure for multi process
from multiprocessing import Process
import time
start = time.perf_counter()
def do_something(time_for_sleep):
print(f'Sleeping {time_for_sleep} second...')
time.sleep(time_for_sleep)
print('Done Sleeping...')
p1 = Process(target=do_something, args=[1])
p2 = Process(target=do_something, args=[2])
if __name__ == '__main__':
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2 )} second(s)')
you don't have to put imports in the if __name__ == '__main__':, just running the program you wish to running inside
In yolo v5 with python 3.8.5
if __name__ == '__main__':
from yolov5 import train
train.run()
In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.
The below solution should work for both python multiprocessing and pytorch multiprocessing.
As other answers mentioned that the fix is to have if __name__ == '__main__': but I faced several issues in identifying where to start because I am using several scripts and modules. When I can call my first function inside main then everything before it started to create multiple processes (not sure why).
Putting it at the very first line (even before the import) worked. Only calling the first function return timeout error. The below is the first file of my code and multiprocessing is used after calling several functions but putting main in the first seems to be the only fix here.
if __name__ == '__main__':
from mjrl.utils.gym_env import GymEnv
from mjrl.policies.gaussian_mlp import MLP
from mjrl.baselines.quadratic_baseline import QuadraticBaseline
from mjrl.baselines.mlp_baseline import MLPBaseline
from mjrl.algos.npg_cg import NPG
from mjrl.algos.dapg import DAPG
from mjrl.algos.behavior_cloning import BC
from mjrl.utils.train_agent import train_agent
from mjrl.samplers.core import sample_paths
import os
import json
import mjrl.envs
import mj_envs
import time as timer
import pickle
import argparse
import numpy as np
# ===============================================================================
# Get command line arguments
# ===============================================================================
parser = argparse.ArgumentParser(description='Policy gradient algorithms with demonstration data.')
parser.add_argument('--output', type=str, required=True, help='location to store results')
parser.add_argument('--config', type=str, required=True, help='path to config file with exp params')
args = parser.parse_args()
JOB_DIR = args.output
if not os.path.exists(JOB_DIR):
os.mkdir(JOB_DIR)
with open(args.config, 'r') as f:
job_data = eval(f.read())
assert 'algorithm' in job_data.keys()
assert any([job_data['algorithm'] == a for a in ['NPG', 'BCRL', 'DAPG']])
job_data['lam_0'] = 0.0 if 'lam_0' not in job_data.keys() else job_data['lam_0']
job_data['lam_1'] = 0.0 if 'lam_1' not in job_data.keys() else job_data['lam_1']
EXP_FILE = JOB_DIR + '/job_config.json'
with open(EXP_FILE, 'w') as f:
json.dump(job_data, f, indent=4)
# ===============================================================================
# Train Loop
# ===============================================================================
e = GymEnv(job_data['env'])
policy = MLP(e.spec, hidden_sizes=job_data['policy_size'], seed=job_data['seed'])
baseline = MLPBaseline(e.spec, reg_coef=1e-3, batch_size=job_data['vf_batch_size'],
epochs=job_data['vf_epochs'], learn_rate=job_data['vf_learn_rate'])
# Get demonstration data if necessary and behavior clone
if job_data['algorithm'] != 'NPG':
print("========================================")
print("Collecting expert demonstrations")
print("========================================")
demo_paths = pickle.load(open(job_data['demo_file'], 'rb'))
########################################################################################
demo_paths = demo_paths[0:3]
print (job_data['demo_file'], len(demo_paths))
for d in range(len(demo_paths)):
feats = demo_paths[d]['features']
feats = np.vstack(feats)
demo_paths[d]['observations'] = feats
########################################################################################
bc_agent = BC(demo_paths, policy=policy, epochs=job_data['bc_epochs'], batch_size=job_data['bc_batch_size'],
lr=job_data['bc_learn_rate'], loss_type='MSE', set_transforms=False)
in_shift, in_scale, out_shift, out_scale = bc_agent.compute_transformations()
bc_agent.set_transformations(in_shift, in_scale, out_shift, out_scale)
bc_agent.set_variance_with_data(out_scale)
ts = timer.time()
print("========================================")
print("Running BC with expert demonstrations")
print("========================================")
bc_agent.train()
print("========================================")
print("BC training complete !!!")
print("time taken = %f" % (timer.time() - ts))
print("========================================")
# if job_data['eval_rollouts'] >= 1:
# score = e.evaluate_policy(policy, num_episodes=job_data['eval_rollouts'], mean_action=True)
# print("Score with behavior cloning = %f" % score[0][0])
if job_data['algorithm'] != 'DAPG':
# We throw away the demo data when training from scratch or fine-tuning with RL without explicit augmentation
demo_paths = None
# ===============================================================================
# RL Loop
# ===============================================================================
rl_agent = DAPG(e, policy, baseline, demo_paths,
normalized_step_size=job_data['rl_step_size'],
lam_0=job_data['lam_0'], lam_1=job_data['lam_1'],
seed=job_data['seed'], save_logs=True
)
print("========================================")
print("Starting reinforcement learning phase")
print("========================================")
ts = timer.time()
train_agent(job_name=JOB_DIR,
agent=rl_agent,
seed=job_data['seed'],
niter=job_data['rl_num_iter'],
gamma=job_data['rl_gamma'],
gae_lambda=job_data['rl_gae'],
num_cpu=job_data['num_cpu'],
sample_mode='trajectories',
num_traj=job_data['rl_num_traj'],
num_samples= job_data['rl_num_samples'],
save_freq=job_data['save_freq'],
evaluation_rollouts=job_data['eval_rollouts'])
print("time taken = %f" % (timer.time()-ts))
I ran into the same problem. #ofter method is correct because there are some details to pay attention to. The following is the successful debugging code I modified for your reference:
if __name__ == '__main__':
import matplotlib.pyplot as plt
import numpy as np
def imgshow(img):
img = img / 2 + 0.5
np_img = img.numpy()
plt.imshow(np.transpose(np_img, (1, 2, 0)))
plt.show()
dataiter = iter(train_loader)
images, labels = dataiter.next()
imgshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[i]] for i in range(4)))
For the record, I don't have a subroutine, I just have a main program, but I have the same problem as you. This demonstrates that when importing a Python library file in the middle of a program segment, we should add:
if __name__ == '__main__':
I tried the tricks mentioned above on the following very simple code. but I still cannot stop it from resetting on any of my Window machines with Python 3.8/3.10. I would very much appreciate it if you could tell me where I am wrong.
print('script reset')
def do_something(inp):
print('Done!')
if __name__ == '__main__':
from multiprocessing import Process, get_start_method
print('main reset')
print(get_start_method())
Process(target=do_something, args=[1]).start()
print('Finished')
output displays:
script reset
main reset
spawn
Finished
script reset
Done!
Update:
As far as I understand, you guys are not preventing either the script containing the __main__ or the .start() from resetting (which doesn't happen in Linux), rather you are suggesting workarounds so that we don't see the reset. One has to make all imports minimal and put them in each function separately, but it is still, relative to Linux, slow.

How to use `Manager` when using "spawn"

Here is some pseudocode for what I'm doing
import multiprocessing as mp
from multiprocessing import Manager
from tqdm import tqdm
def loop(arg):
# do stuff
# ...
results.append(result_of_stuff)
if __name__ == '__main__':
manager = Manager()
results = manager.list()
with mp.get_context('spawn').Pool(4) as pool:
list(tqdm(pool.imap(loop, ls), total=len(ls)))
# do stuff with `results`
# ...
So the issue here is that loop doesn't know about results. I have one working way to do this and it's by using "fork" instead of "spawn". But I need to use "spawn" for reasons beyond the scope of my question..
So what is the minimal change I need to make for this to work? And I really want to keep tqdm hence the use of imap
PS: I'm on Linux
You can use functools.partial to add the extra parameters:
import multiprocessing as mp
import os
from functools import partial
from multiprocessing import Manager
from tqdm import tqdm
def loop(results, arg):
results.append(len(arg))
def main():
ctx = mp.get_context("spawn")
manager = Manager()
l = manager.list()
partial_loop = partial(loop, l)
ls = os.listdir("/tmp")
with ctx.Pool() as pool:
results = list(tqdm(pool.imap(partial_loop, ls), total=len(ls)))
print(f"Sum: {sum(l)}")
if __name__ == "__main__":
main()
There is some overhead with this approach as it will spawn a child process to host the Manager server.
Since you will process the results in the main process anyway I would do something like this instead (but this depends on your circumstances of course):
import multiprocessing as mp
import os
from tqdm import tqdm
def loop(arg):
return len(arg)
def main():
ctx = mp.get_context("spawn")
ls = os.listdir("/tmp")
with ctx.Pool() as pool:
results = list(tqdm(pool.imap(loop, ls), total=len(ls)))
print(f"Sum: {sum(results)}")
if __name__ == "__main__":
main()
I know you have already accepted an answer, but let me add my "two cents":
The other way of solving your issue is by initializing each process in your pool with the global variable results as originally intended. The problem was that when using spawn newly created processes do not inherit the address space of the main process (which included the definition of results). Instead execution starts from the top of the program. But the code that creates results never gets executed because of the if __name__ == '__main__' check. But that is a good thing because you do not want a separate instance of this list anyway.
So how do we share the same instance of global variable results across all processes? This is accomplished by using a pool initializer as follows. Also, if you want an accurate progress bar, you should really use imap_unordered instead of imap so that the progress bar is updated in task-completion order rather than in the order in which tasks were submitted. For example, if the first task submitted happens to be the last task to complete, then using imap would result in the progress bar not progressing until all the tasks completed and then it would shoot to 100% all at once.
But Note: The doumentation for imap_unordered only states that the results will be returned in arbitrary order, not completion order. It does however seem that when a chunksize argument of 1 is used (the default if not explicitly specified), the results are returned in completion order. If you do not want to rely on this, then use instead apply_async specifying a callback function that will update the progrss bar. See the last code example.
import multiprocessing as mp
from multiprocessing import Manager
from tqdm import tqdm
def init_pool(the_results):
global results
results = the_results
def loop(arg):
import time
# do stuff
# ...
time.sleep(1)
results.append(arg ** 2)
if __name__ == '__main__':
manager = Manager()
results = manager.list()
ls = list(range(1, 10))
with mp.get_context('spawn').Pool(4, initializer=init_pool, initargs=(results,)) as pool:
list(tqdm(pool.imap_unordered(loop, ls), total=len(ls)))
print(results)
Update: Another (Better) Way
import multiprocessing as mp
from tqdm import tqdm
def loop(arg):
import time
# do stuff
# ...
time.sleep(1)
return arg ** 2
if __name__ == '__main__':
results = []
ls = list(range(1, 10))
with mp.get_context('spawn').Pool(4) as pool:
with tqdm(total=len(ls)) as pbar:
for v in pool.imap_unordered(loop, ls):
results.append(v)
pbar.update(1)
print(results)
Update: The Safest Way
import multiprocessing as mp
from tqdm import tqdm
def loop(arg):
import time
# do stuff
# ...
time.sleep(1)
return arg ** 2
def my_callback(v):
results.append(v)
pbar.update(1)
if __name__ == '__main__':
results = []
ls = list(range(1, 10))
with mp.get_context('spawn').Pool(4) as pool:
with tqdm(total=len(ls)) as pbar:
for arg in ls:
pool.apply_async(loop, args=(arg,), callback=(my_callback))
pool.close()
pool.join()
print(results)

Python extract element from iterator

I have the following code but cannot get out the results from the iterator
from multiprocess import freeze_support
from pathos.multiprocessing import ProcessPool
if __name__ == "__main__":
freeze_support()
pool = ProcessPool(nodes=4)
results = pool.uimap(pow, [1,2,3,4], [5,6,7,8])
print("...")
print(list(results))
The code does not error it just hangs.
There are a couple subtleties to get this to work, but the short version is imap or uimap are iterators unlike map in the python multiprocessing example. To extract the results it needs to be in a for loop. If inside a class you also need the called method to be a #staticmethod
from multiprocessing import freeze_support
from multiprocessing import Pool
def f(vars):
return vars[0]**vars[1]
if __name__ == "__main__":
freeze_support()
pool = Pool(4)
for run in pool.imap(f, [(1,5), (2,8), (3,9)]):
print(run)

Multiprocessing 'Pool' option deployed with python 3 on windowns 10 not working giving error

I'm trying to use the multiprocessing library for Python 3. The module is imported correctly and does not show any error, however when using it, I get an error.
Here is my code:
from multiprocessing import Pool
import time
start_time = time.process_time()
p = Pool(10)
def print_range():
for i in range(10000):
print('Something')
end_time = time.process_time()
print(end_time-start_time)
p.map(print_range())
However I get this error:
ImportError: cannot import name 'Pool' from 'multiprocessing' (C: ...path file)
Has anyone encountered this error, and have a solution for it? Thanks
Might be related to safely importing main. See this section in the documentation. Specifically, you'd need to change your code like this:
from multiprocessing import Pool
import time
def print_range():
for i in range(10000):
print('Something')
if __name__ == '__main__':
start_time = time.process_time()
p = Pool(10)
end_time = time.process_time()
print(end_time-start_time)
p.map(print_range()) # incorrect usage
In addition, your usage of map() is not correct. See the documentation for examples, or use p.apply(print_range) instead.

Using Multiprocessing with Modules

I am writing a module such that in one function I want to use the Pool function from the multiprocessing library in Python 3.6. I have done some research on the problem and the it seems that you cannot use if __name__=="__main__" as the code is not being run from main. I have also noticed that the python pool processes get initialized in my task manager but essentially are stuck.
So for example:
class myClass()
...
lots of different functions here
...
def multiprocessFunc()
do stuff in here
def funcThatCallsMultiprocessFunc()
array=[array of filenames to be called]
if __name__=="__main__":
p = Pool(processes=20)
p.map_async(multiprocessFunc,array)
I tried to remove the if __name__=="__main__" part but still no dice. any help would appreciated.
It seems to me that your have just missed out a self. from your code. I should think this will work:
class myClass():
...
# lots of different functions here
...
def multiprocessFunc(self, file):
# do stuff in here
def funcThatCallsMultiprocessFunc(self):
array = [array of filenames to be called]
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array) #added self. here
Now having done some experiments, I see that map_async could take quite some time to start up (I think because multiprocessing creates processes) and any test code might call funcThatCallsMultiprocessFunc and then quit before the Pool has got started.
In my tests I had to wait for over 10 seconds after funcThatCallsMultiprocessFunc before calls to multiprocessFunc started. But once started, they seemed to run just fine.
This is the actual code I've used:
MyClass.py
from multiprocessing import Pool
import time
import string
class myClass():
def __init__(self):
self.result = None
def multiprocessFunc(self, f):
time.sleep(1)
print(f)
return f
def funcThatCallsMultiprocessFunc(self):
array = [c for c in string.ascii_lowercase]
print(array)
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array, callback=self.done)
p.close()
def done(self, arg):
self.result = 'Done'
print('done', arg)
Run.py
from MyClass import myClass
import time
def main():
c = myClass()
c.funcThatCallsMultiprocessFunc()
for i in range(30):
print(i, c.result)
time.sleep(1)
if __name__=="__main__":
main()
The if __name__=='__main__' construct is an import protection. You want to use it, to stop multiprocessing from running your setup on import.
In your case, you can leave out this protection in the class setup. Be sure to protect the execution points of the class in the calling file like this:
def apply_async_with_callback():
pool = mp.Pool(processes=30)
for i in range(z):
pool.apply_async(parallel_function, args = (i,x,y, ), callback = callback_function)
pool.close()
pool.join()
print "Multiprocessing done!"
if __name__ == '__main__':
apply_async_with_callback()

Categories