Why multiprocessing.Pool gives so strange error in python? [duplicate] - python

There is following simple code:
from multiprocessing import Process, freeze_support
def foo():
print 'hello'
if __name__ == '__main__':
freeze_support()
p = Process(target=foo)
p.start()
It works good on Linux or Windows with Python 3.3, but fails on Windows with Python 2.7.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\Python27\lib\multiprocessing\forking.py", line 346, in main
prepare(preparation_data)
File "c:\Python27\lib\multiprocessing\forking.py", line 454, in prepare
assert main_name not in sys.modules, main_name
AssertionError: thread
Generally speaking, all multiprocessing examples i tried fail on that setup. Why?

This is a known bug:
http://bugs.python.org/issue10845
Not sure if this will ever get ported to 2.7.X.

Related

Multiprocessing code fails when run with pdb?

Related to Python Multiprocessing error: AttributeError: module '__main__' has no attribute '__spec__' , but arising from different circumstances.
I'm encountering an issue in Python 3.7.4 when I try to run multiprocessing code with pdb. The issue replicates with the basic multiprocessing example from https://docs.python.org/3.6/library/multiprocessing.html :
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
This runs fine (outputs [1, 4, 9]) when run directly from Python via python.exe testcase.py. However, it does not work under pdb; python.exe -m pdb testcase.py fails with an error:
Traceback (most recent call last):
File "c:\python37\lib\pdb.py", line 1697, in main
pdb._runscript(mainpyfile)
File "c:\python37\lib\pdb.py", line 1566, in _runscript
self.run(statement)
File "c:\python37\lib\bdb.py", line 585, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "c:\users\max\desktop\projects\errortest.py", line 1, in <module>
from multiprocessing import Pool
File "c:\python37\lib\multiprocessing\context.py", line 119, in Pool
context=self.get_context())
File "c:\python37\lib\multiprocessing\pool.py", line 176, in __init__
self._repopulate_pool()
File "c:\python37\lib\multiprocessing\pool.py", line 241, in _repopulate_pool
w.start()
File "c:\python37\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "c:\python37\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "c:\python37\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "c:\python37\lib\multiprocessing\spawn.py", line 172, in get_preparation_data
main_mod_name = getattr(main_module.__spec__, "name", None)
AttributeError: module '__main__' has no attribute '__spec__'
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> c:\python37\lib\multiprocessing\spawn.py(172)get_preparation_data()
-> main_mod_name = getattr(main_module.__spec__, "name", None)
I hesitate to think that I've found a bug in a pair of modules that have been important parts of Python for over a decade. Is something incorrect here?
This is a limitation of multiprocessing in windows. This question contains a good explanation for why this is so. A quick google search shows that the puDB may be able to help with debugging multi-processing code, but I have not used it before.
The following is from the python docs:
Functionality within this package requires that the main module be importable by the children. This is covered in Programming guidelines however it is worth pointing out here.
This means that some examples, such as the multiprocessing.pool.Pool examples will not work in the interactive interpreter.

RuntimeError on Windows trying to run a simple program

I have this simple program:
from PIL import Image
import pyscreenshot as ImageGrab
print "hi"
im=ImageGrab.grab()
im.show()
This works perfectly fine on Ubuntu, but it gives the following error on Windows:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python27\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Python27\lib\multiprocessing\forking.py", line 509, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\Administrator\Downloads\sample.py", line 5, in <module>
im=ImageGrab.grab()
File "C:\Python27\lib\site-packages\pyscreenshot\__init__.py", line 46, in gra
b
return _grab(to_file=False, childprocess=childprocess, backend=backend, bbox
=bbox)
File "C:\Python27\lib\site-packages\pyscreenshot\__init__.py", line 29, in _gr
ab
return run_in_childprocess(_grab_simple, imcodec.codec, to_file, backend, bb
ox, filename)
File "C:\Python27\lib\site-packages\pyscreenshot\procutil.py", line 28, in run
_in_childprocess
p.start()
File "C:\Python27\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "C:\Python27\lib\multiprocessing\forking.py", line 258, in __init__
cmd = get_command_line() + [rhandle]
File "C:\Python27\lib\multiprocessing\forking.py", line 358, in get_command_li
ne
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
There is no multiprocessing. I saw some other answers, but they did not help.
Can some please suggest a possible problem here?
There's a known problem with the multiprocessing module on Windows (to elaborate on roganosh remark): using multiprocessing module must be done in a function or in the __main__ section (after all the imports are initialized), not in the root of the script because of the way Windows spawns the python executable (hence the "bootstrap phase" error). No issue on Linux. Looks very much like the same issue as RuntimeError on windows trying python multiprocessing.
Try changing into this code:
from PIL import Image
import pyscreenshot as ImageGrab
if __name__ == "__main__":
im=ImageGrab.grab()
im.show()
The traceback indicates that there is multiprocessing being used in the background, not explicitly in your own code. Specifically, it is being called by pyscreenshot\procutil.py. The relevant lines of the traceback:
File "C:\Python27\lib\site-packages\pyscreenshot\procutil.py", line 28, in run
_in_childprocess
p.start()
File "C:\Python27\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
Since the issue is in the library, there would be nothing you could do except modify the library yourself. However, this page says that pyscreenshot is a "Replacement for the ImageGrab Module, which works on Windows only". So instead, you should install ImageGrab library, which seems to do exactly the same thing, but is only compatible with Windows and MacOS (see here)

python multiprocessing returning error 'module' object has no attribute 'myfunc'

First off, I am very new to multiprocessing, and I can't seem to make a very simple and straightforward example work. This is the example I working with:
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
everytime I run a code I am getting this error multiple times :
C:\Anaconda2\lib\site-packages\IPython\utils\traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
warn("IPython.utils.traitlets has moved to a top-level traitlets package.")
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Anaconda2\lib\multiprocessing\forking.py", line 381, in main
self = load(from_parent)
File "C:\Anaconda2\lib\pickle.py", line 1384, in load
return Unpickler(file).load()
File "C:\Anaconda2\lib\pickle.py", line 864, in load
dispatch[key](self)
File "C:\Anaconda2\lib\pickle.py", line 1096, in load_global
klass = self.find_class(module, name)
File "C:\Anaconda2\lib\pickle.py", line 1132, in find_class
klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'worker'
I know that this question is very vague but I if anyone could point me in the right direction I would appreciate it.
I am on Windows, I run it in Anaconda with python 2.7, the code is exactly the same as above, nothing more nothing less! I run it directly in the console in the IDE
EDIT: It looks like when I run the code directly in command prompt it works just fine, but doing it the console using Anaconda won't work. anybody knows why?
Anaconda doesn't like multiprocessing as explained in this
answer.
From the answer:
This is because of the fact that multiprocessing does not work well in the interactive interpreter. The main reason is that there is no fork() function applicable in windows. It is explained on their web page itself.
Thank you!

python joblib Parallel on Windows not working even "if __name__ == '__main__':" is added

I'm running parallel processing in Python on Windows. Here's my code:
from joblib import Parallel, delayed
def f(x):
return sqrt(x)
if __name__ == '__main__':
a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
Here's the error message:
Process PoolWorker-2:
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\User\lib\site-packages\joblib\pool.py", line 363, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
According to this site the problem is Windows specific:
Yes: under linux we are forking, thus their is no need to pickle the
function, and it works fine. Under windows, the function needs to be
pickleable, ie it needs to be imported from another file. This is
actually good practice: making modules pushes for reuse.
I've tried your code and it works flawlessly under Linux.
Under Windows it runs OK if it is run from a script, like python script_with_your_code.py. But it fails when ran in an interactive python session. It worked for me when I saved the f function in separate module and imported it into my interactive session.
NOT WORKING:
Interactive session:
>>> from math import sqrt
>>> from joblib import Parallel, delayed
>>> def f(x):
... return sqrt(x)
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Python27\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Python27\lib\site-packages\joblib\pool.py", line 359, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
WORKING:
fun.py
from math import sqrt
def f(x):
return sqrt(x)
Interactive session:
>>> from joblib import Parallel, delayed
>>> from fun import f
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
>>> a
[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]
Joblib is working on my Windows 10 when I am using the version 1.19.5 of numpy. I upgraded all outdated package; to do that you can run the following command:
pip list --outdated
pip install --upgrade
or you use pip_upgrade_outdated which upgrades all outdated packages by doing pip install pip-upgrade-outdated and pip-upgrade-outdated according this this site

Python 2.7 on Windows, "assert main_name not in sys.modules, main_name" for all multiprocessing examples

There is following simple code:
from multiprocessing import Process, freeze_support
def foo():
print 'hello'
if __name__ == '__main__':
freeze_support()
p = Process(target=foo)
p.start()
It works good on Linux or Windows with Python 3.3, but fails on Windows with Python 2.7.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\Python27\lib\multiprocessing\forking.py", line 346, in main
prepare(preparation_data)
File "c:\Python27\lib\multiprocessing\forking.py", line 454, in prepare
assert main_name not in sys.modules, main_name
AssertionError: thread
Generally speaking, all multiprocessing examples i tried fail on that setup. Why?
This is a known bug:
http://bugs.python.org/issue10845
Not sure if this will ever get ported to 2.7.X.

Categories