cProfiler working weirdly with multiprocessing - python

I got an error for this code:
from pathos.multiprocessing import ProcessingPool
def diePlz(im):
print('Whoopdepoop!')
def caller():
im = 1
pool = ProcessingPool()
pool.map(diePlz,[im,im,im,im])
if __name__=='__main__':
caller()
when I ran it with the cProfiler: (python3 -m cProfile testProfiler.py)
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/rohit/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/rohit/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/rohit/.local/lib/python3.6/site-packages/pathos/helpers/mp_helper.py", line 15, in <lambda>
func = lambda args: f(*args)
File "testProfiler.py", line 3, in diePlz
print('Whoopdepoop!')
NameError: name 'print' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.6/cProfile.py", line 160, in <module>
main()
File "/usr/lib/python3.6/cProfile.py", line 153, in main
runctx(code, globs, None, options.outfile, options.sort)
File "/usr/lib/python3.6/cProfile.py", line 20, in runctx
filename, sort)
File "/usr/lib/python3.6/profile.py", line 64, in runctx
prof.runctx(statement, globals, locals)
File "/usr/lib/python3.6/cProfile.py", line 100, in runctx
exec(cmd, globals, locals)
File "testProfiler.py", line 11, in <module>
caller()
File "testProfiler.py", line 8, in caller
pool.map(diePlz,[im,im,im,im])
File "/home/rohit/.local/lib/python3.6/site-packages/pathos/multiprocessing.py", line 137, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "/home/rohit/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/rohit/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 644, in get
raise self._value
NameError: name 'print' is not defined
But when I ran it without the cProfiler:
$ python3 testProfiler.py
Whoopdepoop!
Whoopdepoop!
Whoopdepoop!
Whoopdepoop!
The code that I've provided is a minimal working example for the problem. There is a much larger code that I want to debug, but am not able to do so because cProfiler keeps raising weird errors.
In this case, the point of importance is
NameError: name 'print' is not defined
which means python3 is not able to recognize print itself. In my code, it was not able to recognize range.

So, I realize this is a long time after the original post, but I have this exact same issue.
In my case I was getting the exact same error as the original post - python builtin functions such as print() or len() resulted in errors like this:
NameError: name 'len' is not defined
I'm currently running multiprocess version 0.70.11.1 and dill version 0.3.3 (components of pathos that make process based parallelism work).
Based on what I found in an issue comment: https://github.com/uqfoundation/pathos/issues/129#issuecomment-536081859 one of the package authors recommends trying:
import dill
dill.settings['recurse'] = True
At least in my case, the above fixed the error!

Related

I am getting error while while using "zappa init'

I am using python 3.8 and zappa 0.51.0. I have installed zappa in a virtual environment and created AWS account also but when I am trying to command "zappa init" its showing error given below
(.env) D:\rough work\crud>zappa init
Traceback (most recent call last):
File "c:\users\dwipal shrirao\appdata\local\programs\python\python38\Lib\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\dwipal shrirao\appdata\local\programs\python\python38\Lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\rough work\crud\.env\Scripts\zappa.exe\__main__.py", line 4, in <module>
File "d:\rough work\crud\.env\lib\site-packages\zappa\cli.py", line 44, in <module>
from .core import Zappa, logger, API_GATEWAY_REGIONS
File "d:\rough work\crud\.env\lib\site-packages\zappa\core.py", line 33, in <module>
import troposphere
File "d:\rough work\crud\.env\lib\site-packages\troposphere\__init__.py", line 586, in <module>
class Template(object):
File "d:\rough work\crud\.env\lib\site-packages\troposphere\__init__.py", line 588, in Template
'AWSTemplateFormatVersion': (basestring, False),
NameError: name 'basestring' is not defined
what is happening and how I can get rid of this error?
The builtin basestring abstract type was removed. Use str instead. The str and bytes types don’t have functionality enough in common to
warrant a shared base class. The 2to3 tool (see below) replaces every
occurrence of basestring with str.
As you are with python version 3.8, use str instead.

flake8 internal error in regular expression engine

I'm trying to run flake8 on a docker django built like described here (tutorial page)
when building the docker image I get an error from flake8 which is run in an docker-compose file with like so
$ flake8 --ignore=E501,F401 .
multiprocessing.pool.RemoteTraceback:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 48, in
return list(map(*
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 666, in
return checker.run_checks()
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 598, in run_checks
self.run_ast_checks()
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 495, in run_ast_checks
checker = self.run_check(plugin, tree=ast)
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 426, in run_check
self.processor.keyword_arguments_for( File "/usr/local/lib/python3.8/site-packages/flake8/processor.py", line 241, in keyword_arguments_for arguments[param] = getattr(self, param)
File "/usr/local/lib/python3.8/site-packages/flake8/processor.py", line 119, in file_tokens
self._file_tokens = list(
File "/usr/local/lib/python3.8/tokenize.py", line 525, in _tokenize
pseudomatch = _compile(PseudoToken).match(line, pos)
RuntimeError: internal error in regular expression engine
The above exception was the direct cause of the following exception: Traceback (most recent call last):
File "/usr/local/bin/flake8", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/flake8/main/cli.py", line 18, in main
app.run(argv)
File "/usr/local/lib/python3.8/site-packages/flake8/main/application.py", line 393, in run
self._run(argv)
File "/usr/local/lib/python3.8/site-packages/flake8/main/application.py", line 381, in _run
self.run_checks()
File "/usr/local/lib/python3.8/site-packages/flake8/main/application.py", line 300, in run_checks
self.file_checker_manager.run()
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 329, in run
self.run_parallel()
File "/usr/local/lib/python3.8/site-packages/flake8/checker.py", line 293, in run_parallel
for ret in pool_map:
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 448, in <genexpr>
return (item for chunk in result for item in chunk)
File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 865, in next
raise value
RuntimeError: internal error in regular expression engine
When I run flake8 with the --verbose flag, I get an error like this:
Fatal Python error: deletion of interned string failed
Python runtime state: initialized
KeyError: 'FILENAME_RE'
from the tokenizer.py
Does anyone know how to solve this?
Additional Data:
running docker-compose v1.25.4 on an raspberry 3 with buster lite.
Installed and compiled Python 3.8.2 from source with the flag --enableloadable-sqlite
Thanks for helping!
If you don't care for flake8, then just delete that line. It will work. I had the same problem.
I know this isn't an answer. I would add a comment instead if I could.
I encountered a very similar error profile when I had refactored and left a lot of data which formerly had been in the top of the project down in what became the package folder. There were 33621 files including .venv, .nox, .pytest_cache, .coverage.
File "/usr/lib/python3.8/multiprocessing/pool.py", line 448, in <genexpr>
return (item for chunk in result for item in chunk)
File "/usr/lib/python3.8/multiprocessing/pool.py", line 868, in next
raise value
IndexError: string index out of range
If you see a similar signature (the parallel processing engine being overwhelmed in some way and throwing an exception), you may want to review your project directory structure and make sure that you did not put this kind of data in your sources directories.

Pathos (python module) behaves different in IDE and shell

I am trying to understand how to use the Pathos package to run a function that calls a function. It was my understanding an the advantage of Pathos over the main multiprocessing package was that it allowed functions inside functions. However, I can't seem to make it work. Here is the simplest example I could come up with:
def testf(x):
print(x)
import dill
import pathos
from pathos.multiprocessing import ProcessingPool
pool = ProcessingPool(nodes=3)
out2 = pool.map(testf, [1,2,3,4,5,6,7,8,9])
pool.close()
Output:
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/james/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/james/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/james/.local/lib/python3.6/site-packages/pathos/helpers/mp_helper.py", line 14, in <lambda>
func = lambda args: f(*args)
File "<input>", line 2, in testf
NameError: name 'print' is not defined
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<input>", line 5, in <module>
File "/home/james/.local/lib/python3.6/site-packages/pathos/multiprocessing.py", line 136, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "/home/james/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/james/.local/lib/python3.6/site-packages/multiprocess/pool.py", line 608, in get
raise self._value
NameError: name 'print' is not defined
Edit: It seems that if I paste this code into a shell console, it works fine. No dice if I run it from my IDE of choice, PyCharm. So now my question is why the same code would work differently in the same version of the python interpreter (3.6.1) based on whether it is run from a shell or the in-app console/

Multiprocessing in Python 3.4 does not work with imported module

I am using multiprocessing in Python 3.4.3 to speed up my code. I have a problem with getting back my results. I have tried the following simple code, which works just fine.
import numpy
from multiprocessing import Pool
from functools import partial
from OpenDutchWordnet import Wn_grid_parser, le, les, synset, relation
def funct(arg1, value):
return arg1 * value
if __name__ == '__main__':
#------FOR TESTING-------
t=[1,2,3,4]
arg1=4
pool=Pool(processes=1)
func=partial(funct, arg1)
print("func: ", func)
m4=pool.map(func,t)
print(m4)
#------/FOR TESTING-------
Of course, I would like to run more than 1 processes. And the code which I would like to run is the following,
import numpy
from multiprocessing import Pool
from functools import partial
from OpenDutchWordnet import Wn_grid_parser, le, les, synset, relation
def funct2(arg1, value):
return arg1.get_relations(value)
if __name__ == '__main__':
myparser= Wn_grid_parser(Wn_grid_parser.odwn)
l_sensesofwoord = myparser.lemma_get_generator("man")
sense=l_sensesofwoord[0]
synsetid_sense=sense.get_synset_id()
t=["has_hyperonym", "has_holonym"]
arg1=myparser.synsets_find_synset(synsetid_sense)
f=partial(funct2, arg1)
print("f is: ", f)
m1=pool.map(f,t)
When running this code, I get the following errormessage.
f is: functools.partial(<function funct2 at 0x00000000046D5378>, <synset.Synset object at 0x000000005011DDA0>)
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Python34\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\Python34\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "C:\Users\UTRSB\AppData\Local\Continuum\Anaconda3\mycode\multi.py", line 14, in funct2
return numpy.asarray(arg1.get_relations(value))
File "C:\Python34\lib\site-packages\OpenDutchWordnet\synset.py", line 98, in get_relations
for relation_el in self.synset_el.iterfind(xml_query)]
File "C:\Python34\lib\site-packages\OpenDutchWordnet\synset.py", line 97, in <listcomp>
return [Relation(relation_el)
File "C:\Python34\lib\site-packages\lxml\_elementpath.py", line 156, in select
for elem in result:
File "C:\Python34\lib\site-packages\lxml\_elementpath.py", line 88, in select
for elem in result:
File "C:\Python34\lib\site-packages\lxml\_elementpath.py", line 89, in select
for e in elem.iterchildren(tag):
File "lxml.etree.pyx", line 1363, in lxml.etree._Element.iterchildren (src\lxml\lxml.etree.c:50501)
File "lxml.etree.pyx", line 2730, in lxml.etree.ElementChildIterator.__cinit__ (src\lxml\lxml.etree.c:66739)
File "apihelpers.pxi", line 24, in lxml.etree._assertValidNode (src\lxml\lxml.etree.c:14133)
AssertionError: invalid Element proxy at 53353160
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:/Users/UTRSB/AppData/Local/Continuum/Anaconda3/mycode/multi.py", line 52, in <module>
m1=pool.map(f,t)
File "C:\Python34\lib\multiprocessing\pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Python34\lib\multiprocessing\pool.py", line 599, in get
raise self._value
AssertionError: invalid Element proxy at 53353160
I have also tried using another way: result=pool.apply_async(geefAlleGloss,[p])
this works just fine, but when I want to use get() to obtain the results. I end up with the same error. answer=result.get()
I think the error is somewhere in the map function. At first, I thought it had something to to with the imported modules from OpenDutchWordnet that I use. But as the partial function works, the error should be caused by the get() and/or map() function.
I would appreciate any help.

Check if a variable exists

I am using following method isset(var) to determine if a variable exists.
def isset(variable):
try:
variable
except NameError:
return False
else:
return True
It returns True if variable exists. But if a variable doesn't exist I get following:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/lenovo/pyth/master/vzero/__main__.py", line 26, in <module>
ss.run()
File "vzero/ss.py", line 4, in run
snap()
File "vzero/ss.py", line 7, in snap
core.display()
File "vzero/core.py", line 77, in display
stdout(session(username()))
File "vzero/core.py", line 95, in session
if isset(ghi): #current_sessions[user]):
NameError: global name 'ghi' is not defined
I don't want all these errors. I just want it return False. No output. How can I do this?
Instead of writing a complex helper function isset and calling it
if not isset('variable_name'):
# handle the situation
in the place where you want to check the presence of the variable do:
try:
# some code with the variable in question
except NameError:
# handle the situation

Categories