Mysterious pickle error while profiling a multi-process Python script [duplicate] - python

This question already has an answer here:
cProfile causes pickling error when running multiprocessing Python code
(1 answer)
Closed 11 months ago.
I"m using the multiprocessing module, and I'm using UpdateMessage objects (my own class), sent through multiprocessing.Queue objects, to communicate between processes. Here's the class:
class UpdateMessage:
def __init__(self, arrayref, rowslice, colslice, newval):
self.arrayref = arrayref
self.rowslice = rowslice
self.colslice = colslice
self.newval = newval
def do_update(self):
if self.arrayref == 'uL':
arr = uL
elif self.arrayref == 'uR':
arr = uR
else:
raise Exception('UpdateMessage.arrayref neither uL nor uR')
arr[self.rowslice, self.colslice] = self.newval
When I run the script, it works perfectly fine. However, when I run it with either cProfile or profile, it gives the following error:
_pickle.PicklingError: Can't pickle <class '__main__.UpdateMessage'>: attribute lookup __main__.UpdateMessage failed
It seems to be trying to pickle the class, but I can't see why this is happening. My code doesn't do this, and it works just fine without it, so it's probably the multiprocessing module. But why would it need to pickle UpdateMessage, and how can I fix the error?
EDIT: here's part of the code that's sending the UpdateMessage (multiple parts of the script do this, but all in the same way):
msg = UpdateMessage(uLref, refer[0] + marker[0] - 2,
slice(uL.shape[1]), ustar.copy())
queue.put(msg)
The traceback isn't very helpful:
Traceback (most recent call last):
File "/usr/lib/python3.2/multiprocessing/queues.py", line 272, in _feed
send(obj)

I do not know how your processes look like but:
'__main__.UpdateMessage'
refers to UpdateMessage in the launched module.
When another process is started not with the Module of the class UpdateMessage, UpdateMessage will not be available.
you have to import that module that includes UpdateMessage so
UpdateMessage.__module__ is not '__main__'.
Then Pickle can find UpdateMessage also in other programs.
I recommend the __main__.py to look like this:
import UpdateMessage_module
UpdateMessage_module.main()

I had the same problem, and solved it by defining the class to be pickled in its own file.

I'm assuming the pickling attempt on your class is happening before your class is sent to another process. The easiest solution is probably just to implement the pickle-protocol explicitly on your class...

Related

Error when trying to use pickle and multiprocessing

I'm trying to use multiprocessing to run my code in Python 3.7, but met a problem. There is an error when I try to run my code:
Can't pickle local object 'mm_prepare_run.<locals>...
I understand it's a problem with pickle, but I didn't find a proper answer how to resolve this issue.
My simple code is below. Could you advise how I can solve the problem?
import multiprocessing
import copy
from pathlib import Path
proc_mrg = multiprocessing.Manager()
num_cpu = 8 # number of CPU
def prepare_run(config):
din['config'] = config
din_temp = copy.deepcopy(din)
dout_list.append(proc_mrg.dict({}))
#process = multiprocessing.Process(target=Run_IDEAS_instance_get_trajectory,args=(din_temp, dout_list[-1]))
process = multiprocessing.Process(target=Run_IDEAS_instance_get_trajectory(din_temp, dout_list[-1]))
proc_list.append(process)
for job in proc_list:
job.start()
When you create a Process, in prepare_run you are calling Run_IDEAS_instance_get_trajectory instead of just passing it as a reference.
And since that function does not return a result, the target of the Process is None.
Use this instead:
process = multiprocessing.Process(
target=Run_IDEAS_instance_get_trajectory,
args=(din_temp, dout_list[-1])
)
Functions in Python are first class objects of the callable type.
See the "Data model" chapter in the Python language reference.
Edit:
From your comment, I can see that you are running this code on ms-windows.
On this platform it is required that you run process creation inside a if __name__ == "__main__" block! Because of how multiprocessing works on this platform, python has to be able to import your script without side effects such as starting a new process. See the "programming guidelines" section in the documentation for multiprocessing.

Python how to print full stack, including magic methods (dunder methods) used?

I am trying to debug a Python built-in class. My debugging has brought me into the realm of magic methods (aka dunder methods).
I am trying to figure out which dunder methods are called, if any. Normally I would do something like this:
import sys
import traceback
# This would be located where the I'm currently debugging
traceback.print_stack(file=sys.stdout)
However, traceback.print_stack does not give me the level of detail of printing what dunder methods area used in its vicinity.
Is there some way I can print out, in a very verbose manner, what is actually happening inside a block of code?
Sample Code
#!/usr/bin/env python3.6
import sys
import traceback
from enum import Enum
class TestEnum(Enum):
"""Test enum."""
A = "A"
def main():
for enum_member in TestEnum:
traceback.print_stack(file=sys.stdout)
print(f"enum member = {enum_member}.")
if __name__ == "__main__":
main()
I would like the above sample code to print out any dunder methods used (ex: __iter__).
Currently it prints out the path to the call to traceback.print_stack:
/path/to/venv/bin/python /path/to/file.py
File "/path/to/file.py", line 56, in <module>
main()
File "/path/to/file.py", line 51, in main
traceback.print_stack(file=sys.stdout)
enum member = TestEnum.A.
P.S. I'm not interested in going to the byte code level given by dis.dis.
I think, with the stacktrace you are looking at the wrong place. When you call print_stack from a place, that is executed only when coming from a dunder method, this method is very well included in the output.
I tried this code to verify:
import sys
import traceback
from enum import Enum
class TestEnum(Enum):
"""Test enum."""
A = "A"
class MyIter:
def __init__(self):
self.i = 0
def __next__(self):
self.i += 1
if self.i <= 1:
traceback.print_stack(file=sys.stdout)
return TestEnum.A
raise StopIteration
def __iter__(self):
return self
def main():
for enum_member in MyIter():
print(f"enum member = {enum_member}.")
if __name__ == "__main__":
main()
The last line of the stack trace is printed as
File "/home/lydia/playground/demo.py", line 21, in __next__
traceback.print_stack(file=sys.stdout)
In your original code, you are getting the stack trace at a time when all dunder methods have already returned. Thus they have been removed from the stack.
So I think, you want to have a look at a call graph instead. I know that IntelliJ / PyCharm can do this nicely at least in the paid editions.
There are other tools that you may want to try. How does pycallgraph look to you?
Update:
Python makes it actually pretty easy to dump a plain list of all the function calls.
Basically all you need to do is
import sys
sys.setprofile(tracefunc)
Write the tracefunc depending on your needs. Find a working example at this SO question: How do I print functions as they are called
Warning: I needed to start the script from an external shell. Starting it by using the play button in my IDE meant that the script would never terminate but write more and more lines. I assume it collides with the internal profiling done by my IDE.
The official documentation of sys.setprofile: https://docs.python.org/3/library/sys.html#sys.setprofile
And a random tutorial about tracing in Python: https://pymotw.com/2/sys/tracing.html
Note however, that by my experience you can get the best insights into the questions "who is calling whom?" or "where does this value even come from?" by using a plain-old debugger.
I also did some research on the subject matter, as information in #LydiaVanDyke's answer fueled better searches.
Printing Entire Call Stack
As #LydiaVanDyke points out, an IDE debugger is a really great way. I use PyCharm, and found that was my favorite solution, because one can:
Follow function calls + exact line numbers in the code
Read code around the calls, better understanding typing
Skip over calls one doesn't care to investigate
Another way is Python's standard library's trace. It offers both command line and embeddable methods for printing the entire call stack.
And yet another one is Python's built-in debugger module, pdb. This (invoked via pdb.set_trace()) really changed the game for me.
Visualization of Profiler Output
gprof2dot is another useful profiler visualization tool.
source code
useful tutorial
Finding Source Code
One of my other problems was not actually seeing the real source code, due to my IDE's stub files (PyCharm).
How to retrieve source code of Python functions details two methods of actually printing source code
With all this tooling, one feels quite empowered!

A timeout decorator class with multiprocessing gives a pickling error

So on windows the signal and the thread approahc in general are bad ideas / don't work for timeout of functions.
I've made the following timeout code which throws a timeout exception from multiprocessing when the code took to long. This is exactly what I want.
def timeout(timeout, func, *arg):
with Pool(processes=1) as pool:
result = pool.apply_async(func, (*arg,))
return result.get(timeout=timeout)
I'm now trying to get this into a decorator style so that I can add it to a wide range of functions, especially those where external services are called and I have no control over the code or duration. My current attempt is below:
class TimeWrapper(object):
def __init__(self, timeout=10):
"""Timing decorator"""
self.timeout = timeout
def __call__(self, f):
def wrapped_f(*args):
with Pool(processes=1) as pool:
result = pool.apply_async(f, (*args,))
return result.get(timeout=self.timeout)
return wrapped_f
It gives a pickling error:
#TimeWrapper(7)
def func2(x, y):
time.sleep(5)
return x*y
File "C:\Users\rmenk\AppData\Local\Continuum\anaconda3\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function func2 at 0x000000770C8E4730>: it's not the same object as __main__.func2
I'm suspecting this is due to the multiprocessing and the decorator not playing nice but I don't actually know how to make them play nice. Any ideas on how to fix this?
PS: I've done some extensive research on this site and other places but haven't found any answers that work, be it with pebble, threading, as a function decorator or otherwise. If you have a solution that you know works on windows and python 3.5 I'd be very happy to just use that.
What you are trying to achieve is particularly cumbersome in Windows. The core issue is that when you decorate a function, you shadow it. This happens to work just fine in UNIX due to the fact it uses the fork strategy to create a new process.
In Windows though, the new process will be a blank one where a brand new Python interpreter is started and loads your module. When the module gets loaded, the decorator hides the real function making it hard to find for the pickle protocol.
The only way to get it right is to rely on a trampoline function to be set during the decoration. You can take a look on how is done on pebble but, as long as you're not doing it for an exercise, I'd recommend to use pebble directly as it already offers what you are looking for.
from pebble import concurrent
#concurrent.process(timeout=60)
def my_function(var, keyvar=0):
return var + keyvar
future = my_function(1, keyvar=2)
future.result()
The only problem You have here is that You tested the decorated function in the main context. Move it out to a different module and it will probably work.
I wrote the wrapt_timeout_decorator what uses wrapt & dill & multiprocess & pipes versus pickle & multiprocessing & queue, because it can serialize more datatypes.
It might look simple at first, but under windows a reliable timeout decorator is quite tricky - You might use mine, its quite mature and tested :
https://github.com/bitranox/wrapt_timeout_decorator
On Windows the main module is imported again (but with a name != 'main') because Python is trying to simulate a forking-like behavior on a system that doesn't support forking. multiprocessing tries to create an environment similar to Your main process by importing the main module again with a different name. Thats why You need to shield the entry point of Your program with the famous " if name == 'main': "
import lib_foo
def some_module():
lib_foo.function_foo()
def main():
some_module()
# here the subprocess stops loading, because __name__ is NOT '__main__'
if __name__ = '__main__':
main()
This is a problem of Windows OS, because the Windows Operating System does not support "fork"
You can find more information on that here:
Workaround for using __name__=='__main__' in Python multiprocessing
https://docs.python.org/2/library/multiprocessing.html#windows
Since main.py is loaded again with a different name but "main", the decorated function now points to objects that do not exist anymore, therefore You need to put the decorated Classes and functions into another module. In general (especially on windows) , the main() program should not have anything but the main function, the real thing should happen in the modules. I am also used to put all settings or configurations in a different file - so all processes or threads can access them (and also to keep them in one place together, not to forget typing hints and name completion in Your favorite editor)
The "dill" serializer is able to serialize also the main context, that means the objects in our example are pickled to "main.lib_foo", "main.some_module","main.main" etc. We would not have this limitation when using "pickle" with the downside that "pickle" can not serialize following types:
functions with yields, nested functions, lambdas, cell, method, unboundmethod, module, code, methodwrapper, dictproxy, methoddescriptor, getsetdescriptor, memberdescriptor, wrapperdescriptor, xrange, slice, notimplemented, ellipsis, quit
additional dill supports:
save and load python interpreter sessions, save and extract the source code from functions and classes, interactively diagnose pickling errors
To support more types with the decorator, we selected dill as serializer, with the small downside that methods and classes can not be decorated in the main context, but need to reside in a module.
You can find more information on that here: Serializing an object in __main__ with pickle or dill

IPC with a Python subprocess

I'm trying to do some simple IPC in Python as follows: One Python process launches another with subprocess. The child process sends some data into a pipe and the parent process receives it.
Here's my current implementation:
# parent.py
import pickle
import os
import subprocess
import sys
read_fd, write_fd = os.pipe()
if hasattr(os, 'set_inheritable'):
os.set_inheritable(write_fd, True)
child = subprocess.Popen((sys.executable, 'child.py', str(write_fd)), close_fds=False)
try:
with os.fdopen(read_fd, 'rb') as reader:
data = pickle.load(reader)
finally:
child.wait()
assert data == 'This is the data.'
# child.py
import pickle
import os
import sys
with os.fdopen(int(sys.argv[1]), 'wb') as writer:
pickle.dump('This is the data.', writer)
On Unix this works as expected, but if I run this code on Windows, I get the following error, after which the program hangs until interrupted:
Traceback (most recent call last):
File "child.py", line 4, in <module>
with os.fdopen(int(sys.argv[1]), 'wb') as writer:
File "C:\Python34\lib\os.py", line 978, in fdopen
return io.open(fd, *args, **kwargs)
OSError: [Errno 9] Bad file descriptor
I suspect the problem is that the child process isn't inheriting the write_fd file descriptor. How can I fix this?
The code needs to be compatible with Python 2.7, 3.2, and all subsequent versions. This means that the solution can't depend on either the presence or the absence of the changes to file descriptor inheritance specified in PEP 446. As implied above, it also needs to run on both Unix and Windows.
(To answer a couple of obvious questions: The reason I'm not using multiprocessing is because, in my real-life non-simplified code, the two Python programs are part of Django projects with different settings modules. This means they can't share any global state. Also, the child process's standard streams are being used for other purposes and are not available for this.)
UPDATE: After setting the close_fds parameter, the code now works in all versions of Python on Unix. However, it still fails on Windows.
subprocess.PIPE is implemented for all platforms. Why don't you just use this?
If you want to manually create and use an os.pipe(), you need to take care of the fact that Windows does not support fork(). It rather uses CreateProcess() which by default not make the child inherit open files. But there is a way: each single file descriptor can be made explicitly inheritable. This requires calling Win API. I have implemented this in gipc, see the _pre/post_createprocess_windows() methods here.
As #Jan-Philip Gehrcke suggested, you could use subprocess.PIPE instead of os.pipe():
#!/usr/bin/env python
# parent.py
import sys
from subprocess import check_output
data = check_output([sys.executable or 'python', 'child.py'])
assert data.decode().strip() == 'This is the data.'
check_output() uses stdout=subprocess.PIPE internally.
You could use obj = pickle.loads(data) if child.py uses data = pickle.dumps(obj).
And the child.py could be simplified:
#!/usr/bin/env python
# child.py
print('This is the data.')
If the child process is written in Python then for greater flexibility you could import the child script as a module and call its function instead of using subprocess. You could use multiprocessing, concurrent.futures modules if you need to run some Python code in a different process.
If you can't use standard streams then your django applications could use sockets to talk to one another.
The reason I'm not using multiprocessing is because, in my real-life non-simplified code, the two Python programs are part of Django projects with different settings modules. This means they can't share any global state.
This seems bogus. multiprocessing under-the-hood also may use subprocess module. If you don't want to share global state -- don't share it -- it is the default for multiple processes. You should probably ask a more specific for your particular case question about how to organize the communication between various parts of your project.

Really weird issue with shelve (python)

I create a file called foo_module.py containing the following code:
import shelve, whichdb, os
from foo_package.g import g
g.shelf = shelve.open("foo_path")
g.shelf.close()
print whichdb.whichdb("foo_path") # => dbhash
os.remove("foo_path")
Next to that file I create a directory called foo_package than contains an empty __init__.py file and a file called g.py that just contains:
class g:
pass
Now when I run foo_module.py I get a weird error message:
Exception TypeError: "'NoneType' object is not callable" in ignored
But then, if I rename the directory from foo_package to foo, and change the import line in foo_module.py, I don't get any error. Wtf is going on here?
Running Python 2.6.4 on WinXP.
I think you've hit a minor bug in 2.6.4's code related to the cleanup at end of program. If you run python -v you can see exactly at what point of the cleanup the error comes:
# cleanup[1] foo_package.g
Exception TypeError: "'NoneType' object is not callable" in ignored
Python sets references to None during the cleanup at the end of program, and it looks like it's getting confused about the status of g.shelf. As a workaround you could set g.shelf = None after the close. I would also recommend opening a bug in Python's bug tracker!
After days of hair loss, I finally had success using an atexit function:
import atexit
...
cache = shelve.open(path)
atexit.register(cache.close)
It's most appropriate to register right after opening. This works with multiple concurrent shelves.
(python 2.6.5 on lucid)
This is indeed a Python bug, and I've posted a patch to the tracker issue you opened (thanks for doing that).
The problem is that shelve's del method calls its close method, but if the shelve module has already been through cleanup, the close method fails with the message you see.
You can avoid the message in your code by adding 'del g.shelf' after g.shelf.close. As long as g.shelf is the only reference to the shelf, this will result in CPython calling the shelve's del method right away, before the interpreter cleanup phase, and thus avoid the error message.
It seems to be an exception in a shutdown function registered by the shelve module. The "ignored" part is from the shutdown system, and might get its wording improved sometime, per Issue 6294. I'm still hoping for an answer on how to eliminate the exception itself, though...
for me a simple shelve.close() on an unclosed one did the job.
shelve.open('somefile') returns a "persistent dictionary for reading and writing" object which i used throughout the app's runtime.
when I terminated the app I received the "TypeError" Exception as mentioned.
I plased a 'close()' call in my termination sequence and that seemed to fix the problem.
e.g.
shelveObj = shelve.open('fileName')
...
shelveObj.close()
OverByThere commented on Jul 17, 2018
This seems to be fixable.
In short open /usr/lib/python3.5/weakref.py and change line 109 to:
def remove(wr, selfref=ref(self), _atomic_removal=_remove_dead_weakref):
And line 117 to:
_atomic_removal(d, wr.key)
Note you need to do this with spaces, not tabs as this will cause other errors.

Categories