Good day to you,
Today I was moving code from threading to multiprocess. Everything seemed okay, until I got The following error:
Error
Traceback (most recent call last):
File "run.py", line 93, in <module>
main()
File "run.py", line 82, in main
emenu.executemenu(components, _path)
File "/home/s1810979/paellego/lib/execute/execute_menu.py", line 29, in executemenu
e.executeall(installed, _path)
File "/home/s1810979/paellego/lib/execute/execute.py", line 153, in executeall
pool.starmap(phase2, args)
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 268, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 608, in get
raise self._value
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 385, in _handle_tasks
put(task)
File "/usr/lib64/python3.4/multiprocessing/connection.py", line 206, in send
self._send_bytes(ForkingPickler.dumps(obj))
File "/usr/lib64/python3.4/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'module'>: attribute lookup module on builtins failed
Code
execute.py
def executeall(components, _path):
args = []
manager = multiprocessing.Manager()
q = manager.Queue()
resultloc = '/some/result.log'
for component in components:
for apkpath, resultpath in zip(execonfig.apkpaths, execonfig.resultpaths):
args.append((component,apkpath,resultpath,q,)) #Args for subprocesses
cores = askcores()
with multiprocessing.Pool(processes=cores) as pool:
watcher = pool.apply_async(lgr.log, (resultloc+'/results.txt', q,))
pool.starmap(phase2, args)
component.py
class Component(object):
def __init__(self, installmodule, runmodule, installerloc, installationloc, dependencyloc):
self.installmodule = installmodule
self.runmodule = runmodule
self.installerloc = installerloc
self.installationloc = installationloc
self.dependencyloc = dependencyloc
self.config = icnf.Installconfiguration(installerloc+'/conf.conf')
#lots of functions...
installconfig.py
class State(Enum):
BEGIN=0 #Look for units
UNIT=1 #Look for unit keypairs
KEYPAIR=3
class Phase(Enum):
NONE=0
DEPS=1
PKGS=2
class Installconfiguration(object):
def __init__(self, config):
dictionary = self.reader(config) #Fill a dictionary
#dictionary (key:Phase, value: (dictionary key: str, job))
self.deps = dictionary[Phase.DEPS]
self.pkgs = dictionary[Phase.PKGS]
job.py
class Job(object):
def __init__(self, directory=None, url=None):
self.directory = directory if directory else ''
self.url = url if url else ''
As you can see, I pass a component as argument to function phase2(component, str, str, multiprocess.manager.Queue()).
The second and third argument of the constructor of component are modules imported with importlib.
What I tried
I am new to python, but not to programming. Here is what I tried:
Because the error itself did not point out what the problem was exactly, I tried removing args to find out which can't be pickled: Remove component, and everything works fine, so this appears to be the cause for trouble. However, I need this object passed to my processes.
I searched around the internet for hours, but did not find anything but basic tutorials about multiprocessing, and explanations about how pickle works. I did find this saying it should work, but not on windows or something. However, it does not work on Unix (which I use)
My ideas
As I understood it, nothing suggests I cannot send a class containing two importlib modules. I do not know what the exact problem is with component class, but importlib module as members are the only non-regular things. This is why I believe the problem occurs here.
Question
Do you know why a class containing modules is unsuitable for 'pickling'? How can one get a better idea why and where Can't pickle <class 'module'> errors occur?
More code
Full source code for this can be found on https://github.com/Sebastiaan-Alvarez-Rodriguez/paellego
Questions to me
Please leave comments requesting clarifications/more code snippets/??? if you would like me to edit this question
A last request
I would like solutions to use python standard library only, python 3.3 preferably. Also, a requirement of my code is that it runs on Unix systems.
Thanks in advance
Edit
As requested, here is a minimal example which greatly simplifies the problem:
main.py (you could execute as python main.py foo)
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, moduly):
self.moduly = moduly
def foopass(self, stringy):
self.moduly.foo(stringy)
def barpass(self, stringy, numbery):
self.moduly.bar(stringy)
print('Second argument: '+str(numbery))
def worker(clazzy, numbery):
clazzy.barpass('wow', numbery)
def main():
clazzy = clazz(importlib.import_module(sys.argv[1]))
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
foo.py (needs to be in same directory for above call suggestion):
#!/usr/bin/env python
globaly = 0
def foo(stringy):
print('foo '+stringy)
global globaly
globaly = 5
def bar(stringy):
print('bar '+stringy)
print(str(globaly))
This gives error upon running: TypeError: can't pickle module objects
Now we know that pickling module objects is (sadly) not possible.
In order to get rid of the error, let clazz not take a module as attribute, however convenient, but let it take "modpath", which is the required string for importlib to import the module specified by user.
It looks like this (foo.py remains exactly the same as above):
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, modpathy):
self.modpathy = modpathy
def foopass(self, stringy):
moduly = importlib.import_module(self.modpathy)
moduly.foo(stringy)
def barpass(self, stringy, numbery):
moduly = importlib.import_module(self.modpathy)
moduly.bar(stringy)
print('Second argument: '+str(numbery))
def worker(clazzy, number):
clazzy.barpass('wow', number)
def main():
clazzy = clazz(sys.argv[1])
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
If you require that your globals, such as globaly, are guaranteed to maintain state, then you need to pass a mutable object (e.g. list, dictionary) to hold this data, thanks #DavisHerring:
Module attributes are called “global variables” in Python, but they are no more persistent or accessible than any other data. Why not just use dictionaries?
The example code would look like this:
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, modpathy):
self.modpathy = modpathy
self.dictionary = {}
def foopass(self, stringy):
moduly = importlib.import_module(self.modpathy)
moduly.foo(stringy, self.dictionary)
def barpass(self, stringy, numbery):
moduly = importlib.import_module(self.modpathy)
moduly.bar(stringy, self.dictionary)
print('Second argument: '+str(numbery))
def worker(clazzy, number):
clazzy.barpass('wow', number)
def main():
clazzy = clazz(sys.argv[1])
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
foo.py (no more globals):
#!/usr/bin/env python
def foo(stringy, dictionary):
print('foo '+stringy)
globaly = 5
dictionary['globaly'] = globaly
def bar(stringy, dictionary):
print('bar '+stringy)
globaly = dictionary['globaly']
print(str(globaly))
This way you can work around the problem without annoying can't pickle ... errors, and while maintaining states
I'd like to add an action to my Notification with a callback. I'm using pygobject with the following code:
import logging
from time import sleep
import gi
gi.require_version('Notify', '0.7')
from gi.repository import Notify
def callback(*args, **kwargs):
print("Got callback")
print(locals())
def main():
Notify.init("Hello World")
notification = Notify.Notification.new("Testing")
notification.add_action("my action", "Submit", callback)
notification.show()
sleep(5)
if __name__ == '__main__':
logging.basicConfig(level=logging.DEBUG)
main()
When I run the script, I see the notification with the "Submit" button, but when I click the button the callback isn't run (as far as I can tell).
When I use ipython to inspect things, I get this help for add_action:
In [65]: Notify.Notification.add_action?
Type: FunctionInfo
String form: gi.FunctionInfo(add_action)
File: /usr/lib/python3.5/site-packages/gi/__init__.py
Docstring: add_action(self, action:str, label:str, callback:Notify.ActionCallback, user_data=None)
So I see that the callback should be an ActionCallback? I then inspect that class:
In [67]: Notify.ActionCallback
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-67-aa40d4997598> in <module>()
----> 1 Notify.ActionCallback
/usr/lib/python3.5/site-packages/gi/module.py in __getattr__(self, name)
231 wrapper = info.get_value()
232 else:
--> 233 raise NotImplementedError(info)
234
235 # Cache the newly created wrapper which will then be
NotImplementedError: gi.CallbackInfo(ActionCallback)
...and I get a NotImplementedError. So are notification actions just not implemented in PyGObject? Or am I doing something wrong in passing my callback to the add_action method?
I'm on arch linux, using the package python-gobject 3.22.0-1, running with python 3.5.2.
It turns out I needed to run the Gtk main loop:
from gi.repository import Gtk
Gtk.main()
Then the callback was called just fine
I have 4 DBUS Services python script (a.py,b.py,c.py,d.py) and Run it in 'main.py'
The Reason why i want to merge dbus services, is because of %Memory per process running. 2.0% Memory per dbus service. I'll create a 15 of dbus services.
main.py
#!/usr/bin/env python
import sys
import gobject
from dbus.mainloop.glib import DBusGMainLoop
listofdbusfilenames = ['a','b','b','d']
def importDbusServices():
for dbusfilename in listofdbusfilenames:
globals()[dbusfilename] = __import__(dbusfilename)
def callservices():
for dbusfilename in listofdbusfilenames:
globals()[dbusfilename +'_var'] = eval(dbusfilename +'.ServiceClass()')
if __name__ == '__main__':
importDbusDervices()
DBusGMainLoop(set_as_default = True)
callservices()
loop = gobject.MainLoop()
loop.run()
a.py wants to get the return method from b.py
b.py will get the return method of c.py and d.py
-a.py
--|b.py
----|c.py
----|d.py
The PROBLEMS:
I can't get the return method ''get_dbus_method'', Introspect Error appears.
I tried signal and receiver BUT it takes longer than ''get_dbus_method''.
my dbus service format
import dbus
import dbus.service
class ServiceClass(dbus.service.Object):
def __init__(self):
busName = dbus.service.BusName('test.a', bus = dbus.SystemBus())
dbus.service.Object.__init__(self, busName, '/test/a')
#dbus.service.method('test.a')
def aMethod1(self):
#get the b.py method value 'get_dbus_method' here
return #value from b.py method
Is there any other way to get the method directly?
Thanks in advance for response and reading this. :D
I'm trying to write a signal handler that will call methods from a class variable.
I have code that looks like this:
import daemon
class bar():
def func():
print "Hello World!\n"
def sigusr1_handler(signum,frame):
foo.func()
def main():
foo = bar()
context = daemon.DaemonContext(stdout=sys.stdout)
context.signal_map = {
signal.SIGUSR1: sigusr1_handler
}
with context:
if (__name__="__main__"):
main()
This doesn't work. Python throws a NameError exception when I do a kill -USR1 on the daemon.
I also tried defining functions inside main that would handle the exception and call those functions from the signal handlers, but that didn't work either.
Anybody have ideas on how to implement this?
One option would be to import class bar inside your sigusr1_handler function. It's probably a good idea to have it in a different file anyway
Do you import signal? Because if I run you code I get:
Traceback (most recent call last):
File "pydaemon.py", line 16, in <module>
signal.SIGUSR1: sigusr1_handler
NameError: name 'signal' is not defined
You might fix this with:
import signal
And have a look at your string comparison oparator
with context:
if (__name__="__main__"):
main()
I generally use the '==' operator instead of '='
I have a function that is responsible for killing a child process when the program ends:
class MySingleton:
def __init__(self):
import atexit
atexit.register(self.stop)
def stop(self):
os.kill(self.sel_server_pid, signal.SIGTERM)
However I get an error message when this function is called:
Traceback (most recent call last):
File "/usr/lib/python2.5/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/home/commando/Development/Diploma/streaminatr/stream/selenium_tests.py", line 66, in stop
os.kill(self.sel_server_pid, signal.SIGTERM)
AttributeError: 'NoneType' object has no attribute 'kill'
Looks like the os and signal modules get unloaded before atexit is called. Re-importing them solves the problem, but this behaviour seems weird to me - these modules are imported before I register my handler, so why are they unloaded before my own exit handler runs?
There are no strong guarantees about the order in which things are destroyed at program termination time, so it's best to ensure atexit-registered functions are self contained. E.g., in your case:
class MySingleton:
def __init__(self):
import atexit
atexit.register(self.stop)
self._dokill = os.kill
self._thesig = signal.SIGTERM
def stop(self):
self._dokill(self.sel_server_pid, self._thesig)
This is preferable to re-importing modules (which could conceivably cause slowdown of program termination and even unending loops, though that risk is lesser for "system-supplied" modules such as os).