I faced with this problem in my work code, so I can't show it. But I wrote some short example, which exactly reproduce error and cuts off redundant logic.
Example have two files: Example.py & ImportedExample.py.
Example.py
from multiprocessing import Process
from ImportedExample import Imported
class Example:
def __init__(self, number):
self.imported = Imported(number)
def func(example: Example):
print(example)
if __name__ == "__main__":
ex = Example(3)
p = Process(target=func, args=(ex,))
p.start()
ImportedExample.py
class Imported:
def __init__(self, number):
self.number = number
self.ref = self.__private_method
def __private_method(self):
print(self.number)
And Traceback looks like this:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File"C:\Python\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
AttributeError: 'Imported' object has no attribute '__private_method'
The main detail is that when I make __private_method() non-private (renaming to private_method()), all works fine.
I don't understand why this happens. Any suggestions?
multiprocessing module uses pickle for transferring object between processes.
For an object to be pickable, it has to be accessible by name. Thanks to private name mangling, referenced private methods don’t fall in that category.
I suggest making the method protected – that is naming the method with only one leading underscore. From a global point of view, protected methods shoud be treated just as private methods, but they are not subject of name mangling.
Related
Let's say I have this code:
class StaticParent:
def _print(self):
print("I'm StaticParent")
class StaticChild(StaticParent):
def _print(self):
print('StaticChild saying: ')
super()._print()
def _parent_print_proto(self):
print("I'm DynamicParent")
def _child_print_proto(self):
print('DynamicChild saying: ')
super()._print()
DynamicParent = type('DynamicParent', tuple([]), {"_print": _parent_print_proto})
DynamicChild = type('DynamicChild', (DynamicParent,), {"_print": _child_print_proto})
sc = StaticChild()
sc._print()
dc = DynamicChild()
dc._print()
And it's output is:
StaticChild saying:
I'm StaticParent
DynamicChild saying:
Traceback (most recent call last):
File "/tmp/example.py", line 28, in <module>
dc._print()
File "/tmp/example.py", line 17, in _child_print_proto
super()._print()
RuntimeError: super(): __class__ cell not found
So question is how to create prototype method for many classes that calling super()?
PS I tried to implement method with lambda but it is also not working:
DynamicChildWithLambda = type('DynamicChild', (DynamicParent,), {"_print": lambda self : print('Lambda saying: ', super()._print())})
Traceback (most recent call last):
File "/tmp/example.py", line 30, in <module>
dcwl._print()
File "/tmp/example.py", line 23, in <lambda>
DynamicChildWithLambda = type('DynamicChild', (DynamicParent,), {"_print": lambda self : print('Lambda saying: ', super()._print())})
RuntimeError: super(): __class__ cell not found
PS2 Also I tried this way:
class StaticParent:
def _print(self):
print("I'm StaticParent")
def _child_print_proto(self):
print('DynamicChild saying: ')
super(StaticParent, self)._print()
DynamicChild = type('DynamicChild', (StaticParent,), {"_print": _child_print_proto})
dc = DynamicChild()
dc._print()
DynamicChild saying:
Traceback (most recent call last):
File "/tmp/example.py", line 13, in <module>
dc._print()
File "/tmp/example.py", line 8, in _child_print_proto
super(StaticParent, self)._print()
AttributeError: 'super' object has no attribute '_print'
Methods defined within a class get a fake closure scope that automatically provides the class it was defined in to no-arg super(). When defined outside a class, it can't do this (because clearly no class is being defined at the time you define the method). But you can still make a closure the old-fashioned way, by actually writing a closure function that you manually define __class__ appropriately in:
class StaticParent:
def _print(self):
print("I'm StaticParent")
class StaticChild(StaticParent):
def _print(self):
print('StaticChild saying: ')
super()._print()
def _parent_print_proto(self):
print("I'm DynamicParent")
# Nesting allows us to make the inner function have an appropriate __class__
# defined for use by no-arg super
def _make_child_print_proto(cls):
__class__ = cls
def _child_print_proto(self):
print('DynamicChild saying: ')
super()._print()
return _child_print_proto
DynamicParent = type('DynamicParent', tuple([]), {"_print": _parent_print_proto})
DynamicChild = type('DynamicChild', (DynamicParent,), {})
# Need DynamicChild to exist to use it as __class__, bind after creation
DynamicChild._print = _make_child_print_proto(DynamicChild)
sc = StaticChild()
sc._print()
dc = DynamicChild()
dc._print()
Try it online!
Yes, it's hacky and awful. In real code, I'd just use the less common explicit two-arg super:
def _child_print_proto(self):
print('DynamicChild saying: ')
super(DynamicChild, self)._print()
That's all super() does anyway; Python hides the class it was defined in in closure scope as __class__, super() pulls it and the first positional argument, assumed to be self, and implicitly does the same thing the two-arg form did explicitly.
To be clear: You cannot pass self.__class__ manually as the first argument to two-arg super(), to simulate the __class__ that is bound in closure scope. It appears to work, and it does work, until you actually make a child of the class with that method and try to call the method on it (and the whole point of super() is you might have an arbitrarily complex class hierarchy to navigate; you can't just say "Oh, but my class is special enough to never be subclassed again"). If you do something as simple as adding:
class DynamicGrandChild(DynamicChild):
pass
dgc = DynamicGrandChild()
dgc._print()
to the self.__class__-using code from Epsi95's answer, you're going to see:
StaticChild saying:
I'm StaticParent
DynamicChild saying:
I'm DynamicParent
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
DynamicChild saying:
... repeats a thousand times or so ...
DynamicChild saying:
DynamicChild saying:
Traceback (most recent call last):
File ".code.tio", line 31, in <module>
dgc._print()
File ".code.tio", line 15, in _child_print_proto
super(self.__class__, self)._print()
File ".code.tio", line 15, in _child_print_proto
super(self.__class__, self)._print()
File ".code.tio", line 15, in _child_print_proto
super(self.__class__, self)._print()
[Previous line repeated 994 more times]
File ".code.tio", line 14, in _child_print_proto
print('DynamicChild saying: ')
RecursionError: maximum recursion depth exceeded while calling a Python object
super() is used when you're designing for inheritance. super(self.__class__, self) is used only when you're sabotaging inheritance. The only safe ways to do this involve some sort of static linkage to the class (the one that the method will be attached to, not the run time type of whatever instance it is called with), and my two solutions above are the two reasonable ways to do this (there's only one other option really, which is making the closure, and still passing the class and self explicitly, e.g. instead of defining __class__, just use super(cls, self) to explicitly use a closure variable; not sufficiently distinct, and kinda the worst of both worlds).
So, I said "there's three ways", but in fact there is a slightly nicer (but also less portable, because the API for types.FunctionType has changed over time, even though closures defined normally haven't) solution, which lets you make a utility function to bind arbitrary functions to arbitrary classes, instead of requiring you to wrap each such function in a closure-maker. And that's to literally rebuild the function as a closure directly:
import types
# Define function normally, with super(). The function can't actually be used as is though
def _child_print_proto(self):
print('DynamicChild saying: ')
super()._print()
# Utility function binding arbitrary function to arbitrary class
def bind_to_class(cls, func):
# This translates as:
# Make a new function using the same code object as the passed function,
# but tell it it has one closure scoped variable named __class__, and
# provide the class as the value to associate with it.
# This requires Python 3.8 or later (code objects didn't have a replace method
# until then, so doing this would be even uglier than it already is)
return types.FunctionType(func.__code__.replace(co_freevars=('__class__',)), func.__globals__, closure=(types.CellType(cls),))
DynamicParent = type('DynamicParent', tuple([]), {"_print": _parent_print_proto})
DynamicChild = type('DynamicChild', (DynamicParent,), {})
# Rebind the function to a new function that believes it was defined in DynamicChild
DynamicChild._print = bind_to_class(DynamicChild, _child_print_proto)
Like I said, it's super-ugly and would need rewrites pre-3.8, but it does have the mild advantage of allowing you to use bind_to_class with arbitrary super() using functions to bind them to arbitrary classes. I still think manually calling two-arg super is the safe/obvious way to go, but now you've got all the options.
Good day to you,
Today I was moving code from threading to multiprocess. Everything seemed okay, until I got The following error:
Error
Traceback (most recent call last):
File "run.py", line 93, in <module>
main()
File "run.py", line 82, in main
emenu.executemenu(components, _path)
File "/home/s1810979/paellego/lib/execute/execute_menu.py", line 29, in executemenu
e.executeall(installed, _path)
File "/home/s1810979/paellego/lib/execute/execute.py", line 153, in executeall
pool.starmap(phase2, args)
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 268, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 608, in get
raise self._value
File "/usr/lib64/python3.4/multiprocessing/pool.py", line 385, in _handle_tasks
put(task)
File "/usr/lib64/python3.4/multiprocessing/connection.py", line 206, in send
self._send_bytes(ForkingPickler.dumps(obj))
File "/usr/lib64/python3.4/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'module'>: attribute lookup module on builtins failed
Code
execute.py
def executeall(components, _path):
args = []
manager = multiprocessing.Manager()
q = manager.Queue()
resultloc = '/some/result.log'
for component in components:
for apkpath, resultpath in zip(execonfig.apkpaths, execonfig.resultpaths):
args.append((component,apkpath,resultpath,q,)) #Args for subprocesses
cores = askcores()
with multiprocessing.Pool(processes=cores) as pool:
watcher = pool.apply_async(lgr.log, (resultloc+'/results.txt', q,))
pool.starmap(phase2, args)
component.py
class Component(object):
def __init__(self, installmodule, runmodule, installerloc, installationloc, dependencyloc):
self.installmodule = installmodule
self.runmodule = runmodule
self.installerloc = installerloc
self.installationloc = installationloc
self.dependencyloc = dependencyloc
self.config = icnf.Installconfiguration(installerloc+'/conf.conf')
#lots of functions...
installconfig.py
class State(Enum):
BEGIN=0 #Look for units
UNIT=1 #Look for unit keypairs
KEYPAIR=3
class Phase(Enum):
NONE=0
DEPS=1
PKGS=2
class Installconfiguration(object):
def __init__(self, config):
dictionary = self.reader(config) #Fill a dictionary
#dictionary (key:Phase, value: (dictionary key: str, job))
self.deps = dictionary[Phase.DEPS]
self.pkgs = dictionary[Phase.PKGS]
job.py
class Job(object):
def __init__(self, directory=None, url=None):
self.directory = directory if directory else ''
self.url = url if url else ''
As you can see, I pass a component as argument to function phase2(component, str, str, multiprocess.manager.Queue()).
The second and third argument of the constructor of component are modules imported with importlib.
What I tried
I am new to python, but not to programming. Here is what I tried:
Because the error itself did not point out what the problem was exactly, I tried removing args to find out which can't be pickled: Remove component, and everything works fine, so this appears to be the cause for trouble. However, I need this object passed to my processes.
I searched around the internet for hours, but did not find anything but basic tutorials about multiprocessing, and explanations about how pickle works. I did find this saying it should work, but not on windows or something. However, it does not work on Unix (which I use)
My ideas
As I understood it, nothing suggests I cannot send a class containing two importlib modules. I do not know what the exact problem is with component class, but importlib module as members are the only non-regular things. This is why I believe the problem occurs here.
Question
Do you know why a class containing modules is unsuitable for 'pickling'? How can one get a better idea why and where Can't pickle <class 'module'> errors occur?
More code
Full source code for this can be found on https://github.com/Sebastiaan-Alvarez-Rodriguez/paellego
Questions to me
Please leave comments requesting clarifications/more code snippets/??? if you would like me to edit this question
A last request
I would like solutions to use python standard library only, python 3.3 preferably. Also, a requirement of my code is that it runs on Unix systems.
Thanks in advance
Edit
As requested, here is a minimal example which greatly simplifies the problem:
main.py (you could execute as python main.py foo)
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, moduly):
self.moduly = moduly
def foopass(self, stringy):
self.moduly.foo(stringy)
def barpass(self, stringy, numbery):
self.moduly.bar(stringy)
print('Second argument: '+str(numbery))
def worker(clazzy, numbery):
clazzy.barpass('wow', numbery)
def main():
clazzy = clazz(importlib.import_module(sys.argv[1]))
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
foo.py (needs to be in same directory for above call suggestion):
#!/usr/bin/env python
globaly = 0
def foo(stringy):
print('foo '+stringy)
global globaly
globaly = 5
def bar(stringy):
print('bar '+stringy)
print(str(globaly))
This gives error upon running: TypeError: can't pickle module objects
Now we know that pickling module objects is (sadly) not possible.
In order to get rid of the error, let clazz not take a module as attribute, however convenient, but let it take "modpath", which is the required string for importlib to import the module specified by user.
It looks like this (foo.py remains exactly the same as above):
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, modpathy):
self.modpathy = modpathy
def foopass(self, stringy):
moduly = importlib.import_module(self.modpathy)
moduly.foo(stringy)
def barpass(self, stringy, numbery):
moduly = importlib.import_module(self.modpathy)
moduly.bar(stringy)
print('Second argument: '+str(numbery))
def worker(clazzy, number):
clazzy.barpass('wow', number)
def main():
clazzy = clazz(sys.argv[1])
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
If you require that your globals, such as globaly, are guaranteed to maintain state, then you need to pass a mutable object (e.g. list, dictionary) to hold this data, thanks #DavisHerring:
Module attributes are called “global variables” in Python, but they are no more persistent or accessible than any other data. Why not just use dictionaries?
The example code would look like this:
#!/usr/bin/env python
import sys
import importlib
import multiprocessing
class clazz(object):
def __init__(self, modpathy):
self.modpathy = modpathy
self.dictionary = {}
def foopass(self, stringy):
moduly = importlib.import_module(self.modpathy)
moduly.foo(stringy, self.dictionary)
def barpass(self, stringy, numbery):
moduly = importlib.import_module(self.modpathy)
moduly.bar(stringy, self.dictionary)
print('Second argument: '+str(numbery))
def worker(clazzy, number):
clazzy.barpass('wow', number)
def main():
clazzy = clazz(sys.argv[1])
clazzy.foopass('init')
args = [(clazzy, 2,)]
with multiprocessing.Pool(processes=2) as pool:
pool.starmap(worker, args)
if __name__ == "__main__":
main()
foo.py (no more globals):
#!/usr/bin/env python
def foo(stringy, dictionary):
print('foo '+stringy)
globaly = 5
dictionary['globaly'] = globaly
def bar(stringy, dictionary):
print('bar '+stringy)
globaly = dictionary['globaly']
print(str(globaly))
This way you can work around the problem without annoying can't pickle ... errors, and while maintaining states
I'm having a python problem with a simple program. The program is supposed to allow the user to make a make a Cow() instance and give the cow a name in the parameter.
class Cow():
def __init__(self, name):
self.name = name
if self.name == None:
raise NoNameCowError("Your cow must have a name")
def speak(self):
print self.name, "says moo"
Now when I do
cow.Cow("Toby")
I get the error
Traceback (most recent call last):
File "<pyshell#32>", line 1, in <module>
cow.Cow("Toby")
File "C:\Users\Samga_000\Documents\MyPrograms\cow.py", line 8, in __init__
self.name = name
AttributeError: Cow instance has no attribute 'name'
Help? I originally thought I did something wrong with the exception but it doesn't seem to be that. Thanks in advance.
I think you modified your source code and didn't reloaded the module:
Buggy version:
class Cow():
def __init__(self, name):
if self.name == None:
raise NoNameCowError("Your cow must have a name")
def speak(self):
print self.name, "says moo"
>>> import so
Error raised as expected:
>>> so.Cow('abc1')
Traceback (most recent call last):
File "<ipython-input-4-80383f90b571>", line 1, in <module>
so.Cow('abc1')
File "so.py", line 3, in __init__
if self.name == None:
AttributeError: Cow instance has no attribute 'name'
Now let's modify the source code and add this line self.name = name:
>>> import so
>>> so.Cow('abc1')
Traceback (most recent call last):
File "<ipython-input-6-80383f90b571>", line 1, in <module>
so.Cow('abc1')
File "so.py", line 3, in __init__
self.name = name
AttributeError: Cow instance has no attribute 'name'
eh! still same error? That's because python is still using the old .pyc file or the cached module object. Just reload the module and updated code works fine:
>>> reload(so)
<module 'so' from 'so.py'>
>>> so.Cow('dsfds')
<so.Cow instance at 0x8b78e8c>
From docs:
Note For efficiency reasons, each module is only imported once per
interpreter session. Therefore, if you change your modules, you must
restart the interpreter – or, if it’s just one module you want to test
interactively, use reload(), e.g. reload(modulename).
A better version of your code:
class Cow():
def __init__(self, name=None): #Use a default value
self.name = name
if self.name is None: #use `is` for testing against `None`
raise NoNameCowError("Your cow must have a name")
I'm staring at the name check; this object requires the name argument does it not?
if self.name == None:
raise NoNameCowError("Your cow must have a name")
I'm a little Python thick, but that looks like a required arg.
When your declaring Cow, you need to pass object in the parentheses:
class Cow(object):
rest of code.
This way Python knows the class your declaring is an object with attributes and methods.
I get an AttributeError I can't seem to work out.
I'm working with two classes.
The first class goes something like that.
class Partie:
def __init__(self):
# deleted lines
self.interface = Interface(jeu=self)
def evaluerProposition(self):
# computations
self.interface.afficherReponse()
Introducing second class (in a separate file).
class Interface:
def __init__(self, jeu):
self.jeu = jeu
self.root = tkinter.Tk()
# stuff
def onClick(self, event):
# talk
self.jeu.evaluerProposition()
def afficherReponse(self):
# stuff
I start the whole thing by
partie = Partie()
All manipulations on my widget work fine until some click event causes
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python33\lib\tkinter\__init__.py", line 1442, in __call__
return self.func(*args)
File "C:\Users\Canard\Documents\My Dropbox\Python\AtelierPython\Mastermind\classeInterface.py", line 197, in clic
self.jeu.evaluerProposition()
File "C:\Users\Canard\Documents\My Dropbox\Python\AtelierPython\Mastermind\classeJeu.py", line 55, in evaluerProposition
self.interface.afficherReponse()
AttributeError: 'Partie' object has no attribute 'interface'
I typed in the interpretor
>>> dir(partie)
and got a long list in return with 'interface' among the attributes.
Also typed
>>> partie.interface
<classeInterface.Interface object at 0x02C39E50>
so the attribute seems to exist.
Following the advice in some former post, I checked the instance names do not coincide with module names.
I am confused.
Most likely, in some code that you're not showing us, you're doing something like this:
self.some_button = tkinter.Button(..., command=self.interface.onClick())
Note the trailing () on onClick(). This would cause the onClick method to be called at the time the button is created, which is probably before your constructor is done constructing the instance of the Partie class.
Given below is a snippet from a class of which I am trying to create objects and getting error:
class FoF(object):
def __init__(self,path):
filepath=[]
filepath.append(self.FileOrFolder(path))
Upon executing which I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "PathOps.py", line 6, in __init__
def __init__(self,path):
NameError: global name 'filepath' is not defined
After which I tried:
filepath=[]
class FoF(object):
def __init__(self,path):
global filepath.append(self.FileOrFolder(path))
And again:
File "<stdin>", line 1, in <module>
File "PathOps.py", line 6, in __init__
global filepath.append(self.FileOrFolder(path))
NameError: global name 'filepath' is not defined
What is causing the error and how do I fix it?
Try using insted of global the special word self.
So something like this
class FoF(object):
def __init__(self,path):
self.filepath=[]
self.filepath.append(self.FileOrFolder(path))
The reason this error comes up is because what python thinks you're trying to do is one of two things:
Either you're trying to reference a global variable called filepath -- which is clear that's not what you're trying
What's not so clear is that you could also define a class attribute called filepath -- the only problem with that is that you can't define a class attribute with a function of that class. You can only do so within the class -- outside a class function
So in order to declare variables within a function you have to use the word self before it.
Edit** if you want it to be an attribute of the class -- as I'm assuming is what you meant you could do so like this:
class FoF(object):
filepath=[]
def __init__(self,path):
self.filepath.append(self.FileOrFolder(path))
I don't think you're giving us enough information. For example:
>>> class FoF(object):
... def __init__(self, path):
... junk = []
... junk.append(path)
...
>>> foo = FoF('bar/path')
produces no error.
What, exactly, are you trying to do?