AttributeError when unpickling an object - python

I'm trying to pickle an instance of a class in one module, and unpickle it in another.
Here's where I pickle:
import cPickle
def pickleObject():
object = Foo()
savefile = open('path/to/file', 'w')
cPickle.dump(object, savefile, cPickle.HIGHEST_PROTOCOL)
class Foo(object):
(...)
and here's where I try to unpickle:
savefile = open('path/to/file', 'r')
object = cPickle.load(savefile)
On that second line, I get AttributeError: 'module' object has no attribute 'Foo'
Anyone see what I'm doing wrong?

class Foo must be importable via the same path in the unpickling environment so that the pickled object can be reinstantiated.
I think your issue is that you define Foo in the module that you are executing as main (__name__ == "__main__"). Pickle will serialize the path (not the class object/definition!!!) to Foo as being in the main module. Foo is not an attribute of the main unpickle script.
In this example, you could redefine class Foo in the unpickling script and it should unpickle just fine. But the intention is really to have a common library that is shared between the two scripts that will be available by the same path. Example: define Foo in foo.py
Simple Example:
$PROJECT_DIR/foo.py
class Foo(object):
pass
$PROJECT_DIR/picklefoo.py
import cPickle
from foo import Foo
def pickleObject():
obj = Foo()
savefile = open('pickle.txt', 'w')
cPickle.dump(obj, savefile, cPickle.HIGHEST_PROTOCOL)
pickleObject()
$PROJECT_DIR/unpicklefoo.py
import cPickle
savefile = open('pickle.txt', 'r')
obj = cPickle.load(savefile)
...

Jeremy Brown had the right answer, here is a more concrete version of the same point:
import cPickle
import myFooDefiningModule
def pickleObject():
object = myFooDefiningModule.Foo()
savefile = open('path/to/file', 'w')
cPickle.dump(object, savefile)
and:
import cPickle
import myFooDefiningModule
savefile = open('path/to/file', 'r')
object = cPickle.load(savefile)
such that Foo lives in the same namespace in each piece of code.

Related

reading file by pickle module

good afternoon!
saving list(dict(),dict(),dict()) struct with pickle module, but when reading I get: <class 'function'>, and <function lesson at 0x00000278BA3A0D30>
what am I doing wrong?
def lesson(user, date):
with open(user+"_"+date+".data", 'wb') as file:
pickle.dump(lesson, file)
file.close()
def read(user, date):
with open(user+"_"+date+".data", 'rb') as file:
lesson = pickle.load(file)
file.close()
return(lesson)
I am using python 3.10.7
"saving list(dict(),dict(),dict()) struct with pickle module". No, you're not. You're saving the lesson function. See line 3 of your code.

Is there a way to remove the unwanted imports in python when object got serialzied

Imagine I have a.py:
import tensorflow as tf
class A:
name = "Class A"
dump.py:
import pickle
from a import A
a = A()
with open("a.pickled", "wb") as f:
f.write(pickle.dump(a))
load.py:
import pickle
with open("a.pickled", "rb") as f:
a = pickle.load(f.read())
When python pickles the object a, it actually also pulls in the tensorflow. As a result, when the object got deserialized in load.py, it imports tensorflow and brings in a bunch of expensive initialization related to tensorflow. I am wondering if we can remove the unwanted tensorflow import in dump.py without modifying a.py
I tried something like this in dumpy.py, but this doesn't work for me:
import pickle
from a import A
# Delete tensorflow related import from cache
import sys
names = [name for name in sys.modules.keys() if "tensorflow" in name]
for name in names:
del sys.modules[name]
a = A()
with open("a.pickled", "wb") as f:
f.write(pickle.dump(a))

Unpickling a class raises attribute error

I wrote a class A in the Class_A.py file
class A():
def square(self, num):
return num*num
Next in pickle_A.py, I saved the class in a file 'class_A_imported_from_class_A'
import pickle
from Class_A import A
if __name__ == '__main__':
file_name = 'class_A_imported_from_class_A'
with open(file_name,'wb') as f:
pickle.dump(A,f,pickle.HIGHEST_PROTOCOL)
Then I moved the file to another location, where I ran Unpickling_imported_class.py to unpickle this file.
import pickle
file_name = 'class_A_imported_from_class_A'
with open(file_name,'rb') as f:
B = pickle.load(f)
So, I get the error:
B = pickle.load(f)
builtins.ModuleNotFoundError: No module named 'Class_A'
Now, I know that the error will go, if I copied the Class_A into this folder. The constraint is that I cannot.
I was able to do this using cloudpickle, but in that, I have to pickle the file also using cloudpickle.
My work demands that I should be able to unpickle classes directly, i.e. if there's a pickled file that has the data for a class, I should be able to read a class directly. Is there a way that I can do it?

How do I test that I'm calling pickle.dump() correctly?

I want to test this method:
class Data(object):
def save(self, filename=''):
if filename:
self.filename = filename
if not self.filename:
raise ValueError('Please provide a path to save to')
with open(self.filename, 'w') as f:
pickle.dump(self, f)
I can set up the test to make sure pickle.dump gets called, and that the first argument is the object:
#patch('pickle.dump')
def test_pickle_called(self, dump):
self.data.save('foo.pkl')
self.assertTrue(dump.called)
self.assertEquals(self.data, dump.call_args[0][0])
I'm not sure what to do for the second argument, though. If I open a new file for the test, it's not going to be the same as what gets called for the execution. I'd at least like to be sure I'm opening the right file. Would I just mock open and make sure that gets called with the right name at some point?
Patch open() and return an instance of writeable StringIO from it. Load pickled data from that StringIO and test its structure and values (test that it's equivalent to self.data). Something like this:
import builtins # or __builtin__ for Python 2
builtins.open = open = Mock()
open.return_value = sio = StringIO()
self.data.save('foo.pkl')
new_data = pickle.load(sio.getvalue())
self.assertEqual(self.data, new_data)

Load pickled object in different file - Attribute error

I have some trouble with loading a pickled file in a module that is different from the module where I pickled the file. I am aware of the following thread: Unable to load files using pickle and multipile modules. I've tried the proposed solution of importing the class into the module where I am unpickling my file, but it keeps giving me the same error:
AttributeError: Can't get attribute 'Document' on <module '__main__' from ''>
The basic structure of what I am trying to do:
Util file that pickles and unpickles objects, utils.py:
import pickle
def save_document(doc):
from class_def import Document
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
def load_document(file_path):
from class_def import Document
doc_file = open(file_path, 'rb')
return pickle.load(doc_file)
File where Document object is defined and the save util method is called, class_def.py:
import utils
class Document(object):
data = ""
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
File where the load util method is called, process.py:
import utils
if __name__ == '__main__':
utils.load_document(file_path)
Running process.py gives the mentioned AttributeError. If I import the class_def.py file into process.py and run its main method as mentioned in the original thread it works, but I want to be able to run these two modules separately, since the class_def file is a preprocessing step that takes quite some time. How could I solve this?
in your class_def.py file you have this code:
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
This means that doc will be a __main__.Document object, so when it is pickled it is expecting to be able to get a Document class from the main module, to fix this you need to use the definition of Document from a module called class_def meaning you would add an import here:
(in general you can just do from <own module name> import * right inside the if __name__ == "__main__")
if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)
that way it will need to run the class_def.py file twice, once as __main__ and once as class_def but it does mean that the data will be pickled as a class_def.Document object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py:
def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
Although usually I'd prefer the first way.
I had a similar problem and only just realized the differences between our implementations.
Your file structure:
util.py
define pickle functions
class_def.py
import util
define class
make instance
call save pickle
process.py
import util
load pickle
My mistake (using your file names) was first:
util_and_class.py
define class
define pickle funcs
make instance
call save pickle
process.py
import util_and_class
call load pickle << ERROR
What solved my pickle import problem:
util_and_class.py
define class
define pickle funcs
pickle_init.py
import util_and_class
make instance
call save pickle
process.py
call load pickle
This had the welcomed side effect that I didn't need to import the util_and_class file as it's baked into the pickle file. Calling the instance and saving the pickle in a separate file resolved the __name__ issue of "loading a pickled file in a module that is different from the module where I pickled the file."

Categories