Unpickling a class raises attribute error - python

I wrote a class A in the Class_A.py file
class A():
def square(self, num):
return num*num
Next in pickle_A.py, I saved the class in a file 'class_A_imported_from_class_A'
import pickle
from Class_A import A
if __name__ == '__main__':
file_name = 'class_A_imported_from_class_A'
with open(file_name,'wb') as f:
pickle.dump(A,f,pickle.HIGHEST_PROTOCOL)
Then I moved the file to another location, where I ran Unpickling_imported_class.py to unpickle this file.
import pickle
file_name = 'class_A_imported_from_class_A'
with open(file_name,'rb') as f:
B = pickle.load(f)
So, I get the error:
B = pickle.load(f)
builtins.ModuleNotFoundError: No module named 'Class_A'
Now, I know that the error will go, if I copied the Class_A into this folder. The constraint is that I cannot.
I was able to do this using cloudpickle, but in that, I have to pickle the file also using cloudpickle.
My work demands that I should be able to unpickle classes directly, i.e. if there's a pickled file that has the data for a class, I should be able to read a class directly. Is there a way that I can do it?

Related

Creating Python submodule

I want to create a tool called unifile for saving and opening files
like this unifile.open.yaml("file.yaml").
This is my structure:
unifile
|
├-open
| └--__init__.py
|
└-save
└--__init__.py
Code that call my module:
import unifile
a = unifile.open.yaml("file.yaml")
open/init.py
import yaml
class open():
def yml(self, file_path):
try:
with open(file_path, "r", encoding="utf-8") as yaml_conf:
yaml_file = yaml.safe_load(yaml_conf)
return yaml_file
except OSError:
print("Can't load yaml")
1 error if I import unifile always say:
module unifile has no atribute open
2 error in __init__.py I can't open file
[pylint] Context manager 'open' doesn't implement enter and exit. [not-context-manager]
here adding solution to ur problem, make your project structure like this.
add unifile/__init__.py file in the unifile itself not in other modules.
then unifile/open/_open.py file content
import yaml
class Open():
def __init__(self):
pass
def yml(self, file_path):
try:
with open(file_path, "r", encoding="utf-8") as yaml_conf:
yaml_file = yaml.safe_load(yaml_conf)
return yaml_file
except OSError:
print("Can't load yaml")
content of the unifile/__init__.py file
from .open._open import Open
in terminal run the program like this
Also, It is better to create a object element first then proceed ahead.
Two issues, two answers.
First, you should add an init file in unifile. With this, Python will understand that unifile is a package with a sub package.
Second, open is a built-in function and you overwrite it by calling your class open. Change your class name and it should work.
You are getting this error because unifile is not a package. There isn't any init.py file at the top level same as open and save. You also cannot call open.yml directly, because open is a class in package open, so either you will have to import open from open, create its instance and then call iml on that instance.
from open import open
a = open().yml('file.yml')
You are getting this error, because you are trying to override an existing keyword in Python open which you should strictly prohibit doing. So you should name your class anything except a reserved keyword.

How do I test that I'm calling pickle.dump() correctly?

I want to test this method:
class Data(object):
def save(self, filename=''):
if filename:
self.filename = filename
if not self.filename:
raise ValueError('Please provide a path to save to')
with open(self.filename, 'w') as f:
pickle.dump(self, f)
I can set up the test to make sure pickle.dump gets called, and that the first argument is the object:
#patch('pickle.dump')
def test_pickle_called(self, dump):
self.data.save('foo.pkl')
self.assertTrue(dump.called)
self.assertEquals(self.data, dump.call_args[0][0])
I'm not sure what to do for the second argument, though. If I open a new file for the test, it's not going to be the same as what gets called for the execution. I'd at least like to be sure I'm opening the right file. Would I just mock open and make sure that gets called with the right name at some point?
Patch open() and return an instance of writeable StringIO from it. Load pickled data from that StringIO and test its structure and values (test that it's equivalent to self.data). Something like this:
import builtins # or __builtin__ for Python 2
builtins.open = open = Mock()
open.return_value = sio = StringIO()
self.data.save('foo.pkl')
new_data = pickle.load(sio.getvalue())
self.assertEqual(self.data, new_data)

Load pickled object in different file - Attribute error

I have some trouble with loading a pickled file in a module that is different from the module where I pickled the file. I am aware of the following thread: Unable to load files using pickle and multipile modules. I've tried the proposed solution of importing the class into the module where I am unpickling my file, but it keeps giving me the same error:
AttributeError: Can't get attribute 'Document' on <module '__main__' from ''>
The basic structure of what I am trying to do:
Util file that pickles and unpickles objects, utils.py:
import pickle
def save_document(doc):
from class_def import Document
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
def load_document(file_path):
from class_def import Document
doc_file = open(file_path, 'rb')
return pickle.load(doc_file)
File where Document object is defined and the save util method is called, class_def.py:
import utils
class Document(object):
data = ""
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
File where the load util method is called, process.py:
import utils
if __name__ == '__main__':
utils.load_document(file_path)
Running process.py gives the mentioned AttributeError. If I import the class_def.py file into process.py and run its main method as mentioned in the original thread it works, but I want to be able to run these two modules separately, since the class_def file is a preprocessing step that takes quite some time. How could I solve this?
in your class_def.py file you have this code:
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
This means that doc will be a __main__.Document object, so when it is pickled it is expecting to be able to get a Document class from the main module, to fix this you need to use the definition of Document from a module called class_def meaning you would add an import here:
(in general you can just do from <own module name> import * right inside the if __name__ == "__main__")
if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)
that way it will need to run the class_def.py file twice, once as __main__ and once as class_def but it does mean that the data will be pickled as a class_def.Document object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py:
def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
Although usually I'd prefer the first way.
I had a similar problem and only just realized the differences between our implementations.
Your file structure:
util.py
define pickle functions
class_def.py
import util
define class
make instance
call save pickle
process.py
import util
load pickle
My mistake (using your file names) was first:
util_and_class.py
define class
define pickle funcs
make instance
call save pickle
process.py
import util_and_class
call load pickle << ERROR
What solved my pickle import problem:
util_and_class.py
define class
define pickle funcs
pickle_init.py
import util_and_class
make instance
call save pickle
process.py
call load pickle
This had the welcomed side effect that I didn't need to import the util_and_class file as it's baked into the pickle file. Calling the instance and saving the pickle in a separate file resolved the __name__ issue of "loading a pickled file in a module that is different from the module where I pickled the file."

Python import library in submodule

I have the following structure:
run.py
app/hdfs_lib_test.py
app/src/HDFSFileReader.py
This is a Flask app.
The HSFDFileReader.py contains:
class HDFSFileReader:
"""This class represents a reader for accessing files in a HDFS."""
def __init__(self):
pass
#classmethod
def read(cls, file_path):
lines = []
try:
client = Config().get_client('dev')
with client.read('Serien_de', encoding='utf-8', delimiter='\n') as reader:
for line in reader:
lines.append(line)
except:
print("ERROR: Could not read from HDFS.")
raise
return lines
When I run ./run.py I get the ImportError: No module named 'hdfs'. However the library is installed and I can call python hdfs_lib_test.py, which contains the following:
from hdfs import Config
try:
client = Config().get_client('dev')
with client.read('Serien_de',encoding='utf-8',delimiter='\n') as reader:
for line in reader:
print(line)
except:
raise
So to me it seems, that because HDFSFileReader is part of a submodule and hdfs_lib_test.py isn't. The former doesn't work, but the later does. But how do I fix the import problem?
I can't be a circular import issue, I searched the whole project for imports of hdfs and the hdfs_lib_test.py is not used by the actual project.

AttributeError when unpickling an object

I'm trying to pickle an instance of a class in one module, and unpickle it in another.
Here's where I pickle:
import cPickle
def pickleObject():
object = Foo()
savefile = open('path/to/file', 'w')
cPickle.dump(object, savefile, cPickle.HIGHEST_PROTOCOL)
class Foo(object):
(...)
and here's where I try to unpickle:
savefile = open('path/to/file', 'r')
object = cPickle.load(savefile)
On that second line, I get AttributeError: 'module' object has no attribute 'Foo'
Anyone see what I'm doing wrong?
class Foo must be importable via the same path in the unpickling environment so that the pickled object can be reinstantiated.
I think your issue is that you define Foo in the module that you are executing as main (__name__ == "__main__"). Pickle will serialize the path (not the class object/definition!!!) to Foo as being in the main module. Foo is not an attribute of the main unpickle script.
In this example, you could redefine class Foo in the unpickling script and it should unpickle just fine. But the intention is really to have a common library that is shared between the two scripts that will be available by the same path. Example: define Foo in foo.py
Simple Example:
$PROJECT_DIR/foo.py
class Foo(object):
pass
$PROJECT_DIR/picklefoo.py
import cPickle
from foo import Foo
def pickleObject():
obj = Foo()
savefile = open('pickle.txt', 'w')
cPickle.dump(obj, savefile, cPickle.HIGHEST_PROTOCOL)
pickleObject()
$PROJECT_DIR/unpicklefoo.py
import cPickle
savefile = open('pickle.txt', 'r')
obj = cPickle.load(savefile)
...
Jeremy Brown had the right answer, here is a more concrete version of the same point:
import cPickle
import myFooDefiningModule
def pickleObject():
object = myFooDefiningModule.Foo()
savefile = open('path/to/file', 'w')
cPickle.dump(object, savefile)
and:
import cPickle
import myFooDefiningModule
savefile = open('path/to/file', 'r')
object = cPickle.load(savefile)
such that Foo lives in the same namespace in each piece of code.

Categories