I have some trouble with loading a pickled file in a module that is different from the module where I pickled the file. I am aware of the following thread: Unable to load files using pickle and multipile modules. I've tried the proposed solution of importing the class into the module where I am unpickling my file, but it keeps giving me the same error:
AttributeError: Can't get attribute 'Document' on <module '__main__' from ''>
The basic structure of what I am trying to do:
Util file that pickles and unpickles objects, utils.py:
import pickle
def save_document(doc):
from class_def import Document
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
def load_document(file_path):
from class_def import Document
doc_file = open(file_path, 'rb')
return pickle.load(doc_file)
File where Document object is defined and the save util method is called, class_def.py:
import utils
class Document(object):
data = ""
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
File where the load util method is called, process.py:
import utils
if __name__ == '__main__':
utils.load_document(file_path)
Running process.py gives the mentioned AttributeError. If I import the class_def.py file into process.py and run its main method as mentioned in the original thread it works, but I want to be able to run these two modules separately, since the class_def file is a preprocessing step that takes quite some time. How could I solve this?
in your class_def.py file you have this code:
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
This means that doc will be a __main__.Document object, so when it is pickled it is expecting to be able to get a Document class from the main module, to fix this you need to use the definition of Document from a module called class_def meaning you would add an import here:
(in general you can just do from <own module name> import * right inside the if __name__ == "__main__")
if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)
that way it will need to run the class_def.py file twice, once as __main__ and once as class_def but it does mean that the data will be pickled as a class_def.Document object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py:
def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
Although usually I'd prefer the first way.
I had a similar problem and only just realized the differences between our implementations.
Your file structure:
util.py
define pickle functions
class_def.py
import util
define class
make instance
call save pickle
process.py
import util
load pickle
My mistake (using your file names) was first:
util_and_class.py
define class
define pickle funcs
make instance
call save pickle
process.py
import util_and_class
call load pickle << ERROR
What solved my pickle import problem:
util_and_class.py
define class
define pickle funcs
pickle_init.py
import util_and_class
make instance
call save pickle
process.py
call load pickle
This had the welcomed side effect that I didn't need to import the util_and_class file as it's baked into the pickle file. Calling the instance and saving the pickle in a separate file resolved the __name__ issue of "loading a pickled file in a module that is different from the module where I pickled the file."
Related
Imagine I have a.py:
import tensorflow as tf
class A:
name = "Class A"
dump.py:
import pickle
from a import A
a = A()
with open("a.pickled", "wb") as f:
f.write(pickle.dump(a))
load.py:
import pickle
with open("a.pickled", "rb") as f:
a = pickle.load(f.read())
When python pickles the object a, it actually also pulls in the tensorflow. As a result, when the object got deserialized in load.py, it imports tensorflow and brings in a bunch of expensive initialization related to tensorflow. I am wondering if we can remove the unwanted tensorflow import in dump.py without modifying a.py
I tried something like this in dumpy.py, but this doesn't work for me:
import pickle
from a import A
# Delete tensorflow related import from cache
import sys
names = [name for name in sys.modules.keys() if "tensorflow" in name]
for name in names:
del sys.modules[name]
a = A()
with open("a.pickled", "wb") as f:
f.write(pickle.dump(a))
I wrote a class A in the Class_A.py file
class A():
def square(self, num):
return num*num
Next in pickle_A.py, I saved the class in a file 'class_A_imported_from_class_A'
import pickle
from Class_A import A
if __name__ == '__main__':
file_name = 'class_A_imported_from_class_A'
with open(file_name,'wb') as f:
pickle.dump(A,f,pickle.HIGHEST_PROTOCOL)
Then I moved the file to another location, where I ran Unpickling_imported_class.py to unpickle this file.
import pickle
file_name = 'class_A_imported_from_class_A'
with open(file_name,'rb') as f:
B = pickle.load(f)
So, I get the error:
B = pickle.load(f)
builtins.ModuleNotFoundError: No module named 'Class_A'
Now, I know that the error will go, if I copied the Class_A into this folder. The constraint is that I cannot.
I was able to do this using cloudpickle, but in that, I have to pickle the file also using cloudpickle.
My work demands that I should be able to unpickle classes directly, i.e. if there's a pickled file that has the data for a class, I should be able to read a class directly. Is there a way that I can do it?
In Python I have .pyd shared library that is encrypted to .epyd and that I read and decrypt with
with open('src_nuitka/src.epyd', 'rb') as f:
my_pyd_module = decrypt(f.read())
Now I would like to import the module using the <class 'bytes'> object my_pyd_module directly without writing to disk first. How can I do this? Since it is not a Python code string, I cannot use exec. Is there an import hook available for this task? All examples of writing import hooks did this using exec or by instantiating the classes directly as in https://dev.to/dangerontheranger/dependency-injection-with-import-hooks-in-python-3-5hap.
So here is my first try using the ideas of #a_guest and https://dev.to/dangerontheranger/dependency-injection-with-import-hooks-in-python-3-5hap (and no en-/decrypting yet):
import importlib.abc
import importlib.machinery
import sys
class DependencyInjectorFinder(importlib.abc.MetaPathFinder):
def __init__(self, loader):
self._loader: DependencyInjectorLoader = loader
def find_spec(self, fullname, path, target=None):
if fullname == 'src2':
return importlib.machinery.ModuleSpec(fullname, self._loader)
class DependencyInjectorLoader(importlib.machinery.ExtensionFileLoader):
def get_data(self, path):
with open('src_packaged/src_dist/src.pyd', 'rb') as f:
module = f.read()
return module
sys.meta_path.append(DependencyInjectorFinder(DependencyInjectorLoader('src2', 'src2')))
import src2
which results in
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bad argument type for built-in operation
for the last line.
In a main python file, I import another python files, say their names are file1, file2, file3 and all of them have a function inside them named scrape(). I am trying to choose which file's scrape() will run according to user input, like the following:
python main.py file1
Here is the relevant part of my code:
import file1
import file2
import file3
fileName = sys.argv[1]
for func in ['%s.scrape' % fileName]:
meta, infos = func()
However, I get this error message:
Traceback (most recent call last):
File "main.py", line 50, in <module>
meta, infos = func()
TypeError: 'str' object is not callable
Note that it works when I use for func in [file1.scrape]: I just can't use user input as the imported file name. Can someone tell me how to do it?
You are trying to call func as a function, when it's really a string you built from the command-line argument.
For your purposes, as also mentioned in prashant's linked post, you might want to use something like the imp module.
Here's a quick example
import sys
import imp
# `imp.load_source` requires the full path to the module
# This will load the module provided as `user_selection`
# You can then either `import user_selection`, or use the `mod` to access the package internals directly
mod = imp.load_source("user_selection", "/<mypath>/site-packages/pytz/__init__.py")
# I'm using `user_selection` and `mod` instead of `pytz`
import user_selection
print(user_selection.all_timezones)
print(mod.all_timezones)
In your case, you might have to use imp.find_module to get the full path from just the name, or provide the full paths directly in the command line.
This should be a starting point
import sys
import imp
file_name = sys.argv[1]
f, filename, desc = imp.find_module(file_name, ['/path/where/modules/live'])
mod = imp.load_module("selected_module", f, filename, desc)
mod.scrape()
So I have added my scripts path to systaphs and now when I want to use my defs or classes I do:
import myFile as f
print f.FILE.NAME
But this is making my code sometimes look more confusing and I want to get rid of the "f". I made the class FILE to use it like enum in C++, to be usefull and easy to read. How can I import myFile to use my defs of classes like:
import myFile
print FILE.NAME
# Error: NameError: file <maya console> line 3: name 'FILE' is not defined #
You can bind to names from a module in your import:
from myFile import FILE
Importing adds one or more names to your namespace. Using import <module> binds the module name, and using as you get to pick your own name for the module.
Using from <module> import <object> you get to bind any of the module attributes to a name (by default the name of the <object> attribute); you can still use as to pick a different name.
Only the names you import are available. When you use just import myFile, only myFile is set, not any of the attributes on myFile.
I have just tested for you buddy.Here are the two files in the same directory:
File 1#: main.py
def hey():
print 'Hey'
File 2#: main2.py
import main
main.hey()
#or
from main import hey
hey()
C:\Users\Baby>python main2.py
output : Hey
I think it will work for you.