python cPickle import error - python

I am trying to run a python project. Some part of the code calls a serializer with the following code:
try:
fo = open(data_file, "rb")
except IOError:
print "Couldn't open data file: %s" % data_file
return
try:
myobject = pickle.load(fo)
except:
fo.close()
print "Unexpected error:", sys.exc_info()[0]
raise
fo.close()
return myobject
When this part of the code is run, I get an error on
myobject = pickle.load(fo)
The error is:
myobject = pickle.load(fo)
File "/cs/local/lib/pkg/epd-7.3.1/lib/python2.7/pickle.py", line 1378, in load
return Unpickler(file).load()
File "/cs/local/lib/pkg/epd-7.3.1/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/cs/local/lib/pkg/epd-7.3.1/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "/cs/local/lib/pkg/epd-7.3.1/lib/python2.7/pickle.py", line 1124, in find_class
__import__(module)
ImportError: No module named label
I have looked at : Import Error using cPickle in Python
but I cant use any of the solutions because:
"You can open the file binarily and replace options with the module you replaced the old module options with." => I dont know which binary file the solution is refering to. I dont seem to have any binary file in my package.
In my package, I dont have a module named label to import it.
I'm very lost and I would appreciate any help, any suggestions.

When pickle serializes an object, it serializes modules by reference. So if you have a function or some other python object that has a call stack, it might refer to the module label, which cannot be found. If you have a serialized class, class instance, function, or especially a closure… you might have a import label in the source code used to build that object. a pickled object is a set of instructions for python for how to turn binary bits of information into a python object. if some of the bits are missing, such as a module… (pickle again stores this by reference), then your unpickle will fail.
You could either try to install the label module, or you could ask the party who serialized the object to serialize it with a serializer that serializes the module itself instead of doing so by reference. I think you can do this with the dill serializer.
If the person who serialized the object had label in their globals, and there was a closure being serialized, pickle includes everything in globals… so it might not even be relevant, but you'd need it do unserialize the object. You could also ask for a re-pickle by a serializer that is more cautious about including globals, like dill or cloudpickle.
That's basically what Import Error using cPickle in Python is saying in a less general way.

Related

Python get line number of classes from imported module

I have been using importlib to get the module from an imported python file and would like to get the line number where each class is defined in the python file.
For example I have something like this:
testSpec = importlib.util.spec_from_file_location("", old_file)
testModule = importlib.util.module_from_spec(testSpec)
testSpec.loader.exec_module(testModule)
Where old_file is a python file (lets call it old.py). Then using the inspect library I can find all of the classes from old.py in this way:
for name, obj in inspect.getmembers(testModule):
if inspect.isclass(obj):
print(name)
This gives me all of the created class names from old.py which is correct. However, what I would like to do is also get the line number where this class appears in old.py. I have tried to add this line in the if statement right after the print statement: print(inspect.getsourcelines(obj))
However, this errors out in this way:
File "old.py", line 665, in getfile
raise TypeError('{!r} is a built-in class'.format(object))
TypeError: <class '.exampleClassName'> is a built-in class
I am not sure why it considers this user-created class a built-in class, but is there another way that I can just get the line number where this class is defined in old.py? For example if old.py looks like this:
#test comment line 1
#test comment line 2
class exampleClassName:
test = 0
Then when I have the class object exampleClassName I would expect to print out 4 from inspect.getsourcelines(obj) since it is defined on line 4.
An option would be to loop through the file with open and .readline(), and see where the line matches class, then save the line (count) number and class name into a dict.

How to serialize a scandir.DirEntry in Python for sending through a network socket?

I have server and client programs that communicate with each other through a network socket.
What I want is to send a directory entry (scandir.DirEntry) obtained from scandir.scandir() through the socket.
For now I am using pickle and cPickle modules and have come up with the following (excerpt only):
import scandir, pickle
s = scandir.scandir("D:\\PYTHON")
entry = s.next()
data = pickle.dumps(entry)
However, I am getting the following error stack:
File "untitled.py", line 5, in <module>
data = pickle.dumps(item)
File "C:\Python27\Lib\pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "C:\Python27\Lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python27\Lib\pickle.py", line 306, in save
rv = reduce(self.proto)
File "C:\Python27\Lib\copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle DirEntry objects
How can I get rid of this error?
I have heard of using marshall or JSON.
UPDATE: JSON is not dumping all the data within the object.
Is there any completely different way to do so to send the object through the socket?
Thanks in advance for any help.
Yes, os.DirEntry objects are intended to be short-lived, not really kept around or serialized. If you need the data in them to be serialized, looks like you've figured that out in your own answer -- serialize (pickle) a dict version of the attributes you need.
To deserialize into an object that walks and quacks like an os.DirEntry instance, create a PseudoDirEntry class that mimics the things you need.
Note that you can directly serialize the stat object already, which saves you picking the fields out of that.
Combined, that would look like this:
class PseudoDirEntry:
def __init__(self, name, path, is_dir, stat):
self.name = name
self.path = path
self._is_dir = is_dir
self._stat = stat
def is_dir(self):
return self._is_dir
def stat(self):
return self._stat
And then:
>>> import os, pickle
>>> entry = list(os.scandir())[0]
>>> pickled = pickle.dumps({'name': entry.name, 'path': entry.path, 'is_dir': entry.is_dir(), 'stat': entry.stat()})
>>> loaded = pickle.loads(pickled)
>>> pseudo = PseudoDirEntry(loaded['name'], loaded['path'], loaded['is_dir'], loaded['stat'])
>>> pseudo.name
'.DS_Store'
>>> pseudo.is_dir()
False
>>> pseudo.stat()
os.stat_result(st_mode=33188, st_ino=8370294, st_dev=16777220, st_nlink=1, st_uid=502, st_gid=20, st_size=8196, st_atime=1478356967, st_mtime=1477601172, st_ctime=1477601172)
Well I myself have figured out that for instances of non-standard classes like this scandir.DirEntry, the best way is to convert the class member data into a (possibly nested) combination of standard objects like (list, dict, etc.).
For example, in the particular case of scandir.DirEntry, it can be done as follows.
import scandir, pickle
s = scandir.scandir("D:\\PYTHON")
entry = s.next()
# first convert the stat object to st_
st = entry.stat()
st_ = {'st_mode':st.st_mode, 'st_size':st.st_size,\
'st_atime':st.st_atime, 'st_mtime':st.st_mtime,\
'st_ctime':st.st_ctime}
# now convert the entry object to entry_
entry_ = {'name':entry.name, 'is_dir':entry.is_dir(), \
'path':entry.path, 'stat':st_}
# one may need some other class member data also as necessary
# now pickle the converted entry_
data = pickle.dumps(entry_)
Although for my purpose, I only require the data, after the unpickling in the other end, one may need to reconstruct the unpickled entry_ to unpickled scandir.DirEntry object 'entry'. However, I am yet to figure out how to reconstruct the class instance and set the data for the behaviour of methods like is_dir(), stat().

Deserialize Protobuf in python from class name

How can I deserialize a protocol buffer message, knowing only the string name of the protoc-generated class?
For some reason, the fully qualified name of the message that I get with DESCRIPTOR.full_name does not match the actual location of the python class, so I am unable to deserialize it with the following function:
def get_class( kls ):
"""Get class given a fully qualified name of a class"""
parts = kls.split('.')
module = ".".join(parts[:-1])
m = __import__( module )
for comp in parts[1:]:
m = getattr(m, comp)
return m
I just get ImportError no module (name).
Any help appreciated.
P.S.: In case it helps, the bigger problem I am trying to solve is to serialize the protobuf message to the database, and then deserialize it back out (I am using postgresql with sqlalchemy in this case). Since regular python pickle doesn't work - I am hoping to store a tuple (message_cls_name, message_binary) where message_cls_name is a fully qualified name of the protobuf message, and message_binary is a result of called SerializeToString on the message. The part I am not clear about is how to get the message out and deserialize it into proper protobuf class.
Here is an example solution..
from ocmg_common.protobuf.rpc_pb2 import Exception as PBException
from importlib import import_module
pb_message = PBException(message='hello world')
pb_message_string = pb_message.SerializeToString()
def deserialize(message, typ):
module_, class_ = typ.rsplit('.', 1)
class_ = getattr(import_module(module_), class_)
rv = class_()
rv.ParseFromString(message)
return rv
print(deserialize(pb_message_string, 'ocmg_common.protobuf.rpc_pb2.Exception'))
will output
(env)➜ httppost git:(master) ✗ python test.py
message: "hello world"
If you dont know the module, but only have the DESCRIPTOR.full_name you have to name them the same way, or create a function that maps a full descriptor name to a module. Otherwise you are stuck and dont know from where to import the module.
I hope it will help you.. ;)
Best of luck..

How is calling module and function by string handled in python?

Calling a function of a module from a string with the function's name in Python shows us how to call a function by using getattr("bar")(), but this assumes that we have the module foo imported already.
How would would we then go about calling for the execution of "foo.bar" assuming that we probably also have to perform the import of foo (or from bar import foo)?
Use the __import__(....) function:
http://docs.python.org/library/functions.html#import
(David almost had it, but I think his example is more appropriate for what to do if you want to redefine the normal import process - to e.g. load from a zip file)
You can use find_module and load_module from the imp module to load a module whose name and/or location is determined at execution time.
The example at the end of the documentation topic explains how:
import imp
import sys
def __import__(name, globals=None, locals=None, fromlist=None):
# Fast path: see if the module has already been imported.
try:
return sys.modules[name]
except KeyError:
pass
# If any of the following calls raises an exception,
# there's a problem we can't handle -- let the caller handle it.
fp, pathname, description = imp.find_module(name)
try:
return imp.load_module(name, fp, pathname, description)
finally:
# Since we may exit via an exception, close fp explicitly.
if fp:
fp.close()
Here is what i finally came up with to get the function i wanted back out from a dottedname
from string import join
def dotsplit(dottedname):
module = join(dottedname.split('.')[:-1],'.')
function = dottedname.split('.')[-1]
return module, function
def load(dottedname):
mod, func = dotsplit(dottedname)
try:
mod = __import__(mod, globals(), locals(), [func,], -1)
return getattr(mod,func)
except (ImportError, AttributeError):
return dottedname
this solution is not working?
Calling a function of a module from a string with the function's name in Python

PicklingError: Can't pickle <class 'decimal.Decimal'>: it's not the same object as decimal.Decimal

This is the error I got today at <a href"http://filmaster.com">filmaster.com:
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
What does that exactly mean? It does not seem to be making a lot of sense...
It seems to be connected with django caching. You can see the whole traceback here:
Traceback (most recent call last):
File
"/home/filmaster/django-trunk/django/core/handlers/base.py",
line 92, in get_response response =
callback(request, *callback_args,
**callback_kwargs)
File
"/home/filmaster/film20/film20/core/film_views.py",
line 193, in show_film
workflow.set_data_for_authenticated_user()
File
"/home/filmaster/film20/film20/core/film_views.py",
line 518, in
set_data_for_authenticated_user
object_id = self.the_film.parent.id)
File
"/home/filmaster/film20/film20/core/film_helper.py",
line 179, in get_others_ratings
set_cache(CACHE_OTHERS_RATINGS,
str(object_id) + "_" + str(user_id),
userratings)
File
"/home/filmaster/film20/film20/utils/cache_helper.py",
line 80, in set_cache return
cache.set(CACHE_MIDDLEWARE_KEY_PREFIX
+ full_path, result, get_time(cache_string))
File
"/home/filmaster/django-trunk/django/core/cache/backends/memcached.py",
line 37, in set
self._cache.set(smart_str(key), value,
timeout or self.default_timeout)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 128, in set val, flags =
self._convert(val)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 112, in _convert val =
pickle.dumps(val, 2)
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
And the source code for Filmaster can be downloaded from here: bitbucket.org/filmaster/filmaster-test
Any help will be greatly appreciated.
I got this error when running in an jupyter notebook. I think the problem was that I was using %load_ext autoreload autoreload 2. Restarting my kernel and rerunning solved the problem.
One oddity of Pickle is that the way you import a class before you pickle one of it's instances can subtly change the pickled object. Pickle requires you to have imported the object identically both before you pickle it and before you unpickle it.
So for example:
from a.b import c
C = c()
pickler.dump(C)
will make a subtly different object (sometimes) to:
from a import b
C = b.c()
pickler.dump(C)
Try fiddling with your imports, it might correct the problem.
I will demonstrate the problem with simple Python classes in Python2.7:
In [13]: class A: pass
In [14]: class B: pass
In [15]: A
Out[15]: <class __main__.A at 0x7f4089235738>
In [16]: B
Out[16]: <class __main__.B at 0x7f408939eb48>
In [17]: A.__name__ = "B"
In [18]: pickle.dumps(A)
---------------------------------------------------------------------------
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
This error is shown because we are trying to dump A, but because we changed its name to refer to another object "B", pickle is actually confused with which object to dump - class A or B. Apparently, pickle guys are very smart and they have already put a check on this behavior.
Solution:
Check if the object you are trying to dump has conflicting name with another object.
I have demonstrated debugging for the case presented above with ipython and ipdb below:
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
In [19]: debug
> /<path to pickle dir>/pickle.py(789)save_global()
787 raise PicklingError(
788 "Can't pickle %r: it's not the same object as %s.%s" %
--> 789 (obj, module, name))
790
791 if self.proto >= 2:
ipdb> pp (obj, module, name) **<------------- you are trying to dump obj which is class A from the pickle.dumps(A) call.**
(<class __main__.B at 0x7f4089235738>, '__main__', 'B')
ipdb> getattr(sys.modules[module], name) **<------------- this is the conflicting definition in the module (__main__ here) with same name ('B' here).**
<class __main__.B at 0x7f408939eb48>
I hope this saves some headaches! Adios!!
I can't explain why this is failing either, but my own solution to fix this was to change all my code from doing
from point import Point
to
import point
this one change and it worked. I'd love to know why... hth
There can be issues starting a process with multiprocessing by calling __init__. Here's a demo:
import multiprocessing as mp
class SubProcClass:
def __init__(self, pipe, startloop=False):
self.pipe = pipe
if startloop:
self.do_loop()
def do_loop(self):
while True:
req = self.pipe.recv()
self.pipe.send(req * req)
class ProcessInitTest:
def __init__(self, spawn=False):
if spawn:
mp.set_start_method('spawn')
(self.msg_pipe_child, self.msg_pipe_parent) = mp.Pipe(duplex=True)
def start_process(self):
subproc = SubProcClass(self.msg_pipe_child)
self.trig_proc = mp.Process(target=subproc.do_loop, args=())
self.trig_proc.daemon = True
self.trig_proc.start()
def start_process_fail(self):
self.trig_proc = mp.Process(target=SubProcClass.__init__, args=(self.msg_pipe_child,))
self.trig_proc.daemon = True
self.trig_proc.start()
def do_square(self, num):
# Note: this is an synchronous usage of mp,
# which doesn't make sense. But this is just for demo
self.msg_pipe_parent.send(num)
msg = self.msg_pipe_parent.recv()
print('{}^2 = {}'.format(num, msg))
Now, with the above code, if we run this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=True)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Traceback (most recent call last):
File "start_class_process1.py", line 40, in <module>
t.start_process_fail()
File "start_class_process1.py", line 29, in start_process_fail
self.trig_proc.start()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 274, in _Popen
return Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 33, in __init__
super().__init__(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
self._launch(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 48, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function SubProcClass.__init__ at 0x10073e510>: it's not the same object as __main__.__init__
And if we change it to use fork instead of spawn:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Process Process-1:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
TypeError: __init__() missing 1 required positional argument: 'pipe'
But if we call the start_process method, which doesn't call __init__ in the mp.Process target, like this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process()
for i in range(1000):
t.do_square(i)
It works as expected (whether we use spawn or fork).
Did you somehow reload(decimal), or monkeypatch the decimal module to change the Decimal class? These are the two things most likely to produce such a problem.
Same happened to me
Restarting the kernel worked for me
Due to the restrictions based upon reputation I cannot comment, but the answer of Salim Fahedy and following the debugging-path set me up to identify a cause for this error, even when using dill instead of pickle:
Under the hood, dill also accesses some functions of dill. And in pickle._Pickler.save_global() there is an import happening. To me it seems, that this is more of a "hack" than a real solution as this method fails as soon as the class of the instance you are trying to pickle is not imported from the lowest level of the package the class is in. Sorry for the bad explanation, maybe examples are more suitable:
The following example would fail:
from oemof import solph
...
(some code here, giving you the object 'es')
...
model = solph.Model(es)
pickle.dump(model, open('file.pickle', 'wb))
It fails, because while you can use solph.Model, the class actually is oemof.solph.models.Model for example. The save_global() resolves that (or some function before that which passes it to save_global()), but then imports Model from oemof.solph.models and throws an error, because it's not the same import as from oemof import solph.Model (or something like that, I'm not 100% sure about the workings).
The following example would work:
from oemof.solph.models import Model
...
some code here, giving you the object 'es')
...
model = Model(es)
pickle.dump(model, open('file.pickle', 'wb'))
It works, because now the Model object is imported from the same place, the pickle._Pickler.save_global() imports the comparison object (obj2) from.
Long story short: When pickling an object, make sure to import the class from the lowest possible level.
Addition: This also seems to apply to objects stored in the attributes of the class-instance you want to pickle. If for example model had an attribute es that itself is an object of the class oemof.solph.energysystems.EnergySystem, we would need to import it as:
from oemof.solph.energysystems import EnergySystem
es = EnergySystem()
My issue was that I had a function with the same name defined twice in a file. So I guess it was confused about which one it was trying to pickle.
I had same problem while debugging (Spyder). Everything worked normally if run the program. But, if I start to debug I faced the picklingError.
But, once I chose the option Execute in dedicated console in Run configuration per file (short-cut: ctrl+F6) everything worked normally as expected. I do not know exactly how it is adapting.
Note: In my script I have many imports like
from PyQt5.QtWidgets import *
from PyQt5.Qt import *
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
import os, sys, re, math
My basic understanding was, because of star (*) I was getting this picklingError.
I had a problem that no one has mentioned yet. I have a package with a __init__ file that does, among other things:
from .mymodule import cls
Then my top-level code says:
import mypkg
obj = mypkg.cls()
The problem with this is that in my top-level code, the type appears to be mypkg.cls, but it's actually mypkg.mymodule.cls. Using the full path:
obj = mypkg.mymodule.cls()
avoids the error.
I had the same error in Spyder. Turned out to be simple in my case. I defined a class named "Class" in a file also named "Class". I changed the name of the class in the definition to "Class_obj". pickle.dump(Class_obj,fileh) works, but pickle.dump(Class,fileh) does not when its saved in a file named "Class".
This miraculous function solves the mentioned error, but for me it turned out to another error 'permission denied' which comes out of the blue. However, I guess it might help someone find a solution so I am still posting the function:
import tempfile
import time
from tensorflow.keras.models import save_model, Model
# Hotfix function
def make_keras_picklable():
def __getstate__(self):
model_str = ""
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
save_model(self, fd.name, overwrite=True)
model_str = fd.read()
d = {'model_str': model_str}
return d
def __setstate__(self, state):
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
fd.write(state['model_str'])
fd.flush()
model = load_model(fd.name)
self.__dict__ = model.__dict__
cls = Model
cls.__getstate__ = __getstate__
cls.__setstate__ = __setstate__
# Run the function
make_keras_picklable()
### create then save your model here ###

Categories