Saving and loading objects and using pickle - python

I´m trying to save and load objects using pickle module.
First I declare my objects:
>>> class Fruits:pass
...
>>> banana = Fruits()
>>> banana.color = 'yellow'
>>> banana.value = 30
After that I open a file called 'Fruits.obj'(previously I created a new .txt file and I renamed 'Fruits.obj'):
>>> import pickle
>>> filehandler = open(b"Fruits.obj","wb")
>>> pickle.dump(banana,filehandler)
After do this I close my session and I began a new one and I put the next (trying to access to the object that it supposed to be saved):
file = open("Fruits.obj",'r')
object_file = pickle.load(file)
But I have this message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python31\lib\pickle.py", line 1365, in load
encoding=encoding, errors=errors).load()
ValueError: read() from the underlying stream did notreturn bytes
I don´t know what to do because I don´t understand this message.
Does anyone know How I can load my object 'banana'?
Thank you!
EDIT:
As some of you have sugested I put:
>>> import pickle
>>> file = open("Fruits.obj",'rb')
There were no problem, but the next I put was:
>>> object_file = pickle.load(file)
And I have error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python31\lib\pickle.py", line 1365, in load
encoding=encoding, errors=errors).load()
EOFError

As for your second problem:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python31\lib\pickle.py", line
1365, in load encoding=encoding,
errors=errors).load() EOFError
After you have read the contents of the file, the file pointer will be at the end of the file - there will be no further data to read. You have to rewind the file so that it will be read from the beginning again:
file.seek(0)
What you usually want to do though, is to use a context manager to open the file and read data from it. This way, the file will be automatically closed after the block finishes executing, which will also help you organize your file operations into meaningful chunks.
Finally, cPickle is a faster implementation of the pickle module in C. So:
In [1]: import _pickle as cPickle
In [2]: d = {"a": 1, "b": 2}
In [4]: with open(r"someobject.pickle", "wb") as output_file:
...: cPickle.dump(d, output_file)
...:
# pickle_file will be closed at this point, preventing your from accessing it any further
In [5]: with open(r"someobject.pickle", "rb") as input_file:
...: e = cPickle.load(input_file)
...:
In [7]: print e
------> print(e)
{'a': 1, 'b': 2}

The following works for me:
class Fruits: pass
banana = Fruits()
banana.color = 'yellow'
banana.value = 30
import pickle
filehandler = open("Fruits.obj","wb")
pickle.dump(banana,filehandler)
filehandler.close()
file = open("Fruits.obj",'rb')
object_file = pickle.load(file)
file.close()
print(object_file.color, object_file.value, sep=', ')
# yellow, 30

You're forgetting to read it as binary too.
In your write part you have:
open(b"Fruits.obj","wb") # Note the wb part (Write Binary)
In the read part you have:
file = open("Fruits.obj",'r') # Note the r part, there should be a b too
So replace it with:
file = open("Fruits.obj",'rb')
And it will work :)
As for your second error, it is most likely cause by not closing/syncing the file properly.
Try this bit of code to write:
>>> import pickle
>>> filehandler = open(b"Fruits.obj","wb")
>>> pickle.dump(banana,filehandler)
>>> filehandler.close()
And this (unchanged) to read:
>>> import pickle
>>> file = open("Fruits.obj",'rb')
>>> object_file = pickle.load(file)
A neater version would be using the with statement.
For writing:
>>> import pickle
>>> with open('Fruits.obj', 'wb') as fp:
>>> pickle.dump(banana, fp)
For reading:
>>> import pickle
>>> with open('Fruits.obj', 'rb') as fp:
>>> banana = pickle.load(fp)

Always open in binary mode, in this case
file = open("Fruits.obj",'rb')

You can use anycache to do the job for you. Assuming you have a function myfunc which creates the instance:
from anycache import anycache
class Fruits:pass
#anycache(cachedir='/path/to/your/cache')
def myfunc()
banana = Fruits()
banana.color = 'yellow'
banana.value = 30
return banana
Anycache calls myfunc at the first time and pickles the result to a
file in cachedir using an unique identifier (depending on the the function name and the arguments) as filename.
On any consecutive run, the pickled object is loaded.
If the cachedir is preserved between python runs, the pickled object is taken from the previous python run.
The function arguments are also taken into account.
A refactored implementation works likewise:
from anycache import anycache
class Fruits:pass
#anycache(cachedir='/path/to/your/cache')
def myfunc(color, value)
fruit = Fruits()
fruit.color = color
fruit.value = value
return fruit

You didn't open the file in binary mode.
open("Fruits.obj",'rb')
Should work.
For your second error, the file is most likely empty, which mean you inadvertently emptied it or used the wrong filename or something.
(This is assuming you really did close your session. If not, then it's because you didn't close the file between the write and the read).
I tested your code, and it works.

It seems you want to save your class instances across sessions, and using pickle is a decent way to do this. However, there's a package called klepto that abstracts the saving of objects to a dictionary interface, so you can choose to pickle objects and save them to a file (as shown below), or pickle the objects and save them to a database, or instead of use pickle use json, or many other options. The nice thing about klepto is that by abstracting to a common interface, it makes it easy so you don't have to remember the low-level details of how to save via pickling to a file, or otherwise.
Note that It works for dynamically added class attributes, which pickle cannot do...
dude#hilbert>$ python
Python 2.7.6 (default, Nov 12 2013, 13:26:39)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from klepto.archives import file_archive
>>> db = file_archive('fruits.txt')
>>> class Fruits: pass
...
>>> banana = Fruits()
>>> banana.color = 'yellow'
>>> banana.value = 30
>>>
>>> db['banana'] = banana
>>> db.dump()
>>>
Then we restart…
dude#hilbert>$ python
Python 2.7.6 (default, Nov 12 2013, 13:26:39)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from klepto.archives import file_archive
>>> db = file_archive('fruits.txt')
>>> db.load()
>>>
>>> db['banana'].color
'yellow'
>>>
Klepto works on python2 and python3.
Get the code here:
https://github.com/uqfoundation

Related

'ZipExtFile' object has no attribute 'open'

I have a file f that i pulled from a url response:
<zipfile.ZipExtFile name='some_data_here.csv' mode='r' compress_type=deflate>
But for some reason i can't do f.open('some_data_here.csv')
I get an error saying:
'ZipExtFile' object has no attribute 'open'
I'm really confused because isn't that one of the attributes of ZipExtFile?
I'm really confused because isn't that one of the attributes of ZipExtFile?
A ZipExtFile object is what you get when you call open on a ZipFile:
>>> import zipfile
>>> z = zipfile.ZipFile('arduino-ide_2.0.3_Linux_64bit.zip')
>>> f = z.open('arduino-ide_2.0.3_Linux_64bit/LICENSE.electron.txt')
>>> f
<zipfile.ZipExtFile name='arduino-ide_2.0.3_Linux_64bit/LICENSE.electron.txt' mode='r' compress_type=deflate>
There is no open method on a ZipExtFile because it's already an open file. You can read or readline from it:
>>> f.readline()
b'Copyright (c) Electron contributors\n'

Porting pickle py2 to py3 strings become bytes

I have a pickle file that was created with python 2.7 that I'm trying to port to python 3.6. The file is saved in py 2.7 via pickle.dumps(self.saved_objects, -1)
and loaded in python 3.6 via loads(data, encoding="bytes") (from a file opened in rb mode). If I try opening in r mode and pass encoding=latin1 to loads I get UnicodeDecode errors. When I open it as a byte stream it loads, but literally every string is now a byte string. Every object's __dict__ keys are all b"a_variable_name" which then generates attribute errors when calling an_object.a_variable_name because __getattr__ passes a string and __dict__ only contains bytes. I feel like I've tried every combination of arguments and pickle protocols already. Apart from forcibly converting all objects' __dict__ keys to strings I'm at a loss. Any ideas?
** Skip to 4/28/17 update for better example
-------------------------------------------------------------------------------------------------------------
** Update 4/27/17
This minimum example illustrates my problem:
From py 2.7.13
import pickle
class test(object):
def __init__(self):
self.x = u"test ¢" # including a unicode str breaks things
t = test()
dumpstr = pickle.dumps(t)
>>> dumpstr
"ccopy_reg\n_reconstructor\np0\n(c__main__\ntest\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nVtest \xa2\np7\nsb."
From py 3.6.1
import pickle
class test(object):
def __init__(self):
self.x = "xyz"
dumpstr = b"ccopy_reg\n_reconstructor\np0\n(c__main__\ntest\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nVtest \xa2\np7\nsb."
t = pickle.loads(dumpstr, encoding="bytes")
>>> t
<__main__.test object at 0x040E3DF0>
>>> t.x
Traceback (most recent call last):
File "<pyshell#15>", line 1, in <module>
t.x
AttributeError: 'test' object has no attribute 'x'
>>> t.__dict__
{b'x': 'test ¢'}
>>>
-------------------------------------------------------------------------------------------------------------
Update 4/28/17
To re-create my issue I'm posting my actual raw pickle data here
The pickle file was created in python 2.7.13, windows 10 using
with open("raw_data.pkl", "wb") as fileobj:
pickle.dump(library, fileobj, protocol=0)
(protocol 0 so it's human readable)
To run it you'll need classes.py
# classes.py
class Library(object): pass
class Book(object): pass
class Student(object): pass
class RentalDetails(object): pass
And the test script here:
# load_pickle.py
import pickle, sys, itertools, os
raw_pkl = "raw_data.pkl"
is_py3 = sys.version_info.major == 3
read_modes = ["rb"]
encodings = ["bytes", "utf-8", "latin-1"]
fix_imports_choices = [True, False]
files = ["raw_data_%s.pkl" % x for x in range(3)]
def py2_test():
with open(raw_pkl, "rb") as fileobj:
loaded_object = pickle.load(fileobj)
print("library dict: %s" % (loaded_object.__dict__.keys()))
return loaded_object
def py2_dumps():
library = py2_test()
for protcol, path in enumerate(files):
print("dumping library to %s, protocol=%s" % (path, protcol))
with open(path, "wb") as writeobj:
pickle.dump(library, writeobj, protocol=protcol)
def py3_test():
# this test iterates over the different options trying to load
# the data pickled with py2 into a py3 environment
print("starting py3 test")
for (read_mode, encoding, fix_import, path) in itertools.product(read_modes, encodings, fix_imports_choices, files):
py3_load(path, read_mode=read_mode, fix_imports=fix_import, encoding=encoding)
def py3_load(path, read_mode, fix_imports, encoding):
from traceback import print_exc
print("-" * 50)
print("path=%s, read_mode = %s fix_imports = %s, encoding = %s" % (path, read_mode, fix_imports, encoding))
if not os.path.exists(path):
print("start this file with py2 first")
return
try:
with open(path, read_mode) as fileobj:
loaded_object = pickle.load(fileobj, fix_imports=fix_imports, encoding=encoding)
# print the object's __dict__
print("library dict: %s" % (loaded_object.__dict__.keys()))
# consider the test a failure if any member attributes are saved as bytes
test_passed = not any((isinstance(k, bytes) for k in loaded_object.__dict__.keys()))
print("Test %s" % ("Passed!" if test_passed else "Failed"))
except Exception:
print_exc()
print("Test Failed")
input("Press Enter to continue...")
print("-" * 50)
if is_py3:
py3_test()
else:
# py2_test()
py2_dumps()
put all 3 in the same directory and run c:\python27\python load_pickle.py first which will create 1 pickle file for each of the 3 protocols. Then run the same command with python 3 and notice that it version converts the __dict__ keys to bytes. I had it working for about 6 hours, but for the life of me I can't figure out how I broke it again.
In short, you're hitting bug 22005 with datetime.date objects in the RentalDetails objects.
That can be worked around with the encoding='bytes' parameter, but that leaves your classes with __dict__ containing bytes:
>>> library = pickle.loads(pickle_data, encoding='bytes')
>>> dir(library)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'str' and 'bytes'
It's possible to manually fix that based on your specific data:
def fix_object(obj):
"""Decode obj.__dict__ containing bytes keys"""
obj.__dict__ = dict((k.decode("ascii"), v) for k, v in obj.__dict__.items())
def fix_library(library):
"""Walk all library objects and decode __dict__ keys"""
fix_object(library)
for student in library.students:
fix_object(student)
for book in library.books:
fix_object(book)
for rental in book.rentals:
fix_object(rental)
But that's fragile and enough of a pain you should be looking for a better option.
1) Implement __getstate__/__setstate__ that maps datetime objects to a non-broken representation, for instance:
class Event(object):
"""Example class working around datetime pickling bug"""
def __init__(self):
self.date = datetime.date.today()
def __getstate__(self):
state = self.__dict__.copy()
state["date"] = state["date"].toordinal()
return state
def __setstate__(self, state):
self.__dict__.update(state)
self.date = datetime.date.fromordinal(self.date)
2) Don't use pickle at all. Along the lines of __getstate__/__setstate__, you can just implement to_dict/from_dict methods or similar in your classes for saving their content as json or some other plain format.
A final note, having a backreference to library in each object shouldn't be required.
You should treat pickle data as specific to the (major) version of Python that created it.
(See Gregory Smith's message w.r.t. issue 22005.)
The best way to get around this is to write a Python 2.7 program to read the pickled data, and write it out in a neutral format.
Taking a quick look at your actual data, it seems to me that an SQLite database is appropriate as an interchange format, since the Books contain references to a Library and RentalDetails. You could create separate tables for each.
Question: Porting pickle py2 to py3 strings become bytes
The given encoding='latin-1' below, is ok.
Your Problem with b'' are the result of using encoding='bytes'.
This will result in dict-keys being unpickled as bytes instead of as str.
The Problem data are the datetime.date values '\x07á\x02\x10', starting at line 56 in raw-data.pkl.
It's a konwn Issue, as pointed already.
Unpickling python2 datetime under python3
http://bugs.python.org/issue22005
For a workaround, I have patched pickle.py and got unpickled object, e.g.
book.library.books[0].rentals[0].rental_date=2017-02-16
This will work for me:
t = pickle.loads(dumpstr, encoding="latin-1")
Output:
<main.test object at 0xf7095fec>
t.__dict__={'x': 'test ¢'}
test ¢
Tested with Python:3.4.2

how to simulate const variable in python

Hello i am trying to create a const in python using this example found from Creating constant in Python (in the first answer from the link) and use instance as module.
The first file const.py has
# Put in const.py...:
class _const:
class ConstError(TypeError): pass
def __setattr__(self,name,value):
if self.__dict__ in (name):
raise self.ConstError("Can't rebind const(%s)"%name)
self.__dict__[name]=value
import sys
sys.modules[__name__]=_const()
And the rest goes to test.py for example.
# that's all -- now any client-code can
import const
# and bind an attribute ONCE:
const.magic = 23
# but NOT re-bind it:
const.magic = 88 # raises const.ConstError
# you may also want to add the obvious __delattr__
Although i have made 2 changes cause i am using python 3 i still get errors
Traceback (most recent call last):
File "E:\Const_in_python\test.py", line 4, in <module>
const.magic = 23
File "E:\Const_in_python\const.py", line 5, in __setattr__
if self.__dict__ in (name):
TypeError: 'in <string>' requires string as left operand, not dict
I dont understand what the line 5 error is. Can anyone explain? Correcting the example would also be nice. Thanks in advance.
This looks weird (where did it come from?)
if self.__dict__ in (name):
shouldn't it be
if name in self.__dict__:
That fixes your example
Python 3.2.3 (default, May 3 2012, 15:51:42)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import const
>>> const.magic = 23
>>> const.magic = 88
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "const.py", line 6, in __setattr__
raise self.ConstError("Can't rebind const(%s)"%name)
const.ConstError: Can't rebind const(magic)
Do you really need this const hack? Lots of Python code seems to somehow work without it
This line:
if self.__dict__ in (name):
should be
if name in self.__dict__:
... you want to know if the attribute is in the dict, not if the dict is in the attribute name (which doesn't work, because strings contain strings, not dictionaries).
Maybe kkconst - pypi is what you search.
support str, int, float, datetime
the const field instance will keep its base type behavior.
Like orm model definition, BaseConst is Constant Helper which manage const field.
For example:
from __future__ import print_function
from kkconst import (
BaseConst,
ConstFloatField,
)
class MathConst(BaseConst):
PI = ConstFloatField(3.1415926, verbose_name=u"Pi")
E = ConstFloatField(2.7182818284, verbose_name=u"mathematical constant") # Euler's number"
GOLDEN_RATIO = ConstFloatField(0.6180339887, verbose_name=u"Golden Ratio")
magic_num = MathConst.GOLDEN_RATIO
assert isinstance(magic_num, ConstFloatField)
assert isinstance(magic_num, float)
print(magic_num) # 0.6180339887
print(magic_num.verbose_name) # Golden Ratio
# MathConst.GOLDEN_RATIO = 1024 # raise Error, because assignment allow only once
more details usage you can read the pypi url:
pypi or github
same answer: Creating constant in Python

Django '<object> matching query does not exist' when I can see it in the database

My model looks like this:
class Staff(models.Model):
StaffNumber = models.CharField(max_length=20,primary_key=True)
NameFirst = models.CharField(max_length=30,blank=True,null=True)
NameLast = models.CharField(max_length=30)
SchoolID = models.CharField(max_length=10,blank=True,null=True)
AutocompleteName = models.CharField(max_length=100, blank=True,null=True)
I'm using MySQL, in case that matters.
From the manage.py shell:
root#django:/var/www/django-sites/apps# python manage.py shell
Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from disciplineform.models import Staff
>>> s = Staff.objects.all()
>>> len(s)
406
So I know there are 406 "Staff" objects in there. I can also see them in the database. I check one of the values:
>>> s[0].NameFirst
u'"ANDREA"'
That also matches what I see in the database. Now I try to 'get' this object.
>>> a = Staff.objects.get(NameFirst='ANDREA')
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/var/lib/python-support/python2.5/django/db/models/manager.py", line 93, in get
return self.get_query_set().get(*args, **kwargs)
File "/var/lib/python-support/python2.5/django/db/models/query.py", line 309, in get
% self.model._meta.object_name)
DoesNotExist: Staff matching query does not exist.
Huh? This is happening for all the values of all the columns I've tested. I'm getting the same result in my view.py code.
I'm obviously doing something dumb. What is it?
Try
a = Staff.objects.get(NameFirst=u'"ANDREA"')
The u tells Python/Django it's a Unicode string, not a plain old str, and in your s[0].NameFirst sample, it's showing the value as containing double quotes.
I ran into the same issue, here's the solution:
from django.db import reset_queries, close_connection
close_connection()
reset_queries()
Name was stored in database with an extra redundant doublequotes. So, if you want to catch that record, correct code is:
a = Staff.objects.get(NameFirst='"ANDREA"')
…instead of:
a = Staff.objects.get(NameFirst='ANDREA')
If you can't be sure, that query will return result, you must to add exception handling, something like that:
try:
a = Staff.objects.get(NameFirst='"ANDREA"')
except Exception:
a = None
I have run into similar issues before.
I'm not entirely sure why but the raw 'get' tends to give me problems. So, I usually end up using 'filter' instead, then grabbing the first result.
a = Staff.objects.filter(NameFirst='ANDREA')
result = a[0]
A simliar thing happened to me. I was able to solve it by putting something in the db for the query to return.
This isn't exactly like yours, but it was the first hit on Google, so I thought I'd share.

PicklingError: Can't pickle <class 'decimal.Decimal'>: it's not the same object as decimal.Decimal

This is the error I got today at <a href"http://filmaster.com">filmaster.com:
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
What does that exactly mean? It does not seem to be making a lot of sense...
It seems to be connected with django caching. You can see the whole traceback here:
Traceback (most recent call last):
File
"/home/filmaster/django-trunk/django/core/handlers/base.py",
line 92, in get_response response =
callback(request, *callback_args,
**callback_kwargs)
File
"/home/filmaster/film20/film20/core/film_views.py",
line 193, in show_film
workflow.set_data_for_authenticated_user()
File
"/home/filmaster/film20/film20/core/film_views.py",
line 518, in
set_data_for_authenticated_user
object_id = self.the_film.parent.id)
File
"/home/filmaster/film20/film20/core/film_helper.py",
line 179, in get_others_ratings
set_cache(CACHE_OTHERS_RATINGS,
str(object_id) + "_" + str(user_id),
userratings)
File
"/home/filmaster/film20/film20/utils/cache_helper.py",
line 80, in set_cache return
cache.set(CACHE_MIDDLEWARE_KEY_PREFIX
+ full_path, result, get_time(cache_string))
File
"/home/filmaster/django-trunk/django/core/cache/backends/memcached.py",
line 37, in set
self._cache.set(smart_str(key), value,
timeout or self.default_timeout)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 128, in set val, flags =
self._convert(val)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 112, in _convert val =
pickle.dumps(val, 2)
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
And the source code for Filmaster can be downloaded from here: bitbucket.org/filmaster/filmaster-test
Any help will be greatly appreciated.
I got this error when running in an jupyter notebook. I think the problem was that I was using %load_ext autoreload autoreload 2. Restarting my kernel and rerunning solved the problem.
One oddity of Pickle is that the way you import a class before you pickle one of it's instances can subtly change the pickled object. Pickle requires you to have imported the object identically both before you pickle it and before you unpickle it.
So for example:
from a.b import c
C = c()
pickler.dump(C)
will make a subtly different object (sometimes) to:
from a import b
C = b.c()
pickler.dump(C)
Try fiddling with your imports, it might correct the problem.
I will demonstrate the problem with simple Python classes in Python2.7:
In [13]: class A: pass
In [14]: class B: pass
In [15]: A
Out[15]: <class __main__.A at 0x7f4089235738>
In [16]: B
Out[16]: <class __main__.B at 0x7f408939eb48>
In [17]: A.__name__ = "B"
In [18]: pickle.dumps(A)
---------------------------------------------------------------------------
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
This error is shown because we are trying to dump A, but because we changed its name to refer to another object "B", pickle is actually confused with which object to dump - class A or B. Apparently, pickle guys are very smart and they have already put a check on this behavior.
Solution:
Check if the object you are trying to dump has conflicting name with another object.
I have demonstrated debugging for the case presented above with ipython and ipdb below:
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
In [19]: debug
> /<path to pickle dir>/pickle.py(789)save_global()
787 raise PicklingError(
788 "Can't pickle %r: it's not the same object as %s.%s" %
--> 789 (obj, module, name))
790
791 if self.proto >= 2:
ipdb> pp (obj, module, name) **<------------- you are trying to dump obj which is class A from the pickle.dumps(A) call.**
(<class __main__.B at 0x7f4089235738>, '__main__', 'B')
ipdb> getattr(sys.modules[module], name) **<------------- this is the conflicting definition in the module (__main__ here) with same name ('B' here).**
<class __main__.B at 0x7f408939eb48>
I hope this saves some headaches! Adios!!
I can't explain why this is failing either, but my own solution to fix this was to change all my code from doing
from point import Point
to
import point
this one change and it worked. I'd love to know why... hth
There can be issues starting a process with multiprocessing by calling __init__. Here's a demo:
import multiprocessing as mp
class SubProcClass:
def __init__(self, pipe, startloop=False):
self.pipe = pipe
if startloop:
self.do_loop()
def do_loop(self):
while True:
req = self.pipe.recv()
self.pipe.send(req * req)
class ProcessInitTest:
def __init__(self, spawn=False):
if spawn:
mp.set_start_method('spawn')
(self.msg_pipe_child, self.msg_pipe_parent) = mp.Pipe(duplex=True)
def start_process(self):
subproc = SubProcClass(self.msg_pipe_child)
self.trig_proc = mp.Process(target=subproc.do_loop, args=())
self.trig_proc.daemon = True
self.trig_proc.start()
def start_process_fail(self):
self.trig_proc = mp.Process(target=SubProcClass.__init__, args=(self.msg_pipe_child,))
self.trig_proc.daemon = True
self.trig_proc.start()
def do_square(self, num):
# Note: this is an synchronous usage of mp,
# which doesn't make sense. But this is just for demo
self.msg_pipe_parent.send(num)
msg = self.msg_pipe_parent.recv()
print('{}^2 = {}'.format(num, msg))
Now, with the above code, if we run this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=True)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Traceback (most recent call last):
File "start_class_process1.py", line 40, in <module>
t.start_process_fail()
File "start_class_process1.py", line 29, in start_process_fail
self.trig_proc.start()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 274, in _Popen
return Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 33, in __init__
super().__init__(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
self._launch(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 48, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function SubProcClass.__init__ at 0x10073e510>: it's not the same object as __main__.__init__
And if we change it to use fork instead of spawn:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Process Process-1:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
TypeError: __init__() missing 1 required positional argument: 'pipe'
But if we call the start_process method, which doesn't call __init__ in the mp.Process target, like this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process()
for i in range(1000):
t.do_square(i)
It works as expected (whether we use spawn or fork).
Did you somehow reload(decimal), or monkeypatch the decimal module to change the Decimal class? These are the two things most likely to produce such a problem.
Same happened to me
Restarting the kernel worked for me
Due to the restrictions based upon reputation I cannot comment, but the answer of Salim Fahedy and following the debugging-path set me up to identify a cause for this error, even when using dill instead of pickle:
Under the hood, dill also accesses some functions of dill. And in pickle._Pickler.save_global() there is an import happening. To me it seems, that this is more of a "hack" than a real solution as this method fails as soon as the class of the instance you are trying to pickle is not imported from the lowest level of the package the class is in. Sorry for the bad explanation, maybe examples are more suitable:
The following example would fail:
from oemof import solph
...
(some code here, giving you the object 'es')
...
model = solph.Model(es)
pickle.dump(model, open('file.pickle', 'wb))
It fails, because while you can use solph.Model, the class actually is oemof.solph.models.Model for example. The save_global() resolves that (or some function before that which passes it to save_global()), but then imports Model from oemof.solph.models and throws an error, because it's not the same import as from oemof import solph.Model (or something like that, I'm not 100% sure about the workings).
The following example would work:
from oemof.solph.models import Model
...
some code here, giving you the object 'es')
...
model = Model(es)
pickle.dump(model, open('file.pickle', 'wb'))
It works, because now the Model object is imported from the same place, the pickle._Pickler.save_global() imports the comparison object (obj2) from.
Long story short: When pickling an object, make sure to import the class from the lowest possible level.
Addition: This also seems to apply to objects stored in the attributes of the class-instance you want to pickle. If for example model had an attribute es that itself is an object of the class oemof.solph.energysystems.EnergySystem, we would need to import it as:
from oemof.solph.energysystems import EnergySystem
es = EnergySystem()
My issue was that I had a function with the same name defined twice in a file. So I guess it was confused about which one it was trying to pickle.
I had same problem while debugging (Spyder). Everything worked normally if run the program. But, if I start to debug I faced the picklingError.
But, once I chose the option Execute in dedicated console in Run configuration per file (short-cut: ctrl+F6) everything worked normally as expected. I do not know exactly how it is adapting.
Note: In my script I have many imports like
from PyQt5.QtWidgets import *
from PyQt5.Qt import *
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
import os, sys, re, math
My basic understanding was, because of star (*) I was getting this picklingError.
I had a problem that no one has mentioned yet. I have a package with a __init__ file that does, among other things:
from .mymodule import cls
Then my top-level code says:
import mypkg
obj = mypkg.cls()
The problem with this is that in my top-level code, the type appears to be mypkg.cls, but it's actually mypkg.mymodule.cls. Using the full path:
obj = mypkg.mymodule.cls()
avoids the error.
I had the same error in Spyder. Turned out to be simple in my case. I defined a class named "Class" in a file also named "Class". I changed the name of the class in the definition to "Class_obj". pickle.dump(Class_obj,fileh) works, but pickle.dump(Class,fileh) does not when its saved in a file named "Class".
This miraculous function solves the mentioned error, but for me it turned out to another error 'permission denied' which comes out of the blue. However, I guess it might help someone find a solution so I am still posting the function:
import tempfile
import time
from tensorflow.keras.models import save_model, Model
# Hotfix function
def make_keras_picklable():
def __getstate__(self):
model_str = ""
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
save_model(self, fd.name, overwrite=True)
model_str = fd.read()
d = {'model_str': model_str}
return d
def __setstate__(self, state):
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
fd.write(state['model_str'])
fd.flush()
model = load_model(fd.name)
self.__dict__ = model.__dict__
cls = Model
cls.__getstate__ = __getstate__
cls.__setstate__ = __setstate__
# Run the function
make_keras_picklable()
### create then save your model here ###

Categories