PYTHON : There is a function similar to ast.literal_eval ()? - python

I've got a problem with the utilisation of ast.literal_eval(). In the example below, I only want to convert the string (myText) to dictionnary. But ast.literal_eval() try to evaluate <__main__.myClass instance at 0x0000000052D64D88> and give me an error. I completely anderstand this error but I would like to know if there is a way to avoid it (with an other function or with an other way to use the function ast.literal_eval)
import ast
myText = "{<__main__.myClass instance at 0x0000000052D64D88>: value}"
ast.literal_eval(myText)
# Error: invalid syntax
# Traceback (most recent call last):
# File "<maya console>", line 4, in <module>
# File "C:\Program Files\Autodesk\Maya2016\bin\python27.zip\ast.py", line 49, in literal_eval
# node_or_string = parse(node_or_string, mode='eval')
# File "C:\Program Files\Autodesk\Maya2016\bin\python27.zip\ast.py", line 37, in parse
# return compile(source, filename, mode, PyCF_ONLY_AST)
# File "<unknown>", line 1
# {<__main__.myClass instance at 0x0000000052D64D88>: value}
# ^
# SyntaxError: invalid syntax #
Thank you in advance for your help !

What you really want to do is dump your data using pickle.dump and load it using pickle.load (or equivalent, such as json, etc.). Using repr(data) to dump the data will cause problems like this.
If you just need to salvage the data you have already generated, you might get away with something like the following:
def my_literal_eval(s):
s = re.sub(r"<__main__.myClass instance at 0x([^>]+)>", r'"<\1>"', s)
dct = ast.literal_eval(s)
return {myClass(): v for v in dct.itervalues()}
Example of usage:
>>> import ast, re
>>> class myClass(object): pass
...
>>> myText = "{<__main__.myClass instance at 0x0000000052D64D88>: {'name': 'theName'}, <__main__.myClass instance at 0x0000000052D73F48>: {'name': 'theName'}}"
>>> my_literal_eval(myText)
{<__main__.myClass object at 0x7fbdc00a4b90>: {'name': 'theName'}, <__main__.myClass object at 0x7fbdc0035550>: {'name': 'theName'}}
This will work only if the myClass instances don't have any useful information, but are only needed for identity. The idea is to first fix up the string by replacing the <__main__.myClass instance ...> strings with something that can be parsed by ast.literal_eval, and then replace those with actual myClass instances - provided these can be constructed without arguments, which hinges on the above assumption.
If this initial assumption doesn't hold, then your data is, as Ignacio put it, irreversibly damaged, and no amount of clever parsing will retrieve the lost bits.

Related

Python jsonpickle error: 'OrderedDict' object has no attribute '_OrderedDict__root'

I'm hitting this exception with jsonpickle, when trying to pickle a rather complex object that unfortunately I'm not sure how to describe here. I know that makes it tough to say much, but for what it's worth:
>>> frozen = jsonpickle.encode(my_complex_object_instance)
>>> thawed = jsonpickle.decode(frozen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/jsonpickle/__init__.py",
line 152, in decode
return unpickler.decode(string, backend=backend, keys=keys)
:
:
File "/Library/Python/2.7/site-packages/jsonpickle/unpickler.py",
line 336, in _restore_from_dict
instance[k] = value
File "/Library/Python/2.7/site-packages/botocore/vendored/requests/packages/urllib3/packages/ordered_dict.py",
line 49, in __setitem__
root = self.__root
AttributeError: 'OrderedDict' object has no attribute '_OrderedDict__root'
I don't find much of assistance when googling the error. I do see what looks like the same issue was resolved at some time past for simpler objects:
https://github.com/jsonpickle/jsonpickle/issues/33
The cited example in that report works for me:
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict()))
OrderedDict()
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict(a=1)))
OrderedDict([(u'a', 1)])
Has anyone ever run into this themselves and found a solution? I ask with the understanding that my case may be "differently idiosynchratic" than another known example.
The requests module for me seems to be running into problems when I .decode(). After looking at the jsonpickle code a bit, I decided to fork it and change the following lines to see what was going on (and I ended up keeping a private copy of jsonpickle with the changes so I can move forward).
In jsonpickle/unpickler.py (in my version it's line 368), search for the if statement section in the method _restore_from_dict():
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
instance[k] = value
else:
setattr(instance, k, value)
and change it to this (it will logERROR the ones that are failing and then you can either keep the code in place or change your OrderedDict's version that have __root)
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
# Currently requests.adapters.HTTPAdapter is using a non-standard
# version of OrderedDict which doesn't have a _OrderedDict__root
# attribute
try:
instance[k] = value
except AttributeError as e:
import logging
import pprint
warnmsg = 'Unable to unpickle {}[{}]={}'.format(pprint.pformat(instance), pprint.pformat(k), pprint.pformat(value))
logging.error(warnmsg)
else:
setattr(instance, k, value)

How to serialize a scandir.DirEntry in Python for sending through a network socket?

I have server and client programs that communicate with each other through a network socket.
What I want is to send a directory entry (scandir.DirEntry) obtained from scandir.scandir() through the socket.
For now I am using pickle and cPickle modules and have come up with the following (excerpt only):
import scandir, pickle
s = scandir.scandir("D:\\PYTHON")
entry = s.next()
data = pickle.dumps(entry)
However, I am getting the following error stack:
File "untitled.py", line 5, in <module>
data = pickle.dumps(item)
File "C:\Python27\Lib\pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "C:\Python27\Lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python27\Lib\pickle.py", line 306, in save
rv = reduce(self.proto)
File "C:\Python27\Lib\copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle DirEntry objects
How can I get rid of this error?
I have heard of using marshall or JSON.
UPDATE: JSON is not dumping all the data within the object.
Is there any completely different way to do so to send the object through the socket?
Thanks in advance for any help.
Yes, os.DirEntry objects are intended to be short-lived, not really kept around or serialized. If you need the data in them to be serialized, looks like you've figured that out in your own answer -- serialize (pickle) a dict version of the attributes you need.
To deserialize into an object that walks and quacks like an os.DirEntry instance, create a PseudoDirEntry class that mimics the things you need.
Note that you can directly serialize the stat object already, which saves you picking the fields out of that.
Combined, that would look like this:
class PseudoDirEntry:
def __init__(self, name, path, is_dir, stat):
self.name = name
self.path = path
self._is_dir = is_dir
self._stat = stat
def is_dir(self):
return self._is_dir
def stat(self):
return self._stat
And then:
>>> import os, pickle
>>> entry = list(os.scandir())[0]
>>> pickled = pickle.dumps({'name': entry.name, 'path': entry.path, 'is_dir': entry.is_dir(), 'stat': entry.stat()})
>>> loaded = pickle.loads(pickled)
>>> pseudo = PseudoDirEntry(loaded['name'], loaded['path'], loaded['is_dir'], loaded['stat'])
>>> pseudo.name
'.DS_Store'
>>> pseudo.is_dir()
False
>>> pseudo.stat()
os.stat_result(st_mode=33188, st_ino=8370294, st_dev=16777220, st_nlink=1, st_uid=502, st_gid=20, st_size=8196, st_atime=1478356967, st_mtime=1477601172, st_ctime=1477601172)
Well I myself have figured out that for instances of non-standard classes like this scandir.DirEntry, the best way is to convert the class member data into a (possibly nested) combination of standard objects like (list, dict, etc.).
For example, in the particular case of scandir.DirEntry, it can be done as follows.
import scandir, pickle
s = scandir.scandir("D:\\PYTHON")
entry = s.next()
# first convert the stat object to st_
st = entry.stat()
st_ = {'st_mode':st.st_mode, 'st_size':st.st_size,\
'st_atime':st.st_atime, 'st_mtime':st.st_mtime,\
'st_ctime':st.st_ctime}
# now convert the entry object to entry_
entry_ = {'name':entry.name, 'is_dir':entry.is_dir(), \
'path':entry.path, 'stat':st_}
# one may need some other class member data also as necessary
# now pickle the converted entry_
data = pickle.dumps(entry_)
Although for my purpose, I only require the data, after the unpickling in the other end, one may need to reconstruct the unpickled entry_ to unpickled scandir.DirEntry object 'entry'. However, I am yet to figure out how to reconstruct the class instance and set the data for the behaviour of methods like is_dir(), stat().

How do I use python-WikEdDiff?

I recently installed python-WikEdDiff package to my system. I understand it is a python extension of the original JavaScript WikEdDiff tool. I tried to use it but I couldn't find any documentation for it. I am stuck at using WikEdDiff.diff(). I wish to use the other functions of this class, such as getFragments() and others, but on checking, it shows the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/dist-packages/WikEdDiff/diff.py", line 1123, in detectBlocks
self.getSameBlocks()
File "/usr/local/lib/python3.4/dist-packages/WikEdDiff/diff.py", line 1211, in getSameBlocks
while j is not None and self.oldText.tokens[j].link is None:
IndexError: list index out of range
On checking, I found out that the tokens[] structure in the object remains empty whereas it should have been initialized.
Is there an initialize function that I need to call apart from the default constructor? Or is it something to do with the `WikEdDiffConfig' config structure I passed to the constructor?
You get this error because the WikEdDiff object was cleared internally inside diff(), as shown in this section of the code:
def diff( self, oldString, newString ):
...
# Free memory
self.newText.tokens.clear()
self.oldText.tokens.clear()
# Assemble blocks into fragment table
fragments = self.getDiffFragments()
# Free memory
self.blocks.clear()
self.groups.clear()
self.sections.clear()
...
return fragments
If you just need the fragments, use the returned variable of diff() like this:
import WikEdDiff as WED
config=WED.WikEdDiffConfig()
w = WED.WikEdDiff(config)
f = w.diff("abc", "efg")
# do whatever you want with f, but don't use w
print(' '.join([i.text+i.type for i in f]))
# outputs '{ [ (> abc- ) abc< efg+ ] }'

Python sys.argv TypeErrors with printing function results?

I have been trying to learn how to use sys.argv properly, while calling an executable file from the command line.
I wanted to have the functions results print to the command line when passing the filename and argument on the command line but, I get a TypeError.
So far I have:
#! /usr/bin/env python
import mechanize
from BeautifulSoup import BeautifulSoup
import sys
def dictionary(word):
br = mechanize.Browser()
response = br.open('http://www.dictionary.reference.com')
br.select_form(nr=0)
br.form['q'] = sys.argv
br.submit()
definition = BeautifulSoup(br.response().read())
trans = definition.findAll('td',{'class':'td3n2'})
fin = [i.text for i in trans]
query = {}
for i in fin:
query[fin.index(i)] = i
return query
print dictionary(sys.argv)
When I call this:
./this_file.py 'pass'
I am left with this error message:
Traceback (most recent call last):
File "./hot.py", line 20, in <module>
print dictionary(sys.argv)
File "./hot.py", line 10, in dictionary
br.form['q'] = sys.argv
File "/usr/local/lib/python2.7/dist-packages/mechanize/_form.py", line 2782, in __setitem__
control.value = value
File "/usr/local/lib/python2.7/dist-packages/mechanize/_form.py", line 1217, in __setattr__
raise TypeError("must assign a string")
TypeError: must assign a string
With
br.form['q'] = sys.argv
you are assigning a list of strings here instead of a string.
>>> type(sys.argv)
<type 'list'>
>>> type(sys.argv[0])
<type 'str'>
>>>
You want to identify a specific string to assign via an index.
Most likely it will be be index 1 given what you have in your post (and since index 0 is the name of the script). So perhaps
br.form['q'] = sys.argv[1]
will do for you. Of course it could be another index too, depending on your particular application/needs.
Note as #Dougal observes in a helpful comment below, the function parameter word in the function is not being used. You are calling your dictionary function sending it sys.argv and then ought to refer to word inside the function. The type doesn't change only the name that you refer to the command line args inside your function. The idea of word is good as it avoids the use of global variables. If you refer to use globals (not really encouraged) then removing word is recommended as it will be confusing to have it there).
So your statement should really read
br.form['q'] = word[1]

PicklingError: Can't pickle <class 'decimal.Decimal'>: it's not the same object as decimal.Decimal

This is the error I got today at <a href"http://filmaster.com">filmaster.com:
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
What does that exactly mean? It does not seem to be making a lot of sense...
It seems to be connected with django caching. You can see the whole traceback here:
Traceback (most recent call last):
File
"/home/filmaster/django-trunk/django/core/handlers/base.py",
line 92, in get_response response =
callback(request, *callback_args,
**callback_kwargs)
File
"/home/filmaster/film20/film20/core/film_views.py",
line 193, in show_film
workflow.set_data_for_authenticated_user()
File
"/home/filmaster/film20/film20/core/film_views.py",
line 518, in
set_data_for_authenticated_user
object_id = self.the_film.parent.id)
File
"/home/filmaster/film20/film20/core/film_helper.py",
line 179, in get_others_ratings
set_cache(CACHE_OTHERS_RATINGS,
str(object_id) + "_" + str(user_id),
userratings)
File
"/home/filmaster/film20/film20/utils/cache_helper.py",
line 80, in set_cache return
cache.set(CACHE_MIDDLEWARE_KEY_PREFIX
+ full_path, result, get_time(cache_string))
File
"/home/filmaster/django-trunk/django/core/cache/backends/memcached.py",
line 37, in set
self._cache.set(smart_str(key), value,
timeout or self.default_timeout)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 128, in set val, flags =
self._convert(val)
File
"/usr/lib/python2.5/site-packages/cmemcache.py",
line 112, in _convert val =
pickle.dumps(val, 2)
PicklingError: Can't pickle <class
'decimal.Decimal'>: it's not the same
object as decimal.Decimal
And the source code for Filmaster can be downloaded from here: bitbucket.org/filmaster/filmaster-test
Any help will be greatly appreciated.
I got this error when running in an jupyter notebook. I think the problem was that I was using %load_ext autoreload autoreload 2. Restarting my kernel and rerunning solved the problem.
One oddity of Pickle is that the way you import a class before you pickle one of it's instances can subtly change the pickled object. Pickle requires you to have imported the object identically both before you pickle it and before you unpickle it.
So for example:
from a.b import c
C = c()
pickler.dump(C)
will make a subtly different object (sometimes) to:
from a import b
C = b.c()
pickler.dump(C)
Try fiddling with your imports, it might correct the problem.
I will demonstrate the problem with simple Python classes in Python2.7:
In [13]: class A: pass
In [14]: class B: pass
In [15]: A
Out[15]: <class __main__.A at 0x7f4089235738>
In [16]: B
Out[16]: <class __main__.B at 0x7f408939eb48>
In [17]: A.__name__ = "B"
In [18]: pickle.dumps(A)
---------------------------------------------------------------------------
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
This error is shown because we are trying to dump A, but because we changed its name to refer to another object "B", pickle is actually confused with which object to dump - class A or B. Apparently, pickle guys are very smart and they have already put a check on this behavior.
Solution:
Check if the object you are trying to dump has conflicting name with another object.
I have demonstrated debugging for the case presented above with ipython and ipdb below:
PicklingError: Can't pickle <class __main__.B at 0x7f4089235738>: it's not the same object as __main__.B
In [19]: debug
> /<path to pickle dir>/pickle.py(789)save_global()
787 raise PicklingError(
788 "Can't pickle %r: it's not the same object as %s.%s" %
--> 789 (obj, module, name))
790
791 if self.proto >= 2:
ipdb> pp (obj, module, name) **<------------- you are trying to dump obj which is class A from the pickle.dumps(A) call.**
(<class __main__.B at 0x7f4089235738>, '__main__', 'B')
ipdb> getattr(sys.modules[module], name) **<------------- this is the conflicting definition in the module (__main__ here) with same name ('B' here).**
<class __main__.B at 0x7f408939eb48>
I hope this saves some headaches! Adios!!
I can't explain why this is failing either, but my own solution to fix this was to change all my code from doing
from point import Point
to
import point
this one change and it worked. I'd love to know why... hth
There can be issues starting a process with multiprocessing by calling __init__. Here's a demo:
import multiprocessing as mp
class SubProcClass:
def __init__(self, pipe, startloop=False):
self.pipe = pipe
if startloop:
self.do_loop()
def do_loop(self):
while True:
req = self.pipe.recv()
self.pipe.send(req * req)
class ProcessInitTest:
def __init__(self, spawn=False):
if spawn:
mp.set_start_method('spawn')
(self.msg_pipe_child, self.msg_pipe_parent) = mp.Pipe(duplex=True)
def start_process(self):
subproc = SubProcClass(self.msg_pipe_child)
self.trig_proc = mp.Process(target=subproc.do_loop, args=())
self.trig_proc.daemon = True
self.trig_proc.start()
def start_process_fail(self):
self.trig_proc = mp.Process(target=SubProcClass.__init__, args=(self.msg_pipe_child,))
self.trig_proc.daemon = True
self.trig_proc.start()
def do_square(self, num):
# Note: this is an synchronous usage of mp,
# which doesn't make sense. But this is just for demo
self.msg_pipe_parent.send(num)
msg = self.msg_pipe_parent.recv()
print('{}^2 = {}'.format(num, msg))
Now, with the above code, if we run this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=True)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Traceback (most recent call last):
File "start_class_process1.py", line 40, in <module>
t.start_process_fail()
File "start_class_process1.py", line 29, in start_process_fail
self.trig_proc.start()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/context.py", line 274, in _Popen
return Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 33, in __init__
super().__init__(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_fork.py", line 21, in __init__
self._launch(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/popen_spawn_posix.py", line 48, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function SubProcClass.__init__ at 0x10073e510>: it's not the same object as __main__.__init__
And if we change it to use fork instead of spawn:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process_fail()
for i in range(1000):
t.do_square(i)
We get this error:
Process Process-1:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
TypeError: __init__() missing 1 required positional argument: 'pipe'
But if we call the start_process method, which doesn't call __init__ in the mp.Process target, like this:
if __name__ == '__main__':
t = ProcessInitTest(spawn=False)
t.start_process()
for i in range(1000):
t.do_square(i)
It works as expected (whether we use spawn or fork).
Did you somehow reload(decimal), or monkeypatch the decimal module to change the Decimal class? These are the two things most likely to produce such a problem.
Same happened to me
Restarting the kernel worked for me
Due to the restrictions based upon reputation I cannot comment, but the answer of Salim Fahedy and following the debugging-path set me up to identify a cause for this error, even when using dill instead of pickle:
Under the hood, dill also accesses some functions of dill. And in pickle._Pickler.save_global() there is an import happening. To me it seems, that this is more of a "hack" than a real solution as this method fails as soon as the class of the instance you are trying to pickle is not imported from the lowest level of the package the class is in. Sorry for the bad explanation, maybe examples are more suitable:
The following example would fail:
from oemof import solph
...
(some code here, giving you the object 'es')
...
model = solph.Model(es)
pickle.dump(model, open('file.pickle', 'wb))
It fails, because while you can use solph.Model, the class actually is oemof.solph.models.Model for example. The save_global() resolves that (or some function before that which passes it to save_global()), but then imports Model from oemof.solph.models and throws an error, because it's not the same import as from oemof import solph.Model (or something like that, I'm not 100% sure about the workings).
The following example would work:
from oemof.solph.models import Model
...
some code here, giving you the object 'es')
...
model = Model(es)
pickle.dump(model, open('file.pickle', 'wb'))
It works, because now the Model object is imported from the same place, the pickle._Pickler.save_global() imports the comparison object (obj2) from.
Long story short: When pickling an object, make sure to import the class from the lowest possible level.
Addition: This also seems to apply to objects stored in the attributes of the class-instance you want to pickle. If for example model had an attribute es that itself is an object of the class oemof.solph.energysystems.EnergySystem, we would need to import it as:
from oemof.solph.energysystems import EnergySystem
es = EnergySystem()
My issue was that I had a function with the same name defined twice in a file. So I guess it was confused about which one it was trying to pickle.
I had same problem while debugging (Spyder). Everything worked normally if run the program. But, if I start to debug I faced the picklingError.
But, once I chose the option Execute in dedicated console in Run configuration per file (short-cut: ctrl+F6) everything worked normally as expected. I do not know exactly how it is adapting.
Note: In my script I have many imports like
from PyQt5.QtWidgets import *
from PyQt5.Qt import *
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
import os, sys, re, math
My basic understanding was, because of star (*) I was getting this picklingError.
I had a problem that no one has mentioned yet. I have a package with a __init__ file that does, among other things:
from .mymodule import cls
Then my top-level code says:
import mypkg
obj = mypkg.cls()
The problem with this is that in my top-level code, the type appears to be mypkg.cls, but it's actually mypkg.mymodule.cls. Using the full path:
obj = mypkg.mymodule.cls()
avoids the error.
I had the same error in Spyder. Turned out to be simple in my case. I defined a class named "Class" in a file also named "Class". I changed the name of the class in the definition to "Class_obj". pickle.dump(Class_obj,fileh) works, but pickle.dump(Class,fileh) does not when its saved in a file named "Class".
This miraculous function solves the mentioned error, but for me it turned out to another error 'permission denied' which comes out of the blue. However, I guess it might help someone find a solution so I am still posting the function:
import tempfile
import time
from tensorflow.keras.models import save_model, Model
# Hotfix function
def make_keras_picklable():
def __getstate__(self):
model_str = ""
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
save_model(self, fd.name, overwrite=True)
model_str = fd.read()
d = {'model_str': model_str}
return d
def __setstate__(self, state):
with tempfile.NamedTemporaryFile(suffix='.hdf5', delete=True) as fd:
fd.write(state['model_str'])
fd.flush()
model = load_model(fd.name)
self.__dict__ = model.__dict__
cls = Model
cls.__getstate__ = __getstate__
cls.__setstate__ = __setstate__
# Run the function
make_keras_picklable()
### create then save your model here ###

Categories