what's the reason of changes of copy function in UserDict.py - python

copy function defined by this:
def copy(self):
if self.__class__ is UserDict:
return UserDict(self.data.copy())
import copy
data = self.data
//why use try? use return copy.copy(self) instead
try:
self.data = {}
c = copy.copy(self)
finally:
self.data = data
c.update(self)
return c
why try-finally is used here? self.data will be cleared at first? what's the exception that will be raised here?

If you ignore the try / except, the code is::
data = self.data
self.data = {}
c = copy.copy(self)
self.data = data
c.update(self)
Note the self.data = {} line. For some reason, the person who wrote this code felt that the copy would work better if self.data was set to an empty dictionary before calling copy.copy(), and then the actual data was copied over using update().
The point of the finally is to ensure that self.data is restored to its original value, no matter what happens in copy.copy().

Related

How can I override global variables just for the scope of callees of a function in Python?

I'm writing a decorator which needs to pass data to other utility functions; something like:
STORE = []
def utility(message):
STORE.append(message)
def decorator(func):
def decorator_wrap(*args, **kwargs):
global STORE
saved_STORE = STORE
STORE = list()
func(*args, **kwargs)
for line in STORE:
print(line)
STORE = saved_STORE
return decorator_wrap
#decorator
def foo(x):
# ...
utility(x)
# ...
But that's kind of yuck, and not thread safe. Is there a way to override utility()'s view of STORE for the duration of decorator_wrap()? Or some other way to signal to utility() that there's an alternate STORE it should use?
Alternatively, to present an different utility() to foo() and all its callees; but that seems like exactly the same problem.
From this answer I find that I can implement it this way:
import inspect
STORE = []
def utility(message):
global STORE
store = STORE
frame = inspect.currentframe()
while frame:
if 'LOCAL_STORE' in frame.f_locals:
store = frame.f_locals['LOCAL_STORE']
break;
frame = frame.f_back
store.append(message)
def decorator(func):
def decorator_wrap(*args, **kwargs):
LOCAL_STORE = []
func(*args, **kwargs)
for line in LOCAL_STORE:
print(line)
return decorator_wrap
Buuuut while reading the documentation I see f_globals is present in every stack frame. I think the more efficient method would be to inject my local into my callee's f_globals. This would be similar to setting an environment variable before executing another command, but I don't know if it's legal.

Is it really impossible to unpickle a Python class if the original python file has been deleted?

Suppose you have the following:
file = 'hey.py'
class hey:
def __init__(self):
self.you =1
ins = hey()
temp = open("cool_class", "wb")
pickle.dump(ins, temp)
temp.close()
Now suppose you delete the file hey.py and you run the following code:
pkl_file = open("cool_class", 'rb')
obj = pickle.load(pkl_file)
pkl_file.close()
You'll get an error. I get that it's probably the case that you can't work around the problem of if you don't have the file hey.py with the class and the attributes of that class in the top level then you can't open the class with pickle. But it has to be the case that I can find out what the attributes of the serialized class are and then I can reconstruct the deleted file and open the class. I have pickles that are 2 years old and I have deleted the file that I used to construct them and I just have to find out what what the attributes of those classes are so that I can reopen these pickles
#####UPDATE
I know from the error messages that the module that originally contained the old class, let's just call it 'hey.py'. And I know the name of the class let's call it 'you'. But even after recreating the module and building a class called 'you' I still can't get the pickle to open. So I wrote this code on the hey.py module like so:
class hey:
def __init__(self):
self.hey = 1
def __setstate__(self):
self.__dict__ = ''
self.you = 1
But I get the error message: TypeError: init() takes 1 positional argument but 2 were given
#########UPDATE 2:
I Changed the code from
class hey:
to
class hey():
I then got an AttributeError but it doesn't tell me what attribute is missing. I then performed
obj= pickletools.dis(file)
And got an error on the pickletools.py file here
def _genops(data, yield_end_pos=False):
if isinstance(data, bytes_types):
data = io.BytesIO(data)
if hasattr(data, "tell"):
getpos = data.tell
else:
getpos = lambda: None
while True:
pos = getpos()
code = data.read(1)
opcode = code2op.get(code.decode("latin-1"))
if opcode is None:
if code == b"":
raise ValueError("pickle exhausted before seeing STOP")
else:
raise ValueError("at position %s, opcode %r unknown" % (
"<unknown>" if pos is None else pos,
code))
if opcode.arg is None:
arg = None
else:
arg = opcode.arg.reader(data)
if yield_end_pos:
yield opcode, arg, pos, getpos()
else:
yield opcode, arg, pos
if code == b'.':
assert opcode.name == 'STOP'
break
At this line:
code = data.read(1)
saying: AttributeError: 'str' object has no attribute 'read'
I will now try the other methods in the pickletools
########### UPDATE 3
I wanted to see what happened when I saved an object composed mostly of dictionary but some of the values in the dictionaries were classes. This is the class that was saved:
so here is the class in question:
class fss(frozenset):
def __init__(self, *args, **kwargs):
super(frozenset, self).__init__()
def __str__(self):
str1 = lbr + "{}" + rbr
return str1.format(','.join(str(x) for x in self))
Now keep in mind that the object pickled is mostly a dictionary and that class exists within the dictionary. After performing
obj= pickletools.genops(file)
I get the following output:
image
image2
I don't see how I would be able to construct the class referred to with that data if I hadn't known what the class was.
############### UPDATE #4
#AKK
Thanks for helping me out. I am able to see how your code works but my pickled file saved from 2 years ago and whose module and class have long since been deleted, I cannot open it into a bytes-like object which to me seems to be a necessity.
So the path of the file is
file ='hey.pkl'
pkl_file = open(file, 'rb')
x = MagicUnpickler(io.BytesIO(pkl_file)).load()
This returns the error:
TypeError: a bytes-like object is required, not '_io.BufferedReader'
But I thought the object was a bytes object since I opened it with open(file, 'rb')
############ UPDATE #5
Actually, I think with AKX's help I've solved the problem.
So using the code:
pkl_file = open(name, 'rb')
x = MagicUnpickler(pkl_file).load()
I then created two blank modules which once contained the classes found in the save pickle, but I did not have to put the classes on them. I was getting an error in the file pickle.py here:
def load_reduce(self):
stack = self.stack
args = stack.pop()
func = stack[-1]
try:
stack[-1] = func(*args)
except TypeError:
pass
dispatch[REDUCE[0]] = load_reduce
So after excepting that error, everything worked. I really want to thank AKX for helping me out. I have actually been trying to solve this problem for about 5 years because I use pickles far more often than most programmers. I used to not understand that if you alter a class then that ruins any pickled files saved with that class so I ran into this problem again and again. But now that I'm going back over some code which is 2 years old and it looks like some of the files were deleted, I'm going to need this code a lot in the future. So I really appreciate your help in getting this problem solved.
Well, with a bit of hacking and magic, sure, you can hydrate missing classes, but I'm not guaranteeing this will work for all pickle data you may encounter; for one, this doesn't touch the __setstate__/__reduce__ protocols, so I don't know if they work.
Given a script file (so72863050.py in my case):
import io
import pickle
import types
from logging import Formatter
# Create a couple empty classes. Could've just used `class C1`,
# but we're coming back to this syntax later.
C1 = type('C1', (), {})
C2 = type('C2', (), {})
# Create an instance or two, add some data...
inst = C1()
inst.child1 = C2()
inst.child1.magic = 42
inst.child2 = C2()
inst.child2.mystery = 'spooky'
inst.child2.log_formatter = Formatter('heyyyy %(message)s') # To prove we can unpickle regular classes still
inst.other_data = 'hello'
inst.some_dict = {'a': 1, 'b': 2}
# Pickle the data!
pickle_bytes = pickle.dumps(inst)
# Let's erase our memory of these two classes:
del C1
del C2
try:
print(pickle.loads(pickle_bytes))
except Exception as exc:
pass # Can't get attribute 'C1' on <module '__main__'> – yep, it certainly isn't there!
we now have successfully created some pickle data that we can't load anymore, since we forgot about those two classes. Now, since the unpickling mechanism is customizable, we can derive a magic unpickler, that in the face of certain defeat (or at least an AttributeError), synthesizes a simple class from thin air:
# Could derive from Unpickler, but that may be a C class, so our tracebacks would be less helpful
class MagicUnpickler(pickle._Unpickler):
def __init__(self, fp):
super().__init__(fp)
self._magic_classes = {}
def find_class(self, module, name):
try:
return super().find_class(module, name)
except AttributeError:
return self._create_magic_class(module, name)
def _create_magic_class(self, module, name):
cache_key = (module, name)
if cache_key not in self._magic_classes:
cls = type(f'<<Emulated Class {module}:{name}>>', (types.SimpleNamespace,), {})
self._magic_classes[cache_key] = cls
return self._magic_classes[cache_key]
Now, when we run that magic unpickler against a stream from the aforebuilt pickle_bytes that plain ol' pickle.loads() couldn't load...
x = MagicUnpickler(io.BytesIO(pickle_bytes)).load()
print(x)
print(x.child1.magic)
print(x.child2.mystery)
print(x.child2.log_formatter._style._fmt)
prints out
<<Emulated Class __main__:C1>>(child1=<<Emulated Class __main__:C2>>(magic=42), child2=<<Emulated Class __main__:C2>>(mystery='spooky'), other_data='hello', some_dict={'a': 1, 'b': 2})
42
spooky
heyyyy %(message)s
Hey, magic!
The error in function load_reduce(self) can be re-created by:
class Y(set):
pass
pickle_bytes = io.BytesIO(pickle.dumps(Y([2, 3, 4, 5])))
del Y
print(MagicUnpickler(pickle_bytes).load())
AKX's answer do not solve cases when the class inherit from base classes as set, dict, list,...

How can I create a dead weakref in python?

Is there a better way of doing this than:
def create_expired_weakref():
class Tmp: pass
ref = weakref.ref(Tmp())
assert ref() is None
return ref
Context: I want a default state for my weakref, so that my class can do:
def __init__(self):
self._ref = create_expired_weakref()
def get_thing(self):
r = self._ref() # I need an empty weakref for this to work the first time
if r is None:
r = SomethingExpensive()
self._ref = weakref.ref(r)
return r
Another approach is to use duck typing here. If all you care about is that it behaves like a dead weakref with respect to the self._ref() call, then you can do
self._ref = lambda : None
This is what I ended up using when I had a similar desire to have a property that would return a cached value if it was available, but None otherwise. I initialized it with this lambda function. Then the property was
#property
def ref(self):
return self._ref()
Update: Credit to #Aran-Fey, who I see posted this idea as a comment to the question, rather than as an answer.
Use you a weakref.finalize for great good:
import weakref
def create_expired_weakref(type_=type("", (object,), {'__slots__':
('__weakref__',)})):
obj = type_()
ref = weakref.ref(obj)
collected = False
def on_collect():
nonlocal collected
collected = True
final = weakref.finalize(obj, on_collect)
del obj
while not collected:
pass
return ref
This might block the thread for a while if you're debugging, and might even deadlock in some obscure situations, but it's guaranteed to return an expired weakref.

Python Pickle not saving entire object

I'm trying to pickle out a list of objects where the objects contain a list. When I open the pickled file I can see any data in my objects except from the list. I'm putting code below so this makes more sense.
Object that contains a list.
class TestPickle:
testNumber = None
testList = []
def addNumber(self, value):
self.testNumber = value
def getNumber(self):
return self.testNumber
def addTestList(self, value):
self.testList.append(value)
def getTestList(self):
return self.testList
This example I create a list of the above object (I'm adding one object to keep it brief)
testPKL = TestPickle()
testList = []
testPKL.addNumber(12)
testPKL.addTestList(1)
testPKL.addTestList(2)
testList.append(testPKL)
with open(os.path.join(os.path.curdir, 'test.pkl'), 'wb') as f:
pickle.dump(testList, f)
Here is an example of me opening the pickled file and trying to access the data, I can only retrieve the testNumber from above, the testList returns a empty list.
pklResult = None
with open(os.path.join(os.path.curdir, 'test.pkl'), 'rb') as f:
pklResult = pickle.load(f)
for result in pklResult:
print result.getNumber() # returns 12
print result.testNumber # returns 12
print result.getTestList() # returns []
print result.testList # returns []
I think i'm missing something obvious here but I'm not having any luck spotting it. Thanks for any guidance.
testNumber and testList both are class attributes initially. testNumber is of immutable type hence modifying it create new instance attribute, But testList is of mutable type and can be modified in place. Hence modifying testList doesn't create new instance attribute and it remains as class attribute.
You can verify it -
print testPKL.__dict__
{'testNumber': 12}
print result.__dict__
{'testNumber': 12}
So when you access result.testList, it looks for class attribute TestPickle.testList, which is [] in your case.
Solution
You are storing instance in pickle so use instance attribute. Modify TestPickle class as below -
class TestPickle:
def __init__(self):
self.testNumber = None
self.testList = []
def addNumber(self, value):
self.testNumber = value
def getNumber(self):
return self.testNumber
def addTestList(self, value):
self.testList.append(value)
def getTestList(self):
return self.testList

Unpickling "None" object in Python

I am using redis to try to save a request's session object. Based on how to store a complex object in redis (using redis-py), I have:
def get_object_redis(key,r):
saved = r.get(key)
obj = pickle.loads(saved)
return obj
redis = Redis()
s = get_object_redis('saved',redis)
I have situations where there is no saved session and 'saved' evaluates to None. In this case I get:
TypeError: must be string or buffer, not None
Whats the best way to deal with this?
There are several ways to deal with it. This is what they would have in common:
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
# maybe add code here
return ... # return something you expect
obj = pickle.loads(saved)
return obj
You need to make it clear what you expect if a key is not found.
Version 1
An example would be you just return None:
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
return None
obj = pickle.loads(saved)
return obj
redis = Redis()
s = get_object_redis('saved',redis)
s is then None. This may be bad because you need to handle that somewhere and you do not know whether it was not found or it was found and really None.
Version 2
You create an object, maybe based on the key, that you can construct because you know what lies behind a key.
class KeyWasNotFound(object):
# just an example class
# maybe you have something useful in mind
def __init__(self, key):
self.key = key
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
return KeyWasNotFound(key)
obj = pickle.loads(saved)
return obj
Usually, if identity is important, you would store the object after you created it, to return the same object for the key.
Version 3
TypeError is a very geneneric error. You can create your own error class. This would be the preferred way for me, because I do not like version 1 and do not have knowledge of which object would be useful to return.
class NoRedisObjectFoundForKey(KeyError):
pass
def get_object_redis(key,r):
saved = r.get(key)
if saved is None:
raise NoRedisObjectFoundForKey(key)
obj = pickle.loads(saved)
return obj

Categories