It seems that Python has some limitations regarding instance methods.
Instance methods can't be copied.
Instance methods can't be pickled.
This is problematic for me, because I work on a very object-oriented project in which I reference instance methods, and there's use of both deepcopying and pickling. The pickling thing is done mostly by the multiprocessing mechanism.
What would be a good way to solve this? I did some ugly workaround to the copying issue, but
I'm looking for a nicer solution to both problems.
Does anyone have any suggestions?
Update:
My use case: I have a tiny event system. Each event has an .action attribute that points to a function it's supposed to trigger, and sometimes that function is an instance method of some object.
You might be able to do this using copy_reg.pickle. In Python 2.6:
import copy_reg
import types
def reduce_method(m):
return (getattr, (m.__self__, m.__func__.__name__))
copy_reg.pickle(types.MethodType, reduce_method)
This does not store the code of the method, just its name; but that will work correctly in the common case.
This makes both pickling and copying work!
REST - Representation State Transfer. Just send state, not methods.
To transfer an object X from A to B, we do this.
A encode the state of X in some
handy, easy-to-parse notation. JSON
is popular.
A sends the JSON text to B.
B decodes the state of X form JSON
notation, reconstructing X.
B must have the class definitions for X's class for this to work. B must have all functions and other class definitions on which X's class depends. In short, both A
and B have all the definitions. Only a representation of the object's state gets moved
around.
See any article on REST.
http://en.wikipedia.org/wiki/Representational_State_Transfer
http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
pickle the instance and then access the method after unpickling it. Pickling a method of an instance doesn't make sense because it relies on the instance. If it doesn't, then write it as an independent function.
import pickle
class A:
def f(self):
print 'hi'
x = A()
f = open('tmp', 'w')
r = pickle.dump(x, f)
f.close()
f = open('tmp', 'r')
pickled_x = pickle.load(f)
pickled_x.f()
Related
I want to implement pickling support for objects belonging to my extension library. There is a global instance of class Service initialized at startup. All these objects are produced as a result of some Service method invocations and essentially belong to it. Service knows how to serialize them into binary buffers and how deserialize buffers back into objects.
It appeared that Pythons __ reduce__ should serve my purpose - implement pickling support. I started implementing one and realized that there is an issue with unpickler (first element od a tuple expected to be returned by __ reduce__). This unpickle function needs instance of a Service to be able to convert input buffer into an Object. Here is a bit of pseudo code to illustrate the issue:
class Service(object):
...
def pickleObject(self,obj):
# do serialization here and return buffer
...
def unpickleObject(self,buffer):
# do deserialization here and return new Object
...
class Object(object):
...
def __reduce__(self):
return self.service().unpickleObject, (self.service().pickleObject(self),)
Note the first element in a tuple. Python pickler does not like it: it says it is instancemethod and it can't be pickled. Obviously pickler is trying to store the routine into the output and wants Service instance along with function name, but this is not want I want to happen. I do not want (and really can't : Service is not pickable) to store service along with all the objects. I want service instance to be created before pickle.load is invoked and somehow that instance get used during unpickling.
Here where I came by copy_reg module. Again it appeared as it should solve my problems. This module allows to register pickler and unpickler routines per type dynamically and these are supposed to be used later on for the objects of this type. So I added this registration to the Service construction:
class Service(object):
...
def __init__(self):
...
import copy_reg
copy_reg( mymodule.Object, self.pickleObject, self.unpickleObject )
self.unpickleObject is now a bound method taking service as a first parameter and buffer as second. self.pickleObject is also bound method taking service and object to pickle. copy_reg required that pickleObject routine should follow reducer semantic and returns similar tuple as before. And here the problem arose again: what should I return as the first tuple element??
class Service(object):
...
def pickleObject(self,obj):
...
return self.unpickleObject, (self.serialize(obj),)
In this form pickle again complains that it can't pickle instancemethod. I tried None - it does not like it either. I put there some dummy function. This works - meaning serialization phase went through fine, but during unpickling it calls this dummy function instead of unpickler I registered for the type mymodule.Object in Service constructor.
So now I am at loss. Sorry for long explanation: I did not know how to ask this question in a few lines. I can summarize my questions like this:
Why does copy_reg semantic requires me to return unpickler routine from pickleObject, if I an expected to register one independently?
Is there any reason to prefer copy_reg.constructor interface to register unpickler routine?
How do I make pickle to use the unpickler I registered instead of one inside the stream?
What should I return as first element in a tuple as pickleObject result value? Is there a "correct" value?
Do I approach this whole thing correctly? Is there different/simpler solution?
First of all, the copy_reg module is unlikely to help you much here: it is primarily a way to add __reduce__ like features to classes that don't have that method rather than offering any special abilities (e.g. if you want to pickle objects from some library that doesn't natively support it).
The callable returned by __reduce__ needs to be locatable in the environment where the object is to be unpickled, so an instance method isn't really appropriate. As mentioned in the Pickle documentation:
In the unpickling environment this object must be either a class, a callable registered as a
“safe constructor” (see below), or it must have an attribute
__safe_for_unpickling__ with a true value.
So if you defined a function (not method) as follows:
def _unpickle_service_object(buffer):
# Grab the global service object, however that is accomplished
service = get_global_service_object()
return service.unpickleObject(buffer)
_unpickle_service_object.__safe_for_unpickling__ = True
You could now use this _unpickle_service_object function in the return value of your __reduce__ methods so that your objects linked to the new environment's global Service object when unpickled.
I really hope this is not a question posed by millions of newbies, but my search didn t really give me a satisfying answer.
So my question is fairly simple. Are classes basically a container for functions with its own namespace? What other functions do they have beside providing a separate namespace and holding functions while making them callable as class atributes? Im asking in a python context.
Oh and thanks for the great help most of you have been!
More importantly than functions, class instances hold data attributes, allowing you to define new data types beyond what is built into the language; and
they support inheritance and duck typing.
For example, here's a moderately useful class. Since Python files (created with open) don't remember their own name, let's make a file class that does.
class NamedFile(object):
def __init__(self, name):
self._f = f
self.name = name
def readline(self):
return self._f.readline()
Had Python not had classes, you'd probably be working with dicts instead:
def open_file(name):
return {"name": name, "f": open(name)}
Needless to say, calling myfile["f"].readline() all the time will cause your fingers to hurt at some point. You could of course introduce a function readline in a NamedFile module (namespace), but then you'd always have to use that exact function. By contrast, NamedFile instances can be used anywhere you need an object with a readline method, so it would be a plug-in replacement for file in many situation. That's called polymorphism, one of the biggest benefits of OO/class-based programming.
(Also, dict is a class, so using it violates the assumption that there are no classes :)
In most languages, classes are just pieces of code that describe how to produce an object. That's kinda true in Python too:
>>> class ObjectCreator(object):
... pass
...
>>> my_object = ObjectCreator()
>>> print my_object
<__main__.ObjectCreator object at 0x8974f2c>
But classes are more than that in Python. Classes are objects too.
Yes, objects.
As soon as you use the keyword class, Python executes it and creates an OBJECT. The instruction:
>>> class ObjectCreator(object):
... pass
...
creates in memory an object with the name ObjectCreator.
This object (the class) is itself capable of creating objects (the instances), and this is why it's a class.
But still, it's an object, and therefore:
you can assign it to a variable
you can copy it
you can add attributes to it
you can pass it as a function parameter
e.g.:
>>> print ObjectCreator # you can print a class because it's an object
<class '__main__.ObjectCreator'>
>>> def echo(o):
... print o
...
>>> echo(ObjectCreator) # you can pass a class as a parameter
<class '__main__.ObjectCreator'>
>>> print hasattr(ObjectCreator, 'new_attribute')
False
>>> ObjectCreator.new_attribute = 'foo' # you can add attributes to a class
>>> print hasattr(ObjectCreator, 'new_attribute')
True
>>> print ObjectCreator.new_attribute
foo
>>> ObjectCreatorMirror = ObjectCreator # you can assign a class to a variable
>>> print ObjectCreatorMirror.new_attribute
foo
>>> print ObjectCreatorMirror()
<__main__.ObjectCreator object at 0x8997b4c>
Classes (or objects) are used to provide encapsulation of data and operations that can be performed on that data.
They don't provide namespacing in Python per se; module imports provide the same type of stuff and a module can be entirely functional rather than object oriented.
You might gain some benefit from looking at OOP With Python, Dive into Python, Chapter 5. Objects and Object Oriented Programming or even just the Wikipedia article on object oriented programming
A class is the definition of an object. In this sense, the class provides a namespace of sorts, but that is not the true purpose of a class. The true purpose is to define what the object will 'look like' - what the object is capable of doing (methods) and what it will know (properties).
Note that my answer is intended to provide a sense of understanding on a relatively non-technical level, which is what my initial trouble was with understanding classes. I'm sure there will be many other great answers to this question; I hope this one adds to your overall understanding.
Is there any way to get the original object from a weakproxy pointed to it? eg is there the inverse to weakref.proxy()?
A simplified example(python2.7):
import weakref
class C(object):
def __init__(self, other):
self.other = weakref.proxy(other)
class Other(object):
pass
others = [Other() for i in xrange(3)]
my_list = [C(others[i % len(others)]) for i in xrange(10)]
I need to get the list of unique other members from my_list. The way I prefer for such tasks
is to use set:
unique_others = {x.other for x in my_list}
Unfortunately this throws TypeError: unhashable type: 'weakproxy'
I have managed to solve the specific problem in an imperative way(slow and dirty):
unique_others = []
for x in my_list:
if x.other in unique_others:
continue
unique_others.append(x.other)
but the general problem noted in the caption is still active.
What if I have only my_list under control and others are burried in some lib and someone may delete them at any time, and I want to prevent the deletion by collecting nonweak refs in a list?
Or I may want to get the repr() of the object itself, not <weakproxy at xx to Other at xx>
I guess there should be something like weakref.unproxy I'm not aware about.
I know this is an old question but I was looking for an answer recently and came up with something. Like others said, there is no documented way to do it and looking at the implementation of weakproxy type confirms that there is no standard way to achieve this.
My solution uses the fact that all Python objects have a set of standard methods (like __repr__) and that bound method objects contain a reference to the instance (in __self__ attribute).
Therefore, by dereferencing the proxy to get the method object, we can get a strong reference to the proxied object from the method object.
Example:
>>> def func():
... pass
...
>>> weakfunc = weakref.proxy(func)
>>> f = weakfunc.__repr__.__self__
>>> f is func
True
Another nice thing is that it will work for strong references as well:
>>> func.__repr__.__self__ is func
True
So there's no need for type checks if either a proxy or a strong reference could be expected.
Edit:
I just noticed that this doesn't work for proxies of classes. This is not universal then.
Basically there is something like weakref.unproxy, but it's just named weakref.ref(x)().
The proxy object is only there for delegation and the implementation is rather shaky...
The == function doesn't work as you would expect it:
>>> weakref.proxy(object) == object
False
>>> weakref.proxy(object) == weakref.proxy(object)
True
>>> weakref.proxy(object).__eq__(object)
True
However, I see that you don't want to call weakref.ref objects all the time. A good working proxy with dereference support would be nice.
But at the moment, this is just not possible. If you look into python builtin source code you see, that you need something like PyWeakref_GetObject, but there is just no call to this method at all (And: it raises a PyErr_BadInternalCall if the argument is wrong, so it seems to be an internal function). PyWeakref_GET_OBJECT is used much more, but there is no method in weakref.py that could be able to do that.
So, sorry to disappoint you, but you weakref.proxy is just not what most people would want for their use cases. You can however make your own proxy implementation. It isn't to hard. Just use weakref.ref internally and override __getattr__, __repr__, etc.
On a little sidenote on how PyCharm is able to produce the normal repr output (Because you mentioned that in a comment):
>>> class A(): pass
>>> a = A()
>>> weakref.proxy(a)
<weakproxy at 0x7fcf7885d470 to A at 0x1410990>
>>> weakref.proxy(a).__repr__()
'<__main__.A object at 0x1410990>'
>>> type( weakref.proxy(a))
<type 'weakproxy'>
As you can see, calling the original __repr__ can really help!
weakref.ref is hashable whereas weakref.proxy is not. The API doesn't say anything about how you actually can get a handle on the object a proxy points to. with weakref, it's easy, you can just call it. As such, you can roll your own proxy-like class...Here's a very basic attemp:
import weakref
class C(object):
def __init__(self,obj):
self.object=weakref.ref(obj)
def __getattr__(self,key):
if(key == "object"): return object.__getattr__(self,"object")
elif(key == "__init__"): return object.__getattr__(self,"__init__")
else:
obj=object.__getattr__(self,"object")() #Dereference the weakref
return getattr(obj,key)
class Other(object):
pass
others = [Other() for i in range(3)]
my_list = [C(others[i % len(others)]) for i in range(10)]
unique_list = {x.object for x in my_list}
Of course, now unique_list contains refs, not proxys which is fundamentally different...
I know that this is an old question, but I've been bitten by it (so, there's no real 'unproxy' in the standard library) and wanted to share my solution...
The way I solved it to get the real instance was just creating a property which returned it (although I suggest using weakref.ref instead of a weakref.proxy as code should really check if it's still alive before accessing it instead of having to remember to catch an exception whenever any attribute is accessed).
Anyways, if you still must use a proxy, the code to get the real instance is:
import weakref
class MyClass(object):
#property
def real_self(self):
return self
instance = MyClass()
proxied = weakref.proxy(instance)
assert proxied.real_self is instance
In my code I'm trying to take copies of instances a class using copy.deepcopy. The problem is that under some circumstances it is erroring with the following error:
TypeError: 'object.__new__(NotImplementedType) is not safe, use NotImplementedType.__new__()'
After much digging I have found that I am able to reproduce the error using the following code:
import copy
copy.deepcopy(__builtins__)
The problem appears to be that at some point it is trying to copy the NotImplementedType builtin. The question is why is it doing this? I have not overridden __deepcopy__ in my class and it doesn't happen all the time. Does anyone have any tips for tracking down where the request to make a copy of this type comes from?
I've put some debugging code in the copy module itself to ensure that this is what's happening, but the point at which the problem occurs is so far down a recursive stack it's very hard to make much of what I'm seeing.
In the end I did some digging in the copy source code and came up with the following solution:
from copy import deepcopy, _deepcopy_dispatch
from types import ModuleType
class MyType(object):
def __init__(self):
self.module = __builtins__
def copy(self):
''' Patch the deepcopy dispatcher to pass modules back unchanged '''
_deepcopy_dispatch[ModuleType] = lambda x, m: x
result = deepcopy(self)
del _deepcopy_dispatch[ModuleType]
return result
MyType().copy()
I realise this uses a private API but I couldn't find another clean way of achieving the same thing. I did a quick search on the web and found that other people had used the same API without any bother. If it changes in the future I'll take the hit.
I'm also aware that this is not thread-safe (if a thread needed the old behaviour whilst I was doing a copy on another thread I'd be screwed) but again its not a problem for me right now.
Hope that helps someone else out at some point.
you could override the __deepcopy__ method: (python documentation)
In order for a class to define its own copy implementation, it can define special methods __copy__() and __deepcopy__(). The former is called to implement the shallow copy operation; no additional arguments are passed. The latter is called to implement the deep copy operation; it is passed one argument, the memo dictionary. If the __deepcopy__() implementation needs to make a deep copy of a component, it should call the deepcopy() function with the component as first argument and the memo dictionary as second argument.
Otherwise you could save the modules in a global list or something else.
You can override the deepcopy behavior of the class that contains a pointer to a module, by using the pickle protocol, which is supported by the copy module, as is stated here. In particular, you can define __getstate__ and __setstate__ for that class. E.g.:
>>> class MyClass:
... def __getstate__(self):
... state = self.__dict__.copy()
... del state['some_module']
... return state
... def __setstate__(self, state):
... self.__dict__.update(state)
... self.some_module = some_module
I'd like to serialize Python objects to and from the plist format (this can be done with plistlib). My idea was to write a class PlistObject which wraps other objects:
def __init__(self, anObject):
self.theObject = anObject
and provides a "write" method:
def write(self, pathOrFile):
plistlib.writeToPlist(self.theObject.__dict__, pathOrFile)
Now it would be nice if the PlistObject behaved just like wrapped object itself, meaning that all attributes and methods are somehow "forwarded" to the wrapped object. I realize that the methods __getattr__ and __setattr__ can be used for complex attribute operations:
def __getattr__(self, name):
return self.theObject.__getattr__(name)
But then of course I run into the problem that the constructor now produces an infinite recursion, since also self.theObject = anObject tries to access the wrapped object.
How can I avoid this? If the whole idea seems like a bad one, tell me too.
Unless I'm missing something, this will work just fine:
def __getattr__(self, name):
return getattr(self.theObject, name)
Edit: for those thinking that the lookup of self.theObject will result in an infinite recursive call to __getattr__, let me show you:
>>> class Test:
... a = "a"
... def __init__(self):
... self.b = "b"
... def __getattr__(self, name):
... return 'Custom: %s' % name
...
>>> Test.a
'a'
>>> Test().a
'a'
>>> Test().b
'b'
>>> Test().c
'Custom: c'
__getattr__ is only called as a last resort. Since theObject can be found in __dict__, no issues arise.
But then of course I run into the problem that the constructor now produces an infinite recursion, since also self.theObject = anObject tries to access the wrapped object.
That's why the manual suggests that you do this for all "real" attribute accesses.
theobj = object.__getattribute__(self, "theObject")
I'm glad to see others have been able to help you with the recursive call to __getattr__. Since you've asked for comments on the general approach of serializing to plist, I just wanted to chime in with a few thoughts.
Python's plist implementation handles basic types only, and provides no extension mechanism for you to instruct it on serializing/deserializing complex types. If you define a custom class, for example, writePlist won't be able to help, as you've discovered since you're passing the instance's __dict__ for serialization.
This has a couple implications:
You won't be able to use this to serialize any objects that contain other objects of non-basic type without converting them to a __dict__, and so-on recursively for the entire network graph.
If you roll your own network graph walker to serialize all non-basic objects that can be reached, you'll have to worry about circles in the graph where one object has another in a property, which in turn holds a reference back to the first, etc etc.
Given then, you may wish to look at pickle instead as it can handle all of these and more. If you need the plist format for other reasons, and you're sure you can stick to "simple" object dicts, then you may wish to just use a simple function... trying to have the PlistObject mock every possible function in the contained object is an onion with potentially many layers as you need to handle all the possibilities of the wrapped instance.
Something as simple as this may be more pythonic, and keep the usability of the wrapped object simpler by not wrapping it in the first place:
def to_plist(obj, f_handle):
writePlist(obj.__dict__, f_handle)
I know that doesn't seem very sexy, but it is a lot more maintainable in my opinion than a wrapper given the severe limits of the plist format, and certainly better than artificially forcing all objects in your application to inherit from a common base class when there's nothing in your business domain that actually indicates those disparate objects are related.