boost python enable_pickling expectation - python

Hi I am using python to initiate a cpp class which use boost python lib to convert into python usable. at the same time, i have a requirement to pickle the python classes that use the python enabled cpp class.
So what i did is to add enable_picking() to an example class definition like this:
class_<pform::base::Price>("Price", init<double>())
.def(self == self)
.def(self_ns::str(self_ns::self)) // __str__
.def("get_value", &pform::base::Price::get_value)
it make the class pickleable. However i get this error when unpickle it.
Boost.Python.ArgumentError: Python argument types in
Price.__init__(Price)
did not match C++ signature:
__init__(_object*, double)
So what is missing here?

A bit late, but I found the relevant boost documentation for this:
http://www.boost.org/doc/libs/1_64_0/libs/python/doc/html/reference/topics/pickle_support.html
The Pickle Interface
At the user level, the Boost.Python pickle
interface involves three special methods:
__getinitargs__ When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a __getinitargs__
method. This method must return a Python tuple (it is most convenient
to use a boost::python::tuple). When the instance is restored by the
unpickler, the contents of this tuple are used as the arguments for
the class constructor. If __getinitargs__ is not defined, pickle.load
will call the constructor (__init__) without arguments; i.e., the
object must be default-constructible.
__getstate__ When an instance of a Boost.Python extension class is pickled, the pickler tests if the instance has a __getstate__ method.
This method should return a Python object representing the state of
the instance.
__setstate__ When an instance of a Boost.Python extension class is restored by the unpickler (pickle.load), it is first constructed using
the result of __getinitargs__ as arguments (see above). Subsequently
the unpickler tests if the new instance has a __setstate__ method. If
so, this method is called with the result of __getstate__ (a Python
object) as the argument.
The three special methods described above may
be .def()'ed individually by the user. However, Boost.Python provides
an easy to use high-level interface via the
boost::python::pickle_suite class that also enforces consistency:
__getstate__ and __setstate__ must be defined as pairs. Use of this interface is demonstrated by the following examples.
In your particular example the class is not default constructible as it needs a double argument (which I assume is the "value"). To wrap it for Python you would also need to define:
.def("__getinitargs__", +[](pform::base::Price const& self){
return boost::python::make_tuple(self.get_value());
})
Now Boost Python will initialize your class using "value"; instead of calling the default constructor (pform::base::Price()).

Related

Intercept magic method calls in python class

I am trying to make a class that wraps a value that will be used across multiple other objects. For computational reasons, the aim is for this wrapped value to only be calculated once and the reference to the value passed around to its users. I don't believe this is possible in vanilla python due to its object container model. Instead, my approach is a wrapper class that is passed around, defined as follows:
class DynamicProperty():
def __init__(self, value = None):
# Value of the property
self.value: Any = value
def __repr__(self):
# Use value's repr instead
return repr(self.value)
def __getattr__(self, attr):
# Doesn't exist in wrapper, get it from the value
# instead
return getattr(self.value, attr)
The following works as expected:
wrappedString = DynamicProperty("foo")
wrappedString.upper() # 'FOO'
wrappedFloat = DynamicProperty(1.5)
wrappedFloat.__add__(2) # 3.5
However, implicitly calling __add__ through normal syntax fails:
wrappedFloat + 2 # TypeError: unsupported operand type(s) for
# +: 'DynamicProperty' and 'float'
Is there a way to intercept these implicit method calls without explicitly defining magic methods for DynamicProperty to call the method on its value attribute?
Talking about "passing by reference" will only confuse you. Keep that terminology to languages where you can have a choice on that, and where it makes a difference. In Python you always pass objects around - and this passing is the equivalent of "passing by reference" - for all objects - from None to int to a live asyncio network connection pool instance.
With that out of the way: the algorithm the language follows to retrieve attributes from an object is complicated, have details - implementing __getattr__ is just the tip of the iceberg. Reading the document called "Data Model" in its entirety will give you a better grasp of all the mechanisms involved in retrieving attributes.
That said, here is how it works for "magic" or "dunder" methods - (special functions with two underscores before and two after the name): when you use an operator that requires the existence of the method that implements it (like __add__ for +), the language checks the class of your object for the __add__ method - not the instance. And __getattr__ on the class can dynamically create attributes for instances of that class only.
But that is not the only problem: you could create a metaclass (inheriting from type) and put a __getattr__ method on this metaclass. For all querying you would do from Python, it would look like your object had the __add__ (or any other dunder method) in its class. However, for dunder methods, Python do not go through the normal attribute lookup mechanism - it "looks" directly at the class, if the dunder method is "physically" there. There are slots in the memory structure that holds the classes for each of the possible dunder methods - and they either refer to the corresponding method, or are "null" (this is "viewable" when coding in C on the Python side, the default dir will show these methods when they exist, or omit them if not). If they are not there, Python will just "say" the object does not implement that operation and period.
The way to work around that with a proxy object like you want is to create a proxy class that either features the dunder methods from the class you want to wrap, or features all possible methods, and upon being called, check if the underlying object actually implements the called method.
That is why "serious" code will rarely, if ever, offer true "transparent" proxy objects. There are exceptions, but from "Weakrefs", to "super()", to concurrent.futures, just to mention a few in the core language and stdlib, no one attempts a "fully working transparent proxy" - instead, the api is more like you call a ".value()" or ".result()" method on the wrapper to get to the original object itself.
However, it can be done, as I described above. I even have a small (long unmaintained) package on pypi that does that, wrapping a proxy for a future.
The code is at https://bitbucket.org/jsbueno/lelo/src/master/lelo/_lelo.py
The + operator in your case does not work, because DynamicProperty does not inherit from float. See:
>>> class Foo(float):
pass
>>> Foo(1.5) + 2
3.5
So, you'll need to do some kind of dynamic inheritance:
def get_dynamic_property(instance):
base = type(instance)
class DynamicProperty(base):
pass
return DynamicProperty(instance)
wrapped_string = get_dynamic_property("foo")
print(wrapped_string.upper())
wrapped_float = get_dynamic_property(1.5)
print(wrapped_float + 2)
Output:
FOO
3.5

Define #property on function

In JavaScript, we can do the following to any object or function
const myFn = () => {};
Object.defineProperties(myFn, {
property: {
get: () => console.log('property accessed')
}
});
This will allow for a #property like syntax by defining a getter function for the property property.
myFn.property
// property accessed
Is there anything similar for functions in Python?
I know we can't use property since it's not a new-style class, and assigning a lambda with setattr will not work since it'll be a function.
Basically what I want to achieve is that whenever my_fn.property is to return a new instance of another class on each call.
What I currently have with setattr is this
setattr(my_fn, 'property', OtherClass())
My hopes are to design an API that looks like this my_fn.property.some_other_function().
I would prefer using a function as my_fn and not an instance of a class, even though I realize that it might be easier to implement.
Below is the gist of what I'm trying to achieve
def my_fn():
pass
my_fn = property('property', lambda: OtherClass())
my_fn.property
// will be a new instance of OtherClass on each call
It's not possible to do exactly what you want. The descriptor protocol that powers the property built-in is only invoked when:
The descriptor is defined on a class
The descriptor's name is accessed on an instance of said class
Problem is, the class behind functions defined in Python (aptly named function, exposed directly as types.FunctionType or indirectly by calling type() on any function defined at the Python layer) is a single shared, immutable class, so you can't add descriptors to it (and even if you could, they'd become attributes of every Python level function, not just one particular function).
The closest you can get to what you're attempting would be to define a callable class (defining __call__) that defines the descriptor you're interested in as well. Make a single instance of that class (you can throw away the class itself at this point) and it will behave as you expect. Make __call__ a staticmethod, and you'll avoid changing the signature to boot.
For example, the behavior you want could be achieved with:
class my_fn:
# Note: Using the name "property" for a property has issues if you define
# other properties later in the class; this is just for illustration
#property
def property(self):
return OtherClass()
#staticmethod
def __call__(...whatever args apply; no need for self...):
... function behavior goes here ...
my_fn = my_fn() # Replace class with instance of class that behaves like a function
Now you can call the "function" (really a functor, to use C++ parlance):
my_fn(...)
or access the property, getting a brand new OtherClass each time:
>>> type(my_fn.property) is type(my_fn.property)
True
>>> my_fn.property is my_fn.property
False
No, this isn't what you asked for (you seem set on having a plain function do this for you), but you're asking for a very JavaScript specific thing which doesn't exist in Python.
What you want is not currently possible, because the property would have to be set on the function type to be invoked correctly. And you are not allowed to monkeypatch the function type:
>>> type(my_fn).property = 'anything else'
TypeError: can't set attributes of built-in/extension type 'function'
The solution: use a callable class instead.
Note: What you want may become possible in Python 3.8 if PEP 575 is accepted.

How to determine the method type of stdlib methods written in C

The classify_class_attrs function from the inspect module can be used to determine what kind of object each of a class's attributes is, including whether a function is an instance method, a class method, or a static method. Here is an example:
from inspect import classify_class_attrs
class Example(object):
#classmethod
def my_class_method(cls):
pass
#staticmethod
def my_static_method():
pass
def my_instance_method(self):
pass
print classify_class_attrs(Example)
This will output a list of Attribute objects for each attribute on Example, with metadata about the attribute. The relevant ones in these case are:
Attribute(name='my_class_method', kind='class method', defining_class=<class '__main__.Example'>, object=<classmethod object at 0x100535398>)
Attribute(name='my_instance_method', kind='method', defining_class=<class '__main__.Example'>, object=<unbound method Example.my_instance_method>)
Attribute(name='my_static_method', kind='static method', defining_class=<class '__main__.Example'>, object=<staticmethod object at 0x100535558>)
However, it seems that many objects in Python's standard library can't be introspected this way. I'm guessing this has something to do with the fact that many of them are implemented in C. For example, datetime.datetime.now is described with this Attribute object by inspect.classify_class_attrs:
Attribute(name='now', kind='method', defining_class=<type 'datetime.datetime'>, object=<method 'now' of 'datetime.datetime' objects>)
If we compare this to the metadata returned about the attributes on Example, you'd probably draw the conclusion that datetime.datetime.now is an instance method. But it actually behaves as a class method!
from datetime import datetime
print datetime.now() # called from the class: 2014-09-12 16:13:33.890742
print datetime.now().now() # called from a datetime instance: 2014-09-12 16:13:33.891161
Is there a reliable way to determine whether a method on a stdlib class is a static, class, or instance method?
I think you can get much of what you want, distinguishing five kinds, without relying on anything that isn't documented by inspect:
Python instance methods
Python class methods
Python static methods
Builtin instance methods
Builtin class methods or static methods
But you can't distinguish those last two from each other with using CPython-specific implementation details.
(As far as I know, only 3.x has any builtin static methods in the stdlib… but of course even in 2.x, someone could always define one in an extension module.)
The details of what's available in inspect and even what it means are a little different in each version of Python, partly because things have changed between 2.x and 3.x, partly because inspect is basically a bunch of heuristics that have gradually improved over time.
But at least for CPython 2.6 and 2.7 and 3.3-3.5, the simplest way to distinguish builtin instance methods from the other two types is isbuiltin on the method looked up from the class. For a static method or class method, this will be True; for an instance method, False. For example:
>>> inspect.isbuiltin(str.maketrans)
True
>>> inspect.isbuiltin(datetime.datetime.now)
True
>>> inspect.isbuiltin(datetime.datetime.ctime)
False
Why does this work? Well, isbuiltin will:
Return true if the object is a built-in function or a bound built-in method.
When looked up on an instance, either a regular method or a classmethod-like method is bound. But when looked up on the class, a regular method is unbound, while a classmethod-like method is bound (to the class). And of course a staticmethod-like method ends up as a plain-old function when looked up either way. So, it's a bit indirect, but it will always be correct.*
What about class methods vs. static methods?
In CPython 3.x, builtin static and class method descriptors both return the exact same type when looked up on their class, and none of the documented attributes can be used to distinguish them either. And even if this weren't true, I think the way the reference is written, it's guaranteed that no functions in inspect would be able to distinguish them.
What if we turn to the descriptors themselves? Yes, there are ways we can distinguish them… but I don't think it's something guaranteed by the language:
>>> callable(str.__dict__['maketrans'])
False
>>> callable(datetime.datetime.__dict__['now'])
True
Why does this work? Well, static methods just use a staticmethod descriptor, exactly like in Python (but wrapping a builtin function instead of a function). But class and instance methods use a special descriptor type, instead of using classmethod wrapping a (builtin) function and the (builtin) function itself, as Python class and instance methods do. These special descriptor types, classmethod_descriptor and method_descriptor, are unbound (class and instance) methods, as well as being the descriptors that bind them. There are historical/implementation reasons for this to be true, but I don't think there's anything in the language reference that requires it to be true, or even implies it.
And if you're willing to rely on implementation artifacts, isinstance(m, staticmethod) seems a lot simpler…
All that being said, are there any implementations besides CPython that have both builtin staticmethods and classmethods? If not, remember that practicality beats purity…
* What it's really testing for is whether the thing is callable without an extra argument, but that's basically the same thing as the documented "function or bound method"; either way, it's what you want.

Overriding the default type() metaclass before Python runs

Here be dragons. You've been warned.
I'm thinking about creating a new library that will attempt to help write a better test suite.
In order to do that one of the features is a feature that verifies that any object that is being used which isn't the test runner and the system under test has a test double (a mock object, a stub, a fake or a dummy). If the tester wants the live object and thus reduce test isolation it has to specify so explicitly.
The only way I see to do this is to override the builtin type() function which is the default metaclass.
The new default metaclass will check the test double registry dictionary to see if it has been replaced with a test double or if the live object was specified.
Of course this is not possible through Python itself:
>>> TypeError: can't set attributes of built-in/extension type 'type'
Is there a way to intervene with Python's metaclass lookup before the test suite will run (and probably Python)?
Maybe using bytecode manipulation? But how exactly?
The following is not advisable, and you'll hit plenty of problems and cornercases implementing your idea, but on Python 3.1 and onwards, you can hook into the custom class creation process by overriding the __build_class__ built-in hook:
import builtins
_orig_build_class = builtins.__build_class__
class SomeMockingMeta(type):
# whatever
def my_build_class(func, name, *bases, **kwargs):
if not any(isinstance(b, type) for b in bases):
# a 'regular' class, not a metaclass
if 'metaclass' in kwargs:
if not isinstance(kwargs['metaclass'], type):
# the metaclass is a callable, but not a class
orig_meta = kwargs.pop('metaclass')
class HookedMeta(SomeMockingMeta):
def __new__(meta, name, bases, attrs):
return orig_meta(name, bases, attrs)
kwargs['metaclass'] = HookedMeta
else:
# There already is a metaclass, insert ours and hope for the best
class SubclassedMeta(SomeMockingMeta, kwargs['metaclass']):
pass
kwargs['metaclass'] = SubclassedMeta
else:
kwargs['metaclass'] = SomeMockingMeta
return _orig_build_class(func, name, *bases, **kwargs)
builtins.__build_class__ = my_build_class
This is limited to custom classes only, but does give you an all-powerful hook.
For Python versions before 3.1, you can forget hooking class creation. The C build_class function directly uses the C-type type() value if no metaclass has been defined, it never looks it up from the __builtin__ module, so you cannot override it.
I like your idea, but I think you're going slightly off course. What if the code calls a library function instead of a class? Your fake type() would never be called and you would never be advised that you failed to mock that library function. There are plenty of utility functions both in Django and in any real codebase.
I would advise you to write the interpreter-level support you need in the form of a patch to the Python sources. Or you might find it easier to add such a hook to PyPy's codebase, which is written in Python itself, instead of messing with Python's C sources.
I just realized that the Python interpreter includes a comprehensive set of tools to enable any piece of Python code to step through the execution of any other piece of code, checking what it does down to each function call, or even to each single Python line being executed, if needed.
sys.setprofile should be enough for your needs. With it you can install a hook (a callback) that will be notified of every function call being made by the target program. You cannot use it to change the behavior of the target program, but you can collect statistics about it, including your "mock coverage" metric.
Python's documentation about the Profilers introduces a number of modules built upon sys.setprofile. You can study their sources to see how to use it effectively.
If that turns out not to be enough, there is still sys.settrace, a heavy-handed approach that allows you to step through every line of the target program, inspect its variables and modify its execution. The standard module bdb.py is built upon sys.settrace and implements the standard set of debugging tools (breakpoints, step into, step over, etc.) It is used by pdb.py which is the commandline debugger, and by other graphical debuggers.
With these two hooks, you should be all right.

Python's __reduce__/copy_reg semantic and stateful unpickler

I want to implement pickling support for objects belonging to my extension library. There is a global instance of class Service initialized at startup. All these objects are produced as a result of some Service method invocations and essentially belong to it. Service knows how to serialize them into binary buffers and how deserialize buffers back into objects.
It appeared that Pythons __ reduce__ should serve my purpose - implement pickling support. I started implementing one and realized that there is an issue with unpickler (first element od a tuple expected to be returned by __ reduce__). This unpickle function needs instance of a Service to be able to convert input buffer into an Object. Here is a bit of pseudo code to illustrate the issue:
class Service(object):
...
def pickleObject(self,obj):
# do serialization here and return buffer
...
def unpickleObject(self,buffer):
# do deserialization here and return new Object
...
class Object(object):
...
def __reduce__(self):
return self.service().unpickleObject, (self.service().pickleObject(self),)
Note the first element in a tuple. Python pickler does not like it: it says it is instancemethod and it can't be pickled. Obviously pickler is trying to store the routine into the output and wants Service instance along with function name, but this is not want I want to happen. I do not want (and really can't : Service is not pickable) to store service along with all the objects. I want service instance to be created before pickle.load is invoked and somehow that instance get used during unpickling.
Here where I came by copy_reg module. Again it appeared as it should solve my problems. This module allows to register pickler and unpickler routines per type dynamically and these are supposed to be used later on for the objects of this type. So I added this registration to the Service construction:
class Service(object):
...
def __init__(self):
...
import copy_reg
copy_reg( mymodule.Object, self.pickleObject, self.unpickleObject )
self.unpickleObject is now a bound method taking service as a first parameter and buffer as second. self.pickleObject is also bound method taking service and object to pickle. copy_reg required that pickleObject routine should follow reducer semantic and returns similar tuple as before. And here the problem arose again: what should I return as the first tuple element??
class Service(object):
...
def pickleObject(self,obj):
...
return self.unpickleObject, (self.serialize(obj),)
In this form pickle again complains that it can't pickle instancemethod. I tried None - it does not like it either. I put there some dummy function. This works - meaning serialization phase went through fine, but during unpickling it calls this dummy function instead of unpickler I registered for the type mymodule.Object in Service constructor.
So now I am at loss. Sorry for long explanation: I did not know how to ask this question in a few lines. I can summarize my questions like this:
Why does copy_reg semantic requires me to return unpickler routine from pickleObject, if I an expected to register one independently?
Is there any reason to prefer copy_reg.constructor interface to register unpickler routine?
How do I make pickle to use the unpickler I registered instead of one inside the stream?
What should I return as first element in a tuple as pickleObject result value? Is there a "correct" value?
Do I approach this whole thing correctly? Is there different/simpler solution?
First of all, the copy_reg module is unlikely to help you much here: it is primarily a way to add __reduce__ like features to classes that don't have that method rather than offering any special abilities (e.g. if you want to pickle objects from some library that doesn't natively support it).
The callable returned by __reduce__ needs to be locatable in the environment where the object is to be unpickled, so an instance method isn't really appropriate. As mentioned in the Pickle documentation:
In the unpickling environment this object must be either a class, a callable registered as a
“safe constructor” (see below), or it must have an attribute
__safe_for_unpickling__ with a true value.
So if you defined a function (not method) as follows:
def _unpickle_service_object(buffer):
# Grab the global service object, however that is accomplished
service = get_global_service_object()
return service.unpickleObject(buffer)
_unpickle_service_object.__safe_for_unpickling__ = True
You could now use this _unpickle_service_object function in the return value of your __reduce__ methods so that your objects linked to the new environment's global Service object when unpickled.

Categories