Boost Python Hash - python

Is there a function in Boost::Python that lets you get the hash of a boost::python::object, a.k.a the equivalent of Python's hash function? I've been reading the docs, but it doesn't seem to mention anything.

hash in python is implemented with PyObject_Hash on the C side of things. If you have a random object obj, you can simply call:
long hash = PyObject_Hash(obj.ptr())
The ptr() method on a boost::python::object returns a PyObject * that has a borrowed reference to that object.
In general, there's tons of stuff in the CPython API that is not wrapped by boost::python. It's easy enough to just call it directly.

Related

Returning a function python-c-api

I am creating Python bindings for a C library.
In C the code to use the functions would look like this:
Ihandle *foo;
foo = MethFunc();
SetArribute(foo, 's');
I am trying to get this into Python. Where I have MethFunc() and SetAttribute() functions that could be used in my Python code:
import mymodule
foo = mymodule.MethFunc()
mymodule.SetAttribute(foo)
So far my C code to return the function looks like this:
static PyObject * _MethFunc(PyObject *self, PyObject *args) {
return Py_BuildValue("O", MethFunc());
}
But that fails by crashing (no errors)
I have also tried return MethFunc(); but that failed.
How can I return the function foo (or if what I am trying to achieve is completely wrong, how should I go about passing MethFunc() to SetAttribute())?
The problem here is that MethFunc() returns an IHandle *, but you're telling Python to treat it as a PyObject *. Presumably those are completely unrelated types.
A PyObject * (or any struct you or Python defines that starts with an appropriate HEAD macro) begins with pointers to a refcount and a type, and the first thing Python is going to do with any object you hand it is deal with those pointers. So, if you give it an object that instead starts with, say, two ints, Python is going to end up trying to access a type at 0x00020001 or similar, which is almost certain to segfault.
If you need to pass around a pointer to some C object, you have to wrap it up in a Python object. There are three ways to do this, from hackiest to most solid.
First, you can just cast the IHandle * to a size_t, then PyLong_FromSize_t it.
This is dead simple to implement. But it means these objects are going to look exactly like numbers from the Python side, because that's all they are.
Obviously you can't attach a method to this number; instead, your API has to be a free function that takes a number, then casts that number back to an IHandle* and calls a method.
It's more like, e.g., C's stdio, where you have to keep passing stdin or f as an argument to fread, instead of Python's io, where you call methods on sys.stdin or f.
But even worse, because there's no type checking, static or dynamic, to protect you from some Python code accidentally passing you the number 42. Which you'll then cast to an IHandle * and try to dereference, leading to a segfault…
And if you were hoping Python's garbage collector would help you know when the object is still referenced, you're out of luck. You need to make your users manually keep track of the number and call some CloseHandle function when they're done with it.
Really, this isn't that much better than accessing your code from ctypes, so hopefully that inspires you to keep reading.
A better solution is to cast the IHandle * to a void *, then PyCapsule_New it.
If you haven't read about capsules, you need to at least skim the main chapter. But the basic idea is that it wraps up a void* as a Python object.
So, it's almost as simple as passing around numbers, but solves most of the problems. Capsules are opaque values which your Python users can't accidentally do arithmetic on; they can't send you 42 in place of a capsule; you can attach a function that gets called when the last reference to a capsule goes away; you can even give it a nice name to show up in the repr.
But you still can't attach any behavior to capsules.
So, your API will still have to be a MethSetAttribute(mymodule, foo) instead of mymeth.SetAttribute(foo) if mymodule is a capsule, just as if it's an int. (Except now it's type-safe.)
Finally, you can build a new Python extension type for a struct that contains an IHandle *.
This is a lot more work. And if you haven't read the tutorial on Defining Extension Types, you need to go thoroughly read through that whole chapter.
But it means that you have an actual Python type, with everything that goes with it.
You can give it a SetAttribute method, and Python code can just call that method. You can give it whatever __str__ and __repr__ you want. You can give it a __doc__. Python code can do isinstance(mymodule, MyMeth). And so on.
If you're willing to use C++, or D, or Rust instead of C, there are some great libraries (PyCxx, boost::python, Pyd, rust-python, etc.) that can do most of the boilerplate for you. You just declare that you want a Python class and how you want its attributes and methods bound to your C attributes and methods and you get something you can use like a C++ class, except that it's actually a PyObject * under the covers. (And it'll even takes care of all the refcounting cruft for you via RAII, which will save you endless weekends debugging segfaults and memory leaks…)
Or you can use Cython, which lets you write C extension modules in a language that's basically Python, but extended to interface with C code. So your wrapper class is just a class, but with a special private cdef attribute that holds the IHandle *, and your SetAttribute(self, s) can just call the C SetAttribute function with that private attribute.
Or, as suggested by user, you can also use SWIG to generate the C bindings for you. For simple cases, it's pretty trivial—just feed it your C API, and it gives you back the code to build your Python .so. For less simple cases, I personally find it a lot more painful than something like PyCxx, but it definitely has a lower learning curve if you don't already know C++.

Python C API Boolean Objects

I am using Python C API 2.7.2 with my C++ console application. There is one doubt regarding Python C API Boolean Objects
I am using:
PyObject* myVariable = Py_True;
Do I need to deference myVariable with Py_DECREF(myVariable)?
The Python C API documentation says:-
The Python True object. This object has no methods. It needs to be
treated just like any other object with respect to reference counts.
I searched the questions but could not find a clear answer for it.
Thanks.
Although it isn't dynamically created, it must be reference counted because PyObject variables can hold ANY Python object. Otherwise there would need to be checks for Py_True and other special cases scattered throughout the Python runtime as well as any C/C++ code that uses the API. That would be messy and error prone.
It needs to be treated just like any other object with respect to reference counts.
This means that you must incref it when you take its reference
{
Py_INCREF(Py_True);
PyObject* myVariable = Py_True;
and you must decref it when you dispose of it.
Py_DECREF(myVariable);
}

How do you pass around a void pointer between Python and C when writing an extension?

I started on my first Python extension today and was only creating a very small wrapper around a C library as an exercise. As is typical with C libraries, you start of with an initialization function that yields a handler. You can pass that handler to functions and later you pass it to the cleanup function that frees memory.
When I started writing the wrapper I basically wanted to have a way to call each native C function from python. Quickly I hit the problem that I need to return an arbitrary pointer from C to Python only to give it from there to C again in another function. I doesn't matter how it looks as I don't use it in Python, I just store it and pass it around.
So how do you pass around a void pointer between Python and C?
Please note: I know it is not recommended to write such small wrappers using the extension system but rather ctypes and friends. This is just for practice right now.
PyLong_FromVoidPtr() and PyLong_AsVoidPtr() can be abused to inject malicious data into your program. I recommend against them.
Python has PyCapsule for exactly that job. Capsules provide a safe way to exchange void ptr between modules or Python space and C space. The capsules are type-safe, too. If you need some example, the socket / ssl modules and pyexpat / _elementtree modules use capsules to exchange CAPI structs.
http://docs.python.org/3/c-api/capsule.html
After some searching I found the functions PyLong_AsVoidPtr and PyLong_FromVoidPtr. This yields a nice way to convert between a void * and a PyObject:
# in init function
return PyLong_FromVoidPtr(handle);
# in function using handle
handle = PyLong_AsVoidPtr(python_handle);
The one problem now might be how to retrieve python_handle from the typical *args given to a function:
PyObject *python_handle;
PyArg_ParseTuple(args, "O", &python_handle);
Careful here: The argument given for the "O" object must be a pointer to a PyObject pointer: PyObject **. The "O" itself only denotes to pass this PyObject through without any handling and converting. And with this, you can pass around any pointers any way you like.
Note: I think this solution is not really pretty, because you now have to variables, one that is only needed for a short time.

What are the arguments to the types.CodeType() python call?

I'm currently trying to roll my own "marshal" code for python so i can store compiled python code on Google App Engine to serve scripts on a dynamic way. As you all can verify, "marshal" isn't supported on GAE and "pickle" can't serialize code objects.
I found out i can construct a code object with types.CodeType() but it expects 12 arguments.
As much as i've tried, i can't find any documentation on this call and i really need to construct the code object so i can exec() it. My question is, does anyone know what are the parameters for this types.CodeType() "constructor" or any way to introspect it? i have used the info() function defined here but it spits out just generic info!
Quick FAQ:
Q: Why compile the code?
A: CPU time costs real money on Google App Engine, and every bit of CPU cycles i can save counts.
Q: Why not use "marshal"?
A: That's one of the unsupported modules in Google App Engine.
Q: Why not use "pickle"?
A: Pickle doesn't support serialization of code objects.
UPDATE
Google App Engine infrastructure doesn't allow the instantiation of code objects as of 7th July 2011, so my argument here is moot. Hope this gets fixed in the future on GAE.
The question asked:
what are the parameters for this types.CodeType() "constructor"
From the python docs about the inspect module:
co_argcount: number of arguments (not including * or ** args)
co_code: string of raw compiled bytecode
co_consts: tuple of constants used in the bytecode
co_filename: name of file in which this code object was created
co_firstlineno: number of first line in Python source code
co_flags: bitmap: 1=optimized | 2=newlocals | 4=*arg | 8=**arg
co_lnotab: encoded mapping of line numbers to bytecode indices
co_name: name with which this code object was defined
co_names: tuple of names of local variables
co_nlocals: number of local variables
co_stacksize: virtual machine stack space required
co_varnames: tuple of names of arguments and local variables
This blog post has much more detailed explanation: http://tech.blog.aknin.name/2010/07/03/pythons-innards-code-objects/
Note: the blog post talks about python 3 while the quoted python docs above is python 2.7.
The C API function PyCode_New is (minimally) documented here: http://docs.python.org/c-api/code.html ­— the C source code of this function (Python 2.7) is here: http://hg.python.org/cpython/file/b5ac5e25d506/Objects/codeobject.c#l43
PyCodeObject *
PyCode_New(int argcount, int nlocals, int stacksize, int flags,
PyObject *code, PyObject *consts, PyObject *names,
PyObject *varnames, PyObject *freevars, PyObject *cellvars,
PyObject *filename, PyObject *name, int firstlineno,
PyObject *lnotab)
However, in the Python constructor, the last six arguments appear to be swapped around a little. This is the C code that extracts the arguments passed in by Python: http://hg.python.org/cpython/file/b5ac5e25d506/Objects/codeobject.c#l247
if (!PyArg_ParseTuple(args, "iiiiSO!O!O!SSiS|O!O!:code",
&argcount, &nlocals, &stacksize, &flags,
&code,
&PyTuple_Type, &consts,
&PyTuple_Type, &names,
&PyTuple_Type, &varnames,
&filename, &name,
&firstlineno, &lnotab,
&PyTuple_Type, &freevars,
&PyTuple_Type, &cellvars))
return NULL;
Pythonized:
def __init__(self, argcount, nlocals, stacksize, flags, code,
consts, names, varnames, filename, name,
firstlineno, lnotab, freevars=None, cellvars=None): # ...
I went and took the code found here and removed the dependency for the deprecated "new" module.
import types, copy_reg
def code_ctor(*args):
# delegate to new.code the construction of a new code object
return types.CodeType(*args)
def reduce_code(co):
# a reductor function must return a tuple with two items: first, the
# constructor function to be called to rebuild the argument object
# at a future de-serialization time; then, the tuple of arguments
# that will need to be passed to the constructor function.
if co.co_freevars or co.co_cellvars:
raise ValueError, "Sorry, cannot pickle code objects from closures"
return code_ctor, (co.co_argcount, co.co_nlocals, co.co_stacksize,
co.co_flags, co.co_code, co.co_consts, co.co_names,
co.co_varnames, co.co_filename, co.co_name, co.co_firstlineno,
co.co_lnotab)
# register the reductor to be used for pickling objects of type 'CodeType'
copy_reg.pickle(types.CodeType, reduce_code)
if __name__ == '__main__':
# example usage of our new ability to pickle code objects
import cPickle
# a function (which, inside, has a code object, of course)
def f(x): print 'Hello,', x
# serialize the function's code object to a string of bytes
pickled_code = cPickle.dumps(f.func_code)
# recover an equal code object from the string of bytes
recovered_code = cPickle.loads(pickled_code)
# build a new function around the rebuilt code object
g = types.FunctionType(recovered_code, globals( ))
# check what happens when the new function gets called
g('world')
Answering the question you need answered rather than the one you asked:
You can't execute arbitrary bytecode in the App Engine Python environment, currently. Although you may be able to access the bytecode or code objects, you can't load one.
You have an alternative, however: per-instance caching. Store a global dict mapping datastore keys (for your datastore entries that store the Python code) to the compiled code object. If the object doesn't exist in the cache, compile it from source and store it there. You'll have to do the compilation work on each instance, but you don't have to do it on each request, which should save you a lot of work.

Python reference count and ctypes

Hallo,
I have some troubles understanding the python reference count.
What I want to do is return a tuple from c++ to python using the ctypes module.
C++:
PyObject* foo(...)
{
...
return Py_BuildValue("(s, s)", value1, value2);
}
Python:
pointer = c_foo(...) # c_foo loaded with ctypes
obj = cast(pointer, py_object).value
I'm was not sure about the ref count of obj, so I tried sys.getrefcount()
and got 3. I think it should be 2 (the getrefcount functions makes one ref itself).
Now I can't make Py_DECREF() before the return in C++ because the object gets deleted. Can I decrease the ref count in python?
edit
What happens to the ref count when the cast function is called? I'm not really sure from the documentation below. http://docs.python.org/library/ctypes.html#ctypes.cast
ctypes.cast(obj, type)
This function is similar to the cast operator in C. It returns a new instance of type which points to the same memory block as obj. type must be a pointer type, and obj must be an object that can be interpreted as a pointer.
On further research I found out that one can specify the return type of the function.
http://docs.python.org/library/ctypes.html#callback-functions
This makes the cast obsolete and the ref count is no longer a problem.
clib = ctypes.cdll.LoadLibrary('some.so')
c_foo = clib.c_foo
c_foo.restype = ctypes.py_object
As no additional answers were given I accept my new solution as the answer.
Your c++ code seems to be a classic wrapper using the official C-API and it's a bit weird since ctypes is usually used for using classic c types in python (like int, float, etc...).
I use personnally the C-API "alone" (without ctypes) but on my personnal experience, you don't have to worry about the reference counter in this case since you are returning a native python type with Py_BuildValue. When a function returns an object, the ownership of the returned object is given to the calling function.
You have to worry about Py_XINCREF/Py_XDECREF (better than Py_INCREF/Py_DECREF because it accepts NULL pointers) only when you want to change ownership of the object :
For example, you have created a wrapper of a map in python (let's call the typed object py_map). The element are of c++ class Foo and you have created an other python wrapper for them (let's call it py_Foo). If you create a function that wrap the [] operator, you are going to return a py_Foo object in python :
F = py_Map["key"]
but since the ownership is given to the calling function, you will call the destructor when you delete F and the map in c++ contains a pointer to a deallocated objet !
The solution is to write in c++ in the wrapper of [] :
...
PyObject* result; // My py_Foo object
Py_XINCREF(result); // transfer the ownership
return result;
}
You should take a look at the notion of borrowed and owned reference in python. This is essential to understand properly the reference counter.

Categories