This question is related to a previous question I asked. Namely this one if anyone is interested. Basically, what I want to do is to expose a C array to Python using a Py_buffer wrapped in a memoryview-object. I've gotten it to work using PyBuffer_FillInfo (work = I can manipulate the data in Python and write it to stdout in C), but if I try to roll my own buffer I get a segfault after the C function returns.
I need to create my own buffer because PyBuffer_FillInfo assumes that the format is char, making the itemsize field 1. I need to be able to provide items of size 1, 2, 4 and 8.
Some code, this is a working example:
Py_buffer *buf = (Py_buffer *) malloc(sizeof(*buf));
int r = PyBuffer_FillInfo(buf, NULL, malloc(sizeof(char) * 4), 4, 0, PyBUF_CONTIG);
PyObject *mv = PyMemoryView_FromBuffer(buf);
//Pack the memoryview object into an argument list and call the Python function
for (blah)
printf("%c\n", *buf->buf++); //this prints the values i set in the Python function
Looking at the implementation of PyBuffer_FillInfo, which is really simple, I rolled my own function to be able to provide custom itemsizes:
//buffer creation function
Py_buffer *getReadWriteBuffer(int nitems, int itemsize, char *fmt) {
Py_buffer *buf = (Py_buffer *) malloc(sizeof(*buf));
buf->obj = NULL
buf->buf = malloc(nitems * itemsize);
buf->len = nitems * itemsize;
buf->readonly = 0;
buf->itemsize = itemsize;
buf->format = fmt;
buf->ndim = 1;
buf->shape = NULL;
buf->strides = NULL;
buf->suboffsets = NULL;
buf->internal = NULL;
return buf;
}
How i use it:
Py_buffer *buf = getReadWriteBuffer(32, 2, "h");
PyObject *mv = PyMemoryView_FromBuffer(buf);
// pack the memoryview into an argument list and call the Python function as before
for (blah)
printf("%d\n", *buf->buf); //this prints all zeroes even though i modify the array in Python
return 0;
//the segfault happens somewhere after here
The result of using my own buffer object is a segfault after the C function returns. I really don't understand why this happens at all. Any help would be most appreciated.
EDIT
According to this question, which I failed to find before, itemsize > 1 might not even be supported at all. Which makes this question even more interesting. Maybe I could use PyBuffer_FillInfo with a large enough block of memory to hold what I want (32 C floats for example). In that case, the question is more about how to assign Python floats to the memoryview object in the Python function. Questions questions.
So, in lack of answers I decided to take another approach than the one I originally intended. Leaving this here in case someone else hits the same snag.
Basically, instead of creating a buffer (or bytearray, equiv.) in C and passing it to Python for the extension user to modify. I simply redesigned the code slightly, so that the user returns a bytearray (or any type that supports the buffer interface) from the Python callback function. This way I need not even worry about the size of the items since, in my case, all the C code does with the returned object is to extract its buffer and copy it to another buffer with a simple memcpy.
Code:
PYGILSTATE_ACQUIRE; //a macro i made
PyObject *result = PyEval_CallObject(python_callback, NULL);
if (!PyObject_CheckBuffer(result))
; //raise exception
Py_buffer *view = (Py_buffer *) malloc(sizeof(*view));
int error = PyObject_GetBuffer(result, view, PyBUF_SIMPLE);
if (error)
; //raise exception
memcpy(my_other_buffer, view->buf, view->len);
PyBuffer_Release(view);
Py_DECREF(result);
PYGILSTATE_RELEASE; //another macro
I hope this helps someone.
Related
How can I pass a str value (containing 3000 {'0', '1'} bytes) obtained using python code as an argument to a python c extended function (extended using SWIG) which requires int * (fixed length int array) as an input argument? My code is such:
int *exposekey(int *bits) {
int a[1000];
for (int j=2000; j < 3000; j++) {
a[j - 2000] = bits[j];
}
return a;
}
What I've tried was to use ctypes (see below code):
import ctypes
ldpc = ctypes.cdll.LoadLibrary('./_ldpc.so')
arr = (ctypes.c_int * 3072)(<mentioned below>)
ldpc.exposekey(arr)
with 3072 {0, 1} entered in the position. Python returns syntax error : more than 255 arguments. This still doesn't help me to pass assigned str value instead of the initialized ctypes int array.
Other suggestion included using SWIG typemaps but how would that work for converting a str into int * ? Thanks in advance.
Regarding my comment, here are some more details about returning arrays from functions: [SO]: Returning an array using C. In short: ways handle this:
Make the returned variable static
Dynamically allocate it (using malloc (family) or new)
Turn it into an additional argument for the function
Getting that piece of C code to run within the Python interpreter is possible in 2 ways:
[Python 3.Docs]: Extending Python with C or C++ - which creates a C written Python module
A way of doing that is using swig which offers a simple interface for generating the module ([SWIG]: SWIG Basics) saving you the trouble of writing it yourself using [Python 3.Docs]: Python/C API Reference Manual
The other way around, leaving the code in a standard dll which can be accessed via [Python 3.Docs]: ctypes - A foreign function library for Python
Since they both are doing the same thing, mixing them together makes no sense. So, pick the one that best fits your needs.
1. ctypes
This is what you started with
It's one of the ways of doing things using ctypes
ctypes_demo.c:
#include <stdio.h>
#if defined(_WIN32)
# define CTYPES_DEMO_EXPORT_API __declspec(dllexport)
#else
# define CTYPES_DEMO_EXPORT_API
#endif
CTYPES_DEMO_EXPORT_API int exposekey(char *bitsIn, char *bitsOut) {
int ret = 0;
printf("Message from C code...\n");
for (int j = 0; j < 1000; j++)
{
bitsOut[j] = bitsIn[j + 2000];
ret++;
}
return ret;
}
Notes:
Based on comments, I changed the types in the function from int* to char*, because it's 4 times more compact (although it's still ~700% inefficient since 7 bits of each char are ignored versus only one of them being used; that can be fixed, but requires bitwise processing)
I took a and turned into the 2nd argument (bitsOut). I think this is best because it's caller responsibility to allocate and deallocate the array (the 3rd option from the beginning)
I also modified the index range (without changing functionality), because it makes more sense to work with low index values and add something to them in one place, instead of a high index values and subtract (the same) something in another place
The return value is the number of bits set (obviously, 1000 in this case) but it's just an example
printf it's just dummy, to show that the C code gets executed
When dealing with such arrays, it's recommended to pass their dimensions as well, to avoid out of bounds errors. Also, error handling is an important aspect
test_ctypes.py:
from ctypes import CDLL, c_char, c_char_p, c_int, create_string_buffer
bits_string = "010011000110101110101110101010010111011101101010101"
def main():
dll = CDLL("./ctypes_demo.dll")
exposekey = dll.exposekey
exposekey.argtypes = [c_char_p, c_char_p]
exposekey.restype = c_int
bits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())
bits_out = create_string_buffer(1000)
print("Before: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
ret = exposekey(bits_in, bits_out)
print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
print("Return code: {}".format(ret))
if __name__ == "__main__":
main()
Notes:
1st, I want to mention that running your code didn't raise the error you got
Specifying function's argtypes and restype is mandatory, and also makes things easier (documented in the ctypes tutorial)
I am printing the bits_out array (only the first - and relevant - part, as the rest are 0) in order to prove that the C code did its job
I initialize bits_in array with 2000 dummy 0 at the beginning, as those values are not relevant here. Also, the input string (bits_string) is not 3000 characters long (for obvious reasons). If your bits_string is 3000 characters long you can simply initialize bits_in like: bits_in = create_string_buffer(bits_string.encode())
Do not forget to initialize bits_out to an array with a size large enough (in our example 1000) for its purpose, otherwise segfault might arise when trying to set its content past the size
For this (simple) function, the ctypes variant was easier (at least for me, since I don't use swig frequently), but for more complex functions / projects it will become an overkill and switching to swig would be the right thing to do
Output (running with Python3.5 on Win):
c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py
Before: [ ]
Message from C code...
After: [010011000110101110101110101010010111011101101010101]
Return code: 1000
2. swig
Almost everything from the ctypes section, applies here as well
swig_demo.c:
#include <malloc.h>
#include <stdio.h>
#include "swig_demo.h"
char *exposekey(char *bitsIn) {
char *bitsOut = (char*)malloc(sizeof(char) * 1000);
printf("Message from C code...\n");
for (int j = 0; j < 1000; j++) {
bitsOut[j] = bitsIn[j + 2000];
}
return bitsOut;
}
swig_demo.i:
%module swig_demo
%{
#include "swig_demo.h"
%}
%newobject exposekey;
%include "swig_demo.h"
swig_demo.h:
char *exposekey(char *bitsIn);
Notes:
Here I'm allocating the array and return it (the 2nd option from the beginning)
The .i file is a standard swig interface file
Defines the module, and its exports (via %include)
One thing that is worth mentioning is the %newobject directive that deallocates the pointer returned by exposekey to avoid memory leaks
The .h file just contains the function declaration, in order to be included by the .i file (it's not mandatory, but things are more elegant this way)
The rest is pretty much the same
test_swig.py:
from swig_demo import exposekey
bits_in = "010011000110101110101110101010010111011101101010101"
def main():
bits_out = exposekey("\0" * 2000 + bits_in)
print("C function returned: [{}]".format(bits_out))
if __name__ == "__main__":
main()
Notes:
Things make much more sense from Python programmer's PoV
Code is a lot shorter (that is because swig did some "magic" behind the scenes):
The wrapper .c wrapper file generated from the .i file has ~120K
The swig_demo.py generated module has ~3K
I used the same technique with 2000 0 at the beginning of the string
Output:
c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py
Message from C code...
C function returned: [010011000110101110101110101010010111011101101010101]
3. Plain Python C API
I added this part as a personal exercise
This is what swig does, but "manually"
capi_demo.c:
#include "Python.h"
#include "swig_demo.h"
#define MOD_NAME "capi_demo"
static PyObject *PyExposekey(PyObject *self, PyObject *args) {
PyObject *bitsInArg = NULL, *bitsOutArg = NULL;
char *bitsIn = NULL, *bitsOut = NULL;
if (!PyArg_ParseTuple(args, "O", &bitsInArg))
return NULL;
bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));
bitsOut = exposekey(bitsIn);
bitsOutArg = PyUnicode_FromString(bitsOut);
free(bitsOut);
return bitsOutArg;
}
static PyMethodDef moduleMethods[] = {
{"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},
{NULL}
};
static struct PyModuleDef moduleDef = {
PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, moduleMethods
};
PyMODINIT_FUNC PyInit_capi_demo(void) {
return PyModule_Create(&moduleDef);
}
Notes:
It requires swig_demo.h and swig_demo.c (not going to duplicate their contents here)
It only works with Python 3 (actually I got quite some headaches making it work, especially because I was used to PyString_AsString which is no longer present)
Error handling is poor
test_capi.py is similar to test_swig.py with one (obvious) difference: from swig_demo import exposekey should be replaced by from capi_demo import exposekey
The output is also the same to test_swig.py (again, not going to duplicate it here)
I need to construct the following data type in Python for passing to a C function:
struct {
unsigned a,b,c;
char data[8];
};
However, I need to actually pass a pointer to the data field to the function, not a pointer to a struct, and I can't figure out how to do this.
Here is what I have so far:
from ctypes import *
class MyStruct(Structure):
_fields_ = [("a",c_uint), ("b",c_uint), ("c",c_uint), ("data",c_char*8)]
mystruct = MyStruct(0,1,8,"ABCDEFGH")
external_c_function(mystruct.data)
Now in C I have this function:
int external_c_function(char *data) {
int a = ((unsigned *)data)[-1];
int b = ((unsigned *)data)[-2];
int c = ((unsigned *)data)[-3];
...
}
The problem is, when the function gets called, "data" correctly points to "ABCDEFGH", but when I try to get the rest of the struct data preceding it, it is garbage. What am I doing wrong? Isn't mystruct held sequentially in memory like a real C struct? I suspect something funny is going on with the array: am I actually doing something silly like this?
struct {
unsigned a,b,c;
char *data; // -> char[8]
};
and if so, how do I do it correctly?
You pass a pointer to an element of a structure by reference, using the offset of the element:
external_c_function(byref(mystruct,MyStruct.data.offset))
It seems that when you reference mystruct.data, a copy of the data is made. I say this because the python command type(mystruct.data), returns str, rather than a C type.
I presume that you are not able to modify the external_c_function to accept the pointer at the start of the structure, as this would be the most obvious solution. Therefore you need to somehow do C style pointer arithmetic in python - i.e. get the address of mystruct (possibly using ctypes.pointer), then figure out a way to increment this pointer by the appropriate number of bytes.
I don't know how you can do such pointer arithmetic in python, or if it's even possible to do in any robust manner. However, you could always wrap external_c_function in another C function which does the necessary pointer arithmetic.
edit
Mark's answer solves the problem neatly. My comment about why the error occurs is still correct.
There is a libx.so which export 2 functions, and a struct,
typedef struct Tag {
int num;
char *name;
}Tag;
Tag *create(int n, char *n)
{
Tag *t = malloc(sizeof(Tag));
t->num = n;
t->name = n;
return t;
}
void use(Tag *t)
{
printf("%d, %s\n", t->num, t->name);
}
I want to call create in Python and then save the Tag *res returned by create, later I will call use and pass the Tag *res saved before to use, here is it (just to demonstrate):
>>>libx = ctypes.CDLL("./libx.so")
>>>res = libx.create(c_int(1), c_char_p("a"))
>>>libx.use(res)
The above code might be wrong, just to demonstrate what I want to do.
And my problem is that, how could I save the result returned by create? Because it returns a pointer to a user-defined struct, and I don't want to construct struct Tag's counterpart in Python, would c_void_p do the trick?
UPDATE
From #David's answer, I still don't quite understand one thing:
the pointer (c_char_p("a")) is only valid for the duration of the
call to create. As soon as create returns then that pointer is no
longer valid.
And I assign c_char_p("a") to t->name in create, when the call to create finishes, is t->name a dangling pointer? Because according to the quoted words, that pointer is no longer valid after create. Why c_char_p("a") is no longer valid?
The C code that you present is simply not going to work. You need to be much more precise about which party allocates and is responsible for the heap memory.
In your current example you pass c_char_p("a") to the C code. However, the pointer to that ctypes memory is only valid for the duration of the call to create. As soon as create returns then that pointer is no longer valid. But you took a copy of the pointer inside create. Thus the subsequent call to use is liable to fail.
You are going to need to take a copy of the contents of that string and store it in the struct. If you do that then you can use libx.create.restype = c_void_p safely.
But if you want the memory you allocated to be deallocated you will have to provide a destroy function to match the create function. With these changes the C code would look like this:
Tag *create(int n, char *s)
{
Tag *t = malloc(sizeof(Tag));
t->num = n;
t->name = strdup(s);
return t;
}
void destroy(Tag *t)
{
free(t->name);
free(t);
}
The Python code would look like this:
libx = ctypes.CDLL("./libx.so")
libx.create.restype = c_void_p
res = libx.create(c_int(1), c_char_p("a"))
libx.use(res)
libx.destroy(res)
Python does reference counting. You'll have to use Py_INCREF() and friends for objects that are returned from "external" libraries.
UPDATE: I don't know about .so loading by python, maybe the method proposed by #David Hefferman does this automagically.
UPDATE2: delete me!
I ran into a situation with pure python and C python module.
To summarize, how can I accept and manipulate python object in C module?
My python part will look like this.
#!/usr/bin/env python
import os, sys
from c_hello import *
class Hello:
busyHello = _sayhello_obj
class Man:
def __init__(self, name):
self.name = name
def getName(self):
return self.name
h = Hello()
h.busyHello( Man("John") )
in C, two things need to be resolved.
first, how can I receive object?
second, how can I call a method from the object?
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj;
// How can I fill obj?
char s[1024];
// How can I fill s, from obj.getName() ?
printf("Hello, %s\n", s);
return Py_None;
}
To extract an argument from an invocation of your method, you need to look at the functions documented in Parsing arguments and building values, such as PyArg_ParseTuple. (That's for if you're only taking positional args! There are others for positional-and-keyword args, etc.)
The object you get back from PyArg_ParseTuple doesn't have it's reference count increased. For simple C functions, you probably don't need to worry about this. If you're interacting with other Python/C functions, or if you're releasing the global interpreter lock (ie. allowing threading), you need to think very carefully about object ownership.
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj = NULL;
// How can I fill obj?
static char fmt_string = "O" // For "object"
int parse_result = PyArg_ParseTuple(args, fmt_string, &obj);
if(!parse_res)
{
// Don't worry about using PyErr_SetString, all the exception stuff should be
// done in PyArg_ParseTuple()
return NULL;
}
// Of course, at this point you need to do your own verification of whatever
// constraints might be on your argument.
For calling a method on an object, you need to use either PyObject_CallMethod or PyObject_CallMethodObjArgs, depending on how you construct the argument list and method name. And see my comment in the code about object ownership!
Quick digression just to make sure you're not setting yourself up for a fall later: If you really are just getting the string out to print it, you're better off just getting the object reference and passing it to PyObject_Print. Of course, maybe this is just for illustration, or you know better than I do what you want to do with the data ;)
char s[1024];
// How can I fill s, from obj.getName() ?
// Name of the method
static char method_name = "getName";
// No arguments? Score! We just need NULL here
char method_fmt_string = NULL;
PyObject *objname = PyObject_CallMethod(obj, obj_method, method_fmt_string);
// This is really important! What we have here now is a Python object with a newly
// incremented reference count! This means you own it, and are responsible for
// decrementing the ref count when you're done. See below.
// If there's a failure, we'll get NULL
if(objname == NULL)
{
// Again, this should just propagate the exception information
return NULL;
}
Now there are a number of functions in the String/Bytes Objects section of the Concrete Objects Layer docs; use whichever works best for you.
But do not forget this bit:
// Now that we're done with the object we obtained, decrement the reference count
Py_XDECREF(objname);
// You didn't mention whether you wanted to return a value from here, so let's just
// return the "None" singleton.
// Note: this macro includes the "return" statement!
Py_RETURN_NONE;
}
Note the use of Py_RETURN_NONE there, and note that it's not return Py_RETURN_NONE!
PS. The structure of this code is dictated to a great extent by personal style (eg. early returns, static char format strings inside the function, initialisation to NULL). Hopefully the important information is clear enough apart from stylistic conventions.
I'm trying to wrap the Patricia Tries (Perl's NET::Patricia) to be exposed in python. I am having difficulty with one of the classes.
So instances the patricia node (below) as viewed from python have a "data" property. Reading it goes fine, but writing to it breaks.
typedef struct _patricia_node_t {
u_int bit; /* flag if this node used */
prefix_t *prefix; /* who we are in patricia tree */
struct _patricia_node_t *l, *r; /* left and right children */
struct _patricia_node_t *parent;/* may be used */
void *data; /* pointer to data */
void *user1; /* pointer to usr data (ex. route flap info) */
} patricia_node_t;
Specifically:
>>> N = patricia.patricia_node_t()
>>> assert N.data == None
>>> N.data = 1
TypeError: in method 'patricia_node_t_data_set', argument 2 of type 'void *'
Now my C is weak. From what I read in the SWIG book, I think this means I need to pass it a pointer to data. According to the book :
Also, if you need to pass the raw pointer value to some external python library, you can do it by casting the pointer object to an integer... However, the inverse operation is not possible, i.e., you can't build a Swig pointer object from a raw integer value.
Questions:
am I understanding this correctly?
how do I get around this? Is %extends? typemap? Specifics would be very helpful.
Notes:
I can't change the C source, but I can extend it in additional .h files or the interface .i file.
From what I understand, that "data" field should be able to contain "anything" for some reasonable value of "anything" that I don't really know.
I haven't used SWIG in a while, but I am pretty sure that you want to use a typemap that will take a PyObject* and cast it to the required void* and vice versa. Be sure to keep track of reference counts, of course.
It looks like you should pass SWIG a pointer to an integer. For example, if this was all in C, your error would be like this:
void set(struct _patricia_node_t *tree, void *data) {
tree->data = data;
}
...
int value = 1;
set(tree, &value); // OK! HOORAY!
set(tree, value); // NOT OK! FIRE SCORPIONS!
And it seems to me you're doing the Python equivalent of set(tree, value). Now I'm not an expert with SWIG but perhaps you could pass a tuple instead of an integer? Does N.data = (1,) work? This was the answer suggested by an Allegro CL + SWIG example, but I dunno how well it applies to Python.
An alternative is use PyRadix, which uses the same underlying code.