I have a C++ class method like this:
class BinaryData
{
public:
...
void serialize(unsigned char* buf) const;
};
serialize function just get binary data as unsigned char*.
I use SWIG to wrap this class.
I want to read binary data as byte array or int array in python.
Python Code:
buf = [1] * 1000;
binData.serialize(buf);
But it occurs exception that can't convert to unsigned char*.
How can I call this function in python?
Simplest thing to do is to convert it inside Python:
buf = [1] * 1000;
binData.serialize(''.join(buf));
Will work out of the box, but is potentially inelegant depending on what Python users are expecting. You can workaround that using SWIG either inside Python code, e.g. with:
%feature("shadow") BinaryData::serialize(unsigned char *) %{
def serialize(*args):
#do something before
args = (args[0], ''.join(args[1]))
$action
#do something after
%}
Or inside the generated interface code, e.g. using buffers protocol:
%typemap(in) unsigned char *buf %{
// use PyObject_CheckBuffer and
// PyObject_GetBuffer to work with the underlying buffer
// AND/OR
// use PyIter_Check and
// PyObject_GetIter
%}
Where you prefer to do this is a personal choice based on your preferred programming language and other situation specific constraints.
Related
I have a library of C++ classes that I am building a Python interface for using SWIG. Many of these classes have methods that take in a double* array or int* array parameter without inputting a size. For example, there are many methods that have a declaration like one of the following:
void func(double* array);
void func2(double* array, double unrelated_parameter, ...);
I would like to be able to use these functions in Python, with the user passing in a Python numpy array. The size of these arrays are never given as a parameter to the function. The size of the input array is given in the constructor of the objects of these C++ classes and it is assumed that every input array that is given as a parameter to these class methods will have the same size. All of the numpy examples I have seen require me to add an int array_size parameter to the C++ method/function being wrapped.
Is there a way to wrap these C++ functions without having change the API of my entire C++ library to include an int array_size parameter for every single function? Ideally, a user should pass in a Python numpy array and SWIG will automatically convert it to a double or int array on the C++ side.
I have already included numpy.i and followed the instructions here: https://numpy.org/doc/stable/reference/swig.interface-file.html but am getting errors like the following:
TypeError: in method 'func', argument 2 of type 'double *'
One way I can think of is to suppress the "no size" version of the function and extend the class to have a version with a throw-away dimension variable that uses the actual parameter in the class.
Example:
test.i
%module test
%{
#define SWIG_FILE_WITH_INIT
class Test {
public:
int _dim; // needs to be public, or have a public accessor.
Test(int dim) : _dim(dim) {}
double func(double* array) {
double sum = 0.0;
for(int i = 0; i < _dim; ++i)
sum += array[i];
return sum;
}
};
%}
%include "numpy.i"
%init %{
import_array();
%}
%apply (double* IN_ARRAY1, int DIM1) {(double* array, int /*unused*/)};
%ignore Test::func; // so the one-parameter version isn't wrapped
class Test {
public:
Test(int dim);
double func(double* array);
};
%rename("%s") Test::func; // unignore so the two-parameter version will be used.
%extend Test {
double func(double* array, int /*unused*/) {
return $self->func(array);
}
}
Demo:
>>> import test
>>> t = test.Test(5)
>>> import numpy as np
>>> a = np.array([1.5,2.0,2.5,3.75,4.25])
>>> t.func(a)
14.0
How can I pass a str value (containing 3000 {'0', '1'} bytes) obtained using python code as an argument to a python c extended function (extended using SWIG) which requires int * (fixed length int array) as an input argument? My code is such:
int *exposekey(int *bits) {
int a[1000];
for (int j=2000; j < 3000; j++) {
a[j - 2000] = bits[j];
}
return a;
}
What I've tried was to use ctypes (see below code):
import ctypes
ldpc = ctypes.cdll.LoadLibrary('./_ldpc.so')
arr = (ctypes.c_int * 3072)(<mentioned below>)
ldpc.exposekey(arr)
with 3072 {0, 1} entered in the position. Python returns syntax error : more than 255 arguments. This still doesn't help me to pass assigned str value instead of the initialized ctypes int array.
Other suggestion included using SWIG typemaps but how would that work for converting a str into int * ? Thanks in advance.
Regarding my comment, here are some more details about returning arrays from functions: [SO]: Returning an array using C. In short: ways handle this:
Make the returned variable static
Dynamically allocate it (using malloc (family) or new)
Turn it into an additional argument for the function
Getting that piece of C code to run within the Python interpreter is possible in 2 ways:
[Python 3.Docs]: Extending Python with C or C++ - which creates a C written Python module
A way of doing that is using swig which offers a simple interface for generating the module ([SWIG]: SWIG Basics) saving you the trouble of writing it yourself using [Python 3.Docs]: Python/C API Reference Manual
The other way around, leaving the code in a standard dll which can be accessed via [Python 3.Docs]: ctypes - A foreign function library for Python
Since they both are doing the same thing, mixing them together makes no sense. So, pick the one that best fits your needs.
1. ctypes
This is what you started with
It's one of the ways of doing things using ctypes
ctypes_demo.c:
#include <stdio.h>
#if defined(_WIN32)
# define CTYPES_DEMO_EXPORT_API __declspec(dllexport)
#else
# define CTYPES_DEMO_EXPORT_API
#endif
CTYPES_DEMO_EXPORT_API int exposekey(char *bitsIn, char *bitsOut) {
int ret = 0;
printf("Message from C code...\n");
for (int j = 0; j < 1000; j++)
{
bitsOut[j] = bitsIn[j + 2000];
ret++;
}
return ret;
}
Notes:
Based on comments, I changed the types in the function from int* to char*, because it's 4 times more compact (although it's still ~700% inefficient since 7 bits of each char are ignored versus only one of them being used; that can be fixed, but requires bitwise processing)
I took a and turned into the 2nd argument (bitsOut). I think this is best because it's caller responsibility to allocate and deallocate the array (the 3rd option from the beginning)
I also modified the index range (without changing functionality), because it makes more sense to work with low index values and add something to them in one place, instead of a high index values and subtract (the same) something in another place
The return value is the number of bits set (obviously, 1000 in this case) but it's just an example
printf it's just dummy, to show that the C code gets executed
When dealing with such arrays, it's recommended to pass their dimensions as well, to avoid out of bounds errors. Also, error handling is an important aspect
test_ctypes.py:
from ctypes import CDLL, c_char, c_char_p, c_int, create_string_buffer
bits_string = "010011000110101110101110101010010111011101101010101"
def main():
dll = CDLL("./ctypes_demo.dll")
exposekey = dll.exposekey
exposekey.argtypes = [c_char_p, c_char_p]
exposekey.restype = c_int
bits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())
bits_out = create_string_buffer(1000)
print("Before: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
ret = exposekey(bits_in, bits_out)
print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
print("Return code: {}".format(ret))
if __name__ == "__main__":
main()
Notes:
1st, I want to mention that running your code didn't raise the error you got
Specifying function's argtypes and restype is mandatory, and also makes things easier (documented in the ctypes tutorial)
I am printing the bits_out array (only the first - and relevant - part, as the rest are 0) in order to prove that the C code did its job
I initialize bits_in array with 2000 dummy 0 at the beginning, as those values are not relevant here. Also, the input string (bits_string) is not 3000 characters long (for obvious reasons). If your bits_string is 3000 characters long you can simply initialize bits_in like: bits_in = create_string_buffer(bits_string.encode())
Do not forget to initialize bits_out to an array with a size large enough (in our example 1000) for its purpose, otherwise segfault might arise when trying to set its content past the size
For this (simple) function, the ctypes variant was easier (at least for me, since I don't use swig frequently), but for more complex functions / projects it will become an overkill and switching to swig would be the right thing to do
Output (running with Python3.5 on Win):
c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py
Before: [ ]
Message from C code...
After: [010011000110101110101110101010010111011101101010101]
Return code: 1000
2. swig
Almost everything from the ctypes section, applies here as well
swig_demo.c:
#include <malloc.h>
#include <stdio.h>
#include "swig_demo.h"
char *exposekey(char *bitsIn) {
char *bitsOut = (char*)malloc(sizeof(char) * 1000);
printf("Message from C code...\n");
for (int j = 0; j < 1000; j++) {
bitsOut[j] = bitsIn[j + 2000];
}
return bitsOut;
}
swig_demo.i:
%module swig_demo
%{
#include "swig_demo.h"
%}
%newobject exposekey;
%include "swig_demo.h"
swig_demo.h:
char *exposekey(char *bitsIn);
Notes:
Here I'm allocating the array and return it (the 2nd option from the beginning)
The .i file is a standard swig interface file
Defines the module, and its exports (via %include)
One thing that is worth mentioning is the %newobject directive that deallocates the pointer returned by exposekey to avoid memory leaks
The .h file just contains the function declaration, in order to be included by the .i file (it's not mandatory, but things are more elegant this way)
The rest is pretty much the same
test_swig.py:
from swig_demo import exposekey
bits_in = "010011000110101110101110101010010111011101101010101"
def main():
bits_out = exposekey("\0" * 2000 + bits_in)
print("C function returned: [{}]".format(bits_out))
if __name__ == "__main__":
main()
Notes:
Things make much more sense from Python programmer's PoV
Code is a lot shorter (that is because swig did some "magic" behind the scenes):
The wrapper .c wrapper file generated from the .i file has ~120K
The swig_demo.py generated module has ~3K
I used the same technique with 2000 0 at the beginning of the string
Output:
c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py
Message from C code...
C function returned: [010011000110101110101110101010010111011101101010101]
3. Plain Python C API
I added this part as a personal exercise
This is what swig does, but "manually"
capi_demo.c:
#include "Python.h"
#include "swig_demo.h"
#define MOD_NAME "capi_demo"
static PyObject *PyExposekey(PyObject *self, PyObject *args) {
PyObject *bitsInArg = NULL, *bitsOutArg = NULL;
char *bitsIn = NULL, *bitsOut = NULL;
if (!PyArg_ParseTuple(args, "O", &bitsInArg))
return NULL;
bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));
bitsOut = exposekey(bitsIn);
bitsOutArg = PyUnicode_FromString(bitsOut);
free(bitsOut);
return bitsOutArg;
}
static PyMethodDef moduleMethods[] = {
{"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},
{NULL}
};
static struct PyModuleDef moduleDef = {
PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, moduleMethods
};
PyMODINIT_FUNC PyInit_capi_demo(void) {
return PyModule_Create(&moduleDef);
}
Notes:
It requires swig_demo.h and swig_demo.c (not going to duplicate their contents here)
It only works with Python 3 (actually I got quite some headaches making it work, especially because I was used to PyString_AsString which is no longer present)
Error handling is poor
test_capi.py is similar to test_swig.py with one (obvious) difference: from swig_demo import exposekey should be replaced by from capi_demo import exposekey
The output is also the same to test_swig.py (again, not going to duplicate it here)
I'm a newbie in Python and embedding it too. And I have one problem:
There is function in my python module that recieves buffer created with ctypes.create_string_buffer(size) and fills it by content from some memory address:
def get_mem(self, address, size, data):
self.mem.read_ram_block(address, size, data)
How should I call this method with using of (char *) buffer? I want fill my C++ buffer with recieved from python.
If you only want to call the Python function ctypes.create_string_buffer(size), you could easily mirror the Python coding on the C++ side:
static PyObject* create_string_buffer(unsigned long size) {
PyObject *ctypes = PyImport_ImportModule("ctypes");
if (!ctypes) return 0;
PyObject *buf = PyObject_CallMethod(ctypes, "create_string_buffer", "k", size);
Py_DECREF(ctypes);
return buf;
}
If you'd like to use another type than unsigned long for the size, you'd need to change the format in PyObject_CallMethod as well. For example O is used for PyObject*. For a complete list of formats see the documentation for Building values.
I a very new to swig and I am trying to create a swig wrapper in order to use a few C++ files in python. I have the following C++ class.
The following is a snippet of the code that I am trying to convert:
/*packet_buffer.h*/
class CPacketBuffer {
public:
// construct based on given buffer, data is not copied
CPacketBuffer(uint8_t* data, uint32_t length) {
mpBuffer = data;
mLength = length;
mHead = 0;
mTail = length;
}
uint8_t* GetBuffer() {
return (mpBuffer + mHead);
}
void Add(const uint8_t* data, uint32_t length) {
if ((mTail + length) > mLength) {
length = (mLength - mTail);
}
//....
}
I have been trying to write a example.i file that would accept pointers to typedefs(uint8_t *) all day today using help from swig documentation, but I have been unsuccessful.
The following is a packet_buffer.i file that I have tried which doesn't work.
%module packet_buffer
%include typemaps.i
%apply unsigned char* {uint8_t*};
%apply unit8_t *INPUT {uint8_t *data};
%{
#define SWIG_FILE_WITH_INIT
#include "../include/packet_buffer.h"
%}
%include "../include/packet_buffer.h"
How do I write a swig code for member functions that take pointers to typedefs?
Can I write a common %apply that can be used across the code or will I have to write specifics for each INPUT, OUTPUT parameter?
If I've understood this correctly the problem you're facing isn't that they're pointers, it's that they're potentially unbounded arrays.
You can warp an unbounded C array using carrays.i and the "%array_class" macro, e.g.:
%module packet
%include "stdint.i"
%{
#include "packet.h"
%}
%include "carrays.i"
%array_class(uint8_t, buffer);
%include "packet.h"
Would then allow you to in Python write something like:
a = packet.buffer(10000000)
p = packet.CPacketBuffer(a.cast(), 10000000)
Note that you'll need to ensure the life of the buffer is sufficient - if the Python object gets released without the C++ code being aware you'll end up with undefined behaviour.
You can convert uint8_t* pointers (unbounded arrays) to buffer instances in Python using the frompointer methods that the %array_class macro also creates, e.g.:
r = packet.GetBuffer()
buf = packet.buffer_frompointer(r)
You can add additional Python code to automate/hide most of the conversion between buffers if desired, or use MemoryViews to integrate tighter with Python on the C API side.
In general though since this is C++ I'd suggest using std::vector for this - it's much nicer to use on the Python side than the unbounded arrays and the cost is minimal for the safety and simplicity it gives you.
I'm working on a Computer Vision system with OpenCV in C++. I wrote a small GUI for it by using Boost::Python and PyQT4. Since I don't want to introduce QT to the C++ project, I need a way to expose Mat::data (an unsigned char * member) to Python in order to create a QImage there.
First I tried it like this:
class_<cv::Mat>("Mat", init<>())
.add_property("data_", make_getter(&Mat::data))
but then I got this in Python: "TypeError: No to_python (by-value) converter found for C++ type: unsigned char*"
I couldn't write a converter for it because a PyBuf of course needs to know its size.
So my next approach was trying to create a proxy object like this:
struct uchar_array {
uchar *data;
size_t size;
bool copied;
static const bool debug = true;
// copy from byte array
uchar_array(uchar *ptr, size_t size, bool copy) {
this->size = size;
this->copied = copy;
if(copied) {
data = new uchar[size];
memcpy(data, ptr, size);
} else {
data = ptr;
}
if(debug) LOG_ERR("init %d bytes in #%p, mem #%p", size, this, data);
}
PyObject *py_ptr() {
if(debug) LOG_ERR("py_ptr");
return boost::python::incref(PyBuffer_FromMemory(data, size));
}
~uchar_array() {
if(debug) LOG_ERR("~uchar_array #%p", this);
if(copied) {
if(debug) LOG_ERR("free #%p, mem #%p", this, data);
delete [] data;
}
}
};
And exposing this via a non-member method:
uchar_array *getMatData(Mat &mat) {
size_t size = mat.rows * mat.cols * mat.elemSize();
uchar_array *arr = new uchar_array(mat.data, size, true);
return arr;
}
class_<cv::Mat>("Mat", init<>())
.def("data", getMatData, with_custodian_and_ward_postcall<1, 0, return_value_policy<manage_new_object> >())
class_<uchar_array, shared_ptr<uchar_array> >("uchar_array", no_init)
.def("ptr", &uchar_array::py_ptr);
This works and gets me the buffer into Python, but there are two problems with this approach:
I now have to use mat.data().ptr(), it would be nicer to just do mat.data
When doing mat.data().ptr(), it seems the temporary uchar_array gets destructed immediately after calling ptr(), thus freeing the memory while I still want to use it
I did several experiments with custodian_and_ward and other stuff but got to a point where I stopped to understand this.
So, could anyone please tell me: What's the preferred way to export an unsigned char * to a PyBuf? In two variants, if possible: allocated for Python so should be freed by Python or as internal pointer where C++ frees it.
char* buffers are not really python friendly. On my project (which is not performance sensitive) I would use a std::vector or std::string, depending on what it was intended to contain. Both of these are nicely python friendly.
If you are not able to alter the underlying data structure, you can use add_property and a couple of getter and setter functions to convert data to a more convenient structure.