Passing numpy string array to C using Swig

Passing numpy string array to C using Swig - python

my C program needs a char** input which I store in python as a numpy object array of strings.
a = np.empty(2, dtype=object)
a[0] = 'hi you'
a[1] = 'goodbye'
What is the correct way to pass this to my C program considering that numpy.i only defines typemaps for char* arrays?

That's impossibru AFAIK, and as far as the docs go:
Some data types are not yet supported, like boolean arrays and string arrays.
You'll either have to write an intermediary function that takes the strings as separate arguments, put them in an array and pass that to your C function, or work out another way of doing things

So it is doable, but you need to convert the numpy object array to a list of python strings with a.tolist(). Then you can pass it to the C code with the following tutorial code as a char **
http://www.swig.org/Doc1.3/Python.html#Python_nn59
Edit: Turned out to be a real pain in the *** since the example above is for Python 2 but gives useless error messages in Python 3. Python 3 moved to unicode strings and I had to do some doc reading to make it work. Here is the python 3 equivalent of the above example.
// This tells SWIG to treat char ** as a special case
%typemap(in) char ** {
/* Check if is a list */
if (PyList_Check($input)) {
int size = PyList_Size($input);
Py_ssize_t i = 0;
$1 = (char **) malloc((size+1)*sizeof(char *));
for (i = 0; i < size; i++) {
PyObject *o = PyList_GetItem($input,i);
if (PyUnicode_Check(o))
$1[i] = PyUnicode_AsUTF8(PyList_GetItem($input,i));
else {
//PyErr_SetString(PyExc_TypeError,"list must contain strings");
PyErr_Format(PyExc_TypeError, "list must contain strings. %d/%d element was not string.", i, size);
free($1);
return NULL;
}
}
$1[i] = 0;
} else {
PyErr_SetString(PyExc_TypeError,"not a list");
return NULL;
}
}
// This cleans up the char ** array we malloc'd before the function call
%typemap(freearg) char ** {
free((char *) $1);
}
Essentially just had to replace PyString_Check with PyUnicode_Check and PyString_AsString with PyUnicode_AsUTF8 (introduced in python 3.3 and later)

Related

Pass Python bytes to C/C++ function using swig?

How do you pass a bytes value from Python (like data loaded from a file with open('my file.dat', 'rb').read()) to a C/C++ function using swig?
When I try using char * or uint8_t * and then a size parameter it gives me an error like this:
TypeError: in method 'processData', argument 3 of type 'char *'
I've tried using %pybuffer_mutable_binary and %pybuffer_binary but they don't seem to change the definition of the wrapper and I still get the same error.

Without code can't diagnose what is wrong, but likely you didn't declare %pybuffer lines before the function definitions. If declared after the generated wrappers won't use them when processing the functions, which would explain "they don't seem to change the definition of the wrapper".
Here's a working example. Note that passing an immutable item to a function that modifies the string will crash Python. It would be nice if the commands from pybuffer.i type-checked the Python object for mutability. If you want that don't use pybuffer.i.
test.i
%module test
%{
#include <stdlib.h>
#include <string.h>
%}
%include <pybuffer.i>
%pybuffer_mutable_string(char* str1)
%pybuffer_string(const char* str2)
%pybuffer_mutable_binary(char* str3, size_t size)
%pybuffer_binary(const char* str4, size_t size)
%inline %{
void funcms(char *str1) {
strupr(str1);
}
size_t funcs(const char *str2) {
return strlen(str2);
}
void funcmb(char* str3, size_t size) {
memset(str3,'A',size);
}
size_t funcb(const char* str4, size_t size) {
size_t tmp = 0;
for(size_t i = 0; i < size; ++i)
tmp += str4[i];
return tmp % 256;
}
%}
Demo:
>>> import test
>>> b=bytearray(b'abc') # mutable string (nul-terminated)
>>> test.funcms(b)
>>> b
bytearray(b'ABC')
>>> test.funcs(b'abc') # immutable string (nul-terminated)
3
>>> b=bytearray(b'ab\0cd\0ef') # mutable data (includes nulls)
>>> test.funcmb(b)
>>> b
bytearray(b'AAAAAAAA')
>>> test.funcb(b'ab\0cd\0ef') # immutable data (compute byte checksum)
85
>>> sum(b'ab\0cd\0ef')%256 # verify result
85

I think the best way to do this is a type map using the Python buffer interface. This passes a pointer to your data to the C/C++ function without any copying of data. For example:
%typemap(in, numinputs=1) (const char *data, unsigned long int size) {
Py_buffer view;
if (PyObject_CheckBuffer($input) != 1) {
PyErr_SetString(
PyExc_TypeError,
"in method '$symname', argument $argnum does not support the buffer interface");
SWIG_fail;
}
if (PyObject_GetBuffer($input, &view, PyBUF_SIMPLE) != 0) {
PyErr_SetString(
PyExc_TypeError,
"in method '$symname', argument $argnum does not export a simple buffer");
SWIG_fail;
}
$1 = view.buf;
$2 = view.len;
PyBuffer_Release(&view);
}
%typemap(doc) const char *data, unsigned long int size "$1_name: readable buffer (e.g. bytes)"

CPPYY/CTYPES passing array of strings as char* args[]

I only recently started using cppyy and ctypes, so this may be a bit of a silly question. I have the following C++ function:
float method(const char* args[]) {
...
}
and from Python I want to pass args as a list of strings, i.e.:
args = *magic*
x = cppyy.gbl.method(args)
I have previously found this, so I used
def setParameters(strParamList):
numParams = len(strParamList)
strArrayType = ct.c_char_p * numParams
strArray = strArrayType()
for i, param in enumerate(strParamList):
strArray[i] = param
lib.SetParams(numParams, strArray)
and from Python:
args = setParameters([b'hello', b'world'])
c_types.c_char_p expects a bytes array. However, when calling x = cppyy.gbl.method(args) I get
TypeError: could not convert argument 1 (could not convert argument to buffer or nullptr)
I'm not entirely sure why this would be wrong since the args is a <__main__.c_char_p_Array_2> object, which I believe should be converted to a const char* args[].

For the sake of having a concrete example, I'll use this as the .cpp file:
#include <cstdlib>
extern "C"
float method(const char* args[]) {
float sum = 0.0f;
const char **p = args;
while(*p) {
sum += std::atof(*p++);
}
return sum;
}
And I'll assume it was compiled with g++ method.cpp -fPIC -shared -o method.so. Given those assumptions, here's an example of how you could use it from Python:
#!/usr/bin/env python3
from ctypes import *
lib = CDLL("./method.so")
lib.method.restype = c_float
lib.method.argtypes = (POINTER(c_char_p),)
def method(args):
return lib.method((c_char_p * (len(args) + 1))(*args))
print(method([b'1.23', b'45.6']))
We make a C array to hold the Python arguments. len(args) + 1 makes sure there's room for the null pointer sentinel.

ctypes does not have a public API that is usable from C/C++ for extension writers, so the handling of ctypes by cppyy is by necessity somewhat clunky. What's going wrong, is that the generated ctypes array of const char* is of type const char*[2] not const char*[] and since cppyy does a direct type match for ctypes types, that fails.
As-is, some code somewhere needs to do a conversion of the Python strings to low-level C ones, and hold on to that memory for the duration of the call. Me, personally, I'd use a little C++ wrapper, rather than having to think things through on the Python side. The point being that an std::vector<std::string> can deal with the necessary conversions (so no bytes type needed, for example, but of course allowed if you want to) and it can hold the temporary memory.
So, if you're given some 3rd party interface like this (putting it inline for cppyy only for the sake of the example):
import cppyy
cppyy.cppdef("""
float method(const char* args[], int len) {
for (int i = 0; i < len; ++i)
std::cerr << args[i] << " ";
std::cerr << std::endl;
return 42.f;
}
""")
Then I'd generate a wrapper:
# write a C++ wrapper to hide C code
cppyy.cppdef("""
namespace MyCppAPI {
float method(const std::vector<std::string>& args) {
std::vector<const char*> v;
v.reserve(args.size());
for (auto& s : args) v.push_back(s.c_str());
return ::method(v.data(), v.size());
}
}
""")
Then replace the original C API with the C++ version:
# replace C version with C++ one for all Python users
cppyy.gbl.method = cppyy.gbl.MyCppAPI.method
and things will be as expected for any other person downstream:
# now use it as expected
cppyy.gbl.method(["aap", "noot", "mies"])
All that said, obviously there is no reason why cppyy couldn't do this bit of wrapping automatically. I created this issue: https://bitbucket.org/wlav/cppyy/issues/235/automatically-convert-python-tuple-of

OUT argument internally allocated to return an array of structs

I'm new to swig and I have the following function which i cant fix:
int get_list(IN const char * string, OUT struct entry ** results);
where struct entry is defined:
struct flux_entry
{
char * addr_str;
char cc[2];
};
the entry struct is properly converted to a python class.
I googled but couldn't find any explanation i could use.
I want to make it return a tuple of: (original get_list int return value, python list of entry python objects, based on the results buffer), but don't know how to convert the C entry to a python object in the argout code snippet.
I've managed to get thus far:
%typemap(argout) struct entry **
{
PyObject *o = PyList_New(0);
int i;
for(i=0; $1[i] ; i++)
{
PyList_Append(o, SWIG_HOW_TO_CONVERT_TO_PYOBJECT($1[i]));
}
$result = o;
}
what should i replace SWIG_HOW_TO_CONVERT_TO_PYOBJECT with?
passed results is supposed to be a pointer to a (struct entry *) type, set to NULL before calling get_list and should be set to an allocated array of struct entry * pointers. maybe a small wrapper function could make that easier?
the struct entry array is allocated within the C function using malloc, after calculating (inside get_list) how many elements are needed, and ends with a NULL pointer to indicate the end of the array.
i'd also like to make sure it's freed somewhere :)
thanks!

This should at least give you a starting point that works. I still wasn't sure how the data was returned, since to return an array of pointers so that the final one was NULL I'd think you'd need a struct entry ***, so I just set addr_str = NULL on the last one as a sentinel, and just put some dummy data partially based on the input string into the fields. Modify as needed to suit your needs:
%module example
// Insert the structure definition and function to wrap into the wrapper code.
%{
struct entry {
char* addr_str;
char cc[2];
};
int get_list(const char* string, struct entry** results)
{
*results = malloc(3 * sizeof(struct entry));
(*results)[0].addr_str = malloc(10);
strcpy((*results)[0].addr_str,"hello");
(*results)[0].cc[0] = string[0];
(*results)[0].cc[1] = string[1];
(*results)[1].addr_str = malloc(10);
strcpy((*results)[1].addr_str,"there");
(*results)[1].cc[0] = string[2];
(*results)[1].cc[1] = string[3];
(*results)[2].addr_str = NULL;
return 0;
}
%}
#include <typemaps.i>
// Define the structure for SWIG
struct entry {
char* addr_str;
char cc[2];
};
// Define a set of typemaps to be used for an output parameter.
// This typemap suppresses requiring the parameter as an input.
// A temp variable is created and passed instead.
%typemap(in,numinputs=0) struct entry **OUTPUT (struct entry* temp) %{
$1 = &temp;
%}
// Build a list of tuples containing the two entries from the struct.
// Append the new Python list object to the existing "int" result.
%typemap(argout) struct entry **OUTPUT {
int i = 0;
PyObject* out = PyList_New(0);
while((*$1)[i].addr_str != NULL)
{
//PyObject* t = PyTuple_New(2);
//PyTuple_SET_ITEM(t,0,PyBytes_FromString((*$1)[i].addr_str));
//PyTuple_SET_ITEM(t,1,PyBytes_FromStringAndSize((*$1)[i].cc,2));
//PyList_Append(out,t);
//Py_DECREF(t);
PyObject* s = SWIG_NewPointerObj(*$1+i,$descriptor(struct entry*),0);
PyList_Append(out,s);
Py_DECREF(s);
++i;
}
$result = SWIG_AppendOutput($result,out);
}
// Since a Python object was created and the data copied for each entry struct,
// free the memory returned in the structure.
//%typemap(freearg) struct entry **OUTPUT {
// int i=0;
// while((*$1)[i].addr_str != NULL) {
// free((*$1)[i].addr_str);
// ++i;
// }
// free(*$1);
//}
// Apply the OUTPUT typemap set to the "results" parameter.
%apply struct entry **OUTPUT {struct entry** results};
// Finally, define the function for SWIG
int get_list(const char* string, struct entry** results);
Demo (Python 3.3):
>>> import example
>>> example.get_list('abcd')
[0, [(b'hello', b'ab'), (b'there', b'cd')]]
Hope that helps.
Edit:
I commented out the tuple creation and just save the entry* proxy instead. This doesn't leak Python objects, but the memory malloced for use by an entry* is not freed. I'm not sure where to put that, although I'm experimenting with %extend.

Allocating a chunk of memory and immediately freeing it in C fails

In part of the code of a C module I am integrating with Python, I have a char** (array of strings) which is repeatedly allocated, filled with allocated strings, then freed and allocated again. The general pattern is that when a certain function is called (from Python) supplying the new contents of the array (as a list), it iterates through the array of strings, freeing each of them, then frees the array itself. It then allocates the array again to hold the contents of the new Python list, then allocates memory for each of the strings to hold.
All that to say that I am getting an error when attempting to free one of the strings in the list. This error is deterministic; it is always the same word from the same list of words at the same point in the program, but there is nothing extraordinary about that word or list of words. (It is just ["CCellEnv", "18", "34"], which is a similar format to many others) I tried adding some debug code to the loop that allocates the strings; here is the function that produces the error:
static PyObject* py_set_static_line(PyObject* self, PyObject* args)
{
int i;
//Free the old values of the allocated variables, if there are any
if (numStaticWords > 0)
{
for (i = 0; i < numStaticWords; i++)
{
printf("Freeing word %d = '%s'\n", i, staticWords[i]);
free(staticWords[i]);
}
free(staticWords);
free(staticWordMatches);
}
//Parse arguments
PyObject* wordList;
unsigned short numWords;
PyObject* wordMatchesList;
if (!PyArg_ParseTuple(args, "O!HO!", &PyList_Type, &wordList, &numWords, &PyList_Type, &wordMatchesList))
return NULL;
numStaticWords = numWords;
if (numStaticWords > 0)
{
staticWords = malloc(sizeof(char*) * numStaticWords);
staticWordMatches = malloc(sizeof(int) * numStaticWords);
PyObject* wordObj;
PyObject* matchObj;
char* word;
for (i = 0; i < numStaticWords; i++)
{
//wordList is the list of strings passed from Python
wordObj = PyList_GetItem(wordList, i);
word = PyString_AsString(wordObj); //word is "18" in the failing case
//staticWords is the char** array of strings, which has already been malloc'd
staticWords[i] = malloc(sizeof(char) * strlen(word));
//Test freeing the word to see if it crashes
free(staticWords[i]); //Crashes for one specific word
staticWords[i] = malloc(sizeof(char) * strlen(word));
strcpy(staticWords[i], word);
matchObj = PyList_GetItem(wordMatchesList, i);
if (matchObj == Py_None)
{
staticWordMatches[i] = -1;
}
else
{
staticWordMatches[i] = PyInt_AsLong(matchObj);
}
}
}
Py_RETURN_NONE;
}
So, somehow, always and only for this specific string, allocating the memory to put it in, then immediately freeing that memory causes an error. The actual text of the string is not even copied into the memory. What could be causing this mysterious behavior?

Here
staticWords[i] = malloc(sizeof(char) * strlen(word));
strcpy(staticWords[i], word);
you are missing to allocate the 0-termination for the "strings". So any operation on those character arrays as strings, most likely will lead to undefined behaviour.
Do it this way:
{
int isNull = !word;
staticWords[i] = calloc(sizeof(*staticWords[i]), (isNull ?0 :strlen(word)) + 1);
strcpy(staticWords[i], isNull ?"" :word);
}

Writing a Python C extension: how to correctly load a PyListObject?

While attempting to read a Python list filled with float numbers and to populate real channels[7] with their values (I'm using F2C, so real is just a typedef for float), all I am able to retrieve from it are zero values. Can you point out the error in the code below?
static PyObject *orbital_spectra(PyObject *self, PyObject *args) {
PyListObject *input = (PyListObject*)PyList_New(0);
real channels[7], coefficients[7], values[240];
int i;
if (!PyArg_ParseTuple(args, "O!", &PyList_Type, &input)) {
return NULL;
}
for (i = 0; i < PyList_Size(input); i++) {
printf("%f\n", PyList_GetItem(input, (Py_ssize_t)i)); // <--- Prints zeros
}
//....
}

PyList_GetItem will return a PyObject*. You need to convert that to a number C understands. Try changing your code to this:
printf("%f\n", PyFloat_AsDouble(PyList_GetItem(input, (Py_ssize_t)i)));

Few things I see in this code.
You leak a reference, don't create that empty list at the beginning, it's not needed.
You don't need to cast to PyListObject.
PyList_GetItem returns a PyObject, not a float. Use PyFloat_AsDouble to extract the value.
If PyList_GetItem returns NULL, then an exception has been thrown, and you should check for it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Passing numpy string array to C using Swig - python

my C program needs a char** input which I store in python as a numpy object array of strings. a = np.empty(2, dtype=object) a[0] = 'hi you' a[1] = 'goodbye' What is the correct way to pass this to my C program considering that numpy.i only defines typemaps for char* arrays?

Related

Pass Python bytes to C/C++ function using swig?

CPPYY/CTYPES passing array of strings as char* args[]

OUT argument internally allocated to return an array of structs

Allocating a chunk of memory and immediately freeing it in C fails

Writing a Python C extension: how to correctly load a PyListObject?

Categories

Resources