Wrapping C++ classes that contain wxString with Cython - python

I'm working on a Python extension to tie in with a C++ application written using wxWidgets for the GUI. I'm using Cython, and have the basic system (build tools, plus a starter extension with appropriate version details etc) happily working.
I'm only interested in making backend (non-GUI) functionality available, such as file parsing and processing. However, all classes - not just the GUI ones - use wxString for string data, such as in the following minimal example:
#include <wx/string.h>
class MyClass{
wxString name;
wxString get_name(){
return this->name;
}
};
My question is what is the best way to go about wrapping such a class? Is there a simple way to interface between a Python string and a wxString instance? Or will I need to wrap the wxString class as well? Am I able to tie in somehow with the wxPython port to avoid re-inventing the wheel?

I got it to work by using the static wxString::FromUTF8() function to convert from Python to wxString, and the wxString.ToUTF8() to go in the other direction. The following is the code I came up with:
# Import the parts of wxString we want to use.
cdef extern from "wx/string.h":
cdef cppclass wxString:
char* ToUTF8()
# Import useful static functions from the class.
cdef extern from "wx/string.h" namespace "wxString":
wxString FromUTF8(char*)
# Function to convert from Python string to wxString. This can be given either
# a unicode string, or a UTF-8 encoded byte string. Results with other encodings
# are undefined and will probably lead to errors.
cdef inline wxString from_python(python_string):
# If it is a Python unicode string, encode it to a UTF-8 byte string as this
# is how we will pass it to wxString.
if isinstance(python_string, unicode):
byte_string = python_string.encode('UTF-8')
# It is already a byte string, and we have no choice but to assume its valid
# UTF-8 as theres no (sane/efficient) way to detect the encoding.
else:
byte_string = python_string
# Turn the byte string (which is still a Python object) into a C-level char*
# string.
cdef char* c_string = byte_string
# Use the static wxString::FromUTF8() function to get us a wxString.
return FromUTF8(c_string)
# Function to convert a wxString to a UTF-8 encoded Python byte string.
cdef inline object to_python_utf8(wxString wx_string):
return wx_string.ToUTF8()
# Function to convert a wxString to a Python unicode string.
cdef inline object to_python_unicode(wxString wx_string):
# Since the wxString.ToUTF8() method returns a const char*, we'd have to try
# and cast it if we wanted to do it all in here. I've tried this and can't
# seem to get it to work. But calling the to_python_utf8() function
# means Cython handles the conversions and it all just works. Plus, since
# they are defined as inline functions this may well be simplified down when
# compiled.
byte_string = to_python_utf8(wx_string)
# Decode it to a unicode string and we're done.
return byte_string.decode('UTF-8')
Simply put this in a .pxd file (personally, I put it in a subdirectory as wx/string.pxd - make sure you also create an wx/__init__.pdx if you choose to do the same). Then cimport it and call the functions as appropriate:
cimport wx.string
wx_string = wx.string.from_python(python_string)
python_string = wx.string.to_python_unicode(wx_string)

The first approach I would try is to use the wxString constructor:
wxString(const wxChar* psz, size_t nLength = wxSTRING_MAXLEN)
and pass the const char* string to it to create the object.
Then write some inline functions to convert from python string to wxString and vice versa.
PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
Now the only downside I see is that the string might be duplicate in wxString and python world.
A Second Approach would be to subclass wxString and hand reimplement all the operations in a way that uses the Python's PyString Object character buffer. Cython can help in coding such subclass.

Related

Storing unsafe C derivative of temporary Python reference Cython

Although I few similar questions already being asked, but I couldn't get head around on how to fix.
Basically I have this function:
Module one.pyx:
cdef char *isInData(data, dataLength):
cdef char *ret = <char *>malloc(200)
memset(ret, 0x00, 200)
if (b"Hello" in data and b"World" in data):
strcat(ret, b"Hello World!")
return ret
Module two.pyx:
import one
from libc.stdlib cimport malloc,realloc, free
cdef char *tempo():
cdef char *str
str = one.isInData(b"Hello what is up World", 23)
# do some stuff
free(str)
Issue occurs on line str = one.isInData("Hello what is up World", 23), I am assuming that as soon as isInData->ret is assigned to str. isInData->ret is deleted which causes this issue. But annyone help me on how to fix this?
import one
This line does a Python import of one. It doesn't know about any cdef functions defined in one (or even that one is a Cython module...). Therefore it assumes that isInData is a Python object that it can look up and that it'll be a callable returning another Python object.
cdf char* str = some_python_function() is unsafe because str points into the Python object. However the Python object is only a temporary and is most likely freed almost immediately.
What you mean is:
cimport one
which tells Cython that it's a Cython module that you're importing, and it should expect to find the declarations at compile-time. You'll probably need to write a pxd file to give the declarations.
I haven't looked in detail at the rest of your code so don't know if you are handling C strings right. But in general you may find it easier just to use Python strings rather than messing around with C string handling.

How to make an interaction b/w the C and the python files to copy the data?

In this code, I am trying to copy the data from the first path to the second path and I am using ctypes to call the C file in python. I used sprintf and snprintf but the data was not copying. I also used .so and .dll files but nothing worked. Although when I run the code only in C, it copy the data. But when I run the code in python, it gives me the following error.
Note: Don't suggest me to copy the data using python instead of C
Error
hello_lib.transfer(ctypes.c_char_p("/home/bilal/Pictures/New"), ctypes.c_char_p("/home/bilal/Music"))
TypeError: bytes or integer address expected instead of str instance
practice.c
#include <stdio.h>
#include <stdlib.h>
char *transfer(char *f1, char *f2){
char cmd[500];
sprintf(cmd, "rsync -av %s %s", f1, f2);
// snprintf(cmd, 500, "rsync -av %s %s", f1, f2);
system(cmd);
}
int main(void){
transfer("/home/bilal/Pictures/New", "/home/bilal/Music");
return 0;
}
practice.py
from ctypes import cdll
from ctypes import c_char_p
import ctypes
hello_lib = cdll.LoadLibrary("/Path/to/practice.so")
# hello_lib = cdll.LoadLibrary("/Path/to/practice.dll")
hello_lib.transfer(ctypes.c_char_p("/home/bilal/Pictures/New"), ctypes.c_char_p("/home/bilal/Music"))
FYI, your transfer() function declares char* as a return value, but doesn't return anything. Change it to void and then use the following.
.argtypes declares the types of arguments, and ctypes uses it for checking that the number and type of parameters is correct. Without it, it might work for some types but won't check that the types match.
.restype declares the return type. Use None for void.
The c_char_p type requires byte strings, not Unicode strings. Note the leading b. You can also .encode() your Unicode strings to convert them to byte strings. Use wchar_t* in C and c_wchar_p in Python to use Unicode strings, but then you'd need to call wsystem().
import ctypes as ct
hello_lib = ct.CDLL('/Path/to/practice.so')
hello_lib.transfer.argtypes = ct.c_char_p,ct.c_char_p
hello_lib.transfer.restype = None
hello_lib.transfer(b'/home/bilal/Pictures/New'),b'/home/bilal/Music')

Cython: std::function callbacks with custom parameter types

Please read this post before answering: Pass a closure from Cython to C++
In the accepted answers, it is neatly shown how a python function is converted into a std::function using Boost Python.
Following this example I'm able to wrap functions taking an std::function as an argument and call them using a python function as input. However, this only works when the std::function parameters are primitives like int, double, string etc.
Any guidance on how to make this work for custom types as well will be highly appreciated.
This won't be a complete answer - it assumes you can fill in the gaps from my previous answer that the question was based on. Unfortunately it is a little bit more complicated than that case.
Just to define the problem - assume you have a parameter of a custom C++ class, like:
class cpp_class {
// some non-trivial contents
};
and thus your C++ interface looks like this:
void call_some_std_func(std::function<void(cpp_class&)> callback) {
callback(5,std::string("hello"));
}
The first thing to do is to write a Cython wrapper for your C++ class (in principle you could make a Boost Python wrapper instead). Here you need to make a choice of about "ownership" of the C++ object. The first choice is to make a copy:
cdef extern from "cpp_file.hpp":
cppclass cpp_class:
pass # details
cdef class CyWrapper:
cdef cpp_class* ptr
def __dealloc__(self):
del self.ptr
# other details following standard wrapper pattern
cdef public make_CyWrapper(cpp_class& x):
obj = CyWrapper()
obj.ptr = new cpp_class(x)
return obj
I've created a wrapper class with a destructor that handles the memory and a publicly accessible constructor function that can be called from external code. This version is safe because your wrapper owns the object it holds and so there can be no writes to invalid memory. However, because it makes a copy, you can't make changes to the original C++ object.
A second option is to hold a pointer to an object you don't own. The code is basically identical except you remove the __dealloc__ and avoid making a copy in make_CyWrapper:
obj.ptr = &x // instead of new cpp_class(x)
This is unsafe - you need to ensure your C++ object outlives the Cython wrapper - but allows you to modify the object.
You could also imagine a few other options: you could take ownership of an existing object with your Cython wrapper (Such a scheme would have to pass by pointer rather than reference, or it could use move constructors); you could deconstruct your C++ class into a representation expressed in basic types and pass those to Python; you could use shared pointers to split the ownership; or you have a more elaborate way of marking your Cython wrapper as "invalid" once your held C++ instance is destructed.
What you do next depends on whether you're using Boost Python (for it's convenient, callable wrapping of Python objects) or if you're making your own version. (I showed both possibilities in the previous answer).
Assuming Boost Python, you need to do two things - tell it about the conversion and make sure that it imports the module that your wrapper is defined in (if you don't do this you get exciting segmentation faults)
struct convert_to_PyWrapper {
static PyObject* convert(const cpp_class& rhs) {
// the const_cast here is a bit dodgy, but was needed to make it work
return make_CyWrapper(const_cast<cpp_class&>(rhs));
}
};
inline void setup_boost_python() {
PyInit_your_module_name(); // named inityour_module_name in Python 2
boost::python::to_python_converter<
cpp_class,
convert_to_PyWrapper>();
}
You need to make sure that your Python/Cython code calls "setup_boost_python" before attempting to use the callback (if you put it at module level it's done on import, which is ideal).
If you're following my "manual" scheme (avoiding the dependency on Boost Python) then you need to modify the call_obj Cython function that does the C++ to Cython type conversion.
cdef public void call_obj(obj, cpp_class& c):
obj(make_CyWrapper(c))
You also need to ensure the wrapper Cython module is imported before use (otherwise you get segmentation faults). I did this in "py_object_wrapper.hpp" but providing it's done once somewhere you can place it where you like.
void operator()(cpp_class& a) {
PyInit_your_module_name();
if (held) {
call_obj(held,a);
}
}

Am I using ctypes correctly to pythonify this struct?

I'm trying to talk to this DLL using python's ctypes. Many of the functions take or return an HGRABBER type:
typedef struct HGRABBER_t__ { int unused; } HGRABBER_t;
#define HGRABBER HGRABBER_t*
(the full header file can be viewed here). Here's an example of a function prototype that returns an HGRABBER type:
HGRABBER __stdcall IC_CreateGrabber();
Here's my attempt at implementing this struct in python, and using it to call that function from the DLL:
import ctypes as C
class GrabberHandle(C.Structure):
_fields_ = [('unused', C.c_int)]
dll = C.windll.LoadLibrary('tisgrabber_x64.dll')
dll.create_grabber = dll.IC_CreateGrabber
dll.create_grabber.argtypes = []
dll.create_grabber.restype = GrabberHandle
my_handle = dll.create_grabber()
This seems to work, but I'm worried that I'm doing this wrong. I'm not experienced with C, and I don't think I understand the typedef and #define statements which define the HGRABBER type. Am I calling IC_CreateGrabber correctly? Should I have defined GrabberHandle to be a pointer to a struct, instead of a struct?
Thanks for reading, please let me know if I can clarify my question somehow.
You're right that you actually want a POINTER to the Structure, not the Structure itself.
Translating the C into English, being very loose (in a way that would be dangerous if you were trying to learn C but is good enough for using ctypes):
The struct defines a type named struct HGRABBER_t__, as a structure with one int in it.
The typedef defines a type named HGRABBER_t, as a synonym for struct HGRABBER_t__.
The #define defines a type named HGRABBER as a pointer to HGRABBER_t.
So, your GrabberHandle is the equivalent of HGRABBER_t; the equivalent of HGRABBER is:
GrabberHandlePtr = C.POINTER(GrabberHandle)
So you want this:
dll.create_grabber.restype = GrabberHandlePtr
It may be hard to debug the difference. A C struct with nothing but an int in it looks identical to an int in memory. And on Win32, an int and a pointer are both 32-bit values. And an int named unused is likely to be filled with meaningless garbage, making it hard to distinguish it from a pointer you've accidentally treated as an int. So everything will look fine, until you segfault 30 lines later in your code and have no idea what's wrong. :)
This library does what you are trying to do: https://github.com/morefigs/py-ic-imaging-control :)
But to answer your question, the library uses the code:
from ctypes import *
import os
class GrabberHandle(Structure):
pass
GrabberHandle._fields_ = [('unused', c_int)]
# set and check path
dll_path = os.path.join(os.path.expanduser('~'),
'Documents\\The Imaging Source Europe GmbH\\TIS Grabber DLL\\bin\\win32\\tisgrabber.dll')
with open(dll_path) as thefile:
pass
# open DLL
_ic_grabber_dll = windll.LoadLibrary(dll_path)
# create grabber
create_grabber = _ic_grabber_dll.IC_CreateGrabber
create_grabber.restype = POINTER(GrabberHandle)
create_grabber.argtypes = None
# get handle
handle = create_grabber()
Edit: changed code to use a pointer to GrabberHandle as per abarnert's answer as this is correct. However, in this particular case I have found no practical difference (with the 32-bit DLL), probably because the GrabberHandle structure is so simple.

Import constants from .h file into python

I've been looking for a simple answer to this question, but it seems that I can't find one. I would prefer to stay away from any external libraries that aren't already included in Python 2.6/2.7.
I have 2 c header files that resemble the following:
//constants_a.h
const double constant1 = 2.25;
const double constant2 = -0.173;
const int constant3 = 13;
...
//constants_b.h
const double constant1 = 123.25;
const double constant2 = -0.12373;
const int constant3 = 14;
...
And I have a python class that I want to import these constants into:
#pythonclass.py
class MyObject(object):
def __init(self, mode):
if mode is "a":
# import from constants_a.h, like:
# self.constant1 = constant1
# self.constant2 = constant2
elif mode is "b":
# import from constants_b.h, like:
# self.constant1 = constant1
# self.constant2 = constant2
...
I have c code which uses the constants as well, and resembles this:
//computations.c
#include <stdio.h>
#include <math.h>
#include "constants_a.h"
// do some calculations, blah blah blah
How can I import the constants from the header file into the Python class?
The reason for the header files constants_a.h and constants_b.h is that I am using python to do most of the calculations using the constants, but at one point I need to use C to do more optimized calculations. At this point I am using ctypes to wrap the c code into Python. I want to keep the constants away from the code just in case I need to update or change them, and make my code much cleaner as well. I don't know if it helps to note I am also using NumPy, but other than that, no other non-standard Python extensions. I am also open to any suggestions regarding the design or architecture of this program.
In general, defining variables in C header file is poor style. The header file should only declare objects, leaving their definition for the appropriate ".c" source code file.
One thing you may want to do is to declare the library-global constants like extern const whatever_type_t foo; and define (or "implement") them (i.e. assigning values to them) somewhere in your C code (make sure you do this only once).
Anyway, let's ignore how you do it. Just suppose you've already defined the constants and made their symbols visible in your shared object file "libfoo.so". Let us suppose you want to access the symbol pi, defined as extern const double pi = 3.1415926; in libfoo, from your Python code.
Now you typically load your object file in Python using ctypes like this:
>>> import ctypes
>>> libfoo = ctypes.CDLL("path/to/libfoo.so")
But then you'll see ctypes thinks libfoo.pi is a function, not a symbol for constant data!
>>> libfoo.pi
<_FuncPtr object at 0x1c9c6d0>
To access its value, you have to do something rather awkward -- casting what ctypes thinks is a function back to a number.
>>> pi = ctypes.cast(foo.pi, ctypes.POINTER(ctypes.c_double))
>>> pi.contents.value
3.1415926
In C jargon, this vaguely corresponds to the following thing happening: You have a const double pi, but someone forces you to use it only via a function pointer:
typedef int (*view_anything_as_a_function_t)(void);
view_anyting_as_a_function_t pi_view = π
What do you do with the pointer pi_view in order to use the value of pi? You cast it back as a const double * and dereference it: *(const double *)(pi_view).
So this is all very awkward. Maybe I'm missing something but this I believe is by design of the ctypes module -- it's there chiefly for making foreign function calls, not for accessing "foreign" data. And exporting pure data symbol in a loadable library is arguably rare.
And this will not work if the constants are only C macro definitions. There's in general no way you can access macro-defined data externally. They're macro-expanded at compile time, leaving no visible symbol in the generated library file, unless you also export their macro values in your C code.
I recommend using regular expressions (re module) to parse the information you want out of the files.
Building a full C parser would be huge, but if you only use the variables and the file is reasonably simple/predictable/under control, then what you need to write is straightforward.
Just watch out for 'gotcha' artifacts such as commented-out code!
I would recommend using some kind of configuration file readable by both Python and C program, rather than storing constant values in headers. E.g. a simple csv, ini-file, or even your own simple format of 'key:value' pairs. And there will be no need to recompile the C program every time you'd like to change one of the values :)
I'd up-vote emilio, but I'm lacking rep!
Although you have requested to avoid other non-standard libraries, you may wish to take a look at Cython (Cython: C-Extensions for Python www.cython.org/), which offers the flexibility of Python coding and the raw speed of execution of C/C++-compiled code.
This way you can use regular Python for everything, but handle the expensive elements of code using its built-in C-types. You can then convert your Python code into .c files too (or just wrap external C-libraries themselves. ), which can then be compiled into a binary. I've achieved up to 10x speed-ups doing so for numerical routines. I also believe NumPy uses it.

Categories