Shared library vs. opening a process performance - python

I have a certain base of Python code (a Flask server). I need the server to perform a performance-critical operation, which I've decided to implement in C++. However, as the C++ part has other dependencies, trying out ctypes and boost.python yielded no results (not found symbols, other libraries even when setting up the environment, etc., basically, there were problems). I believe a suitable alternative would be for me to just compile the C++ part into an executable (a single function/procedure is required) and run it from python using commands or subprocess, communicating through stdin/out for example. The only thing I'm worried about is that this will slow down the procedure enough to matter and since I'm unable to create a shared object library, calling its function from python directly, I cannot benchmark the speedup.
When I compile the C++ code into an executable and run it with some sample data, the program takes ~5s to run. This does not account for opening the process from python, nor for passing data between the processes.
The question is: How big of a speedup can one expect by using ctypes/boost with a SO compared to creating a new process to run the procedure? If I regard the number to be big enough, it would motivate me to solve the encountered problems, basically, I'm asking if it's worth it.

If you're struggling with creating binding using Boost.Python, you can manually expose your API via c-functions and use them via FFI.
Here's a simple example, which briefly explains my idea. At first, you create a shared library, but add some extra functions here, which in the example I put into extern "C" section. It's necessary to use extern "C" since otherwise function names will be mangled and their actual names are likely to be different from those you've declared:
#include <cstdint>
#include <cstdio>
#ifdef __GNUC__
#define EXPORT __attribute__ ((visibility("default")))
#else // __GNUC__
#error "Unsupported compiler"
#endif // __GNUC__
class data_processor {
public:
data_processor() = default;
void process_data(const std::uint8_t *data, std::size_t size) {
std::printf("processing %zu bytes of data at %p\n", size, data);
}
};
extern "C" {
EXPORT void *create_processor() {
return new data_processor();
}
EXPORT void free_processor(void *data) {
delete static_cast<data_processor *>(data);
}
EXPORT void process_data(void *object, const std::uint8_t *data, const std::uint32_t size) {
static_cast<data_processor *>(object)->process_data(data, size);
}
}
Then you create function bindings in python. As you can see function declarations are almost the same as they are in the cpp file below. I used built-in types only (like void *, uint8_t and anything, but I believe FFI allows you to declare and use custom structs as well):
from cffi import FFI
mylib_api = FFI()
mylib_api.cdef("""
void *create_processor();
void free_processor(void *object);
void process_data(void *object, const uint8_t *data, uint32_t size);
""")
mylib = mylib_api.dlopen("mylib.so-file-location")
processor = mylib.create_processor()
try:
buffer = b"buffer"
mylib.process_data(processor, mylib_api.from_buffer("uint8_t[]", python_buffer=buffer), len(buffer))
finally:
mylib.free_processor(processor)
And that's basically it.
In my opinion inter-processing is going to be the last resort when nothing else works since:
you need to put a lot of efforts implementing details of your communication protocol, either if you use something popular, there could be a lot of issues, especially from c++-side;
inter-process communication is generally more expensive in terms of processor time.

Related

Pybind11 automatic type conversions memory management

I am using Pybind11 to make a Python C++ extension module, and have the following code:
#include <pybind11/pybind11.h>
#include <string.h>
namespace py = pybind11;
int getLength(char* arg) {
return strlen(arg);
}
PYBIND11_MODULE(example, m) {
m.def("getLength", &getLength, "Get length using strlen", py::arg("arg"));
}
From the type conversion documentations, when I call getLength from Python, Pybind11 converts Python's str type to C++ char*. I assume that the memory for the new char* argument is allocated on the heap. My question is: does Pybind11 deallocate this on return, or I need to add delete[] arg; at the end of my function?
I know that if I were to change my function, to accept py:str (Python's string type), then manually converted it to C++ char* I would be responsible for (de)allocation. Does this hold for built in conversions too?
Does Pybind11 handle memory management of the results of automatic conversions, or do I need to do that?
Though I couldn't find the answer in the docs, I link it anyway: https://pybind11.readthedocs.io/en/stable/advanced/cast/overview.html#list-of-all-builtin-conversions
In this case it will end up using the c-style string type caster, which in turn will instantiate a std::string here, and the memory will be cleaned up when this std::string's destructor is called. (unless it's an optimized small string which was allocated on the stack).
Although I am a little out of my depth here, it has to be a copy since python strings are (technically) immutable. But pybind11 also allows conversions to a std::string_view which does not make a copy.

Ctypes. How to pass struct by reference?

I try to write Python wrapper for C library using ctypes.
So far I have:
C.h
typedef struct
{
int erorrCode;
char * Key;
} A;
#ifdef __cplusplus
extern "C" {
#endif
EXPORT void __stdcall DestroyA(A &input);
#ifdef __cplusplus
}
#endif
C.cpp
EXPORT void __stdcall DestroyA(A &input)
{
delete []input.Key;
}
Python.py
import sys
import ctypes
class A(ctypes.Structure):
_fields_ = [
("erorrCode", ctypes.c_int),
("Key", ctypes.c_char_p)]
try:
libapi = ctypes.cdll.LoadLibrary('./lib.so')
except OSError:
print("Unable to load RAPI library")
sys.exit()
DestroyA = libapi.DestroyA
libapi.DestroyA.argtypes = [ctypes.POINTER(A)]
libapi.DestroyA.restype = None
a = A(1,b'random_string')
DestroyA(ctypes.byref(a)) #!!!here is segmentation fault
So, how can I fix the segmentation fault error?
Note: I cannot change the code on the C ++ side as long as there is a way to fix it on the Python side.
Listing [Python.Docs]: ctypes - A foreign function library for Python.
You have here Undefined Behavior (UB).
Python has builtin memory management for its objects, including CTypes ones. So, every time an object (PyObject which is basically anything - including a Python int), Python invokes one of the malloc functions family under the hood in order to allocate memory. Conversely, when the object is destroyed (manually or by GC), free is called.
What happened:
You created the object (behind the scenes, Python allocated some memory)
You called free on the object allocated by Python (which is wrong, not to mention that you also crossed the .dll boundary)
You need to call free only on pointers that you allocated. One such example: [SO]: python: ctypes, read POINTER(c_char) in python (#CristiFati's answer).
If you want to get rid of the object (and thus free the memory that it uses), let Python do it for you:
del a
Additional remarks:
You're using __stdcall functions with ctypes.CDLL. Again, that's UB (on 32bit). Use the "regular" calling convention (__cdecl)
You're passing a reference. That's C++ specific (although it's only a const ptr). To be C compatible, use:
EXPORT void destroyA(A *pInput);

SWIG and Boost::variant

I'm in the middle of trying to wrap a c++ project into a python api using SWIG and I'm running into an issue with code that has the following format.
class A
{
//constructors and such.
};
class B
{
//constructors and such.
};
class C
{
//constructors and such.
};
typedef boost::variant<A,B,C> VariantType;
typedef std::vector<boost::variant<A,B,C>> VariantTypeList;
Classes A,B & C all come out in the python wrapper without a problem and seem to be usable. However when I try to add the following lines to the interface file
%template(VariantType) boost::variant<A,B,C>;
%template(VariantTypeList) std::vector<boost::variant<A,B,C>>;
I get an error that says
Boost\x64\include\boost\variant\variant.hpp(148): error : Syntax error in input(3).
So I go and look at the error and its a line that has a macro that is defined inside another header file specifically "boost/mpl/aux_/value_wknd.hpp" so I add that to the interface file with %include and now it appears that SWIG.exe crashes with an error helpfully stating
Access Violation
So long story short is there a way to wrap a boost::variant template type? Unfortunately this template definition is baked into the core of our library and I can't change it now. Also if it matters I'm on the MSVC 2013 compiler.
If it isn't possible to wrap the template type directly is it possible to work around this? I'm reading through the SWIG documentation to see if there is some typemap magic that can be applied but I'm fairly new to SWIG in general.
You can do this. I thought for quite a while about what the neatest Python interface to boost::variant actually is. My conclusion was that 99% of the time a Python user shouldn't even realise there's a variant type being use - unions and variants are basically just somewhat constrained duck-typing for C++.
So my goals were this:
wherever possible benefit from existing typemaps - we don't want to have to write our own std::string, int, typemaps from scratch.
anywhere a C++ function takes a boost::variant we should transparently accept any of the types the variant can hold for that function argument.
anywhere a C++ function returns a boost::variant we should transparently return it as the type the variant was holding when we got it back into Python.
allow Python users to explicitly create a variant object, e.g. an empty one, but don't expect that to ever actually happen. (Maybe that would be useful for reference output arguments, but I've not gone that far in this currently).
I didn't do this, but it would be fairly simple to add visitors from where this interface currently stands using the directors feature of SWIG.
It's pretty fiddly to do all that without adding some machinery into things. I wrapped everything up in a reusable file, this is the final working version of my boost_variant.i:
%{
#include <boost/variant.hpp>
static PyObject *this_module = NULL;
%}
%init %{
// We need to "borrow" a reference to this for our typemaps to be able to look up the right functions
this_module = m; // borrow should be fine since we can only get called when our module is loaded right?
// Wouldn't it be nice if $module worked *anywhere*
%}
#define FE_0(...)
#define FE_1(action,a1) action(0,a1)
#define FE_2(action,a1,a2) action(0,a1); action(1,a2)
#define FE_3(action,a1,a2,a3) action(0,a1); action(1,a2); action(2,a3)
#define FE_4(action,a1,a2,a3,a4) action(0,a1); action(1,a2); action(2,a3); action(3,a4)
#define FE_5(action,a1,a2,a3,a4,a5) action(0,a1); action(1,a2); action(2,a3); action(3,a4); action(4,a5)
#define GET_MACRO(_1,_2,_3,_4,_5,NAME,...) NAME
%define FOR_EACH(action,...)
GET_MACRO(__VA_ARGS__, FE_5, FE_4, FE_3, FE_2, FE_1, FE_0)(action,__VA_ARGS__)
%enddef
#define in_helper(num,type) const type & convert_type ## num () { return boost::get<type>(*$self); }
#define constructor_helper(num,type) variant(const type&)
%define %boost_variant(Name, ...)
%rename(Name) boost::variant<__VA_ARGS__>;
namespace boost {
struct variant<__VA_ARGS__> {
variant();
variant(const boost::variant<__VA_ARGS__>&);
FOR_EACH(constructor_helper, __VA_ARGS__);
int which();
bool empty();
%extend {
FOR_EACH(in_helper, __VA_ARGS__);
}
};
}
%typemap(out) boost::variant<__VA_ARGS__> {
// Make our function output into a PyObject
PyObject *tmp = SWIG_NewPointerObj(&$1, $&1_descriptor, 0); // Python does not own this object...
// Pass that temporary PyObject into the helper function and get another PyObject back in exchange
const std::string func_name = "convert_type" + std::to_string($1.which());
$result = PyObject_CallMethod(tmp, func_name.c_str(), "");
Py_DECREF(tmp);
}
%typemap(in) const boost::variant<__VA_ARGS__>& (PyObject *tmp=NULL) {
// I don't much like having to "guess" the name of the make_variant we want to use here like this...
// But it's hard to support both -builtin and regular modes and generically find the right code.
PyObject *helper_func = PyObject_GetAttrString(this_module, "new_" #Name );
assert(helper_func);
// TODO: is O right, or should it be N?
tmp = PyObject_CallFunction(helper_func, "O", $input);
Py_DECREF(helper_func);
if (!tmp) SWIG_fail; // An exception is already pending
// TODO: if we cared, we chould short-circuit things a lot for the case where our input really was a variant object
const int res = SWIG_ConvertPtr(tmp, (void**)&$1, $1_descriptor, 0);
if (!SWIG_IsOK(res)) {
SWIG_exception_fail(SWIG_ArgError(res), "Variant typemap failed, not sure if this can actually happen");
}
}
%typemap(freearg) const boost::variant<__VA_ARGS__>& %{
Py_DECREF(tmp$argnum);
%}
%enddef
This gives us a macro we can use in SWIG, %boost_variant. You can then use this in your interface file something like this:
%module test
%include "boost_variant.i"
%inline %{
struct A {};
struct B {};
%}
%include <std_string.i>
%boost_variant(TestVariant, A, B, std::string);
%inline %{
void idea(const boost::variant<A, B, std::string>&) {
}
boost::variant<A,B,std::string> make_me_a_thing() {
struct A a;
return a;
}
boost::variant<A,B,std::string> make_me_a_string() {
return "HELLO";
}
%}
Where the %boost_variant macro takes the first argument as a name for the type (much like %template would) and the remaining arguments as a list of all the types in the variant.
This is sufficient to allow us to run the following Python:
import test
a = test.A();
b = test.B();
test.idea(a)
test.idea(b)
print(test.make_me_a_thing())
print(test.make_me_a_string())
So how does that actually work?
We essentially duplicate SWIG's %template support here. (It's documented here as an option)
Most of the heavy lifting in my file is done using a FOR_EACH variadic macro. Largely that's the same as my previous answer on std::function, which was itself derived from several older Stack Overflow answers and adapted to work with SWIG's preprocessor.
Using the FOR_EACH macro we tell SWIG to wrap one constructor per type the variant can hold. This lets us explicitly construct variants from Python code, with two extra constructors added
By using constructors like this we can lean heavily on SWIG's overload resolution support. So given a Python object we can simply rely on SWIG to determine how to construct a variant from it. Which saves us a bunch of extra work, and uses the existing typemaps for each type within the variant.
The in typemap basically just delegates to the constructor, via a slightly convoluted route because it's surprisingly hard to find other functions in the same module programatically. Once that delegation has happened we use the normal conversion of a function argument to just pass the tempoary variant into the function as though it were what we were given.
We also synthesise a set of extra member functions, convert_typeN which internally just call boost::get<TYPE>(*this), where N and TYPE are the position of each type in the list of variant types.
Within the out typemap this then allows us to lookup a Python function, using which() to determine what the variant currently holds. We've then got largely SWIG generated code, using existing typemaps to make a given variant into a Python object of the underlying type. Again that saves us a lot of effort and makes everything plug and play.
If you're decided on SWIG (which wasn't clear to me from your post as you said to be fairly new to SWIG, so I'm under the assumption that this is a new project), then stop reading and ignore this answer.
But in case the bindings technology to use isn't fixed yet and you only need to bind Python, no other languages, an alternative is to use cppyy (http://cppyy.org, and full disclaimer: I'm main author). With that, the boost::variant type is directly available in Python and then you can make it look/behave more Pythonistic by writing Python code rather than SWIG .i code.
Example (note that cppyy has wheels for Windows on PyPI but built with MSVC2017, not MSVC2013, so I'll keep that caveat as to whether MSVC2013 is modern enough to build the code as I haven't tried):
import cppyy
cppyy.include("boost/variant/variant.hpp")
cppyy.include("boost/variant/get.hpp")
cpp = cppyy.gbl
std = cpp.std
boost = cpp.boost
cppyy.cppdef("""
class A
{
//constructors and such.
};
class B
{
//constructors and such.
};
class C
{
//constructors and such.
};
""")
VariantType = boost.variant['A, B, C']
VariantTypeList = std.vector[VariantType]
v = VariantTypeList()
v.push_back(VariantType(cpp.A()))
print(v.back().which())
v.push_back(VariantType(cpp.B()))
print(v.back().which())
v.push_back(VariantType(cpp.C()))
print(v.back().which())
print(boost.get['A'](v[0]))
try:
print(boost.get['B'](v[0]))
except Exception as e:
print(e) # b/c of type-index mismatch above
print(boost.get['B'](v[1])) # now corrected
print(boost.get['C'](v[2]))
which produces the expect output of:
$ python variant.py
0
1
2
<cppyy.gbl.A object at 0x5053704>
Could not instantiate get<B>:
B& boost::get(boost::variant<A,B,C>& operand) =>
Exception: boost::bad_get: failed value get using boost::get (C++ exception)
<cppyy.gbl.B object at 0x505370c>
<cppyy.gbl.C object at 0x5053714>

Interoperability between C++ Boost Multiprecision and Python's mpmath

I have experience working with both Boost Multiprecision and with Python's mpmath, separately.
When it gets to making both communicate (for example to create Python extensions in C++), my attempts have always involved some sort of wasteful float-to-string and string-to-float conversion.
My question is: is it possible to make both communicate in a more performant (and elegant) way? And by that I mean, is there a way to directly have C++ Boost Multiprecision load from and export to a Python mpmath.mpf object in the same vein as C's mpp does via pybind11?
I have been searching for this for quite a bit. The only other similar question I found was about just exporting from Boost Multiprecision to Python (in general) using pybind11, not to a mpmath object directly. And in that question, the OP ended up using the same approach I am trying to avoid (that is, converting from/to strings when communicating from/to C++ and Python).
This answers only partially to your question. Because the direct answer is: No, it is not possible in a clean way without a wasteful conversion to string, because mpmath is a purely python library without any parts of it written in C or C++, hence even if you try to skip "wasteful conversion" by seeking to use some sort of binary compatibility, your code will be very fragile: it will break every time when some python or mpmath internals are changed ever so slightly.
However I needed exactly the same thing. And so I settled down for an automated conversion registered via boost::python which checks and converts using strings. Actually inside python you also create mpmath.mpf objects from strings, so it's very much the same, except in the code below it is faster because it is written inside C++.
So here's what works for me:
#include <boost/python.hpp>
#include <iostream>
#include <limits>
#include <sstream>
#include <boost/math/constants/constants.hpp>
#include <boost/multiprecision/cpp_bin_float.hpp>
namespace py = ::boost::python;
using Prec80 = boost::multiprecision::number<boost::multiprecision::cpp_bin_float<80>>;
template<typename ArbitraryReal>
struct ArbitraryReal_to_python {
static PyObject* convert(const ArbitraryReal& val){
std::stringstream ss{};
ss << std::setprecision(std::numeric_limits<ArbitraryReal>::digits10+1) << val;
py::object mpmath = py::import("mpmath");
mpmath.attr("mp").attr("dps")=int(std::numeric_limits<ArbitraryReal>::digits10+1);
py::object result = mpmath.attr("mpf")(ss.str());
return boost::python::incref( result.ptr() );
}
};
template<typename ArbitraryReal>
struct ArbitraryReal_from_python {
ArbitraryReal_from_python(){
boost::python::converter::registry::push_back(&convertible,&construct,boost::python::type_id<ArbitraryReal>());
}
static void* convertible(PyObject* obj_ptr){
// Accept whatever python is able to convert into float
// This works with mpmath numbers. However if you want to accept strings as numbers this checking code can be a little longer to verify if string is a valid number.
double check = PyFloat_AsDouble(obj_ptr);
return (PyErr_Occurred()==nullptr) ? obj_ptr : nullptr;
}
static void construct(PyObject* obj_ptr, boost::python::converter::rvalue_from_python_stage1_data* data){
std::istringstream ss{ py::call_method<std::string>(obj_ptr, "__str__") };
void* storage=((boost::python::converter::rvalue_from_python_storage<ArbitraryReal>*)(data))->storage.bytes;
new (storage) ArbitraryReal;
ArbitraryReal* val=(ArbitraryReal*)storage;
ss >> *val;
data->convertible=storage;
}
};
struct Var
{
Prec80 value{"-71.23"};
Prec80 get() const { return value; };
void set(Prec80 val) { value = val; };
};
BOOST_PYTHON_MODULE(pysmall)
{
ArbitraryReal_from_python<Prec80>();
py::to_python_converter<Prec80,ArbitraryReal_to_python<Prec80>>();
py::class_<Var>("Var" )
.add_property("val", &Var::get, &Var::set);
}
Now you compile this code with this command:
g++ -O1 -g pysmall.cpp -o pysmall.so -std=gnu++17 -fPIC -shared -I/usr/include/python3.7m/ -lboost_python37 -lpython3.7m -Wl,-soname,"pysmall.so"
And here is an example python session:
In [1]: import pysmall
In [2]: a=pysmall.Var()
In [3]: a.val
Out[3]: mpf('-71.2299999999999999999999999999999999999999999999999999999999999999999999999999997072')
In [4]: a.val=123.12
In [5]: a.val
Out[5]: mpf('123.120000000000000000000000000000000000000000000000000000000000000000000000000000003')
The C++ code does not care whether mpmath is already imported in python. If it is, it obtains the exsiting library handle, if it is not then it imports it.
If you find any room for improvement in this snippet please let me know!
Here's a couple of useful references when I was writing this:
https://misspent.wordpress.com/2009/09/27/how-to-write-boost-python-converters/
https://github.com/bluescarni/mppp/blob/master/include/mp%2B%2B/extra/pybind11.hpp (but I didn't want to use pybind11, just boost::python)
EDIT: I have now finished implementing this in YADE , it works with EIGEN and CGAL libraries. The part concerning this question is in file ToFromPythonConverter.hpp

Adding swig pythoncode to set thisown flag on Python object

I have a swigged C++ class container, MyContainer, holding objects of type MyObject, also a C++ class.
The following is the C++ header code (freemenot.h)
#ifndef freemenotH
#define freemenotH
#include <vector>
#include <string>
using std::string;
class MyObject
{
public:
MyObject(const string& lbl);
~MyObject();
string getLabel();
private:
string label;
};
class MyContainer
{
public:
MyContainer();
~MyContainer();
void addObject(MyObject* o);
MyObject* getObject(unsigned int t);
int getNrOfObjects();
private:
std::vector<MyObject*> mObjects;
};
#endif
and this is the source (freemenot.cpp)
#include "freemenot.h"
#include <iostream>
using namespace std;
/* MyObject source */
MyObject::MyObject(const string& lbl)
:
label(lbl)
{ cout<<"In object ctor"<<endl; }
MyObject::~MyObject() { cout<<"In object dtor"<<endl; }
string MyObject::getLabel() { return label; }
/* MyContainer source */
MyContainer::MyContainer() { cout<<"In container ctor"<<endl; }
MyContainer::~MyContainer()
{
cout<<"In container dtor"<<endl;
for(unsigned int i = 0; i < mObjects.size(); i++)
{
delete mObjects[i];
}
}
int MyContainer::getNrOfObjects() { return mObjects.size(); }
void MyContainer::addObject(MyObject* o) { mObjects.push_back(o); }
MyObject* MyContainer::getObject(unsigned int i) { return mObjects[i]; }
Observe that the objects are stored as RAW POINTERS in the vector. The class is such designed, and the container is thus responsible to free the objects in its destructor, as being done in the destructors for loop.
In C++ code, like below, an object o1 is added to the container c, which is returned to client code
MyContainer* getAContainerWithSomeObjects()
{
MyContainer* c = new MyContainer();
MyObject* o1 = new MyObject();
c.add(o1);
return c;
}
The returned container owns its objects, and are responsible to de-allocate these objects when done. In C++, access to the containers objects is fine after the function exits above.
Exposing the above classes to python, using Swig, will need an interface file. This interface file looks like this
%module freemenot
%{ #include "freemenot.h" %}
%include "std_string.i"
//Expose to Python
%include "freemenot.h"
And to generate a Python module, using CMake, the following CMake script was used.
cmake_minimum_required(VERSION 2.8)
project(freemenot)
find_package(SWIG REQUIRED)
include(UseSWIG)
find_package(PythonInterp)
find_package(PythonLibs)
get_filename_component(PYTHON_LIB_FOLDER ${PYTHON_LIBRARIES} DIRECTORY CACHE)
message("Python lib folder: " ${PYTHON_LIB_FOLDER})
message("Python include folder: " ${PYTHON_INCLUDE_DIRS})
message("Python libraries: " ${PYTHON_LIBRARIES})
set(PyModule "freemenot")
include_directories(
${PYTHON_INCLUDE_PATH}
${CMAKE_CURRENT_SOURCE_DIR}
)
link_directories( ${PYTHON_LIB_FOLDER})
set(CMAKE_MODULE_LINKER_FLAGS ${CMAKE_CURRENT_SOURCE_DIR}/${PyModule}.def)
set_source_files_properties(${PyModule}.i PROPERTIES CPLUSPLUS ON)
set_source_files_properties(${PyModule}.i PROPERTIES SWIG_FLAGS "-threads")
SWIG_ADD_LIBRARY(${PyModule}
MODULE LANGUAGE python
SOURCES ${PyModule}.i freemenot.cpp)
SWIG_LINK_LIBRARIES (${PyModule} ${PYTHON_LIB_FOLDER}/Python37_CG.lib )
# INSTALL PYTHON BINDINGS
# Get the python site packages directory by invoking python
execute_process(COMMAND python -c "import site; print(site.getsitepackages()[0])" OUTPUT_VARIABLE PYTHON_SITE_PACKAGES OUTPUT_STRIP_TRAILING_WHITESPACE)
message("PYTHON_SITE_PACKAGES = ${PYTHON_SITE_PACKAGES}")
install(
TARGETS _${PyModule}
DESTINATION ${PYTHON_SITE_PACKAGES})
install(
FILES ${CMAKE_CURRENT_BINARY_DIR}/${PyModule}.py
DESTINATION ${PYTHON_SITE_PACKAGES}
)
Generating the make files using CMake, and compiling using borlands bcc32 compiler, a Python module (freemenot) is generated and installed into a python3 valid sitepackages folder.
Then, in Python, the following script can be used to illuminate the problem
import freemenot as fmn
def getContainer():
c = fmn.MyContainer()
o1 = fmn.MyObject("This is a label")
o1.thisown = 0
c.addObject(o1)
return c
c = getContainer()
print (c.getNrOfObjects())
#if the thisown flag for objects in the getContainer function
#is equal to 1, the following call return an undefined object
#If the flag is equal to 0, the following call will return a valid object
a = c.getObject(0)
print (a.getLabel())
This Python code may look fine, but don't work as expected. Problem is that, when the function getContainer() returns, the memory for object o1 is freed, if the thisown flag is not set to zero. Accessing the object after this line, using the returned container will end up in disaster. Observe, there is not nothing wrong with this per se, as this is how pythons garbage collection works.
For the above use case being able to set the python objects thisown flag inside the addObject function, would render the C++ objects usable in Python.
Having the user to set this flag is no good solution.
One could also extend the python class with an "addObject" function, and modifying the thisown flag inside this function, and thereby hiding this memory trick from the user.
Question is, how to get Swig to do this, without extending the class?
I'm looking for using a typemap, or perhaps %pythoncode, but I seem not able to find a good working example.
The above code is to be used by, and passed to, a C++ program that is invoking the Python interpreter. The C++ program is responsible to manage the memory allocated in the python function, even after PyFinalize().
The above code can be downloaded from github https://github.com/TotteKarlsson/miniprojects
There are a number of different ways you could solve this problem, so I'll try and explain them each in turn, building on a few things along the way. Hopefully this is useful as a view into the options and innards of SWIG even if you only really need the first example.
Add Python code to modify thisown directly
The solution most like what you proposed relies on using SWIG's %pythonprepend directive to add some extra Python code. You can target it based on the C++ declaration of the overload you care about, e.g.:
%module freemenot
%{ #include "freemenot.h" %}
%include "std_string.i"
%pythonprepend MyContainer::addObject(MyObject*) %{
# mess with thisown
print('thisown was: %d' % args[0].thisown)
args[0].thisown = 0
%}
//Expose to Python
%include "freemenot.h"
Where the only notable quirk comes from the fact that the arguments are passed in using *args instead of named arguments, so we have to access it via position number.
There are several other places/methods to inject extra Python code (provided you're not using -builtin) in the SWIG Python documentation and monkey patching is always an option too.
Use Python's C API to tweak thisown
The next possible option here is to use a typemap calls the Python C API to perform the equivalent functionality. In this instance I've matched on the argument type and argument name, but that does mean the typemap here would get applied to all functions which receive a MyObject * named o. (The easiest solution here is to make the names describe the intended semantics in the headers if that would over-match currently which has the side benefit of making IDEs and documentation clearer).
%module freemenot
%{ #include "freemenot.h" %}
%include "std_string.i"
%typemap(in) MyObject *o {
PyObject_SetAttrString($input, "thisown", PyInt_FromLong(0)); // As above, but C API
$typemap(in,MyObject*); // use the default typemap
}
//Expose to Python
%include "freemenot.h"
The most noteworthy point about this example other than the typemap matching is the use of $typemap here to 'paste' another typemap, specifically the default one for MyObject* into our own typemap. It's worth having a look inside the generated wrapper file at a before/after example of what this ends up looking like.
Use SWIG runtime to get at SwigPyObject struct's own member directly
Since we're already writing C++ instead of going via the setattr in the Python code we can adapt this typemap to use more of SWIG's internals and skip a round-trip from C to Python and back to C again.
Internally within SWIG there's a struct that contains the details of each instance, including the ownership, type etc.
We could just cast from PyObject* to SwigPyObject* ourselves directly, but that would require writing error handling/type checking (is this PyObject even a SWIG one?) ourselves and become dependent on the details of the various differing ways SWIG can produce Python interfaces. Instead there's a single function we can call which just handles all that for us, so we can write our typemap like this now:
%module freemenot
%{ #include "freemenot.h" %}
%include "std_string.i"
%typemap(in) MyObject *o {
// TODO: handle NULL pointer still
SWIG_Python_GetSwigThis($input)->own = 0; // Safely cast $input from PyObject* to SwigPyObject*
$typemap(in,MyObject*); // use the default typemap
}
//Expose to Python
%include "freemenot.h"
This is just an evolution of the previous answer really, but implemented purely in the SWIG C runtime.
Copy construct a new instance before adding
There are other ways to approach this kind of ownership problem. Firstly in this specific instance your MyContainer assumes it can always call delete on every instance it stores (and hence owns in these semantics).
The motivating example for this would be if we were also wrapping a function like this:
MyObject *getInstanceOfThing() {
static MyObject a;
return &a;
}
Which introduces a problem with our prior solutions - we set thisown to 0, but here it would already have been 0 and so we still can't legally call delete on the pointer when the container is released.
There's a simple way to deal with this that doesn't require knowing about SWIG proxy internals - assuming MyObject is copy constructable then you can simply make a new instance and be sure that no matter where it came from it's going to be legal for the container to delete it. We can do that by adapting our typemap a little:
%module freemenot
%{ #include "freemenot.h" %}
%include "std_string.i"
%typemap(in) MyObject *o {
$typemap(in,MyObject*); // use the default typemap as before
$1 = new $*1_type(*$1); // but afterwards call copy-ctor
}
//Expose to Python
%include "freemenot.h"
The point to note here is the use of several more SWIG features that let us know the type of the typemap inputs - $*1_type is the type of the typemap argument dereferenced once. We could have just written MyObject here, as that's what it resolves to but this lets you handle things like templates if your container is really a template, or re-use the typemap in other similar containers with %apply.
The thing to watch for here now is leaks if you had a C++ function that you were deliberately allowing to return an instance without thisown being set on the assumption that the container would take ownership that wouldn't now hold.
Give the container a chance to manage ownership
Finally one of the other techniques I like using a lot isn't directly possible here as currently posed, but is worth mentioning for posterity. If you get the chance to store some additional data along side each instance in the container you can call Py_INCREF and retain a reference to the underlying PyObject* no matter where it came from. Provided you then get a callback at destruction time you can also call Py_DECREF and force the Python runtime to keep the object alive as long as the container.
You can also do that even when it's not possible to keep a 1-1 MyObject*/PyObject* pairing alive, by keeping a shadow container alive somewhere also. That can be hard to do unless you're willing to add another object into the container, subclass it or can be very certain that the initial Python instance of the container will always live long enough.
You're looking for %newobject. Here's a small example:
%module test
%newobject create;
%delobject destroy;
%inline %{
#include <iostream>
struct Test
{
Test() { std::cout << "create" << std::endl; }
~Test() { std::cout << "destroy" << std::endl; }
};
Test* create() { return new Test; }
void destroy(Test* t) { delete t; }
%}
Use:
>>> import test
>>> t1 = test.create() # create a test object
create
>>> t2 = test.Test() # don't really need a create function :)
create
>>> t3 = test.create() # and another.
create
>>> test.destroy(t2) # explicitly destroy one
destroy
>>>
>>>
>>>
>>> ^Z # exit Python and the other two get destroyed.
destroy
destroy
I just wanted thisown to be set to zero in the constructor. I did it in two ways
I simply added one line sed statement to my makefile to add 'self.thisown = 0' at the end of init() function of my class.
Using pythonappend. I figured out two caveats (a) %pythonappend statement has to be placed before c++ class definition, (b) c++ constructor overloads do not matter
%pythonappend MyApp::MyApp() %{
self.thisown = 0
%}
%include <MyApp.hpp>

Categories