I have a C++ library that manipulates (among other things) a list of wrappers that I have been working on converting to Python using pybind11. The rest of the library operates on a pointer to a list of pointers: std::list<Symbol*>*. The problem is that when attempting to autocast a Python list to this C++ list and then initializing a ParamMap, an object that holds the list on the C++ side, the pointers of the list get all messed up. Inspection in GDB reveals that the "next-object pointers" of all the objects are invalid, and this leads to segfaults when traversing the list.
There is no sign of the objects being deallocated on the C++ side, as neither the destructors for the list container ParamMap nor the list objects Symbol are called. I've deduced that the issue might be Python hyperactively deleting objects C++ is still using, but I've tried object terms like py::return_value_policy::reference and py::keep_alive, and they haven't fixed the problem. What is going wrong here? Unfortunately, changing the list type on the C++ side is not an option, but I would really appreciate some help in making this work on the Python side. Thank you!
Here is some minimal reproduction code:
Symbol.hpp
#include <string>
class Symbol {
private:
std::string val1;
int val2;
public:
Symbol(std::string con1, int con2) : val1(con1), val2(con2) {}
};
ParamMap.hpp
#include <list>
#include "Symbol.hpp"
class ParamMap {
private:
std::list<Symbol*>* lst;
int otherData;
public:
ParamMap(std::list<Symbol*>* symbolList, int dat) : lst(symbolList), otherData(dat) {}
std::list<Symbol*>* getSymbols() { return lst; }
int getOtherData() { return otherData; }
};
Query.cpp
#include <iostream>
#include "ParamMap.hpp"
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
namespace py = pybind11;
void getSymbolListSize(ParamMap* map) {
std::cout << "Entering query method" << std::endl;
auto sz = map->getSymbols()->size(); // SEGFAULT OCCURS WHEN GETTING SIZE
std::cout << "Got size successfully. Size = " << sz << std::endl;
}
PYBIND11_MODULE(list_test, handle) {
handle.def("getSymbolListSize", &getSymbolListSize);
py::class_<ParamMap>(handle, "ParamMap")
.def(py::init<std::list<Symbol*>*, int>(), py::keep_alive<1, 2>())
.def("getOtherData", &ParamMap::getOtherData)
.def("getSymbols", &ParamMap::getSymbols);
py::class_<Symbol>(handle, "Symbol")
.def(py::init<std::string, int>());
}
test.py
import list_test as p
# Creating a list of some random symbols
symbol_list = []
symbol1 = p.Symbol("Hello", 1)
symbol_list.append(symbol1)
symbol2 = p.Symbol("World", 2)
symbol_list.append(symbol2)
# Creating a parammap and passing it the symbol list
pm = p.ParamMap(symbol_list, 71)
print("Symbol list and ParamMap init'd successfully")
# Here, calling Query.cpp's only method
sz = p.getSymbolListSize(pm)
print(sz)
I don't know a lot about how pybind11 works its magic and therefore I can't help you understanding what is going on. However, I have the feeling that pybind attempts to build the list even though your code only uses a pointer to the list. If I were you I'd consider this a pybind bug and post it as an issue on their github page.
As per your code, doing something like this seems to work (although it's not very clean):
#include <list>
#include "Symbol.hpp"
class ParamMap {
private:
std::list<Symbol*>* lst;
int otherData;
public:
ParamMap(std::list<Symbol*> *symbolList, int dat) : lst(symbolList), otherData(dat) {
lst = new std::list<Symbol *>;
for(auto s : *symbolList) {
lst->push_back(s);
}
}
~ParamMap() {
delete lst;
}
std::list<Symbol*>* getSymbols() { return lst; }
int getOtherData() { return otherData; }
};
I don't know who's supposed to manage the lifetime of the pointed list, so you may want to remove the destructor in case someone else is supposed to deallocate the list.
Related
I'm not the author, but there's a public software package I use that seems to be leaking memory (Github issue). I'm trying to figure out how to patch it to make it work correctly.
To narrow the problem down, there's a struct, call it xxx_t. First %extend is used to make a member of the struct available in Python:
%extend xxx_t {
char *surface;
}
Then there's a custom getter. What exactly it does here isn't important except that it uses new to create a char*.
%{
char* xxx_t_surface_get(xxx *n) {
char *s = new char [n->length + 1];
memcpy (s, n->surface, n->length);
s[n->length] = '\0';
return s;
}
%}
Currently the code has this line to handle garbage collection:
%newobject surface;
This does not seem to work as expected. %newobject xxx_t::surface; also doesn't work. If I replace it with %newobject xxx_t_surface_get; that doesn't work because the getter function is escaped (inside %{ ... %}).
What is the right way to tell SWIG about the char* so it gets freed?
Before getting start it's worth pointing out one thing: Because you return char* it ends up using SWIG's normal string typemaps to produce a Python string.
Having said that let's understand what the code that currently gets generated looks like. We can start our investigation with the following SWIG interface definition to experiment with:
%module test
%inline %{
struct foobar {
};
%}
%extend foobar {
char *surface;
}
If we run something like this through SWIG we'll see a generated function which wraps your _surface_get code, something like this:
SWIGINTERN PyObject *_wrap_foobar_surface_get(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
foobar *arg1 = (foobar *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
PyObject * obj0 = 0 ;
char *result = 0 ;
if (!PyArg_ParseTuple(args,(char *)"O:foobar_surface_get",&obj0)) SWIG_fail;
res1 = SWIG_ConvertPtr(obj0, &argp1,SWIGTYPE_p_foobar, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "foobar_surface_get" "', argument " "1"" of type '" "foobar *""'");
}
arg1 = reinterpret_cast< foobar * >(argp1);
result = (char *)foobar_surface_get(arg1);
resultobj = SWIG_FromCharPtr((const char *)result);
/* result is never used again from here onwards */
return resultobj;
fail:
return NULL;
}
The thing to note here however is that the result of calling your getter is lost when this wrapper returns. That is to say that it isn't even tied to the lifespan of the Python string object that gets returned.
So there are several ways we could fix this:
One option would be to ensure that the generated wrapper calls delete[] on the result of calling your getter, after the SWIG_FromCharPtr has happened. This is exactly what %newobject does in this instance. (See below).
Another alternative would be to keep the allocated buffer between calls, probably in some thread local storage and track the size to minimise allocations
Alternatively we could use some kind of RAII based object to own the temporary buffer and make sure it gets removed. (We could do something neat with operator void* if we wanted even).
If we change our interface to add %newobject like so:
%module test
%inline %{
struct foobar {
};
%}
%newobject surface;
%extend foobar {
char *surface;
}
Then we see that our generated code now looks like this:
// ....
result = (char *)foobar_surface_get(arg1);
resultobj = SWIG_FromCharPtr((const char *)result);
delete[] result;
We can see this in the real code from github too, so this isn't the bug that you're looking for.
Typically for C++ I'd lean towards the RAII option. And as it happens there's a neat way to do this from both a SWIG perspective and a C++ one: std::string. So we can fix your leak in a simple and clean way just by doing something like this:
%include <std_string.i> /* If you don't already have this... */
%extend xxx_t {
std::string surface;
}
%{
std::string xxx_t_surface_get(xxx *n) {
return std::string(n->surface, n->length);
}
%}
(You'll need to change the setter to match too though, unless you made it const so there is no setter)
The thing about this though is that it's still making two sets of allocations for the same output. Firstly the std::string object makes one allocation and then secondly an allocation occurs for the Python string object. That's all for something where the buffer already exists in C++ anyway. So whilst this change would be sufficient and correct to solve the leak you can also go further and write a version that does less duplicitous copying:
%extend xxx_t {
PyObject *surface;
}
%{
PyObject *xxx_t_surface_get(xxx *n) {
return SWIG_FromCharPtrAndSize(n->surface, n->length);
}
%}
I am using pybind11 to implement binds for my c++ project.
So, my problem is basically how to define a python function in the interpreter
and call it from the C++ code.
The C++ interface passes data using a pointer (double*) and I don't know how to code the function in the interpreter and how to convert it for a std::function to perform the evaluation:
// C++
//--------
double cpp_call( const std::array<double,N> &value, const std::function<double(double*)> &func)
{
return func(value.data());
}
// python binding with pybind11
// module definition...
...
m.def("py_call", &cpp_call);
//python interpreter
//-------------------
?
Please, could someone give some tip to me ?
You're most likely missing a couple of requires headers to get this working, #include <pybind11/functional.h> (for the std::function support) and #include <pybind11/stl.h> (for the stl container support); neither header is included by default (to keep the core project smaller).
With those, your example almost works (it just needs a const added to the inner argument of the std::function, i.e. const std::function<double(const double *)> &func: the std::array is const and thus its .data() returns a const pointer).
Here's a full example showing this working:
#include <pybind11/pybind11.h>
#include <pybind11/functional.h>
#include <pybind11/stl.h>
double cpp_call(const std::array<double, 3> &values,
const std::function<double(double *)> &func) {
double ret = 0;
for (auto d : values) ret += func(&d);
return ret;
}
PYBIND11_MODULE(stack92, m) {
m.def("sum", &cpp_call);
}
Python:
>>> import stack92
>>> def f(v): return v**.5
...
>>> print("1+2+3 =", stack92.sum([1, 4, 9], f))
1+2+3 = 6.0
According to Swig docs and the marvelous explanation at SWIG in typemap works, but argout does not by #Flexo, the argout typemap turns reference arguments into return values in Python.
I have a scenario, in which I pass a dict, which then is converted to an unordered_map in typemap(in), which then gets populated in the C++ lib. Stepping through the code, I can see the mapping changed after it returned from C++, so I wonder why there is not a possibility to just convert the unordered_map back in place in to the dict that was passed. Or is it possible by now and I'm just overlooking something?
Thanks!
I am a little confused as to what exactly you are asking, but my understanding is:
You have an "in" typemap to convert a Python dict to a C++ unordered_map for some function argument.
The function then modifies the unordered_map.
After completion of the function, you want the Python dict updated to the current unordered_map, and are somehow having trouble with this step.
Since you know how to convert a dict to an unordered_map, I assume you basically do know how to convert the unordered_map back to the dict using the Python C-API, but are somehow unsure into which SWIG typemap to put the code.
So, under these assumptions, I'll try to help:
"the argout typemap turns reference arguments into return values in Python". Not really, although it is mostly used for this purpose. An "argout" typemap simply supplies code to deal with some function argument (internally referred to as $1) that is inserted into the wrapper code after the C++ function is called. Compare this with an "in" typemap that supplies code to convert the supplied Python argument $input to a C++ argument $1, which is obviously inserted into the wrapper code before the C++ function is called.
The original passed Python dict can be referred to in the "argout" typemap as $input, and the modified unordered_map as $1 (see the SWIG docs linked above).
Therefore, all you need to do is write an "argout" typemap for the same argument signature as the "in" typemap that you already have, and insert the code (using the Python C-API) to update the contents of the Python dict ($input) to reflect the contents of the unordered_map ($1).
Note that this is different from the classical use of "argout" typemaps, which would typically convert the $1 back to a new Python dict and append this to the Python return object, which you can refer to by $result.
I hope this helps. If you are still stuck at some point, please edit your question to make clear at which point you are having trouble.
I am well aware of that the user has already solved his issue, but here goes a solution. Some validation of inputs may be introduced to avoid non-string values of the input dictionary.
Header file
// File: test.h
#pragma once
#include <iostream>
#include <string>
#include <unordered_map>
void method(std::unordered_map<std::string, std::string>* inout) {
for( const auto& n : (*inout) ) {
std::cout << "Key:[" << n.first << "] Value:[" << n.second << "]\n";
}
(*inout)["BLACK"] = "#000000";
};
Interface file
// File : dictmap.i
%module dictmap
%{
#include "test.h"
%}
%include "typemaps.i"
%typemap(in) std::unordered_map<std::string, std::string>* (std::unordered_map<std::string, std::string> temp) {
PyObject *key, *value;
Py_ssize_t pos = 0;
$1 = &temp;
temp = std::unordered_map<std::string, std::string>();
while (PyDict_Next($input, &pos, &key, &value)) {
(*$1)[PyString_AsString(key)] = std::string(PyString_AsString(value));
}
}
%typemap(argout) std::unordered_map<std::string, std::string>* {
$result = PyDict_New();
for( const auto& n : *$1) {
PyDict_SetItemString($result, n.first.c_str(),
PyString_FromString(n.second.c_str()));
}
}
%include "test.h"
Test
import dictmap
out = dictmap.method({'WHITE' : '#FFFFFF'})
Output is an updated dictionary
In[2]: out
Out[3] : {'BLACK': '#000000', 'WHITE': '#FFFFFF'}
I want to know the names of the NetworkInterfaces from python, but it seems it's not possible from python so I'm using this C code:
#include <Python.h>
#include <windows.h>
#include <iphlpapi.h>
#include <stdio.h>
#pragma comment(lib, "iphlpapi.lib")
PyObject* GetInterfaces (PyObject* self){
ULONG buflen = sizeof(IP_ADAPTER_INFO);
IP_ADAPTER_INFO *pAdapterInfo = (IP_ADAPTER_INFO *)malloc(buflen);
if (GetAdaptersInfo(pAdapterInfo, &buflen) == ERROR_BUFFER_OVERFLOW) {
free(pAdapterInfo);
pAdapterInfo = (IP_ADAPTER_INFO *)malloc(buflen);
}
if (GetAdaptersInfo(pAdapterInfo, &buflen) == NO_ERROR) {
for (IP_ADAPTER_INFO *pAdapter = pAdapterInfo; pAdapter; pAdapter = pAdapter->Next) {
printf("%s (%s)\n", pAdapter->IpAddressList.IpAddress.String, pAdapter->Description);
}
}
if (pAdapterInfo) free(pAdapterInfo);
return 0;
}
static char interfaces_docs[] =
"GetInterfaces( ): prints the interfaces name and IP\n";
static PyMethodDef interfaces_funcs[] = {
{"GetInterfaces", (PyCFunction)GetInterfaces,
METH_NOARGS, interfaces_docs},
{NULL}
};
void initinterfaces(void)
{
Py_InitModule3("interfaces", interfaces_funcs,
"Interfaces Module");
}
Is this good? And what are the steps to importing it into Python with ctypes? How can I do it? Also is there a way to return a list of tuples instead of printing it? Do I need to compile it? If I do how can I?
Is this good?
Almost. Never return 0/null as a PyObject* unless you're signaling an exception; instead incref Py_None and return it. And you may want to add actual error checking code as well.
And what are the steps to importing it into Python with ctypes? How can I do it?
What you've written doesn't need ctypes, since it's an actual Python module written in C. Use import interfaces after compiling the code to interfaces.pyd.
Also is there a way to return a list of tuples instead of printing it? Do I need to compile it? If I do how can I?
With the normal list and tuple functions; create the object and set each element in turn as required.
I am using msvc++ and Python 2.7. I have a dll that returns a std:wstring. I am trying to wrap it in such a way that it is exposed as a c style string for calls from Python via ctypes. I obviously do not understand something about how strings are handled between the two. I have simplified this into a simple example to understand the passing mechanism. Here is what I have:
C++
#include <iostream>
class WideStringClass{
public:
const wchar_t * testString;
};
extern "C" __declspec(dllexport) WideStringClass* WideStringTest()
{
std::wstring testString = L"testString";
WideStringClass* f = new WideStringClass();
f->testString = testString.c_str();
return f;
}
Python:
from ctypes import *
lib = cdll.LoadLibrary('./myTest.dll')
class WideStringTestResult(Structure):
_fields_ = [ ("testString", c_wchar_p)]
lib.WideStringTest.restype = POINTER(WideStringTestResult)
wst = lib.WideStringTest()
print wst.contents.testString
And, the output:
????????????????????᐀㻔
What am I missing?
Edit:
Changing the C++ to the following solves the problem. Of course, I think I now have a memory leak. But, that can be solved.
#include <iostream>
class WideStringClass{
public:
std::wstring testString;
void setTestString()
{
this->testString = L"testString";
}
};
class Wide_t_StringClass{
public:
const wchar_t * testString;
};
extern "C" __declspec(dllexport) Wide_t_StringClass* WideStringTest()
{
Wide_t_StringClass* wtsc = new Wide_t_StringClass();
WideStringClass* wsc = new WideStringClass();
wsc->setTestString();
wtsc->testString = wsc->testString.c_str();
return wtsc;
}
Thanks.
There is a big issue that is not related to Python:
f->testString = testString.c_str();
This is not correct, since testString (the std::wstring you declared) is a local variable, and as soon as that function returns, testString is gone, thus invalidates any attempt to use what c_str() returned.
So how do you fix this? I'm not a Python programmer, but the way character data is usually marshalled between two differing languages is to copy characters to a buffer that was either created on the receiver's side or the sender's side (better the former than the latter).