Calling C++ code via embedded Python - python

I have successfully created a Python module that appears to work in isolation, but doesn't affect the program that is running it.
I have the following module:
BOOST_PYTHON_MODULE(mandala)
{
class_<state_mgr_t, state_mgr_t*, noncopyable>("states", no_init)
.def("push", &state_mgr_t::push)
.def("pop", &state_mgr_t::pop)
.def("count", &state_mgr_t::count);
scope().attr("states") = boost::python::object(boost::python::ptr(&states));
}
The states object is referencing a global value, states:
extern state_mgr_t states;
I can run the following script lines from within my program:
from mandala import states
states.count()
> 0
All of that is fine and whatnot, but I was expecting that running this python script would affect the actual state of the program that is running it. It appears as though Python is actually just dealing with it's own instance of states and not affecting the parent program.
Now I'm wondering if I've completely misunderstood what Boost.Python is capable of; I was expecting something similar to Lua, where I could modify the C++ program via scripts.
Is this not possible? Or am I doing something very wrong?
Thank you in advance!

If you are embedding Python into your C++ program, it should be possible to access the instance from your script. Obviously, I don't have your full code, but have you tried something like this?
PyImport_AppendInittab("mandala", &initmandala);
Py_Initialize();
try {
object main_module = import("__main__");
object main_namespace = main_module.attr("__dict__");
main_namespace.attr("states") = ptr(&states);
object ignored = exec("print states.count()", main_namespace);
} catch(const error_already_set&) {
PyErr_Print();
}

Related

Call a C# DLL function in Python 3.6

I have this part of a C# DLL from a larger project. It has no dependencies, runs in either .NET 5.0 or .NET 6.0 C# without anything else in it. It looks something like this:
namespace Some.Namespace {
public static class Rename{
public static string RenameString(string stringToRename){
//Do some stuff to it
return stringToRename;
}
}
}
However, no matter what way we attempt to add the reference to Python 3.6 (ArcGIS Pro Distribution) using Python.NET...
sys.path.append(os.path.join(os.getcwd(),"Project_Folder","bin"))
clr.AddReference("SomeDLL")
from Some.Namespace import Rename
It throws this error on the 3rd line.
Exception has occurred: ModuleNotFoundError No module named Some
We've tried just about every possible means to load the DLL at this point and none of them have worked (Yes I know that 2.5.x Python.NET doesn't support .NET 6.0 - we switched to 5.0 and it didn't work either) Exposing the function through Robert's DllExport using ctypes throws a CLR Exception that we can't debug because it's cross-environment whenever the function is run.
Here's that attempt.
//The DLL With Robert's DLLExport
namespace Some.Namespace {
public static class Rename{
//We've also tried CallingConvention.Cdecl
[DllExport("Rename", System.Runtime.InteropServices.CallingConvention.StdCall)]
public static string RenameString(string stringToRename){
//Do some stuff to it
return stringToRename;
}
}
}
#In Python
#We've tried WINDLL
dll_utils = CDLL(os.path.join(os.getcwd(),"project_folder","bin","SomeDLL.dll"))
#We've tried using CFUNCTTYPE here too
encrypt_proto = ctypes.WINFUNCTYPE(ctypes.c_void_p, ctypes.c_void_p)
encrypt_args = (1, "p1", 0),(1, "p2", 0),
#It fails here when we add the encrypt_args flags to it, saying they don't match up,
#but there's no way to figure out how many it wants, we've tried many variations of
#the above
encrypt_call = encrypt_proto(("Rename",dll_utils),encrypt_args)'
#So because of that on one attempt we removed these two lines
p2 = ctypes.c_char_p(text_to_obfuscate)
p1 = ctypes.c_char_p(text_to_obfuscate)
#And the second argument here
return_value = encrypt_call(p1,p2)
#And...
OSError: [WinError -532462766] Windows Error 0xe0434352
Attempting to write a second function in the DLL to convert the C# string to a byte array and back and using a ctypes pointer doesn't work - it throws the same error just posted when it's called. Someone suggested in another question here IronPython next - which we want to avoid, or to try to use comtypes which I've never used before. At one point we even tried to do a COM decoration and it claimed something I've seen on an answer here is obsolete. I've researched this for 2-3 days and haven't been able to solve this.
At this point I'm at a loss. Is there any effective, simple way, to get an external .NET C# function in a DLL to call a function and pass a string in Python without installing any major dependencies? Just out of the box? There's gotta be something simple I'm missing.

Fatal Python Error: GC object already tracked - but which object is it? [duplicate]

My python code has been crashing with error 'GC Object already Tracked' . Trying to figure out the best approach to debug this crashes.
OS : Linux.
Is there a proper way to debug this issue.
There were couple of suggestions in the following article.
Python memory debugging with GDB
Not sure which approach worked for the author.
Is there a way to generate memory dumps in such scenario which could be analyzed. Like in Windows world.
Found some article on this. But not entirely answers my question:
http://pfigue.github.io/blog/2012/12/28/where-is-my-core-dump-archlinux/
Found out the reason for this issue in my scenario (not necessarily the only reason for the GC object crash).
I used the GDB and Core dumps to debug this issue.
I have Python and C Extension Code (in shared object).
Python code registers a Callback routine with C Extension code.
In a certain workflow a thread from C Extension code was calling the registered Call back routine in Python code.
This usually worked fine but when multiple threads did the same action concurrently it resulted in the Crash with 'GC Object already tracked'.
Synchronizing the access to python objects for multiple thread does resolve this issue.
Thanks to any responded to this.
I ran into this problem using boost::python when our C++ code would trigger a python callback. I would occasionally get "GC object already tracked" and the program would terminate.
I was able to attach GDB to the process prior to triggering the error. One interesting thing, in the python code we were wrapping the callback with
a functools partial, which was actually masking where the real error was occuring. After replacing the partial with a simple callable wrapper class. The "GC object already tracked error" no longer popped up, instead I was now just getting a segfault.
In our boost::python wrapper, we had lambda functions to handle a C++ callback and the lambda function captured the boost::python::object callback function. It turned out, for whatever reason, in the destructor for the lambda, it wasn't always properly acquiring the GIL when destroying the boost::python::object which was causing the segfault.
The fix was to not use a lambda function, but instead create a functor that makes sure to acquire the GIL in the destructor prior to calling PyDECREF() on the boost::python::object.
class callback_wrapper
{
public:
callback_wrapper(object cb): _cb(cb), _destroyed(false) {
}
callback_wrapper(const callback_wrapper& other) {
_destroyed = other._destroyed;
Py_INCREF(other._cb.ptr());
_cb = other._cb;
}
~callback_wrapper() {
std::lock_guard<std::recursive_mutex> guard(_mutex);
PyGILState_STATE state = PyGILState_Ensure();
Py_DECREF(_cb.ptr());
PyGILState_Release(state);
_destroyed = true;
}
void operator ()(topic_ptr topic) {
std::lock_guard<std::recursive_mutex> guard(_mutex);
if(_destroyed) {
return;
}
PyGILState_STATE state = PyGILState_Ensure();
try {
_cb(topic);
}
catch(error_already_set) { PyErr_Print(); }
PyGILState_Release(state);
}
object _cb;
std::recursive_mutex _mutex;
bool _destroyed;
};
The problem is that you try to add an object to Python's cyclic garbage collector tracking twice.
Check out this bug, specifically:
The documentation for supporting cyclic garbage collection in Python
My documentation patch for the issue and
My explanation in the bug report itself
Long story short: if you set Py_TPFLAGS_HAVE_GC and you are using Python's built-in memory allocation (standard tp_alloc/tp_free), you don't ever have to manually call PyObject_GC_Track() or PyObject_GC_UnTrack(). Python handles it all behind your back.
Unfortunately, this is not documented very well, at the moment. Once you've fixed the issue, feel free to chime in on the bug report (linked above) about better documentation of this behavior.

Pointer function argument corrupted, Stack corruption?

Im building a C++ library which gets called by python via ctypes. The function has as parameter a pointer to a struct and passes it to other (internal) function. It looks like this:
extern "C" __declspec(dllexport) void USBEndpoint_subscribe(USBEndpoint* endpoint, CALLBACK callback) {
try {
subscribe_internal(endpoint, callback);
} catch (int e_code) {
exception_report(__FILE__, __LINE__, e_code); \
} catch (...) {
exception_report(__FILE__, __LINE__);
}
void subscribe_internal(USBEndpoint* endpoint, CALLBACK callback)
{
...
}
The thing is that the value for endpoint recieved at USBEndpoint_subscribe is not the same value that enters subscribe_internal. For example:
while debugging I can see that USBEndpoint_subscribe recieves:
endpoint=0x00000000088f0680 (and callback=0x0000000000540f50)
and as soon as I step into subscribe_internal, inside subscribe_internal the values are:
endpoint=0x000007fef15f7740 (and callback=0x0000000000000000).
I have not spawn other threads and I'm compiling the library in release mode**. My only suspect could be stack corruption, but actually I have clue of what's going on.
Any hints of what can be happening are very welcome.
** I have compiled python in debug mode, but it's slow to set up with spider and I loose the interactive console if I use my debug-compiled version, so I prefer sticking with release python and compiling my library in release.
EDIT:
As suggested, I used my debug version of python with a debug compile of the library. But I still got the same behavior (endpoint: 0x0000000002be80e0->0x0000000000000000, callback: 0x0000000000320f88->0x00000000024d24b8). No other useful information appeared here that wasn't in release mode.
I was loading The library using CDLL("path"), which makes the calling convention cdecl. And even if I'm compiling in VS2008 with /Gd, it seems it's using other convention. Appending __cdecl to the function declaration solves the issue.

static openCL class not properly released in python module using boost.python

EDIT: Ok, all the edits made the layout of the question a bit confusing so I will try to rewrite the question (not changing the content, but improving its structure).
The issue in short
I have an openCL program that works fine, if I compile it as an executable. Now I try to make it callable from Python using boost.python. However, as soon as I exit Python (after importing my module), python crashes.
The reason seems to have something to do with
statically storing only GPU CommandQueues and their release mechanism when the program terminates
MWE and setup
Setup
IDE used: Visual Studio 2015
OS used: Windows 7 64bit
Python version: 3.5
AMD OpenCL APP 3.0 headers
cl2.hpp directly from Khronos as suggested here: empty openCL program throws deprecation warning
Also I have an Intel CPU with integrated graphics hardware and no other dedicated graphics card
I use version 1.60 of the boost library compiled as 64-bit versions
The boost dll I use is called: boost_python-vc140-mt-1_60.dll
The openCL program without python works fine
The python module without openCL works fine
MWE
#include <vector>
#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 200
#define CL_HPP_MINIMUM_OPENCL_VERSION 200 // I have the same issue for 100 and 110
#include "cl2.hpp"
#include <boost/python.hpp>
using namespace std;
class TestClass
{
private:
std::vector<cl::CommandQueue> queues;
TestClass();
public:
static const TestClass& getInstance()
{
static TestClass instance;
return instance;
}
};
TestClass::TestClass()
{
std::vector<cl::Device> devices;
vector<cl::Platform> platforms;
cl::Platform::get(&platforms);
//remove non 2.0 platforms (as suggested by doqtor)
platforms.erase(
std::remove_if(platforms.begin(), platforms.end(),
[](const cl::Platform& platform)
{
int v = cl::detail::getPlatformVersion(platform());
short version_major = v >> 16;
return !(version_major >= 2);
}),
platforms.end());
//Get all available GPUs
for (const cl::Platform& pl : platforms)
{
vector<cl::Device> plDevices;
try {
pl.getDevices(CL_DEVICE_TYPE_GPU, &plDevices);
}
catch (cl::Error&)
{
// Doesn't matter. No GPU is available on the current machine for
// this platform. Just check afterwards, that you have at least one
// device
continue;
}
devices.insert(end(devices), begin(plDevices), end(plDevices));
}
cl::Context context(devices[0]);
cl::CommandQueue queue(context, devices[0]);
queues.push_back(queue);
}
int main()
{
TestClass::getInstance();
return 0;
}
BOOST_PYTHON_MODULE(FrameWork)
{
TestClass::getInstance();
}
Calling program
So after compiling the program as a dll I start python and run the following program
import FrameWork
exit()
While the import works without issues, python crashes on exit(). So I click on debug and Visual Studio tells me there was an exception in the following code section (in cl2.hpp):
template <>
struct ReferenceHandler<cl_command_queue>
{
static cl_int retain(cl_command_queue queue)
{ return ::clRetainCommandQueue(queue); }
static cl_int release(cl_command_queue queue) // -- HERE --
{ return ::clReleaseCommandQueue(queue); }
};
If you compile the above code instead as a simple executable, it works without issues. Also the code works if one of the following is true:
CL_DEVICE_TYPE_GPU is replaced by CL_DEVICE_TYPE_ALL
the line queues.push_back(queue) is removed
Question
So what could be the reason for this and what are possible solutions? I suspect it has something to do with the fact that my testclass is static, but since it works with the executable I am at a loss what is causing it.
I came across similar problem in the past.
clRetain* functions are supported from OpenCL1.2.
When getting devices for the first GPU platform (platforms[0].getDevices(...) for CL_DEVICE_TYPE_GPU) in your case it must happen to be a platform pre OpenCL1.2 hence you get a crash. When getting devices of any type (GPU/CPU/...) your first platform changes to be a OpenCL1.2+ and everything is fine.
To fix the problem set:
#define CL_HPP_MINIMUM_OPENCL_VERSION 110
This will ensure calls to clRetain* aren't made for unsupported platforms (pre OpenCL 1.2)
Update: I think there is a bug in cl2.hpp which despite setting minimum OpenCL version to 1.1 it still tries to use clRetain* on pre OpenCL1.2 devices when creating a command queue.
Setting minimum OpenCL version to 110 and version filtering works fine for me.
Complete working example:
#include "stdafx.h"
#include <vector>
#define CL_HPP_ENABLE_EXCEPTIONS
#define CL_HPP_TARGET_OPENCL_VERSION 200
#define CL_HPP_MINIMUM_OPENCL_VERSION 110
#include <CL/cl2.hpp>
using namespace std;
class TestClass
{
private:
std::vector<cl::CommandQueue> queues;
TestClass();
public:
static const TestClass& getInstance()
{
static TestClass instance;
return instance;
}
};
TestClass::TestClass()
{
std::vector<cl::Device> devices;
vector<cl::Platform> platforms;
cl::Platform::get(&platforms);
size_t x = 0;
for (; x < platforms.size(); ++x)
{
cl::Platform &p = platforms[x];
int v = cl::detail::getPlatformVersion(p());
short version_major = v >> 16;
if (version_major >= 2) // OpenCL 2.x
break;
}
if (x == platforms.size())
return; // no OpenCL 2.0 platform available
platforms[x].getDevices(CL_DEVICE_TYPE_GPU, &devices);
cl::Context context(devices);
cl::CommandQueue queue(context, devices[0]);
queues.push_back(queue);
}
int main()
{
TestClass::getInstance();
return 0;
}
Update2:
So what could be the reason for this and what are possible solutions?
I suspect it has something to do with the fact that my testclass is
static, but since it works with the executable I am at a loss what is
causing it.
TestClass static seems to be a reason. Looks like releasing memory is happening in wrong order when run from python. To fix that you may want to add a method which will have to be explicitly called to release opencl objects before python starts releasing memory.
static TestClass& getInstance() // <- const removed
{
static TestClass instance;
return instance;
}
void release()
{
queues.clear();
}
BOOST_PYTHON_MODULE(FrameWork)
{
TestClass::getInstance();
TestClass::getInstance().release();
}
"I would appreciate an answer that explains to me what the problem actually is and if there are ways to fix it."
First, let me say that doqtor already answered how to fix the issue -- by ensuring a well-defined destruction time of all used OpenCL resources. IMO, this is not a "hack", but the right thing to do. Trying to rely on static init/cleanup magic to do the right thing -- and watching it fail to do so -- is the real hack!
Second, some thoughts about the issue: the actual problem is even more complex than the common static initialization order fiasco stories. It involves DLL loading/unloading order, both in connection with python loading your custom dll at runtime and (more importantly) with OpenCL's installable client driver (ICD) model.
What DLLs are involved when running an application/dll that uses OpenCL? To the application, the only relevant DLL is the opencl.dll you link against. It is loaded into process memory during application startup time (or when your custom DLL which needs opencl is dynamically loaded in python).
Then, at the time when you first call clGetPlatformInfo() or similar in your code, the ICD logic kicks in: opencl.dll will look for installed drivers (in windows, those are mentioned somewhere in the registry) and dynamically load their respective dlls (using sth like the LoadLibrary() system call). That may be e.g. nvopencl.dll for nvidia, or some other dll for the intel driver you have installed. Now, in contrast to the relatively simple opencl.dll, this ICD dll can and will have a multitude of dependencies on its own -- probably using Intel IPP, or TBB, or whatever. So by now, things have become real messy already.
Now, during shutdown, the windows loader must decide which dlls to unload in which order. When you compile your example in a single executable, the number and order of dlls being loaded/unloaded will certainly be different than in the "python loads your custom dll at runtime" scenario. And that could well be the reason why you experience the problem only in the latter case, and only if you still have an opencl-context+commandqueue alive during shutdown of your custom dll. The destruction of your queue (triggered via clRelease... during static destruction of your testclass instance) is delegated to the intel-icd-dll, so this dll must still be fully functional at that time. If, for some reason, that is not the case (perhaps because the loader chose to unload it or one of the dlls it needs), you crash.
That line of thought reminded me of this article:
https://blogs.msdn.microsoft.com/larryosterman/2004/06/10/dll_process_detach-is-the-last-thing-my-dlls-going-to-see-right/
There's a paragraph, talking about "COM objects", which might be equally applicable to "OpenCL resources":
"So consider the case where you have a DLL that instantiates a COM object at some point during its lifetime. If that DLL keeps a reference to the COM object in a global variable, and doesn’t release the COM object until the DLL_PROCESS_DETACH, then the DLL that implements the COM object will be kept in memory during the lifetime of the COM object. Effectively the DLL implementing the COM object has become dependant on the DLL that holds the reference to the COM object. But the loader has no way of knowing about this dependency. All it knows is that the DLL’s are loaded into memory."
Now, I wrote a lot of words without coming to a definitive proof of what's actually going wrong. The main lesson I learned from bugs like these is: don't enter that snake pit, and do your resource-cleanup in a well-defined place like doqtor suggested. Good night.

Wrapping c++ functions in python with ctypes on windows : function not found

I need to run a series of python scripts calculating various scripts, that are working fine, but one of them runs very slowly and has to be done in C++.
The C++ code is ready, but I need to find a way to call the C++ function from the python and get the return value.
I found information about SWIG, but didn't get it to work on Windows with Visual Studio (I have to do it in VS on Windows, because of other constraints). I found ctypes much easier for my very simple function with standart input and output values.
So I did many tests with ctypes, tried all the examples I could find online, and each and every time, what happens is that I can load the dll (built with visual studio 2012 compiler) but when I try calling the function in python, it just goes :
Traceback (most recent call last):
File "test.py", line 3, in <module>
print maLib.plop()
File "C:\Python27\lib\ctypes\__init__.py", line 378, in __getattr__
func = self.__getitem__(name)
File "C:\Python27\lib\ctypes\__init__.py", line 383, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: function 'plop' not found
The python code is (test.py) :
from ctypes import cdll
lib = cdll.LoadLibrary('test')
print lib.plop()
The C++ code is (test.cpp, compiled as test.dll, Visual Studio set to build as dll) :
extern "C"
{
int plop();
}
int plop() { return 4; }
As you can see, I tried to make it as simple as possible, to avoid details making it fail. I have read the python ctypes help, and tutorials on how to use ctypes, trying exactly the same codes as them, but I had to adapt a bit because I am using Visual Studio / windows and most of the other users are using linux.
All of the files are in the same folder.
I have tried multiple ways of loading the library : LoadLibrary('./name.dll'), WinDLL('name.dll'), giving the function a different name, void return type, etc...
Do you think I should use SWIG ? Is my problem an obvious beginner mistake ? I am a student, and new to most of what I'm using, but I put a lot of effort in this, even if I just need it for a particular single task.
When using SWIG, I had to make a wrapper with function pointers, or references, so I thought this was the problem, which made me want to try a simpler solution.
I am using : Python 2.7.7, Visual Studio 2012, Windows 8.1
Many thanks in advance
The error reported by your Python is very clear.
function 'plop' not found
This means that the DLL does not export a function of that name. So, you need to export the function. Either with a .def file, or using __declspec(dllexport):
extern "C"
{
__declspec(dllexport) int plop();
}
To inspect your DLL to debug issues like this, use dumpbin or Dependency Walker.
Note that since you do not specify calling convention, the default of __cdecl is used. That means that cdll is correct on the ctypes side.
You need to give your function "C" linkage to avoid name mangling. Declare it like this:
extern "C"
{
int plop();
}
and it will then properly be called "plop" rather than being name-mangled.

Categories