Python C Extension using numpy and gdal giving undefined symbol at runtime - python

I'm writing a C++ extension for python to speed up a raster image viewer created in-house.
I've got working code, but noticed that the speed hadn't increased that much and after profiling a bit deeper realised that it was due to the gdal.ReadAsArray calls, which were python callbacks from the C extension.
To get around the overhead of Python C-API when calling python objects I decided I would use the C++ libraries for gdal rather than using the Python callback to the existing gdalDataset. (space isn't a problem).
However after implementing the code for this I ran into an error at runtime(the extension compiled fine)
which was
import getRasterImage_new
ImportError: /local1/data/scratch/lib/python2.7/site-packages
/getRasterImage_new.so: undefined symbol: _ZN11GDALDataset14GetRasterYSizeEv
The code below replicates the error(some edits may be needed to run on your machines(ignore the uninitialised variables, it's just whats needed to replicate the error).
Python:
#!/usr/bin/env python
import numpy
from osgeo import gdal
import PythonCTest
print("test starting")
PythonCTest.testFunction(1)
print("test complete")
C++:
#include "Python.h"
#include "numpy/arrayobject.h"
#include "gdal_priv.h"
#include <iostream>
extern "C"{
static PyObject* PythonCTest_testFunction(PyObject* args);
static PyMethodDef PythonCTest_newMethods[] = {
{"testFunction", (PyCFunction)PythonCTest_testFunction, METH_VARARGS,
"test function"},
{NULL,NULL,0,NULL}};
PyMODINIT_FUNC initPythonCTest(void){
(void)Py_InitModule("PythonCTest",PythonCTest_newMethods);
import_array();
}
}
GDALDataset* y;
static PyObject* PythonCTest_testFunction(PyObject* args){
std::cout << "in testFunction\n";
y->GetRasterYSize();
std::cout << "doing stuff" << "\n";
return Py_None;
}
Any suggestions would be very welcome.
EDIT
You can also remove the from osgeo import gdal and the error stull occurs(only just noticed that).
EDIT 2
I forgot to say that I'm compiling my extension using distutils
current setup.py is
#!/usr/bin/env python
from distutils.core import setup, Extension
import os
os.environ["CC"] = "g++"
setup(name='PythonCTest', version='1.0', \
ext_modules=[Extension('PythonCTest', ['PythonCTest.cpp'],
extra_compile_args=['--std=c++14','-l/usr/include/gdal', '-I/usr/include/gdal'])])

A Python extension module is a dynamically loadable (shared) library. When linking a shared library, you need to specify its library dependencies, such as -lgdal and, for that matter, -lpython2.7. Failing to do so results in a library with unresolved symbols, and if those symbols are not provided by the time it is loaded, the loading will fail, as reported by Python.
To resolve the error, you need to add libraries=['gdal'] to Extension constructor. Specifying -lgdal in extra_compile_args won't work because compile arguments, as the name implies, are used for compilation and not for linking.
Note that an unresolved symbol would not go undetected when linking an executable, where the build would fail with a linker error. To get the same diagnostics when linking shared libraries, include -Wl,-zdefs in link arguments.

Related

How I resolve Cppyy load_library giving Runtime error?

Okay so according to the answer I found here to the question titled "Calling C/C++ from Python?" here, and also on the cppyy documentation website, I made some sample classes in .h and .cpp files and tried to include them in Python. While the .h file gets included easily, when I try to use the cppyy.load_library() function, it gives me a runtime error for some reason. Can someone please help? I've tried to look for solutions online but apparently no one has got a similar problem before. This is what I'm running in Jupyter Notebook:
import cppyy
cppyy.include("foo.h")
cppyy.load_library("libfoo")
the final line is giving me the following error:
RuntimeError Traceback (most recent call last)
<ipython-input-3-eea6173ad08e> in <module>
----> 1 cppyy.load_library("libfoo")
~\anaconda3\lib\site-packages\cppyy\__init__.py in load_library(name)
219 sc = gSystem.Load(name)
220 if sc == -1:
--> 221 raise RuntimeError('Unable to load library "%s"%s' % (name, err.err))
222 return True
223
RuntimeError: Unable to load library "libfoo"
This is my .h file:
class Foo {
public:
void bar();
};
And here is my .cpp file:
#include "foo.h"
#include <iostream>
void Foo::bar() { std::cout << "Hello" << std::endl; }
I'm using the commands g++ -c -fPIC foo.cpp -o foo.o and g++ -shared -Wl,-soname,libfoo.so -o libfoo.so foo.o to compile my cpp code.
Please can someone help?
The example works just fine for me.
To debug, first make sure to start python in the same directory where libfoo.so is located, or to add the directory where libfoo.so lives to LD_LIBRARY_PATH (for any process to use), or call cppyy.add_library_path() with the path as argument to add it for use by cppyy only. As concerns the name, the .so extension is automatically added as needed, as is the lib part, so either one of foo, libfoo, foo.so, or libfoo.so is fine.
If that still fails, a reasonable way on Linux (only) of getting more information for what could be going wrong, is to use ctypes:
$ python
>>> import cypes
>>> lib = ctypes.CDLL("libfoo.so")
which will show you if there are different problems such as missing symbols or missing dependent libraries (but neither is the case here).

pybind with third-party C++ on Windows (ImportError: DLL load failed)

I'm trying to use a third-party c++ lib from python using pybind11.
I have managed to do it using pure C++ code, but, when I add the third-party code I get the following error:
ImportError: DLL load failed while importing recfusionbind
Here's my code:
#include <RecFusion.h>
#include <pybind11/pybind11.h>
using namespace RecFusion;
bool isValidLicense() {
return RecFusionSDK::setLicenseFile("License.dat");
}
namespace py = pybind11;
PYBIND11_MODULE(recfusionbind, m) {
m.def("isValidLicense", &isValidLicense, R"pbdoc(
Check if there is a valid license file.
)pbdoc");
#ifdef VERSION_INFO
m.attr("__version__") = VERSION_INFO;
#else
m.attr("__version__") = "dev";
#endif
}
If i change, for testing purposes, the return of the function to simply true it works just fine. But calling the third-party library gives the above mentioned error. I have put RecFusion.dll on the current directory (where the python script is), with same results.
Any tips on what I'm missing will be welcomed. Lib is 64 bits, same as my code.
This seems to me like a linking problem. Look at your dll's import table (with tools like PE explorer or IDA). Now validate that the function RecFusionSDK::setLicenseFile is indeed imported from a dll named RecFusion.dll and not some other dll name.
Also validate that the mangled function name in RecFusion.dll's export table and in your dll match. If not, this could be an ABI problem.

How do I import a dll-module written in C++ to Python using PyBind11

I've been having trouble getting pybind11 to work the way described here.
I was able to build the example DLL in Visual Studio 2019 but I don't understand how to actually get a Python script to recognize the module using an 'import' statement.
This the test module written in C++ (compiles without any issues):
#include <pybind11/pybind11.h>
int add(int i, int j) {
return i + j;
}
PYBIND11_MODULE(example, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
m.def("add", &add, "A function which adds two numbers");
This is my test Python script which attempts to call the module:
import example
print(example.add(1,2))
The above simply fails to identify the module - i.e. "ModuleNotFoundError: No module named 'example'"
I've placed my Python script in the same folder as the DLL build (containing various files - .obj, .iobj,.lib etc. )
I'm running Python 3.8 on Windows 10 if it matters.
I've been bashing my head against Boost.Python all day before finally finding pybind11 but I can't get it to work!!

Swig/python : when SWIG_init() is needed?

Hi everyone and thanks for trying to help me !
I encounter trouble when trying to import a python module generated by swig.
I have a basic library "example" containing few methods.
Next to it I have a main program dynamically linked to python.
This program imports the generated module and calls a function in it.
If my library example is a shared one, named _example.so, everything works perfectly, and I can import it in python.
But if my library is static, _example.a, and linked to the main program, then I will have the error "no module named _example was found" unless I add a call to SWIG_init() in the main function.
What exactly does SWIG_init() , and when should I use it ? It seems quite weird to me because it's never said in the documentation to do such a call.
I know that dealing with a .so shared library is better but I try to reproduce the behavior of what I have on a big project at work, so I really have to understand what happens when the module is static.
Here is my main file :
#include "Python.h"
#include <iostream>
#if PY_VERSION_HEX >= 0x03000000
# define SWIG_init PyInit__example
#else
# define SWIG_init init_example
#endif
#ifdef __cplusplus
extern "C"
#endif
#if PY_VERSION_HEX >= 0x03000000
PyObject*
#else
void
#endif
SWIG_init(void);
int main (int arc, char** argv)
{
Py_Initialize();
SWIG_init(); // needed only using the statically linked version of example ?
PyRun_SimpleString("print \"Hello world from Python !\"");
PyRun_SimpleString("import sys");
PyRun_SimpleString("sys.path.append(\"/path/to/my/module\")");
PyRun_SimpleString("import example");
PyRun_SimpleString("a = example.Example()");
PyRun_SimpleString("print a.fact(5)");
}
Here is how things are generated :
swig -c++ -python example.i
g++ -fpic -c example.cpp example_wrap.cxx -I/include/python2.7 -lstdc++
ar rvs libexample.a example.o example_wrap.o
// to generate dynamic instead of static : g++ -shared example.o example_wrap.o -o _example.so
g++ main.cpp -I/include/python2.7 libexample.a -lstdc++ -L/lib/python -lpython2.7 -o main
What you are calling is the init function of the native python module _example that is loaded by the SWIG generated python wrapper. For python 2 this function is named init_example, and for python 3 it is named PyInit__example.
Every python extension with C or C++ needs such a function, it basically initializes everything and registers the name of the module and all the methods available for it. In your case SWIG has generated this function for you.
The reason you have to call this function yourself when you compiled the library statically is simply that the python wrapper example imports the native module _example which is by the python convention a shared object, which you did not compile, and which is thus not found.
By calling SWIG_init, you "preload" the module, so python does not try to reimport it, so it works even though there is no shared object anywhere on the python module path.
If you have the shared object for your module, python will call this function for you after loading the shared object and you don't have to worry about this.

Building a DLL with MinGW and loading it with Python's ctypes module

I am trying to build a C DLL which can be loaded within python using ctypes.windll.loadlibrary(...)
I can create a DLL and a client program all in C which work following the MinGW tutorial at http://www.mingw.org/wiki/MSVC_and_MinGW_DLLs.
When I try to load the same DLL within python I get an error:
OSError: [WinErrror 193] %1 is not a valid Win32 application
Can someone give me some idea as to what I am doing incorrectly?
Here are the files:
noise_dll.h
#ifndef NOISE_DLL_H
#define NOISE_DLL_H
// declspec will identify which functions are to be exported when
// building the dll and imported when 'including' this header for a client
#ifdef BUILDING_NOISE_DLL
#define NOISE_DLL __declspec(dllexport)
#else
#define NOISE_DLL __declspec(dllimport)
#endif
//this is a test function to see if the dll is working
// __stdcall => use ctypes.windll ...
int __stdcall NOISE_DLL hello(const char *s);
#endif // NOISE_DLL_H
noise_dll.c
#include <stdio.h>
#include "noise_dll.h"
__stdcall int hello(const char *s)
{
printf("Hello %s\n", s);
return 0;
}
I build the DLL with:
gcc -c -D BUILDING_NOISE_DLL noise_dll.c
gcc -shared -o noise_dll.dll noise_dll.o -Wl,--out-implib,libnoise_dll.a
The python code is simply:
import ctypes
my_dll = ctypes.windll.LoadLibrary("noise_dll")
and I get the error above: '%1 is not not a valid Win32 application'
I know the DLL is not completely wrong, because if i create a client file:
noise_client.c
#include <stdio.h>
#include "noise_dll.h"
int main(void)
{
hello("DLL");
return 0;
}
and build with:
gcc -c noise_client.c
gcc -o noise_client.exe noise_client.o -L. -lnoise_dll
I get a working executable. I have some understanding of all of what goes on in the code, options and preprocessor directives above, but am still a little fuzzy on how the .dll file and the .a file are used. I know if I remove the .a file I can still build the client, so I am not even sure what its purpose is. All I know is that it is some kind of archive format for multiple object files
I can ctypes.windll.loadlibrary(...) an ordinary windows DLL found in the windows/system32 without an issue.
One final point:
I am using 64 bit python 3.3. I am using the version of minGW tat comes with the recommended installer (mingw-get-inst-20120426.exe). I am not sure if it is 32 bit, or if that matters.
Thanks!
Use 32-bit python - your DLL probably wasn't compiled as 64 bit code.

Categories