Find what the PyCUDA compilation error was - python

I'm trying to compile a PyCUDA kernel. The compilation fails with the error:
pycuda.driver.CompileError: nvcc preprocessing of C:\Users\itay\AppData\Local\Temp\tmp78b6tln1.cu failed
[command: nvcc --preprocess -arch sm_52 -m64 -Ic:\users\itay\sources\pythonmaps\env\lib\site-packages\pycuda\cuda C:\Users\itay\AppData\Local\Temp\tmp78b6tln1.cu --compiler-options -EP]
Running the same command from a command prompt works fine.
The CompileError exception has a strerr attribute which contains the stderr of the compilation. It is empty. Only by placing a break-point inside PyCUDA did I find the actual error, which was reported by nvcc to stdout.
Is there a way to get the compilation output without placing breakpoints in PyCUDA?

Related

Error while creating dll from C code for Python code

I have some C code which uses "mex.h". I am trying to create shared library (ie create dll file) so that it can be used in Python by using ctypes library.
Following is the command that I have tried for creating dll file, but I am getting an error.
Command Used
gcc -fPIC -IC:\Progra~1\MATLAB\R2020b\extern\include -IC:\Progra~1\MATLAB\R2020b\extern\lib -lC:\Progra~1\MATLAB\R2020b\bin\win64\libmat.dll -lC:\Progra~1\MATLAB\R2020b\bin\win64\libmx.dll -shared -o whitsmw_Cd1.dll whitsmw_Cd1.c
Error
c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lC:\Progra~1\MATLAB\R2020b\bin\win64\libmat.dll
c:/mingw/bin/../lib/gcc/mingw32/6.3.0/../../../../mingw32/bin/ld.exe: cannot find -lC:\Progra~1\MATLAB\R2020b\bin\win64\libmx.dll
collect2.exe: error: ld returned 1 exit status
I have checked the paths are correct. I would be helpful if someone can provide some help.

Pycharm debugger when using multiprocessing pool

My problem is happening with this setup :
Pycharm 2020.3 pro
multiprocessing.Pool
Macbook pro 2020 (M1)
Conda python 3.8
And most of all, it happens when I use the debugger of pycharm.
It shows 8 times (number of processes) in the console :
Error loading: /Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd_attach_to_process/attach_x86_64.dylib
Every processes are executed. Results are correct. I can see them with htop command. So it's only a debugger failure and doesn't really impact code execution (correct me if I am wrong).
This is the kind of code I run :
def func(x):
return x+10
if __name__ == '__main__':
poo = Pool()
x = [[i] for i in range(10)]
res = poo.starmap(func, x)
print(res)
I can ignore for now those massive printing in my console but it's not really convenient. If someone has an idea to get rid of those...
This sounds like something JetBrains devs will need to work out for the M1 (consider dropping them a bug report). In the meantime, I suspect you could disable it under the options:
PyCharm > Preferences > Build, Execution, Deployment > Python Debugger
and unchecking the box "Attach to subprocess automatically while debugging". See the pertinent docs for reference.
The solution for that error is modify the file /Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd_attach_to_process/linux_and_mac/compile_mac.sh and replace all code for the next one:
g++ -fPIC -D_REENTRANT -std=c++11 -arch arm64 -c -o attach_x86_64.o attach.cpp
g++ -dynamiclib -nostartfiles -arch arm64 -o attach_x86_64.dylib attach_x86_64.o -lc
rm attach_x86_64.o
mv attach_x86_64.dylib ../attach_x86_64.dylib
then you can run the sh script, which will replace the attach_x86_64.dylib file.
Note. this change will be lost if you update your pycharm.
I faced this same issue on my Mac M1 16" (2021) only when I updated pycharm for newer version.
If you are facing this issue on Mac M1 laptops. Please head to this location on your laptops
/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/pydevd_attach_to_process/linux_and_mac/compile_mac.sh
and add these lines in the compile_mac.sh file with the existing lines.
g++ -fPIC -D_REENTRANT -std=c++11 -arch arm64 -c -o attach_arm64.o attach.cpp
g++ -dynamiclib -nostartfiles -arch arm64 -o attach_arm64.dylib attach_arm64.o -lc
rm attach_arm64.o
mv attach_arm64.dylib ../attach_arm64.dylib
I found this link useful for solving this issue: youtrack.jetbrains.com

Python / C++ binding, how to link agains static c++ library (portaudio) with distutils?

I am trying to staticaly link the "c++ portaudio library" against my "C++ demo module" which is a python callable library (module).
I'm doing this with distutils, and in order to perform the static linking, I've added the libportaudio to the extra_objects argument, as follows:
module1 = Extension(
"demo",
sources=cppc,
# TODO remove os dependency
extra_compile_args=gccArgs,
# link against shared libraries
#libraries=[""]
# link against static libraries
extra_objects=["./clib-3rd-portaudio/libportaudio.a"]) # << I've added the static lib here
Compiling with "python setup.py build" results in the following linker error:
/usr/bin/ld: ./clib-3rd-portaudio/libportaudio.a(pa_front.o): relocation R_X86_64_32 against `.rodata.str1.8' can not be used when making a shared object; recompile with -fPIC
./clib-3rd-portaudio/libportaudio.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
So at this point I've tried the obvious, I've added the -fPIC flagg to gccArgs (note extra_compile_args=gccArgs above), as follows:
gccArgs = [
"-Icsrc",
"-Icsrc/paExamples",
"-Icinc-3rd-portaudio",
"-Icinc-3rd-portaudio/common",
"-Icinc-3rd-portaudio/linux",
"-fPIC"] # << I've added the -fPIC flag here
However this results in the exact same error, so I guess the -fPIC flag is not the root cause. I'm probably missing something trivial, but I'm a bit lost here, hope somebody can help.
As the error message said, you should recompile the external library libportaudio.a with -fPIC argument, NOT your own codes. That's why it doesn't help to add -fPIC to your extra_compile_args.
Several other posts suggest that the file libportaudio.a cannot be used to build shared library, probably because the default build settings of portaudio don't include -fPIC.
To recompile portaudio correctly, download the source and try to run ./configure with -shared option (or something similar). If you cannot find the proper option, then modify the Makefile and append -fPIC to the extra compile options. You can also compile each object file manually and pack them into libportaudio.a.
Since your target file (libdemo.so) is a shared library, you must make sure ANY object codes included inside are compiled with -fPIC option. To understand why you need this option, please refer to:
What does -fPIC mean when building a shared library? and Position Independent Code (PIC) in shared libraries

Random Forest Distance for Python fails to build (g++)

this implementation of RFD http://www.cse.buffalo.edu/~jcorso/r/snippets.metric_learning.html
fails to build for me. Running the setup.py within the python package, when the following appears:
Building Swig Modules:
building librf...
/tmp/cctKDjwA.s: Assembler messages:
/tmp/cctKDjwA.s:12665: Error: no such instruction: `vfnmadd312ss 52(%r14),%xmm5,%xmm2'
/tmp/cctKDjwA.s:14338: Error: no such instruction: `vfnmadd312ss 84(%rdx),%xmm5,%xmm2'
/tmp/cctKDjwA.s:18244: Error: no such instruction: `vfnmadd312ss 228(%rsp),%xmm1,%xmm3'
/tmp/cctKDjwA.s:18389: Error: no such instruction: `vfmadd312ss 272(%rsp),%xmm1,%xmm0'
The line where it fails (checked separately):
os.system("g++ -march=native -fPIC -O3 -std=c++0x -c src/librf_wrap.cxx src/librf/*.cc src/librf/semaphores/*.cpp -I/usr/include/" + pyver)
I'm running on Ubuntu 12.04 64-bit with i5-4430. Apologies, but I'm unsure what additional info I should add, please suggest.
Thanks for your patience.
It's possible that '-march=native' is incorrectly determining your CPU and generating instructions that are illegal for it.
Could you try without '-march=native' to see if that is the case?
Note that it's possible to see what '-march=native' is choosing exactly, see this website.

undefined symbol: PyOS_InputHook, from shared library

I wrote a C++ "python plugin" for an a non-python C++ application.
At some point, this plugin, which is a .so, initializes the python interpreter and opens a python console.
For convenience the "readline" module is then imported and we get this error:
ImportError: /usr/lib/python2.7/lib-dynload/readline.so: undefined symbol: PyOS_InputHook
The link command (generated by cmake) goes:
/usr/bin/c++ -fPIC -Wall -Wextra -O3 -DNDEBUG -Xlinker -export-dynamic -Wl,-fwhole-program /usr/lib/libpython2.7.a -shared -Wl,-soname,libMyplugin.so -o libMyplugin.so [sources] [qt libs] -lGLU -lGL -lX11 -lXext -lc -lc -lpython2.7 -Wl,-rpath,/src:/usr/local/Trolltech/Qt-4.8.4/lib:
nm libMyplugin.so gives the following python-related symbols:
U Py_Finalize
U Py_Initialize
00000000002114a8 B PyOS_InputHook
U PyRun_InteractiveLoopFlags
U PyRun_SimpleStringFlags
We observe that PyOS_InputHook is defined in the BSS section of the plugin. Yet, python's readline.so fails to find it.
The question is why, and how to fix it.
The issue is with how the main application loads the plugin: it uses dlopen() without the flag RTLD_GLOBAL.
This implies that the symbols present in the plugin that are not currently needed (like PyOS_InputHook in this instance) are not resolved and will not be resolved for other shared libraries that will be loaded afterwards (like readline.so in this instance).
To fix this, the flag RTLD_GLOBAL should be used when loading the plugin.
If there is no control over the main application (as in this instance) and on how it uses dlopen(), it is still possible to "reload" the plugin from within the plugin itself using dlopen() with flags RTLD_NOLOAD | RTLD_GLOBAL, so as to resolve all previously unresolved symbols in the currently loaded library.
Doing this solves the issue.

Categories