Better Way of Debugging Cython Packages

Better Way of Debugging Cython Packages - python

I currently use Cython to build a module that is mostly written in C. I would like to be able to debug quickly by simply calling a python file that imports the "new" Cython module and test it. The problem is that I import GSL and therefore pyximport will not work. So I'm left with "python setup.py build; python setup.py install" and then running my test script.
Is this the only way? I was wondering if anyone else uses any shortcuts or scripts to help them debug faster?

I usually just throw all the commands I need to build and test into a shell script, and run it when I want to test. It's a lot easier than futzing with crazy Python test runners.

Related

Embedding Cython in C++

I am trying to embed a piece of Cython code in a C++ project, such that I can compile a binary that has no dependencies on Python 2.7 (so users can run the executable without having Python installed). The Cython source is not pure Cython: There is also Python code in there.
I am compiling my Cython code using distutils in the following script (setup.py):
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize("test.pyx")
)
I then run the script using python setup.py build_ext --inplace. This generates a couple of files: test.c, test.h, test.pyd and some library files: test.exp, test.obj and test.lib.
What would be the proper procedure to import this into C++? I managed to get it working by including test.c and test.h during compilation and test.lib during linking.
I am then able to call the Cython functions after I issue
Py_Initialize();
inittest();
in my C++ code.
The issue is that there a numerous dependencies on Python, both during compilation (e.g., in test.h) as well in during linking. Bottom-line is that in order to run the executable, Python has to be installed (otherwise I get errors on missing python27.dll).
Am I going in the right direction with this approach? There are so many options that I am just very confused on how to proceed. Conceptually, it also does not make sense why I should call Py_Initialize() if I want the whole thing to be Python-independent. Furthermore, this is apparently the `Very High Level Embedding' method instead a low-level Cython embedding, but this is just how I got it to work.
If anybody has any insights on this, that would be really appreciated.

Cython cannot make Python code Python-independent; it calls into the Python library in order to handle Python types and function calls. If you want your program to be Python-independent then you should not write any Python code.

(This is primarily extra detail to
Ignacio Vazquez-Abrams's answer which says that you can't eliminate the Python dependency)
If you don't want to force your users to have Python installed themselves, you could always bundle python27.dll with your application (read the license agreement, but I'm almost certain it's fine!).
However, as soon as you do an import in your code, you either have to bundle the relevant module, or make sure it (and anything it imports!) is compiled with Cython. Unless you're doing something very trivial then you could end spending a lot of time chasing dependencies. This includes the majority of the standard library.

How to install python binding of a C++ library

Imaging that we are given a finished C++ source code of a library, called MyAwesomeLib. The goal is to expose some of its power to python, so we create a wrapper using swig and generated a python package called PyMyAwesomeLib.
The directory structure now looks like
root_dir
|-src/
|-lib/
| |- libMyAwesomeLib.so
| |- _PyMyAwesomeLib.so
|-swig/
| |- PyMyAwesomeLib.py
|-python/
|- Script_using_myawesomelib.py
So far so good. Ideally, all we want to do next is to copy lib/*.so swig/*.py and python/*.py into the corresponding directory in site-packages in a pythonic way, i.e. using
python setup.py install
However, I got very confused when trying to achieve this simple goal using setuptools and distutils. Both tools handles the compilation of python extensions through an internal system, where the source file, compiler flags etc. are passed using setup(ext_module=[Extension(...)]). But this is ridiculous since MyAsesomeLib has a fully functioning build system that is based on makefile. Porting the logic embedded in makefiles would be redundant and completely un-necessary work.
After some research, it seems there are two options left, I can either override setuptools.command.build and setuptools.command.install to use the existing makefile and copy the results directly, or I can somehow let setuptools know about these files and ask it to copy them during installation. The second way is more appealing, but it is what gives me the most headache. I have tried the following optionts without success
package_data, and include_package_data does not work because *.so files are not under version control and they are not inside of any package.
data_files does not seems to work since the files only get included when running python setup.py sdist, but ignored when python setup.py install. This is the opposite of what I want. The .so files should not be included in the source distribution, but get copied during the installation step.
MANIFEST.in failed for the same reason as data_files.
eager_resources does not work either, but honestly I do not know the difference between eager_resources and data_files or MANIFEST.in.
I think this is actually a common situation, and I hope there is a simple solution to it. Any help would be greatly appreciated.

Porting the logic embedded in makefiles would be redundant and
completely un-necessary work.
Unfortunately, that's exactly what I had to do. I've been struggling with this same issue for a while now.
Porting it over actually wasn't too bad. distutils does understand SWIG extensions, but it this was implemented rather haphazardly on their part. Running SWIG creates Python files, and the current build order assumes that all Python files have been accounted for before running build_ext. That one wasn't too hard to fix, but it's annoying that they would claim to support SWIG without mentioning this. Distutils attempts to be cross-platform when compiling things, so there is still an advantage to using it.
If you don't want to port your entire build system over, use the system's package manager. Many complex libraries do this (but they also try their best with setup.py). For example, to get numpy and lxml on Ubuntu you'd just do:
sudo apt-get install python-numpy python-lxml. No pip.
I realize you'd rather write one setup file instead of dealing with every package manager ever so this is probably not very helpful.
If you do try to go the setuptools route there is one fatal flaw I ran into: dependencies.
For instance, if you are distributing a SWIG-based project, it's going to need libpython. If they don't have it, an error like this happens:
#include <Python.h>
error: File not found
That's pretty unhelpful to the average user.
Even worse, if you require a shared library but the user's library is out of date, the user can get some crazy errors. You're at the mercy of their C++ compiler to output Google-friendly error messages so they can figure it out.
The long-term solution would be to get setuptools/distutils to get better at detecting non-python libraries, hopefully as good as Ruby's gem. I pretty much had to roll my own. For instance, in this setup.py I'm working on you can see a few functions at the top I hacked together for dependency detection (still doesn't work on all systems...definitely not Windows).

What does setup.py build do?

I read in the Python documentation:
The build command is responsible for putting the files to install into a build directory.
I fear this documentation may be incomplete. Does python setup.py build do anything else? I expect this step to generate object files with Python bytecode, which will be interpreted at execution time by Python VM.
Also, I'm building an automated code check in my source code repository. I want to know if there is any benefit of running setup.py build (does it do any checks?) or is a static code/PEP8 checker such as Pylint good enough?

Does python setupy.py build do anything else?
If your package contains C extensions (or defines some custom compilation tasks), they will be compiled too. If you only have Python files in your package, copying is all build does.
I expect this step to generate object files with Python bytecode, which will be interpreted at execution time by Python VM.
No, build does not do that. This happens at install stage.
I want to know if there is any benefit of running setup.py build (Does it do any checks?) or is a static code/PEP8 checker such as Pylint good enough?
By all means, run pylint. build does not even check the syntax.

How do i run the python 'sdist' command from within a python automated script without using subprocess?

I am writing a script to automate the packaging of a 'home-made' python module and distributing it on a remote machine.
i am using Pip and have created a setup.py file but i then have to call the subprocess module to call the "python setup.py sdist" command.
i have looked at the "run_setup" method in distutils.core but i am trying to avoid using the subprocess module alltogether. (i see no point in opening a shell to run a python command if i am already in python...)
is there a way to import the distutils module into my script and pass the setup information directly to one of its methods and avoid using the shell command entirely? or any other suggestions that may help me
thanks

Just for the sake of completeness, I wanted to answer this since I came across it trying to find out how to do this myself. In my case, I wanted to be sure that the same python version was being used to execute the command, which is why using subprocess was not a good option. (Edit: as pointed out in comment, I could use sys.executable with subprocess, though programmatic execution is IMO still a cleaner approah -- and obviously pretty straightforward.)
(Using distutils.core.run_setup does not call subprocess, but uses exec in a controlled scope/environment.)
from distutils.core import run_setup
run_setup('setup.py', script_args=['sdist'])
Another option, may be to use the setuptools commands, though I have not explored this to completion. Obviously, you still have to figure out how to avoid duplicating your proj metadata.
from setuptools.dist import Distribution
from setuptools.command.sdist import sdist
dist = Distribution({'name': 'my-project', 'version': '1.0.0'}) # etc.
dist.script_name = 'setup.py'
cmd = sdist(dist)
cmd.ensure_finalized()
cmd.run() # TODO: error handling
Anyway, hopefully that will help someone in the right direction. There are plenty of valid reasons to want to perform packaging operations programmatically, after all.

If you don’t have a real reason to avoid subprocesses (i.e. lack of platform support, not just aesthetics (“I see no point”)), then I suggest you should just not care and run in a subprocess. There are a few ways to achieve what you request, but they have their downsides (like having to catch exceptions and reporting errors).

How can i use nosetest to run an shell script or another python script

nose
is a test runner which extends PyUnit. Is it possible to write e.g
$ nosetests --with-shell myTest.py -myargs test
If not, then is there a plugin, or do i need to develop it myself.
Any suggestions ?

Nose is not a general test harness. It's specifically a Python harness which runs Python unit tests.
So, while you can write extensions for it to execute scripts and mark them as successes or failures based on the exit status or an output string, I think it's an attempt to shoehorn the harness into doing something it's not really meant to do.
You should package your tests as Python functions or classes and then have them use a library to run external scripts the output or behaviour of which is translated into something that nose can interpret rather than extend nose to directly run scripts.
Also, I've experimented with nose a bit and found it's extension mechanism quite clumsy compared to py.test. You might want to give that a shot.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.