How to install python binding of a C++ library

How to install python binding of a C++ library - python

Imaging that we are given a finished C++ source code of a library, called MyAwesomeLib. The goal is to expose some of its power to python, so we create a wrapper using swig and generated a python package called PyMyAwesomeLib.
The directory structure now looks like
root_dir
|-src/
|-lib/
| |- libMyAwesomeLib.so
| |- _PyMyAwesomeLib.so
|-swig/
| |- PyMyAwesomeLib.py
|-python/
|- Script_using_myawesomelib.py
So far so good. Ideally, all we want to do next is to copy lib/*.so swig/*.py and python/*.py into the corresponding directory in site-packages in a pythonic way, i.e. using
python setup.py install
However, I got very confused when trying to achieve this simple goal using setuptools and distutils. Both tools handles the compilation of python extensions through an internal system, where the source file, compiler flags etc. are passed using setup(ext_module=[Extension(...)]). But this is ridiculous since MyAsesomeLib has a fully functioning build system that is based on makefile. Porting the logic embedded in makefiles would be redundant and completely un-necessary work.
After some research, it seems there are two options left, I can either override setuptools.command.build and setuptools.command.install to use the existing makefile and copy the results directly, or I can somehow let setuptools know about these files and ask it to copy them during installation. The second way is more appealing, but it is what gives me the most headache. I have tried the following optionts without success
package_data, and include_package_data does not work because *.so files are not under version control and they are not inside of any package.
data_files does not seems to work since the files only get included when running python setup.py sdist, but ignored when python setup.py install. This is the opposite of what I want. The .so files should not be included in the source distribution, but get copied during the installation step.
MANIFEST.in failed for the same reason as data_files.
eager_resources does not work either, but honestly I do not know the difference between eager_resources and data_files or MANIFEST.in.
I think this is actually a common situation, and I hope there is a simple solution to it. Any help would be greatly appreciated.

Porting the logic embedded in makefiles would be redundant and
completely un-necessary work.
Unfortunately, that's exactly what I had to do. I've been struggling with this same issue for a while now.
Porting it over actually wasn't too bad. distutils does understand SWIG extensions, but it this was implemented rather haphazardly on their part. Running SWIG creates Python files, and the current build order assumes that all Python files have been accounted for before running build_ext. That one wasn't too hard to fix, but it's annoying that they would claim to support SWIG without mentioning this. Distutils attempts to be cross-platform when compiling things, so there is still an advantage to using it.
If you don't want to port your entire build system over, use the system's package manager. Many complex libraries do this (but they also try their best with setup.py). For example, to get numpy and lxml on Ubuntu you'd just do:
sudo apt-get install python-numpy python-lxml. No pip.
I realize you'd rather write one setup file instead of dealing with every package manager ever so this is probably not very helpful.
If you do try to go the setuptools route there is one fatal flaw I ran into: dependencies.
For instance, if you are distributing a SWIG-based project, it's going to need libpython. If they don't have it, an error like this happens:
#include <Python.h>
error: File not found
That's pretty unhelpful to the average user.
Even worse, if you require a shared library but the user's library is out of date, the user can get some crazy errors. You're at the mercy of their C++ compiler to output Google-friendly error messages so they can figure it out.
The long-term solution would be to get setuptools/distutils to get better at detecting non-python libraries, hopefully as good as Ruby's gem. I pretty much had to roll my own. For instance, in this setup.py I'm working on you can see a few functions at the top I hacked together for dependency detection (still doesn't work on all systems...definitely not Windows).

Related

Manage 3rd party libraries installation C++

I am wondering how to intelligently manage the building and installation for some of our 3rd party C++ dependencies on Linux(Ubuntu). The way I currently have it set up is a git-lfs with all the necessary compressed 3rd party sources. I then use a shell script I wrote to install all the necessary system dependencies and then unzip and build the desired library. This shell script also takes care setting up all the paths so that our source code can easily link to the 3rd party libraries.
Example commands for our script are ./install opencv or ./install everything
However, over the months the script has gotten quite large and breaks sometimes when certain libraries are already installed or other minor issues. Thus I would like to replace it with something a bit more intelligent and useful. I have currently been looking into writing some kind of python script, but just changing the language from shell to python is not that big of an advantage. So I am looking if there are any specific python libraries that can help me with managing these libraries.
I have looked into things like chef and other automated builds stuff, but that is overkill for the small project I am working on.
I was wondering what other people used for this 3rd party management stuff, as sadly C++ does not have anything like pip.

I use jhbuild for this kind of thing (if I understand what you are doing correctly). It came out of the GNOME project (they use it to build the whole desktop from source), but it's easy to customize for any set of projects. The jhbuild packaged in recent Ubuntus works fine.
You write a little XML to describe each project: where to download the sources, what patches to apply, what configure flags to use, what projects it depends on, and so on; then when you enter jhbuild build mything it works out what to build and in what order and gets on with it. It's reasonably smart about changes, so if you edit a source file in one of the projects that makes up your stack, it'll only rebuild the affected parts.
For example, I have this for fftw3, the excellent fast Fourier transform library:
<autotools id="fftw3"
autogen-sh="configure"
autogenargs="--disable-static --enable-shared --disable-threads"
>
<branch
repo="fftw"
module="fftw-3.3.4.tar.gz"
/>
<dependencies>
<dep package="libiconv"/>
</dependencies>
</autotools>
With probably obvious meanings. That's building from a release tarball, but it can build from git as well. It's happy with cmake projects. jhbuild is written in Python, so it's simple to customize. Thanks to GNOME, many common libraries are included.
I actually use it to build Windows binaries. You can set it up to build everything with a cross-compiler, then put it inside Docker. It makes it a one-liner for anyone to be able to build a large and complex application on (almost) any platform. You can use it to do automated nightly builds as well, of course.
There are probably better things around, but it has worked well for me.

Python wheel force ABI to "none"

I would think this is an easy question, but I haven't found an answer yet, so I'm posting here.
I have a Python 3 app, that I package into a platform wheel. I have the setup.py and everything works as expected. The only thing I can't figure out is the generated wheel always includes an ABI tag (like"cp34m"), and when that's included I find that I can't actually install the wheel via pip. (My build script installs the latest pip, setuptools, and wheel before running.)
The solution is simple. I just change the file name of the wheel to change "cp34m" to "none". This is obviously easy to add into my build script, but I wonder if it's possible to set an option for bdist_wheel or something so that the .whl file that's generated has none set on its own?
The command I use to create the wheel is (for example on x64):
python setup.py bdist_wheel --plat-name win_amd64
That creates a wheel like:
mpf_mc-0.30.0.dev269-cp34-cp34m-win_amd64.whl
Which I then rename before uploading to PyPI to:
mpf_mc-0.30.0.dev269-cp34-none-win_amd64.whl
Everything seems to work fine by manually renaming it. But I wonder if this the right way to do it, or am I missing something?

This is difficult to answer without seeing your setup.py file, but I will give my thought:
the naming of the wheel postfix is a combination of the python tag, ABI tag, and platform tag, as shown here. But it sounds like you already knew that.
This link explaining it all is less official, but clearer, and here is the definition of ABI.
In my ignorant understanding, the ABI is the service level agreement (interface) between applications.
For the purposes of Python, the ABI is often the same as the python tag, with some additional info like the m you have in your case, which means the Cython aspects of it used malloc.
Reason Your Solution Works
So, I would guess the reason things do not work when you use the original name and do work once you change the name is:
pip will not let you install things using the m part of the ABI tag... probably because it has not been vetted, and not because of any real problem, and/or
you are not actually using any aspects of Cython that depend upon malloc.
Reason The ABI Tag Is Being Added
Now, if we assume that there is no real need for the m tag, as I said, then the reason it is probably being added is that you have a dependency in your setup file that needs it.
I am primarily a data scientist, so I will use a common example from there: numpy is compiled down to C code and has many Python calls to that lower level language to boost speed. Even if I write pure Python 3 code, and do not even use numpy anywhere, if I have a dependency in my setup file that asks for numpy, then I will have a platform-specific build. However, more common is that I am in fact using numpy, but I may not be using any aspects of it that rely upon C. (in numpy, that is rare, as all array initializations happen in C, but I hope you get the idea.) I do not know what forces ABI-specific builds, but I would guess you have some dependency asking for it, but which is never really used.

It seems like something has changed since you posted your question - I just tried your command to create a wheel file on my own project:
~$ python setup.py bdist_wheel --plat-name win_amd64
and the result file is:
my_project-1.0.0-py2-none-win_amd64.whl

Install a CMake macro script from within a python package install script (using setup.py)

So I have a Python package – it’s all set up on PyPI, and on GitHub, no problem. This is something I’m relatively familiar with.
What is unknown to me is: the notion of installing a CMake script as part of the python package install process. The python package in question is a development tool – you use it to preprocess some of your C/C++/Obj-C/Obj-C++ source files and generate some predefined macros in a header – and it works well when it’s wrapped in a CMake macro (for example like so) and executed as part of a proper chain of dependencies.
For one, I am not sure how to approach this, as there seem to be significant differences between the setuptools sandbox stance and distutils’ willing systems-level installer integration – and then even if I did know how to go about setting things up correctly in setup.py, I can’t find a good precedent on where a CMake script pertaining to a Python package might live.
All thoughts and insights on the matter are welcome.

It took me a while to understand your question. If I understand correctly, what you are trying to do is provide the IodSymbolize.cmake in the standard installation location of cmake so that other users/projects who rely on your software(symbolizer) can use it in their build process. I think you are thinking in a good direction, trying to provide services for end users of your package. Good question!
Here is my understanding of how things work in the cmake world.
Say I am an end user who wants to use "symbolizer" executable. What I would do is
find_package(symbolizer). This would try to figure out the location of the executable and it would set certain variables which can be used in the build process.
You need to provide Findsymbolizer.cmake file.
Please take a look at : http://www.cmake.org/Wiki/CMake:How_To_Find_Libraries
Also look at the Find*.cmake files provided in /usr//share/cmake/Modules directory if you are Unix/Linux platform.
Once the Findsymbolizer.cmake file is working properly, send it to the cmake mailing list for review. Once accepted it can be packaged in the next release of cmake. Then your module is usable with cmake. Hope I answered your question. Please update if you need more info.

distutils setup script under linux - permission issue

So I created a setup.py script for my python program with distutils and I think it behaves a bit strange. First off it installs all data_files into /usr/local/my_directory by default which is a bit weird since this isn't a really common place to store data, is it?
I changed the path to /usr/share/my_directory/. But now I'm not able to write to the database inside that directory and I can't set required permission from within setup.py neither since the actual database file has not been created when I run it.
Is my approach wrong? Should I use another tool for distributing?
Because at least for Linux, writing a simple setup sh script seems easier to me at the moment.

The immediate solution is to invoke setup.py with --prefix=/the/path/you/want.
A better approach would be to include the data as package_data. This way they will be installed along side your python package and you'll find it much easier to manage it (find paths etc).

Preparing a complex python project for submission to launchpad

I'm trying to wrap my head around the whole PPA thing and it seems to be as unnecessarily difficult as everybody is making it out to be. Let's take a project like http://docs.bokeh.org/ which has a node.js dependency and make a .deb out of it. Following this guide, and various posts here, I tried to use stdeb to do it:
pypi-download bokeh
tar xfz bokeh-0.7.0.tar.gz
cd bokeh-0.7.0/bokehjs/
npm install
grunt build
cd ..
python3 setup.py --command-packages=stdeb.command sdist_dsc
The end of the output is
dh clean --with python3 --buildsystem=python_distutils
dh_testdir -O--buildsystem=python_distutils
debian/rules override_dh_auto_clean
make[1]: Entering directory `/home/emre/Desktop/bokeh-0.7.0/deb_dist/bokeh-0.7.0'
python3 setup.py clean -a
/home/emre/Desktop/bokeh-0.7.0/deb_dist/bokeh-0.7.0/bokehjs
ERROR: Cannot install BokehJS: files missing in `./bokehjs/build`.
Please build BokehJS by running setup.py with the `--build_js` option.
Dev Guide: http://docs.bokeh.org/docs/dev_guide.html.
I just did that! Am I missing something? Is this building even necessary for something that's straight off pypi? The guides gloss over these things.

Making good debs can be complicated, yes, especially when you are not the upstream author and aren't sure exactly what their intentions were for installations of their software. The complication is necessary because well-behaved debs must conform to a fairly long list of policies and requirements so that users know what to expect from them in many different situations and cases. Source for debs needs to contain enough information that it can be built by automated systems (including installing any necessary build dependencies). Binary (built) debs must put their files in the right places on the system and not break any other packages and be able to clean up after themselves fully on uninstall. Debs should be installable without a user watching on an interactive terminal. Debs must declare all of their dependencies, and necessary versions of those dependencies, except for a few packages considered "required". Debs should not download anything from the internet during build or install. And so on, and so on. This strictness and the degree to which the community adheres to it is actually one of the most important benefits of running a Debian-based distribution.
Python source distributions such as those you find on PyPI, on the other hand, can pretty much do whatever they want. There are emerging best-practices for build and install commands with setup.py, but they're not always followed, and even when they are, there is still a lot of room for interpretation and variance. Some, such as the one you reference here, might arbitrarily require the user to call setup.py with a different nonstandard option before building normally. Some go ahead and download their own dependencies and put them wherever they want. Most packages beyond the trivial don't know how to uninstall themselves.
Both approaches are fine and are better in different contexts. But hopefully you can see now why it's not possible in the general case to make arbitrary Python source distributions automatically into working debs. There is just too much that the computer has to assume about how the Python will behave.
Having said all that, if you don't care about conforming to Ubuntu/Debian policy and you just want to be able to put something in a personal repository, the easiest path for you might be to change the Python source so that it does its --build_js thing automatically as necessary, rather than complaining and asking the user to do it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.