How to extend a python package by binary executables? - python

My package is written almost entirely in python. However, some functionality is based on executables that are run within python using subprocess. If I set up the package locally, I need to first compile the corresponding C++ project (managed by CMake) and ensure that the resulting binary executables are created in the bin folder. My python scripts then can call these utilities.
My project's folder structure resembles the following one:
root_dir
- bin
- binary_tool1
- binary_tool2
- cpp
- CMakeLists.txt
- tool1.cpp
- tool2.cpp
- pkg_name
- __init__.py
- module1.py
- module2.py
- ...
LICENSE
README
setup.py
I now consider to create a distributable python package and to publish it via PyPi/pip. I therefore need to include the build-step of the C++ project into the packaging procedure.
So far, I create the python package (without the binary "payload") as described in this tutorial. I now wonder how to extend the packaging procedure such that the C++ binary files are distributed along with the package.
Questions:
Is setuptools designed for such a use-case at all?
Is this "mixed" package approach feasible at all or are binary compatibility issues to be expected?
I believe that the canonical approach to extend a pure-python package with C code is to create "binary extensions" (e.g. using distutils, or as described here). In this case, the functionality is provided by executables, and not by wrappable C/C++ functions. I would like to avoid redesigning the C++ project to create binary extensions.

I found a number of half-answers to this but nothing complete, so here goes.
Quick and easy (single-platform)
I believe you'll need to remove the dash from your package name. The rest of this answer assumes it's been replaced with an underscore.
Starting with your directory structure, create a copy of bin under pkg_name (or move bin there). The reason for that is, if you do not, you will end up installing files into your python folders site-packages/pkg_name and site-packages/bin instead of having it all under site-packages/pkg_name.
Your minimal set of files needed for packaging should now be as follows:
- pkg_name/
- __init__.py
- module1.py
- module2.py
- bin/
- binary_tool1
- binary_tool2
- setup.py
To call your binary executable from the code, use a relative path to __file__:
def run_binary_tool1(args):
cmd = [os.path.join(os.path.dirname(__file__), 'bin', 'binary_tool1')] + args
p = subprocess.Popen(cmd, ...)
...
In setup.py, reference your binaries in addition to your package folder:
from setuptools import setup
setup(
name='pkg_name',
version='0.1.0',
package_data={
'pkg_name':['bin/binary_tool1','bin/binary_tool2']
},
packages=['pkg_name']
)
Do yourself a favor and create a Makefile:
# Makefile for pkg_name python wheel
# PKG_NAME and VERSION should match what is in setup.py
PKG_NAME=pkg_name
VERSION=0.1.0
# Shouldn't need to change anything below here
# determine the target wheel file
WHEEL_TARGET=dist/${PKG_NAME}-${VERSION}-py2.py3-none-any.whl
# help
help:
#echo "Usage: make <setup|build|install|uninstall|clean>"
# install packaging utilities (only run this once)
setup: pip install wheel setuptools
# build the wheel
build: ${WHEEL_TARGET}
# install to local python environment
install: ${WHEEL_TARGET}
pip install ${WHEEL_TARGET}
# uninstall from local python environment
uninstall:
pip uninstall ${PKG_NAME}
# remove all build artifacts
clean:
#rm -rf build dist ${PKG_NAME}.egg-info
#find . -name __pycache__ -exec rm -rf {} \; 2>/dev/null
# build the wheel
${WHEEL_TARGET}: setup.py ${PKG_NAME}/__init__.py ${PKG_NAME}/module1.py ${PKG_NAME}/module2.py ${PKG_NAME}/bin/binary_tool1 ${PKG_NAME}/bin/binary_tool2
python setup.py bdist_wheel --universal
Now you're ready to roll:
make setup # only run once if needed
make install # runs `make build` first
## optional:
# make uninstall
# make clean
and in python:
import pkg_name
pkg_name.run_binary_tool1(...)
...
Multi-platform
You'll almost certainly want to provide more info in your setup() call, so I won't go into detail on that here. More importantly, the above creates a wheel that purports to be universal but really is not. This might be sufficient for your needs if you are sure you will only be distributing on a single platform and you don't mind this mismatch, but would not be suitable for broader distribution.
For multi-platform distribution, you could go the obvious route and create platform-specific wheels (changing the --universal flag in the above Makefile command, etc).
Alternatively, if you can compile a binary for every platform, you could packages all of the binaries for all platforms in your one universal wheel, and let your python code figure out which binary to call (for example, by checking sys.platform and/or other available variables to determine the platform details).
The advantages of this alternative approach are that the packaging process is still easy, the platform-dynamic code is some simple python, and you can easily reuse the same binary on multiple platforms provided that it actually works on those platforms. Now, I would not be surprised if the "all-binaries" approach is frowned on by at least some if not many, but hey, python users often say that developer time is king, so this approach has that argument in its favor-- as does the overall idea of packaging a binary executable instead of going through all the brain damage of creating a python/C wrapper interface.

Related

How to force a platform wheel using build and pyproject.toml?

I am trying to force a Python3 non-universal wheel I'm building to be a platform wheel, despite not having any native build steps that happen during the distribution-packaging process.
The wheel will include an OS-specific shared library, but that library is built and copied into my package directory by a larger build system that my package knows nothing about. By the time my Python3 package is ready to be built into a wheel, my build system has already built the native shared library and copied it into the package directory.
This SO post details a solution that works for the now-deprecated setup.py approach, but I'm unsure how to accomplish the same result using the new and now-standard build / pyproject.toml system:
mypackage/
mypackage.py # Uses platform.system and importlib to load the local OS-specific library
pyproject.toml
mysharedlib.so # Or .dylib on macOS, or .dll on Windows
Based on the host OS performing the build, I would like the resulting wheel to be manylinux, macos, or windows.
I build with python3 -m build --wheel, and that always emits mypackage-0.1-py3-none-any.whl.
What do I have to change to force the build to emit a platform wheel?
OK, after some research and reading of code, I can present a bit of information and a few solutions that might meet other people's needs, summarized here:
Firstly, pyproject.toml is not mutually exclusive from setup.py. setuptools will complain about deprecation if you create a distribution package via python3 setup.py ... and no pyproject.toml file is present.
However, setup.py is still around and available, but it's a mistake to duplicate project configuration values (name, version, etc). So, put as much as your package will allow inside your pyproject.toml file, and use setup.py for things like overriding the Distribution class, or overriding the bdist_wheel module, etc.
As far as creating platform wheels, there are a few approaches that work, with pros and cons:
Override the bdist_wheel command class in setup.py as described here and set self.root_is_pure to False in the finalize_options override. This forces the python tag (e.g. cp39) to be set, along with the platform tag.
Override the Distribution class in setup.py as described here and override has_ext_modules() to simply return True. This also forces the python and platform tags to be set.
Add an unused minimal extension module to your packaging definition, as described here and here. This lengthens the build process and adds a useless "dummy" shared library to be dragged along wherever your wheel goes.
Add the argument -C=--build-option=--plat {your-platform-tag} to the build invocation (for my case that's python -m build -w -n, for example). This leaves the Python tag untouched but you have to supply your own tag; there's no way to say "use whatever the native platform is". You can discover the exact platform tag with the command wheel.bdist_wheel.get_platform(pathlib.Path('.')) after importing the pathlib and wheel.bdist_wheel packages, but that can be cumbersome because wheel isn't a standard library package.
Simply rename your wheel from mypkg-py3-none-any.whl to mypkg-py3-none-macosx_13_0_x86_64.whl- it appears that the platform tag is only encoded into the filename, and not any of the package metadata that's generated during the distribution-package process.
In the end I chose option #4 because it required the least amount of work- no setup.py files need to be introduced solely to accomplish this, and the build logs make it clear that a platform wheel (not a pure wheel) is being created.

How to use setuptools to create rpm packages for linux

I want to build a rpm package for my software. I am only familiar with the classic way of using rpmbuild tool of linux with spec files and source directory. But I read in the documentation of distutils that it can somehow create a RPM package. Setuptools is based on distutils so I am guessing it also has some procedure to build rpm.
Although I never practically used any of the two modules, but I always thought that they build their own standalone packages.
I have two questions. First is that what is the exact procedure to create a rpm from setuptools. Second is that, is this way more organized than rpmbuild utility?
What I researched so far on Internet-
Setuptools is mainly used to create a "wheel" package. And it is similar to other packages like rpm or deb, except linux will not directly understand it like RPM.
Need to pass bdist_rpm flag during the build process to create a rpm package.(link)
I am quite confused with the concept of building and distributing a package. Need some explanation on what i am understanding wrong between setuptools and rpm.
How to build RPM package using bdist directly from setup.py http://jeromebelleman.gitlab.io/posts/devops/setuppy/
Note that this method is easy and can produce just simply RPM packages. And for example, you cannot put requires (or build requires) in metadata, you have to remember to put them on the command line all the times.
I would say that bdist is suitable just for initial work. If you want to ship and support it then creating SPEC file is a must.
One more example - AFAIK you cannot specify %post or %pre scriptlets using bdist and setup.py.
Here is an example of python SPEC file: https://fedoraproject.org/wiki/Packaging:Python#Example_common_spec_file

Should PYTHONPATH include ./build/*?

Running
$ python setup.py build_ext
with the usual Cython extension configuration creates a build directory and places the compiled modules deep within it.
How is the Python interpreter supposed to find them now? Should PYTHONPATH include those sub-sub-directories? It seems kludgy to me. Perhaps this is meant to work differently?
You will find the information here https://docs.python.org/3.5/install/
Build is an intermediary result before python actually install the module. Put in pythonpath the paths to the library that is, for instance:
<dir>/local/lib/python
if you use the "home" installation technique and dir is the directory you have chosen, ex
/home/user2
Presumably, when you write a package containing Cython code, your setup.py will contain something similar to this:
setup(
ext_modules = cythonize("example.pyx")
)
(there are some variations, but that's the general idea). When you run
python setup.py install
or
python setup.py install --user
You will see it creates binary files (with extensions based on your OS - on mine it will be example.so) and copies them to the standard installation directory (also depending on your OS).
These binary files are therefore already in the import path of your Python distribution, and it can import them like regular modules.
Consequently, you do not need to add the build directory to the path. Just install (possibly with --user, or use virtualenv, if you're developing), and let the extensions be imported the regular way.

How should I write a simple installer for python package?

I'm writing a simple python package with many helper functions to be used with other projects.
How should handle I packaging and make installer for said package?
Directory layout
package:
package (/git/dev/package)
|
+-- __init__.py
|
+-- foo.py
|
+-- bar.py
project using said package:
project (/git/dev/project)
|
+-- project.py
How do I make this package available to every local python project (I don't need to distribute it publicly)? Should installer add current package location to path or use some other way?
Current preferred workflow:
1. checkout package from version control
2. do something so python finds and can use that said package
3. use package in some project
4. edit package (project should use edited package even before I push those changes to repo)
File contents
project.py:
# Doesn't work currently since package isn't added to path
from package import foo
from package import bar
foo.do_stuff()
bar.do_things()
How do I make this package available to every python project?
The standard way to distribute packages is as source distributions on PyPI (or a private pip-compatible repository). The details take up many pages, so I can't explain them all here, but the Python Packaging User Guide has everything you want to know.
The basic idea is that you create a setup.py file that tells Python how to install your program. Look at the official sample linked from the tutorial.
Now, someone can just download and unzip your package and run python setup.py install and it will install your program for them. Or, better, they can type pip install ..
But, even better, you can python setup.py sdist to create a source distribution, test it out, upload it to PyPI, and then users don't have to download anything, they can just pip install mylib.
If you want binary installations for different platforms, setuptools knows how to make both Windows installers, which can be double-clicked and run, and wheel files, which can be installed with pip (in fact, it'll automatically find and use wheels instead of source distributions). If you don't have any compiled C extensions, python setup.py bdist_wheel --universal is enough to build a wheel that works everywhere.
Should installer add current package location to path or use some other way?
The installer should install the package into the user's system or user (or current-virtual-environment-system) site-packages. If you use setuptools, pip will take care of this automatically.
If you're not trying to make this public, you can still use setuptools and pip. Just check the source distribution into source control, then you can install the latest version at any time like this:
pip install --upgrade git+https://github.com/mycompany/mylib
This also means you can skip a lot of the PyPI metadata in your setup.py file (e.g., nobody cares about your project's classifiers if it's not going to end up in the PyPI repository).
But you can still take advantage of all of the parts of the Python packaging system that make your life easier, while skipping the parts that aren't relevant.

What is the difference between an 'sdist' .tar.gz distribution and an python egg?

I am a bit confused. There seem to be two different kind of Python packages, source distributions (setup.py sdist) and egg distributions (setup.py bdist_egg).
Both seem to be just archives with the same data, the python source files. One difference is that pip, the most recommended package manager, is not able to install eggs.
What is the difference between the two and what is 'the' way to do distribute my packages?
(Note, I am not wanting to distribute my packages through PyPI, but I want to use a package manager that fetches my dependencies from PyPI)
setup.py sdist creates a source distribution: it contains setup.py, the source files of your module/script (.py files or .c/.cpp for binary modules), your data files, etc. The result is an archive that can then be used to recompile everything on any platform.
setup.py bdist (and bdist_*) creates a built distribution: it includes .pyc files, .so/.dll/.dylib for binary modules, .exe if using py2exe on Windows, your data files... but no setup.py. The result is an archive that is specific to a platform (for example linux-x86_64) and to a version of Python, and that can be installed simply by extracting it into the root of your filesystem (executables are in /usr/bin (or equivalent), data files in /usr/share, modules in /usr/lib/pythonX.X/site-packages/...). You can even build rpm archives that can be directly installed using your package manager.
2021 update: the tools to build and use eggs no longer exist in Python.
There are many more than two different kind of Python (distribution) packages. This command lists many subcommands:
$ python setup.py --help-commands
Notice the various different bdist types.
An egg was a new package type, introduced by setuptools but later adopted by the standard library. It is meant to be installed monolithic onto sys.path. This differs from an sdist package which is meant to have setup.py install run, copying each file into place and perhaps taking other actions as well (building extension modules, running additional arbitrary Python code included in the package).
eggs are largely obsolete at this point in time. EDIT: eggs are gone, they were used with the command "easy_install" that's been removed from Python.
The favored packaging format now is the "wheel" format, notably used by "pip install".
Whether you create an sdist or an egg (or wheel) is independent of whether you'll be able to declare what dependencies the package has (to be downloaded automatically at installation time by PyPI). All that's necessary for this dependency feature to work is for you to declare the dependencies using the extra APIs provided by distribute (the successor of setuptools) or distutils2 (the successor of distutils - otherwise known as packaging in the current development version of Python 3.x).
https://packaging.python.org/ is a good resource for further information about packaging. It covers some of the specifics of declaring dependencies (eg install_requires but not extras_require afaict).

Categories