Create binary python package that depends on system-wide installed libssl.so - python

I am trying to create a binary python package that can be installed without compilation. The python package consists only one extension module written in C using the Python API. The extension module depending on the stable python ABI by using Py_LIMITED_API with 0x03060000 (3.6). Up to my knowledge, this means the extension module can work for all CPython versions that are not older than 3.6. I managed to create the sdist package, and I explored dumb, egg and wheel formats. I managed to create the binary packages, but none of them is perfect for my use case.
The problem is the extension module is depending on libssl.so, and this is the only "external" dependency of it. Because the python package itself is very stable, it doesn't require frequent releases. Therefore I wouldn't like to include libssl.so and take the burden of releasing new versions because of the security updates of OpenSSL (not mentioning to educate the users to update their python package regularly). I think because of this, the wheel format is not suitable, because the linux_*.whl packages cannot be uploaded to PyPI, but the manylinux2014_*.whl tag has to include libssl.so (and its dependencies) in the package. The dumb package are not suitable for PyPI based distribution, the egg format is not supported by pip, so they are also not suitable.
Because of the stable ABI and the widespread of libssl.so, I think it should be possible to release a single binary package for the most linux distributions and multiple python versions (similarly to the manylinux tags with wheel). Of course the pacakge would require libssl to be installed on the machine, but that is something I can accept for achieving better security. And that's where I am stuck.
My question is how can I create a binary python package, which
contains a python extension module,
depends on the system-wide installed libssl.so,
and can uploaded to and installed via pip and PyPI?
I tried to explore other possibilities, but I couldn't find anything else, so if you have any tips for other formats to look after, I would appreciate that also.

Related

Does Python has dependency on C compilers like GCC in Linux? What OS libraries does Python depends on?

I'm trying to understand Linux OS library dependencies to effectively run python 3.9 and imported pip packages to work. Is there a requirement for GCC to be installed for pip modules with c extention modules to run? What system libraries does Python's interpreter (CPython) depends on?
I'm trying to understand Linux OS library dependencies to effectively run python 3.9 and imported pip packages to work.
Your questions may have pretty broad answers and depend on a bunch of input factors you haven't mentioned.
Is there a requirement for GCC to be installed for pip modules with c extention modules to run?
It depends how the package is built and shipped. If it is available only as a source distribution (sdist), then yes. Obviously a compiler is needed to take the .c files and produce a laudable binary extension (ELF or DLL). Some packages ship binary distributions, where the publisher does the compilation for you. Of course this is more of a burden on the publisher, as they must support many possible target machines.
What system libraries does Python's interpreter depends on?
It depends on a number of things, including which interpreter (there are multiple!) and how it was built and packaged. Even constraining the discussion to CPython (the canonical interpreter), this may vary widely.
The simplest thing to do is whatever your Linux distro has decided for you; just apt install python3 or whatever, and don't think too hard about it. Most distros ship dynamically-linked packages; these will depend on a small number of "common" libraries (e.g. libc, libz, etc). Some distros will statically-link the Python library into the interpreter -- IOW the python3 executable will not depend on libpython3.so. Other distros will dynamically link against libpython.
What dependencies will external modules (e.g. from PyPI) have? Well that completely depends on the package in question!
Hopefully this helps you understand the limitations of your question. If you need more specific answers, you'll need to either do your own research, or provide a more specific question.
Python depends on compilers and a lot of other tools if you're going to compile the source (from the repository). This is from the offical repository, telling you what you need to compile it from source, check it out.
1.4. Install dependencies
This section explains how to install additional extensions (e.g. zlib) on Linux and macOs/OS X. On Windows, extensions are already included and built automatically.
1.4.1. Linux
For UNIX based systems, we try to use system libraries whenever available. This means optional components will only build if the relevant system headers are available. The best way to obtain the appropriate headers will vary by distribution, but the appropriate commands for some popular distributions are below.
However, if you just want to run python programs, all you need is the python binary (and the libraries your script wants to use). The binary is usually at /usr/bin/python3 or /usr/bin/python3.9
Python GitHub Repository
For individual packages, it depends on the package.
Further reading:
What is PIP?
Official: Managing application dependencies

How to structure and distribute Pybind11 extension with stubs?

I'm trying to create and distribute (with pip) a Python package that has Python code, and C++ code compiled to a .pyd file with Pybind11 (using Visual Studio 2019). I also want to include .pyi stub files, for VScode and other editors. I can't find much documentation on doing this correctly.
I'd like to be able to just install the package via pip as normal, and write from mymodule.mysubmodule import myfunc etc like a normal Python package, including autocompletes, type annotations, VScode intellisense etc using the .pyi files I'd write.
My C++ code is in multiple cpp and header files. It uses a few standard libraries, and a few external libraries (such as boost). It defines a single module, and 2 submodules. I want to be able to distribute this on Windows and Linux, and for x86 and x64. I am currently targeting Python 3.9, and the c++17 standard.
How should I structure and distribute this package? Do I include the c++ source files, and create a setup.py similar to the Pybind11 example? If so, how do I include the external libraries? And how do I structure the .pyi stub files? Does this mean whoever tries to install my package would need a c++ compiler as well?
Or, should I compile my c++ to a .pyd/.so file for each platform and architecture? If so, is there a way to specify which one gets installed through pip? And again, how would I structure the .pyi stubs?
Generating .pyi stubs
The pybind11 issue mentions a couple of tools (1, 2) to generate stubs for binary modules. There could be more, but I'm not aware of others. Unfortunately both are far from being perfect, so you probably still need to check and adjust the generated stubs manually.
Distribution of .pyi stubs
After correction of stubs you just include those .pyi files in you distribution (e.g. in wheel or as sources) along with py.typed indication file or, alternatively, distribute them separately as standalone package (e.g. mypackage-stubs).
Building wheels
Wheels allows users of your library to install it in binary form, i.e. without compilation. Wheels makes use of older compilers in order to be compatible with greater number of systems/platforms, so you might face some troubles with a C++17 library. (C++11 is old enough and should have no problems with wheels).
Building wheels for various platforms is tedious, the pybind11's python_example uses cibuildwheels package to do that, I would recommend this route if you are already using CI.
If wheels are missing for target platform the pip will attempt to install from source. This would require compiler and 3rd party libraries you are using to be already installed.
Maybe conda?
If your setup is complex and requires a number of 3rd party libraries it might be worth to write a conda recipe and use conda-forge to generate binary versions of the package. Conda is superior to pip, since it can manage non-python dependencies as well.

Why we need python packaging (e.g. egg)? [duplicate]

This question already has answers here:
What is a Python egg?
(4 answers)
Closed 6 years ago.
When I need a Python library, I use pip to fetch it from PyPi and if I create a project and want to share it, I just need to have in place the setup.py file and that would make it easily installable. Therefore, I was wondering what is the use case for egg or wheel packages.
The Python Packaging User Guide has to say the following on this topic:
Wheel and Egg are both packaging formats that aim to support the use case of needing an install artifact that doesn’t require building or compilation, which can be costly in testing and production workflows.
These formats can be used to distribute packages that contain binary extension modules. These would otherwise require compilation during installation.
If no compilation is involved a source distribution is in principle sufficient, but the user guide still recommends to create a wheel for performance reasons:
Minimally, you should create a Source Distribution:
python setup.py sdist
A “source distribution” is unbuilt (i.e, it’s not a Built Distribution), and requires a build step when installed by pip. Even if the distribution is pure python (i.e. contains no extensions), it still involves a build step to build out the installation metadata from setup.py.
[...]
You should also create a wheel for your project. A wheel is a built package that can be installed without needing to go through the “build” process. Installing wheels is substantially faster for the end user than installing from a source distribution.
In short, packages are a convenience thing - mostly for the user.
Wheel packages unify the process of distributing and installing projects that contain pure python, platform dependent code, or compiled extensions. The user does not need to worry if the package is written in Python or in C - it just works.
Egg packages are an older standard, you should ignore them nowadays. Use pip install . instead of ./setup.py install to prevent creating them. (addendum: They are also .zips in disguise, from which Python reads package data — not exactly the most performant solution)
Wheel packages, on the other hand, are the new standard. They allow for creation of portable binary packages for Windows, macOS, and Linux (yes, Linux!). Nowadays, you can just do pip install PyQt5 (as an example) and it will just work, no C++ compiler and Qt libraries required on the system. Everything is pre-compiled and included in the wheel. Non-binary packages also benefit, because it’s safer not to run setup.py (all the metadata is in the wheel). (addendum: those are also .zips, but they are unpacked when installed)

Allowing python use modules from other python installation

I normally use python 2.7.3 traditionally installed in /usr/local/bin, but I needed to rebuild python 2.6.6 (which I did without using virtualenv) in another directory ~/usr/local/ and rebuild numpy, scipy, all libraries I needed different versions from what I had for python 2.7.3 there...
But all the other packages that I want exactly as they were (meaning same version) in my default installation, I don't know how to just use them in the python 2.6.6 without having to download tarballs, build and installing them using --prefix=/home/myself/usr/local/bin.
Is there a fast or simpler way of "re-using" those packages in my "local" python 2.6.6?
Reinstall them. It may seem like a no-brainer to reuse modules (in a lot of cases, you can), but in the case of modules that have compiled code - for long term systems administration this can be an utter nightmare.
Consider supporting multiple versions of Python for multiple versions / architectures of Linux. Some modules will reference libraries in /usr/local/lib, but those libraries can be the wrong arch or wrong version.
You're better off making a requirements.txt file and using pip to install them from source.

How to share platform dependent python packages using source control?

Currently we keep project related python packages in a subversion directory, so when someone adds or removes one it will directly be available to others.
Still, this method works well with Python packages that are not platform dependent.
Still I do have quite a few that are platform dependent and worse, when you install them using easy_install they will need a compiler to produce the .egg file.
I had to mention that the package maintainers are not providing binaries for these modules, so I need to compile it manually. I've tried to add the .egg file to the shared directory but python doesn't pick it up by default.
While in the entire team only a few have compilers, how can we share the packages in an easy way?
To make the problem even more complex, I had to specify that even if in 99% of the code is run on the same platform (Windows) with the same version of Python (2.5), we still have few scripts that are to be executed on another platform (Linux).

Categories