Can I exclude libraries from auditwheel repair? - python

I'm using manylinux2014_x86_64 build some precompiled linux wheels for a python library that acts as an API to a C++ library involving CUDA. I create the wheels with pip wheel, then run auditwheel repair to include external libraries in the wheels (my c++ library, pybind11, etc.)
The problem is that it wants to package CUDA runtime and driver libraries into the wheel. Ideally I'd like to leave the CUDA installation up to the user rather than having to include it in the python wheel (I'm not even sure exactly how redistributable it is).
Is anyone aware of a way to blacklist the cuda libs from auditwheel repair? Or perhaps another better way of doing this?

There is a way, but it kind of defeats the purpose of auditwheel repair.
You need to install auditwheel as a Python module, then import it in your own python script and monkey patch some values that specify repair policies.
# Monkey patch to not ship libjvm.so in pypi wheels
import sys
from auditwheel.main import main
from auditwheel.policy import _POLICIES as POLICIES
# libjvm is loaded dynamically; do not include it
for p in POLICIES:
p['lib_whitelist'].append('libjvm.so')
if __name__ == "__main__":
sys.exit(main())
This snippet is from the diplib project, which does what you wish for Java libraries. You would need to modify this script to cover libraries you need to whitelist.
This script then needs to be invoked by a Python 3.x interpreter or it will fail. You can repair Python 2.7 wheels this way just fine if you need to. The diplib project also shows an example invocation that needs to happen in a manylinux docker container.
#!/bin/bash
# Run this in a manylinux2010 docker container with /io mounted to some local directory
# ...
/opt/python/cp37-cp37m/bin/python -m pip install cmake auditwheel # ignore "cmake"
# ...
export AUDITWHEEL=`pwd`/diplib/tools/travis/auditwheel # the monkey patch script
# ...
/opt/python/cp37-cp37m/bin/python $AUDITWHEEL repair pydip/staging/dist/*.whl
# ...

Related

How can I run cython'd shared libraries on Google Cloud Functions?

As the title says, I was wondering if Google Cloud Functions (where I currently have some pure python code) support cython'd modules?
I guess, more specifically, I'm asking about how I would use said modules? It's a private project, I'm using cython via setup.py and cythonize(files) which creates a bunch of shared object modules (example.cpython-38-darwin.so, example1.cpython-38-darwin.so, example2.cpython-38-darwin.so).
Those are all for Mac, so won't work on Firebase.
Is there any way to get Cloud Functions to run the setup.py and compile some files? Or, better yet, is there some way to pre-compile those files for the appropriate OS and just deploy the shared libs?
I know a variety of libraries I'm installing via pip on Cloud Functions use Cython under the hood, but I don't really know the process of creating a wheel or other pip dependency...
You need to specify cython as a build-time dependency for your private project by adding a pyproject.toml file like:
[build-system]
requires = ["cython"]
Then when installing your package with the modern version of pip in the Cloud Functions runtime, cython will be installed into the build environment before your setup.py script is run.
I appear to have been able to (eventually) solve this... I might have several unnecessary steps, but I think they improve my overall build system (again, with the intention of being able to use cython'd shared libraries on Firebase).
From Docker (or in my case, a Linux VM), in my private repo, I cythonize the important code, and turn everything into a wheel. From here, I run auditwheel show over the wheel, to check if it adheres to the manylinux1 tag (or whichever manylinux I want). In this case, it did adhere to manylinux1 off the bat, so there was no need to repair the wheel or do any shenanigans this time.
... .py # Other irrelevant .py files
magic.py # Source code that needs to be cython'd
setup.py
Simplified setup.py:
from setuptools import setup, find_packages
from Cython.Build import cythonize
setup(
name='magiclib',
version='0.1.0',
packages=find_packages(),
ext_modules=cythonize(
"magic.py",
compiler_directives={'language_level': 3}
)
)
Running python setup.py bdist_wheel creates a wheel named dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl
From here, I run auditwheel show dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl which shows me that the code already adheres to the manylinux1 tag, but I nonetheless run auditwheel repair dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl which creates wheelhouse/magiclib-0.1.0-cp37-cp37m-manylinux1_x86_64.whl.
At this point, I bring this wheel into my GCF project, and use:
pip install -t magiclib magiclib-0.1.0-cp37-cp37m-manylinux1_x86_64.whl
which basically unzips the wheel into a sub-directory that I can vendor and deploy to Google Cloud and call from my Functions.
Works fine on some of my simple code, and I'll be experimenting with some more involved code.

How to import or install pre-built python extension module (C++) (i.e. library not compiled via setuptools)?

I have a C++ project for which I am developing a Python interface. For now I am using pybind11 since it seems pretty neat, and has some nice tools for building the extension module with CMake, which is how the main C++ project is built.
Via CMake I managed to get a shared library containing the interface functions to build, however now that I have it I don't know how to tell Python that it exists and make it import-able. I don't want to reconfigure the whole build of the project to be launched via Python (i.e. as described here with setuptools) because it is a big project and I am just providing a Python interface to part of it. So it would be preferable if I could just build the shared library for Python along with the rest of the C++ code, and then just later on run "setup.py install" to do whatever else needs to be done to make the shared library visible to Python.
Is this possible? Or do I need to do some other sort of refactoring, like make the main project build some other pure C++ libraries, which I then just link into the Python extension module library which is built separately via setuptools?
If you need to install just one binary module you can create an simple installer just for that module. Let's assume that you have a binary module foo.so (or foo.pyd if you are working on Windows) that is already built with your cmake-generated build script. Then you can create a simple setup setup script:
from setuptools import setup
setup(
name='foo',
version='0.1.2.3',
py_modules=['foo']
)
Then you need to add MANIFEST.in file to pick your binary module file:
include foo.so
So you need 3 files:
foo.so
MANIFEST.in
setup.py
Now you can do python setup.py install from your Python virtual environment, and your binary module will be installed in it. If you want to distribute your module, then it's better to install Python wheel package and create a .whl file: python setup.py bdist_wheel. Such "wheel" can later be installed with pip command. Note that binary modules must be installed on the same platform and Python version that was used to build those modules.

How should I write a simple installer for python package?

I'm writing a simple python package with many helper functions to be used with other projects.
How should handle I packaging and make installer for said package?
Directory layout
package:
package (/git/dev/package)
|
+-- __init__.py
|
+-- foo.py
|
+-- bar.py
project using said package:
project (/git/dev/project)
|
+-- project.py
How do I make this package available to every local python project (I don't need to distribute it publicly)? Should installer add current package location to path or use some other way?
Current preferred workflow:
1. checkout package from version control
2. do something so python finds and can use that said package
3. use package in some project
4. edit package (project should use edited package even before I push those changes to repo)
File contents
project.py:
# Doesn't work currently since package isn't added to path
from package import foo
from package import bar
foo.do_stuff()
bar.do_things()
How do I make this package available to every python project?
The standard way to distribute packages is as source distributions on PyPI (or a private pip-compatible repository). The details take up many pages, so I can't explain them all here, but the Python Packaging User Guide has everything you want to know.
The basic idea is that you create a setup.py file that tells Python how to install your program. Look at the official sample linked from the tutorial.
Now, someone can just download and unzip your package and run python setup.py install and it will install your program for them. Or, better, they can type pip install ..
But, even better, you can python setup.py sdist to create a source distribution, test it out, upload it to PyPI, and then users don't have to download anything, they can just pip install mylib.
If you want binary installations for different platforms, setuptools knows how to make both Windows installers, which can be double-clicked and run, and wheel files, which can be installed with pip (in fact, it'll automatically find and use wheels instead of source distributions). If you don't have any compiled C extensions, python setup.py bdist_wheel --universal is enough to build a wheel that works everywhere.
Should installer add current package location to path or use some other way?
The installer should install the package into the user's system or user (or current-virtual-environment-system) site-packages. If you use setuptools, pip will take care of this automatically.
If you're not trying to make this public, you can still use setuptools and pip. Just check the source distribution into source control, then you can install the latest version at any time like this:
pip install --upgrade git+https://github.com/mycompany/mylib
This also means you can skip a lot of the PyPI metadata in your setup.py file (e.g., nobody cares about your project's classifiers if it's not going to end up in the PyPI repository).
But you can still take advantage of all of the parts of the Python packaging system that make your life easier, while skipping the parts that aren't relevant.

How to properly deploy python webserver application with extension deps?

I developed my first webserver app in Python.
It's a but unusual, because it does not only depend on python modules (like tornado) but also on some proprietary C++ libs wrapped using SWIG.
And now it's time to deliver it (to Linux platform).
Due to dependency on C++ lib, just sending sources with requirements.txt does not seem enough. The only workaround would be to have exact Linux installation to ensure binary compatibility of the lib. But in this case there will be problems with LD_PATH etc.
Another option is to write setup.py to create sdist and then deploy it with pip install.
Unfortunately that would mean I have to kill all instances of the server before installing my package. The workaround would be to use virtualenv for each instance though.
But maybe I'm missing something much simpler?
If you need the package to be installed by some user the easiest way will be to write the setup.py - but no just with simple setup function like most of installers. If you look at some packages, they have very complicated setup.py scripts which builds many things and C extensions with installation scripts for many external dependences.
The LD_PATH problem you can solve like this. If your application have an entry-point like some script which you save in python's bin directory (or system /usr/bin) you override LD_PATH like export LD_PATH="/my/path:$LD_PATH".
If your package is system service, like some servers or daemons, you can write system package, for example debian package or rpm. Debian has a lot of scripts and mechanism to point out the dependencies with packages.
So, if you need some system libraries on the list you write it down in package source and debian will install them when you will be installing your package. For example your package have dependencies for SWIG and other DEV modules, and your C extension will be built properly.

Moving a Python script to another computer

I am wondering what my options are if I write a Python script that makes use of installed libraries on my computer (like lxml for example) and I want to deploy this script on another computer.
Of course having Python installed on the other machine is a given but do I also have to install all the libraries I use in my script or can I just use the *.pyc file?
Are there any options for making an installer for this kind of problem that copies all the dependencies along with the script in question?
I am talking about Windows machines by the way.
Edit -------------------
After looking through your answers i thought i should add this:
The thing is....the library required for this to work won't come along quietly with either pip or easy_install on account on it requiring either a windows installer (witch i found after some searching) or being rebuilt on the target computer from sources (witch i'm trying to avoid) .
I thought there was some way to port a pyc file or something to another pc and the interpreter on that station will not require the dependencies on account on it already being translated to bytecode.
If this is not possible, can anyone show me a guide of some sort for making windows package installers ?
You should create setup.py for use with setuptools. Dependencies should be included in the field install_requires. Afterwards if you install that package using easy_install or pip, dependencies will be automatically downloaded and installed.
You can also use distutils. A basic setup.py looks as follows:
from distutils.core import setup
setup(
name='My App',
version='1.0',
# ... snip ...
install_requires=[
"somedependency >= 1.2.3"
],
)
For differences between distutils and setuptools see this question.

Categories