How can I run cython'd shared libraries on Google Cloud Functions? - python

As the title says, I was wondering if Google Cloud Functions (where I currently have some pure python code) support cython'd modules?
I guess, more specifically, I'm asking about how I would use said modules? It's a private project, I'm using cython via setup.py and cythonize(files) which creates a bunch of shared object modules (example.cpython-38-darwin.so, example1.cpython-38-darwin.so, example2.cpython-38-darwin.so).
Those are all for Mac, so won't work on Firebase.
Is there any way to get Cloud Functions to run the setup.py and compile some files? Or, better yet, is there some way to pre-compile those files for the appropriate OS and just deploy the shared libs?
I know a variety of libraries I'm installing via pip on Cloud Functions use Cython under the hood, but I don't really know the process of creating a wheel or other pip dependency...

You need to specify cython as a build-time dependency for your private project by adding a pyproject.toml file like:
[build-system]
requires = ["cython"]
Then when installing your package with the modern version of pip in the Cloud Functions runtime, cython will be installed into the build environment before your setup.py script is run.

I appear to have been able to (eventually) solve this... I might have several unnecessary steps, but I think they improve my overall build system (again, with the intention of being able to use cython'd shared libraries on Firebase).
From Docker (or in my case, a Linux VM), in my private repo, I cythonize the important code, and turn everything into a wheel. From here, I run auditwheel show over the wheel, to check if it adheres to the manylinux1 tag (or whichever manylinux I want). In this case, it did adhere to manylinux1 off the bat, so there was no need to repair the wheel or do any shenanigans this time.
... .py # Other irrelevant .py files
magic.py # Source code that needs to be cython'd
setup.py
Simplified setup.py:
from setuptools import setup, find_packages
from Cython.Build import cythonize
setup(
name='magiclib',
version='0.1.0',
packages=find_packages(),
ext_modules=cythonize(
"magic.py",
compiler_directives={'language_level': 3}
)
)
Running python setup.py bdist_wheel creates a wheel named dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl
From here, I run auditwheel show dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl which shows me that the code already adheres to the manylinux1 tag, but I nonetheless run auditwheel repair dist/magiclib-0.1.0-cp37-cp37m-linux_x86_64.whl which creates wheelhouse/magiclib-0.1.0-cp37-cp37m-manylinux1_x86_64.whl.
At this point, I bring this wheel into my GCF project, and use:
pip install -t magiclib magiclib-0.1.0-cp37-cp37m-manylinux1_x86_64.whl
which basically unzips the wheel into a sub-directory that I can vendor and deploy to Google Cloud and call from my Functions.
Works fine on some of my simple code, and I'll be experimenting with some more involved code.

Related

How to force a platform wheel using build and pyproject.toml?

I am trying to force a Python3 non-universal wheel I'm building to be a platform wheel, despite not having any native build steps that happen during the distribution-packaging process.
The wheel will include an OS-specific shared library, but that library is built and copied into my package directory by a larger build system that my package knows nothing about. By the time my Python3 package is ready to be built into a wheel, my build system has already built the native shared library and copied it into the package directory.
This SO post details a solution that works for the now-deprecated setup.py approach, but I'm unsure how to accomplish the same result using the new and now-standard build / pyproject.toml system:
mypackage/
mypackage.py # Uses platform.system and importlib to load the local OS-specific library
pyproject.toml
mysharedlib.so # Or .dylib on macOS, or .dll on Windows
Based on the host OS performing the build, I would like the resulting wheel to be manylinux, macos, or windows.
I build with python3 -m build --wheel, and that always emits mypackage-0.1-py3-none-any.whl.
What do I have to change to force the build to emit a platform wheel?
OK, after some research and reading of code, I can present a bit of information and a few solutions that might meet other people's needs, summarized here:
Firstly, pyproject.toml is not mutually exclusive from setup.py. setuptools will complain about deprecation if you create a distribution package via python3 setup.py ... and no pyproject.toml file is present.
However, setup.py is still around and available, but it's a mistake to duplicate project configuration values (name, version, etc). So, put as much as your package will allow inside your pyproject.toml file, and use setup.py for things like overriding the Distribution class, or overriding the bdist_wheel module, etc.
As far as creating platform wheels, there are a few approaches that work, with pros and cons:
Override the bdist_wheel command class in setup.py as described here and set self.root_is_pure to False in the finalize_options override. This forces the python tag (e.g. cp39) to be set, along with the platform tag.
Override the Distribution class in setup.py as described here and override has_ext_modules() to simply return True. This also forces the python and platform tags to be set.
Add an unused minimal extension module to your packaging definition, as described here and here. This lengthens the build process and adds a useless "dummy" shared library to be dragged along wherever your wheel goes.
Add the argument -C=--build-option=--plat {your-platform-tag} to the build invocation (for my case that's python -m build -w -n, for example). This leaves the Python tag untouched but you have to supply your own tag; there's no way to say "use whatever the native platform is". You can discover the exact platform tag with the command wheel.bdist_wheel.get_platform(pathlib.Path('.')) after importing the pathlib and wheel.bdist_wheel packages, but that can be cumbersome because wheel isn't a standard library package.
Simply rename your wheel from mypkg-py3-none-any.whl to mypkg-py3-none-macosx_13_0_x86_64.whl- it appears that the platform tag is only encoded into the filename, and not any of the package metadata that's generated during the distribution-package process.
In the end I chose option #4 because it required the least amount of work- no setup.py files need to be introduced solely to accomplish this, and the build logs make it clear that a platform wheel (not a pure wheel) is being created.

What is the minimal setup.py needed to develop poetry packages?

I am developing a python package managed by poetry. The package has some complex requirements that are very difficult to install successfully on my system. I want the ability to install this in editable mode, with the ability to ignore dependencies (something which the developer of poetry frowns on). Unfortunately, I do not have the option of converting this package to a more mature packaging system.
Apparently the simple solution is to create a setup.py for the project and pip install -e that. Since unfortunately poetry has spread like a cancer to many projects now, I will have to employ such a workaround frequently. As such, I want to minimize the tedium by not copying over fields like description which are irrelevant to the developing the package.
What is the minimal setup.py file that I can use as a template for such poetry projects? I assume it must at least include the package name, version and location. Is there anything else?
I am also planning to not put any requirements in the setup.py file, since the whole point is to bypass the requirements defined by poetry and pyproject.toml. I am fine with manually resolving ModuleNotFoundError: No module named 'foo' errors by typing pip install foo.
It appears sufficient to create the following file:
from distutils.core import setup
setup(
name="<PACKAGE_NAME>",
version="<PACKAGE_VERSION>"
)
And also comment out the entire [build-system] block in the pyproject.toml file (see also How do I configure git to ignore some files locally? so you don't accidentally commit to that).
I think the package name and version can be automatically pulled from the toml file as well, but not sure right now how to do it.

Can I exclude libraries from auditwheel repair?

I'm using manylinux2014_x86_64 build some precompiled linux wheels for a python library that acts as an API to a C++ library involving CUDA. I create the wheels with pip wheel, then run auditwheel repair to include external libraries in the wheels (my c++ library, pybind11, etc.)
The problem is that it wants to package CUDA runtime and driver libraries into the wheel. Ideally I'd like to leave the CUDA installation up to the user rather than having to include it in the python wheel (I'm not even sure exactly how redistributable it is).
Is anyone aware of a way to blacklist the cuda libs from auditwheel repair? Or perhaps another better way of doing this?
There is a way, but it kind of defeats the purpose of auditwheel repair.
You need to install auditwheel as a Python module, then import it in your own python script and monkey patch some values that specify repair policies.
# Monkey patch to not ship libjvm.so in pypi wheels
import sys
from auditwheel.main import main
from auditwheel.policy import _POLICIES as POLICIES
# libjvm is loaded dynamically; do not include it
for p in POLICIES:
p['lib_whitelist'].append('libjvm.so')
if __name__ == "__main__":
sys.exit(main())
This snippet is from the diplib project, which does what you wish for Java libraries. You would need to modify this script to cover libraries you need to whitelist.
This script then needs to be invoked by a Python 3.x interpreter or it will fail. You can repair Python 2.7 wheels this way just fine if you need to. The diplib project also shows an example invocation that needs to happen in a manylinux docker container.
#!/bin/bash
# Run this in a manylinux2010 docker container with /io mounted to some local directory
# ...
/opt/python/cp37-cp37m/bin/python -m pip install cmake auditwheel # ignore "cmake"
# ...
export AUDITWHEEL=`pwd`/diplib/tools/travis/auditwheel # the monkey patch script
# ...
/opt/python/cp37-cp37m/bin/python $AUDITWHEEL repair pydip/staging/dist/*.whl
# ...

why run setup.py, can I just embed the code?

I am writing a CLI python application that has dependencies on a few libraries (Paramiko etc.).
If I download their source and just place them under my main application source, I can import them and everything works just fine.
Why would I ever need to run their setup.py installers or deal with python package managers?
I understand that when deploying server side applications it is OK for an admin to run easy_install/pip commands etc to install the prerequsites, but for a script like CLI apps that have to be distributed as a self-contained apps that only depend on a python binary, what is the recommented approach?
Several reasons:
Not all packages are pure-python packages. It's easy to include C-extensions in your package and have setup.py automate the compilation process.
Automated dependency management; dependencies are declared and installed for you by the installer tools (pip, easy_install, zc.buildout). Dependencies can be declared dynamically too (try to import json, if that fails, declare a dependency on simplejson, etc.).
Custom resource installation setups. The installation process is highly configurable and dynamic. The same goes for dependency detection; the cx_Oracle has to jump through quite some hoops to make installation straightforward with all the various platforms and quirks of the Oracle library distribution options it needs to support, for example.
Why would you still want to do this for CLI scripts? That depends on how crucial the CLI is to you; will you be maintaining this over the coming years? Then I'd still use a setup.py, because it documents what the dependencies are, including minimal version needs. You can add tests (python setup.py test), and deploy to new locations or upgrade dependencies with ease.

Moving a Python script to another computer

I am wondering what my options are if I write a Python script that makes use of installed libraries on my computer (like lxml for example) and I want to deploy this script on another computer.
Of course having Python installed on the other machine is a given but do I also have to install all the libraries I use in my script or can I just use the *.pyc file?
Are there any options for making an installer for this kind of problem that copies all the dependencies along with the script in question?
I am talking about Windows machines by the way.
Edit -------------------
After looking through your answers i thought i should add this:
The thing is....the library required for this to work won't come along quietly with either pip or easy_install on account on it requiring either a windows installer (witch i found after some searching) or being rebuilt on the target computer from sources (witch i'm trying to avoid) .
I thought there was some way to port a pyc file or something to another pc and the interpreter on that station will not require the dependencies on account on it already being translated to bytecode.
If this is not possible, can anyone show me a guide of some sort for making windows package installers ?
You should create setup.py for use with setuptools. Dependencies should be included in the field install_requires. Afterwards if you install that package using easy_install or pip, dependencies will be automatically downloaded and installed.
You can also use distutils. A basic setup.py looks as follows:
from distutils.core import setup
setup(
name='My App',
version='1.0',
# ... snip ...
install_requires=[
"somedependency >= 1.2.3"
],
)
For differences between distutils and setuptools see this question.

Categories