install a pre-build wheel file as part of setup requirement - python

I have a python project where I am using the maskrcnn-benchmark project from Facebook Research. The problem is that the setup file for the facebook project depends on pytorch i.e. the setup file has an import line like:
import torch
So, I need to have pytorch pre-installed and this is causing me some problems. For me, the cleanest solution would be if I could prebuild the maskrcnn-benchmark project as a wheel with all its dependencies like pytorch and then add this wheel as a requirement in my setup.py file.
However, I could not find an easy way to do so. Is there someway to adsd a wheel file as an install_requires step in the setup file of a python project.

The maskrcnn-benchmark project should have torch==1.0.1 (or whichever version) in install_requirements= (along with any other requirements).
Then, you can use
pip wheel . --wheel-dir /tmp/deps
to have pip gather up the wheels (for your current architecture!) in /tmp/deps. Then, to install the dependencies from the wheel dir,
pip install --find-links=/tmp/deps -e .
This technique works for other target types too, like -r requirements.txt.
EDIT: If you also want to build a wheel for the project itself, that'd be python setup.py bdist_wheel, but that won't look for dependencies.

Related

Packaing a Wheel within a Wheel

Context
I have a wheel file of a Python CLI app (call it A) in an S3 bucket. Not the repo, not the source code, just the wheel, because "reasons". I also have access to the source code of a wrapper for A (call it B).
What's Needed
I would like to create a wheel file that will ideally install both A and B along with other dependencies available via PyPI, and distribute said wheel to multiple people many of whom may not (and need not) have access to the S3 bucket. I'm wondering if it's possible to package A within B's wheel such that when someone pip installs B.whl, A is picked up automatically.
What I have tried
I tried including a reference to A in B's setup.py under install_requires and the relative path to A (./deps/A.whl) under dependency_links, but that didn't work. The error I get is that pip could not find a version that satisfies the requirement of package A. I did not already know whether that would work or not for certain; just tried using the path instead of a URL.
Build command: python setup.py bdist_wheel

pip does not install my package dependencies

I have developed a python package on github that I released on PyPi. It installs with pip install PACKAGENAME, but does not do anything with the dependencies that are stated in the "install_requires" of the setup.py file.
Weirdly enough, the zip file of the associated release does install all dependencies.. I tried with different virtual environments and on different computers but it never installs the dependencies.. Any help appreciated.
pip install pythutils downloads a wheel if it's available — and it's available for your package.
When generating a wheel setuptools runs python setup.py locally but doesn't include setup.py into the wheel. Download your wheel file and unzip it (it's just a zip archive) — there is your main package directory pythutils and a directory with metadata pythutils-1.1.1.dist-info. In the metadata directory there is a file METADATA that usually lists static dependencies but your file doesn't list any. Because when you were generating wheels all your dependencies has already been installed so all your dynamic code paths were skipped.
The archive that you downloaded from Github release install dependencies because it's not a wheel so pip runs python setup.py install and your dynamic dependencies work.
What you can do? My advice is to avoid dynamic dependencies. Declare static dependencies and allow pip to decide what versions to install:
install_requires=[
'numpy==1.16.5; python_version>="2" and python_version<"3"',
'numpy; python_version>="3"',
],
Another approach would be to create version-specific wheel files — one for Python 2 and another for Python 3 — with fixed dependencies.
Yet another approach is to not publish wheels at all and only publish sdist (source distribution). Then pip is forced to run python setup.py install on the target machine. That not the best approach and it certainly will be problematic for packages with C extensions (user must have a compiler and developer tools to install from sources).
Your setup.py does a series of checks like
try:
import numpy
except ImportError:
if sys.version_info[0] == 2:
install_requires.append('numpy==1.16.5')
if sys.version_info[0] == 3:
install_requires.append("numpy")
Presumably the system where you ran it had all the required modules already installed, and so ended up with an empty list in install_requires. But this is the wrong way to do it anyway; you should simply make a static list (or two static lists, one each for Python 2 and Python 3 if you really want to support both in the same package).

How to force pip to get a wheel package (even for package dependencies)?

I'm trying to build a multistage docker image with some python packages. For some reason, pip wheel command still downloads source files .tar.gz for few packages even though .whl files exist in Pypi. For example: it does it for pandas, numpy.
Here is my requirements.txt:
# REST client
requests
# ETL
pandas
# SFTP
pysftp
paramiko
# LDAP
ldap3
# SMB
pysmb
First stage of the Dockerfile:
ARG IMAGE_TAG=3.7-alpine
FROM python:${IMAGE_TAG} as python-base
COPY ./requirements.txt /requirements.txt
RUN mkdir /wheels && \
apk add build-base openssl-dev pkgconfig libffi-dev
RUN pip wheel --wheel-dir=/wheels --requirement /requirements.txt
ENTRYPOINT tail -f /dev/null
Output below shows that it is downloading source package for Pandas but it got a wheel for Requests package. Also, surprisingly it takes a lot of time (I really mean a lot of time) to download and build these packages !!
Step 5/11 : RUN pip wheel --wheel-dir=/wheels --requirement /requirements.txt
---> Running in d7bd8b3bd471
Collecting requests (from -r /requirements.txt (line 4))
Downloading https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl (57kB)
Saved /wheels/requests-2.22.0-py2.py3-none-any.whl
Collecting pandas (from -r /requirements.txt (line 7))
Downloading https://files.pythonhosted.org/packages/0b/1f/8fca0e1b66a632b62cc1ae38e197befe48c5cee78f895edf4bf8d340454d/pandas-0.25.0.tar.gz (12.6MB)
I would like to know how I can force it get a wheel file for all the required packages and also for the dependencies listed in these packages. I observed that some dependencies get a wheel file but others get the source packages.
NOTE: code above is a combination of multiple online sources.
Any help to make this build process easier is greatly appreciated.
Thanks in Advance.
You are using Alpine Linux. This one is somewhat unique as it uses musl as the underlying libc implementation, as opposed to the most other Linux distros which use glibc.
If a Python project implements C extensions (this is what e.g. numpy or pandas do), it has two options: either
offer a source dist (.tar.gz, .tar.bz2 or .zip) so that the C extensions are compiled using the C compiler/library found on the target system, or
offer a wheel that contains compiled C extensions. If the extensions are compiled against glibc, they will be unusable on systems using musl, and AFAIK vice versa too.
Now, Python defines the manylinux1 platform tag which is specified in PEP 513 and updated in PEP 571. Basically, the name says it all - wheels with compiled C extensions should be built against glibc and thus will work on many distros (that use glibc), but not on some (Alpine being one of them).
For you, it means that you have two possibilities: either build packages from source dists (this is what pip already does), or install the prebuilt packages via Alpine's package manager. E.g. for py3-pandas it would mean doing:
# echo "#edge http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
# apk update
# apk add py3-pandas#edge
However, I don't see a big issue with building packages from source. When done right, you capture it in a separate layer placed as high as possible in the image, so it is cached and not rebuilt each time.
You might ask, why there's no platform tag analogous to manylinux1, but for musl-based distros? Because no one has written a PEP similar to PEP 513 that defines a musllinux platform tag yet. If you are interested in the current state of it, take a look at the issue #37.
Update
PEP 656 That defines a musllinux platform tag is now accepted, so it (hopefully) won't last long until prebuilt wheels for Alpine start to ship. You can track the current implementation state in auditwheel#305.
For Python 3, your packages will be installed from wheels with ordinary pip call:
pip install pandas numpy
From the docs:
Pip prefers Wheels where they are available. To disable this, use the --no-binary flag for pip install.
If no satisfactory wheels are found, pip will default to finding source archives.

rebuilding the opencv-python wheel installer

I am using the opencv-python project here. What I would like to do is recreate the wheel file again. So what I did was something like:
python setup.py bdist_wheel
This creates a dist directory and adds the wheel file there which I then take and try to install in an Anaconda environment as follows:
pip install ~/opencv_python-3.4.2+5b36c37-cp36-cp36m-linux_x86_64.whl
This is fine and seems to install fine. But when I try to use it and do
import cv2
I get the error:
ImportError: libwebp.so.5: cannot open shared object file: No such file or directory
I thought that creating the wheel file would take care of all the dependencies but I wonder if I have to do something else before the wheel generation to make sure everything is packaged correctly?
EDIT
I compare the wheel archives from the official sources and the one I generated and I see that the third party libraries are not included. So, my zip file contents are:
['cv2/LICENSE-3RD-PARTY.txt',
'cv2/LICENSE.txt', 'cv2/__init__.py',
'cv2/cv2.cpython-36m-x86_64-linux-gnu.so']
I have omitted some XML files which are not relevant. Meanwhile, the official archive has:
['cv2/__init__.py',
'cv2/cv2.cpython-36m-i386-linux-gnu.so',
'cv2/.libs/libswresample-08248319.so.3.2.100',
'cv2/.libs/libavformat-d485f70f.so.58.17.101',
'cv2/.libs/libvpx-1b5256ac.so.5.0.0',
'cv2/.libs/libz-83853723.so.1.2.3',
'cv2/.libs/libQtGui-55070e59.so.4.8.7',
'cv2/.libs/libavcodec-3b67922d.so.58.21.104',
'cv2/.libs/libswscale-3bf29a6c.so.5.2.100',
'cv2/.libs/libQtTest-0cf8861e.so.4.8.7',
'cv2/.libs/libQtCore-ccf6d197.so.4.8.7',
'cv2/.libs/libavutil-403a4871.so.56.18.102']

Bundle two Python packages together

I have a Python package myapp which depends on a Python package theirapp.
theirapp is used by others and may update occasionally, but it is not hosted on PyPI.
I currently have my repository setup like this:
my-app/
myapp/
__init__.py
requirements.txt
their-app/
setup.py
theirapp/
__init__.py
My requirements.txt file contains the following line (among others):
./their-app/
their-app is not hosted on PyPI but I want to make sure the latest version is installed. Up to this point I have been downloading a zip file containing my-app and typing pip install -U requirements.txt and using the application manually.
I would like to make an installable Python package. Ideally I would like to download a my-app.zip file and type pip install my-app.zip to install myapp, theirapp and any other dependencies.
Is this possible? If not, what is the best way to handle this scenario?
You may just need to bundle theirapp as part of your project and import it as myapp.contrib.theirapp. If both projects are versioned in git you can impliment it as a submodule, but it may increase complexity for maintainers.
How pip handles a similar problem:
https://github.com/pypa/pip/tree/develop/pip/_vendor
You can see pip imports bundled vendor packages as pip._vendor.theirapp.

Categories