Installing pandas in docker Alpine - python

I am having a really hard time trying to install a stable data science package configuration in docker. This should be easier with such mainstream, relevant tools.
The following is the Dockerfile that used to work, with a bit of a hack, removing pandas from the package core and installing it separately, specifying pandas<0.21.0, because, allegedly, higher versions conflict with numpy.
FROM alpine:3.6
ENV PACKAGES="\
dumb-init \
musl \
libc6-compat \
linux-headers \
build-base \
bash \
git \
ca-certificates \
freetype \
libgfortran \
libgcc \
libstdc++ \
openblas \
tcl \
tk \
libssl1.0 \
"
ENV PYTHON_PACKAGES="\
numpy \
matplotlib \
scipy \
scikit-learn \
nltk \
"
RUN apk add --no-cache --virtual build-dependencies python3 \
&& apk add --virtual build-runtime \
build-base python3-dev openblas-dev freetype-dev pkgconfig gfortran \
&& ln -s /usr/include/locale.h /usr/include/xlocale.h \
&& python3 -m ensurepip \
&& rm -r /usr/lib/python*/ensurepip \
&& pip3 install --upgrade pip setuptools \
&& ln -sf /usr/bin/python3 /usr/bin/python \
&& ln -sf pip3 /usr/bin/pip \
&& rm -r /root/.cache \
&& pip install --no-cache-dir $PYTHON_PACKAGES \
&& pip3 install 'pandas<0.21.0' \ #<---------- PANDAS
&& apk del build-runtime \
&& apk add --no-cache --virtual build-dependencies $PACKAGES \
&& rm -rf /var/cache/apk/*
# set working directory
WORKDIR /usr/src/app
# add and install requirements
COPY ./requirements.txt /usr/src/app/requirements.txt # other than data science packages go here
RUN pip install -r requirements.txt
# add entrypoint.sh
COPY ./entrypoint.sh /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh
# add app
COPY . /usr/src/app
# run server
CMD ["/usr/src/app/entrypoint.sh"]
The configuration above used to work. What happens now is that build does go through, but pandas fails at import with the following error:
ImportError: Missing required dependencies ['numpy']
Since numpy 1.16.1 was installed, I don't know which numpy pandas is trying to find anymore...
Does anyone know how to obtain a stable solution for this?
NOTE: A solution consisting of a pull from a turnkey docker image for data science with at least the packages mentioned above, into Dockerfile above, would be also very welcomed.
EDIT 1:
If I move install of data packages into requirements.txt, as suggested in the comments, like so:
requirements.txt
(...)
numpy==1.16.1 # or numpy==1.16.0
scikit-learn==0.20.2
scipy==1.2.1
nltk==3.4
pandas==0.24.1 # or pandas== 0.23.4
matplotlib==3.0.2
(...)
and Dockerfile:
# add and install requirements
COPY ./requirements.txt /usr/src/app/requirements.txt
RUN pip install -r requirements.txt
It breaks again at pandas, complaining about numpy.
Collecting numpy==1.16.1 (from -r requirements.txt (line 61))
Downloading https://files.pythonhosted.org/packages/2b/26/07472b0de91851b6656cbc86e2f0d5d3a3128e7580f23295ef58b6862d6c/numpy-1.16.1.zip (5.1MB)
Collecting scikit-learn==0.20.2 (from -r requirements.txt (line 62))
Downloading https://files.pythonhosted.org/packages/49/0e/8312ac2d7f38537361b943c8cde4b16dadcc9389760bb855323b67bac091/scikit-learn-0.20.2.tar.gz (10.3MB)
Collecting scipy==1.2.1 (from -r requirements.txt (line 63))
Downloading https://files.pythonhosted.org/packages/a9/b4/5598a706697d1e2929eaf7fe68898ef4bea76e4950b9efbe1ef396b8813a/scipy-1.2.1.tar.gz (23.1MB)
Collecting nltk==3.4 (from -r requirements.txt (line 64))
Downloading https://files.pythonhosted.org/packages/6f/ed/9c755d357d33bc1931e157f537721efb5b88d2c583fe593cc09603076cc3/nltk-3.4.zip (1.4MB)
Collecting pandas==0.24.1 (from -r requirements.txt (line 65))
Downloading https://files.pythonhosted.org/packages/81/fd/b1f17f7dc914047cd1df9d6813b944ee446973baafe8106e4458bfb68884/pandas-0.24.1.tar.gz (11.8MB)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 359, in get_provider
module = sys.modules[moduleOrReq]
KeyError: 'numpy'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-_e5z6o6_/pandas/setup.py", line 732, in <module>
ext_modules=maybe_cythonize(extensions, compiler_directives=directives),
File "/tmp/pip-install-_e5z6o6_/pandas/setup.py", line 475, in maybe_cythonize
numpy_incl = pkg_resources.resource_filename('numpy', 'core/include')
File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 1144, in resource_filename
return get_provider(package_or_requirement).get_resource_filename(
File "/usr/local/lib/python3.7/site-packages/pkg_resources/__init__.py", line 361, in get_provider
__import__(moduleOrReq)
ModuleNotFoundError: No module named 'numpy'
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-_e5z6o6_/pandas/
EDIT 2:
This seems like an open pandas issue. For more details please refer to:
pandas-dev github
"Unfortunately, this means that a requirements.txt file is insufficient for setting up a new environment with pandas installed (like in a docker container)".
**ImportError**:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
Here is how to proceed:
- If you're working with a numpy git repository, try `git clean -xdf`
(removes all files not under version control) and rebuild numpy.
- If you are simply trying to use the numpy version that you have installed:
your installation is broken - please reinstall numpy.
- If you have already reinstalled and that did not fix the problem, then:
1. Check that you are using the Python you expect (you're using /usr/local/bin/python),
and that you have no directories in your PATH or PYTHONPATH that can
interfere with the Python and numpy versions you're trying to use.
2. If (1) looks fine, you can open a new issue at
https://github.com/numpy/numpy/issues. Please include details on:
- how you installed Python
- how you installed numpy
- your operating system
- whether or not you have multiple versions of Python installed
- if you built from source, your compiler versions and ideally a build log
EDIT 3
requirements.txt ---> https://pastebin.com/0icnx0iu
EDIT 4
As of 01/12/20, the accepted solution started not to work anymore.
Now, build breaks not at pandas, but at scipy but after numpy, while building scipy's wheel. This is the log:
----------------------------------------
ERROR: Failed building wheel for scipy
Running setup.py clean for scipy
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3.6 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-s6nahssd/scipy/setup.py'"'"'; __file__='"'"'/tmp/pip-install-s6nahssd/scipy/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' clean --all
cwd: /tmp/pip-install-s6nahssd/scipy
Complete output (9 lines):
`setup.py clean` is not supported, use one of the following instead:
- `git clean -xdf` (cleans all files)
- `git clean -Xdf` (cleans all versioned files, doesn't touch
files that aren't checked into the git repo)
Add `--force` to your command to use it anyway if you must (unsupported).
----------------------------------------
ERROR: Failed cleaning build dir for scipy
Successfully built numpy
Failed to build scipy
ERROR: Could not build wheels for scipy which use PEP 517 and cannot be installed directly
From the error it seems that building process is using python3.6, while I use FROM alpine:3.7.
Full log here -> https://pastebin.com/Tw4ubxSA
And this is the current Dockerfile:
https://pastebin.com/3SftEufx

If you're not bound to Alpine 3.6, using Alpine 3.7 (or later) should work.
On Alpine 3.6, installing matplotlib failed for me with the following:
Collecting matplotlib
Downloading https://files.pythonhosted.org/packages/26/04/8b381d5b166508cc258632b225adbafec49bbe69aa9a4fa1f1b461428313/matplotlib-3.0.3.tar.gz (36.6MB)
Complete output from command python setup.py egg_info:
Download error on https://pypi.org/simple/numpy/: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833) -- Some packages may not be found!
Couldn't find index page for 'numpy' (maybe misspelled?)
Download error on https://pypi.org/simple/: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833) -- Some packages may not be found!
No local packages or working download links found for numpy>=1.10.0
However, on Alpine 3.7, it worked. This may be due to a numpy versioning issue (see here), but I'm not able to tell for sure. Past that problem, packages were built and installed successfully - taking a good while, about 30 minutes (since Alpine's musl-libc is not compatible to Python's Wheels format, all packages installed with pip have to be built from source).
Note that one important change is needed: you should only remove the build-runtime virtual package (apk del build-runtime) after pip install. Also, if applicable, you could replace numpy 1.16.1 with 1.16.2, which is the shipped version (otherwise 1.16.2 will be uninstalled and 1.16.1 built from source, further increasing the build time) - I haven't tried this, though.
For reference, here's my slightly modified Dockerfile and docker build output.
Note:
Usually Alpine is chosen as the base for minimizing the image size (Alpine is also otherwise very slick, but has compatibility issues with mainland Linux apps due to glibc/musl). Having to build Python packages from source kind of beats that purpose, since you get a very bloated image - 900MB before any cleanup, which also takes ages to build. The image could be greatly compacted by removing all intermediate compilation artifacts, build dependencies etc., but still.
If you can't get the Python package versions you need to work on Alpine, without having to build them from source, I would suggest trying other small and more compatible base images such as debian-slim, or even ubuntu.
Edit:
Following "Edit 3" with added requirements, here are updated Dockerfile and Docker build output.
The following packages were added for satisfying build dependencies:
postgresql-dev libffi-dev libressl-dev libxml2 libxml2-dev libxslt libxslt-dev libjpeg-turbo-dev zlib-dev
For packages that failed to build due to specific headers, I used Alpine's package contents search to locate the missing package.
Specifically for cffi, the ffi.h header was missing, which needs the libffi-dev package: https://pkgs.alpinelinux.org/contents?file=ffi.h&path=&name=&branch=v3.7.
Alternatively, when a package build failure is not very clear, the installation instructions of the specific package could be referred to, for example, Pillow.
The new image size, before any compaction, is 1.04GB. For cutting it down a bit, you could remove the Python and pip caches:
RUN apk del build-runtime && \
find -type d -name __pycache__ -prune -exec rm -rf {} \; && \
rm -rf ~/.cache/pip
This will bring image size down to 661MB, when using docker build --squash.

Try adding this to your requirements.txt file:
numpy==1.16.0
pandas==0.23.4
I've been facing the same error since yesterday and this change solved it for me.

An older Q&A at Why does it take ages to install Pandas on Alpine Linux relates.
If your aim to get a stable solution without knowing the nuts and bolts, for python 3 you can just build off the following (copy & paste of my answer from https://stackoverflow.com/a/50443531/1021819)
FROM python:3.7-alpine
RUN echo "#testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
RUN apk add --update --no-cache py3-numpy py3-pandas#testing
If your goal is to understand how to achieve a stable build, the discussion there and related images might help too...

FROM python:3.8-alpine
RUN apk --update add gcc build-base freetype-dev libpng-dev openblas-dev
RUN pip install --no-cache-dir matplotlib pandas

This may not be completely relevant, since this the first answer that pops up when searching for numpy/pandas installation failed in Alpine, I am adding this answer.
The following fix worked for me(But it takes longer to install pandas/numpy)
apk update
apk --no-cache add curl gcc g++
ln -s /usr/include/locale.h /usr/include/xlocale.h
Now try installing pandas/numpy

Related

Build Error on apple silicon M1 with docker

I was trying to dockerize a flask application with a third-party cli (plastimatch) on my M1.
I used ubuntu:18.04 as base image. The build on more recent version would fail with the error message 'no installation candidate was found'. The first odd thing I noticed was that the exact same build would succeed on a linux server.
I used a local venv to finalize the application and as I started to dockerize everything I got the following error:
#16 22.37 note: This error originates from a subprocess, and is likely not a problem with pip.
#16 22.37 ERROR: Failed building wheel for pylibjpeg-libjpeg
#16 22.37 Failed to build pylibjpeg-openjpeg pylibjpeg-libjpeg
#16 22.37 ERROR: Could not build wheels for pylibjpeg-openjpeg, pylibjpeg-libjpeg, which is required to install pyproject.toml-based projects
These python packages are wrappers for different C++ libaries, that handle images. The local build fails and the build on our linux server runs perfectly fine.
Has anyone noticed similar problems when dockerizing there applications locally in development? And are there any solutions to it?
Here is the reference of the used Dockerfile and requirements.txt (currently missing specific versions):
FROM ubuntu:18.04 as base
RUN apt-get update -y && apt-get install -y && apt-get upgrade -y
RUN apt-get install -y software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get install -y python3.8 python3-pip
RUN rm /usr/bin/python3 && ln -s /usr/bin/python3.8 /usr/bin/python3
RUN apt-get install -y \
plastimatch \
zlib1g \
cmake
WORKDIR /app
COPY requirements.txt requirements.txt
RUN python3 -m pip install -U --force-reinstall pip
RUN pip3 install --upgrade pip setuptools wheel
RUN pip3 install -r requirements.txt
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
FROM base as upload-dev
RUN echo "Building dev version"
COPY requirements_dev.txt requirements_dev.txt
RUN pip3 install -r requirements_dev.txt
COPY . .
python-dotenv
cython
pynrrd
flask-cors
Flask
Werkzeug
httplib2
numpy
pydicom
highdicom
dicomweb-client
Update: 01. July 2022
I could track down the error.
The problem was the missing wheel of some third party libraries. If no wheel could be located, the source code will be fetched and installed by a compiler. This crashed on my machine during the installation of libraries that use C++ at their core.
An easy approach to fix this problem would be to directly use the linux AMD64 image.
FROM --platform=linux/amd64 $YOUR_BASE_IMAGE
This would be a bit slower but for most development environments sufficient.
A detailed explanation: https://pythonspeed.com/articles/docker-build-problems-mac/
For me, the fix was to install Rosetta 2, which is included in the Docker documentation: https://docs.docker.com/desktop/mac/apple-silicon/#system-requirements
softwareupdate --install-rosetta

Minimal SciPy Dockerfile

I have a Dockerfile like the following, app code is omitted:
FROM python:3
# Binary dependencies
RUN apt update && apt install -y gfortran libopenblas-dev liblapack-dev
# Wanted Python packages
RUN python3 -m pip install mysqlclient numpy scipy pandas matplotlib
It works fine but produces an image of 1.75 GB in size (while code is about 50 MB). How can I reduce such huge volume??
I also tried to use Alpine Linux, like this:
FROM python:3-alpine
# Binary dependencies for numpy & scipy; though second one doesn't work anyway
RUN apk add --no-cache --virtual build-dependencies \
gfortran gcc g++ libstdc++ \
musl-dev lapack-dev freetype-dev python3-dev
# For mysqlclient
RUN apk --no-cache add mariadb-dev
# Wanted Python packages
RUN python3 -m pip install mysqlclient numpy scipy pandas matplotlib
But Alpine leads to many different strange errors. Error from the upper code:
File "scipy/odr/setup.py", line 28, in configuration
blas_info['define_macros'].extend(numpy_nodepr_api['define_macros'])
KeyError: 'define_macros'
So, how one can get minimal possible (or at least just smaller) image of Python 3 with mentioned packages?
There are several things you can do to make your Docker image smaller.
Use the python:3-slim Docker image as a base. The -slim images do not include packages needed for compiling software.
Pin the Python version, let's say to 3.8. Some packages do not have wheel files for python 3.9 yet, so you might have to compile them. It is good practice, in general, to use a more specific tag because the python:3-slim tag will point to different versions of python at different points in time.
You can also omit the installation of gfortran, libopenblas-dev, and liblapack-dev. Those packages are necessary for building numpy/scipy, but if you install the wheel files, which are pre-compiled, you do not need to compile any code.
Use --no-cache-dir in pip install to disable the cache. If you do not include this, then pip's cache counts toward the Docker image size.
There are no linux wheels for mysqlclient, so you will have to compile it. You can install build dependencies, install the package, then remove build dependencies in a single RUN instruction. Keep in mind that libmariadb3 is a runtime dependency of this package.
Here is a Dockerfile that implements the suggestions above. It makes a Docker image 354 MB large.
FROM python:3.8-slim
# Install mysqlclient (must be compiled).
RUN apt-get update -qq \
&& apt-get install --no-install-recommends --yes \
build-essential \
default-libmysqlclient-dev \
# Necessary for mysqlclient runtime. Do not remove.
libmariadb3 \
&& rm -rf /var/lib/apt/lists/* \
&& python3 -m pip install --no-cache-dir mysqlclient \
&& apt-get autoremove --purge --yes \
build-essential \
default-libmysqlclient-dev
# Install packages that do not require compilation.
RUN python3 -m pip install --no-cache-dir \
numpy scipy pandas matplotlib
Using alpine linux was a good idea, but alpine uses muslc instead of glibc, so it is not compatible with most pip wheels. The result is that you would have to compile numpy/scipy.

How do I install Cython, cartopy and shapely in a python docker container?

I am trying to get Cython, cartopy and shapely running in a docker container so I can leverage a python library traffic. I am currently getting an error with Cython:
Collecting Cython==0.26 (from -r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/87/6c/53a9e636c9dbe7acd5c002422c1a7a48a367f3b4c0cf6490908f43398ca6/Cython-0.26-cp27-cp27mu-manylinux1_x86_64.whl (7.0MB)
Collecting geos (from -r requirements.txt (line 2))
Downloading https://files.pythonhosted.org/packages/11/9b/a190f02fb92f465a7640b9ee7da732d91610415a1102f6e9bb08125a3fef/geos-0.2.2.tar.gz (365kB)
Collecting cartopy (from -r requirements.txt (line 3))
Downloading https://files.pythonhosted.org/packages/e5/92/fe8838fa8158931906dfc4f16c5c1436b3dd2daf83592645b179581403ad/Cartopy-0.17.0.tar.gz (8.9MB)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-Se89QB/cartopy/setup.py", line 42, in <module>
raise ImportError('Cython 0.15.1+ is required to install cartopy.')
ImportError: Cython 0.15.1+ is required to install cartopy.
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-Se89QB/cartopy/
The command '/bin/sh -c pip install --no-cache-dir -r requirements.txt' returned a non-zero code: 1
Below is my setup:
Dockerfile:
FROM ubuntu:latest
WORKDIR /usr/src/app
#apt-get install -y build-essential -y python python-dev python-pip python-virtualenv libmysqlclient-dev curl&& \
RUN \
apt-get update && \
apt-get install -y build-essential -y python python-dev python-pip python-virtualenv libmysqlclient-dev curl&& \
rm -rf /var/lib/apt/lists/*
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Install cron
RUN apt-get update
RUN apt-get install cron
# Add crontab file in the cron directory
ADD crontab /etc/cron.d/simple-cron
# Add shell script and grant execution rights
ADD script.sh /script.sh
RUN chmod +x /script.sh
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/simple-cron
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Run the command on container startup
CMD cron && tail -f /var/log/cron.log
requirements.txt
Cython==0.26
geos
cartopy
shapely
traffic
Try installing Cython first with pip, before using your requirements.txt.
cartopy has a lot of dependencies, and some -- especially Proj -- may not be resolvable using PIP or apt-get. numpy and cython may be resolved by installing them separately and just prior to installing cartopy (like u/dopplershift suggests) -- but Proj will never resolve, grr.
My solution was to use conda install, which solves the dependencies for you. Unfortunately Docker and Conda don't play well together, but you can kind of work around it using miniconda. Try this:
FROM ubuntu:latest
FROM python:3.8.5
RUN mkdir /app
ADD . /app
WORKDIR /app
# cartopy cannot be installed using PIP because the proj never gets resolved.
# The proj dependency never gets resolved because there are two Python packages
# called proj, and PIP always loads the wrong one. The conda install command,
# however, using the conda-forge channel, does know how to resolve the dependency
# issues, including packages like numpy.
#
# Here we install miniconda, just so we can use the conda install command
# for cartopy.
FROM continuumio/miniconda3
RUN conda install -c conda-forge cartopy
Since 2022-09-09 this is much easier, because Cartopy v0.21.0 does not depend on PROJ.
Solution
Dockerfile:
FROM python:3.11-slim-bullseye
RUN apt update && apt install -y git gcc build-essential python3-dev libgeos-dev
RUN python3 -m pip install --upgrade pip setuptools wheel
ADD requirements.txt .
RUN python3 -m pip install --no-cache-dir --compile -r requirements.txt
# add files and set cmd/entrpypoint down here
requirements.txt:
Cartopy==0.21.0
Test
docker build -t cartopy -f Dockerfile .
docker run -it cartopy pip freeze
Results in:
Cartopy==0.21.0
certifi==2022.12.7
contourpy==1.0.6
cycler==0.11.0
fonttools==4.38.0
kiwisolver==1.4.4
matplotlib==3.6.2
numpy==1.23.5
packaging==22.0
Pillow==9.3.0
pyparsing==3.0.9
pyproj==3.4.0
pyshp==2.3.1
python-dateutil==2.8.2
Shapely==1.8.5.post1
six==1.16.0

Docker pip dependencies installation error

I'm trying to build a Docker image for a Flask app I wrote, but I'm getting a pip related error when it's installing the build dependencies as you can see from the log below.
I'm using pipenv for dependency management, and I'm able to get the app running locally without any error using pipenv run python3 run.py
It seems it's not able to install bcrypt, but I can't figure out why.
Dockerfile:
FROM alpine:3.8
RUN apk add --no-cache python3-dev && pip3 install --upgrade pip
WORKDIR /app
COPY . /app
RUN pip3 --no-cache-dir install -r requirements.txt
EXPOSE 5000
ENTRYPOINT ["python3"]
CMD ["run.py"]
requirements.txt (generated with pipenv shell; pip freeze > requirements.txt)
bcrypt==3.1.6
blinker==1.4
cffi==1.11.5
Click==7.0
Flask==1.0.2
Flask-Bcrypt==0.7.1
Flask-Login==0.4.1
Flask-Mail==0.9.1
Flask-SQLAlchemy==2.3.2
Flask-WTF==0.14.2
itsdangerous==1.1.0
Jinja2==2.10
MarkupSafe==1.1.0
Pillow==5.4.1
pycparser==2.19
six==1.12.0
SQLAlchemy==1.2.17
Werkzeug==0.14.1
WTForms==2.2.1
Docker build image process log:
$ docker build -t flaskapp:latest .
Sending build context to Docker daemon 2.16MB
Step 1/8 : FROM alpine:3.8
---> 3f53bb00af94
Step 2/8 : RUN apk add --no-cache python3-dev && pip3 install --upgrade pip
---> Using cache
---> 3856c6d59bbe
Step 3/8 : WORKDIR /app
---> Using cache
---> 54ed0e7464e4
Step 4/8 : COPY . /app
---> Using cache
---> 9e045f4ce91c
Step 5/8 : RUN pip3 --no-cache-dir install -r requirements.txt
---> Running in 25909f37b071
Collecting bcrypt==3.1.6 (from -r requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/ce/3a/3d540b9f5ee8d92ce757eebacf167b9deedb8e30aedec69a2a072b2399bb/bcrypt-3.1.6.tar.gz (42kB)
Installing build dependencies: started
Installing build dependencies: finished with status 'error'
Complete output from command /usr/bin/python3.6 /usr/lib/python3.6/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-9iojppec/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel "cffi>=1.1; python_implementation != 'PyPy'":
Collecting setuptools
Downloading https://files.pythonhosted.org/packages/bf/ae/a23db1762646069742cc21393833577d3fa438eecaa59d11fb04fa57fcd5/setuptools-40.7.1-py2.py3-none-any.whl (574kB)
Collecting wheel
Downloading https://files.pythonhosted.org/packages/ff/47/1dfa4795e24fd6f93d5d58602dd716c3f101cfd5a77cd9acbe519b44a0a9/wheel-0.32.3-py2.py3-none-any.whl
Collecting cffi>=1.1
Downloading https://files.pythonhosted.org/packages/e7/a7/4cd50e57cc6f436f1cc3a7e8fa700ff9b8b4d471620629074913e3735fb2/cffi-1.11.5.tar.gz (438kB)
Complete output from command python setup.py egg_info:
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libffi', required by 'virtual:world', not found
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libffi', required by 'virtual:world', not found
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libffi', required by 'virtual:world', not found
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libffi', required by 'virtual:world', not found
Package libffi was not found in the pkg-config search path.
Perhaps you should add the directory containing `libffi.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libffi', required by 'virtual:world', not found
No working compiler found, or bogus compiler options passed to
the compiler from Python's standard "distutils" module. See
the error messages above. Likely, the problem is not related
to CFFI but generic to the setup.py of any Python package that
tries to compile C code. (Hints: on OS/X 10.8, for errors about
-mno-fused-madd see http://stackoverflow.com/questions/22313407/
Otherwise, see https://wiki.python.org/moin/CompLangPython or
EDIT:
After changing the RUN command in Dockerfile to:
RUN apk add --no-cache python3-dev openssl-dev libffi-dev gcc musl-dev && pip3 install --upgrade pip
I now get this error (relevant part posted):
...
Collecting SQLAlchemy==1.2.17 (from -r requirements.txt (line 17))
Downloading https://files.pythonhosted.org/packages/c6/52/73d1c92944cd294a5b165097038418abb6a235f5956d43d06f97254f73bf/SQLAlchemy-1.2.17.tar.gz (5.7MB)
Collecting Werkzeug==0.14.1 (from -r requirements.txt (line 18))
Downloading https://files.pythonhosted.org/packages/20/c4/12e3e56473e52375aa29c4764e70d1b8f3efa6682bef8d0aae04fe335243/Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
Collecting WTForms==2.2.1 (from -r requirements.txt (line 19))
Downloading https://files.pythonhosted.org/packages/9f/c8/dac5dce9908df1d9d48ec0e26e2a250839fa36ea2c602cc4f85ccfeb5c65/WTForms-2.2.1-py2.py3-none-any.whl (166kB)
Exception:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 176, in main
status = self.run(options, args)
File "/usr/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 346, in run
session=session, autobuilding=True
File "/usr/lib/python3.6/site-packages/pip/_internal/wheel.py", line 886, in build
assert have_directory_for_build
AssertionError
The command '/bin/sh -c pip3 --no-cache-dir install -r requirements.txt' returned a non-zero code: 2
EDIT2:
It seems it's an Alpine Linux known issue. More info here: have_directory_for_build AssertionError when installing with --no-cache-dir (19.0.1)
The problem is that cffi requires a compiler and development libraries to be available on install.
A simple way to get around this is by simply installing the required packages as part of the docker build process.
RUN apk add --no-cache python3-dev openssl-dev libffi-dev gcc && pip3 install --upgrade pip
This might not be the best long term solution and something like multi-stage builds might be better long term.
I tried pulling the version of alpine:3.4 and it works, but with 3.7 and 3.9, the install of dependencies in python crashes every time. It should be something of alpine image.
SOLVED:
I had the error installing Fabric2 and one of its dependencies was bcrypt and cryptography, and those threw many errors so, looking for in the Cryptography notes it says you have to add this line in the Dockerfile:
RUN apk-install gcc musl-dev python3-dev libffi-dev openssl-dev make
The worst thing is that it increases the layer size in 231MiB
A good approach to a lower image size is installing those tools, do the installation of the python libraries and uninstalling the tools needed before, all in one RUN clause
The image size shrinks to 83MiB so it's the best idea.
I used the latest version of alpine in this case and it works fine now!
You could use the python docker file with alpine: Python 3.7 Alpine 3.8 Dockerfile
You can look here to get the appropriate image based on the python version you have and it should remove the overhead of having to install any python dependencies which are missing. Then at the top of your dockerfile you can just write:
FROM 3.7.2-alpine3.8
If you don't want to go that root adding in libffi-dev and gcc to the following command should fix your problem. If you look at the dockerfile linked though you can see that there a lot of additional dependencies which may be needed included in the python image.
RUN apk add --no-cache python3-dev libffi-dev gcc && pip3 install --upgrade pip
Hope this helps!
Try to install gcc into your with apt before installing python requirements.

Cannot load CLoader with pyyaml

I'm working on a python project using pyyaml. I need to run it in a Docker container based on bitnami/minideb:jessie. Python version is 2.7.9.
The original code is using CLoader and I cannot change it currently.
Any reason CLoader fails to load but Loader is fine ?
>>> import yaml
>>> yaml.__version__
'3.12'
>>> from yaml import Loader
>>> from yaml import CLoader
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name CLoader
>>>
I cannot figure out what I'm missing here. Any idea ?
Running it from the Docker image python:2.7.9 does not raise any error then:
$ docker run -ti python:2.7.9 bash
#/ python
>>> from yaml import CLoader
>>> from yaml import Loader
>>>
By default, the setup.py script checks whether LibYAML is installed
and if so, builds and installs LibYAML bindings.
This is the minimum to get CLoader compiled and installed.
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y \
python3 python3-dev python3-pip gcc libyaml-dev
RUN pip3 install pyyaml
# verify
RUN python3 -c "import yaml; yaml.CLoader"
I ran into the same problem. You need to install the libyaml-dev package, then install libyaml and pyyaml from source. Here's the complete Dockerfile for minideb:jessie:
FROM bitnami/minideb:jessie
RUN apt-get update
RUN apt-get install -y \
automake \
autoconf \
build-essential \
git-core \
libtool \
libyaml-dev \
make \
python \
python-dev \
python-pip
RUN pip install --upgrade pip
RUN pip install Cython==0.29.10
RUN mkdir /libyaml
WORKDIR /libyaml
RUN git clone https://github.com/yaml/libyaml.git . && \
git checkout dist-0.2.2 && \
autoreconf -f -i && \
./configure && \
make && \
make install
RUN mkdir /pyyaml
WORKDIR /pyyaml
RUN git clone https://github.com/yaml/pyyaml.git . && \
git checkout 5.1.1 && \
python setup.py install
RUN python -c "import yaml; from yaml import CLoader; print 'Loaded CLoader!'"
A couple of additions to others' solutions:
If you want the install command to hard-fail if the libyaml C extension won't build (instead of silently falling back to a pure-Python only install), you can pass the --with-libyaml global option, eg: python setup.py --with-libyaml install.
If you're doing this with something that might ever need to be upgraded (eg implicitly via another package's requirement for a higher pyyaml version), it's better to use pip instead of directly calling setup.py, as that (currently) uses a pure distutils installation, which pip will fail to uninstall later. You'll see an error like "ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall."
Doing the required extension build with pip looks something like pip install --global-option='--with-libyaml' pyyaml.
I'm just copying the developer's answer from the issue linked above, but this happens because pyyaml only installs the libyaml bindings (CLoader & co.) if it finds the libyaml-dev package (that's the debian package, anyway) at install time. If it doesn't find it, it prints a warning and skips the libyaml bindings.
So, install libyaml-dev before installing pyyaml.
I tried all the step mentions, and the following steps fixed my issue.
Install
apt-get install -y gcc libyaml-dev
pip install --ignore-installed --global-option='--with-libyaml' pyyaml
Test
python -c "import yaml; yaml.CLoader"

Categories