WARNING: Running pip as the 'root' user

WARNING: Running pip as the 'root' user - python

I am making simple image of my python Django app in Docker. But at the end of the building container it throws next warning (I am building it on Ubuntu 20.04):
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead
Why does it throw this warning if I am installing Python requirements inside my image? I am building my image using:
sudo docker build -t my_app:1 .
Should I be worried about warning that pip throws, because I know it can break my system?
Here is my Dockerfile:
FROM python:3.8-slim-buster
WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

The way your container is built doesn't add a user, so everything is done as root.
You could create a user and install to that users's home directory by doing something like this;
FROM python:3.8.3-alpine
RUN pip install --upgrade pip
RUN adduser -D myuser
USER myuser
WORKDIR /home/myuser
COPY --chown=myuser:myuser requirements.txt requirements.txt
RUN pip install --user -r requirements.txt
ENV PATH="/home/myuser/.local/bin:${PATH}"
COPY --chown=myuser:myuser . .
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

This behavior was introduced in pip 21.1 as a "bug fix".
As of pip 22.1, you can now opt out of the warning using a parameter:
pip install --root-user-action=ignore
You can ignore this in your container by using the environment:
ENV PIP_ROOT_USER_ACTION=ignore
#11035

UPDATE 220930
The good news of this answer here is just that you can ignore the warning, but ignoring the warning is not best practice anymore for pip version >=22.1. At the time of writing this answer, the new trick for pip version >=22.1 was not known to me.
pip version >=22.1
Follow the answer of Maximilian Burszley. It was not known to me at the time of writing and allows you to avoid the warning with a tiny parameter.
pip version >=21.1 and <22.1
You can ignore this warning since you create the image for an isolated purpose and it therefore is organizationally as isolated as a virtual environment. Not technically, but that does not matter here.
It usually should not pay off to invest the time and create a virtual environment in an image or add a user as in the other answer only to avoid the warning, since you should not have any issues with this. It might cloud your view during debugging, but it does not stop the code from working.
Just check pip -V and pip3 -V to know whether you need to pay attention not to mistakenly use pip for Python 2 when you want pip for Python 3. But that should be it, and if you install only pip for python 3, you will not have that problem anyway.
pip version <21.1
In these older versions, the warning does not pop up anyway, see the other answer again. And it is also clear from the age of the question that this warning did not show up in older versions.

I don't like ignoring warnings, as one day you will oversee an important one.
Here is a good explanation on best docker practices with python. Search for Example with virtualenv and you'll find this:
# temp stage
FROM python:3.9-slim as builder
WORKDIR /app
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install -r requirements.txt
# final stage
FROM python:3.9-slim
COPY --from=builder /opt/venv /opt/venv
WORKDIR /app
ENV PATH="/opt/venv/bin:$PATH"
Works like charm. No warnings or alike. BTW they also recommend to create a non root user for security reasons.
EDIT: to get rid of all warnings you may also want to add the following entries to the builder part of your Dockerfile (applies for Debian 8.3.x):
ARG DEBIAN_FRONTEND=noninteractive
ARG DEBCONF_NOWARNINGS="yes"
RUN python -m pip install --upgrade pip && \
...

Related

how I can install packages inside a running docker container and packages take effect without recreating the container?

I am running docker compose up which consists of multiple containers on of which is python 3.* and all the containers have volumes attached to them.
also I have already created requirements.txt file
I have entered python container and install x packages then I did
pip freeze > requirements.txt
I then I stoped the containers and restart the containers again, but python container didn't start and the log says modules x is not found, so what I did is that O deleted the container and created a new one and it worked,
my questions is, Is there any way to not deleting the container (I think its over kill)
but some-who still able to manage installing packages in the container?
Dockerfile
FROM python:3.6
RUN apt-get update
RUN apt-get install -y gettext
RUN mkdir -p /var/www/server
COPY src/requirements.txt /var/www/server/
WORKDIR /var/www/server
RUN pip install -r ./requirements.txt
EXPOSE 8100
ENTRYPOINT sleep 3 && python manage.py migrate && python manage.py runserver 0.0.0.0:8100

You should move your project source files into the container during build and within in it run the pip install -r requirements.txt.
Below is an example to give you an idea:
--- Other build commands follow
WORKDIR /usr/src/service
COPY ./service . # Here, I am moving everything of the service folder/module into WORKDIR within docker
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
-- Other build commands follow.
Finally, you will use docker-compose build service to build the define service in the docker-compose.yml pointing to the Dockerfile in the build context.
...
build:
context: .
dockerfile: service/Dockerfile
...

Broadly, set up your Dockerfile such that you need to do the least-changing and most time-costly work first
FROM FOO
RUN get os-level and build dependencies
COPY only exactly files needed to identify dependencies
RUN install dependencies that takes a long time
RUN install more frequently-changing dependencies
COPY rest of your wanted content
ENTRYPOINT define me
as #coldly says in their Answer, write your dependencies into a requirements file and install them during the container build!

Running the same docker config a few months later results in "ValueError: numpy.ndarray size changed"

We've built a docker image in Summer '21 and it ran fine until December.
Now, after having deleted the docker cache, it doesn't run anymore.
The error is:
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
There are quite a lot of related SO questions. The answers tend to be similar:
reinstall numpy
use numpy version X
use --no-cache-dir option on pip install
use a new pip version
We've tried all of it (upgrading pip by adding the latest version to requirements.txt). A specific numpy version was not in our requirements.txt, I thought that could be the cause here, but after adding a few very different versions without success, I am out of ideas.
We're not even sure whether purging the docker cache played a role.
This is the minimal repro:
main.py:
from string_grouper import match_strings, match_most_similar
requirements.txt:
string_grouper==0.1.1
Dockerfile:
FROM amancevice/pandas:1.3.1-slim
# for string_grouper
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install g++ -y \
&& apt-get clean
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
COPY requirements.txt .
RUN python -m pip install --no-cache-dir -r requirements.txt
WORKDIR /app
COPY . /app
RUN useradd appuser && chown -R appuser /app
USER appuser
CMD ["python", "main.py"]
What else can we try to troubleshoot this?
EDIT: I believe the likeliest solution is "explicitly install numpy version X". In that case, the question would just be how to find the right one.

How to invalidate Dockerfile cache when pip installing from repo

I have a Dockerfile that needs to install the latest package code from a private git repo, however because the dockerfile/url/commit doesn't change (I just follow the latest in master), Docker will cache this request and won't pull the latest code.
I can disable build caching entirely which fixes the issue - but this results in a slow build.
How can I just force docker not to use the cache for the one command?
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
COPY ./requirements.txt /app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
# This needs to be separate to trigger to invalidate the build cache
RUN pip install -e git+https://TOKEN#github.com/user/private-package.git#egg=private_package
COPY ./main.py /app
COPY ./app /app/app

Add
ARG foo=bar
Before RUN pip install -e ... in your docker file.
Then in your script with docker build .... add as a parameter
--build-arg foo="$(date -s)"

Python package not installable in docker container

I have basic python docker container file like this:
FROM python:3.8
RUN pip install --upgrade pip
EXPOSE 8000
ENV PYTHONDONTWRITEBYTECODE=1
# Turns off buffering for easier container logging
ENV PYTHONUNBUFFERED=1
# Install pip requirements
COPY requirements.txt .
RUN python -m pip install -r requirements.txt
WORKDIR /app
COPY . /app
RUN useradd appuser && chown -R appuser /app
USER appuser
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]
I want to run my flask application in a docker container by using this definition file. Locally I can start a new virtual env, install everything via pip install -r requirements.txt on python 3.8 and it does not fail.
When building the docker image it fails to install all packages from the requirements.txt. For example this package fails:
ERROR: Could not find a version that satisfies the requirement cvxopt==1.2.5.post1
ERROR: No matching distribution found for cvxopt==1.2.5.post1
When I comment out the package in the requirements.txt everything seems to work. The package itself claims to be compatible with python >2.7. Same behavior for the package pywin32==228 here.

Looing at the wheel files in the package, cvxopt.1.2.5.post1 only contains a build for Windows. For Linux (such as the docker container), you should use cvxopt.1.2.5.

You should replace the version with 1.2.5 (pip install cvxopt==1.2.5)
The latest version cvxopt 1.2.5.post1 is not compatible with all architectures: https://pypi.org/project/cvxopt/1.2.5.post1/#files
The previous one is compatible with a lot more hardware and should be able to run on your Docker image: https://pypi.org/project/cvxopt/1.2.5/#files

pipenv --system option for docker. What is the suggested way to get all the python packages in docker

I use pipenv for my django app.
$ mkdir djangoapp && cd djangoapp
$ pipenv install django==2.1
$ pipenv shell
(djangoapp) $ django-admin startproject example_project .
(djangoapp) $ python manage.py runserver
Now i am shifting to docker environment.
As per my understanding pipenv only installs packages inside a virtualenv
You don't need a virtual env inside a container, docket container IS a virtual environment in itself.
Later after going through many Dockerfile 's i found --system option to install in the system.
For example the following i found:
https://testdriven.io/blog/dockerizing-django-with-postgres-gunicorn-and-nginx/
COPY ./Pipfile /usr/src/app/Pipfile
RUN pipenv install --skip-lock --system --dev
https://hub.docker.com/r/kennethreitz/pipenv/dockerfile
# -- Install dependencies:
ONBUILD RUN set -ex && pipenv install --deploy --system
https://wsvincent.com/beginners-guide-to-docker/
# Set work directory
WORKDIR /code
# Copy Pipfile
COPY Pipfile /code
# Install dependencies
RUN pip install pipenv
RUN pipenv install --system
So --system is only sufficient or --deploy --system is better way. And --skip-lock --system --dev which is different again.
So can some one guide how to get my environment back in my Docker

A typical Docker deployment would involve having a requirements.txt (it's a file where you can store your pip dependencies, including Django itself) file and then in your Dockerfile you do something like:
FROM python:3.7 # or whatever version you need
ADD requirements.txt /code/
WORKDIR /code
# install your Python dependencies
RUN pip install -r requirements.txt
# run Django
CMD [ "python", "./manage.py", "runserver", "0.0.0.0:8000"]
You don't need pipenv here at all since you no longer have a virtual environment as you say.
Even better you can configure a lot of that stuff in a docker-compose.yml file and then use docker-compose to run and manage your services, not just Django.
Docker have a very good tutorial on dockerising Django with it. And if you're unsure what's going on in the Dockerfile itself, check the manual.

In either a docker image, a CI pipeline, a production server or even in your development workstation: you should always include the --deploy flag in your installs unless you want to potentially relock all dependencies, e.g. while evolving your requirements. It will check that the lockfile is up-to-date and will never install anything that is not listed there.
As for the --system flag, you'd better drop it. There is no real harm on using a virtual environment inside docker images, but some subtle benefits. See this comment by #anishtain4. Pipenv now recommends against system-wide installs https://github.com/pypa/pipenv/pull/2762.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.