I created a docker container with a python script. The python script takes an input file, does some processing and saves output file at some specified location.
docker run /app/script.py --input /data/input.csv --output /data/output.csv
Since the input file can be different every time I run the script, I want it to be outside the docker container. I also would like to store the output somewhere outside the container.
docker run /app/script.py --input /my/local/location/outside/docker/input.csv --output /my/local/location/outside/docker/output.csv
Does docker support this? If so, how would one be able to achieve it?
My Dockerfile looks like the following:
FROM phusion/baseimage
RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get install -y python-dev
RUN apt-get install -y python-pip
RUN apt-get install -y python-numpy && \
apt-get install -y python-scipy
COPY ./requirements.txt /app/requirements.txt
COPY ./src/script.py /app/script.py
WORKDIR /app
COPY . /app
You could mount a directory with the file inside as a Docker data volume using the -v option: https://docs.docker.com/engine/tutorials/dockervolumes/
docker run -d -P --name myapp -v /app mydir/app python script.py
This will have the added benefit of allowing you to stop the container, make changes to the file, and start the container and see the change reflected within the container.
so you should add to your Dockerfile a line
ENTRYPOINT ["python","/app/script.py"]
and a
CMD myinput
or something similar,
read
What is the difference between CMD and ENTRYPOINT in a Dockerfile?
read the docs about
https://docs.docker.com/engine/reference/builder/#entrypoint
and
https://docs.docker.com/engine/reference/builder/#cmd
Related
When building a Docker file, I get the error
"/bin/sh: 1: apt-get: not found"
docker file:
FROM python:3.8
FROM ubuntu:20.04
ENV PATH="/env/bin/activate"
RUN apt-get update -y && apt-get upgrade -y
WORKDIR /var/www/html/
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8000
CMD ["python", "manage.py"]
You are setting the PATH to /env/bin/activate and that is then the only place where apt-get is searched for. There is no need to activate a virtual env inside the container, just get rid of that line. pip can install the packages in requirements.txt to the "system" Python without issues.
You cannot layer 2 images like you are attempting to do, with multiple FROM statements. Just use FROM python:3.8 and drop the ubuntu. Multiple FROM statements are used in multi-stage builds where you have intermediate images which produce artifacts that are copied to the final image.
So just do:
FROM python:3.8
RUN apt-get update -y && apt-get upgrade -y
WORKDIR /var/www/html/
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8000
CMD ["python", "manage.py"]
.. although why you would put Python code in /var/www/html beats me. Probably you don't.
I have a Flask API that connects to an Azure SQL database, deployed on Azure App Service in a Docker Image.
It works fine but I am trying to keep consistency between my development, staging and production environments using Alembic/Flask-Migrate to apply database upgrades.
I saw on Miguel Grinberg's Docker Deployment Tutorial, that this can be achieved by adding the flask db upgrade command to a boot.sh script, like so:
#!/bin/sh
flask db upgrade
exec gunicorn -w 4 -b :5000 --access-logfile - --error-logfile - app:app
My problem is that, when running the boot.sh script, I receive the error:
Usage: flask db [OPTIONS] COMMAND [ARGS]...
Try 'flask db --help' for help.
'.ror: No such command 'upgrade
Which indicates the script cannot find the Flask-Migrate library. This actually happens if I try other site-packages, such as just trying to run flask commands.
The weird thing is:
gunicorn works just fine
The API works just fine
I can run flask db upgrade with no problem if I fire up the container and open a terminal session with docker exec -i -t api /bin/sh
Obviously, there's a problem with my Dockerfile. I would massively appreciate any help here as I'm relatively new to Docker and Linux so I'm sure I'm missing something obvious:
EDIT: It also works just fine if I add the following line to my Dockerfile, just before the entrypoint CMD:
RUN flask db upgrade
Dockerfile
FROM python:3.8-alpine
# Dependencies for pyodbc on Linux
RUN apk update
RUN apk add curl sudo build-base unixodbc-dev unixodbc freetds-dev
RUN apk add gcc musl-dev libffi-dev openssl-dev
RUN apk add --no-cache tzdata
RUN rm -rf /var/cache/apk/*
RUN curl -O https://download.microsoft.com/download/e/4/e/e4e67866-dffd-428c-aac7-8d28ddafb39b/msodbcsql17_17.5.2.2-1_amd64.apk
RUN sudo sudo apk add --allow-untrusted msodbcsql17_17.5.2.2-1_amd64.apk
RUN mkdir /code
WORKDIR /code
COPY requirements.txt requirements.txt
RUN python -m pip install --default-timeout=100 -r requirements.txt
RUN python -m pip install gunicorn
ADD . /code/
COPY boot.sh /usr/local/bin/
RUN chmod u+x /usr/local/bin/boot.sh
EXPOSE 5000
ENTRYPOINT ["sh", "boot.sh"]
I ended up making some major changes to my Dockerfile and boot.sh script. I'll share these as best I can below:
Problem 1: Entrypoint script cannot access directories
My main issue was that I had an inconsistent folder structure in my directory. There were 2 boot.sh scripts and the one being run on entrypoint either had the wrong permissions or was in the wrong place to find my site packages.
I simplified the copying of files from my local machine to the Docker image like so:
RUN mkdir /code
WORKDIR /code
COPY requirements.txt requirements.txt
RUN python -m venv venv
RUN venv/bin/pip install --default-timeout=100 -r requirements.txt
RUN venv/bin/pip install gunicorn
COPY app app
COPY migrations migrations
COPY api.py config.py boot.sh ./
RUN chmod u+x boot.sh
EXPOSE 5000
ENTRYPOINT ["./boot.sh"]
The changes involved:
Setting up a virtualenv and installing all site packages in there
Making sure the config.py, boot.sh, and api.py files were in the root directory of the application folder (./)
Changing the entrypoint command from ["bin/sh", "boot.sh"] to just ["./boot.sh"]
Moving migrations files into the relevant folder for the upgrade script
I was then able to activate the virtual environment in the entrypoint file, and run the flask upgrade commands (NB: I had a problem with line endings being CRLF instead of LF in boot.sh, so make sure to change it if on Windows):
#!/bin/bash
source venv/bin/activate
flask db upgrade
exec gunicorn -w 4 -b :5000 --access-logfile - --error-logfile - api:app
Problem 2: Alpine Linux Too Slow
My other issue was that my image was taking forever to build (upwards of 45 mins) on Alpine Linux. Turns out this is a pretty well-established issue when using some of the libraries in my API (Pandas, Numpy).
I switched to a Debian build so that I could makes changes more quickly to my Docker image.
Including the installation of pyodbc to connect to Azure SQL Server, the first half of my Dockerfile now looks like:
FROM python:3.8-slim-buster
RUN apt-get update
RUN apt-get install -y apt-utils curl sudo gcc g++ gnupg2
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get install -y libffi-dev libgssapi-krb5-2 unixodbc-dev unixodbc freetds-dev
RUN sudo apt-get update
RUN sudo ACCEPT_EULA=Y apt-get install msodbcsql17
RUN apt-get clean -y
Where the curl commands and below come from the official MS docs on installing pyodbc on Debian
Full dockerfile:
FROM python:3.8-slim-buster
RUN apt-get update
RUN apt-get install -y apt-utils curl sudo gcc g++ gnupg2
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN apt-get install -y libffi-dev libgssapi-krb5-2 unixodbc-dev unixodbc freetds-dev
RUN sudo apt-get update
RUN sudo ACCEPT_EULA=Y apt-get install msodbcsql17
RUN apt-get clean -y
RUN mkdir /code
WORKDIR /code
COPY requirements.txt requirements.txt
RUN python -m venv venv
RUN venv/bin/pip install --default-timeout=100 -r requirements.txt
RUN venv/bin/pip install gunicorn
COPY app app
COPY migrations migrations
COPY api.py config.py boot.sh ./
RUN chmod u+x boot.sh
EXPOSE 5000
ENTRYPOINT ["./boot.sh"]
I think this is the key information.
Which indicates the script cannot find the Flask-Migrate library. This actually happens if I try other site-packages, such as just trying to run flask commands.
To me this may indicate that the problem is not specific to Flask-Migrate but to all packages - as you write. This may mean on of following two.
First, it can mean that the packages are not correctly installed. However, this is unlikely as you write that it works when you manually start the container.
Second, something is wrong with how you execute your boot.sh script. For example, try changing
ENTRYPOINT ["sh", "boot.sh"]
to
ENTRYPOINT ["/bin/sh", "boot.sh"]
HTH!
My Dockerfile is:
FROM ubuntu:18.04
RUN apt-get -y update
RUN apt-get install -y software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update -y
RUN apt-get install -y python3.7 build-essential python3-pip
ENV LC_ALL C.UTF-8
ENV LANG C.UTF-8
RUN pip3 install pipenv
COPY . /app
WORKDIR /app
RUN pipenv install
EXPOSE 5000
CMD ["pipenv", "run", "python3", "application.py"]
When I do docker build -t flask-sample:latest ., it builds fine (I think).
I run it with docker run -d -p 5000:5000 flask-sample and it looks okay
But when I go to http://localhost:5000, nothing loads. What am I doing wrong?
Why do you need a virtual environment ? Why do you use Ubuntu as base layer:
A simpler approach would be:
Dockerfile:
FROM python:3
WORKDIR /usr/src/
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY app.py .
ENTRYPOINT FLASK_APP=/usr/src/app.py flask run --host=0.0.0.0
You put in your requirements.txt the desired packages (e.g flask).
Build image:
docker build -t dejdej/flasky:latest .
Start container:
docker run -it -p 5000:5000 dejdej/flasky
If it is mandatory to use virtual environment , you can try it with
venv:
FROM python:2.7
RUN virtualenv /YOURENV
RUN /YOURENV/bin/pip install flask
CMD ["/YOURENV/bin/python", "application.py"]
Short answer:
Your container is running pipenv, not your application. You need to fix the last line.
CMD ["pipenv", "run", "python3", "application.py"] should be only CMD ["python3", "application.py"]
Right answer:
I completely agree that there isnĀ“t any reason to use pipenv. Better solution is replace your Dockfile to use a python image and forget pipenv. You already in a container, no reason to use a enviroment.
I have made this first docker container, and it works as per the Dockerfile.
FROM python:3.5-slim
RUN apt-get update && \
apt-get -y install gcc mono-mcs && \
apt-get -y install vim && \
apt-get -y install nano && \
rm -rf /var/lib/apt/lists/*
RUN mkdir -p /statics/js
VOLUME ["/statics/"]
WORKDIR /statics/js
COPY requirements.txt /opt/requirements.txt
RUN pip install -r /opt/requirements.txt
EXPOSE 8080
CMD ["python", "/statics/js/app.py"]
after running this command:
docker run -it -p 8080:8080 -v
~/Development/my-Docker-builds/pythonReact/statics/:/statics/ -d
ciasto/pythonreact:v2
and when I open the page localhost:8080 i get error:
A server error occurred. Please contact the administrator.
but if I run this application normally, i.e. not containerised directly on my host machine: it works fine.
So I want to know what is causing server error. How do I debug a python app that runs via container to know what is causing it to not work. or what am I doing wrong.
Mainly, this:
config.paths['static_files'] = 'statics'
Should be:
config.paths['static_files'] = '/statics'
I've got your application up and running with your 'Hello World'
Did these changes:
1) The mentioned config.paths['static_files'] = '/statics'
2) This Dockerfile (removed VOLUME)
FROM python:3.5-slim
RUN apt-get update && \
apt-get -y install gcc mono-mcs && \
apt-get -y install vim && \
apt-get -y install nano && \
rm -rf /var/lib/apt/lists/*
COPY requirements.txt /opt/requirements.txt
RUN pip install -r /opt/requirements.txt
COPY ./statics/ /statics/
COPY app.py /app/app.py
WORKDIR /statics/js
EXPOSE 8080
CMD ["python", "/app/app.py"]
3) Moved the non-static app.py to a proper place: root of the project.
4) Run with: docker build . -t pyapp, then docker run -p 8080:8080 -it pyapp
You should see Serving on port 8080... from terminal output. And Hello World in browser.
I've forked your Github project and did a pull-request.
Edit:
If you need make changes when you develop, run the container with a volume to override the app that is packed in the image. For example:
docker run -v ./static/js/:/static/js -p 8080:8080 -it pyapp
You can have as many volumes as you want, but the app is already packed in the image and ready to push somewhere.
You can use pdb to debug Python code in CLI. To achieve this, you just have to import pdb and call pdb.set_trace() where you would like to have a breakpoint in your Python code. Basically you have to insert the following line where you want a breakpoint:
import pdb; pdb.set_trace()
Then you have to run your Python code interactively.
You could do that by running bash interactively in your container with
docker run -it -p 8080:8080 -v ~/Development/my-Docker-builds/pythonReact/statics/:/statics/ ciasto/pythonreact:v2 /bin/bash
and then running manually your app with
root#5910f24d0d8a:/statics/js# python /statics/js/app.py
When the code will reach the breakpoint, it will pause and a prompt will be shown where you can type commands to inspect your execution.
For more detail about the available commands, you can give a look at the pdb commands documentation.
Also, I noted that you are building your image using the python:3.5-slim base image which is a (very) light Python image which does not include all is commonly included in a Python distribution.
From the Python images page:
This image does not contain the common packages contained in the default tag and only contains the minimal packages needed to run python. Unless you are working in an environment where only the python image will be deployed and you have space constraints, we highly recommend using the default image of this repository.
Maybe using the standard python:3.5 image instead would solve your issue.
As a quick tip for debugging containerized applications. If your application is failing with container crashed/stopped. Just launch the container image with CMD/ENTRYPOINT as /bin/bash then manually start the application once you have the container shell you can debug the application as per normal Linux system. CMD is straightforward to override as per ENTRYPOINT just use --entrypoint flag with docker run command.
I am working on setting up a dockerised selenium grid. I can send my python tests [run with pytest] from a pytest container [see below] by attaching to it.
But I have setup another LAMP container that is going to control pytest.
So I want to make the pytest container standalone,running idle and waiting for commands from the LAMP container.
I have this Dockerfile:
# Starting from base image
FROM ubuntu
#-----------------------------------------------------
# Set the Github personal token
ENV GH_TOKEN blablabla
# Install Python & pip
RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y python python-pip python-dev && pip install --upgrade pip
# Install nano for #debugging
RUN apt-get install -y nano
# Install xvfb
RUN apt-get install -y xvfb
# Install GIT
RUN apt-get update -y && apt-get install git -y
# [in the / folder]
RUN git clone https://$GH_TOKEN:x-oauth-basic#github.com/user/project.git /project
# Install dependencies via pip
WORKDIR /project
RUN pip install -r dependencies.txt
#-----------------------------------------------------
#
CMD ["/bin/bash"]
I start the pytest container manually [for development] with this:
docker run -dit -v /project --name pytest repo/user:py
The thing is that I finished development and I want to have the pytest container launched from docker-compose and connect it to other containers [with link and volume].
I just cannot make it to stay up .
I used this :
pytest:
image: repo/user:py
volumes:
- "/project"
command: "/bin/bash tail -f /dev/null"
but didnt work.
So, inside the Dockerfile, should I use a specific CMD or ENTRYPOINT ?
Should I use some command from the docker-compose file?
I just enabled it on one of my projects recently. I use a multistage build. At present I put tests in the same folder as the source test_*.py. From my experience with this, it doesn't feel natural, I prefer tests to be in its own folder that is excluded by default.
FROM python:3.7.6 AS build
WORKDIR /app
COPY requirements.txt .
RUN pip3 install --compile -r requirements.txt && rm -rf /root/.cache
COPY src /app
# TODO precompile
# Build stage test - run tests
FROM build AS test
RUN pip3 install pytest pytest-cov && rm -rf /root/.cache
RUN pytest --doctest-modules \
--junitxml=xunit-reports/xunit-result-all.xml \
--cov \
--cov-report=xml:coverage-reports/coverage.xml \
--cov-report=html:coverage-reports/
# Build stage 3 - Complete the build setting the executable
FROM build AS final
CMD [ "python", "./service.py" ]
In order to exclude the test files from coverage. .coveragerc must be present.
[run]
omit = test_*
The test target runs the required tests and generates coverage and execution reports. These are NOT suitable for Azure DevOps and SonarQube. To make it suitable
sed -i~ 's#/app#$(Build.SourcesDirectory)/app#' $(Pipeline.Workspace)/b/coverage-reports/coverage.xml
To run tests
#!/usr/bin/env bash
set -e
DOCKER_BUILDKIT=1 docker build . --target test --progress plain
I am not exactly sure how your tests execute, but I think I have a similar use case. You can see how I do this in my Envoy project in cmd.sh, and a sample test.
Here is how I run my tests. I'm using pytest as well, but thats not important:
1. use docker-compose to bring up the stack, without the tests
2. wait for the stack to be ready for requests. for me this means poll for a 200 response
3. run the test container separately but make sure it uses the same network as the compose stack.
This can be done in several ways. You can put this all in a Bash script and control it all from your host.
In my case I do this all from a Python container. Its a little to wrap your head around, but the idea is there is a Python test container which the host starts. Then the container itself uses compose to bring up the stack back on the host (dockerception). And then in the test container we run the pytest test. When its done, it composes down the stack and pushes up the return code.
At first you must get list of your images with "docker images".
Then watch the list and ensure your image is exists.
So run your docker image with "docker run".
Do not forget this note: you must CMD in DockerFile to run pytest with your container.
my Dockerfile
FROM python:3.6-slim
COPY . /python-test-calculator
WORKDIR /python-test-calculator
RUN pip freeze > requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
RUN mkdir reports
CMD cd reports
CMD [python", "-m", "pytest", "--junitxml=reports/result.xml"]
CMD tail -f /dev/null