install ffmpeg on amazon ecr linux python

install ffmpeg on amazon ecr linux python - python

I'm trying to install ffmpeg on docker for amazon lambda function.
Code for Dockerfile is:
FROM public.ecr.aws/lambda/python:3.8
# Copy function code
COPY app.py ${LAMBDA_TASK_ROOT}
# Install the function's dependencies using file requirements.txt
# from your project folder.
COPY requirements.txt .
RUN yum install gcc -y
RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
RUN yum install -y ffmpeg
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.handler" ]
I am getting an error:
> [6/6] RUN yum install -y ffmpeg:
#9 0.538 Loaded plugins: ovl
#9 1.814 No package ffmpeg available.
#9 1.843 Error: Nothing to do

Since the ffmpeg package is not available with yum package manager, I have manually installed ffmpeg and made it part of the container. Here are the steps:
Downloaded the static build from here (the build for the public.ecr.aws/lambda/python:3.8 image is ffmpeg-release-amd64-static.tar.xz
Here is a bit more info on the topic.
Manually unarchived it in the root folder of my project (where my Dockerfile and app.py files are). I use a CodeCommit repo but this is not mandatory of course.
Added the following line my Dockerfile:
COPY ffmpeg-5.1.1-amd64-static /usr/local/bin/ffmpeg
In the requirements.txt I added the following line (so that the python package managed installs ffmpeg-python package):
ffmpeg-python
And here is how I use it in my python code:
import ffmpeg
...
process1 = (ffmpeg
.input(sourceFilePath)
.output("pipe:", format="s16le", acodec="pcm_s16le", ac=1, ar="16k", loglevel="quiet")
.run_async(pipe_stdout=True, cmd=r"/usr/local/bin/ffmpeg/ffmpeg")
)
Note that in order to work, in the run method (or run_async in my case) I
needed to specify the cmd property with the location of the ffmpeg
executable.
I was able to build the container and the ffmpeg is working properly for me.
FROM public.ecr.aws/lambda/python:3.8
# Copy function code
COPY app.py ${LAMBDA_TASK_ROOT}
COPY input_files ./input_files
COPY ffmpeg-5.1.1-amd64-static /usr/local/bin/ffmpeg
RUN chmod 777 -R /usr/local/bin/ffmpeg
RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.lambda_handler" ]

Related

Deploying multi-stage Docker image in Google Cloud Run

I created a multi-stage Docker image to reduce the size of my python application. The content of the Dockerfile is:
FROM python:3.10-slim AS image-compiling
# updates pip and prepare for installing wheels
RUN pip install --upgrade pip && \
pip install wheel && \
pip cache purge
# install needed modules from wheels
RUN pip install --user \
Flask==2.2.2 Flask-RESTful==0.3.9 Flask-Cors==3.0.10 keras==2.9.0 && \
pip cache purge
# brand new image
FROM python:3.10-slim AS image-deploy
# get python 'user environment'
COPY --from=image-compiling /root/.local /root/.local
# get app content
COPY app ./app
# define working directory and the standard command (run it detached)
WORKDIR /app
ENTRYPOINT ["python"]
CMD ["wsgi.py"]
Then I pushed the image my_image to the Google Cloud Storage of my project my_proj with:
$ docker push us.gcr.io/my_proj/my_image
When I try to start a Google Cloud Run service with this image, it fails with the Python error:
Error 2022-09-14 11:41:13.744 EDTTraceback (most recent call last): File "/app/wsgi.py", line 10, in <module> from flask import render_template ModuleNotFoundError: No module named 'flask'
Warning 2022-09-14 11:41:13.805 EDTContainer called exit(1).
But flask is installed in the image, and a container created locally from this image does run it normally.
Why the container in Google Cloud Run is not able to find the module flask?
Isn't it created completely?

Reason of the problem:
The problem in this case is not with the fact that the image was created using the multi-stage approach. The issue is that the Python modules were installed using the option --user.
This option was included so that the needed Python modules were stored in the folder /root/.local, which could be copied directly to the new deploy image.
However, for some reason, Python applications in containers in Google Cloud Run are not able to find the modules if they are in the folder /root/.local. Probably a user other than root is used there. Thus, using the --user argument of pip does not work, which also inhibits the multi-stage approach in this type of scenario.
Solution:
Do a "mono-stage" image with global Python modules, i.e., a Dockerfile like:
FROM python:3.10-slim
# updates pip and prepare for installing wheels
RUN pip install --upgrade pip && \
pip install wheel && \
pip cache purge
# install needed modules from wheels
RUN pip install \
Flask==2.2.2 Flask-RESTful==0.3.9 Flask-Cors==3.0.10 keras==2.9.0 && \
pip cache purge
# get app content
COPY app ./app
# define working directory and the standard command (run it detached)
WORKDIR /app
ENTRYPOINT ["python"]
CMD ["wsgi.py"]

can't find python packages installed into customized docker image

I am creating a Docker container that runs Python 3.6.15 and the pip install function in my Dockerfile runs during the build process but when I try to execute functions within it after the build completes and I run it the 'installed' packages do not exist.
For more context, here is my Dockerfile. For clarity, I am building a Docker container that is being uploaded to AWS ECR to be used in a Lambda function but I don't think that's entirely relevant to this question (good for context though):
# Define function directory
ARG FUNCTION_DIR="/function"
FROM python:3.6 as build-image
# Install aws-lambda-cpp build dependencies
RUN apt-get clean && apt-get update && \
apt-get install -y \
g++ \
make \
cmake \
unzip \
libcurl4-openssl-dev \
ffmpeg
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}
# Copy function code
COPY . ${FUNCTION_DIR}
# Install the runtime interface client
RUN /usr/local/bin/python -m pip install \
--target ${FUNCTION_DIR} \
awslambdaric
# Install the runtime interface client
COPY requirements.txt /requirements.txt
RUN /usr/local/bin/python -m pip install -r requirements.txt
# Multi-stage build: grab a fresh copy of the base image
FROM python:3.6
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}
# Copy in the build image dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
COPY entry-point.sh /entry_script.sh
ADD aws-lambda-rie /usr/local/bin/aws-lambda-rie
ENTRYPOINT [ "/entry_script.sh" ]
CMD [ "app.handler" ]
When I run my docker run command in Terminal, I can see that it is collecting and installing the packages from the requirements.txt file that is in my project's root. I then try to run an get an Import Module error. To troubleshoot, I ran some command line exec functions such as:
docker exec <container-id> bash -c "ls" # This returns the folder structure which looks great
docker exec <container-id> bash -c "pip freeze". # This only returns 'pip', 'wheel' and some other basic Python modules
The only why I could solve it is that after I build and run it, I run this command:
docker exec <container-id> bash -c "/usr/local/bin/python -m pip install -r requirements.txt"
Which manually installs the modules and they then show up in the freeze command and I can execute the code. This is not ideal as I would like to have pip install run correctly during the build process so there are less steps in the future as I make changes to the code.
Any pointers as to where I am going wrong would be great, thank you!

According to Docker Docs, multi-stage builds
With multi-stage builds, you use multiple FROM statements in your
Dockerfile. Each FROM instruction can use a different base, and each
of them begins a new stage of the build. You can selectively copy
artifacts from one stage to another, leaving behind everything you
don’t want in the final image.
So the 2nd from python:3.6 in the Dockerfile resets the image build, deleting the module installations.
The subsequent copy saves what was in /function (the aws module) but not the other modules saved to the system in the other pip install.

How to invalidate Dockerfile cache when pip installing from repo

I have a Dockerfile that needs to install the latest package code from a private git repo, however because the dockerfile/url/commit doesn't change (I just follow the latest in master), Docker will cache this request and won't pull the latest code.
I can disable build caching entirely which fixes the issue - but this results in a slow build.
How can I just force docker not to use the cache for the one command?
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
COPY ./requirements.txt /app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
# This needs to be separate to trigger to invalidate the build cache
RUN pip install -e git+https://TOKEN#github.com/user/private-package.git#egg=private_package
COPY ./main.py /app
COPY ./app /app/app

Add
ARG foo=bar
Before RUN pip install -e ... in your docker file.
Then in your script with docker build .... add as a parameter
--build-arg foo="$(date -s)"

Rolling your own custom container docker image: how to create user defined command for local system GPU

I would like to use my computer GPU to run the program. For that custom container has been created using: https://blog.softwaremill.com/setting-up-tensorflow-with-gpu-acceleration-the-quick-way-add80cd5c988 (please see Rolling your own custom container section). I tried to use ENTRYPOINT["mp"] for creating a user-defined executable command 'mp'. However, there is the following error:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "mp": executable file not found in $PATH: unknown.
Using mp executable command, I would like to run 'mp train'
Link for my docker file: https://github.com/sandeepsinghsengar/docker-file/blob/main/docker
I have copied the docker file's contents here also.
FROM tensorflow/tensorflow:latest-gpu
# ^or just latest-gpu-jupyter if you need Jupyter
WORKDIR /tmp #my main docker's working directory is /tmp
# Set desired Python version
ENV python_version 3.7
# Install desired Python version (the current TF image is be based on Ubuntu at the moment)
RUN apt install -y python${python_version}
# Set default version for root user - modified version of this solution:
https://jcutrer.com/linux/upgrade-python37-ubuntu1810
RUN update-alternatives --install /usr/local/bin/python python
/usr/bin/python${python_version} 1
RUN /usr/local/bin/python -m pip install --upgrade pip
# Update pip: https://packaging.python.org/tutorials/installing-
packages/#ensure-pip-setuptools-and-wheel-are-up-to-date
RUN python -m pip install --upgrade pip setuptools wheel
#RUN apt-get install python-six
# By copying over requirements first, we make sure that Docker will "cache"
# our installed requirements in a dedicated FS layer rather than reinstall
# them on every build
COPY requirements.txt requirements.txt
# Install the requirements
RUN python -m pip install -r requirements.txt
ENTRYPOINT ["mp"]
Normal python commands are running under docker. Any suggestion will be highly appreciable.
Let me know for further information.

Add pip requirements to docker image in runtime

I want to be able to add some extra requirements to an own create docker image. My strategy is build the image from a dockerfile with a CMD command that will execute a "pip install -r" command using a mounted volume in runtime.
This is my dockerfile:
FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y python-pip python-dev build-essential
RUN pip install --upgrade pip
WORKDIR /root
CMD ["pip install -r /root/sourceCode/requirements.txt"]
Having that dockerfile I build the image:
sudo docker build -t test .
And finally I try to attach my new requirements using this command:
sudo docker run -v $(pwd)/sourceCode:/root/sourceCode -it test /bin/bash
My local folder "sourceCode" has inside a valid requirements.txt file (it contains only one line with the value "gunicorn").
When I get the prompt I can see that the requirements file is there, but if I execute a pip freeze command the gunicorn package is not listed.
Why the requirements.txt file is been attached correctly but the pip command is not working properly?

TLDR
pip command isn't running because you are telling Docker to run /bin/bash instead.
docker run -v $(pwd)/sourceCode:/root/sourceCode -it test /bin/bash
^
here
Longer explanation
The default ENTRYPOINT for a container is /bin/sh -c. You don't override that in the Dockerfile, so that remains. The default CMD instruction is probably nothing. You do override that in your Dockerfile. When you run (ignore the volume for brevity)
docker run -it test
what actually executes inside the container is
/bin/sh -c pip install -r /root/sourceCode/requirements.txt
Pretty straight forward, looks like it will run pip when you start the container.
Now let's take a look at the command you used to start the container (again, ignoring volumes)
docker run -v -it test /bin/bash
what actually executes inside the container is
/bin/sh -c /bin/bash
the CMD arguments you specified in your Dockerfile get overridden by the COMMAND you specify in the command line. Recall that docker run command takes this form
docker run [OPTIONS] IMAGE[:TAG|#DIGEST] [COMMAND] [ARG...]
Further reading
This answer has a really to the point explanation of what CMD and ENTRYPOINT instructions do
The ENTRYPOINT specifies a command that will always be executed when the container starts.
The CMD specifies arguments that will be fed to the ENTRYPOINT.
This blog post on the difference between ENTRYPOINT and CMD instructions that's worth reading.

You may change the last statement i.e., CMD to below.
--specify absolute path of pip location in below statement
CMD ["/usr/bin/pip", "install", "-r", "/root/sourceCode/requirements.txt"]
UPDATE: adding additional answer based on comments.
One thing must be noted that, if customized image is needed with additional requirements, that should part of the image rather than doing at run time.
Using below base image to test:
docker pull colstrom/python:legacy
So, installing packages should be run using RUN command of Dockerfile.
And CMD should be used what app you actually wanted to run as a process inside of container.
Just checking if the base image has any pip packages by running below command and results nothing.
docker run --rm --name=testpy colstrom/python:legacy /usr/bin/pip freeze
Here is simple example to demonstrate the same:
Dockerfile
FROM colstrom/python:legacy
COPY requirements.txt /requirements.txt
RUN ["/usr/bin/pip", "install", "-r", "/requirements.txt"]
CMD ["/usr/bin/pip", "freeze"]
requirements.txt
selenium
Build the image with pip packages Hope you know to place Dockerfile, requirements.txt file in fresh directory.
D:\dockers\py1>docker build -t pypiptest .
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM colstrom/python:legacy
---> 640409fadf3d
Step 2 : COPY requirements.txt /requirements.txt
---> abbe03846376
Removing intermediate container c883642f06fb
Step 3 : RUN /usr/bin/pip install -r /requirements.txt
---> Running in 1987b5d47171
Collecting selenium (from -r /requirements.txt (line 1))
Downloading selenium-3.0.1-py2.py3-none-any.whl (913kB)
Installing collected packages: selenium
Successfully installed selenium-3.0.1
---> f0bc90e6ac94
Removing intermediate container 1987b5d47171
Step 4 : CMD /usr/bin/pip freeze
---> Running in 6c3435177a37
---> dc1925a4f36d
Removing intermediate container 6c3435177a37
Successfully built dc1925a4f36d
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.
Now run the image
If you are not passing any external command, then container takes command from CMD which is just shows the list of pip packages. Here in this case, selenium.
D:\dockers\py1>docker run -itd --name testreq pypiptest
039972151eedbe388b50b2b4cd16af37b94e6d70febbcb5897ee58ef545b1435
D:\dockers\py1>docker logs testreq
selenium==3.0.1
So, the above shows that package is installed successfully.
Hope this is helpful.

Using the concepts that #Rao and #ROMANARMY have explained in their answers, I find out finally a way of doing what I wanted: add extra python requirements to a self-created docker image.
My new Dockerfile is as follows:
FROM ubuntu:14.04
RUN apt-get update
RUN apt-get install -y python-pip python-dev build-essential
RUN pip install --upgrade pip
WORKDIR /root
COPY install_req.sh .
CMD ["/bin/bash" , "install_req.sh"]
I've added as first command the execution of a shell script that has the following content:
#!/bin/bash
pip install -r /root/sourceCode/requirements.txt
pip freeze > /root/sourceCode/freeze.txt
And finally I build and run the image using these commands:
docker build --tag test .
docker run -itd --name container_test -v $(pwd)/sourceCode:/root/sourceCode test <- without any parameter at the end
As I explained at the beginning of the post, I have in a local folder a folder named sourceCode that contains a valid requirements.txt file with only one line "gunicorn"
So finally I've the ability of adding some extra requirements (gunicorn package in this example) to a given docker image.
After building and running my experiment If I check the logs (docker logs container_test) I see something like this:
Downloading gunicorn-19.6.0-py2.py3-none-any.whl (114kB)
100% |################################| 122kB 1.1MB/s
Installing collected packages: gunicorn
Furthermore, the container have created a freeze.txt file inside the mounted volume that contains all the pip packages installed, including the desired gunicorn:
chardet==2.0.1
colorama==0.2.5
gunicorn==19.6.0
html5lib==0.999
requests==2.2.1
six==1.5.2
urllib3==1.7.1
Now I've other problems with the permissions of the new created file, but that will be probably in a new post.
Thank you!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.