make sam to IGNORE requirements.txt - python

So I am using AWS SAM to build and deploy some function to AWS Lambda.
Because of my slow connection speed uploading functions is very slow, so I decided to create a Layer with requirements in it. So the next time when I try to deploy function I will not have to upload all 50 mb of requirements, and I can just use already uploaded layer.
Problem is that I could not find any parameter which lets me to just ignore requirements file and just deploy the source code.
Is it even possible?

I hope I understand your question correctly, but if you'd like to deploy a lambda without any dependencies you can try two things:
not running sam build before running sam deploy
having an empty requirements.txt file. Then sam build simply does not include any dependencies for that lambda function.
Of course here I assume the layer is already present in AWS and is not included in the same template. If they are defined in the same template, you'd have to split them into two stacks. One with the layer which can be deployed once and one with the lambda referencing that layer.
Unfortunately sam build has no flag to ignore requirements.txt as far as I know, since the core purpose of the command is to build dependencies.

For everyone using image container, this is the solution I have found. It drastically improve the workflow.
Dockerfile [it skip if requirments.txt is unchanged]
FROM public.ecr.aws/lambda/python:3.8 AS build
COPY requirements.txt ./
RUN python3.8 -m pip install -r requirements.txt -t .
COPY app.py ./
COPY model /opt/ml/model
CMD ["app.lambda_handler"]
What I have changed?
This was the default Dockerfile
FROM public.ecr.aws/lambda/python:3.8
COPY app.py requirements.txt ./
COPY model /opt/ml/model
RUN python3.8 -m pip install -r requirements.txt -t .
CMD ["app.lambda_handler"]
This solution is based on https://stackoverflow.com/a/34399661/5723524

Related

How to use tweepy python library as aws lambda layer?

I am following this instructions to create an AWS Lambda layer:
mkdir my-lambda-layer && cd my-lambda-layer
mkdir -p aws-layer/python/lib/python3.8/site-packages
pip3 install tweepy --target aws-layer/python/lib/python3.8/site-packages
cd aws-layer
I zip then the folder "python" (zip -r tweepy_layer.zip python/) and upload it to s3. This is what I see when I unzip the folder to double check:
Unfortunately, I still get the following error though the pass should be the same as in the docs. I tried both from MacOs and Ubuntu though I do not think this should play a role for this particular library.
Essentially the problem turned out to be the cache. Yes, all those __pycache__ and .pyc files. Thanks to this other question I cleared the cache after installing the libraries by doing
pip3 install pyclean
pyclean .
After cleaning the cache, re-doing the zip and uploading it to s3 the lambda setup works perfectly:

Should I run tests during the docker build?

I have a Dockerfile like this:
FROM python:3.9
WORKDIR /app
RUN apt-get update && apt-get upgrade -y
RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python -
ENV PATH /root/.local/bin:$PATH
COPY pyproject.toml poetry.lock Makefile ./
COPY src ./src
COPY tests ./tests
RUN poetry install && poetry run pytest && make clean
CMD ["bash"]
As you can see that tests will be run during the build. It could slow down the build little bit, but will ensure that my code runs in the Docker container.
If tests pass in my local machine, does not mean that they will also pass in the docker container.
Suppose I add a feature in my code that uses chromedriver or ffmpeg binaries, that is present in my system, so tests will pass in my system.
But, suppose I forget to install those dependencies in the Dockerfile, then the docker build will fail (as tests are running during the build)
What is the standard way of doing what i am trying to do ?
Is my Dockerfile good ? or should i do something differently ?
Run pytest on image construction makes no sense to me. However, what you can do is run tests after the image is completed. In your pipeline you should have something like this:
Test your python package locally
Build wheel with poetry
Build docker image with your python package
Run your docker image to test if it works (running pytests for example)
Publish your tested image to container registry
If you use multiple stages, you can run docker build and avoid running the tests, and if you want to run the tests afterwards, running docker build --target test would then run the tests on what was previously built. This approach is explained on the docker official documentation.
This way not only we avoid running the build twice due to the Docker caching mechanism, we also can avoid having the test code being shipped in the image.
A possible use case of this implementation when doing CI/CD is to run the two commands when developing locally and in the CI; and not run the tests command in the CD, because the code being deployed will already be tested.

How to create docker container with Python and Orange

Does anyone know how I create a docker container with Python and Orange, without installing the whole Anaconda package.
I managed to make it work with a container of 8.0 GB, but that is way too big
From the GitHub project page, look at the README, and download the appropriate requirements-* files. Create a directory containing the file(s), and write a Dockerfile like this:
FROM python:3.7
RUN pip install PyQt5
COPY requirements-core.txt /tmp
RUN pip install -r requirements-core.txt
# repeat the previous two commands with other files, if needed
pip install git+https://github.com/biolab/orange3
Add any other commands as needed, e.g. to COPY your source code.

Prevent Docker from installing python package requirements on every build (without requirements.txt)

I'm building an image from a Dockerfile where my main program is a python application that has a number of dependencies. The application is installed via setup.py and the dependencies are listed inside. There is no requirements.txt. I'm wondering if there is a way to avoid having to download and build all of the application dependencies, which rarely change, on every image build. I saw a number of solutions that use the requirements.txt file but I'd like to avoid having one if possible.
You can use requires.txt from the egg info to preinstall the requirements.
WORKDIR path/to/setup/script
RUN python setup.py egg_info
RUN pip install -r pkgname.egg-info/requires.txt
One solution is: if those dependencies rarely change as you said what you could do is another image with already those packages installed. You would create that image and then save it using docker save, so you end with a new base image with the required dependencies. docker save will create a .tar with the image. You have to load the image using docker load and then in your Dockerfile you would do:
FROM <new image with all the dependencies>
//your stuff and no need to run pip install
....
Hope it helps
Docker save description

Transfer virtualenv to docker image

Is it possible to transfer virtual environment data from a local host to a docker image via the ADD command?
Rather than doing pip installs inside the container, I would rather the user have all of that done locally and simply transfer the virtual environment into the container. Granted all of the files are the same name locally as in the docker container, along with all directories being nested properly.
This would save minutes to hours if it was possible to transfer virtual environment settings into a docker image. Maybe I am thinking about this in the wrong abstract.
It just feels very inefficient doing pip installs via a requirements.txt that was passed into the container, as opposed to doing it all locally, otherwise each time the image is started up it has to re-install the same dependencies that have not changed from each image's build.
We had run into this problem earlier and here are a few things we considered:
Consider building base images that have common packages installed. The app containers can then use the one of these base containers and install the differential.
Cache the Pip packages on a local path that can be mounted on the container. That'll save the time to download the packages.
Depending on the complexity of your project one may suit better than the other - you may also consider a hybrid approach to find maximum optimization.
While possible, it's not recommended.
Dependencies (library versions, globally installed packages) can be different on host machine and container.
Image builds will not be 100% reproducible on other hosts.
Impact of pip install is not big. Each RUN command creates it's own layer, which are cached locally and also in repository, so pip install will be re-run only when requirements.txt is changed (or previous layers are rebuilt).
To trigger pip install only on requirements.txt changes, Dockerfile should start this way:
...
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY src/ ./
...
Also, it will be run only on image build, not container startup.
If you have multiple containers with same dependencies, you can build intermediate image with all the dependencies and build other images FROM it.

Categories