How to use tweepy python library as aws lambda layer? - python

I am following this instructions to create an AWS Lambda layer:
mkdir my-lambda-layer && cd my-lambda-layer
mkdir -p aws-layer/python/lib/python3.8/site-packages
pip3 install tweepy --target aws-layer/python/lib/python3.8/site-packages
cd aws-layer
I zip then the folder "python" (zip -r tweepy_layer.zip python/) and upload it to s3. This is what I see when I unzip the folder to double check:
Unfortunately, I still get the following error though the pass should be the same as in the docs. I tried both from MacOs and Ubuntu though I do not think this should play a role for this particular library.

Essentially the problem turned out to be the cache. Yes, all those __pycache__ and .pyc files. Thanks to this other question I cleared the cache after installing the libraries by doing
pip3 install pyclean
pyclean .
After cleaning the cache, re-doing the zip and uploading it to s3 the lambda setup works perfectly:

Related

How to install package dependencies for a custom Airbyte connector?

I'm developing a custom connector for Airbyte, and it involves extracting files from different compressed formats, like .zip or .7z. My plan was to use patool for this, and indeed it works in local tests, running:
python main.py read --config valid_config.json --catalog configured_catalog_old.json
However, since Airbyte runs in docker containers, I need those containers to have packages like p7zip installed. So my question is, what is the proper way to do that?
I just downloaded and deployed Airbyte Open Source in my own machine using the recommended commands listed on Airbyte documentation:
git clone https://github.com/airbytehq/airbyte.git
cd airbyte
docker compose up
I tried using docker exec -it CONTAINER_ID bash into airbyte/worker and airbyte/connector-builder-server, to install p7zip directly, but it's not working yet. My connector calls patoolib from a Python script, but it is unable to process the given file, because it fails to find a program to extract it. This is the log output:
> patool: Extracting /tmp/tmpan2mjkmn ...
> unknown archive format for file `/tmp/tmpan2mjkmn'
It turns out I completely ignored that the connector template comes with a Dockerfile, which is used precisely to configure the container that is supposed to run the connector code. So all I had to do was to add this line do Dockerfile:
RUN apt-get update && apt-get install -y file p7zip p7zip-full lzma lzma-dev
Specifically, to use patoolib, I had to install the file package, so it could detect the mime type of archive files.

make sam to IGNORE requirements.txt

So I am using AWS SAM to build and deploy some function to AWS Lambda.
Because of my slow connection speed uploading functions is very slow, so I decided to create a Layer with requirements in it. So the next time when I try to deploy function I will not have to upload all 50 mb of requirements, and I can just use already uploaded layer.
Problem is that I could not find any parameter which lets me to just ignore requirements file and just deploy the source code.
Is it even possible?
I hope I understand your question correctly, but if you'd like to deploy a lambda without any dependencies you can try two things:
not running sam build before running sam deploy
having an empty requirements.txt file. Then sam build simply does not include any dependencies for that lambda function.
Of course here I assume the layer is already present in AWS and is not included in the same template. If they are defined in the same template, you'd have to split them into two stacks. One with the layer which can be deployed once and one with the lambda referencing that layer.
Unfortunately sam build has no flag to ignore requirements.txt as far as I know, since the core purpose of the command is to build dependencies.
For everyone using image container, this is the solution I have found. It drastically improve the workflow.
Dockerfile [it skip if requirments.txt is unchanged]
FROM public.ecr.aws/lambda/python:3.8 AS build
COPY requirements.txt ./
RUN python3.8 -m pip install -r requirements.txt -t .
COPY app.py ./
COPY model /opt/ml/model
CMD ["app.lambda_handler"]
What I have changed?
This was the default Dockerfile
FROM public.ecr.aws/lambda/python:3.8
COPY app.py requirements.txt ./
COPY model /opt/ml/model
RUN python3.8 -m pip install -r requirements.txt -t .
CMD ["app.lambda_handler"]
This solution is based on https://stackoverflow.com/a/34399661/5723524

how to test python AWS lambda function when creating zip with Additional Dependencies? localy

i want to create python function with some additional Dependencies im using
according to this document :
https://docs.aws.amazon.com/lambda/latest/dg/python-package.html#python-package-dependencies
im need to use :
Updating a Function with Additional Dependencies section
my question how can i be sure my python script will use or if it have enough libs in the zip
in simple words how i fource my python script to use the dependesia from the the new directory im about to upload as zip ?
You need to install packages in your deployement package before uploading it to lambda.
For example your deployement package is like
lambda-package
-- lambda_function.py
-- some other files
cd into your lambda_package and install packages inside this using
pip install package-name -t .
This will install this package and -t means you are specifying target directory as your current directory
Then zip all the contents of this folder and upload it to aws lambda.

Cannot run Google Vision API on AWS Lambda [duplicate]

I am trying to use Google Cloud Platform (specifically, the Vision API) for Python with AWS Lambda. Thus, I have to create a deployment package for my dependencies. However, when I try to create this deployment package, I get several compilation errors, regardless of the version of Python (3.6 or 2.7). Considering the version 3.6, I get the issue "Cannot import name 'cygrpc'". For 2.7, I get some unknown error with the .path file. I am following the AWS Lambda Deployment Package instructions here. They recommend two options, and both do not work / result in the same issue. Is GCP just not compatible with AWS Lambda for some reason? What's the deal?
Neither Python 3.6 nor 2.7 work for me.
NOTE: I am posting this question here to answer it myself because it took me quite a while to find a solution, and I would like to share my solution.
TL;DR: You cannot compile the deployment package on your Mac or whatever pc you use. You have to do it using a specific OS/"setup", the same one that AWS Lambda uses to run your code. To do this, you have to use EC2.
I will provide here an answer on how to get Google Cloud Vision working on AWS Lambda for Python 2.7. This answer is potentially extendable for other other APIs and other programming languages on AWS Lambda.
So the my journey to a solution began with this initial posting on Github with others who have the same issue. One solution someone posted was
I had the same issue " cannot import name 'cygrpc' " while running
the lambda. Solved it with pip install google-cloud-vision in the AMI
amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2 instance and exported the
lib/python3.6/site-packages to aws lambda Thank you #tseaver
This is partially correct, unless I read it wrong, but regardless it led me on the right path. You will have to use EC2. Here are the steps I took:
Set up an EC2 instance by going to EC2 on Amazon. Do a quick read about AWS EC2 if you have not already. Set one up for amzn-ami-hvm-2018.03.0.20180811-x86_64-gp2 or something along those lines (i.e. the most updated one).
Get your EC2 .pem file. Go to your Terminal. cd into your folder where your .pem file is. ssh into your instance using
ssh -i "your-file-name-here.pem" ec2-user#ec2-ip-address-here.compute-1.amazonaws.com
Create the following folders on your instance using mkdir: google-cloud-vision, protobuf, google-api-python-client, httplib2, uritemplate, google-auth-httplib2.
On your EC2 instance, cd into google-cloud-vision. Run the command:
pip install google-cloud-vision -t .
Note If you get "bash: pip: command not found", then enter "sudo easy_install pip" source.
Repeat step 4 with the following packages, while cd'ing into the respective folder: protobuf, google-api-python-client, httplib2, uritemplate, google-auth-httplib2.
Copy each folder on your computer. You can do this using the scp command. Again, in your Terminal, not your EC2 instance and not the Terminal window you used to access your EC2 instance, run the command (below is an example for your "google-cloud-vision" folder, but repeat this with every folder):
sudo scp -r -i your-pem-file-name.pem ec2-user#ec2-ip-address-here.compute-1.amazonaws.com:~/google-cloud-vision ~/Documents/your-local-directory/
Stop your EC2 instance from the AWS console so you don't get overcharged.
For your deployment package, you will need a single folder containing all your modules and your Python scripts. To begin combining all of the modules, create an empty folder titled "modules." Copy and paste all of the contents of the "google-cloud-vision" folder into the "modules" folder. Now place only the folder titled "protobuf" from the "protobuf" (sic) main folder in the "Google" folder of the "modules" folder. Also from the "protobuf" main folder, paste the Protobuf .pth file and the -info folder in the Google folder.
For each module after protobuf, copy and paste in the "modules" folder the folder titled with the module name, the .pth file, and the "-info" folder.
You now have all of your modules properly combined (almost). To finish combination, remove these two files from your "modules" folder: googleapis_common_protos-1.5.3-nspkg.pth and google_cloud_vision-0.34.0-py3.6-nspkg.pth. Copy and paste everything in the "modules" folder into your deployment package folder. Also, if you're using GCP, paste in your .json file for your credentials as well.
Finally, put your Python scripts in this folder, zip the contents (not the folder), upload to S3, and paste the link in your AWS Lambda function and get going!
If something here doesn't work as described, please forgive me and either message me or feel free to edit my answer. Hope this helps.
Building off the answer from #Josh Wolff (thanks a lot, btw!), this can be streamlined a bit by using a Docker image for Lambdas that Amazon makes available.
You can either bundle the libraries with your project source or, as I did below in a Makefile script, upload it as an AWS layer.
layer:
set -e ;\
docker run -v "$(PWD)/src":/var/task "lambci/lambda:build-python3.6" /bin/sh -c "rm -R python; pip install -r requirements.txt -t python/lib/python3.6/site-packages/; exit" ;\
pushd src ;\
zip -r my_lambda_layer.zip python > /dev/null ;\
rm -R python ;\
aws lambda publish-layer-version --layer-name my_lambda_layer --description "Lambda layer" --zip-file fileb://my_lambda_layer.zip --compatible-runtimes "python3.6" ;\
rm my_lambda_layer.zip ;\
popd ;
The above script will:
Pull the Docker image if you don't have it yet (above uses Python 3.6)
Delete the python directory (only useful for running a second
time)
Install all requirements to the python directory, created in your projects /src directory
ZIP the python directory
Upload the AWS layer
Delete the python directory and zip file
Make sure your requirements.txt file includes the modules listed above by Josh: google-cloud-vision, protobuf, google-api-python-client, httplib2, uritemplate, google-auth-httplib2
There's a fast solution that doesn't require much coding.
Cloud9 uses AMI so using pip on their virtual environment should make it work.
I created a Lambda from the Cloud9 UI and from the console activated the venv for the EC2 machine. I proceeded to install google-cloud-speech with pip.That was enough to fix the issue.
I was facing same error using goolge-ads API.
{
"errorMessage": "Unable to import module 'lambda_function': cannot import name'cygrpc' from 'grpc._cython' (/var/task/grpc/_cython/init.py)","errorType": "Runtime.ImportModuleError","stackTrace": []}
My Lambda runtime was Python 3.9 and architecture x86_64.
If somebody encounter similar ImportModuleError then see my answer here : Cannot import name 'cygrpc' from 'grpc._cython' - Google Ads API

Deploying a custom python package with `pip`

I have a custom Python package (call it MyProject) on my filesystem with a setup.py and a requirements.txt. This package needs to be used by a Flask server (which will be deployed on AWS/EC2/EB).
In my Flask project directory, I create a virtualenv and run pip install -e ../path/to/myProject.
But for some reason, MyProject's upstream git repo shows up in pip freeze:
...
llvmlite==0.19.0
-e git+https://github.com/USERNAME/MYPROJECT.git#{some-git-hash}
python-socketio==1.8.0
...
The reference to git is a problem, because the repository is private and the deployment server does not (and should not, and will never) have credentials to access it. The deployment server also doesn't even have git installed (and it seems extremely problematic that pip assumes without my permission that it does). There is nothing in MyProject's requirements.txt or setup.py that alludes to git, so I am not sure where the hell this is coming from.
I can dupe the project to a subdirectory of the Flask project, and then put the following in MyFlaskProject's requirements.txt:
...
llvmlite==0.19.0
./MyProject
python-socketio==1.8.0
...
But this doesn't work, because the path is taken as relative to the working directory of the pip process when it is run, not to requirements.txt. Indeed, it seems pip is broken in this respect. In my case, EC2 runs its install scripts from some other directory (with a full path to requirements.txt specified), and as expected, this fails.
What is the proper way to deploy a custom python package as a dependency of another project?
To install your own python package from a git repo you might want to check this post
To sort out the credential issue, why not having git installed on the EC2? You could simply create an ssh key and share it with MyProject repository.
I am using this solution on ECS instances deployed by Jenkins (with Habitus to hide Jenkin's ssh keys while building the image) and it works fine for me!

Categories