I want to convert .docx file to .txt and If .docx has tables I want to maintain them in good way in .txt file , so I am using pypandoc for this purpose .
In my local this is working like charm.
When I zip it with all dependencies and put it in s3 to run via aws lambda it fails with below error:
No pandoc was found: either install pandoc and add it
to your PATH or or call pypandoc.download_pandoc(...) or
install pypandoc wheels with included pandoc
My code is like :
import boto3
import logging
import pypandoc
local_file_docx = '/tmp/'+prefix+'german-de.docx'
local_file_txt = '/tmp/'+prefix+'german-de.txt'
def lambda_handler(event, context):
print(pypandoc.convert_file(local_file_docx, "plain+simple_tables", format="docx", extra_args=
(), encoding='utf-8', outputfile=local_file_txt))
Any help . Aprreciated in advance
Related
So I have this similar problem with this person.
How to create password encrypted zip file in python through AWS lambda
We have the exact same problem but i already did everything from the answers in that thread but to no avail.
I have a lambda script that runs on python3.9 I need to compress the files in my s3 as a zip file that is password protected and i need to put it in another s3.
This is how it goes
import pyminizip
def zip_to_client():
# reportTitles = os.listdir(tempDir)
dateGenerated = datetime.now(tz=atz).strftime("%Y-%m-%d")
pyminizip.compress("Daily_Booking_Report.csv", subfolder + str(dateGenerated) +'/'+str(id)+'/'
, "/tmp/test.zip", "awesomepassword", 9)
s3 = boto3.resource('s3')
s3.meta.client.upload_file(Filename = '/tmp/test.zip', Bucket = bucket, Key = subfolder + 'test.zip', ExtraArgs={'Tagging':'archive=90days'})
print("SUCCESS: Transferred report into S3")
i'm not sure if it works but i can't debug it because lambda shows me the error:
Response
{
"errorMessage": "Unable to import module 'lambda_function': No module named 'pyminizip'",
"errorType": "Runtime.ImportModuleError",
"requestId": "0000111000",
"stackTrace": []
}
I made sure that i put import pyminizip as well as pip installing it in the directory.
pip install pyminizip -t .
so far this is what the lambda directory looks like
https://ibb.co/ZGmLBbv
i've tried everything from putting it in a lambda layer to pip installing different versions from python 3.7 to 3.9
This is a common case when you create a lambda layer and get import error. And this occurs when you don't have created python files in a defined directory like python/python38/site-packages...
or
second reason might be a dependency is missing. In that use use docker and follow steps from here : https://www.geeksforgeeks.org/how-to-install-python-packages-for-aws-lambda-layers/.
I'm using in my lambda function the magic library to determine the file`s type.
I first deployed it to a local container to check that everything works.
My DockerFile :
FROM lambci/lambda:build-python3.8
WORKDIR /app
RUN mkdir -p .aws
COPY requirements.txt ./
COPY credentials /app/.aws/
RUN mv /app/.aws/ ~/.aws/
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements.txt -t "/app/dependencies/"
WORKDIR /app/dependencies
RUN zip -r lambda.zip *
requirements.txt :
python-magic
libmagic
In my local container when I run tests on the lambda logic everything went ok and passed (including the part that uses the magic code..).
I created a zip that contains the lambda.py code and with the python dependencies (last 3 lines in the docker file).
When I upload the zip to aws and test the lambda I'm getting the following error :
{
"errorMessage": "Unable to import module 'lambda': failed to find libmagic. Check your installation",
"errorType": "Runtime.ImportModuleError"
}
As you can see, on my local container I'm using baseline image lambci/lambda:build-python3.8 that should be the same aws uses when the lambda is launching.
I tried also to add python-magic-bin==0.4.14 to the requirements.txt instead of the magic and libmagic but it didnt help either because it seems that this module is for windows.
Into the lambda.zip I put also the lambda.py which is the file that includes my lambda function :
import boto3
import urllib.parse
from io import BytesIO
import magic
def lambda_handler(event, context):
s3 = boto3.client("s3")
if event:
print("Event : ", event)
event_data = event["Records"][0]
file_name = urllib.parse.unquote_plus(event_data['s3']['object']['key'])
print("getting file: {}".format(file_name))
bucket_name = event_data['s3']['bucket']['name']
file_from_s3 = s3.get_object(Bucket=bucket_name, Key=file_name)
file_obj = BytesIO(file_from_s3['Body'].read())
print(magic.from_buffer(file_obj.read(2048)))
What am I doing wrong ?
While using filetype as suggested by other answers is much simpler, that library does not detect as many file types as magic does.
You can make python-magic work on aws lambda with python3.8 by doing the following:
Add libmagic.so.1 to a lib folder at the root of the lambda package. This lib folder will be automatically added to LD_LIBRARY_PATH on aws lambda. This library can be found in /usr/lib64/libmagic.so.1 on an amazon linux ec2 instance for example.
Create a magic file or take the one available on an amazon linux ec2 instance in /usr/share/misc/magic and add it to your lambda package.
The Magic constructor from python-magic takes a magic_file argument. Make this point to your magic file. You can then create the magic object with magic.Magic(magic_file='path_to_your_magic_file') and then call any function from python-magic you like on that object.
These steps are not necessary on the python3.7 runtime as those libraries are already present in aws lambda.
I didn't find a way to solve this issue, therefore, I decided to use a different library called filetype .
Example how to use it :
file_type = filetype.guess(file_object)
if file_type is not None:
file_type = file_type.MIME
file_object.seek(0)
print("File type : {}".format(file_type))
if file_type == "application/gzip":
file_binary_content = gzip.GzipFile(filename=None, mode='rb', fileobj=file_object).read()
elif file_type == "application/zip":
zipfile = ZipFile(file_object)
file_binary_content = zipfile.read(zipfile.namelist()[0])
Maybe someone finds it helpful. This is how I make my lambda become friends with python-magic. I used a local Ubuntu machine to create a zip-package and Python 3.7.
First, create a directory. The name doesn't matter.
mkdir lambda-package
cd lambda-package
Then, install locally python-magic. I used pip3 for it.
pip3 install python-magic -t .
Now, in your lambda-package directory create or copy (if you already have some code in your lambda) a lambda_function.py file. Be aware, Lambda expects such name. After this step you should have the following directory structure inside the lambda-package:
lambda-package
│ lambda_function.py
│ magic/
│ python_magic-0.4.24.dist-info/
Now, zip the contents of this directory. Remember to zip the contents only, not the folder itself. So, inside the lambda-package run:
zip -r lambda-package.zip .
Final step. Create a new Lambda function from scratch. Make sure to choose a proper runtime. When your function is created, click Upload from -> choose Zip file and upload lambda-package.zip file.
Now you will be able to import python-magic normally, like this:
import magic
That's it!
p.s. The error failed to find libmagic appears under Python 3.8 runtime. Python 3.7 works fine.
The simplest way to resolve this at the time of this writing:
Create a lib/ directory in the root of your Lambda package. If you're using a Lambda layer create opt/lib/.
Download the binary package from here, it will have a filename something like file-libs-5.11-36.amzn2.0.1.x86_64.rpm
Unarchive that package. (If you're on MacOS, 7zip will work. It will extract a .cpio file. Extract the outputted file using the standard MacOS unarchiver)
Move the libmagic.so.1.0.0 file into the lib/ folder of your Lambda package (or /opt/lib directory of the Lambda layer). DO NOT move the other file in the same folder named libmagic.so.1, it's a symlink and not a real file and will not work.
Re-name libmagic.so.1.0.0 to libmagic.so.1
Deploy
I know the concept of using a deployment package is relatively straightforward, but I've been banging my head on this issue for the last few hours. I am following the documentation from AWS on packaging up Lambda dependencies. I want to write a simple Lambda function to update an entry in a PostgreSQL table upon some event.
I first make a new directory to work in:
mkdir lambdas-deployment && cd lambdas-deployment
Then I make a new virtual environment and install my packages:
virtualenv v-env
source v-env/bin/activate
pip3 install sqlalchemy boto3 psycopg2
My trigger-yaml-parse.py function (it doesn't actually use the sqlalchemy library yet, but I'm just trying to import it successfully):
import logging
import json
import boto3
import sqlalchemy
def lambda_handler(event, context):
records = event['Records']
s3_records = filter(lambda record: record['eventSource'] == 'aws:s3', records)
object_created_records = filter(lambda record: record['eventName'].startswith('ObjectCreated'), s3_records)
for record in object_created_records:
key = record['s3']['object']['key']
print(key)
I've been following the instructions in the AWS documentation.
zip -r trigger-yaml-parse.zip $VIRTUAL_ENV/lib/python3.6/site-packages/
I then add in my function code:
zip -g trigger-yaml-parse.zip trigger-yaml-parse.py
I get an output of updating: trigger-yaml-parse.py (deflated 48%).
Then I upload my new zipped deployment to my S3 build bucket:
aws s3 cp trigger-yaml-parse.zip s3://lambda-build-bucket
I choose upload from S3 in the AWS Lambda console:
However, my Lambda function fails upon execution with the error:
START RequestId: 396c6c3c-3f5b-4df9-b7f1-057842a87eb3 Version: $LATEST
Unable to import module 'trigger-yaml-parse': No module named 'sqlalchemy'
What am I doing wrong? I've followed the documentation from AWS literally step for step.
I think your problem might be in this line:
zip -r trigger-yaml-parse.zip $VIRTUAL_ENV/lib/python3.6/site-packages/
When you create the zip file the compressed files will have the complete path you had in your disk. The python runtime in lambda will not be able to find the libraries.
Instead you should do something like this
cd $VIRTUAL_ENV/lib/python3.6/site-packages/
zip -r /full/path/to/trigger-yaml-parse.zip .
Run unzip -t against both files and you will see the difference.
from AWS documentation:
"Zip packages uploaded with incorrect permissions may cause execution
failure. AWS Lambda requires global read permissions on code files and
any dependent libraries that comprise your deployment package"
So you can use zip info to check permissions:
zipinfo trigger-yaml-parse.zip
-r-------- means only the file owner has permissions.
I am trying to upload a python lambda function with zipped dependencies but for some reason I am constantly getting
"errorMessage": "Unable to import module 'CreateThumbnail'"
whenever I test it.
Here are the steps I took which were almost identical to these docs.
Created and activate a virtualenv with virtualenv ~/lambda_env and source ~/lambda_env/bin/activate
Install Pillow and boto3 with pip install Pillow and pip install boto3
Zip dependencies with cd $VIRTUAL_ENV/lib/python2.7/site-packages and zip -r9 ~/CreateThumbnail.zip *
Add the actual python lambda function to the zip file with zip -g ~/CreateThumbnail.zip CreateThumbnail.py where CreateThumbnail.py is
from __future__ import print_function
import boto3
import os
import sys
import uuid
from PIL import Image
import PIL.Image
s3_client = boto3.client('s3')
def resize_image(image_path, resized_path):
with Image.open(image_path) as image:
image.thumbnail(tuple(x / 2 for x in image.size))
image.save(resized_path)
def handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
upload_path = '/tmp/resized-{}'.format(key)
s3_client.download_file(bucket, key, download_path)
resize_image(download_path, upload_path)
s3_client.upload_file(upload_path, '{}resized'.format(bucket), key)
Then in the console I set the handler to be CreateThumbnail.handler
Then I upload CreateThumbnail.zip via the aws console and click 'save & test' I get
"errorMessage": "Unable to import module 'CreateThumbnail'"
I am very confused by this because feel like I am following the docs. Can anyone tell me what I am doing wrong here?
Perhaps check out the lambda-uploader project... It handles the packaging of dependencies and is config based.
https://github.com/rackerlabs/lambda-uploader/
Also these links may be helpful:
http://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
https://markn.ca/2015/10/python-extension-modules-in-aws-lambda/
http://www.perrygeo.com/running-python-with-compiled-code-on-aws-lambda.html
The problem lies in the packaging hierarchy. After you install the dependencies, zip the lambda function as follows (in the example below, lambda_function is the name of my function)
Try this:
pip install requests -t .
zip -r9 lambda_function.zip .
zip -g lambda_function.zip lambda_function.py
Do not let your browser automatically unzip the lambda "project" file after downloading. This seems to corrupt the file when it is re-zipped and used.
The tutorial you pointed out uses python 3.8
And you seem to be using python 2.7
That may be the reason.
I am doing a similar tutorial, but they give us the zip ready to upload but warning to select python 3.7 and not 3.8 or it will fail to run correctly.
I need to do a rest-call within a python script, that runs once per day.
I can't pack the "requests" package into my python-package using the AWS Lambdas. I get the error: "Unable to import module 'lambda_function': No module named lambda_function"
I broke it down to the hello_world predefined script. I can pack it into a zip and upload it. Everything works. As soon as I put "import requests" into the file, I get this error.
Here is what I already did:
The permissions of the zip and the project folder (including subfolders) are set to `chmod 777`. So permissions shouldn't be a problem.
The script itself is within the root folder. When you open the zip file, you directly see it.
I installed the requests package into the root-folder of the project using `sudo pip install requests -t PATH_TO_ROOT_FOLDER`
The naming of everything looks like this:
zip-file: lambda_function.zip
py-file: lambda_function.py
handler method: lambda_handler(event, context)
handler-definition in the "webconfig: lambda_function.lambda_handler
The file I want to run in the end looks like this:
import requests
import json
def lambda_handler(event, context):
url = 'xxx.elasticbeanstalk.com/users/login'
headers = {"content-type": "application/json", "Authorization": "Basic Zxxxxxxxxx3NjxxZxxxxzcw==" }
response = requests.put(url, headers=headers, verify=False)
return 'hello lambda_handler'
I'm glad for ANY kind of help. I already used multiple hours on this issue.
EDIT: On Oct-21-2019 Botocore removed the vendored version of requests: https://github.com/boto/botocore/pull/1829.
EDIT 2: (March 10, 2020): The deprecation date for the Lambda service to bundle the requests module in the AWS SDK is now January 30, 2021. https://aws.amazon.com/blogs/compute/upcoming-changes-to-the-python-sdk-in-aws-lambda/
EDIT 3: (Nov 22, 2022): AWS cancelled the deprecation so you can continue to use requests as described below. AWS Blog
To use requests module, you can simply import requests from botocore.vendored. For example:
from botocore.vendored import requests
def lambda_handler(event, context):
response = requests.get("https://httpbin.org/get", timeout=10)
print(response.json())
you can see this gist to know more modules that can be imported directly in AWS lambda.
If you're working with Python on AWS Lambda, and need to use requests, you better use urllib3, it is currently supported on AWS Lambda and you can import it directly, check the example on urllib3 site.
import urllib3
http = urllib3.PoolManager()
r = http.request('GET', 'http://httpbin.org/robots.txt')
r.data
# b'User-agent: *\nDisallow: /deny\n'
r.status
# 200
I finally solved the problem: The structure in my zip file was broken. It is important that the python script and the packed dependencies (as folders) are in the root of the zip file. This solved my problem.
It's a bit depressing if you find such easy errors after hours of try and failure.
I believe you have lambda_function.py on the Lambda console. You need to first create the Lambda function deployment package, and then use the console to upload the package.
You create a directory, for example project-dir on your system (locally)
create lambda_function.py in project-dir, copy the content of lambda_function.py from lambda console and paste it in project-dir/lambda_function.py
pip install requests -t /path/to/project-dir
Zip the content of the project-dir directory, which is your deployment package (Zip the directory content, not the directory)
Go to the Lambda console, select upload zip file in code entry type and upload your deployment package. Import requests should work without any error.
With this command download the folder package
pip install requests -t .
Run this command on your local machine, then zip your working directory, then upload to aws.
Most of the comments somehow correct, but not enough informative for AWS beginners. Here is my long resume what needs to be done for accessing requests functionality:
1. Creates root folder for AWS Lambda function
% mkdir lambda-function
2. Go inside crated root folder
% cd lambda-function
3. Create entry point Python file for AWS Lambda.
% vi lambda_function.py
4. Paste a code into lambda_function.py
import requests
def lambda_handler(event, context):
response = requests.get("https://www.test.com/")
print(response.text)
return response.text
5. Install requests library. Note:package folder created
% pip install --target ./package requests
6. Go inside package
% cd package
7. Zip package
zip -r ../deployment-package.zip .
8. Go into parent folder
% cd ..
9. Zip deployment packge and lambda function file
% zip -g deployment-package.zip lambda_function.py
In the AWS Lambda functions tap "Upload from" and pick ".zip file". Navigate to your zip package zip file: deployment-package.zip.
After upload all files will be inside AWS Lambda function.
python 3.8 windows 10
lambda is looking for a specific folder structure and we are going to recreate in this manner in the steps below (https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html#configuration-layers-create):
make a folder on your desktop called "python," open a cmd terminal: cd desktop
pip install --target python requests
right click your python folder and zip it and rename the zip to 'requests.zip' - now if you look inside the zip you should see the python folder.
aws console > lambda > layers > create layer => name layer/upload requests.zip
aws console > functions > create function => in the "designer" box select layers and then "add layers." Choose custom layers and select
your layer.
Go back to the function screen by clicking on the lambda symbol in the designer box. Now you can see "function code" again. Click lambda_function.py
Now you can import requests like this:
import json
import requests
def lambda_handler(event, context):
# TODO implement
response = requests.get('your_URL')
return {
'statusCode': 200,
'body': json.dumps(response.json())
}
Copy whatever you have in the lambda_function fron AWS lambda console and paste it in a new python script and save it as lambda_function.py.
Make a new folder (I name it as package) and save requests module in it by running the following code in terminal: pip install -t package requests
Move lambda_function.py into the folder (package).
Go to the folder and select all content and zip them.
Go back to the AWS Lambda console. select the function and under the Function code section, click on 'Action' (on the right side) and select Upload a .zip file.
Upload the folder. lambda_function should be uploaded automatically.
Run and Enjoy.
Add a layer to your lambda function
by specifying this arn (ap-south-1)
arn:aws:lambda:ap-south-1:770693421928:layer:Klayers-p38-requests-html:10