Deploying additional files into AWS Lambda (Python) - python

I am attempting to upload an additional file containing an encryption secret to AWS Lambda, and am having trouble. The file is to be read from my Python script and be processed. I have tested this functionality locally, and it works just fine.
I package and upload the .zip correctly, as AWS has no problems running the script once it uploads. However, my code fails at the line that it reads my file, even though it should be in the working directory.
Is it possible to upload a file into the AWS zip deployment, and have it be read by the script?

I was surprised that this did not work, so I did some digging for anyone interest.
I created a simple function:
import json
import os
import random
def lambda_handler(event, context):
selection = str(random.randint(1,5))
with open('mydata.csv') as dogs:
for l in dogs:
if selection == l.split(',')[0]:
random_dog = l.split(',')[2]
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!'),
'cwd': os.getcwd(),
'ls': os.listdir(),
'random_dog': random_dog
}
A data file:
akc_popularity,year,breed
1,2019,labrador Retriever
2,2019,German Shepherd Dog
3,2019,Golden Retriever
4,2019,French Bulldog
5,2019,Bulldog
Added them to a zip archive:
$ zip fileImport.zip importer.py
$ zip fileImport.zip mydata.csv
Created the function:
$ aws lambda create-function --function-name fileImport --zip-file fileb://fileImport.zip --handler importer.lambda_handler --runtime python3.8 --role arn:aws:iam::***************:role/Lambda_StackExchange
Triggered the function:
$ aws lambda invoke --function-name fileImport output.json
{
"StatusCode": 200,
"ExecutedVersion": "$LATEST"
}
$ jq . output.json
{
"statusCode": 200,
"body": "\"Hello from Lambda!\"",
"cwd": "/var/task",
"ls": [
"importer.py",
"mydata.csv"
],
"random_dog": "Bulldog\n"
}
So, please share some code so we can dive in! FYI...I would highly recommend storing secrets using AWS Secrets Manager. It is very easy to use and keeps your hardcoded secrets out of things like version control systems. Additionally, changing your secret will not require a redeployment of your function.

Related

How to get the path of .py files under Azure function in Azure portal

I am working on Python Azure function. Below is the part of the code.
df1 = pd.DataFrame(df)
df2= df1.loc[0, 'version']
ipversion= f"testversion{df2}.py"
start_path = 'E:\Azure\Azure_FUNC'
path_to_file = os.path.join(start_path, ipversion)
logging.info(f"path_to_file: {path_to_file}")
path = Path(path_to_file)
version= f"testversion{df2}"
if ip:
if path.is_file():
module = 'Azure_FUNC.' + version
my_module = importlib.import_module(module)
return func.HttpResponse(f"{my_module.add(ip)}")
else:
return func.HttpResponse(f" This HTTP triggered function executed successfully.Flex calculation = {default.mult(ip)}")
else:
return func.HttpResponse(
"This HTTP triggered function executed successfully.,
status_code=200
)
Azure_FUNC is my function name.
testversion1, testversion2 and default are 3 .py files under this function.
In the above code, based on the input version provided from the API call, the code checks if that version .py is available and imports the module from that particular version and executes the code. If the given version .py file is not available, it is going to execute default .py file.
This works fine in my local. But when I deploy this function to Azure, I am unable to find the path for testversion1 and testversion2 files in the Azure portal under Azure functions.
Please let me know how to get the path of these files and how to check these files based on the input version provided from the API call.
Thank you.
If you would deploy the Azure Python Function Project to Linux Function App, then you can see the location of your trigger files (i.e., .py files) in the path of:
Open Kudu Site of your Function App > Click on SSH >

Resource handler returned message: "Unzipped size must be smaller than 262144000 bytes AWS Lambda and CDK, no serverless.yaml file

I'm attempting to get a Lambda function working with AWS CDK. I'm implemented the lambda function in python, and want to include external libraries in my Lambda code. Currently this is my CDK code:
import aws_cdk as core
from aws_cdk import (
Stack,
aws_lambda as _lambda,
aws_apigateway as apigw,
)
class SportsTeamGeneratorStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
my_lambda = _lambda.Function(self, 'HelloHandler',
runtime=_lambda.Runtime.PYTHON_3_8,
code=_lambda.Code.from_asset("lambda",
bundling= core.BundlingOptions(
image=_lambda.Runtime.PYTHON_3_8.bundling_image,
command=[
"bash", "-c",
"pip install --no-cache -r requirements.txt -t /asset-output && cp -au . /asset-output"
],
),
),
handler='hello.handler',
)
apigw.LambdaRestApi(
self, 'Endpoint',
handler=my_lambda,
)
And this is my Lambda code:
import json
import pandas
def handler(event, context):
print('request: {}'.format(json.dumps(event)))
return {
'statusCode': 200,
'headers': {
'Content-Type': 'text/plain'
},
'body': 'Hello, CDK! You have hit {}\n'.format(event['path'])
}
The cdk code is in a directory called sports_team_generator, and the lambda code is in a hello.py file located in a directory called "lambda". Within the "lambda" directory, I also have my requirements.txt file, which contains the follwing:
aws-cdk-lib==2.19.0
constructs>=10.0.0,<11.0.0
pytz==2022.1
requests==2.27.1
sportsipy==0.6.0
numpy==1.22.3
pandas==1.4.2
pyquery >= 1.4.0
I am currently trying to avoid using ECR to upload docker images and then link those images to lambda functions in the console, as I want to do everything through the CDK. I feel as though the lambda itself is small, and have no clue why it might be exceeding the size requirement. It seems as though the requirements.txt is causing the problem, and I'm not sure if there is some workaround to this. Preferably I would fix this error, although if not possible I could be open to creating a docker image and uploading to ECR, and linking that ECR instance to a lambda function through the cdk if possible. If anyone has a solution/suggestions please let me know.
I'm afraid your only option is deploying the lambda from the container, see https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
Max unzipped deployment package size from zip is 250 MB. Lambda layers also do not help as the combined deployment package size of all layers needs to be lower than 250 MB.
However, container image code package size limit is 10 GB. Therefore if you can not lower down package size of your lambda under 250 MB, containers are the way to go for you. The guide how to register the container and use it for the lambda deploy is at https://docs.aws.amazon.com/lambda/latest/dg/images-create.html and https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/

Django Scheduled task using Django-q

Im trying to run scheduled task using Django-q I followed the docs but its not running
heres my config
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
'LOCATION': 'db_cache_table',
}
}
Q_CLUSTER = {
'name': 'DjangORM',
'workers': 4,
'timeout': 90,
'retry': 120,
'queue_limit': 50,
'bulk': 10,
'orm': 'default'
}
heres my scheduled task
Nothin is executing, please help
I also had problems with getting scheduled tasks processed in the first place, but finally found a workflow.
I run django-q on a windows machine, using the django ORM as a broker.
Before talking about the execution routine i came up, lets quickly check out my modules first, starting with ..
settings.py:
Q_CLUSTER = {
"name": "austrian_energy_monthly",
"workers": 1,
"timeout": 10,
"retry": 20,
"queue_limit": 50,
"bulk": 10,
"orm": "default",
"ack_failures": True,
"max_attempts": 1,
"attempt_count": 0,
}
.. and my folder structure:
As you can see, the folder of my django project is inside the src folder. Further, there's a folder for the app i created for this project, which is simply called "app". Inside the app folder i do have another folder called "cron", which includes the following files and functions related to the scheduling:
tasks.py
I do not use the schedule() method provided by the django-q, instead i go for the creating tables directly (see: django-q official schedule docs)
from django.utils import timezone
from austrian_energy_monthly.app.cron.func import create_text_file
from django_q.models import Schedule
Schedule.objects.create(
func="austrian_energy_monthly.app.cron.func.create_text_file",
kwargs={"content": "Insert this into a text file"},
hooks="austrian_energy_monthly.app.cron.hooks.print_result",
name="Text file creation process",
schedule_type=Schedule.ONCE,
next_run=timezone.now(),
)
Make sure you assign the "right" path to the "func" keyword. Just using "func.create_text_file",didn't work out for me, even though these files are laying in the same folder. Same for the "hooks" keyword.
(NOTE: I've set up my project as a development package via setup.py, such that i can call it from everywhere inside my src folder)
func.py:
Contains the function called by the schedule table object.
def create_text_file(content: str) -> str:
file = open(f"copy.txt", "w")
file.write(content)
file.close()
return "Created a text file"
hooks.py:
Contains the function called after the scheduled process finished.
def print_result(task):
print(task.result)
Let's now see how i managed to get the executions running for with the file examples described above:
First i've scheduled the "Text file creation process". Therefore I used "python manage.py shell" and imported the tasks.py module (you probably could schedule everythin via the admin page as well, but i didnt tested this yet):
You could now see the scheduled task, with a question mark on the success column in the admin page (tab "Scheduled tasks", as within your picture):
After that i opened a new terminal and started the cluster with "python manage.py qcluster", resulting in the following output in the terminal:
The successful execution can be inspected by looking at "13:22:17 [Q] INFO Processed [ten-virginia-potato-high]", alongside the hook print statement "Created a text file" in the terminal. Further you can check it at the admin page, under the tab "Successful Tasks", where you should see:
Hope that helped!
Django-q dont support windows. :)

Is it possible to add paths to the PATH environment variable through serverless.yml?

When I create an AWS Lambda Layer, all the contents / modules of my zip file go to /opt/ when the AWS Lambda executes. This easily becomes cumbersome and frustrating because I have to use absolute imports on all my lambdas. Example:
import json
import os
import importlib.util
spec = importlib.util.spec_from_file_location("dynamodb_layer.customer", "/opt/dynamodb_layer/customer.py")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
def fetch(event, context):
CustomerManager = module.CustomerManager
customer_manager = CustomerManager()
body = customer_manager.list_customers(event["queryStringParameters"]["acquirer"])
response = {
"statusCode": 200,
"headers": {
"Access-Control-Allow-Origin": "*"
},
"body": json.dumps(body)
}
return response
So I was wondering, is it possible to add these /opt/paths to the PATH environment variable by beforehand through serverless.yml? In that way, I could just from dynamodb_layer.customer import CustomerManager, instead of that freakish ugliness.
I've Lambda layer for Python3.6 runtime. My my_package.zip structure is:
my_package.zip
- python
- lib
- python3.6
- site-packages
- customer
All dependencies are in build folder in project root:
e.g. build/python/lib/python3.6/site-packages/customer
Relevant section of my serverless.yml
layers:
my_package:
path: build
compatibleRuntimes:
- python3.6
In my Lambda I import my package like I would do any other package:
import customer
Have you tried setting your PYTHONPATH env var?
https://stackoverflow.com/a/5944201/6529424
Have you tried adding to sys.path?
https://stackoverflow.com/a/12257807/6529424
In the zip archive, the module needs to be placed the in a python subdirectory so that when it is extracted as a layer in Lambda, it is located in /opt/python. That way you'll be able to directly import your module without the need for importlib.
It's documented here or see this detailed blogpost from an AWS dev evangelist for more.
Setting of PYTHONPATH variable is not required, as long as you place items correctly inside the zip file.
Simple modules, and package directories, these should be placed inside a directory "python", and then the whole python/ placed into the zip file for uploading to AWS as a layer. Don't forget to add the "compatible runtimes" (eg Python 3.6, 3.7, 3.8 ...) settings for the layers.
So as an example:
python/
- my_module.py
- my_package_dir
-- __init__.py
-- package_mod_1.py
-- package_mod_2.py
which then get included in the zip file.
zip -r my_layer_zip.zip python/
The modules can then be imported without any more ado, when accessed as a layer:
....
import my_module
from my_package.package_mod_2 import mod_2_function
....
You can see the package structure from within the lambda if you look at '/opt/python/' which will show my_module.py, my_package/ etc., this is easily tested using the AWS Lambda test function, assuming the layer is attached to the function (or else the code will error)
import json
import os
def lambda_handler(event, context):
# TODO implement
dir_list = os.listdir('/opt/python/')
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!'),
'event': json.dumps(event),
'/opt/python/': dir_list
}

Unable to import module in AWS Lambda (Python)

I have a python script named foo.py. It has a lambda handler function defined like this:
def handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
download_path = '/tmp/{}.gz'.format(key)
csv_path = '/tmp/{}.csv'.format(key)
... proceed to proprietary stuff
This is in a zip file like so:
-foo.zip
-foo.py
-dependencies
I have uploaded this zip file to AWS Lambda and configured an AWS Lambda Function to run foo.handler. However, every time I test it, I get "errorMessage": "Unable to import module 'foo'".
Any ideas what might be going on here?
stat --format '%a' foo.py shows 664
So, I was importing psycopg2 in my lambda function, which requires libpq.so, which installs with Postgres. Postgres isn't installed in the lambda environment, so importing psycopg2 failed, which meant that, by extension, Amazon's import of my lambda function also failed. Not a very helpful error message, though.
Thankfully, somebody's built a version of psycopg2 that works with AWS lambda: https://github.com/jkehler/awslambda-psycopg2

Categories