Local AWS Lambda + Kinesis - setting up event source

Local AWS Lambda + Kinesis - setting up event source - python

I have Kinesis streams set up locally with Localstack, and Lambda (in Python) set up locally with Serverless Offline. I cannot set up event source between them due to 404 and 500 errors.
Kinesis is set up with Docker-compose:
version: '3'
services:
localstack:
container_name: "localstack"
image: localstack/localstack:latest
environment:
- DEFAULT_REGION=eu-central-1
- SERVICES=kinesis
- DOCKER_HOST=unix:///var/run/docker.sock
ports:
- "4566:4566" # LocalStack Gateway
- "4510-4559:4510-4559" # external services port range
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
Streams are set up with boto3:
import boto3
if __name__ == '__main__':
client = boto3.client(
"kinesis",
region_name="eu-central-1",
endpoint_url="http://localhost:4566"
)
client.create_stream(StreamName="audience_events_local", ShardCount=1)
client.create_stream(StreamName="audience_events_local_cache", ShardCount=1)
Lambda functions are set up with Serverless Offline: serverless offline --stage=local. Relevant part of serverless.yml:
serverless-offline:
httpPort: 3000 # HTTP port to listen on
lambdaPort: 3002 # Lambda HTTP port to listen on
I try to set up event sources with:
import boto3
def get_kinesis_stream_arns() -> list[str]:
client = boto3.client(
"kinesis",
region_name="eu-central-1",
endpoint_url="http://localhost:4566"
)
return [
client.describe_stream(StreamName=stream_name)["StreamDescription"]["StreamARN"]
for stream_name in ["audience_events_local", "audience_events_local_cache"]
]
def create_event_sources(stream_arns: list[str]) -> None:
client = boto3.client(
"lambda",
region_name="eu-central-1",
endpoint_url="http://localhost:4566"
)
for arn in stream_arns:
# example:
# arn:aws:kinesis:eu-central-1:000000000000:stream/audience_events_local
# -> function_name = audience-events-local
function_name = arn.split("/")[-1].replace("_", "-")
client.create_event_source_mapping(
EventSourceArn=arn,
FunctionName=function_name,
MaximumRetryAttempts=2
)
if __name__ == '__main__':
stream_arns = get_kinesis_stream_arns()
print("Stream ARNs:", stream_arns)
create_event_sources(stream_arns)
However, I get errors:
if I use endpoint_url="http://localhost:4566" in create_event_sources, I get botocore.exceptions.ClientError: An error occurred (500) when calling the CreateEventSourceMapping operation (reached max retries: 4):
if I use endpoint_url="http://localhost:3002", I get botocore.exceptions.ClientError: An error occurred (404) when calling the CreateEventSourceMapping operation: Not Found
How can I fix this?

Related

Error : Cloud Run error: Container failed to start. Stop Execution of Container/Cloud Run once execution of Python is done

My requirement is to stop the execution of Cloud run/container once the execution of python code is complete.
The task is to fetch files from Cloud Storage, process them and then export them back to cloud storage.
I am able to complete the task successfully. But the cloud build is ending with below error.
"deploy-to-cloud-run": ERROR: (gcloud.run.deploy) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more info.
CloudBuild.yml
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/reponame:latest', '-t', 'gcr.io/$PROJECT_ID/reponame:$COMMIT_SHA', '-t', 'gcr.io/$PROJECT_ID/reponame:$BUILD_ID', '.']
id: 'build-image-reponame'
waitFor: ['-'] # The '-' indicates that this step begins immediately.
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/reponame:$COMMIT_SHA']
id: 'push-image-to-container-registry'
waitFor: ['build-image-reponame']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'reponame'
- '--image'
- 'gcr.io/$PROJECT_ID/reponame:$COMMIT_SHA'
- '--region'
- 'us-east1'
- '--platform'
- 'managed'
waitFor: ['push-image-to-container-registry']
id: 'deploy-to-cloud-run'
images:
- 'gcr.io/$PROJECT_ID/reponame:latest'
- 'gcr.io/$PROJECT_ID/reponame:$COMMIT_SHA'
- 'gcr.io/$PROJECT_ID/reponame:$BUILD_ID'

You can simply rely on the getting started tutorial
For your code, do something like this (code get from the tutorial with some comments)
import os
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello_world():
#name = os.environ.get("NAME", "World")
#return "Hello {}!".format(name)
# Put your logic here
return "Done", 200 #return nicely the response to the request
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))
Then when you want to run your processing, call the / route and the process will be performed. That's all. If the processing take more than 3 minutes, you can increase the cloud run timeout up to 60 minutes.

Restart ElasticBeanstalk app server on schedule

I created a lambda function using serverless in a private subnets of the non default VPC. I wanted to restart the app server of elasticbeanstalk application at a schedule time. I used boto3 and here is the reference [https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/elasticbeanstalk.html][1]
The problem is that when i run the function locally it runs and restart the application server. But when i deploy using sls deploy, it is not working and i get null response back when i test it from the lambda console.
Here is the code:
import json
from logging import log
from loguru import logger
import boto3
from datetime import datetime
import pytz
def main(event, context):
try:
client = boto3.client("elasticbeanstalk", region_name="us-west-1")
applications = client.describe_environments()
current_hour = datetime.now(pytz.timezone("US/Eastern")).hour
for env in applications["Environments"]:
applicationname = env["EnvironmentName"]
if applicationname == "xxxxx-xxx":
response = client.restart_app_server(
EnvironmentName=applicationname,
)
logger.info(response)
print("restarted the application")
return {"statusCode": 200, "body": json.dumps("restarted the instance")}
except Exception as e:
logger.exception(e)
if __name__ == "__main__":
main("", "")
Here the serverless.yml file:
service: beanstalk-starter
frameworkVersion: '2'
provider:
name: aws
runtime: python3.8
lambdaHashingVersion: 20201221
profile: xxxx-admin
region: us-west-1
memorySize: 512
timeout: 15
vpc:
securityGroupIds:
- sg-xxxxxxxxxxx (open on all ports for inbound)
subnetIds:
- subnet-xxxxxxxxxxxxxxxx (private)
- subnet-xxxxxxxxxxxxxxxx (private)
plugins:
- serverless-python-requirements
custom:
pythonRequirements:
dockerizePip: non-linux
functions:
main:
handler: handler.main
events:
- schedule: rate(1 minute)
Response from lambda console:
The area below shows the result returned by your function execution. Learn more about returning results from your function.
null
Any help would be appreciated! Let me know what I'm missing here!

To solve this, I have to give these two permissions to my AWS lambda role from the AWS management console. You can also set the permission in the serverless.yml file.
AWSLambdaVPCAccessExecutionRole
AWSCodePipeline_FullAccess
(*Make sure you are using the least privileges while giving permission to a role.)
Thank you.

Slow Socket IO response when using Docker

I have a Web App built in Flask where tweets are captured (using Tweepy library) and displayed on the front-end. I used Socket IO to display the tweets live on the front-end.
My code works fine when I run this locally. The tweets appear instantly.
However, when i Dockerized the web app, the front-end doesn't update immediately. It takes some time to show the changes (sometimes I think tweets are lost due to the slowness)
Below are code extracts from my website:
fortsocket.js
$(document).ready(function () {
/************************************/
/*********** My Functions ***********/
/************************************/
function stream_active_setup() {
$("#favicon").attr("href", "/static/icons/fortnite-active.png");
$("#stream-status-ic").attr("src", "/static/icons/stream-active.png");
$("#stream-status-text").text("Live stream active");
}
function stream_inactive_setup() {
$("#favicon").attr("href", "/static/icons/fortnite-inactive.png");
$("#stream-status-ic").attr("src", "/static/icons/stream-inactive.png");
$("#stream-status-text").text("Live stream inactive");
}
/*********************************/
/*********** My Events ***********/
/*********************************/
// Socket connection to server
// Prometheus
//var socket = io.connect('http://104.131.173.145:8083');
// Local
var socket = io.connect(window.location.protocol + '//' + document.domain + ':' + location.port);
// Heroku
//var socket = io.connect('https://fortweet.herokuapp.com/');
// Send a hello to know
// if a stream is already active
socket.on('connect', () => {
socket.emit('hello-stream', 'hello-stream');
});
// Listene for reply from hello
socket.on('hello-reply', function (bool) {
if (bool == true) {
stream_active_setup()
} else {
stream_inactive_setup()
}
});
// Listens for tweets
socket.on('stream-results', function (results) {
// Insert tweets in divs
$('#live-tweet-container').prepend(`
<div class="row justify-content-md-center mt-3">
<div class="col-md-2">
<img width="56px" height="56px" src="${results.profile_pic !== "" ? results.profile_pic : "/static/icons/profile-pic.png"}" class="mx-auto d-block rounded" alt="">
</div>
<div class="col-md-8 my-auto">
<div><b>${results.author}</b></div>
<div>${results.message}</div>
</div>
</div>
`);
});
// Listener for when a stream of tweets starts
socket.on('stream-started', function (bool) {
if (bool == true) {
stream_active_setup()
}
});
// Listener for when a stream of tweets ends
socket.on('stream-ended', function (bool) {
if (bool == true) {
stream_inactive_setup()
}
});
});
init.py
# Create the app
app = create_app()
# JWT Configurations
jwt = JWTManager(app)
# Socket IO
socketio = SocketIO(app, cors_allowed_origins="*")
# CORS
CORS(app)
app.config["CORS_HEADERS"] = "Content-Type"
# Creates default admins and insert in db
create_default_admin()
# Main error handlers
#app.errorhandler(404) # Handling HTTP 404 NOT FOUND
def page_not_found(e):
return Err.ERROR_NOT_FOUND
# Listen for hello emit data
# from client
#socketio.on("hello-stream")
def is_stream_active(hello_stream):
emit("hello-reply", streamer.StreamerInit.is_stream_active(), broadcast=True)
streamer.py
import time
import tweepy
import threading as Coroutine
import app.messages.constants as Const
import app.setup.settings as settings_mod
import app.models.tweet as tweet_mod
import app.services.logger as logger
import app
class FStreamListener(tweepy.StreamListener):
def __init__(self):
self.start_time = time.time()
self.limit = settings_mod.TwitterSettings.get_instance().stream_time
logger.get_logger().debug("Live capture has started")
# Notify client that a live capture will start
app.socketio.emit(
"stream-started", True, broadcast=True,
)
super(FStreamListener, self).__init__()
def on_status(self, status):
if (time.time() - self.start_time) < self.limit:
# Create tweet object
forttweet = tweet_mod.TweetModel(
status.source,
status.user.name,
status.user.profile_background_image_url_https,
status.text,
status.created_at,
status.user.location,
)
# Emit to socket
app.socketio.emit(
"stream-results",
{
"profile_pic": forttweet.profile_pic,
"author": forttweet.author,
"message": forttweet.message,
},
broadcast=True,
)
# Add to database
forttweet.insert()
return True
else:
logger.get_logger().debug("Live capture has ended")
# Notify client that a live capture has ended
app.socketio.emit(
"stream-ended", True, broadcast=True,
)
# Stop the loop of streaming
return False
def on_error(self, status):
logger.get_logger().debug(f"An error occurred while fetching tweets: {status}")
raise Exception(f"An error occurred while fetching tweets: {status}")
class StreamerInit:
# [Private] Twitter configurations
def __twitterInstantiation(self):
# Get settings instance
settings = settings_mod.TwitterSettings.get_instance()
# Auths
auth = tweepy.OAuthHandler(settings.consumer_key, settings.consumer_secret,)
auth.set_access_token(
settings.access_token, settings.access_token_secret,
)
# Get API
api = tweepy.API(auth)
# Live Tweets Streaming
myStreamListener = FStreamListener()
myStream = tweepy.Stream(auth=api.auth, listener=myStreamListener)
myStream.filter(track=settings.filters)
def start(self):
for coro in Coroutine.enumerate():
if coro.name == Const.FLAG_TWEETS_LIVE_CAPTURE:
return False
stream = Coroutine.Thread(target=self.__twitterInstantiation)
stream.setName(Const.FLAG_TWEETS_LIVE_CAPTURE)
stream.start()
return True
#staticmethod
def is_stream_active():
for coro in Coroutine.enumerate():
if coro.name == Const.FLAG_TWEETS_LIVE_CAPTURE:
return True
return False
The streamer.py is called on a button click
Dockerfile
# Using python 3.7 in Alpine
FROM python:3.6.5-stretch
# Set the working directory to /app
WORKDIR /app
# Copy the current directory contents into the container at /app
ADD . /app
RUN apt-get update -y && apt-get upgrade -y && pip install -r requirements.txt
# Run the command
ENTRYPOINT ["uwsgi", "app.ini"]
#ENTRYPOINT ["./entry.sh"]
docker-compose.yml
version: "3.8"
services:
fortweet:
container_name: fortweet
image: mervin16/fortweet:dev
build: ./
env_file:
- secret.env
networks:
plutusnet:
ipv4_address: 172.16.0.10
expose:
- 8083
restart: always
nginx_fortweet:
image: nginx
container_name: nginx_fortweet
ports:
- "8083:80"
networks:
plutusnet:
ipv4_address: 172.16.0.100
volumes:
- ./nginx/nginx.conf:/etc/nginx/conf.d/default.conf
depends_on:
- fortweet
restart: always
networks:
plutusnet:
name: plutus_network
driver: bridge
ipam:
driver: default
config:
- subnet: 172.16.0.0/24
gateway: 172.16.0.1
app.ini
[uwsgi]
module = run:app
master = true
processes = 5
# Local & Prometheus
http-socket = 0.0.0.0:8083
http-websockets = true
chmod-socket = 660
vacuum = true
die-on-term = true
For a full, updated code, you can find it here under the branch dev/mervin
Any help is appreciated.

in order to see if ipv6 is responsible i would suggest you shutdown everything
open /etc/sysctl.conf and add the following lines to disable ipv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
run sudo sysctl -p so changes can take effect
start nginx and the docker again
if you dont see any difference then you can just change the settings to 0 and rerun sysctl -p and let me know

Unfortunately I can't reproduce the issue without the configuration, so I can't verify my answer.
I was able to find a similar issue on a JP's blog: Performance problems with Flask and Docker
In short, it might be that having both IPv6 and IPv4 configs on the container are causing the issue.
In order to verify the issue:
Run the docker
Go inside the running container and change the hosts file so that it won't map IPv6 to localhost
Run application again inside of container
If the app runs smoothly then you've identified your issue.
The solution would be to tweak the uwsgi parameters.
What the author did in the blog post:
CMD uwsgi -s /tmp/uwsgi.sock -w project:app --chown-socket=www-data:www-data --enable-threads & nginx -g 'daemon off;'

How to make Minio-client (from host) talk with Minio-server(docker container)?

I am running a minio-server in the container of docker-compose. I am trying to upload a file to the minio-server in the container, from the host machine (Ubuntu) (instead of container) by using minio-client (python SDK).
I did not make it happen as expected.
I am not clear if it is because of my endpoint(URL), or due to the connection issue between container and host?
The endpoints i tried:
url_1 = 'http://minio:9000' # from my default setup for minio link;
url_2 = 'http://localhost:9000/minio/test' # from Minio browser.
For url_1, what i got is: " botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: http://minio:9000/test ".
The line of error: s3.create_bucket(Bucket='test')
For url_2, what i got is: " All access to this bucket has been disabled. ".
The line of error: s3.create_bucket(Bucket='test')
I tried the similar thing: activating my minio-server and minio-client both in my host machine. Then i upload file from the minio-client to the minio-server. I can see those uploaded files in Minio browser in localhost.
######### python script uploading files
import boto3
from botocore.client import Config
import os
import getpass
my_url1='http://minio:9000' # this is from os.environ['S3_URL']
my_url2='http://localhost:9000/minio/test' # this is from browser
s3 = boto3.resource('s3',
endpoint_url=my_url2,
aws_access_key_id=os.environ['USER'],
aws_secret_access_key = getpass.getpass('Password:'),
config = Config(signature_version='s3v4'),
region_name='us-east-1')
print ('********', s3)
s3.create_bucket(Bucket='test')
uploadfile= os.getcwd()+'/'+'test.txt'
s3.Bucket('testBucket').upload_file(uploadfile,'txt')
######### docker-yml file for Minio
minio:
image: minio/minio
entrypoint:
- minio
- server
-/data
ports:
- "9000:9000"
environment:
minio_access_key = username
minio_secret_key = password
mc:
image: minio/mc
environment:
minio_access_key = username
minio_secret_key = password
entrypoint:
/bin/sh -c
depends_on:
minio
i expected to see the uploaded files from the minio browser('http://localhost:9000/minio/test') , just like what i did from activating minio-server and minio-client both at the host.

With default docker networking, you would have to try to access minio at http://localhost:9000 on your host. So you can just use this URL in your Python script. The http://minio:9000 will work from containers on the same docker network as your minio server.

Try to use Pyminio client instead of boto3.
import os
from pyminio import Pyminio
pyminio_client = Pyminio.from_credentials(
endpoint='http://localhost:9000/',
access_key=os.environ['USER'],
secret_key=getpass.getpass('Password:')
)
pyminio_client.mkdirs('/test/')
pyminio_client.put_file(
to_path='/test/',
file_path=os.path.join(os.getcwd(), 'test.txt')
)

use this configuration in your compose.yml file
version: "3"
services:
minio:
image: "minio/minio"
container_name: mi
ports:
- "9000:9000"
environment:
- "MINIO_ACCESS_KEY=ACCRESS"
- "MINIO_SECRET_KEY=SECRET"
restart: always
command: server /data
mc:
image: minio/mc
container_name: mc
network_mode: host
entrypoint: >
/bin/sh -c "
/usr/bin/mc config host add minio http://127.0.0.1:9000 ACCESS SECRET;
/usr/bin/mc rm -r --force minio/psb-new;
/usr/bin/mc mb minio/psb-new;
/usr/bin/mc policy set public minio/psb-new;
exit 0;
"
networks:
elastic:
driver: bridge

elasticache redis - python - connection times out

I have a rather simple test app:
import redis
import os
import logging
log = logging.getLogger()
log.setLevel(logging.DEBUG)
def test_redis(event, context):
redis_endpoint = None
if "REDIS" in os.environ:
redis_endpoint = os.environ["REDIS"]
log.debug("redis: " + redis_endpoint)
else:
log.debug("cannot read REDIS config environment variable")
return {
'statusCode': 500
}
redis_conn = None
try:
redis_conn = redis.StrictRedis(host=redis_endpoint, port=6379, db=0)
redis_conn.set("foo", "boo")
redis_conn.get("foo")
except:
log.debug("failed to connect to redis")
return {
'statusCode': 500
}
finally:
del redis_conn
return {
'statusCode': 200
}
which I have deployed as a HTTP endpoint with serverless
#
# For full config options, check the docs:
# docs.serverless.com
#
service: XXX
plugins:
- serverless-aws-documentation
- serverless-python-requirements
custom:
pythonRequirements:
dockerizePip: true
provider:
name: aws
stage: dev
region: eu-central-1
runtime: python3.6
environment:
# our cache
REDIS: xx-xx-redis-001.xxx.euc1.cache.amazonaws.com
functions:
hello:
handler: hello/hello_world.say_hello
events:
- http:
path: hello
method: get
# private: true # <-- Requires clients to add API keys values in the `x-api-key` header of their request
# authorizer: # <-- An AWS API Gateway custom authorizer function
testRedis:
handler: test_redis/test_redis.test_redis
events:
- http:
path: test-redis
method: get
When I trigger the endpoint via API Gateway, the lambda just times out after about 7 seconds.
The environmental variable is read properly, no error message displayed.
I suppose there's a problem connecting to the redis, but the tutorial are quite explicit - not sure what the problem could be.
The problem might need the need to set up a NAT, not sure how to accomplish this task with serverless

I ran into this issue as well. For me, there were a few problems that had to be ironed out
The lambda needs VPC permissions.
The ElastiCache security group needs an inbound rule from the Lambda security group that allows communication on the Redis port. I thought they could just be in the same security group.
And the real kicker: I had turned on encryption in-transit. This meant that I needed to pass redis.RedisClient(... ssl=True). The redis-py page mentions that ssl_cert_reqs needs to be set to None for use with ElastiCache, but that didn't seem to be true in my case. I did however need to pass ssl=True.
It makes sense that ssl=True needed to be set but the connection was just timing out so I went round and round trying to figure out what the problem with the permissions/VPC/SG setup was.

Try having the lambda in the same VPC and security group as your elastic cluster

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Local AWS Lambda + Kinesis - setting up event source - python

Related

Error : Cloud Run error: Container failed to start. Stop Execution of Container/Cloud Run once execution of Python is done

Restart ElasticBeanstalk app server on schedule

Slow Socket IO response when using Docker

How to make Minio-client (from host) talk with Minio-server(docker container)?

elasticache redis - python - connection times out

Categories

Resources