What I am trying to do is use aws-lambda to import zipped sql files in aws-rds. In my case zipped sql files are inserted in s3 constantly by some crawlers. What I want to do is when any sql file is uploaded to an s3 bucket, I want aws-lambda to use a mysql-client to import these files into aws-rds.
They way I have think of doing this is by packaging a mysql-client inside the zip for the aws-lambda handler. But I can't really figure out how to package mysql inside a zip. Is this possible? If yes, then a list of steps to achieve this would be really helpful!
PS: I am using python-2.7 for writing the aws-lambda handler. I am not interested in using any python-mysql library to achieve this task. The reason being, I don't want to unzip the files and load them in memory and than execute them. These files can be very large, so I don't want to load them in memory.
Related
My problem is simple: I want to read a parquet file from s3 into a PCollection in Apache Beam using the Python Sdk.
I know of the apache_beam.io.parquetio module but this one does not seem to be able to read from s3 directly (or does it?).
I know of the apache_beam.io.aws.s3io module but this one seems to return an s3 file object or something that is not a PCollection anyway (or does it?).
So what’s the best way to do this?
if you install beam with the aws requirement
pip install 'apache-beam[aws]'
You can just pass in an s3 filename to read from it
filename = "s3://bucket-name/...
beam.io.ReadFromParquet(filenam)
My requirement is to download files from S3 bucket on daily basis based on the date filter (ex: day=2018-07-14). We are successfully able to download using AWSCLI using the below code
aws s3 cp s3://<bucketname>/day=2018-07-14 local_dir --recursive
But I would want to download using Python script (may be boto3). Can anyone suggest what are the steps to be taken and mainly the configuration steps (I am using the windows machine) to download using .py scripts.
Thanks in advance.
import boto3
This will unlock the python functionality you desire:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html
I am new to AWS Lambda, I have phantomjs application to run there.
There is a python script of 5 kb and phantomjs binary which makes the whole uploadable zip to 32MB.
And I have to upload this bunch all the time. Is there any way of pushing phantomjs binary to AWS lambda /bin folder separately ?
No, there is no way to accomplish this. Your Lambda function is always provisioned as a whole from the latest zipped package you provide (or S3 bucket/key if you choose that method).
Is it possible to create a new excel spreadsheet file and save it to an Amazon S3 bucket without first saving to a local filesystem?
For example, I have a Ruby on Rails web application which now generates Excel spreadsheets using the write_xlsx gem and saving it to the server's local file system. Internally, it looks like the gem is using Ruby's IO.copy_stream when it saves the spreadsheet. I'm not sure this will work if moving to Heroku and S3.
Has anyone done this before using Ruby or even Python?
I found this earlier question, Heroku + ephemeral filesystem + AWS S3. So, it would seem this is not possible using Heroku. Theoretically, it would be possible using a service which allows adding an Amazon EBS.
You have dedicated Ruby Gem to help you moving file to Amazon S3:
https://rubygems.org/gems/aws-s3
If you want more details about the implementation, here is the git repository. The documentation on the page is very complete, and explain how to move file to S3. Hope it helps.
Once your xls file is created, the library helps you create a S3Objects and store it into a Bucket (which you can also create with the library).
S3Object.store('keyOfYourData', open('nameOfExcelFile.xls'), 'bucketName')
If you want more choice, Amazon also delivered an official Gem for this purpose: https://rubygems.org/gems/aws-sdk
I have an application where in I need to zip folders hosted on S3.The zipping process will be triggered from the model save method.The Django app is running on an EC2 instance.Any thoughts or leads on the same?
I tried django_storages but haven't got a breakthrough
from my understanding you can't zip files directly on s3. you would have to download the files you'd want to zip, zip them up, then upload the zipped file. i've done something similar before and used s3cmd to keep a local synced copy of my s3bucket, and since you're on an ec2 instance network speed and latency will be pretty good.