I want to make a linke to download S3 stored file.
<a href="https://s3.region.amazonaws.com/bucket/file.txt" download>DownLoad</a>
it only display file.txt on the browser.
So I found way to download. It is add Content-Disposition : attachment meta tag to file.
But I need to add this meta tag to new file automately. So I made lambda function by python.
import json
import urllib.parse
import boto3
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
#print("Received event: " + json.dumps(event, indent=2))
# Get the object from the event and show its content type
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
try:
response = s3.get_object(Bucket=bucket, Key=key)
print("CONTENT TYPE: " + response['ContentType'])
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
try:
s3_2 = boto3.resource('s3')
s3_object = s3_2.Object(bucket, key)
print(s3_object.metadata)
s3_object.metadata.update({'ContentDisposition':'attachment'})
print(bucket, key)
s3_object.copy_from(CopySource={'Bucket':bucket, 'Key':key}, Metadata=s3_object.metadata, MetadataDirective='REPLACE')
except:
print(s3_object.metadata)
return response['ContentType']
But this function add user defined metatag not system metatag. . .
What should I do?
Content-Disposition is treated by S3 as (somewhat) more like system metadata than custom/user-defined metadata, so it has its own argument.
s3_object.copy_from(CopySource={'Bucket':bucket, 'Key':key}, ContentDisposition='attachment', Metadata=s3_object.metadata, MetadataDirective='REPLACE')
Note that you still need Metadata and MetadataDirective as shown, for this to work, but s3_object.metadata.update() is not required since you are not changing the custom metadata.
Related
SITUATION
I'm using a Lambda function that takes a CSV attachment from an incoming email and places it into what is, in effect, a sub-folder of an S3 bucket. This part of the Lambda works well, however there are other UDFs which I need to execute, within the same Lambda function, to perform susequent tasks.
CODE
import boto3
import email
import base64
import math
import pickle
import numpy as np
import pandas as pd
import io
###############################
### GET THE ATTACHMENT ###
###############################
#s3 = boto3.client('s3')
FILE_MIMETYPE = 'text/csv'
#'application/octet-stream'
# destination folder
S3_OUTPUT_BUCKETNAME = 'my_bucket'
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
#source email bucket
inBucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.quote(event['Records'][0]['s3']['object']['key'].encode('utf8'))
try:
response = s3.get_object(Bucket=inBucket, Key=key)
msg = email.message_from_string(response['Body'].read().decode('utf-8'))
except Exception as e:
print(e)
print('Error retrieving object {} from source bucket {}. Verify existence and ensure bucket is in same region as function.'.format(key, inBucket))
raise e
attachment_list = []
try:
#scan each part of email
for message in msg.walk():
# Check filename and email MIME type
if (message.get_content_type() == FILE_MIMETYPE and message.get_filename() != None):
attachment_list.append ({'original_msg_key':key, 'attachment_filename':message.get_filename(), 'body': base64.b64decode(message.get_payload()) })
except Exception as e:
print(e)
print ('Error processing email for CSV attachments')
raise e
# if multiple attachments send all to bucket
for attachment in attachment_list:
try:
s3.put_object(Bucket=S3_OUTPUT_BUCKETNAME, Key='attachments/' + attachment['original_msg_key'] + '-' + attachment['attachment_filename'] , Body=attachment['body']
)
except Exception as e:
print(e)
print ('Error sending object {} to destination bucket {}. Verify existence and ensure bucket is in same region as function.'.format(attachment['attachment_filename'], S3_OUTPUT_BUCKETNAME))
raise e
#################################
### ADDITIONAL FUNCTIONS ###
#################################
def my_function():
print("Hello, this is another function")
OUTCOME
The CSV attachment is successfully retrieved and placed in the destination as specified by s3.put_object, however there is no evidence in the Cloudwatch logs that my_function runs.
WHAT I HAVE TRIED
I've tried using def my_function(event, context): in an attempt to ascertain whether the function requires the same criteria to be executed as the first functon. I've also tried to include the my_function() as part of the first function but this does not appear to work either.
How can I ensure that both functions are executed within the Lambda?
Based on the comments.
The issue was caused because my_function function was not called inside the lambda handler.
The solution was to add my_function() into the handler lambda_handler so that the my_function is actually called.
I created a event based triggering lambda function that copies a zip file from source s3 to target s3 whenever a zip file is loaded in source s3 bucket. Below is the lambda function in python:
from __future__ import print_function
import json
import urllib
import boto3
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']
['key'].encode("utf8"))
target_bucket = 'bucket name'
copy_source = {'Bucket' :bucket , 'Key' : key}
try:
response = s3.get_object(Bucket=bucket, Key=key)
print("CONTENT TYPE: " + response['ContentType'])
return response['ContentType']
print("copying from source to target")
s3.copy_object(Bucket=target_bucket,
Key=key,CopySource=copy_source)
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they
exist and your bucket is in the same region as this
function.'.format(key, bucket))
raise e
Below is the error message that I get by executing the code:
'Records': KeyError Traceback (most recent call last):
File "/var/task/lambda_function.py", line 13, in lambda_handler
bucket = event['Records'][0]['s3']['bucket']['name']
KeyError: 'Records'
Any help to solve this would be appreciated.
I'm trying to use a python lambda function to append a text file with a new line on a object stored in S3. Since objects stored in S3 are immutable, you must first download the file into '/tmp/', then modify it, then upload the new version back to S3. My code appends the data, however it will not append it with a new line.
BUCKET_NAME = 'mybucket'
KEY = 'test.txt'
s3 = boto3.resource('s3')
def lambda_handler(event, context):
try:
s3.Object(BUCKET_NAME, KEY).download_file('/tmp/test.txt')
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
with open('/tmp/test.txt', 'a') as fd:
fd.write("this is a new string\n")
s3.meta.client.upload_file('/tmp/test.txt', BUCKET_NAME, KEY)
The file is always appended with the new string but never with a new line. Any ideas?
UPDATE: This problem does not occur on linux machines or on a Mac. Lambda functions run on linux containers, which means the file in /tmp/ is saved as a Unix-formatted text file. Some Windows applications will not show line breaks on Unix-formatted text files, which was the case here. I'm dumb.
You don't need to download and upload a file in order to overwrite a file in S3; To overwrite an existing object you can just upload the file with the same name and it will be done automatically (reference). Look into the put_object function (S3 doc).
So your code will look like this:
BUCKET_NAME = 'mybucket'
KEY = 'test.txt'
# Use .client() instead of .resource()
s3 = boto3.client('s3')
def lambda_handler(event, context):
try:
# (Optional) Read the object
obj = s3.get_object(Bucket=BUCKET_NAME, Key=KEY)
file_content = obj['Body'].read().decode('utf-8')
# (Optional) Update the file content
new_file_content = file_content + "this is a new string\n"
# Write to the object
s3.put_object(Bucket=BUCKET_NAME, Key=KEY, Body=str(new_file_content))
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
You need to specify the local file path
import boto3
import botocore
from botocore.exceptions import ClientError
BUCKET_NAME = 'mybucket'
KEY = 'test.txt'
LOCAL_FILE = '/tmp/test.txt'
s3 = boto3.resource('s3')
def lambda_handler(event, context):
try:
obj=s3.Bucket(BUCKET_NAME).download_file(LOCAL_FILE, KEY)
except ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
with open('/tmp/test.txt', 'a') as fd:
fd.write("this is a new string\n")
s3.meta.client.upload_file(LOCAL_FILE, BUCKET_NAME, KEY)
Boto3 doc reference: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.download_file
Nice Post!
Just an adjustment..
You should change the order of LOCAL_FILE and KEY in the parameters of the download_file method.
The correct syntax is:
obj=s3.Bucket(BUCKET_NAME).download_file(KEY,LOCAL_FILE)
Also it would be nice if we delete de local file in case of file not found in the bucket. because if we dont remove the local file (if exists obviously) we may be adding a new line to the already existed local file.
With the help of this function:
def remove_local_file(filePath):
import os
# As file at filePath is deleted now, so we should check if file exists or not not before deleting them
if os.path.exists(filePath):
os.remove(filePath)
else:
print("Can not delete the file as it doesn't exists")
the final code starting in the 'try' could be like this:
try:
obj=s3.Bucket(BUCKET_NAME).download_file(KEY,LOCAL_FILE)
except ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
remove_local_file(LOCAL_FILE)
else:
raise
with open(LOCAL_FILE, 'a') as fd:
fd.write("this is a new string\n")
s3.meta.client.upload_file(LOCAL_FILE, BUCKET_NAME, KEY)
Pretty basic but I am not able to download files given s3 path.
for eg, I have this s3://name1/name2/file_name.txt
import boto3
locations = ['s3://name1/name2/file_name.txt']
s3_client = boto3.client('s3')
bucket = 'name1'
prefix = 'name2'
for file in locations:
s3_client.download_file(bucket, 'file_name.txt', 'my_local_folder')
I am getting error as botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
This file exists as when I download. using aws cli as s3 path: s3://name1/name2/file_name.txt .
You need to have a list of filename paths, then modify your code like shown in the documentation:
import os
import boto3
import botocore
files = ['name2/file_name.txt']
bucket = 'name1'
s3 = boto3.resource('s3')
for file in files:
try:
s3.Bucket(bucket).download_file(file, os.path.basename(file))
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
You may need to do this with some type of authentication. There are several methods, but creating a session is simple and fast:
from boto3.session import Session
bucket_name = 'your_bucket_name'
folder_prefix = 'your/path/to/download/files'
credentials = 'credentials.txt'
with open(credentials, 'r', encoding='utf-8') as f:
line = f.readline().strip()
access_key = line.split(':')[0]
secret_key = line.split(':')[1]
session = Session(
aws_access_key_id=access_key,
aws_secret_access_key=secret_key
)
s3 = session.resource('s3')
bucket = s3.Bucket(bucket_name)
for s3_file in bucket.objects.filter(Prefix=folder_prefix):
file_object = s3_file.key
file_name = str(file_object.split('/')[-1])
print('Downloading file {} ...'.format(file_object))
bucket.download_file(file_object, '/tmp/{}'.format(file_name))
In credentials.txt file you must add a single line where you concatenate the access key id and the secret, for example:
~$ cat credentials.txt
AKIAIO5FODNN7EXAMPLE:ABCDEF+c2L7yXeGvUyrPgYsDnWRRC1AYEXAMPLE
Don't forget to protect this file well on your host, give read-only permissions for the user who runs this program. I hope it works for you, it works perfectly for me.
I'm trying to upload a .zip file to S3 using boto3 for python but the .zip file in my directory is not uploaded correctly.
The code downloads all emails of a given user, zips them in the same directory and uploads them to an S3 bucket.
The problem is that the file that gets uploaded is not the one I intend to upload. Instead a file with 18kb only appears.
Here's the code:
import sys
import imaplib
import getpass
import email
import shutil
import boto3
import os
username = input("Enter user's first name: ")
surname = input("Enter user's surname: ")
email_address = username + "." + surname + "#gmail.com"
password = getpass.getpass()
directory = username + surname + '/'
def download_emails(server):
result, data = server.uid('search', None, "ALL") #search all email and return their uids
if result == 'OK':
for num in data[0].split():
result, data = server.uid('fetch', num, '(RFC822)') #RFC is a standard for the format of ARPA Internet text messages
if result == 'OK':
email_message = email.message_from_bytes(data[0][1]) #raw email text including headers
file_name = email_message['Subject'] #use dates and file names(can be changed)
if not os.path.exists(directory):
os.makedirs(directory) #create a dir for user's emails
try:
email_file = open(directory + file_name+'.eml', 'wb') #open a file for each email and insert the data.
email_file.write(data[0][1])
email_file.close()
except:
pass
#function to zip all the emails
def archive(zipname, directory):
return shutil.make_archive(zipname, 'zip', root_dir=directory, base_dir=None)
#function to upload zipped emails to AWS bucket
def upload_to_s3(file_name):
s3 = boto3.resource('s3',
aws_access_key_id=accessKey,
aws_secret_access_key=secretKey,
aws_session_token=secretToken,
)
s3.Bucket('user-backups').put_object(Key=username.title() + " " +
surname.title() + "/" + file_name, Body=file_name)
print("Uploaded")
def main():
server = imaplib.IMAP4_SSL("imap.gmail.com", 993) #connect to gmail's imap server
server.login(email_address, password) #enter creds
result, data = server.select('"[Gmail]/All Mail"') #get all emails(inbox, outbox etc)
if result == 'OK':
print("Downloading")
download_emails(server)
server.close()
else:
print("ERROR: Unable to open mailbox ", result)
server.logout()
archive(username + surname, directory)
upload_to_s3(username + surname + ".zip")
#os.remove(email_address + ".zip")
#shutil.rmtree(email_address)
print("Done")
if __name__ == "__main__":
main()
You can check out this article for more information.
There are a number of ways to upload. Check out this boto3 document where I have the methods listed below:
The managed upload methods are exposed in both the client and resource interfaces of boto3:
S3.Client method to upload a file by name: S3.Client.upload_file()
S3.Client method to upload a readable file-like object: S3.Client.upload_fileobj()
S3.Bucket method to upload a file by name: S3.Bucket.upload_file()
S3.Bucket method to upload a readable file-like object: S3.Bucket.upload_fileobj()
S3.Object method to upload a file by name: S3.Object.upload_file()
S3.Object method to upload a readable file-like object: S3.Object.upload_fileobj()
I made it work using s3.client.upload_file.
upload_file(Filename, Bucket, Key, ExtraArgs=None, Callback=None,
Config=None) .
Upload a file to an S3 object.
import boto3
s3Resource = boto3.resource('s3')
try:
s3Resource.meta.client.upload_file('/path/to/file', 'bucketName', 'keyName')
except Exception as err:
print(err)
None of the above answers worked!
The following code worked for me..
import os
def upload_file_zip(local_file_path):
s3_client = boto3.client('s3')
s3_path = os.path.join(os.path.basename(local_file_path))
with open(local_file_path,mode='rb') as data:
s3_client.upload_fileobj(data, BUCKET_NAME, s3_path)
Updated code as s3_folder parameter is not required here.
The put_object function accepts Body which is either bytes object or a file object. You have currently just passed the plain filename (a string).
From documentation:
Body (bytes or seekable file-like object) -- Object data.
So the fix should be to pass the file object. Consult this to know how to do that.
Just use s3.client.upload_file.
upload_file(Filename, Bucket, Key, ExtraArgs=None, Callback=None,
Config=None)
def upload_to_s3(file_name):
s3 = boto3.client('s3')
Key = username.title() + " " + surname.title() + "/" + file_name
try:
s3.meta.client.upload_file('/path/to/file', 'user-backups', Key)
except Exception as e:
print(e)
I managed to upload a .zip file by means of the following code:
def write_to_s3(filename, bucket, key):
s3 = boto3.resource(service_name='s3',
aws_access_key_id=os.environ["AWS_USER1_ACCESS_KEY"],
aws_secret_access_key=os.environ["AWS_USER1_SECRET_ACCESS_KEY"])
s3.meta.client.upload_file(filename, bucket, key)
Note : I had to use boto3.resource() instead of boto3.client() as was answered above by mootmoot, as it threw an Exception().
Exception thrown if boto3.client() is used:
AttributeError: 'ClientMeta' object has no attribute 'client'