cross-account file upload in S3 bucket using boto3 and python - python

I have an S3 bucket with a given access_key and secret_access_key. I use the following code to upload files into my S3 bucket successfully.
import boto3
import os
client = boto3.client('s3',
aws_access_key_id = access_key,
aws_secret_access_key = secret_access_key)
upload_file_bucket = 'my-bucket'
upload_file_key = 'my_folder/' + str(my_file)
client.upload_file(file, upload_file_bucket, upload_file_key)
Now, I want to upload my_file into another bucket that is owned by a new team. Therefore, I do not have access to access_key and secret_access_key. What is the best practice to do cross-account file upload using boto3 and Python?

You can actually use the same code, but the owner of the other AWS Account would need to add a Bucket Policy to the destination bucket that permits access from your IAM User. It would look something like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:PutObject",
"s3:PutObjectAcl"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::their-bucket/*",
"Principal": {
"AWS": [
"arn:aws:iam::YOUR-ACCOUNT-ID:user/username"
]
}
}
]
}
When uploading objects to a bucket owned by another AWS Account I recommend adding ACL= bucket-owner-full-control , like this:
client.upload_file(file, upload_file_bucket, upload_file_key, ExtraArgs={'ACL':'bucket-owner-full-control'})
This grants ownership of the object to the bucket owner, rather than the account that did the upload.

Related

AWS CLI Can List S3 Bucket But Access Denied For Python Lambda

I've used terraform to setup infra for an s3 bucket and my containerised lambda. I want to trigger the lambda to list the items in my s3 bucket. When I run the aws cli it's fine:
aws s3 ls
returns
2022-11-08 23:04:19 bucket-name
This is my lambda:
import logging
import boto3
LOGGER = logging.getLogger(__name__)
LOGGER.setLevel(logging.DEBUG)
s3 = boto3.resource('s3')
def lambda_handler(event, context):
LOGGER.info('Executing function...')
bucket = s3.Bucket('bucket-name')
total_objects = 0
for i in bucket.objects.all():
total_objects = total_objects + 1
return {'total_objects': total_objects}
When I run the test in the AWS console, I'm getting this:
[ERROR] ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
No idea why this is happening. These are my terraform lambda policies, roles and the s3 setup:
resource "aws_s3_bucket" "statements_bucket" {
bucket = "bucket-name"
acl = "private"
}
resource "aws_s3_object" "object" {
bucket = aws_s3_bucket.statements_bucket.id
key = "excel/"
}
resource "aws_iam_role" "lambda" {
name = "${local.prefix}-lambda-role"
path = "/"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow"
}
]
}
EOF
}
resource "aws_iam_policy" "lambda" {
name = "${local.prefix}-lambda-policy"
description = "S3 specified access"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::bucket-name"
]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::bucket-name/*"
]
}
]
}
EOF
}
As far as I can tell, your Lambda function has the correct IAM role (the one indicated in your Terraform template) but that IAM role has no attached policies.
You need to attach the S3 policy, and any other IAM policies needed, to the IAM role. For example:
resource "aws_iam_role_policy_attachment" "lambda-attach" {
role = aws_iam_role.role.name
policy_arn = aws_iam_policy.policy.arn
}
In order to run aws s3 ls, you would need to authorize the action s3:ListAllMyBuckets. This is because aws s3 ls lists all of your buckets.
You should be able to list the contents of your bucket using aws s3 ls s3://bucket-name. However, you're probably going to have to add "arn:aws:s3:::bucket-name/*" to the resource list for your role as well. Edit: nevermind!

Downloading Files on a Public S3 Bucket Without Authentication Using Python

Im trying to download a file from an S3 bucket that is public and requires no authentication (meaning no need to hardcode Access and Secret keys to access nor store it in AWS CLI), yet I still cannot access it via boto3.
Python code
import boto3
import botocore
from botocore import UNSIGNED
from botocore.config import Config
BUCKET_NAME = 'converted-parquet-bucket'
PATH = 'json-to-parquet/names.snappy.parquet'
s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
try:
s3.Bucket(BUCKET_NAME).download_file(PATH, 'names.snappy.parquet')
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
I get this error code when I execute the code
AttributeError: 'S3' object has no attribute 'Bucket'
If it helps, here is my bucket public policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::converted-parquet-bucket/*"
}
]
}
If your suggestion is to store keys, please dont, that is not what I'm trying to do.
try resource s3 = boto3.resource('s3') instead of s3 = boto3.client('s3')

Copy from S3 bucket in one account to S3 bucket in another account using Boto3 in AWS Lambda

I have created a S3 bucket and created a file under my aws account. My account has trust relationship established with another account and I am able to put objects into the bucket in another account using Boto3. How can I copy objects from bucket in my account to bucket in another account using Boto3?
I see "access denied" when I use the code below -
source_session = boto3.Session(region_name = 'us-east-1')
source_conn = source_session.resource('s3')
src_conn = source_session.client('s3')
dest_session = __aws_session(role_arn=assumed_role_arn, session_name='dest_session')
dest_conn = dest_session.client ( 's3' )
copy_source = { 'Bucket': bucket_name , 'Key': key_value }
dest_conn.copy ( copy_source, dest_bucket_name , dest_key,ExtraArgs={'ServerSideEncryption':'AES256'}, SourceClient = src_conn )
In my case , src_conn has access to source bucket and dest_conn has access to destination bucket.
I believe the only way to achieve this by downloading and uploading the files.
AWS Session
client = boto3.client('sts')
response = client.assume_role(RoleArn=role_arn, RoleSessionName=session_name)
session = boto3.Session(
aws_access_key_id=response['Credentials']['AccessKeyId'],
aws_secret_access_key=response['Credentials']['SecretAccessKey'],
aws_session_token=response['Credentials']['SessionToken'])
Another approach is to attach a policy to the destination bucket permitting access from the account hosting the source bucket. eg. something like the following should work (although you may want to tighten up the permissions as appropriate):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<source account ID>:root"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::dst_bucket",
"arn:aws:s3:::dst_bucket/*"
]
}
]
}
Then your Lambda hosted in your source AWS account should have no problems writing to the bucket(s) in the destination AWS account.

Boto3: aws credentials with limited permissions

I was provisioned some AWS keys. These keys give me access to certain directories in a s3 bucket. I want to use boto3 to interact with the directories that were exposed to me, however it seems that I can't actually do anything with the bucket at all, since I don't have access to the entire bucket.
This works for me from my terminal:
aws s3 ls s3://the_bucket/and/this/specific/path/
but if I do:
aws s3 ls s3://the_bucket/
I get:
An error occurred (AccessDenied) when calling the ListObjects
operation: Access Denied
which also happens when I try to access the directory via boto3.
session = boto3.Session(profile_name=my_creds)
client=session.client('s3')
list_of_objects = client.list_objects(Bucket='the_bucket', Prefix='and/this/specific/path', Delimiter='/')
Do I need to request access to the entire bucket for boto3 to be usable?
You need to set this Bucket Policy:
{
"Sid": "<SID>",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<account>:user/<user_name>"
},
"Action": [
"s3:GetBucketLocation",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::<bucket_name>"
}
For more information about Specifying Permissions in a Policy

AmazonS3 - connecting with Python Boto according specific permissions

I am trying to connect Amazon S3 via Boto 2.38.0 and python 3.4.3.
The S3 account is owned by another company and they grants just these permissions :
"Statement":
[
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:axs:s3:::GA-Exports",
"Condition":{
"StringLike":
{
"s3.prefix": "Events_3112/*"
}
}
},{
"Effect": "Allow",
"Action":
[
"s3:GetObject",
"s3.GetObjectAcl",
"s3.GetBucketAcl"
],
"Resource": "arn:axs:s3:::GA-Exports/Events_3112/*",
"Condition": {}
}
]
I can connect and retrieve a specific file if I set the name. But I need to retrieve all data from S3 (for example to determine -through a script- which files I have not yet downloaded).
from boto.s3.connection import S3Connection
from boto.s3.connection import OrdinaryCallingFormat
s3_connection = S3Connection(access_key, secret_key,calling_format=OrdinaryCallingFormat())
bucket = s3_connection.get_bucket(__bucket_name, validate=False)
key = bucket.get_key(file_name)
works, but
all_buckets = s3_connection.get_all_buckets()
raise an error
S3ResponseError: S3ResponseError: 403 Forbidden
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>19D20ADCFFC899ED</RequestId><HostId>eI4CzQqAvOnjcXJNZyyk+drFHjO9+yj0EtP+vJ5f/D7D4Dh2HFL3UvCacy9nP/wT</HostId></Error>
With the software S3 Browser, I can right click > "export file list", and get what I need. But how can I do this in python ?
EDIT :
Finally found the answer :
bucket_name = 'GA-Exports'
s3_connection = S3Connection(access_key, secret_key, calling_format=OrdinaryCallingFormat())
bucket = s3_connection.get_bucket(bucket_name, validate=False)
for key in bucket.list(prefix='Events_3112/DEV/'):
print(key.name, key.size, key.last_modified)
Thanks for your help! :)
You won't be allowed to get all buckets, permissions says that you are allowed to list bucket contents only for "GA-Exports":
from boto.s3.connection import S3Connection
from boto.s3.connection import OrdinaryCallingFormat
# this is to avoid a 301 mover permanently when used OrdinaryCallingFormat
if '.' in __bucket_name:
conn = S3Connection(access_key, secret_key, calling_format=OrdinaryCallingFormat())
else:
conn = S3Connection(access_key, secret_key)
bucket = conn.get_bucket(__bucket_name, validate=False)
l = bucket.list(prefix='Events_3112/') # now l is a list of objects within the bucket
# other option is to use bucket.get_all_keys()
for key in l:
print l # or whatever you want to do with each file name
# Recall this is only the filename not the file perse :-D
see complete bucket object reference in http://boto.readthedocs.org/en/latest/ref/s3.html#module-boto.s3.bucket
Edit: added a fix when a 301 moved permanently error is received when accessing S3 via ordinarycallingformat. Added #garnaat comment on prefix aswell (thx!)

Categories