I work with a group of non-developers which are uploading objects to an s3 style bucket through radosgw. All uploaded objects need to be publicly available, but they cannot do this programmatically. Is there a way to make the default permission of an object public-read so this does not have to be manually set every time? There has to be a way to do this with boto, but I've yet to find any examples. There's a few floating around using AWS' GUI, but that is not an option for me. :(
I am creating a bucket like this:
#!/usr/bin/env python
import boto
import boto.s3.connection
access_key = "SAMPLE3N84XBEHSAMPLE"
secret_key = "SAMPLEc4F3kfvVqHjMAnsALY8BCQFwTkI3SAMPLE"
conn = boto.connect_s3(
aws_access_key_id = access_key,
aws_secret_access_key = secret_key,
host = '10.1.1.10',
is_secure=False,
calling_format = boto.s3.connection.OrdinaryCallingFormat(),
)
bucket = conn.create_bucket('public-bucket', policy='public-read')
I am setting the policy to public-read which seems to allow people to browse the bucket as a directory, but the objects within the bucket do not inherit this permission.
>>> print bucket.get_acl()
<Policy: http://acs.amazonaws.com/groups/global/AllUsers = READ, S3 Newbie (owner) = FULL_CONTROL>
To clarify, I do know I can resolve this on a per-object basis like this:
key = bucket.new_key('thefile.tgz')
key.set_contents_from_filename('/home/s3newbie/thefile.tgz')
key.set_canned_acl('public-read')
But my end users are not capable of doing this, so I need a way to make this the default permission of an uploaded file.
I found a solution to my problem.
First, many thanks to joshbean who posted this: https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/python/example_code/s3/s3-python-example-put-bucket-policy.py
I noticed he was using the boto3 library, so I started using it for my connection.
import boto3
import json
access_key = "SAMPLE3N84XBEHSAMPLE"
secret_key = "SAMPLEc4F3kfvVqHjMAnsALY8BCQFwTkI3SAMPLE"
conn = boto3.client('s3', 'us-east-1',
endpoint_url="http://mycephinstance.net",
aws_access_key_id = access_key,
aws_secret_access_key = secret_key)
bucket = "public-bucket"
bucket_policy = {
"Version":"2012-10-17",
"Statement":[
{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": "*",
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::{0}/*".format(bucket)]
}
]
}
bucket_policy = json.dumps(bucket_policy)
conn.put_bucket_policy(Bucket=bucket_name, Policy=bucket_policy)
Now when an object is uploaded in public-bucket, it can be anonymously downloaded without explicitly setting the key permission to public-read or generating a download URL.
If you're doing this, be REALLY REALLY certain that it's ok for ANYONE to download this stuff. Especially if your radosgw service is publicly accessible on the internet.
Related
My login to AWS console is MFA & for that I am using Google Authenticator.
I have S3 DEV bucket and to access that DEV bucket, I have to switch role and after switching i can access DEV bucket.
I need help how to achieve same in python with boto3.
There are many csv file that I need to open in dataframe and without that resolving access, I cannot proceed.
I tried configuring AWS credentials & config and using that in my python code but didn't helped.
AWS document is not clear about how to do switching role while using & doing in python.
import boto3
import s3fs
import pandas as pd
import boto.s3.connection
access_key = 'XXXXXXXXXXX'
secret_key = 'XXXXXXXXXXXXXXXXX'
# bucketName = 'XXXXXXXXXXXXXXXXX'
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
Expected result should be to access that bucket after switching role in python code along with MFA.
In general, it is a bad for security to put credentials in your program code. It is better to store them in a configuration file. You can do this by using the AWS Command-Line Interface (CLI) aws configure command.
Once the credentials are stored this way, any AWS SDK (eg boto3) will automatically retrieve the credentials without having to reference them in code.
See: Configuring the AWS CLI - AWS Command Line Interface
There is an additional capability with the configuration file, that allows you to store a role that you wish to assume. This can be done by specifying a profile with the Role ARN:
# In ~/.aws/credentials:
[development]
aws_access_key_id=foo
aws_access_key_id=bar
# In ~/.aws/config
[profile crossaccount]
role_arn=arn:aws:iam:...
source_profile=development
The source_profile points to the profile that contains credentials that will be used to make the AssumeRole() call, and role_arn specifies the Role to assume.
See: Assume Role Provider
Finally, you can tell boto3 to use that particular profile for credentials:
session = boto3.Session(profile_name='crossaccount')
# Any clients created from this session will use credentials
# from the [crossaccount] section of ~/.aws/credentials.
dev_s3_client = session.client('s3')
An alternative to all the above (which boto3 does for you) is to call assume_role() in your code, then use the temporary credentials that are returned to define a new session that you can use to connect to a service. However, the above method using profiles is a lot easier.
On boto I used to specify my credentials when connecting to S3 in such a way:
import boto
from boto.s3.connection import Key, S3Connection
S3 = S3Connection( settings.AWS_SERVER_PUBLIC_KEY, settings.AWS_SERVER_SECRET_KEY )
I could then use S3 to perform my operations (in my case deleting an object from a bucket).
With boto3 all the examples I found are such:
import boto3
S3 = boto3.resource( 's3' )
S3.Object( bucket_name, key_name ).delete()
I couldn't specify my credentials and thus all attempts fail with InvalidAccessKeyId error.
How can I specify credentials with boto3?
You can create a session:
import boto3
session = boto3.Session(
aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
)
Then use that session to get an S3 resource:
s3 = session.resource('s3')
You can get a client with new session directly like below.
s3_client = boto3.client('s3',
aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
region_name=REGION_NAME
)
This is older but placing this here for my reference too. boto3.resource is just implementing the default Session, you can pass through boto3.resource session details.
Help on function resource in module boto3:
resource(*args, **kwargs)
Create a resource service client by name using the default session.
See :py:meth:`boto3.session.Session.resource`.
https://github.com/boto/boto3/blob/86392b5ca26da57ce6a776365a52d3cab8487d60/boto3/session.py#L265
you can see that it just takes the same arguments as Boto3.Session
import boto3
S3 = boto3.resource('s3', region_name='us-west-2', aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY)
S3.Object( bucket_name, key_name ).delete()
I'd like expand on #JustAGuy's answer. The method I prefer is to use AWS CLI to create a config file. The reason is, with the config file, the CLI or the SDK will automatically look for credentials in the ~/.aws folder. And the good thing is that AWS CLI is written in python.
You can get cli from pypi if you don't have it already. Here are the steps to get cli set up from terminal
$> pip install awscli #can add user flag
$> aws configure
AWS Access Key ID [****************ABCD]:[enter your key here]
AWS Secret Access Key [****************xyz]:[enter your secret key here]
Default region name [us-west-2]:[enter your region here]
Default output format [None]:
After this you can access boto and any of the api without having to specify keys (unless you want to use a different credentials).
If you rely on your .aws/credentials to store id and key for a user, it will be picked up automatically.
For instance
session = boto3.Session(profile_name='dev')
s3 = session.resource('s3')
This will pick up the dev profile (user) if your credentials file contains the following:
[dev]
aws_access_key_id = AAABBBCCCDDDEEEFFFGG
aws_secret_access_key = FooFooFoo
region=op-southeast-2
There are numerous ways to store credentials while still using boto3.resource().
I'm using the AWS CLI method myself. It works perfectly.
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html?fbclid=IwAR2LlrS4O2gYH6xAF4QDVIH2Q2tzfF_VZ6loM3XfXsPAOR4qA-pX_qAILys
you can set default aws env variables for secret and access keys - that way you dont need to change default client creation code - though it is better to pass it as a parameter if you have non-default creds
I am trying to upload an image to S3 through Python. My code looks like this:
import os
from PIL import Image
import boto
from boto.s3.key import Key
def upload_to_s3(aws_access_key_id, aws_secret_access_key, file, bucket, key, callback=None, md5=None, reduced_redundancy=False, content_type=None):
conn = boto.connect_s3(aws_access_key_id, aws_secret_access_key)
bucket = conn.get_bucket(bucket, validate=False)
k = Key(bucket)
k.key = key
k.set_contents_from_file(file)
AWS_ACCESS_KEY = "...."
AWS_ACCESS_SECRET_KEY = "....."
filename = "images/image_0.jpg"
file = Image.open(filename)
key = "image"
bucket = 'images'
upload_to_s3(AWS_ACCESS_KEY, AWS_ACCESS_SECRET_KEY, file, bucket, key)
I am getting this error message:
S3ResponseError: S3ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidRequest</Code><Message> The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.</Message>
<RequestId>90593132BA5E6D6C</RequestId>
<HostId>...</HostId></Error>
This code is based on the tutorial from this website: http://stackabuse.com/example-upload-a-file-to-aws-s3/
I have tried k.set_contents_from_file as well as k.set_contents_from_filename, but both don't seem to work for me.
The error says something about using AWS4-HMAC-SHA256, but I am not sure how to do that. Is there another way to solve this problem besides using AWS4-HMAC-SHA256? If anyone can help me out, I would really appreciate it.
Thank you!
Just use:
import boto3
client = boto3.client('s3', region_name='us-west-2')
client.upload_file('images/image_0.jpg', 'mybucket', 'image_0.jpg')
Try to avoid putting your credentials in the code. Instead:
If you are running the code from an Amazon EC2 instance, simply assign an IAM Role to the instance with appropriate permissions. The credentials will automatically be used.
If you are running the code on your own computer, use the AWS Command-Line Interface (CLI) aws configure command to store your credentials in a file, which will be automatically used by your code.
I have a django web app and I want to allow it to download files from my s3 bucket.
The files are not public. I have an IAM policy to access them.
The problem is that I do NOT want to download the file on the django app server and then serve it to download on the client. That is like downloading twice. I want to be able to download directly on the client of the django app.
Also, I don't think it's safe to pass my IAM credentials in an http request so I think I need to use a temporary token.
I read:
http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html
but I just do not understand how to generate a temporary token on the fly.
A python solution (maybe using boto) would be appreciated.
With Boto (2), it should be really easy to generate time-limited download URLs, should your IAM policy have the proper permissions. I am using this approach to serve videos to logged-in users from private S3 bucket.
from boto.s3.connection import S3Connection
conn = S3Connection('<aws access key>', '<aws secret key>')
bucket = conn.get_bucket('mybucket')
key = bucket.get_key('mykey', validate=False)
url = key.generate_url(86400)
This would generate a download URL for key foo in the given bucket, that is valid for 24 hours (86400 seconds). Without validate=False Boto 2 will check that the key actually exists in the bucket first, and if not, will throw an exception. With these server-controlled files it is often an unnecessary extra step, thus validate=False in the example
In Boto3 the API is quite different:
s3 = boto3.client('s3')
# Generate the URL to get 'key-name' from 'bucket-name'
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': 'mybucket',
'Key': 'mykey'
},
expires=86400
)
I store images in my local server then upload to s3
Now I want to edit it to stored images directly to amazon s3
But ther is error:
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
here is my settings.py
AWS_ACCESS_KEY_ID = "XXXX"
AWS_SECRET_ACCESS_KEY = "XXXX"
IMAGES_STORE = 's3://how.are.you/'
Do I need to add something??
my scrapy edition: Scrapy==0.22.2
Please guide me,thank you!
AWS_ACCESS_KEY_ID = "xxxxxx"
AWS_SECRET_ACCESS_KEY = "xxxxxx"
IMAGES_STORE = "s3://bucketname/virtual_path/"
how.are.you should be a S3 Bucket that exist into your S3 account, and it will store the images you upload. If you want to store images inside any virtual_path then you need to create this folder into your S3 Bucket.
I found the cause of the problem is upload policy. The function Key.set_contents_from_string() takes argument policy, default set to S3FileStore.POLICY. So modify the code in scrapy/contrib/pipeline/files.py, change
return threads.deferToThread(k.set_contents_from_string, buf.getvalue(),
headers=h, policy=self.POLICY)
to
return threads.deferToThread(k.set_contents_from_string, buf.getvalue(),
headers=h)
Maybe you can try it, and share the result here.
I think the problem is not in your code, actually the problem lies in permission, please check your credentials first and make sure your permissions to access and write on s3 bucket.
import boto
s3 = boto.connect_s3('access_key', 'secret_key')
bucket = s3.lookup('bucket_name')
key = bucket.new_key('testkey')
key.set_contents_from_string('This is a test')
key.delete()
If test run successfuly then look into your permission, for setting permission you can look at amazon configuration