I'm trying to enable Transfer Acceleration for some AWS S3 buckets.
I start up my client session:
client = boto3.client(
"s3",
aws_access_key_id=environ.get("AWS_ACCESS_KEY_ID"),
aws_secret_access_key=environ.get("AWS_SECRET_ACCESS_KEY")
)
Then I turn Transfer Acceleration on through the S3 console, and have ensured it is enabled and turned on in the code as such:
response = client.put_bucket_accelerate_configuration(
Bucket='string',
AccelerateConfiguration={
'Status': 'Enabled'
}
)
and
response = client.get_bucket_accelerate_configuration(
Bucket='string'
)
both snippets come straight from boto3 docs. I am able to upload to the bucket successfully later on in the code with:
client.upload_fileobj(data, environ.get("AWS_S3_BUCKET"), 'key')
I tried setting the endpoint_url param while starting the client session, but this just created a new folder (with my bucket title) inside my bucket.
It seems that boto3 is the only SDK that doesn't have some sort of "use transfer acceleration endpoint" flag. I know it is enabled on the bucket, and I have proof of that, but I have no proof that it is actually using the endpoint.
I've tried going through client metadata, bucket metadata, and every other client method that returns any sort of data, and I can't find proof that it actually used the acceleration endpoint.
Am I missing something?
Connect to S3 accelerate endpoint with boto3 mentions using:
Config(s3={"use_accelerate_endpoint": True})
This parameter is listed in Config Reference — botocore documentation:
s3 (dict)
use_accelerate_endpoint -- Refers to whether to use the S3 Accelerate endpoint. The value must be a boolean. If True, the client will use the S3 Accelerate endpoint. If the S3 Accelerate endpoint is being used then the addressing style will always be virtual.
So try using:
s3_client = boto3.client("s3", config=Config(s3={"use_accelerate_endpoint": True}))
Related
I have seen examples for checking whether an S3 bucket exists and have implemented them below. My bucket is located in us-east-1 region but the following code doesn't throw an exception. Is there a way to make the check region specific depending on my session?
session = boto3.Session(
profile_name = 'TEST'
,region_name='ap-south-1'
)
s3 = session.resource('s3')
bucket_name = 'TEST_BUCKET'
try:
s3.meta.client.head_bucket(Bucket = bucket_name)
except ClientError as c:
print(c)
It does not matter which S3 regional endpoint you send the request to. The underlying SDK (boto3) will redirect as needed. It's preferable, however, to target the correct region if you know it in advance, to save on redirects.
You can see this in detail if you use the awscli in debug mode:
aws s3api head-bucket --bucket mybucket --region ap-south-1 --debug
You will see debug output similar to this:
DEBUG - S3 client configured for region ap-south-1 but the bucket mybucket is in region us-east-1; Please configure the proper region to avoid multiple unnecessary redirects and signing attempts.
DEBUG - Switching signature version for service s3 to version s3v4 based on config file override.
DEBUG - Updating URI from https://s3.ap-south-1.amazonaws.com/mybucket to https://s3.us-east-1.amazonaws.com/mybucket
Note that the awscli uses the boto3 SDK, as does your Python script.
I am posting this here because I found it really hard to find the function to get all objects from our s3 bucket using python. When I tried to find get_object_data function, I was directed to downloading the object function.
So, how do we get the data of all the objects in our AWS s3 bucket using boto3(aws sdk for python)?
import boto3 to your python shell
make a connection to your AWS account and specify the resource(s3-bucket here) you want to access?
(make sure that the IAM credentials you are giving have access to that resource)
get the data required
The code looks something like this
import boto3
s3_resource = boto3.resource(service_name='s3',
region_name='<your bucket region>'
aws_access_key_id='<your access key id>'
aws_secret_access_key='<your secret access key>')
a = s3_resource.Bucket('<your bucket name>')
for obj in a.objects.all():
#object URL
print("https://<your bucket name>.s3.<your bucket region>.amazonaws.com/" + obj.key)
#if you want to print all the data of object, just print obj
My login to AWS console is MFA & for that I am using Google Authenticator.
I have S3 DEV bucket and to access that DEV bucket, I have to switch role and after switching i can access DEV bucket.
I need help how to achieve same in python with boto3.
There are many csv file that I need to open in dataframe and without that resolving access, I cannot proceed.
I tried configuring AWS credentials & config and using that in my python code but didn't helped.
AWS document is not clear about how to do switching role while using & doing in python.
import boto3
import s3fs
import pandas as pd
import boto.s3.connection
access_key = 'XXXXXXXXXXX'
secret_key = 'XXXXXXXXXXXXXXXXX'
# bucketName = 'XXXXXXXXXXXXXXXXX'
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
Expected result should be to access that bucket after switching role in python code along with MFA.
In general, it is a bad for security to put credentials in your program code. It is better to store them in a configuration file. You can do this by using the AWS Command-Line Interface (CLI) aws configure command.
Once the credentials are stored this way, any AWS SDK (eg boto3) will automatically retrieve the credentials without having to reference them in code.
See: Configuring the AWS CLI - AWS Command Line Interface
There is an additional capability with the configuration file, that allows you to store a role that you wish to assume. This can be done by specifying a profile with the Role ARN:
# In ~/.aws/credentials:
[development]
aws_access_key_id=foo
aws_access_key_id=bar
# In ~/.aws/config
[profile crossaccount]
role_arn=arn:aws:iam:...
source_profile=development
The source_profile points to the profile that contains credentials that will be used to make the AssumeRole() call, and role_arn specifies the Role to assume.
See: Assume Role Provider
Finally, you can tell boto3 to use that particular profile for credentials:
session = boto3.Session(profile_name='crossaccount')
# Any clients created from this session will use credentials
# from the [crossaccount] section of ~/.aws/credentials.
dev_s3_client = session.client('s3')
An alternative to all the above (which boto3 does for you) is to call assume_role() in your code, then use the temporary credentials that are returned to define a new session that you can use to connect to a service. However, the above method using profiles is a lot easier.
On boto I used to specify my credentials when connecting to S3 in such a way:
import boto
from boto.s3.connection import Key, S3Connection
S3 = S3Connection( settings.AWS_SERVER_PUBLIC_KEY, settings.AWS_SERVER_SECRET_KEY )
I could then use S3 to perform my operations (in my case deleting an object from a bucket).
With boto3 all the examples I found are such:
import boto3
S3 = boto3.resource( 's3' )
S3.Object( bucket_name, key_name ).delete()
I couldn't specify my credentials and thus all attempts fail with InvalidAccessKeyId error.
How can I specify credentials with boto3?
You can create a session:
import boto3
session = boto3.Session(
aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
)
Then use that session to get an S3 resource:
s3 = session.resource('s3')
You can get a client with new session directly like below.
s3_client = boto3.client('s3',
aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
region_name=REGION_NAME
)
This is older but placing this here for my reference too. boto3.resource is just implementing the default Session, you can pass through boto3.resource session details.
Help on function resource in module boto3:
resource(*args, **kwargs)
Create a resource service client by name using the default session.
See :py:meth:`boto3.session.Session.resource`.
https://github.com/boto/boto3/blob/86392b5ca26da57ce6a776365a52d3cab8487d60/boto3/session.py#L265
you can see that it just takes the same arguments as Boto3.Session
import boto3
S3 = boto3.resource('s3', region_name='us-west-2', aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY)
S3.Object( bucket_name, key_name ).delete()
I'd like expand on #JustAGuy's answer. The method I prefer is to use AWS CLI to create a config file. The reason is, with the config file, the CLI or the SDK will automatically look for credentials in the ~/.aws folder. And the good thing is that AWS CLI is written in python.
You can get cli from pypi if you don't have it already. Here are the steps to get cli set up from terminal
$> pip install awscli #can add user flag
$> aws configure
AWS Access Key ID [****************ABCD]:[enter your key here]
AWS Secret Access Key [****************xyz]:[enter your secret key here]
Default region name [us-west-2]:[enter your region here]
Default output format [None]:
After this you can access boto and any of the api without having to specify keys (unless you want to use a different credentials).
If you rely on your .aws/credentials to store id and key for a user, it will be picked up automatically.
For instance
session = boto3.Session(profile_name='dev')
s3 = session.resource('s3')
This will pick up the dev profile (user) if your credentials file contains the following:
[dev]
aws_access_key_id = AAABBBCCCDDDEEEFFFGG
aws_secret_access_key = FooFooFoo
region=op-southeast-2
There are numerous ways to store credentials while still using boto3.resource().
I'm using the AWS CLI method myself. It works perfectly.
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html?fbclid=IwAR2LlrS4O2gYH6xAF4QDVIH2Q2tzfF_VZ6loM3XfXsPAOR4qA-pX_qAILys
you can set default aws env variables for secret and access keys - that way you dont need to change default client creation code - though it is better to pass it as a parameter if you have non-default creds
I have a django web app and I want to allow it to download files from my s3 bucket.
The files are not public. I have an IAM policy to access them.
The problem is that I do NOT want to download the file on the django app server and then serve it to download on the client. That is like downloading twice. I want to be able to download directly on the client of the django app.
Also, I don't think it's safe to pass my IAM credentials in an http request so I think I need to use a temporary token.
I read:
http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html
but I just do not understand how to generate a temporary token on the fly.
A python solution (maybe using boto) would be appreciated.
With Boto (2), it should be really easy to generate time-limited download URLs, should your IAM policy have the proper permissions. I am using this approach to serve videos to logged-in users from private S3 bucket.
from boto.s3.connection import S3Connection
conn = S3Connection('<aws access key>', '<aws secret key>')
bucket = conn.get_bucket('mybucket')
key = bucket.get_key('mykey', validate=False)
url = key.generate_url(86400)
This would generate a download URL for key foo in the given bucket, that is valid for 24 hours (86400 seconds). Without validate=False Boto 2 will check that the key actually exists in the bucket first, and if not, will throw an exception. With these server-controlled files it is often an unnecessary extra step, thus validate=False in the example
In Boto3 the API is quite different:
s3 = boto3.client('s3')
# Generate the URL to get 'key-name' from 'bucket-name'
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': 'mybucket',
'Key': 'mykey'
},
expires=86400
)