I can create a bucket using these parameters. But none of them is a custom header. It's also said that boto3 will not support it because S3 does not currently allow setting arbitrary headers on buckets or objects.
But in my case. I am using Cloudian as storage. It supports x-gmt-policyid this policy determines how data in the bucket will be distributed and protected through either replication or erasure coding.
Any idea how to inject custom header to boto bucket creation?
s3_resource.create_bucket(Bucket='foo-1')
My last two options:
1) to fork botocore and add this functionality, but I saw they use loaders.py that read everything from json file, and it seems a bit complicated for a beginner.
2) or maybe I need to use pure python implementation using request module to create s3 bucket.
Thanks for suggestions.
My current solution is to fetch S3 compatible cloudian API directly. Signing the request is very complicated, so I use the help of requests-aws4auth library. I tried other libs but failed.
example to create bucket with clodian x-gmt-policyid value:
import requests
from requests_aws4auth import AWS4Auth
endpoint = "http://awesome-bucket.my-s3.net"
auth = AWS4Auth(
"00ac60d1a669fakekey",
"S2/x9sRvb1Jys9n+fakekey",
"eu-west-1",
"s3",
)
headers = {
"x-gmt-policyid": "9f934425b7f5de611c32fakeid",
"x-amz-acl": "public-read",
}
response = requests.put(endpoint, auth=auth, headers=headers)
print(response.text)
Related
I am trying to get multiple objects from an S3 bucket using python with aws cli installed and configure. I can currently get a single file using this code.
import boto3
url = boto3.client('s3').generate_presigned_url(
ClientMethod='get_object',
Params={'Bucket': 'test-bucket', 'Key':'00001.png'},
ExpiresIn=3600)
print(url)
However I need to generate the same for 100 other image files, how can I possibly do this?
Run the code 100 times -- seriously!
You should separate out the client generation, such as:
s3_client = boto3.client('s3')
url = s3_client.generate_presigned_url(...)
It's a very quick command and doesn't require a call to AWS so you can repeat or loop-through the last line many times.
Each object will require a separate pre-signed URL because permission is being generated for just one object at a time.
Since Azures Shared Access Signatures (SAS) is an open standard I want to use it in my Django application which has no links to Azure whatsoever. I basically want to create a piece of code that creates read permission for the next 24 hours on a specific url I'm serving. So I've watched some video's on SAS and I installed the python library for it (pip install azure-storage-blob).
I read over the README here on github but as far as I can see it always requires an Azure account. Is it also possible to use SAS in my own (Python) application? I imagine it to create the hashes based on a pre-defined secret key for. If this is possible, does anybody have any example code on how to create the url and how to validate it? Preferably in Python, but example code in other languages would be welcome as well.
While the original blob storage sas generation code exists here, I rather find the below simplified code more useful for your general purpose (inspired by this sample). Adjust as per your need. Below is the client side sas generation logic (hmac sha256 digest) using a secret key. Use the similar logic at server side to re-generate signature extracting URL params (sr, sig, se) and compare the same (sig) with that passed from client side to match. Note the shared secret key both at client and server side is main driver here.
import time
import urllib
import hmac
import hashlib
import base64
def get_auth_token(url_base, resource, sas_name, sas_secret):
"""
Returns an authorization token dictionary
for making calls to Event Hubs REST API.
"""
uri = urllib.parse.quote_plus("https://{}.something.com/{}" \
.format(url_base, resource))
sas = sas_secret.encode('utf-8')
expiry = str(int(time.time() + 10000))
string_to_sign = (uri + '\n' + expiry).encode('utf-8')
signed_hmac_sha256 = hmac.HMAC(sas, string_to_sign, hashlib.sha256)
signature = urllib.parse.quote(base64.b64encode(signed_hmac_sha256.digest()))
return {"url_base": url_base,
"resource": resource,
"token":'SharedAccessSignature sr={}&sig={}&se={}&skn={}' \
.format(uri, signature, expiry, sas_name)
}
I was asked to preform integration with an external google storage bucket, I had received a credentials json,
And while trying to do
gsutil ls gs://bucket_name (after configuring myself with the creds json) I had received a valid response, as well as when I tried to upload a file into the bucket.
When trying to do it with Python3, it does not work:
While using google-cloud-storage==1.16.0 (tried also the newer versions), I'm doing:
project_id = credentials_dict.get("project_id")
credentials = service_account.Credentials.from_service_account_info(credentials_dict)
client = storage.Client(credentials=credentials, project=project_id)
bucket = client.get_bucket(bucket_name)
But on the get_bucket line, I get:
google.api_core.exceptions.Forbidden: 403 GET https://www.googleapis.com/storage/v1/b/BUCKET_NAME?projection=noAcl: USERNAME#PROJECT_ID.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket.
The external partner which I'm integrating with, saying that the user is set correctly, and to prove it they're showing that I can preform the action with gsutil.
Can you please assist? Any idea what might be the problem?
The answer was that the creds were indeed wrong, but it did worked when I tried to preform on the client client.bucket(bucket_name) instead of client.get_bucket(bucket_name).
Please follow these steps in order to correctly set up the Cloud Storage Client Library for Python. In general, the Cloud Storage Libraries can use Application default credentials or environment variables for authentication.
Notice that the recommended method to use would be to set up authentication using environment variables (i.e if you are using Linux: export GOOGLE_APPLICATION_CREDENTIALS="/path/to/[service-account-credentials].json" should work) and avoid the use of the service_account.Credentials.from_service_account_info() method altogether:
from google.cloud import storage
storage_client = storage.Client(project='project-id-where-the-bucket-is')
bucket_name = "your-bucket"
bucket = client.get_bucket(bucket_name)
should simply work because the authentication is handled by the client library via the environment variable.
Now, if you are interested in explicitly using the service account instead of using service_account.Credentials.from_service_account_info() method you can use the from_service_account_json() method directly in the following way:
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'/[service-account-credentials].json')
bucket_name = "your-bucket"
bucket = client.get_bucket(bucket_name)
Find all the relevant details as to how to provide credentials to your application here.
tl;dr: dont use client.get_bucket at all.
See for detailed explanation and solution https://stackoverflow.com/a/51452170/705745
I have been tasked with converting some bash scripting used by my team that performs various cloudformation tasks into Python using the boto3 library. I am currently stuck on one item. I cannot seem to determine how to do a wildcard type search where a cloud formation stack name contains a string.
My bash version using the AWS CLI is as follows:
aws cloudformation --region us-east-1 describe-stacks --query "Stacks[?contains(StackName,'myString')].StackName" --output json > stacks.out
This works on the cli, outputting the results to a json file, but I cannot find any examples online to do a similar search for contains using boto3 with Python. Is it possible?
Thanks!
Yes, it is possible. What you are looking for is the following:
import boto3
# create a boto3 client first
cloudformation = boto3.client('cloudformation', region_name='us-east-1')
# use client to make a particular API call
response = cloudformation.describe_stacks(StackName='myString')
print(response)
# as an aside, you'd need a different client to communicate
# with a different service
# ec2 = boto3.client('ec2', region_name='us-east-1')
# regions = ec2.describe_regions()
where, response is a Python dictionary, which, among other things, will contain the description of the stack, "myString".
In the boto3 document, it shows an example of how to migrate the connection from boto 2.x to boto3
# Boto 2.x
import boto
s3_connection = boto.connect_s3()
# Boto 3
import boto3
s3 = boto3.resource('s3')
However, in boto, it is possible to pass a parameter https_connection_factory. What is the equivalent in boto3?
There's not a direct equivalent. When creating a client or resource, you can make some very broad choices about SSL (use_ssl, verify). Both those can also take a botocore.config.Config object which can let you control timeouts and http pooling behavior among other options.
However, if you want full control of the ssl context, there doesn't appear to be any official support. Internally, boto is using a requests.Session to do all its work. You can see where the session is setup here. If you're okay with digging into botocore's internal implementation, you could reach into your resources/clients to mount a new custom Adapter for https:// paths as described in the requests user guide. The path to the http session object is <client>._endpoint.http_session or <resource>.meta.client._endpoint.http_session.