How to create a signed cloudfront URL with Python? - python

I would like to know how to create a signed URL for cloudfront. The current working solution is unsecured, and I would like to switch the system to secure URL's.
I have tried using Boto 2.5.2 and Django 1.4
Is there a working example on how to use the boto.cloudfront.distribution.create_signed_url method? or any other solution that works?
I have tried the following code using the BOTO 2.5.2 API
def get_signed_url():
import boto, time, pprint
from boto import cloudfront
from boto.cloudfront import distribution
AWS_ACCESS_KEY_ID = 'YOUR_AWS_ACCESS_KEY_ID'
AWS_SECRET_ACCESS_KEY = 'YOUR_AWS_SECRET_ACCESS_KEY'
KEYPAIR_ID = 'YOUR_KEYPAIR_ID'
KEYPAIR_FILE = 'YOUR_FULL_PATH_TO_FILE.pem'
CF_DISTRIBUTION_ID = 'E1V7I3IOVHUU02'
my_connection = boto.cloudfront.CloudFrontConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
distros = my_connection.get_all_streaming_distributions()
oai = my_connection.create_origin_access_identity('my_oai', 'An OAI for testing')
distribution_config = my_connection.get_streaming_distribution_config(CF_DISTRIBUTION_ID)
distribution_info = my_connection.get_streaming_distribution_info(CF_DISTRIBUTION_ID)
my_distro = boto.cloudfront.distribution.Distribution(connection=my_connection, config=distribution_config, domain_name=distribution_info.domain_name, id=CF_DISTRIBUTION_ID, last_modified_time=None, status='Active')
s3 = boto.connect_s3()
BUCKET_NAME = "YOUR_S3_BUCKET_NAME"
bucket = s3.get_bucket(BUCKET_NAME)
object_name = "FULL_URL_TO_MP4_ECLUDING_S3_URL_DOMAIN_NAME EG( my/path/video.mp4)"
key = bucket.get_key(object_name)
key.add_user_grant("READ", oai.s3_user_id)
SECS = 8000
OBJECT_URL = 'FULL_S3_URL_TO_FILE.mp4'
my_signed_url = my_distro.create_signed_url(OBJECT_URL, KEYPAIR_ID, expire_time=time.time() + SECS, valid_after_time=None, ip_address=None, policy_url=None, private_key_file=KEYPAIR_FILE, private_key_string=KEYPAIR_ID)
Everything seems fine until the method create_signed_url. It returns an error.
Exception Value: Only specify the private_key_file or the private_key_string not both

Omit the private_key_string:
my_signed_url = my_distro.create_signed_url(OBJECT_URL, KEYPAIR_ID,
expire_time=time.time() + SECS, private_key_file=KEYPAIR_FILE)
That parameter is used to pass the actual contents of the private key file, as a string. The comments in the source explain that only one of private_key_file or private_key_string should be passed.
You can also omit all the kwargs which are set to None, since None is the default.

Related

Building Lambda Function in yaml - Issue

I'm building a CloudFormation deployment that includes a Lambda function built out in python 3.9. However, when I build the function, it will not allow me to keep the single quotes. This hasn't been an issue for most of the script as I simply import json and the double quote (") work fine, but one section requires the single quotes.
Here is the code:
import boto3
import json
def lambda_handler(event, context):
client = client_obj()
associated = associated_list(client)
response = client.list_resolver_query_log_configs(
MaxResults=1,
)
config = response['ResolverQueryLogConfigs'][0]['Id']
ec2 = boto3.client('ec2')
vpc = ec2.describe_vpcs()
vpcs = vpc['Vpcs']
for v in vpcs:
if v['VpcId'] not in associated:
client.associate_resolver_query_log_config(
ResolverQueryLogConfigId= f"{config}",
ResourceId=f"{v['VpcId']}"
)
else:
print(f"{v['VpcId']} is already linked.")
def client_obj():
client = boto3.client('route53resolver')
return client
def associated_list(client_object):
associated = list()
assoc = client_object.list_resolver_query_log_config_associations()
for element in assoc['ResolverQueryLogConfigAssociations']:
associated.append(element['ResourceId'])
return associated
any section that includes f"{v['VpcId']}" requires the single quote inside the [] for the script to run properly. Since yaml requires the script to be encapsulated in single quotes for packaging, how can I fix this?
Example in yaml from another script:
CreateIAMUser:
Type: 'AWS::Lambda::Function'
Properties:
Code:
ZipFile: !Join
- |+
- - import boto3
- 'import json'
- 'from botocore.exceptions import ClientError'
- ''
- ''
- 'def lambda_handler(event, context):'
- ' iam_client = boto3.client("iam")'
- ''
- ' account_id = boto3.client("sts").get_caller_identity()["Account"]'
- ''
I imagine I could re-arrange the script to avoid this, but I would like to use this opportunity to learn something new if possible.
Not sure what you are trying to do, but usually you just use pipe in yaml for that:
Code:
ZipFile: |
import boto3
import json
def lambda_handler(event, context):
client = client_obj()
associated = associated_list(client)
response = client.list_resolver_query_log_configs(
MaxResults=1,
)
config = response['ResolverQueryLogConfigs'][0]['Id']
ec2 = boto3.client('ec2')
vpc = ec2.describe_vpcs()
vpcs = vpc['Vpcs']
for v in vpcs:
if v['VpcId'] not in associated:
client.associate_resolver_query_log_config(
ResolverQueryLogConfigId= f"{config}",
ResourceId=f"{v['VpcId']}"
)
else:
print(f"{v['VpcId']} is already linked.")
def client_obj():
client = boto3.client('route53resolver')
return client
def associated_list(client_object):
associated = list()
assoc = client_object.list_resolver_query_log_config_associations()
for element in assoc['ResolverQueryLogConfigAssociations']:
associated.append(element['ResourceId'])
return associated

How to mock list of response objects from boto3?

I'd like to get all archives from a specific directory on S3 bucket like the following:
def get_files_from_s3(bucket_name, s3_prefix):
files = []
s3_resource = boto3.resource("s3")
bucket = s3_resource.Bucket(bucket_name)
response = bucket.objects.filter(Prefix=s3_prefix)
for obj in response:
if obj.key.endswidth('.zip'):
# get all archives
files.append(obj.key)
return files
My question is about testing it; because I'd like to mock the list of objects in the response to be able to iterate on it. Here is what I tried:
from unittest.mock import patch
from dataclasses import dataclass
#dataclass
class MockZip:
key = 'file.zip'
#patch('module.boto3')
def test_get_files_from_s3(self, mock_boto3):
bucket = mock_boto3.resource('s3').Bucket(self.bucket_name)
response = bucket.objects.filter(Prefix=S3_PREFIX)
response.return_value = [MockZip()]
files = module.get_files_from_s3(BUCKET_NAME, S3_PREFIX)
self.assertEqual(['file.zip'], files)
I get an assertion error like this: E AssertionError: ['file.zip'] != []
Does anyone have a better approach? I used struct but I don't think this is the problem, I guess I get an empty list because the response is not iterable. So how can I mock it to be a list of mock objects instead of just a MockMagick type?
Thanks
You could use moto, which is an open-source libray specifically build to mock boto3-calls. It allows you to work directly with boto3, without having to worry about setting up mocks manually.
The testfunction that you're currently using would look like this:
from moto import mock_s3
#pytest.fixture(scope='function')
def aws_credentials():
"""Mocked AWS Credentials, to ensure we're not touching AWS directly"""
os.environ['AWS_ACCESS_KEY_ID'] = 'testing'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'testing'
os.environ['AWS_SECURITY_TOKEN'] = 'testing'
os.environ['AWS_SESSION_TOKEN'] = 'testing'
#mock_s3
def test_get_files_from_s3(self, aws_credentials):
s3 = boto3.resource('s3')
bucket = s3.Bucket(self.bucket_name)
# Create the bucket first, as we're interacting with an empty mocked 'AWS account'
bucket.create()
# Create some example files that are representative of what the S3 bucket would look like in production
client = boto3.client('s3', region_name='us-east-1')
client.put_object(Bucket=self.bucket_name, Key="file.zip", Body="...")
client.put_object(Bucket=self.bucket_name, Key="file.nonzip", Body="...")
# Retrieve the files again using whatever logic
files = module.get_files_from_s3(BUCKET_NAME, S3_PREFIX)
self.assertEqual(['file.zip'], files)
Full documentation for Moto can be found here:
http://docs.getmoto.org/en/latest/index.html
Disclaimer: I am a maintainer for Moto.

Separating development/staging/production media buckets on S3 in Django

We are currently using AWS S3 buckets as a storage for media files in a Django 1.11 project (using S3BotoStorage from django-storages library). The relevant code is here:
# storage.py
from storages.backends.s3boto import S3BotoStorage
class MediaRootS3BotoStorage(S3BotoStorage):
"""Storage for uploaded media files."""
bucket_name = settings.AWS_MEDIA_STORAGE_BUCKET_NAME
custom_domain = domain(settings.MEDIA_URL)
# common_settings.py
DEFAULT_FILE_STORAGE = 'storage.MediaRootS3BotoStorage'
AWS_MEDIA_STORAGE_BUCKET_NAME = 'xxxxxxxxxxxxxxxx'
MEDIA_URL = "//media.example.com/"
# models.py
import os
import uuid
from django.db import models
from django.utils import timezone
from django.utils.module_loading import import_string
def upload_to_unique_filename(instance, filename):
try:
extension = os.path.splitext(filename)[1]
except Exception:
extension = ""
now = timezone.now()
return f'resume/{now.year}/{now.month}/{uuid.uuid4()}{extension}'
class Candidate(models.Model):
[...]
resume = models.FileField(
storage=import_string(settings.DEFAULT_PRIVATE_FILE_STORAGE)(),
upload_to=upload_to_unique_filename,
)
[...]
The issue is that the bucket key is hardcoded in the settings file, and since there are multiple developers + 1 staging environment, all of the junk files that are uploaded for testing/QA purposes end up in the same S3 bucket as the real production data.
One obvious solution would be to override AWS_MEDIA_STORAGE_BUCKET_NAME in staging_settings.py and development_settings.py files, but that would make the production data unavailable on staging and testing instances. To make this work, we would somehow how to sync the production bucket to the dev/staging one, which I'm unsure how to do efficiently and seamlessly.
Another option would be to use local filesystem for media storage in development and staging environments. This would also require the download of substantial amount of media files, and would exclude one part of the stack (django-storages and S3 API) from the testing/QA process.
How to handle this? Is the mixing of testing and production media files in the same bucket even an issue (I was sure it was until I started thinking about how to handle it)? What are some best practices about separating development/staging/production cloud storages in general?
In that case, our team use one bucket for all environments, but we add some metadata to uploaded static & media files. By this way, in order delete some kind not production S3 Objects you can just make filter using AWS API, and delete them.
It's possible by adding in settings.py:
ENVIRONMENT = "development/production/qa"
AWS_S3_OBJECT_PARAMETERS = {
'CacheControl': 'max-age=86400',
'Metadata': {
'environment': ENVIRONMENT
}
}
We recently addressed this issue with a custom S3Storage class that supports two buckets instead of one. Each environment writes to their own bucket which means that the production bucket doesn't get polluted with files from the temporary environments (dev, staging, QA, ...). However, if a given environment needs a resource that it can't find in its own bucket, then it automatically tries to fetch it from the production bucket. Accordingly, we do not need to duplicate tons of mostly static resources that are already available in the production bucket.
In settings.py, we add two new variables and specify a custom storage class
# The alternate bucket (typically the production bucket) is used as a fallback when the primary one doesn't contain the resource requested.
AWS_STORAGE_ALTERNATE_BUCKET_NAME = os.getenv('AWS_STORAGE_ALTERNATE_BUCKET_NAME')
AWS_S3_ALTERNATE_CUSTOM_DOMAIN = f'{AWS_STORAGE_ALTERNATE_BUCKET_NAME}.s3.amazonaws.com'
# Custom storage class
STATICFILES_STORAGE = 'hello_django.storage_backends.StaticStorage'
Then in the custom storage class, we overwrite the url() method as follows
from datetime import datetime, timedelta
from urllib.parse import urlencode
from django.utils.encoding import filepath_to_uri
from storages.backends.s3boto3 import S3Boto3Storage
from storages.utils import setting
class StaticStorage(S3Boto3Storage):
location = 'static'
default_acl = 'public-read'
def __init__(self, **settings):
super().__init__(**settings)
def get_default_settings(self):
settings_dict = super().get_default_settings()
settings_dict.update({
"alternate_bucket_name": setting("AWS_STORAGE_ALTERNATE_BUCKET_NAME"),
"alternate_custom_domain": setting("AWS_S3_ALTERNATE_CUSTOM_DOMAIN")
})
return settings_dict
def url(self, name, parameters=None, expire=None, http_method=None):
params = parameters.copy() if parameters else {}
if self.exists(name):
r = self._url(name, parameters=params, expire=expire, http_method=http_method)
else:
if self.alternate_bucket_name:
params['Bucket'] = self.alternate_bucket_name
r = self._url(name, parameters=params, expire=expire, http_method=http_method)
return r
def _url(self, name, parameters=None, expire=None, http_method=None):
"""
Similar to super().url() except that it allows the caller to provide
an alternate bucket name in parameters['Bucket']
"""
# Preserve the trailing slash after normalizing the path.
name = self._normalize_name(self._clean_name(name))
params = parameters.copy() if parameters else {}
if expire is None:
expire = self.querystring_expire
if self.custom_domain:
bucket_name = params.pop('Bucket', None)
if bucket_name is None or self.alternate_custom_domain is None:
custom_domain = self.custom_domain
else:
custom_domain = self.alternate_custom_domain
url = '{}//{}/{}{}'.format(
self.url_protocol,
custom_domain,
filepath_to_uri(name),
'?{}'.format(urlencode(params)) if params else '',
)
if self.querystring_auth and self.cloudfront_signer:
expiration = datetime.utcnow() + timedelta(seconds=expire)
return self.cloudfront_signer.generate_presigned_url(url, date_less_than=expiration)
return url
if params.get('Bucket') is None:
params['Bucket'] = self.bucket.name
params['Key'] = name
url = self.bucket.meta.client.generate_presigned_url('get_object', Params=params,
ExpiresIn=expire, HttpMethod=http_method)
if self.querystring_auth:
return url
return self._strip_signing_parameters(url)
This sample project illustrates the approach.

Deploying static site with s3 and cloudfront using python boto3

Trying to automate the deployment of a static website using boto3. I have a static website (angular/javascript/html) sitting in a bucket, and need to use the aws cloudfront CDN.
Anyway, looks like making the s3 bucket and copying in the html/js is working fine.
import boto3
cf = boto3.client('cloudfront')
cf.create_distribution(DistributionConfig=dict(CallerReference='firstOne',
Aliases = dict(Quantity=1, Items=['mydomain.com']),
DefaultRootObject='index.html',
Comment='Test distribution',
Enabled=True,
Origins = dict(
Quantity = 1,
Items = [dict(
Id = '1',
DomainName='mydomain.com.s3.amazonaws.com')
]),
DefaultCacheBehavior = dict(
TargetOriginId = '1',
ViewerProtocolPolicy= 'redirect-to-https',
TrustedSigners = dict(Quantity=0, Enabled=False),
ForwardedValues=dict(
Cookies = {'Forward':'all'},
Headers = dict(Quantity=0),
QueryString=False,
QueryStringCacheKeys= dict(Quantity=0),
),
MinTTL=1000)
)
)
When I try to create the cloudfront distribution, I get the following error:
InvalidOrigin: An error occurred (InvalidOrigin) when calling the CreateDistribution operation: The specified origin server does not exist or is not valid.
An error occurred (InvalidOrigin) when calling the CreateDistribution operation: The specified origin server does not exist or is not valid.
Interestingly, it looks to be complaining about the origin, mydomain.com.s3.amazonaws.com, however when I create a distribution for the s3 bucket in the web console, it has no problem with the same origin domain name.
Update:
I can get this to work with boto with the following, but would rather use boto3:
import boto
c = boto.connect_cloudfront()
origin = boto.cloudfront.origin.S3Origin('mydomain.com.s3.amazonaws.com')
distro = c.create_distribution(origin=origin, enabled=False, comment='My new Distribution')
Turns out their is a required parameter that is not documented properly.
Since the Origin is a S3 bucket, you must have S3OriginConfig = dict(OriginAccessIdentity = '') defined even if OriginAccessIdentity not used, and is an empty string.
The following command works. Note, you still need a bucket policy to make the objects accessible, and a route53 entry to alias the cname we want to cloudfront generated hostname.
cf.create_distribution(DistributionConfig=dict(CallerReference='firstOne',
Aliases = dict(Quantity=1, Items=['mydomain.com']),
DefaultRootObject='index.html',
Comment='Test distribution',
Enabled=True,
Origins = dict(
Quantity = 1,
Items = [dict(
Id = '1',
DomainName='mydomain.com.s3.amazonaws.com',
S3OriginConfig = dict(OriginAccessIdentity = ''))
]),
DefaultCacheBehavior = dict(
TargetOriginId = '1',
ViewerProtocolPolicy= 'redirect-to-https',
TrustedSigners = dict(Quantity=0, Enabled=False),
ForwardedValues=dict(
Cookies = {'Forward':'all'},
Headers = dict(Quantity=0),
QueryString=False,
QueryStringCacheKeys= dict(Quantity=0),
),
MinTTL=1000)
)
)

boto3 searching unused security groups

I am using AWS Python SDK Boto3 and I am trying to know which security groups are unused. With boto2 I did it but I do not know how to do the same with boto3.
from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo
import boto.sns
import sys
import logging
from security_groups_config import config
# Get settings from config.py
aws_access_key = config['aws_access_key']
aws_secret_key = config['aws_secret_key']
ec2_region_name = config['ec2_region_name']
ec2_region_endpoint = config['ec2_region_endpoint']
region = RegionInfo(name=ec2_region_name, endpoint=ec2_region_endpoint)
if aws_access_key:
conn = EC2Connection(aws_access_key, aws_secret_key, region=region)
else:
conn = EC2Connection(region=region)
sgs = conn.get_all_security_groups()
## Searching unused SG if the instances number is 0
def search_unused_sg(event, context):
for sg in sgs:
print sg.name, len(sg.instances())
Use the power of Boto3 and Python's list comprehension and sets to get what you want in 7 lines of code:
import boto3
ec2 = boto3.resource('ec2') #You have to change this line based on how you pass AWS credentials and AWS config
sgs = list(ec2.security_groups.all())
insts = list(ec2.instances.all())
all_sgs = set([sg.group_name for sg in sgs])
all_inst_sgs = set([sg['GroupName'] for inst in insts for sg in inst.security_groups])
unused_sgs = all_sgs - all_inst_sgs
Debug information
print 'Total SGs:', len(all_sgs)
print 'SGS attached to instances:', len(all_inst_sgs)
print 'Orphaned SGs:', len(unused_sgs)
print 'Unattached SG names:', unused_sgs
Output
Total SGs: 289
SGS attached to instances: 129
Orphaned SGs: 160
Unattached SG names: set(['mysg', '...
First , I suggest you relook how boto3 deal with credential. Better use a genereic AWS credential file , so in the future when required, you can switch to IAM roles base credential or AWS STS without changing your code.
import boto3
# You should use the credential profile file
ec2 = boto3.client("ec2")
# In boto3, if you have more than 1000 entries, you need to handle the pagination
# using the NextToken parameter, which is not shown here.
all_instances = ec2.describe_instances()
all_sg = ec2.describe_security_groups()
instance_sg_set = set()
sg_set = set()
for reservation in all_instances["Reservations"] :
for instance in reservation["Instances"]:
for sg in instance["SecurityGroups"]:
instance_sg_set.add(sg["GroupName"])
for security_group in all_sg["SecurityGroups"] :
sg_set.add(security_group ["GroupName"])
idle_sg = sg_set - instance_sg_set
Note : code are not tested. Please debug it as required.
Note: If you have ASG (Autoscalling group) that are in null state (count=0), when the ASG start adding adding the security groups, then it will adopt the orphan security groups. Keep in mind you need to check for the ASG security groups also
I used an alternative approach. If we skip the credentials discussion and go back to the main question "boto3 searching unused security groups" here is an option:
You go and enumerate the resource, in my case a network interface, because if you think about it, a security group has to be associated to a resource in order to be used.
My example:
client = boto3.client('ec2', region_name=region, aws_access_key_id=newsession_id, aws_secret_access_key=newsession_key, aws_session_token=newsession_token)
response = client.describe_network_interfaces()
for i in response["NetworkInterfaces"]:
#Check if the security group is attached
if 'Attachment' in i and i['Attachment']['Status'] == 'attached':
#Create a list with the attached SGs
groups = [g['GroupId'] for g in i['Groups']]
II used the network interface resource because I needed to get public IPs for the accounts.

Categories