I've just uploaded a 5GB of data and would like to verify that the MD5 sums match. I've calculated this for my local copy of the files, but am having problems fetching ContentMD5 from Azure. So far, I get an empty dict, but I can see the blob names. I've limited it to the first 10 items at the moment, just for debugging. I'm aware that MD5 is different on Azure from a typical md5sum call and have allowed for that locally. But, currently, I cannot see any blob properties. The properties are there when I browse via the Azure console (as is the ContentMD5 property).
Where am I going wrong?
Here's my code at the moment:
import os
from os import sys
from azure.storage.blob import BlobServiceClient
def remote_check(connection_str):
blob_service_client = BlobServiceClient.from_connection_string(connection_str)
container_name = "global"
container = blob_service_client.get_container_client(container=container_name)
blob_list = container.list_blobs()
count = 0
for blob in blob_list:
if count < 10:
blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob)
a = blob_client.get_blob_properties()
print(a.metadata)
print("Blob name: " + str(blob_client.blob_name))
count = count + 1
else:
break
def main():
try:
CONNECTION_STRING = os.environ['AZURE_STORAGE_CONNECTION_STRING']
remote_check(CONNECTION_STRING)
except KeyError:
print("AZURE_STORAGE_CONNECTION_STRING must be set.")
sys.exit(1)
if __name__ == '__main__':
main()
Please make sure you're using the latest version of package azure-storage-blob 12.6.0.
Some properties are in the content_settings, for example, to get content_md5, you should use the following code:
a=blob_client.get_blob_properties()
print(a.content_settings.content_md5)
Here is the my test result:
Maybe you can check the blob properties with a rest (e.g. with an rest client like postman) call described here:
https://learn.microsoft.com/en-us/rest/api/storageservices/get-blob-properties
The "Content-MD5" is returned as HTTP-Response Header.
I'm trying to use the sample provided by Microsoft to connect to an Azure storage table using Python. The code below fail because of tablestorageaccount not found. What I'm missing I installed the azure package but still complaining that it's not found.
import azure.common
from azure.storage import CloudStorageAccount
from tablestorageaccount import TableStorageAccount
print('Azure Table Storage samples for Python')
# Create the storage account object and specify its credentials
# to either point to the local Emulator or your Azure subscription
if IS_EMULATED:
account = TableStorageAccount(is_emulated=True)
else:
account_connection_string = STORAGE_CONNECTION_STRING
# Split into key=value pairs removing empties, then split the pairs into a dict
config = dict(s.split('=', 1) for s in account_connection_string.split(';') if s)
# Authentication
account_name = config.get('AccountName')
account_key = config.get('AccountKey')
# Basic URL Configuration
endpoint_suffix = config.get('EndpointSuffix')
if endpoint_suffix == None:
table_endpoint = config.get('TableEndpoint')
table_prefix = '.table.'
start_index = table_endpoint.find(table_prefix)
end_index = table_endpoint.endswith(':') and len(table_endpoint) or table_endpoint.rfind(':')
endpoint_suffix = table_endpoint[start_index+len(table_prefix):end_index]
account = TableStorageAccount(account_name = account_name, connection_string = account_connection_string, endpoint_suffix=endpoint_suffix)
I find the source sample code, and in the sample code there is still a custom module tablestorageaccount.py, it's just used to return TableService. If you already have the storage connection string and want to have a test, you could connect to table directly.
Sample:
from azure.storage.table import TableService, Entity
account_connection_string = 'DefaultEndpointsProtocol=https;AccountName=account name;AccountKey=account key;EndpointSuffix=core.windows.net'
tableservice=TableService(connection_string=account_connection_string)
Also you could refer to the new sdk to connect table. Here is the official tutorial about Get started with Azure Table storage.
I am currently looking for a way to upload a video to Azure Media Services (AMS v3) via Python SDKs. I have followed its instruction, and am able to connect to AMS successfully.
Example
credentials = AdalAuthentication(
context.acquire_token_with_client_credentials,
RESOURCE,
CLIENT,
KEY)
client = AzureMediaServices(credentials, SUBSCRIPTION_ID) # Successful
I also successfully get all the videos' details uploaded via its portal
for data in client.assets.list(RESOUCE_GROUP_NAME, ACCOUNT_NAME).get(0):
print(f'Asset_name: {data.name}, file_name: {data.description}')
# Asset_name: 4f904060-d15c-4880-8c5a-xxxxxxxx, file_name: 夢想全紀錄.mp4
# Asset_name: 8f2e5e36-d043-4182-9634-xxxxxxxx, file_name: an552Qb_460svvp9.webm
# Asset_name: aef495c1-a3dd-49bb-8e3e-xxxxxxxx, file_name: world_war_2.webm
# Asset_name: b53d8152-6ecd-41a2-a59e-xxxxxxxx, file_name: an552Qb_460svvp9.webm - Media Encoder Standard encoded
However, when I tried to use the following method; it failed. Since I have no idea what to parse as parameters - Link to Python SDKs
create_or_update(resource_group_name, account_name, asset_name,
parameters, custom_headers=None, raw=False, **operation_config)
Therefore, I would like to ask questions as follows (everything is done via Python SDKs):
What kind of parameters does it expect?
Can a video be uploaded directly to AMS or it should be uploaded to Blob Storage first?
Should an Asset contain only one video or multiple files are fine?
The documentation for the REST version of that method is at https://learn.microsoft.com/en-us/rest/api/media/assets/createorupdate. This is effectively the same as the Python parameters.
Videos are stored in Azure Storage for Media Services. This is true for input assets, the assets that are encoded, and any streamed content. It all is in Storage but accessed by Media Services. You do need to create an asset in Media Services which creates the Storage container. Once the Storage container exists you upload via the Storage APIs to that Media Services created container.
Technically multiple files are fine, but there are a number of issues with doing that that you may not expect. I'd recommend using 1 input video = 1 Media Services asset. On the encoding output side there will be more than one file in the asset. Encoding output contains one or more videos, manifests, and metadata files.
I have found my method to work around using Python SDKs and REST; however, I am not quite sure it's proper.
Log-In to Azure Media Services and Blob Storage via Python packages
import adal
from msrestazure.azure_active_directory import AdalAuthentication
from msrestazure.azure_cloud import AZURE_PUBLIC_CLOUD
from azure.mgmt.media import AzureMediaServices
from azure.mgmt.media.models import MediaService
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
Create Assets for an original file and an encoded one by parsing these parameters. Example of the original file Asset creation.
asset_name = 'asset-myvideo'
asset_properties = {
'properties': {
'description': 'Original File Description',
'storageAccountName': "storage-account-name"
}
}
client.assets.create_or_update(RESOUCE_GROUP_NAME, ACCOUNT_NAME, asset_name, asset_properties)
Upload a video to the Blob Storage derived from the created original asset
current_container = [data.container for data in client.assets.list(RESOUCE_GROUP_NAME, ACCOUNT_NAME).get(0) if data.name == asset_name][0] # Get Blob Storage location
file_name = "myvideo.mp4"
blob_client = blob_service_client.get_blob_client(container=current_container, blob=file_name)
with open('original_video.mp4', 'rb') as data:
blob_client.upload_blob(data)
print(f'Video uploaded to {current_container}')
And after that, I do Transform, Job, and Streaming Locator to get the video Streaming Link successfully.
I was able to get this to work with the newer python SDK. The python documentation is mostly missing, so I constructed this mainly from the python SDK source code and the C# examples.
azure-storage-blob==12.3.1
azure-mgmt-media==2.1.0
azure-mgmt-resource==9.0.0
adal~=1.2.2
msrestazure~=0.6.3
0) Import a lot of stuff
from azure.mgmt.media.models import Asset, Transform, Job,
BuiltInStandardEncoderPreset, TransformOutput, \
JobInputAsset, JobOutputAsset, AssetContainerSas, AssetContainerPermission
import adal
from msrestazure.azure_active_directory import AdalAuthentication
from msrestazure.azure_cloud import AZURE_PUBLIC_CLOUD
from azure.mgmt.media import AzureMediaServices
from azure.storage.blob import BlobServiceClient, ContainerClient
import datetime as dt
import time
LOGIN_ENDPOINT = AZURE_PUBLIC_CLOUD.endpoints.active_directory
RESOURCE = AZURE_PUBLIC_CLOUD.endpoints.active_directory_resource_id
# AzureSettings is a custom NamedTuple
1) Log in to AMS:
def get_ams_client(settings: AzureSettings) -> AzureMediaServices:
context = adal.AuthenticationContext(LOGIN_ENDPOINT + '/' +
settings.AZURE_MEDIA_TENANT_ID)
credentials = AdalAuthentication(
context.acquire_token_with_client_credentials,
RESOURCE,
settings.AZURE_MEDIA_CLIENT_ID,
settings.AZURE_MEDIA_SECRET
)
return AzureMediaServices(credentials, settings.AZURE_SUBSCRIPTION_ID)
2) Create an input and output asset
input_asset = create_or_update_asset(
input_asset_name, "My Input Asset", client, azure_settings)
input_asset = create_or_update_asset(
output_asset_name, "My Output Asset", client, azure_settings)
3) Get the Container Name. (most documentation refers to BlockBlobService, which is seems to have been removed from the SDK)
def get_container_name(client: AzureMediaServices, asset_name: str, settings: AzureSettings):
expiry_time = dt.datetime.now(dt.timezone.utc) + dt.timedelta(hours=4)
container_list: AssetContainerSas = client.assets.list_container_sas(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
asset_name=asset_name,
permissions = AssetContainerPermission.read_write,
expiry_time=expiry_time
)
sas_uri: str = container_list.asset_container_sas_urls[0]
container_client: ContainerClient = ContainerClient.from_container_url(sas_uri)
return container_client.container_name
4) Upload a file the the input asset container:
def upload_file_to_asset_container(
container: str, local_file, uploaded_file_name, settings: AzureSettings):
blob_service_client = BlobServiceClient.from_connection_string(settings.AZURE_MEDIA_STORAGE_CONNECTION_STRING))
blob_client = blob_service_client.get_blob_client(container=container, blob=uploaded_file_name)
with open(local_file, 'rb') as data:
blob_client.upload_blob(data)
5) Create a transform (in my case, using the adaptive streaming preset):
def get_or_create_transform(
client: AzureMediaServices,
transform_name: str,
settings: AzureSettings):
transform_output = TransformOutput(preset=BuiltInStandardEncoderPreset(preset_name="AdaptiveStreaming"))
transform: Transform = client.transforms.create_or_update(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
transform_name=transform_name,
outputs=[transform_output]
)
return transform
5) Submit the Job
def submit_job(
client: AzureMediaServices,
settings: AzureSettings,
input_asset: Asset,
output_asset: Asset,
transform_name: str,
correlation_data: dict) -> Job:
job_input = JobInputAsset(asset_name=input_asset.name)
job_outputs = [JobOutputAsset(asset_name=output_asset.name)]
job: Job = client.jobs.create(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
job_name=f"test_job_{UNIQUENESS}",
transform_name=transform_name,
parameters=Job(input=job_input,
outputs=job_outputs,
correlation_data=correlation_data)
)
return job
6) Then I get the URLs after the Event Grid has told me the job is done:
# side-effect warning: this starts the streaming endpoint $$$
def get_urls(client: AzureMediaServices, output_asset_name: str
locator_name: str):
try:
locator: StreamingLocator = client.streaming_locators.create(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
streaming_locator_name=locator_name,
parameters=StreamingLocator(
asset_name=output_asset_name,
streaming_policy_name="Predefined_ClearStreamingOnly"
)
)
except Exception as ex:
print("ignoring existing")
streaming_endpoint: StreamingEndpoint = client.streaming_endpoints.get(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
streaming_endpoint_name="default")
if streaming_endpoint:
if streaming_endpoint.resource_state != "Running":
client.streaming_endpoints.start(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
streaming_endpoint_name="default"
)
paths = client.streaming_locators.list_paths(
resource_group_name=settings.AZURE_MEDIA_RESOURCE_GROUP_NAME,
account_name=settings.AZURE_MEDIA_ACCOUNT_NAME,
streaming_locator_name=locator_name
)
return [f"https://{streaming_endpoint.host_name}{path.paths[0]}" for path in paths.streaming_paths]
I have a couple of Security Groups I'd like to attach to an EC2 instance.
I tried the following but failed:
sg_1 = 'sg-something'
sg_2 = 'sg-else'
response = instance.modify_attribute(Groups=sg_1, sg_2)
And something like this:
response = instance.modify_attribute(Groups=[sg_1, sg_2])
And something like this:
for sg in sg_1, sg_2:
response = instance.modify_attribute(Groups=[sg_1, sg_2])
It seems like it can only accept one sg at a time but when I pass the second one it overwrites the previous one.
Any ideas?
Thanks
This worked fine for me:
import boto3
client=boto3('ec2')
response = client.modify_instance_attribute(InstanceId='i-1234',Groups=['sg-1111','sg-2222'])
Or using the resource version:
import boto3
ec2 = boto3.resource('ec2')
instance = ec2.Instance('i-1234')
instance.modify_attribute(Groups=['sg-1111','sg-2222'])
I would like to know how to create a signed URL for cloudfront. The current working solution is unsecured, and I would like to switch the system to secure URL's.
I have tried using Boto 2.5.2 and Django 1.4
Is there a working example on how to use the boto.cloudfront.distribution.create_signed_url method? or any other solution that works?
I have tried the following code using the BOTO 2.5.2 API
def get_signed_url():
import boto, time, pprint
from boto import cloudfront
from boto.cloudfront import distribution
AWS_ACCESS_KEY_ID = 'YOUR_AWS_ACCESS_KEY_ID'
AWS_SECRET_ACCESS_KEY = 'YOUR_AWS_SECRET_ACCESS_KEY'
KEYPAIR_ID = 'YOUR_KEYPAIR_ID'
KEYPAIR_FILE = 'YOUR_FULL_PATH_TO_FILE.pem'
CF_DISTRIBUTION_ID = 'E1V7I3IOVHUU02'
my_connection = boto.cloudfront.CloudFrontConnection(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
distros = my_connection.get_all_streaming_distributions()
oai = my_connection.create_origin_access_identity('my_oai', 'An OAI for testing')
distribution_config = my_connection.get_streaming_distribution_config(CF_DISTRIBUTION_ID)
distribution_info = my_connection.get_streaming_distribution_info(CF_DISTRIBUTION_ID)
my_distro = boto.cloudfront.distribution.Distribution(connection=my_connection, config=distribution_config, domain_name=distribution_info.domain_name, id=CF_DISTRIBUTION_ID, last_modified_time=None, status='Active')
s3 = boto.connect_s3()
BUCKET_NAME = "YOUR_S3_BUCKET_NAME"
bucket = s3.get_bucket(BUCKET_NAME)
object_name = "FULL_URL_TO_MP4_ECLUDING_S3_URL_DOMAIN_NAME EG( my/path/video.mp4)"
key = bucket.get_key(object_name)
key.add_user_grant("READ", oai.s3_user_id)
SECS = 8000
OBJECT_URL = 'FULL_S3_URL_TO_FILE.mp4'
my_signed_url = my_distro.create_signed_url(OBJECT_URL, KEYPAIR_ID, expire_time=time.time() + SECS, valid_after_time=None, ip_address=None, policy_url=None, private_key_file=KEYPAIR_FILE, private_key_string=KEYPAIR_ID)
Everything seems fine until the method create_signed_url. It returns an error.
Exception Value: Only specify the private_key_file or the private_key_string not both
Omit the private_key_string:
my_signed_url = my_distro.create_signed_url(OBJECT_URL, KEYPAIR_ID,
expire_time=time.time() + SECS, private_key_file=KEYPAIR_FILE)
That parameter is used to pass the actual contents of the private key file, as a string. The comments in the source explain that only one of private_key_file or private_key_string should be passed.
You can also omit all the kwargs which are set to None, since None is the default.