List files on Azure BlobStorage with python API - python

This code tries to list the files in the in a blob storage:
#!/usr/bin/env python3
import os
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__
from datetime import datetime, timedelta
import azure.cli.core as az
print(f"Azure Blob storage v{__version__} - Python quickstart sample")
account_name = "my_account"
container_name = "my_container"
path_on_datastore = "test/path"
def _create_sas(expire=timedelta(seconds=10)) -> str:
cli = az.get_default_cli()
expire_date = datetime.utcnow() + expire
expiry_string = datetime.strftime(expire_date, "%Y-%m-%dT%H:%M:%SZ")
cmd = ["storage", "container", "generate-sas", "--name", container_name, "--account-name",
account_name, "--permissions", "lr", "--expiry", expiry_string, "--auth-mode", "login", "--as-user"]
if cli.invoke(cmd) != 0:
raise RuntimeError("Could not receive a SAS token for user {}#{}".format(
account_name, container_name))
return cli.result.result
sas = _create_sas()
blob_service_client = BlobServiceClient(
account_url=f"{account_name}.blob.core.windows.net", container_name=container_name, credential=sas)
container_client = blob_service_client.create_container(container_name)
blob_list = container_client.list_blobs()
for blob in blob_list:
print("\t" + blob.name)
That code worked quite fine a few weeks ago, but then we always get the error:
azure.core.exceptions.ClientAuthenticationError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
Does someone know what can be wrong?
PS. using Azure blob storage package of version 12.3.2.
[Edit]
Because of security concerns we are not allowed to use account keys here.

I'm not entirely sure what is wrong with your code, but it looks like your SAS token is not the expected format. Have you tested if the the SAS URL works in a browser?
Additionally, your _create_sas function seems to be creating the SAS signature with an Azure CLI command. I don't think you need to do this because the azure-storage-blob package has methods such as generate_account_sas to generate a SAS signature. This will eliminate a lot of complexity because you don't need to worry about the SAS signature format.
from datetime import datetime, timedelta
from azure.storage.blob import (
BlobServiceClient,
generate_account_sas,
ResourceTypes,
AccountSasPermissions,
)
from azure.core.exceptions import ResourceExistsError
account_name = "<account name>"
account_url = f"https://{account_name}.blob.core.windows.net"
container_name = "<container name>"
# Create SAS token credential
sas_token = generate_account_sas(
account_name=account_name,
account_key="<account key>",
resource_types=ResourceTypes(container=True),
permission=AccountSasPermissions(read=True, write=True, list=True),
expiry=datetime.utcnow() + timedelta(hours=1),
)
Which gives this SAS signature read, write and list permissions on blob containers, with an expiry time of 1 hour. You can change this to your liking.
We can then create the BlobServiceClient with this SAS signature as a credential, then create the container client to list the blobs.
# Create Blob service client to interact with storage account
# Use SAS token as credential
blob_service_client = BlobServiceClient(account_url=account_url, credential=sas_token)
# First try to create container
try:
container_client = blob_service_client.create_container(name=container_name)
# If container already exists, fetch the client
except ResourceExistsError:
container_client = blob_service_client.get_container_client(container=container_name)
# List blobs in container
for blob in container_client.list_blobs():
print(blob.name)
Note: The above is using version azure-storage-blob==12.5.0, which is the latest package. This is not too far ahead of your version, so I would probably update your code to work with latest functionality, as also provided in the documentation.
Update
If you are unable to use account keys for security reasons, then you can create a service principal and give it a Storage Blob Data Contributor role to your storage account. This gets created as an AAD application, which will have access to your storage account.
To get this setup, you can use this guide from the documentation.
Sample Code
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient
token_credential = DefaultAzureCredential()
blob_service_client = BlobServiceClient(
account_url="https://<my_account_name>.blob.core.windows.net",
credential=token_credential
)

It looks like the module is deprecated:
Starting with v5.0.0, the 'azure' meta-package is deprecated and cannot be
installed anymore. Please install the service specific packages prefixed by
azure needed for your application.
The complete list of available packages can be found at:
https://aka.ms/azsdk/python/all
A more comprehensive discussion of the rationale for this decision can be found
in the following issue:
https://github.com/Azure/azure-sdk-for-python/issues/10646

Related

azure python SDK retrieve backup items from recoverservices(backup)

I was told to move my bash script that reports on VM backup status, also reports VMs that are not being backed up to Azure automation account. I picked python since Automation Account doesn't have bash and I have done python scripts before for system admin purposes. I am not a python developer, and I need help navigate Azure python SDK classes.
I need to find the "Backup Items" in portal from one of the python SDK modules, to retrieve the VM information from backup vault. I've tried azure.mgmt.recoveryservices and azure.mgmt.recoveryservicesbackup. I can get vault information from azure.mgmt.recoveryservices, which I can use to query more information about vault, hopefully the VM information. My guess is azure.mgmt.recoveryservicesbackup. But I am lost in azure.mgmt.recoveryservicesbackup.jobs.models. I can't tell among hundreds of classes, which one would give that information.
I'd like to use the output from vault backup to against the list of VMs, to find out which ones are not being backed up.
I've looked: https://learn.microsoft.com/en-us/python/api/azure-mgmt-recoveryservicesbackup/azure.mgmt.recoveryservicesbackup.activestamp?view=azure-python https://azure.github.io/azure-sdk/releases/latest/all/python.html, https://www.programcreek.com/python/?ClassName=azure.mgmt.recoveryservicesbackup&submit=Search.
any help would much appreciated!
thanks.
Using python to get the backup of VM
You can use the below code snippet to get VM Backup details.
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.compute import ComputeManagementClient
from azure.mgmt.network import NetworkManagementClient
import requests
SUBSCRIPTION_ID = '<Your Subscription ID>'
VM_NAME = '<VM Name>'
credentials = ServicePrincipalCredentials(
client_id='<Client id>',
secret='<Client Secret>',
tenant='<Your Tenent Id>'
)
# Creating base URL
BASE_API_URL = "https://management.azure.com/Subscriptions/<Subscription>/resourceGroups/<Resourece group name>/providers/Microsoft.RecoveryServices/vaults/your_vault_name/backupProtectedItems?api-version=2019-05-13&"
# Add the required filter to fetch the Exact details
customFilter="$filter=backupManagementType eq 'AzureIaasVM' and itemType eq 'VM' and policyName eq 'DailyPolicy'"
#Adding the Base API url with custom filter
BASE_URL = BASE_API_URL + customFilter
header = {
"Authorization": 'Bearer '+ credentials.token["access_token"]
}
response = requests.get(BASE_URL, headers=header)
# here you can handle the response to know the details of backup
print(response.content)
...
Refer here to achieve using Azure cli.
what I was looking for is in "backup_protected_item", in RecoveryServicesBackupClient constructor, here is sample code.
from azure.mgmt.recoveryservicesbackup import RecoveryServicesBackupClient
backup_items = backup_client.backup_protected_items.list(resource_group_name='rg_xxx', vault_name=var_vault)
print(backup_items.properties)

Azure BlobServiceClient Error InvalidResourceName

Having problem uploading file to azure blob storage container, using azure.storage.blob for python 2.7. (I know i should use newer python, but it's a part of big ROS application, hence not just so to upgrade it all.)
from azure.storage.blob import BlobServiceClient
...
container_name = "operationinput"
self.back_up_root = "~/backup/sql/lp/"
self.back_up_root = os.path.expanduser(self.back_up_root)
file = 'test.sql'
try:
client = BlobServiceClient.from_connection_string(conn_str=connection_string)
blob = client.get_blob_client(container='container_name', blob='datafile')
except Exception as err:
print(str(err))
with open(self.back_up_root + file, "rb") as data:
blob.upload_blob(data)
I get the following error:
azure.core.exceptions.HttpResponseError: The specifed resource name contains invalid characters.
RequestId:3fcb6c26-101e-007e-596d-1c7d61000000
Time:2022-02-07T21:58:17.1308670Z
ErrorCode:InvalidResourceName
All post i have found refers to people using capital letters or so, but i have:
operationinput
datafile
All should be within specification.
Any ideas?
We have tried with below sample code to upload files to Azure blob storage (Container ) using SAS token , and can able to achieve it successfully.
Code sample:-
from azure.storage.blob import BlobClient
upload_file_path="C:\\Users\\Desktop\\filename"
sas_url="https://xxx.blob.core.windows.nethttps://cloudsh3D?sastoken"
client = BlobClient.from_blob_url(sas_url)
with open(upload_file_path,'rb') as data:
client.upload_blob(data)
print("**file uploaded**")
To generate SAS url and connection string we have selected as below:-
For more information please refer this Microsoft Documentation: Allow or disallow public read access for a storage account

GCP Python Compute Engine - list VM's

I have the following Python3 script:
import os, json
import googleapiclient.discovery
from google.oauth2 import service_account
from google.cloud import storage
storage_client = storage.Client.from_service_account_json('gcp-sa.json')
buckets = list(storage_client.list_buckets())
print(buckets)
compute = googleapiclient.discovery.build('compute', 'v1')
def list_instances(compute, project, zone):
result = compute.instances().list(project=project, zone=zone).execute()
return result['items'] if 'items' in result else None
list_instances(compute, "my-project", "my-zone")
Only listing buckets without the rest works fine, that tells me that my service account (with has read access to the whole project) should work. How can I now list VM's? Using the code above, I get
raise exceptions.DefaultCredentialsError(_HELP_MESSAGE)
google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
So that tells me that I somehow have to pass the service account json. How is that possible?
Thanks!!

How to access Azure storage account from VM using Python and managed identity / SAS credentials

OBJECTIVE I have an Azure VM set up with a system assigned managed identity. I want to be able to:
Allow users to access blobs inside storage account using the VM
Ensure the users are not able to access the blobs external to the VM
Use Python - most of our users are Python but not Powershell literate.
Setup details:
Storage account: sa030802util. Container: testutils. Blob: hello3.txt
Managed identity and roles. VM has a system assigned managed identity and contributor, storage account contributor, storage blob data contributor roles for sa030802util.
METHODS
I have tried four methods to solve this problem.
Partially successful method 1: Python. In Python, I have been able to access the sa030802util storage account using the below code, derived from link, link and link. The problem is that this uses the storage account and keys rather than relying solely on the managed identity for the VM. My fear is that this leaves the possibility that users could extract the storage keys and gain access to the blobs outside the VM.
Pros: in Python. Con: not using managed identity to authenticate. BlockBlobService can't use MSI to authenticate (yet).
Partially successful method 2: Powershell. In Powershell, I have found two ways to access the blob using the managed identity. The challenge is that neither create a credential that I can easily substitute into Python as explained below. This first method is drawn from the Microsoft-taught Pluralsight course on Implementing Managed Identities for Microsoft Azure Resources (link). It uses the Az module.
Pros: uses managed identity, relatively simple. Cons: not in Python. Does not generate a credential that could be used in Python.
Partially successful method 3: Powershell. This method is drawn from link. It uses the VM managed identity to generate a SAS credential and access Azure Storage.
Pros: uses managed identity and generates SAS credential, which is potentially valuable as BlockBlobService in Python can accept a SAS token. Cons: not in Python. Overkill for Powershell itself given method 2 above achieves the same thing with less effort. I trialled it because I wanted to see if I could extract the SAS credential for use in Python.
Unsuccessful method 4: Python and Powershell. I thought I might be able to generate a SAS token in Powershell using method 3, then slot the token in to the BlockBlobService code from method 1. What I have isn't working. I suspect the reason is that the SAS credential was created for the testutils container, and the Python BlockBlobService needs a SAS credential for the sa030802util storage account.
Pro: would allow me to rely on the managed identity of the VM to access Azure Storage. Con: doesn't work!
QUESTIONS
My questions are:
Am I right in thinking it's better to rely on the VM managed identity and / or SAS credential than the account keys, if I want to make sure that users can only access the storage account inside the VM?
Is there a way to cobble together code that lets me use Python to access the data? Is method 4 promising or a waste of time?
CODE
Method 1: Python
from azure.mgmt.storage import StorageManagementClient
from azure.mgmt.storage.models import StorageAccountCreateParameters
from msrestazure.azure_active_directory import MSIAuthentication
from azure.mgmt.resource import SubscriptionClient
from azure.storage.blob import BlockBlobService
# find credentials and subscription id
credentials = MSIAuthentication()
subscription_client = SubscriptionClient(credentials)
subscription = next(subscription_client.subscriptions.list())
subscription_id = subscription.subscription_id
# find storage keys
storage_client = StorageManagementClient(credentials, subscription_id)
storage_account = storage_client.storage_accounts.get_properties("<resourcegroup>", "sa030802util")
storage_keys = storage_client.storage_accounts.list_keys("<resourcegroup>", "sa030802util")
storage_keys = {v.key_name: v.value for v in storage_keys.keys}
# create BlockBlobService and for e.g. print blobs in container
account_name = "sa030802util"
account_key = storage_keys["key1"]
container_name = "testutils"
block_blob_service = BlockBlobService(account_name = account_name, account_key = account_key)
print("List blobs in container")
generator = block_blob_service.list_blobs(container_name)
for blob in generator:
print("Blob name: " + blob.name)
The output of this code is:
List blobs in container
Blob name: hello3.txt
Method 2: Powershell
Connect-AzAccount -MSI -Subscription <subscriptionid>
$context = New-AzStorageContext -StorageAccountName sa030802util
Get-AzStorageBlob -Name testutils -Context $context
The output of this code is:
Name BlobType Length ContentType LastModified AccessTier SnapshotTime IsDeleted
---- -------- ------ ----------- ------------ ---------- ------------ ---------
hello3.txt BlockBlob 15 application/octet-stream 2019-08-02 05:45:33Z Hot False
Method 3: Powershell
# to get an access token using the VM's identity and use it to call Azure Resource Manager
$response = Invoke-WebRequest -Uri 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F' -Method GET -Headers #{Metadata="true"}
$ content = $response.Content | ConvertFrom-Json
#ArmToken = $content.access_token
# to get SAS credential from Azure Resource Manager to make storage calls
## convert parameters to JSON
$params = #{canonicalizedResource="/blob/sa030802util/testutils"; signedResource="c"; signedPermission="rcwl"; signedProtocol="https"; signedExpiry="2019-08-30T00:00:00Z"}
$jsonParams = $params | ConvertTo-Json
## call storage listServiceSas endpoint to create SAS credential
$sasResponse = Invoke-WebRequest -Uri https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/<resourceGroup>/providers/Microsoft.Storage/storageAccounts/sa030802util/listServiceSas/?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F -Method POST -Body $jsonParams -Headers #{Authorization = "Bearer $ArmToken"} -UseBasicParsing
## extract SAS credential from response
$sasContent = $sasResponse.Content | ConvertFrom-Json
$sasCred = $sasContent.serviceSasToken
# as example, list contents of container
$context = New-AzStorageContext -StorageAccountName sa030802util -SasToken $sasCred
Get-AzStorageBlob -Name testutils -Context $context
The output of this code is the same as for Method 2.
Method 4: Python and Powershell
Powershell code
# to get an access token using the VM's identity and use it to call Azure Resource Manager
$response = Invoke-WebRequest -Uri 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F' -Method GET -Headers #{Metadata="true"}
$content = $response.Content | ConvertFrom-Json
$ArmToken = $content.access_token
# to get SAS credential from Azure Resource Manager to make storage calls
## convert parameters to JSON
$params = #{canonicalizedResource="/blob/sa030802util/testutils"; signedResource="c"; signedPermission="rcwl"; signedProtocol="https"; signedExpiry="2019-08-30T00:00:00Z"}
$jsonParams = $params | ConvertTo-Json
## call storage listServiceSas endpoint to create SAS credential
$sasResponse = Invoke-WebRequest -Uri https://management.azure.com/subscriptions/<subscription_id>/resourceGroups/<resourceGroup>/providers/Microsoft.Storage/storageAccounts/sa030802util/listServiceSas/?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F -Method POST -Body $jsonParams -Headers #{Authorization = "Bearer $ArmToken"} -UseBasicParsing
## extract SAS credential from response
$sasContent = $sasResponse.Content | ConvertFrom-Json
$sasCred = $sasContent.serviceSasToken
# then export the SAS credential ready to be used in Python
Python code
from azure.storage.blob import BlockBlobService, PublicAccess
import os
# import SAS credential
with open("cred.txt") as f:
line = f.readline()
# create BlockBlobService
block_blob_service = BlockBlobService(account_name = "sa030802util", sas_token=line)
# print content of testutils container
generator = block_blob_service.list_blobs("testutils")
for blob in generator:
print(blob.name)
The Python code returns the following error:
AzureHttpError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. ErrorCode: AuthenticationFailed
<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:<subscriptionid>
Time:2019-08-05T05:33:40.0175771Z</Message><AuthenticationErrorDetail>Signature did not match. String to sign used was rcwl
2019-08-30T00:00:00.0000000Z
/blob/sa030802util/testutils
https
2018-03-28
</AuthenticationErrorDetail></Error>
Very interesting post, unfortunately I'm not Python expert but this may help :https://github.com/Azure-Samples/resource-manager-python-manage-resources-with-msi
if I want to make sure that users can only access the storage account inside the VM?
You can achieve this without MSI: https://learn.microsoft.com/en-us/azure/storage/common/storage-network-security
MSI does provide an additional layer of security and it also somewhat simplifies management as you don't need to manage keys/SAS tokens but it's not an absolute requirement and you can build secure designs without it.
Good luck!
In the Azure SDK for Python, create a BlobServiceClient then use its get_blob_client method to retrieve a BlobClient class. Then use download_blob on that client to get at the blob contents.
BlobServiceClient takes a credentials argument to which you can pass MSIAuthentication()
You can use azure key vault to store the connection string of the storage account as a secret and retreive the credentials from there to connect to your desired container

Python Azure sdk: How to retrieve secrets from keyvault?

I need to retrieve secrets from keyvault. This is my code so far:
from azure.mgmt.keyvault import KeyVaultManagementClient
from azure.common.credentials import ServicePrincipalCredentials
subscription_id = 'x'
# See above for details on creating different types of AAD credentials
credentials = ServicePrincipalCredentials(
client_id = 'x',
secret = 'x',
tenant = 'x'
)
kv_client = KeyVaultManagementClient(credentials, subscription_id)
for vault in kv_client.vaults.list():
print(vault)
But I am getting this error:
msrestazure.azure_exceptions.CloudError: Azure Error:
AuthorizationFailed Message: The client 'x' with object id 'x' does
not have authorization to perform action
'Microsoft.Resources/subscriptions/resources/read' over scope
'/subscriptions/x'.
Now I am able to access the same keyvault with same credentials using C# code/ POwershell so there is definitely nothing wrong with authorization. Not sure why it isnt working using SDK. Please help.
If you are looking to access via a ServicePrincipalCredentials instance, you can just use:
from azure.keyvault import KeyVaultClient, KeyVaultAuthentication
from azure.common.credentials import ServicePrincipalCredentials
credentials = None
def auth_callback(server, resource, scope):
credentials = ServicePrincipalCredentials(
client_id = '',
secret = '',
tenant = '',
resource = "https://vault.azure.net"
)
token = credentials.token
return token['token_type'], token['access_token']
client = KeyVaultClient(KeyVaultAuthentication(auth_callback))
secret_bundle = client.get_secret("https://vault_url", "secret_id", "")
print(secret_bundle.value)
This assumes that you don't want to pass a version. If you do, you can substitute the last parameter for it.
I run your code sample above and it is able to list the key vaults without any issue, hence it is not a code issue.
I have assigned the Contributor role to my AD application on the subscription where the key vault is provisioned and set the Access Policies to allow GET & LIST permissions for Key and Secret to the AD application.
The versions of my Azure Python packages used running under Python 3.6.2 runtime environment:
azure.common (1.1.8)
azure.mgmt.keyvault (0.40.0)
msrestazure(0.4.13)
I'll recommend you to try on the Python runtime version and Azure Python packages versions which is verified working.
Addendum:
If the above Python runtime environment version as well as Azure Python packages also does not work for you, you should probably consider creating a new issue in the Azure SDK for Python GitHub as it is working with the same credential with Azure .NET SDK as well as PowerShell.
You can also get secret by the name of the secret instead of ID:
secret_bundle = client.get_secret(<VAULT URL>, "<NAME>", "")
There are some good answers already, but the Azure SDK has since released new packages for working with Key Vault in Python that replace azure-keyvault:
azure-keyvault-certificates (Migration guide)
azure-keyvault-keys (Migration guide)
azure-keyvault-secrets (Migration guide)
azure-identity is also the package that should be used with these for authentication.
Documentation for working with the secrets library can be found on the azure-sdk-for-python GitHub repository, and here's a sample for retrieving secrets as you were doing:
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
credential = DefaultAzureCredential()
secret_client = SecretClient(
vault_url="https://my-key-vault.vault.azure.net/",
credential=credential
)
secret = secret_client.get_secret("secret-name")
You can provide the same credentials that you used for ServicePrincipalCredentials by setting environment variables corresponding to the client_id, secret, and tenant:
export AZURE_CLIENT_ID="client_id"
export AZURE_CLIENT_SECRET="secret"
export AZURE_TENANT_ID="tenant"
(I work on the Azure SDK in Python)
One can use the below class from azure.identity i.e ClientSecretCredential, find the below code ex: snippet
from azure.identity import ClientSecretCredential
from azure.keyvault.secrets import SecretClient
TENANT= <TenantId-in-string>
CLIENT_ID = <ClientId-in-string>
CLIENT_SECRET= <ClientSecret-in-string>
credential = ClientSecretCredential(TENANT,CLIENT_ID,CLIENT_SECRET)
VAULT_URL= <AzureVault-url-in-string>
client = SecretClient(vault_url=VAULT_URL, credential=credential)
print(client)
example_secret = client.get_secret(<secret_name_in_string>)
print(example_secret.value)

Categories