Here is my problem, I am trying to create linked service using python sdk and I was successful if I provide the storage account name and key. But I would like to create Linked service with key vaults reference, the below runs fine and creates the linked service. However when I go to datafactory and test connection.. it fails.. Please help!
store = LinkedServiceReference(reference_name ='LS_keyVault_Dev')
storage_string = AzureKeyVaultSecretReference( store=store, secret_name = 'access_key')
ls_azure_storage = AzureStorageLinkedService(connection_string=storage_string)
ls = adf_client.linked_services.create_or_update(rg_name, df_name, ls_name, ls_azure_storage)
Error Message
Invalid storage connection string provided to 'AzureTableConnection'. Check the storage connection string in configuration. No valid combination of account information found.
I test your code, it created the linked service successfully, and I navigate to the portal to Test connection, it also works, you could follow the steps below.
1.Navigate to the azure keyvault in the portal -> Secrets -> Create a secret, I'm not sure why you can use access_key as the name of the secret, pey my test, it is invalid. So in my sample, I use accesskey as the name of the secret, then store the Connection string of the storage account.
2.Navigate to the Access policies of the keyvault, add the MSI of your data factory with correct secret permission. If you did not enable the MSI of the data factory, follow this link to generate it, this is used to for the Azure Key Vault linked service to access your keyvault secret.
3.Navigate to the Azure Key Vault linked service of your data factory, make sure the connection is successful.
4.Use the code below to create the storage linked service.
Version of the libraries:
azure-common==1.1.23
azure-mgmt-datafactory==0.9.0
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.datafactory import DataFactoryManagementClient
from azure.mgmt.datafactory.models import *
subscription_id = '<subscription-id>'
credentials = ServicePrincipalCredentials(client_id='<client-id>', secret='<client-secret>', tenant='<tenant-id>')
adf_client = DataFactoryManagementClient(credentials, subscription_id)
rg_name = '<resource-group-name>'
df_name = 'joyfactory'
ls_name = 'storageLinkedService'
store = LinkedServiceReference(reference_name ='AzureKeyVault1') # AzureKeyVault1 is the name of the Azure Key Vault linked service
storage_string = AzureKeyVaultSecretReference( store=store, secret_name = 'accesskey')
ls_azure_storage = AzureStorageLinkedService(connection_string=storage_string)
ls = adf_client.linked_services.create_or_update(rg_name, df_name, ls_name, ls_azure_storage)
print(ls)
5.Go back to the linked service page, refresh and test the connection, it works fine.
Related
I am new to Azure and Azure Python SDK and I would like to ask several questions. How to use Python SDK to:
Given a VM, how do I get all the attached disks and their complete information?
Then how do I get backup history of a disk? How do I know what was the latest backup job executed?
Please explain clearly with references if it is possible. Any help will be appreciated.
The below code was suggested by #Shui shengbao here,to list the disks inside the Resource Group:
from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.compute import ComputeManagementClient
from azure.mgmt.resource import ResourceManagementClient, SubscriptionClient
# Tenant ID for your Azure Subscription
TENANT_ID = ''
# Your Service Principal App ID
CLIENT = ''
# Your Service Principal Password
KEY = ''
credentials = ServicePrincipalCredentials(
client_id = CLIENT,
secret = KEY,
tenant = TENANT_ID
)
subscription_id = ''
compute_client = ComputeManagementClient(credentials, subscription_id)
rg = 'shuilinux'
disks = compute_client.disks.list_by_resource_group(rg)
for disk in disks:
print disk
And also refer this thread to fetch the backup details of Azure VM using python SDK.
I am trying to write a file from an Azure Synapse Notebook to ADLS Gen2 while authenticating with the account key.
When I use python and the DataLakeServiceClient, I can authenticate via key and write a file without a problem. If I try to authenticate with the same key for Spark, I get java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation using this permission.", 403, PUT,.
With PySpark and authorization with the account key [NOT WORKING]:
myaccountname = ""
account_key = ""
spark.conf.set(f"fs.azure.account.key.{myaccountname}.dfs.core.windows.net", account_key)
dest_container = "container_name"
dest_storage_name = "storage_name"
destination_storage = f"abfss://{dest_container }#{dest_storage_name }.dfs.core.windows.net"
df.write.mode("append").parquet(destination_storage + "/raw/myfile.parquet")
But I can write a file with Python and the DataLakeServiceClient and also authorization with the account key [WORKING]:
from azure.storage.filedatalake import DataLakeServiceClient
# DAP ADLS configurations
storage_name = ""
account_key = ""
container_name = ""
service_client = DataLakeServiceClient(account_url=f"https://{storage_name}.dfs.core.windows.net", credential=account_key)
file_system_client = service_client.get_file_system_client(container_name)
dir_client = file_system_client.get_directory_client(directory_name)
dir_client.create_directory()
file_client = dir_client.get_file_client(file_name)
file_client.create_file()
file_client.append_data(file_content, offset=0, length=len(file_content))
file_client.flush_data(len(file_content))
What am I doing wrong? I was under the impression using spark.conf.set for a URL-key is enough?
I finally solved it by using a LinkedService. In the LinkedService I used the AccountKey (retrieved from a KeyVault).
For some the direct reason the authentication with the account key in the code did not work in the Synapse Notebook, despite the User having all required permissions.
UPDATE: According to Microsoft's third level tech support, authentication with an account key from within Synapse is not possible (!!!) You HAVE to use their LinkedServices.
If anyone else needs to authenticate:
linkedServiceName_var = "my_linked_service_name"
spark.conf.set("fs.azure.account.auth.type", "SAS")
spark.conf.set("fs.azure.sas.token.provider.type", "com.microsoft.azure.synapse.tokenlibrary.LinkedServiceBasedSASProvider")
spark.conf.set("spark.storage.synapse.linkedServiceName", linkedServiceName_var)
raw_container_name = "my_container"
raw_storageaccount_name = "my_storage_account"
CONNECTION_STR = f"abfs://{raw_container_name}#{raw_storageaccount_name}.dfs.core.windows.net"
my_df = spark.read.parquet(CONNECTION_STR+ "/" + filepath)
--Update
Can you double check if you or the user running this has ADLSGen2 access and right permissions (contributer role on subscription or Storage Blob Data Owner at the storage account level or Blob Storage Contributor Role to the service principal in the scope of the Data Lake Storage Gen2 storage account. ) depending on your setup.
Make sure you have the valid account key copied from the Azure portal.
Just in case....
To enable other users to use the storage account after you create your
workspace, you will have to perform below tasks:
Assign other users to the Contributor role on workspace
Assign other users to a Workspace, SQL, or Spark admin role using Synapse Studio
Assign yourself and other users to the Storage Blob Data Contributor role on the storage account
Also, if you are using MSI for synapse workspace, make sure the you as a user have same permission level in the notebook.
Going through the official MS docs on Azure Synapse connecting to Azure storage account
In case you have set up an account key and secret for the storage
account, you can set forwardSparkAzureStorageCredentials to true,
in which case Azure Synapse connector automatically discovers the
account access key set in the notebook session configuration or the
global Hadoop configuration and forwards the storage account access
key to the connected Azure Synapse instance by creating a temporary
Azure database scoped
credential.
Just add this option while df.write
.option("forwardSparkAzureStorageCredentials", "true")
I'm using Azure Python SDK to deploy Azure VM. I can create VM with Network Security Group without any issue via the Azure portal. However, I failed to create a Network Security Group by using API like:
async_nsg_create=network_client.network_security_groups.begin_create_or_update(
GROUP_NAME,
NSG_NAME,
nsg_parameters
)
It always complains that I "does not have authorization to perform action 'Microsoft.Network/networkSecurityGroups/write'".
However, I can create a Network Security Group via the Azure portal by clicking "create a resource" or add new source in Resource Group. I suspect I may have to create NSG via ResourceManagementClient, but I couldn't find any useful info in API doc:https://learn.microsoft.com/en-us/python/api/azure-mgmt-resource/azure.mgmt.resource.resourcemanagementclient?view=azure-python#models-api-version--2020-06-01--
I checked the solution in this issue: enter link description here, but failed at step: resource_client.providers.register('Microsoft.Compute') and it complains:"does not have authorization to perform action 'Microsoft.Compute/register/action'"
The error means your client does not have the permission to do the operations, you need to add it as an RBAC role in your resource group/subscription.
However, I can create a Network Security Group via the Azure portal by clicking "create a resource" or add new source in Resource Group.
In the portal, your are using the account logged in the portal, if you are using the code here, it uses the credentials of the service principal, it is different.
Here is a complete sample works for me, you follow the steps below.
1.Register an application with Azure AD and create a service principal.
2.Get values for signing in and create a new application secret.
3.Navigate to the resource group or the subscription -> Access control (IAM) -> Add -> add service principal of the AD App as an RBAC role e.g. Contributor, details follow this.
4.Then use the code below.
from azure.identity import ClientSecretCredential
from azure.mgmt.network import NetworkManagementClient
from azure.mgmt.network.v2020_06_01.models import NetworkSecurityGroup
from azure.mgmt.network.v2020_06_01.models import SecurityRule
tenant_id = "<tenant-id>"
client_id = "<client-id>"
client_secret = "<client-secret>"
subscription_id = "<subscription-id>"
credential = ClientSecretCredential(tenant_id, client_id, client_secret)
network_client = NetworkManagementClient(credential, subscription_id)
resource_group_name = "<group-name>"
nsg_name = "testnsg"
nsg_params = NetworkSecurityGroup(id= "testnsg", location="UK South", tags={ "name" : "testnsg" })
nsg = network_client.network_security_groups.begin_create_or_update(resource_group_name, "testnsg", parameters=nsg_params)
print(nsg.result().as_dict())
5.Check in the portal:
Update:
If you want to use the user account, you just need to use AzureCliCredential.
1.Install the Azure CLI, then login your account with az login in a local terminal, e.g. powershell.
2.After login, change the code like below and run it.
from azure.identity import ClientSecretCredential
from azure.mgmt.network import NetworkManagementClient
from azure.mgmt.network.v2020_06_01.models import NetworkSecurityGroup
from azure.mgmt.network.v2020_06_01.models import SecurityRule
subscription_id = "<subscription-id>"
credential = AzureCliCredential()
network_client = NetworkManagementClient(credential, subscription_id)
resource_group_name = "<group-name>"
nsg_name = "testnsg"
nsg_params = NetworkSecurityGroup(id= "testnsg", location="UK South", tags={ "name" : "testnsg" })
nsg = network_client.network_security_groups.begin_create_or_update(resource_group_name, "testnsg", parameters=nsg_params)
print(nsg.result().as_dict())
For a Python code base I would like to have developers accessing application secrets using Azure Key Vault, with the idea that when we deploy, the application also should be able to connect. Hence, I'm thinking Active Directory.
However, I can not find any examples on the interweb that show this with the Python SDK. Initially, I would think to retrieve the CLI user:
from azure.common.credentials import get_azure_cli_credentials
credentials, subscription_id, tenant_id = get_azure_cli_credentials(with_tenant=True)
and then use this retrieved set of credentials to access the key vault:
from azure.keyvault import KeyVaultClient
vault_url = "https://########.vault.azure.net/"
secret_name = "########"
secret_version = "########"
client = KeyVaultClient(credentials)
secret = client.get_secret(vault_url, secret_name, secret_version)
print(secret)
However, I retrieve an error that:
azure.keyvault.v7_0.models.key_vault_error_py3.KeyVaultErrorException: Operation returned an invalid status code 'Unauthorized'
I can confirm that credentials, subscription_id and tenant_id are correct, and that using the CLI, I can succesfully retrieve the secret content. So it must be some Python SDK-specific thing.
Any ideas?
It looks like this is a bug in the Python SDK.
https://github.com/Azure/azure-sdk-for-python/issues/5096
You can use your own AD username and password with the UserPassCredentials class. It's not the logged in user, but's it's probably as close as you'll get for now.
EG:
from azure.common.credentials import UserPassCredentials
credentials = UserPassCredentials('username','password')
client = KeyVaultClient(credentials)
secret = client.get_secret(vault_url, secret_name, secret_version)
print(secret)
I tried the same thing and had a different error ("...audience is invalid...") until I changed your first function call adding the resource parameter:
credentials, subscription_id, tenant_id =
get_azure_cli_credentials(resource='https://vault.azure.net', with_tenant=True)
With this change I was able to access secrets using the same code you show.
What about this code snippet? Comparing your code to the example, I don't see where you're setting the client_id or the tenant.
You’ll want to set the access policy for the key vault to allow the authenticated user to access secrets. This can be done in the portal. Bear in mind that key vault has an upper limit of 16 access definitions, so you’ll probably want to grant access to a group and add your users to that group.
As #8forty pointed out, adding a resource='https://vault.azure.net' parameter to your get_azure_cli_credentials call will resolve the issue.
However, there are new packages for working with Key Vault in Python that replace azure-keyvault:
azure-keyvault-certificates (Migration guide)
azure-keyvault-keys (Migration guide)
azure-keyvault-secrets (Migration guide)
azure-identity is also the package that should be used with these for authentication.
If you want to authenticate your Key Vault client with the credentials of the logged in CLI user, you can use the AzureCliCredential class:
from azure.identity import AzureCliCredential
from azure.keyvault.secrets import SecretClient
credential = AzureCliCredential()
vault_url = "https://{vault-name}.vault.azure.net"
secret_name = "secret-name"
client = SecretClient(vault_url, credential)
secret = client.get_secret(secret_name)
print(secret.value)
(I work on the Azure SDK in Python)
I'm trying to use Apache Libcloud (Web) and reading the Documentation of how to use it with Amazon EC2 I'm stuck on a step at the beginning.
On this step:
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver
cls = get_driver(Provider.EC2)
driver = cls('temporary access key', 'temporary secret key',
token='temporary session token', region="us-west-1")
You need to pass the temporary access data and tells you to read Amazon Documentation but also I've read the documentation I don't get very clear what I have to do to get my temporal credentials.
On the doc says that you can interact with the AWS STS API to connect to the endpoint but I don't understand how do you get the credentials. Moreover, on the example of Libcloud Web they use the personal credentials:
ACCESS_ID = 'your access id'
SECRET_KEY = 'your secret key'
So I'm a bit lost. How I can get my temporal credentials to use it on my code?
Thanks and regards.
If this code does not run on an EC2 instance I suggest you go with static credentials:
ACCESS_ID = 'your access id'
SECRET_KEY = 'your secret key'
cls = get_driver(Provider.EC2)
driver = cls(ACCESS_ID, SECRET_KEY, region="us-west-1")
to create access credentials:
Sign in to the Identity and Access Management (IAM) console at https://console.aws.amazon.com/iam/.
In the navigation pane, choose Users.
Choose the name of the desired user, and then choose the Security Credentials tab.
If needed, expand the Access Keys section and do any of the following:
Choose Create Access Key and then choose Download Credentials to save the access key ID and secret access key to a CSV file on your computer. Store the file in a secure location. You will not have access to the secret access key again after this dialog box closes. After you have downloaded the CSV file, choose Close.
if you want to run your code from an EC2 machine you can get temporary credentials by assuming an IAM role using the AWS SDK for Python https://boto3.readthedocs.io/en/latest/guide/quickstart.html by calling assume_role() on the STS service https://boto3.readthedocs.io/en/latest/reference/services/sts.html
#Aker666 from what I have found on the web, you're still expected to use the regular AWS api to obtain this information.
The basic snippet that works for me is:
import boto3
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver
boto3.setup_default_session(aws_access_key_id='somekey',aws_secret_access_key='somesecret',region_name="eu-west-1")
sts_client = boto3.client('sts')
assumed_role_object = sts_client.assume_role(
RoleArn="arn:aws:iam::701********:role/iTerm_RO_from_TGT",
RoleSessionName='update-cloud-hosts.aviadraviv#Aviads-MacBook-Pro.local'
)
cls = get_driver(Provider.EC2)
driver = cls(assumed_role_object['Credentials']['AccessKeyId'], assumed_role_object['Credentials']['SecretAccessKey'],
token=assumed_role_object['Credentials']['SessionToken'], region="eu-west-1")
nodes = driver.list_nodes()
print(nodes)
Hope this helps anyone.