Read Azure KeyVault Secret from Function App - python

This Python script is deployed to run from Azure Function App on Linux Consumption plan, This script is expected to read secrets from Azure Key Vault.
Apart from code deployment, following configurations are made
1.)System Assigned Managed Access Enabled for Azure Function App
2.)Azure Key Vault's Role Assignments Reference this Function App with >Reader role.
Here is the script from > > >init.py
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
# Get url and filename from postman by using POST method
#identity = ManagedIdentityCredential()
credentials = DefaultAzureCredential()
secretClient = SecretClient(vault_url="https://kvkkpbedpdev.vault.azure.net/", credential=credentials)
secret = secretClient.get_secret(name = 'st-cs-kkpb-edp-dev')
This function app requires following libraries and defined in requirements.txt file
azure-functions
azure-keyvault-secrets
azure-identity
This function runs and ends up following exception.
warn: Function.Tide_GetFiles.User[0]
python | SharedTokenCacheCredential.get_token failed: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
python | Traceback (most recent call last):
python | File "/usr/local/lib/python3.8/site-packages/azure/identity/_internal/decorators.py", line 27, in wrapper
python | token = fn(*args, **kwargs)
python | File "/usr/local/lib/python3.8/site-packages/azure/identity/_credentials/shared_cache.py", line 88, in get_token
python | account = self._get_account(self._username, self._tenant_id)
python | File "/usr/local/lib/python3.8/site-packages/azure/identity/_internal/decorators.py", line 45, in wrapper
python | return fn(*args, **kwargs)
python | File "/usr/local/lib/python3.8/site-packages/azure/identity/_internal/shared_token_cache.py", line 166, in _get_account
python | raise CredentialUnavailableError(message=NO_ACCOUNTS)
python | azure.identity._exceptions.CredentialUnavailableError: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
python | info: Function.Tide_GetFiles.User[0]
python | DefaultAzureCredential - SharedTokenCacheCredential is unavailab
and error
fail: Function.Tide_GetFiles[3]
python | Executed 'Functions.Tide_GetFiles' (Failed, Id=9d514a1f-aeae-4625-9379-b2f0bc89f38f, Duration=1673ms)
python | Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Tide_GetFiles
python | ---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
python | Exception: ClientAuthenticationError: DefaultAzureCredential failed to retrieve a token from the included credentials.
python | Attempted credentials:
python | EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
python | ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
python | SharedTokenCacheCredential: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
how can I figure this

From the error, it seems managed identity is not applied to your Function app correctly. You should be able to see that going to the identity blade of Function app.
Additionally, you should add the required access policy (separate from role assignment in access control) (secret get here) to allow the identity (same name as the app) to access keyvault if you are not using the new preview access control. Refer How to set and get secrets from Azure Key Vault with Azure Managed Identities and Python.
Using the Azure Portal, go to the Key Vault's access policies, and grant required access to the Key Vault.
Search for your Key Vault in “Search Resources dialog box” in Azure Portal.
Select "Overview", and click on Access policies
Click on "Add Access Policy", select required permissions.
Click on "Select Principal", add your account
Save the Access Policies
You can also create an Azure service principal either through
Azure CLI, PowerShell or the portal and grant it the same access.

Related

Flask web app on Cloud Run - google.auth.exceptions.DefaultCredentialsError:

I'm hosting a Flask web app on Cloud Run. I'm also using Secret Manager to store Service Account keys. (I previously downloaded a JSON file with the keys)
In my code, I'm accessing the payload then using os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = payload to authenticate. When I deploy the app and try to visit the page, I get an Internal Service Error. Reviewing the logs, I see:
File "/usr/local/lib/python3.10/site-packages/google/auth/_default.py", line 121, in load_credentials_from_file
raise exceptions.DefaultCredentialsError(
google.auth.exceptions.DefaultCredentialsError: File {"
I can access the secret through gcloud just fine with: gcloud secrets versions access 1 --secret="<secret_id>" while acting as the Service Account.
Here is my Python code:
# Grabbing keys from Secret Manager
def access_secret_version():
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the secret version.
name = "projects/{project_id}/secrets/{secret_id}/versions/1"
# Access the secret version.
response = client.access_secret_version(request={"name": name})
payload = response.payload.data.decode("UTF-8")
return payload
#app.route('/page/page_two')
def some_random_func():
# New way
payload = access_secret_version() # <---- calling the payload
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = payload
# Old way
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "service-account-keys.json"
I'm not technically accessing a JSON file like I was before. The payload variable is storing entire key. Is this why it's not working?
Your approach is incorrect.
When you run on a Google compute service like Cloud Run, the code runs under the identity of the compute service.
In this case, by default, Cloud Run uses the Compute Engine default service account but, it's good practice to create a Service Account for your service and specify it when you deploy it to Cloud Run (see Service accounts).
This mechanism is one of the "legs" of Application Default Credentials when your code is running on Google Cloud, you don't specify the environment variable (you also don't need to create a key) and Cloud Run service acquires the credentials from the Metadata service:
import google.auth
credentials, project_id = google.auth.default()
See google.auth package
It is bad practice to define|set an environment variable within code. By their nature, environment variables should be provided by the environment. Doing this with APPLICATION_DEFAULT_CREDENTIALS means that your code always sets this value when it should only do this when the code is running off Google Cloud.
For completeness, if you need to create Credentials from a JSON string rather than from a file contain a JSON string, you can use from_service_account_info (see google.oauth2.service_account)

Using Secrets Manager to authenticate for Google API

I'm running a flask app that will access Bigquery on behalf of users using a service account they upload.
To store those service account credentials, I thought the following might be a good set up:
ENV Var: Stores my credentials for accessing google secrets manager
Secret & secret version: in google secrets manager for each user of the application. This will access the user's own bigquery instance on behalf of the user.
--
I'm still learning about secrets, but this seemed more appropriate than any way of storing credentials in my own database?
--
The google function for accessing secrets is:
def access_secret_version(secret_id, version_id=version_id):
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the secret version.
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
# Access the secret version.
response = client.access_secret_version(name=name)
# Return the decoded payload.
return response.payload.data.decode('UTF-8')
However, this returns JSON as a string. When then using this for big query:
credentials = access_secret_version(secret_id, version_id=version_id)
BigQuery_client = bigquery.Client(credentials=json.dumps(credentials),
project=project_id)
I get the error:
File "/Users/Desktop/application_name/venv/lib/python3.8/site-
packages/google/cloud/client/__init__.py", line 167, in __init__
raise ValueError(_GOOGLE_AUTH_CREDENTIALS_HELP)
ValueError: This library only supports credentials from google-auth-library-python.
See https://google-auth.readthedocs.io/en/latest/ for help on authentication with
this library.
Locally I'm storing the credentials and accessing them via a env variable. But as I intend for this application to have multiple users, from different organisations I don't think that scales.
I think my question boils down to two pieces:
Is this a sensible method for storing and accessing credentials?
Can you authenticate to Bigquery using a string rather than a .json file indicated here

AWS NoCredentials in training

I am attempting to run the example code for Amazon Sagemaker on a local GPU. I have copied the code from the Jupyter notebook to the following Python script:
import boto3
import subprocess
import sagemaker
from sagemaker.mxnet import MXNet
from mxnet import gluon
from sagemaker import get_execution_role
import os
sagemaker_session = sagemaker.Session()
instance_type = 'local'
if subprocess.call('nvidia-smi') == 0:
# Set type to GPU if one is present
instance_type = 'local_gpu'
# role = get_execution_role()
gluon.data.vision.MNIST('./data/train', train=True)
gluon.data.vision.MNIST('./data/test', train=False)
# successfully connects and uploads data
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/mnist')
hyperparameters = {
'batch_size': 100,
'epochs': 20,
'learning_rate': 0.1,
'momentum': 0.9,
'log_interval': 100
}
m = MXNet("mnist.py",
role=role,
train_instance_count=1,
train_instance_type=instance_type,
framework_version="1.1.0",
hyperparameters=hyperparameters)
# fails in Docker container
m.fit(inputs)
predictor = m.deploy(initial_instance_count=1, instance_type=instance_type)
m.delete_endpoint()
where the referenced mnist.py file is exactly as specified on Github. The script fails on m.fit in Docker container with the following error:
algo-1-1DUU4_1 | Downloading s3://<S3-BUCKET>/sagemaker-mxnet-2018-10-07-00-47-10-435/source/sourcedir.tar.gz to /tmp/script.tar.gz
algo-1-1DUU4_1 | 2018-10-07 00:47:29,219 ERROR - container_support.training - uncaught exception during training: Unable to locate credentials
algo-1-1DUU4_1 | Traceback (most recent call last):
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
algo-1-1DUU4_1 | fw.train()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/mxnet_container/train.py", line 169, in train
algo-1-1DUU4_1 | mxnet_env.download_user_module()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/environment.py", line 89, in download_user_module
algo-1-1DUU4_1 | cs.download_s3_resource(self.user_script_archive, tmp)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/container_support/utils.py", line 37, in download_s3_resource
algo-1-1DUU4_1 | script_bucket.download_file(script_key_name, target)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/inject.py", line 246, in bucket_download_file
algo-1-1DUU4_1 | ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/inject.py", line 172, in download_file
algo-1-1DUU4_1 | extra_args=ExtraArgs, callback=Callback)
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/boto3/s3/transfer.py", line 307, in download_file
algo-1-1DUU4_1 | future.result()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 73, in result
algo-1-1DUU4_1 | return self._coordinator.result()
algo-1-1DUU4_1 | File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 233, in result
algo-1-1DUU4_1 | raise self._exception
algo-1-1DUU4_1 | NoCredentialsError: Unable to locate credentials
I am confused that I can authenticate to S3 outside of the container (to pload the training/test data) but I cannot within the Docker container. So I am guessing the issues has to do with passing the AWS credentials to the Docker container. Here is the generated Docker-compose file:
networks:
sagemaker-local:
name: sagemaker-local
services:
algo-1-1DUU4:
command: train
environment:
- AWS_REGION=us-west-2
- TRAINING_JOB_NAME=sagemaker-mxnet-2018-10-07-00-47-10-435
image: 123456789012.dkr.ecr.us-west-2.amazonaws.com/sagemaker-mxnet:1.1.0-gpu-py2
networks:
sagemaker-local:
aliases:
- algo-1-1DUU4
stdin_open: true
tty: true
volumes:
- /tmp/tmpSkaR3x/algo-1-1DUU4/input:/opt/ml/input
- /tmp/tmpSkaR3x/algo-1-1DUU4/output:/opt/ml/output
- /tmp/tmpSkaR3x/algo-1-1DUU4/output/data:/opt/ml/output/data
- /tmp/tmpSkaR3x/model:/opt/ml/model
version: '2.1'
Should the AWS credentials be passed in as enviromental variables?
I upgraded my sagemaker install to after reading Using boto3 in install local mode?, but that had no effect. I checked the credentials that are being fetched in the Sagemaker session (outside the container) and they appear to be blank, even though I have an ~/.aws/config and ~/.aws/credentials file:
{'_token': None, '_time_fetcher': <function _local_now at 0x7f4dbbe75230>, '_access_key': None, '_frozen_credentials': None, '_refresh_using': <bound method AssumeRoleCredentialFetcher.fetch_credentials of <botocore.credentials.AssumeRoleCredentialFetcher object at 0x7f4d2de48bd0>>, '_secret_key': None, '_expiry_time': None, 'method': 'assume-role', '_refresh_lock': <thread.lock object at 0x7f4d9f2aafd0>}
I am new to AWS so I do not know how to diagnose the issue regarding AWS credentials. My .aws/config file has the following information (with placeholder values):
[default]
output = json
region = us-west-2
role_arn = arn:aws:iam::123456789012:role/SageMakers
source_profile = sagemaker-test
[profile sagemaker-test]
output = json
region = us-west-2
Where the sagemaker-test profile has AmazonSageMakerFullAccess in the IAM Management Console.
The .aws/credentials file has the following information (represented by placeholder values):
[default]
aws_access_key_id = 1234567890
aws_secret_access_key = zyxwvutsrqponmlkjihgfedcba
[sagemaker-test]
aws_access_key_id = 0987654321
aws_secret_access_key = abcdefghijklmopqrstuvwxyz
Lastly, these are versions of the applicable libraries from a pip freeze:
awscli==1.16.19
boto==2.48.0
boto3==1.9.18
botocore==1.12.18
docker==3.5.0
docker-compose==1.22.0
mxnet-cu91==1.1.0.post0
sagemaker==1.11.1
Please let me know if I left out any relevant information and thanks for any help/feedback that you can provide.
UPDATE: Thanks for your help, everyone! While attempting some of your suggested fixes, I noticed that boto3 was out of date, and update it (to boto3-1.9.26 and botocore-1.12.26) which appeared to resolve the issue. I was not able to find any documentation on that being an issue with boto3==1.9.18. If someone could help me understand what the issue was with boto3, I would happy to make mark their answer as correct.
SageMaker local mode is designed to pick up whatever credentials are available in your boto3 session, and pass them into the docker container as environment variables.
However, the version of the sagemaker sdk that you are using (1.11.1 and earlier) will ignore the credentials if they include a token, because that usually indicates short-lived credentials that won't remain valid long enough for a training job to complete or endpoint to be useful.
If you are using temporary credentials, try replacing them with permanent ones, or running from an ec2 instance (or SageMaker notebook!) that has an appropriate instance role assigned.
Also, the sagemaker sdk's handling of credentials changed in v1.11.2 and later -- temporary credentials will be passed to local mode containers, but with a warning message. So you could just upgrade to a newer version and try again (pip install -U sagemaker).
Also, try upgrading boto3 can change, so try using the latest version.
I just confirmed that his example works on my machine locally. Please make sure the role you are using has permission to use the buckets with name starts with sagemaker. Sagemaker by default creates buckets prefixed with sagemaker.
It looks like you have the credentials configured on your host at ~/.aws/credentials but are trying to access them on a docker container running on the host.
The simplest solution seems to be, mounting your aws credentials on the container at the expected location. You appear to be using the sagemaker-mxnet:1.1.0-gpu-py2 image, which appears to use the root user. Based on this, if you update the volumes in your docker-compose file for the algo-1-1DUU4 to include:
volumes:
...
~/.aws/:/root/.aws/
this will mount your credentials on to the root user in your container, so that your python script should be able to access them.
I'll assume that the library you're using has boto3 at its core. boto3 advises that there are several methods of authentication available to you.
Passing credentials as parameters in the boto.client() method
Passing credentials as parameters when creating a Session object
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
But it sounds like a docker sandbox does not have access to your ~/.aws/credentials.conf file, so I'd consider other options that may be available to you. As I'm unfamiliar with docker, I can't give you a guaranteed solution for your scenario.

Trying to connect to Google cloud storage (GCS) using python

I've build the following script:
import boto
import sys
import gcs_oauth2_boto_plugin
def check_size_lzo(ds):
# URI scheme for Cloud Storage.
CLIENT_ID = 'myclientid'
CLIENT_SECRET = 'mysecret'
GOOGLE_STORAGE = 'gs'
dir_file= 'date_id={ds}/apollo_export_{ds}.lzo'.format(ds=ds)
gcs_oauth2_boto_plugin.SetFallbackClientIdAndSecret(CLIENT_ID, CLIENT_SECRET)
uri = boto.storage_uri('my_bucket/data/apollo/prod/'+ dir_file, GOOGLE_STORAGE)
key = uri.get_key()
if key.size < 45379959:
raise ValueError('umg lzo file is too small, investigate')
else:
print('umg lzo file is %sMB' % round((key.size/1e6),2))
if __name__ == "__main__":
check_size_lzo(sys.argv[1])
It works fine locally but when I try and run on kubernetes cluster I get the following error:
boto.exception.GSResponseError: GSResponseError: 403 Access denied to 'gs://my_bucket/data/apollo/prod/date_id=20180628/apollo_export_20180628.lzo'
I have updated the .boto file on my cluster and added my oauth client id and secret but still having the same issue.
Would really appreciate help resolving this issue.
Many thanks!
If it works in one environment and fails in another, I assume that you're getting your auth from a .boto file (or possibly from the OAUTH2_CLIENT_ID environment variable), but your kubernetes instance is lacking such a file. That you got a 403 instead of a 401 says that your remote server is correctly authenticating as somebody, but that somebody is not authorized to access the object, so presumably you're making the call as a different user.
Unless you've changed something, I'm guessing that you're getting the default Kubernetes Engine auth, with means a service account associated with your project. That service account probably hasn't been granted read permission for your object, which is why you're getting a 403. Grant it read/write permission for your GCS resources, and that should solve the problem.
Also note that by default the default credentials aren't scoped to include GCS, so you'll need to add that as well and then restart the instance.

Google cloud speech api throwing 403 when trying to use it

I'm using python with google cloud speech api I did all the steps in "How to use google speech recognition api in python?" on ubuntu and on windows as well and when I trying to run the simple script from here - "https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/api/speech_rest.py"
I get the next error:
<HttpError 403 when requesting https://speech.googleapis.com/$discovery/rest?version=v1beta1 returned "Google Cloud Speech API has not been used in project google.com:cloudsdktool before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project=google.com:cloudsdktool then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.">
what is weird is that I don't have project by the name "cloudsdktool"
I run "gcloud init", and linked the json file that I got when I created service account key with "gcloud auth activate-service-account --key-file=jsonfile" command,
I tried in linux to create google credentials environment variable and still I get the same massage
So I found two ways to fix that problem:
1 - if using google cloud sdk and the cloud speech is in beta version you need to run 'gcloud beta init' instead of 'gcloud init' and then provide the json file
2 - if you don't want to use the cloud sdk from google you can pass the json file straight in python app
here are the methods for this:
from oauth2client.client import GoogleCredentials
GoogleCredentials.from_stream('path/to/your/json')
then you just create scope on the creds and authorizing or if using grpc(streaming) you pass it to the header just like in the example.
here are the changed script for the grpc:
def make_channel(host, port):
"""Creates an SSL channel with auth credentials from the environment."""
# In order to make an https call, use an ssl channel with defaults
ssl_channel = implementations.ssl_channel_credentials(None, None, None)
# Grab application default credentials from the environment
creds = GoogleCredentials.from_stream('path/to/your/json').create_scoped([SPEECH_SCOPE])
# Add a plugin to inject the creds into the header
auth_header = (
'Authorization',
'Bearer ' + creds.get_access_token().access_token)
auth_plugin = implementations.metadata_call_credentials(
lambda _, cb: cb([auth_header], None),
name='google_creds')
# compose the two together for both ssl and google auth
composite_channel = implementations.composite_channel_credentials(
ssl_channel, auth_plugin)
return implementations.secure_channel(host, port, composite_channel)

Categories