I want to Unit test a some FastAPI API endpoints which utilizes Google Cloud Platfrom, I want to write the test without using os.environ["GOOGLE_APPLICATION_CREDENTIALS"]='path_to_json_file.json' in the files to authenticate (as this service will be in the cloud soon). Is there a way to mock this?
It's slightly unclear from your question but it is unlikely that you would ever want to set GOOGLE_APPLICATION_CREDENTIALS from within your code, partly for this reason.
You should set the variable from the environment:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/key.json
python3 your_code.py
Application Default Credentials (ADC) looks for credentials in 3 locations:
GOOGLE_APPLICATION_DEFAULT environment variable
gcloud application-default login
The compute service's identity
For this reason, setting the variable explicitly in code, overrides the possibility of #2 (less important) and #3 (more important).
If you set the variable outside of the code when you run the code for testing etc., the credentials will be found automatically and the code will be auth'd.
When you don't set the variable because the code is running on a compute service (e.g. Cloud Run, Compute Engine ...), the service's credentials will be used automatically by ADC and the code will be auth'd.
Related
So I am trying to orchestrate a workflow in Airflow. One task is to read GCP Cloud Storage, which needs me to specify the Google Application Credentials.
I decided to create a new folder in the dag folder and put the JSON key. Then I specified this in the dag.py file;
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "dags\support\keys\key.json"
Unfortunately, I am getting this error below;
google.auth.exceptions.DefaultCredentialsError: File dags\support\keys\dummy-surveillance-project-6915f229d012.json was not found
Can anyone help with how I should go about declaring the service account key?
Thank you.
You can create a connection to Google Cloud from Airflow webserver admin menu. In this menu you can pass the Service Account key file path.
In this picture, the keyfile Path is /usr/local/airflow/dags/gcp.json.
Beforehand you need to mount your key file as a volume in your Docker container with the previous path.
You can also directly copy the key json content in the Airflow connection, in the keyfile Json field :
You can check from these following links :
Airflow-connections
Airflow-with-google-cloud
Airflow-composer-managing-connections
If you trying to download data from Google Cloud Storage using Airflow, you should use the GCSToLocalFilesystemOperator operator described here. It is already provided as part of the standard Airflow library (if you installed it) so you don't have to write the code yourself using the Python operator.
Also, if you use this operator you can enter the GCP credentials into the connections screen (where it should be). This is a better approach to putting your credentials in a folder with your DAGs as this could lead to your credentials being committed into your version control system which could lead to security issues.
I have an Azure Devops Pipeline setup. It gets some secrets via the yaml
variables
- group: GROUP_WITH_SECRET
Then in the later part of the pipeline I run a python script that gets that particular secret via
my_pat = os.environ["my_secret"]
That is then used in a library provided by Microsoft (msrest) as so:
BasicAuthentication("", my_pat)
If the variable in question, in the ADO Library is set to plain, the script works correctly. If I change it to a secret, connection fails. If I set it back to plain text, it again works.
Question is, how can I make it work with a secret? I've tried printing the value out but since it's a secret it doesn't show me the actual value other than the
The user 'aaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaa' is not authorized to access this resource
To use the secret variable in Azure Pipeline, you need to explicitly map secret variables in Agent Job.
Based on my test, the Python script task has no environment field to map the secret variables.
So you can add environment variable in PowerShell task to map secret variables. And you can set it as pipeline variable for the next tasks.
Here is an example:
- powershell: |
echo "##vso[task.setvariable variable=myPass]$env:myPass"
displayName: 'PowerShell Script'
env:
myPass: $(myPass)
Then you can use the variable in the next tasks.
For more detailed info, you can refer to this doc: Secret Variable
I am getting the error while deploying the Azure function from the local system.
I wen through some blogs and it is stating that my function is unable to connect with the Azure storage account which has the functions meta data.
Also, The function on the portal is showing the error as: Azure Functions runtime is unreachable
Earlier my function was running but after integrating the function with a Azure premium App service plan it has stooped working. My assumption is that my app service plan having some restriction for the inbound/outbound traffic rule and Due to this it is unable to establish the connection with the function's associated storage account.
Also, I would like to highlight that if a function is using the premium plan then we have to add few other configuration properties.
WEBSITE_CONTENTAZUREFILECONNECTIONSTRING = "DefaultEndpointsProtocol=https;AccountName=blob_container_storage_acc;AccountKey=dummy_value==;EndpointSuffix=core.windows.net"
WEBSITE_CONTENTSHARE = "my-function-name"
For the WEBSITE_CONTENTSHARE property I have added the function app name but I am not sure with the value.
Following is the Microsoft document reference for the function properties
Microsoft Function configuration properties link
Can you please help me to resolve the issue.
Note: I am using python for the Azure functions.
I have created a new function app with Premium plan and selected the interpreter as Python. When we select Python, OS will be automatically Linux.
Below is the message we get to create functions for Premium plan function App:
Your app is currently in read only mode because Elastic Premium on Linux requires running from a package.
PortalScreenshot
We need to create, deploy and run function apps from a package, refer to the documentation on how we can run functions from package.
Documentation
Make sure to add all your local.settings.json configurations to Application Settings in function app.
Not sure of what kind of Azure Function you are using but usually when there is a Storage Account associated, we need to specify the AzureWebJobsStorage field in the serviceDependencies.json file inside Properties folder. And when I had faced the same error, the cause was that while publishing the azure function from local, some settings from the local.settings.json were missing in the Application Settings of the app service under Configuration blade.
There can be few more things which you can recheck:
Does the storage account you are trying to use existing still or is deleted by any chance.
While publishing the application from local, using the web deploy method, the publish profile is correct or has any issues.
Disabling the function app and then stopping the app service before redeploying it.
Hope any of the above mentions help you diagnose and solve the issue.
The thing is that there is a difference in how the function deployed using Consumption vs Premium service plan.
Consumption - working out of the box.
Premium - need to add the WEBSITE_RUN_FROM_PACKAGE = 1 in the function Application settings. (see https://learn.microsoft.com/en-us/azure/azure-functions/run-functions-from-deployment-package for full details)
I am using Azure Managed Identity feature for my python Azure Functions App
and would like to be able to fetch currently assigned Client ID from within the Function App itself.
Search through documentation and azure-identity python sources did not give result I would expect.
Maybe I could:
Query Azure Instance Metadata Service myself to get this ID. (not really happy with this option)
Provision it as env variable during ARM deployment stage/ or by hands later on. (seems good and efficient, but not sure what is the best practice here)
UPDATE
Managed o get it working with ARM template and env variable
Deploys FunctionApp with System Identity
Provisions System Identity as env variable of this same FunctionApp
Idea is to use Microsoft.Resources/deployments subtemplate to update Function App configuration with:
{
"name": "AZURE_CLIENT_ID",
"value": "[reference(resourceId('Microsoft.Web/sites', variables('appName')), '2019-08-01', 'full').identity.principalId]"
},
The simplest option is to go to the identity tab for your Functions app, and turn on "System assigned managed identity".
You can then get the access token without having to provide the client_id, since the token request simply picks the system assigned identity if there is one for the Function app.
If you are using "user assigned managed identity", then you need to provide the client_id: either through env or directly in your code.
You may already be aware, but just an additional note: that you also need to make sure you have given access to your managed identity for the resource you are accessing, for example: going to the Azure resource your Function app needs to access and assigning an appropriate role for your managed identity.
your option 1 (query Azure Instance Metadata Service), is only available on VMs.
UPDATE
Since you need the client_id for other purposes, you may also consider reading it from the response to your request for the access token: client_id is one of the parameters in the JSON token returned to you along with the access token, and its value is the client_id of the managed identity you used (in your case, the system-assigned managed identity)
Here is a sample token response to illustrate this:
{
access_token: <...>,
resource: <...>,
token_type: 'Bearer',
client_id: <client_id of the managed identity used to get this token>
}
Please could someone help me with a query related to permissions on the Google cloud platform? I realise that this is only loosely programming related so I apologise if this is the wrong forum!
I have a project ("ProjectA") written in Python that uses Googles cloud storage and compute engine. The project has various buckets that are accessed using python code from both compute instances and from my home computer. This project uses a service account which is a Project "owner", I believe it has all APIs enabled and the project works really well. The service account name is "master#projectA.iam.gserviceaccount.com".
Recently I started a new project that needs similar resources (storage, compute) etc, but I want to keep it separate. The new project is called "ProjectB" and I set up a new master service account called master#projectB.iam.gserviceaccount.com. My code in ProjectB generates an error related to access permissions and is demonstrated even if I strip the code down to these few lines:
The code from ProjectA looked like this:
from google.cloud import storage
client = storage.Client(project='projectA')
mybucket = storage.bucket.Bucket(client=client, name='projectA-bucket-name')
currentblob = mybucket.get_blob('somefile.txt')
The code from ProjectB looks like this:
from google.cloud import storage
client = storage.Client(project='projectB')
mybucket = storage.bucket.Bucket(client=client, name='projectB-bucket-name')
currentblob = mybucket.get_blob('somefile.txt')
Both buckets definitely exist, and obviously if "somefile.text" does not exist then currentblob is None, which is fine, but when I execute this code I receive the following error:
Traceback (most recent call last):
File .... .py", line 6, in <module>
currentblob = mybucket.get_blob('somefile.txt')
File "C:\Python27\lib\site-packages\google\cloud\storage\bucket.py", line 599, in get_blob
_target_object=blob,
File "C:\Python27\lib\site-packages\google\cloud\_http.py", line 319, in api_request
raise exceptions.from_http_response(response)
google.api_core.exceptions.Forbidden: 403 GET https://www.googleapis.com/storage/v1/b/<ProjectB-bucket>/o/somefile.txt: master#ProjectA.iam.gserviceaccount.com does not have storage.objects.get access to projectB/somefile.txt.
Notice how the error message says "ProjectA" service account doesn't have ProjectB access - well, I would somewhat expect that but I was expecting to use the service account on ProjectB!
Upon reading the documentation and links such as this and this, but even after removing and reinstating the service account or giving it limited scopes it hasnt helped. I have tried a few things:
1) Make sure that my new service account was "activated" on my local machine (where the code is being run for now):
gcloud auth activate-service-account master#projectB.iam.gserviceaccount.com --key-file="C:\my-path-to-file\123456789.json"
This appears to be successful.
2) Verify the list of credentialled accounts:
gcloud auth list
This lists two accounts, one is my email address (that I use for gmail, etc), and the other is master#projectB.iam.gserviceaccount.com, so it appears that my account is "registered" properly.
3) Set the service account as the active account:
gcloud config set account master#projectB.iam.gserviceaccount.com
When I look at the auth list again, there is an asterisk "*" next to the service account, so presumably this is good.
4) Check that the project is set to ProjectB:
gcloud config set project projectB
This also appears to be ok.
Its strange that when I run the python code, it is "using" the service account from my old project even though I have changed seemingly everything to refer to project B - Ive activated the account, selected it, etc.
Please could someone point me in the direction of something that I might have missed? I don't recall going through this much pain when setting up my original project and Im finding it so incredibly frustrating that something I thought would be simple is proving so difficult.
Thank you to anyone who can offer me any assistance.
I'm not entirely sure, but this answer is from a similar question on here:
Permission to Google Cloud Storage via service account in Python
Specifying the account explicitly by pointing to the credentials in your code. As documented here:
Example from the documentation page:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json(
'service_account.json')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
Don't you have a configured GOOGLE_APPLICATION_CREDENTIALS env variable which points project A's SA?
The default behavior of Google SDK is to takes the service account from the environment variable GOOGLE_APPLICATION_CREDENTIALS.
If you want to change the account you can do something like:
from google.cloud import storage
credentials_json_file = os.environ.get('env_var_with_path_to_account_json')
client= storage.Client.from_service_account_json(credentials)
The above assumes you have creates a json account file like in: https://cloud.google.com/iam/docs/creating-managing-service-account-keys
and that the json account file is in the environment variable env_var_with_path_to_account_json
This way you can have 2 account files and decide which one to use.