I am trying to use the Healthcare API, specifically the Healthcare Natural Language API for which there is this tutorial as well as this other one
The tutorial outlines how to run the API on a string; I've been tasked with figuring out how to make use of the API with a dataset of medical text data. I am most comfortable in Python out of all GCP options, so I attempted to run the code through Colab.
I used a service account json key for authentication, but this isn't best practice. So, I had to delete the key since we are dealing with patient data and everyone on my team is new to GCP.
In order for me to continue exploring running the Healthcare NLP API on a dataset rather than one string, I need to figure out authentication through a different method. In this regard, I have the following questions:
Pros/cons of me trying to run this through Colab?
Should I look to shifting to running my code within the GCP interface?
Are my choices Colab (and being forced to use a json key) vs working within GCP shell/terminal (with a plethora of authentication options)? This is what I gather from my research, but I am quite new to using APIs, working with cloud computing, etc.
I've tried to look into related tutorials such as this but their lack of direct relationship to what I am doing (ie: can't find one where the API being used is Healthcare), and my lack of familiarity with APIs and GCPs, I don't particularly understand what is going on + I keep seeing service accounts generally mentioned at one point or another, and I am not allowed to use service account keys for the time being.
Instead of using a service account you can use your own credentials, and supply them to your code using the "application default credentials". To set this up, make sure the GOOGLE_APPLICATION_CREDENTIALS environment variable is unset, then run gcloud auth application-default login (docs). After going through the login flow you can either generate an auth token manually, using gcloud auth application-default print-access-token, or you can use the Google client libraries, i.e.
from googleapiclient import discovery
api_version = "v1"
service_name = "healthcare"
# Returns an authorized API client by discovering the Healthcare API
# and using GOOGLE_APPLICATION_CREDENTIALS environment variable.
client = discovery.build(service_name, api_version)
Under the hood this is using the application default credentials, you can also do with the google-auth Python package if you want.
You can find a summary of all the standard authentication methods here
Related
This is what I want to achieve:
Ask the user to authorize the collection of their data on a Google Analytics 4 property (or Universal Analytics but I would rather not)
Programmatically retrieve and store the data every n-hours
I was able to do (1) client-side by asking for authorization with google's OAUTH2 and making a call to Reporting API v4 https://developers.google.com/analytics/devguides/reporting/core/v4 using gapi on the front-end.
However, I'm not sure how to do it on a schedule without user interaction. I've searched Google's API docs and I believe there's a way to do it in python https://developers.google.com/analytics/devguides/reporting/core/v4/quickstart/service-py but I am currently limited to Node and the browser. I guess I could make a server in python that does the data fetching and connects with the Node application, but that's yet another layer of complications that I'm trying to avoid. Is there a way to do everything in Node?
GCP APIs are all documented in a way which allows everyone to generate client libraries in a variety of languages, including node.js. The documentation for the node.js client for Analytics Reporting is here.
For the question of how to schedule this on GCP, I would recommend you to use Cloud Scheduler. This will hit an endpoint running on Cloud Run, which will do the actual work. Alternatively, if you already have a service running somewhere else, you can simply add the required endpoints there and point Cloud Scheduler to it.
The overall design which I would suggest you goes something like this:
Build a site which takes the user through the OAUTH2 login process,
requesting the relevant Google Analytics Reporting API scopes
required to make the request.
Store the obtained credentials in their user database.(preferably
Firestore in Datastore mode)
Set up a Cloud Run service (or anything else), with two endpoints
Iteration endpoint: Iterate through the list of users and add tasks
to Cloud Tasks to hit the download endpoint for each one.
Download endpoint: Takes a user ID (e.g. as a query parameter) and
performs the download for this user. You will need to load the
credentials for the user from the database and use this to access the
reporting API.
Store the downloaded data in the desired location, e.g. Cloud
Storage, Firestore, Cloud SQL, etc.
Set up Cloud Scheduler to hit the iteration endpoint at the desired
frequency.
For the GCP services mentioned above, basically everything other than Analytics, you may use the "cloud" clients for node.js, which are available here
Note : The question you have asked is a very broad question and this answer is just a suggestion. You may think about other designs whichever works best for you.
I have a python/Flask application, on our intranet, and I want people to authenticate to it using their Azure AD credentials. Pretty much every hit on Google/Bing/etc is about how to use AD to authenticate so you can subsequently use Microsoft APIs, such as Graph or Data Lake, or they are for .NET applications, or they are for stuff running on the Azure cloud.
The closest I've come to what I need is https://github.com/cicorias/python-flask-aad-v2, and the instructions refer to some older version of Azure. It would also be nice if I could specify whether an authenticated user should have access to this app, but I can live without it and simply have a list of allowed IDs in the app's back-end.
This cannot be that hard; I've done this in the past for both GCP and AWS, but I've hit the proverbial brick wall when it came to Azure. While this is not my first overall rodeo, it is my first Azure/AD rodeo, so to speak. I'm sure that part of my problem is that, being an Azure noob, I may not even be using the right search keywords.
Help?
Do not think in terms of the providers but in terms of the Authentication standards. Since you have integrated Google Login in your app in the past then you must have used something called OAuth as the auth standard. Azure AD also supports OAuth. You can use a python package called flask-azure-oauth to integrate it in your flask app.
You can refer to below code samples available in Microsoft Identity Platform documentation (https://learn.microsoft.com/en-us/azure/active-directory/develop/sample-v2-code#web-applications)
Sign in users - https://github.com/Azure-Samples/ms-identity-python-flask-tutorial
Sign in users and call Microsoft Graph - https://github.com/Azure-Samples/ms-identity-python-webapp
These links are for Python (Flask). You can get code samples for other languages or scenario from Microsoft Identity Platform documentation (https://learn.microsoft.com/en-us/azure/active-directory/develop/sample-v2-code#web-applications)
I am writing an application that uses Google's python client for GCS.
https://cloud.google.com/storage/docs/reference/libraries#client-libraries-install-python
I've had no issues using this, until I needed to write my functional tests.
The way our organization tests integrations like this is to write a simple stub of the API endpoints I hit, and point the Google client library (in this case) to my stub, instead of needing to hit Google's live endpoints.
I'm using a service account for authentication and am able to point the client at my stub when fetching a token because it gets that value from the service account's json key that you get when you create the service account.
What I don't seem able to do is point the client library at my stubbed API instead of making calls directly to Google.
Some work arounds that I've though of, that I don't like are:
- Allow the tests to hit the live endpoints.
- Put in some configuration that toggles using the real Google client library, or a mocked version of the library. I'd rather mock the API versus having mock code deployed to production.
Any help with this is greatly appreciated.
I’ve made some research and it seems like there’s nothing supported specifically for Cloud Storage using python. I found this GitHub issue entry with a related discussion, but for go.
I think you can open a public issue tracker asking for this functionality. I’m afraid by now it’s easier to keep using your second workaround.
I need to use Cloud Vision API in my python solution, I've been relying on an API key for a while now, but at the moment I'm trying to give my Compute Engine's default service account the scope needed to call Vision, with little luck so far.
I have enabled vision API in my project via cloud console, but I still get that 403 error:
Request had insufficient authentication scopes.
I would set access individually for each API from my gce's edit details tab, but couldn't find Vision listed along the other API's.
The only way I managed to correctly receive a correct response from Vision API is by flagging the "Allow full access to all Cloud APIs" checkbox, again from my gce's edit details tab, but that doesn't sound too secure to me.
Hopefully there are better ways to do this, but I couldn't find any on Vision's documentation on authentication, nor in any question here on stack overflow (some had a close topic, but none of the proposed answers quite fitted my case, or provided a working solution).
Thank you in advance for your help.
EDIT
I'm adding the list of every API I can individually enable in my gce's default service account from cloud console:
BigQuery; Bigtable Admin; Bigtable Data; Cloud Datastore; Cloud Debugger; Cloud Pub/Sub; Cloud Source Repositories; Cloud SQL; Compute Engine; Service Control; Service Management; Stackdriver Logging API; Stackdriver Monitoring API; Stackdriver Trace; Storage; Task queue; User info
None of them seems useful to my needs, although the fact that enabling full access to them all solves my problem is pretty confusing to me.
EDIT #2
I'll try and state my question(s) more concisely:
How do I add https://www.googleapis.com/auth/cloud-vision to my gce instance's default account?
I'm looking for a way to do that via any of the following: GCP console, gcloud command line, or even through Python (at the moment I'm using googleapiclient.discovery.build, I don't know if there is any way to ask for vision api scope through the library).
Or is it ok to enable all the scopes as long as limit the roles via IAM? And if that's the case how do I do that?
I really can't find my way around the documentation, thank you once again.
Google Cloud APIs (Vision, Natural Language, Translation, etc) do not need any special permissions, you should just enable them in your project (going to the API Library tab in the Console) and create an API key or a Service account to access them.
Your decision to move from API keys to Service Accounts is the correct one, given that Service Accounts are the recommended approach for authentication with Google Cloud Platform services, and for security reasons, Google recommends to use them instead of API keys.
That being said, I see that you are using the old Python API Client Libraries, which make use of the googleapiclient.discovery.build service that you mentioned. As of now, the newer idiomatic Client Libraries are the recommended approach, and they superseded the legacy API Client Libraries that you are using, so I would strongly encourage to move in that direction. They are easier to use, more understandable, better documented and are the recommended approach to access Cloud APIs programatically.
Getting that as the starting point, I will divide this answer in two parts:
Using Client Libraries
If you decided to follow my advice and migrate to the new Client Libraries, authentication will be really easy for you, given that Client Libraries use Application Default Credentials (ADC) for authentication. ADC make use of the default service account for Compute Engine in order to provide authentication, so you should not worry about it at all, as it will work by default.
Once that part is clear, you can move on to create a sample code (such as the one available in the documentation), and as soon as you test that everything is working as expected, you can move on to the complete Vision API Client Library reference page to get the information about how the library works.
Using (legacy) API Client Libraries
If, despite my words, you want to stick to the old API Client Libraries, you might be interested in this other documentation page, where there is some complete information about Authentication using the API Client Libraries. More specifically, there is a whole chapter devoted to explaining OAuth 2.0 authentication using Service Accounts.
With a simple code like the one below, you can use the google.oauth2.service_account module in order to load the credentials from the JSON key file of your preferred SA, specify the required scopes, and use it when building the Vision client by specifying credentials=credentials:
from google.oauth2 import service_account
import googleapiclient.discovery
SCOPES = ['https://www.googleapis.com/auth/cloud-vision']
SERVICE_ACCOUNT_FILE = '/path/to/SA_key.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
vision = googleapiclient.discovery.build('vision', 'v1', credentials=credentials)
EDIT:
I forgot to add that in order for Compute Engine instances to be able to work with Google APIs, it will have to be granted with the https://www.googleapis.com/auth/cloud-platform scope (in fact, this is the same as choosing the Allow full access to all Cloud APIs). This is documented in the GCE Service Accounts best practices, but you are right that this would allow full access to all resources and services in the project.
Alternatively, if you are concerned about the implications of allowing "access-all" scopes, in this other documentation page it is explained that you can allow full access and then perform the restriction access by IAM roles.
In any case, if you want to grant only the Vision scope to the instance, you can do so by running the following gcloud command:
gcloud compute instances set-service-account INSTANCE_NAME --zone=INSTANCE_ZONE --scopes=https://www.googleapis.com/auth/cloud-vision
The Cloud Vision API scope (https://www.googleapis.com/auth/cloud-vision) can be obtained, as for any other Cloud API, from this page.
Additionally, as explained in this section about SA permissions and access scopes, SA permissions should be compliant with instance scopes; that means that most restrictive permission would apply, so you need to have that in mind too.
To set the access scopes from the python client libraries with the same effect as that radio button in the GUI:
instance_client = compute_v1.InstancesClient()
instance.service_accounts = [
compute_v1.ServiceAccount(
email="$$$$$$$-compute#developer.gserviceaccount.com",
scopes=[
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/cloud-platform",
],
)
]
With a tutorial for creating instances from python here
I am trying to use Google cloud's natural language API at work, and I believe my corporate firewall is blocking communication between python and google cloud.
After entering the following in the terminal:
gcloud auth application-default login
My browser opens up to log into my google account successfully. After I log in, however, I get
ERROR: There was a problem with web authentication. Try running a
gain with --no-launch-browser.
ERROR: (gcloud.auth.application-default.login) Could not reach th
e login server. A potential cause of this could be because you ar
e behind a proxy. Please set the environment variables HTTPS_PROX
Y and HTTP_PROXY to the address of the proxy in the format "proto
col://address:port" (without quotes) and try again.
Example: HTTPS_PROXY=https://192.168.0.1:8080
I believe I need to contact my IT department to add an exception to our firewall. Does anyone know what the address / port for google cloud's natural language processing API is?
I can't directly answer your question but I can provide some general guidance that might workaround your issue.
The command
gcloud auth application-default login
Is a convenience helper for running sample code locally but it's really not the best auth strategy for a variety of reasons. It uses a special client ID that won't always have all your quota.
The way I would recommend using the API is Service Accounts. You can create a Service Account in the Cloud Console under API credentials, and then download a JSON key. Then you set the environment variable GOOGLE_APPLICATION_CREDENTIALS to point to your file, and it will automatically work assuming you are using Application Default Credentials (which most samples and client libraries use).
On App Engine, and Compute Engine (assuming you created the VM with the correct scopes) Service Accounts exist by default so you don't even need to download the JSON and set the environment variable.
The other way you can use the API is just creating an API Key, then hit the HTTP endpoints with ?key=api-key at the end of the URL. API Keys are also less than ideal (no idea who client is, no scopes), but are a simple option.
In your case, I'd recommend using JSON service account keys and the environment variable, but it's worth reading the official authentication guide.