I am trying to use Google cloud's natural language API at work, and I believe my corporate firewall is blocking communication between python and google cloud.
After entering the following in the terminal:
gcloud auth application-default login
My browser opens up to log into my google account successfully. After I log in, however, I get
ERROR: There was a problem with web authentication. Try running a
gain with --no-launch-browser.
ERROR: (gcloud.auth.application-default.login) Could not reach th
e login server. A potential cause of this could be because you ar
e behind a proxy. Please set the environment variables HTTPS_PROX
Y and HTTP_PROXY to the address of the proxy in the format "proto
col://address:port" (without quotes) and try again.
Example: HTTPS_PROXY=https://192.168.0.1:8080
I believe I need to contact my IT department to add an exception to our firewall. Does anyone know what the address / port for google cloud's natural language processing API is?
I can't directly answer your question but I can provide some general guidance that might workaround your issue.
The command
gcloud auth application-default login
Is a convenience helper for running sample code locally but it's really not the best auth strategy for a variety of reasons. It uses a special client ID that won't always have all your quota.
The way I would recommend using the API is Service Accounts. You can create a Service Account in the Cloud Console under API credentials, and then download a JSON key. Then you set the environment variable GOOGLE_APPLICATION_CREDENTIALS to point to your file, and it will automatically work assuming you are using Application Default Credentials (which most samples and client libraries use).
On App Engine, and Compute Engine (assuming you created the VM with the correct scopes) Service Accounts exist by default so you don't even need to download the JSON and set the environment variable.
The other way you can use the API is just creating an API Key, then hit the HTTP endpoints with ?key=api-key at the end of the URL. API Keys are also less than ideal (no idea who client is, no scopes), but are a simple option.
In your case, I'd recommend using JSON service account keys and the environment variable, but it's worth reading the official authentication guide.
Related
I am trying to use the Healthcare API, specifically the Healthcare Natural Language API for which there is this tutorial as well as this other one
The tutorial outlines how to run the API on a string; I've been tasked with figuring out how to make use of the API with a dataset of medical text data. I am most comfortable in Python out of all GCP options, so I attempted to run the code through Colab.
I used a service account json key for authentication, but this isn't best practice. So, I had to delete the key since we are dealing with patient data and everyone on my team is new to GCP.
In order for me to continue exploring running the Healthcare NLP API on a dataset rather than one string, I need to figure out authentication through a different method. In this regard, I have the following questions:
Pros/cons of me trying to run this through Colab?
Should I look to shifting to running my code within the GCP interface?
Are my choices Colab (and being forced to use a json key) vs working within GCP shell/terminal (with a plethora of authentication options)? This is what I gather from my research, but I am quite new to using APIs, working with cloud computing, etc.
I've tried to look into related tutorials such as this but their lack of direct relationship to what I am doing (ie: can't find one where the API being used is Healthcare), and my lack of familiarity with APIs and GCPs, I don't particularly understand what is going on + I keep seeing service accounts generally mentioned at one point or another, and I am not allowed to use service account keys for the time being.
Instead of using a service account you can use your own credentials, and supply them to your code using the "application default credentials". To set this up, make sure the GOOGLE_APPLICATION_CREDENTIALS environment variable is unset, then run gcloud auth application-default login (docs). After going through the login flow you can either generate an auth token manually, using gcloud auth application-default print-access-token, or you can use the Google client libraries, i.e.
from googleapiclient import discovery
api_version = "v1"
service_name = "healthcare"
# Returns an authorized API client by discovering the Healthcare API
# and using GOOGLE_APPLICATION_CREDENTIALS environment variable.
client = discovery.build(service_name, api_version)
Under the hood this is using the application default credentials, you can also do with the google-auth Python package if you want.
You can find a summary of all the standard authentication methods here
I know that the Google Cloud environment has tons of solutions for encryption but I keep ultimately running in circles and finding myself holding my own key when it should be unknown to the application.
My current strategy is:
Access my google credentials as json held locally.
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS']='<my_google_user>.json'
Set my secret with payload in google cloud via Python (longer code omitted for conciseness)
Access my secret as dictionary to access directly later in code.
client = secretmanager.SecretManagerServiceClient()
sf_str = client.access_secret_version(request={"name":<my_version_name>}).payload.data.decode("utf-8")
sf_cred = json.loads(sf_str)
So my question is how can I encrypt my google application credentials json?
I am very new to this environment and workflows involving encryption so please be precise with python or cloud terminal examples. Feel free to knock the security of my strategy as a whole so that I may learn a better one.
P.S. I have created a Cryptographic Key in Google Cloud Platform if that is a step but don't know how to use/automate it in this 12hr recurring task I want to use in setting up this password access.
Not sure if I understand your question correctly.
You might not need the credentials file at all.
Your code in the cloud is executed by App Engine, or Cloud Function, or Cloud Run, etc. under some service account. It can be a default service account, or a specifically created service account. For example: Using the Default App Engine Service Account
In order to access a secret in the Secret Manager, it may be enough to add/assign relevant IAM roles to the service account which executes the code to access those secrets. For example - to add IAM roles to the default app engine service account.
Secret Manager IAM is described here: Access control
Most likely a roles/secretmanager.secretAccessor role may be enough. In that case your code will be able to get the secret value, and subsequently use it.
I need to use Cloud Vision API in my python solution, I've been relying on an API key for a while now, but at the moment I'm trying to give my Compute Engine's default service account the scope needed to call Vision, with little luck so far.
I have enabled vision API in my project via cloud console, but I still get that 403 error:
Request had insufficient authentication scopes.
I would set access individually for each API from my gce's edit details tab, but couldn't find Vision listed along the other API's.
The only way I managed to correctly receive a correct response from Vision API is by flagging the "Allow full access to all Cloud APIs" checkbox, again from my gce's edit details tab, but that doesn't sound too secure to me.
Hopefully there are better ways to do this, but I couldn't find any on Vision's documentation on authentication, nor in any question here on stack overflow (some had a close topic, but none of the proposed answers quite fitted my case, or provided a working solution).
Thank you in advance for your help.
EDIT
I'm adding the list of every API I can individually enable in my gce's default service account from cloud console:
BigQuery; Bigtable Admin; Bigtable Data; Cloud Datastore; Cloud Debugger; Cloud Pub/Sub; Cloud Source Repositories; Cloud SQL; Compute Engine; Service Control; Service Management; Stackdriver Logging API; Stackdriver Monitoring API; Stackdriver Trace; Storage; Task queue; User info
None of them seems useful to my needs, although the fact that enabling full access to them all solves my problem is pretty confusing to me.
EDIT #2
I'll try and state my question(s) more concisely:
How do I add https://www.googleapis.com/auth/cloud-vision to my gce instance's default account?
I'm looking for a way to do that via any of the following: GCP console, gcloud command line, or even through Python (at the moment I'm using googleapiclient.discovery.build, I don't know if there is any way to ask for vision api scope through the library).
Or is it ok to enable all the scopes as long as limit the roles via IAM? And if that's the case how do I do that?
I really can't find my way around the documentation, thank you once again.
Google Cloud APIs (Vision, Natural Language, Translation, etc) do not need any special permissions, you should just enable them in your project (going to the API Library tab in the Console) and create an API key or a Service account to access them.
Your decision to move from API keys to Service Accounts is the correct one, given that Service Accounts are the recommended approach for authentication with Google Cloud Platform services, and for security reasons, Google recommends to use them instead of API keys.
That being said, I see that you are using the old Python API Client Libraries, which make use of the googleapiclient.discovery.build service that you mentioned. As of now, the newer idiomatic Client Libraries are the recommended approach, and they superseded the legacy API Client Libraries that you are using, so I would strongly encourage to move in that direction. They are easier to use, more understandable, better documented and are the recommended approach to access Cloud APIs programatically.
Getting that as the starting point, I will divide this answer in two parts:
Using Client Libraries
If you decided to follow my advice and migrate to the new Client Libraries, authentication will be really easy for you, given that Client Libraries use Application Default Credentials (ADC) for authentication. ADC make use of the default service account for Compute Engine in order to provide authentication, so you should not worry about it at all, as it will work by default.
Once that part is clear, you can move on to create a sample code (such as the one available in the documentation), and as soon as you test that everything is working as expected, you can move on to the complete Vision API Client Library reference page to get the information about how the library works.
Using (legacy) API Client Libraries
If, despite my words, you want to stick to the old API Client Libraries, you might be interested in this other documentation page, where there is some complete information about Authentication using the API Client Libraries. More specifically, there is a whole chapter devoted to explaining OAuth 2.0 authentication using Service Accounts.
With a simple code like the one below, you can use the google.oauth2.service_account module in order to load the credentials from the JSON key file of your preferred SA, specify the required scopes, and use it when building the Vision client by specifying credentials=credentials:
from google.oauth2 import service_account
import googleapiclient.discovery
SCOPES = ['https://www.googleapis.com/auth/cloud-vision']
SERVICE_ACCOUNT_FILE = '/path/to/SA_key.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
vision = googleapiclient.discovery.build('vision', 'v1', credentials=credentials)
EDIT:
I forgot to add that in order for Compute Engine instances to be able to work with Google APIs, it will have to be granted with the https://www.googleapis.com/auth/cloud-platform scope (in fact, this is the same as choosing the Allow full access to all Cloud APIs). This is documented in the GCE Service Accounts best practices, but you are right that this would allow full access to all resources and services in the project.
Alternatively, if you are concerned about the implications of allowing "access-all" scopes, in this other documentation page it is explained that you can allow full access and then perform the restriction access by IAM roles.
In any case, if you want to grant only the Vision scope to the instance, you can do so by running the following gcloud command:
gcloud compute instances set-service-account INSTANCE_NAME --zone=INSTANCE_ZONE --scopes=https://www.googleapis.com/auth/cloud-vision
The Cloud Vision API scope (https://www.googleapis.com/auth/cloud-vision) can be obtained, as for any other Cloud API, from this page.
Additionally, as explained in this section about SA permissions and access scopes, SA permissions should be compliant with instance scopes; that means that most restrictive permission would apply, so you need to have that in mind too.
To set the access scopes from the python client libraries with the same effect as that radio button in the GUI:
instance_client = compute_v1.InstancesClient()
instance.service_accounts = [
compute_v1.ServiceAccount(
email="$$$$$$$-compute#developer.gserviceaccount.com",
scopes=[
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/cloud-platform",
],
)
]
With a tutorial for creating instances from python here
I'd like to save the files of my Saas Appication to my Google Drive Account, all the examples I've seen was with oauth2 autentication and needs the end user autenticate openning the browser, I need to upload files from my server without any user interation, sending files direct to my account!
I have try many tutorials I found on internet with no success, mainly the oficial
Google Drive API with Python
How can I autenticate programatically from my server and upload files and use API features such as share folders and others?
I'm using Python, the lib PyDrive uses the same aproach to autenticate
You can do this, but need to use a Service Account, which is (or rather can be used as) an account for an application, and doesn't require a browser to open.
The documentation is here: https://developers.google.com/api-client-library/python/auth/service-accounts
And an example (without PyDrive, which is just a wrapper around all this, but makes service account a bit trickier):
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
from httplib2 import Http
scopes = ['https://www.googleapis.com/auth/drive.readonly']
credentials = ServiceAccountCredentials.from_json_keyfile_name('YourDownloadedFile-5ahjaosi2df2d.json', scopes)
http_auth = credentials.authorize(Http())
drive = build('drive', 'v3', http=http_auth)
request = drive.files().list().execute()
files = request.get('items', [])
for f in files:
print(f)
To add to andyhasit's answer, using a service account is the correct and easiest way to do this.
The problem with using the JSON key file is it becomes hard to deploy code anywhere else, because you don't want the file in version control. An easier solution is to use an environment variable like so:
https://benjames.io/2020/09/13/authorise-your-python-google-drive-api-the-easy-way/
I know it's quite late for answer but this worked for me:
Use the same API you were using, this time in your computer, it will generate a Storage.json which using it along with your scripts will solve the issue (specially in read-ony platforms like heroku)
Checkout the Using OAuth 2.0 for Web Server Applications. It seems that's what you're looking for.
Any application that uses OAuth 2.0 to access Google APIs must have
authorization credentials that identify the application to Google's
OAuth 2.0 server. The following steps explain how to create
credentials for your project. Your applications can then use the
credentials to access APIs that you have enabled for that project.
Open the Credentials page in the API Console. Click Create credentials
OAuth client ID. Complete the form. Set the application type to Web application. Applications that use languages and frameworks like PHP,
Java, Python, Ruby, and .NET must specify authorized redirect URIs.
The redirect URIs are the endpoints to which the OAuth 2.0 server can
send responses. For testing, you can specify URIs that refer to the
local machine, such as http://localhost:8080.
We recommend that you design your app's auth endpoints so that your
application does not expose authorization codes to other resources on
the page.
Might be a bit late but I've been working with gdrive over python, js and .net and here's one proposed solution (REST API) once you get the authorization code on authorization code
How to refresh token in .net google api v3?
Please let me know if you have any questions
I have a website and I need to test it with 250 users. However, I am using google login via OAuth2. The website is hosted on Google App Engine.
I am stuck at this login part. After we log in we get and access token from Google that is passed to Google APIs via the Authorization: Bearer header. We use the access token in the application to get user details and access other google apps for that user. I don't know how to get that access token using my external script.
One option is to mock / stub this part of your application out during testing. For instance, you can provide a certain header that tells your application that you're in test mode and instead of calling the real google APIs, it calls a mock API instead. If your application is setup for dependency injection this could be trivial, otherwise, it may involve monkey-patching or similar.
Another option is to use an OAuth2 Service Account and acquire access tokens for a bunch of users in a test Google Apps domain. Your test script can do this and then just pass the access tokens just as a client normally would.