I am planning on creating a Twisted Python Google Drive API using the open-source txboxdotnet API as a base, which is based on Box.com's API and is located here.
Doing this would allow me to create a plugin for Tahoe-LAFS, a Least Authority File System which gives the user a number of additional client-side security functions. For my purposes, I am using the open-source branch of Tahoe-LAFS that allows for cloud interaction located here and an open-source public-clouds modification made for this branch.
My main question is would the creation of a Twisted Python API for Google Drive be allowed under the Terms and Conditions for Google Developers? The API will (of course) not contain any of my confidential developer credentials.
Well going on the assumption of best practices here and your description of the API usage and intent you are within the bounds of their terms. You can read for yourself here. Sections 3 and 4 to be exact.
Related
I am trying to use the Healthcare API, specifically the Healthcare Natural Language API for which there is this tutorial as well as this other one
The tutorial outlines how to run the API on a string; I've been tasked with figuring out how to make use of the API with a dataset of medical text data. I am most comfortable in Python out of all GCP options, so I attempted to run the code through Colab.
I used a service account json key for authentication, but this isn't best practice. So, I had to delete the key since we are dealing with patient data and everyone on my team is new to GCP.
In order for me to continue exploring running the Healthcare NLP API on a dataset rather than one string, I need to figure out authentication through a different method. In this regard, I have the following questions:
Pros/cons of me trying to run this through Colab?
Should I look to shifting to running my code within the GCP interface?
Are my choices Colab (and being forced to use a json key) vs working within GCP shell/terminal (with a plethora of authentication options)? This is what I gather from my research, but I am quite new to using APIs, working with cloud computing, etc.
I've tried to look into related tutorials such as this but their lack of direct relationship to what I am doing (ie: can't find one where the API being used is Healthcare), and my lack of familiarity with APIs and GCPs, I don't particularly understand what is going on + I keep seeing service accounts generally mentioned at one point or another, and I am not allowed to use service account keys for the time being.
Instead of using a service account you can use your own credentials, and supply them to your code using the "application default credentials". To set this up, make sure the GOOGLE_APPLICATION_CREDENTIALS environment variable is unset, then run gcloud auth application-default login (docs). After going through the login flow you can either generate an auth token manually, using gcloud auth application-default print-access-token, or you can use the Google client libraries, i.e.
from googleapiclient import discovery
api_version = "v1"
service_name = "healthcare"
# Returns an authorized API client by discovering the Healthcare API
# and using GOOGLE_APPLICATION_CREDENTIALS environment variable.
client = discovery.build(service_name, api_version)
Under the hood this is using the application default credentials, you can also do with the google-auth Python package if you want.
You can find a summary of all the standard authentication methods here
I have a python/Flask application, on our intranet, and I want people to authenticate to it using their Azure AD credentials. Pretty much every hit on Google/Bing/etc is about how to use AD to authenticate so you can subsequently use Microsoft APIs, such as Graph or Data Lake, or they are for .NET applications, or they are for stuff running on the Azure cloud.
The closest I've come to what I need is https://github.com/cicorias/python-flask-aad-v2, and the instructions refer to some older version of Azure. It would also be nice if I could specify whether an authenticated user should have access to this app, but I can live without it and simply have a list of allowed IDs in the app's back-end.
This cannot be that hard; I've done this in the past for both GCP and AWS, but I've hit the proverbial brick wall when it came to Azure. While this is not my first overall rodeo, it is my first Azure/AD rodeo, so to speak. I'm sure that part of my problem is that, being an Azure noob, I may not even be using the right search keywords.
Help?
Do not think in terms of the providers but in terms of the Authentication standards. Since you have integrated Google Login in your app in the past then you must have used something called OAuth as the auth standard. Azure AD also supports OAuth. You can use a python package called flask-azure-oauth to integrate it in your flask app.
You can refer to below code samples available in Microsoft Identity Platform documentation (https://learn.microsoft.com/en-us/azure/active-directory/develop/sample-v2-code#web-applications)
Sign in users - https://github.com/Azure-Samples/ms-identity-python-flask-tutorial
Sign in users and call Microsoft Graph - https://github.com/Azure-Samples/ms-identity-python-webapp
These links are for Python (Flask). You can get code samples for other languages or scenario from Microsoft Identity Platform documentation (https://learn.microsoft.com/en-us/azure/active-directory/develop/sample-v2-code#web-applications)
I am writing an application that uses Google's python client for GCS.
https://cloud.google.com/storage/docs/reference/libraries#client-libraries-install-python
I've had no issues using this, until I needed to write my functional tests.
The way our organization tests integrations like this is to write a simple stub of the API endpoints I hit, and point the Google client library (in this case) to my stub, instead of needing to hit Google's live endpoints.
I'm using a service account for authentication and am able to point the client at my stub when fetching a token because it gets that value from the service account's json key that you get when you create the service account.
What I don't seem able to do is point the client library at my stubbed API instead of making calls directly to Google.
Some work arounds that I've though of, that I don't like are:
- Allow the tests to hit the live endpoints.
- Put in some configuration that toggles using the real Google client library, or a mocked version of the library. I'd rather mock the API versus having mock code deployed to production.
Any help with this is greatly appreciated.
I’ve made some research and it seems like there’s nothing supported specifically for Cloud Storage using python. I found this GitHub issue entry with a related discussion, but for go.
I think you can open a public issue tracker asking for this functionality. I’m afraid by now it’s easier to keep using your second workaround.
I need to use Cloud Vision API in my python solution, I've been relying on an API key for a while now, but at the moment I'm trying to give my Compute Engine's default service account the scope needed to call Vision, with little luck so far.
I have enabled vision API in my project via cloud console, but I still get that 403 error:
Request had insufficient authentication scopes.
I would set access individually for each API from my gce's edit details tab, but couldn't find Vision listed along the other API's.
The only way I managed to correctly receive a correct response from Vision API is by flagging the "Allow full access to all Cloud APIs" checkbox, again from my gce's edit details tab, but that doesn't sound too secure to me.
Hopefully there are better ways to do this, but I couldn't find any on Vision's documentation on authentication, nor in any question here on stack overflow (some had a close topic, but none of the proposed answers quite fitted my case, or provided a working solution).
Thank you in advance for your help.
EDIT
I'm adding the list of every API I can individually enable in my gce's default service account from cloud console:
BigQuery; Bigtable Admin; Bigtable Data; Cloud Datastore; Cloud Debugger; Cloud Pub/Sub; Cloud Source Repositories; Cloud SQL; Compute Engine; Service Control; Service Management; Stackdriver Logging API; Stackdriver Monitoring API; Stackdriver Trace; Storage; Task queue; User info
None of them seems useful to my needs, although the fact that enabling full access to them all solves my problem is pretty confusing to me.
EDIT #2
I'll try and state my question(s) more concisely:
How do I add https://www.googleapis.com/auth/cloud-vision to my gce instance's default account?
I'm looking for a way to do that via any of the following: GCP console, gcloud command line, or even through Python (at the moment I'm using googleapiclient.discovery.build, I don't know if there is any way to ask for vision api scope through the library).
Or is it ok to enable all the scopes as long as limit the roles via IAM? And if that's the case how do I do that?
I really can't find my way around the documentation, thank you once again.
Google Cloud APIs (Vision, Natural Language, Translation, etc) do not need any special permissions, you should just enable them in your project (going to the API Library tab in the Console) and create an API key or a Service account to access them.
Your decision to move from API keys to Service Accounts is the correct one, given that Service Accounts are the recommended approach for authentication with Google Cloud Platform services, and for security reasons, Google recommends to use them instead of API keys.
That being said, I see that you are using the old Python API Client Libraries, which make use of the googleapiclient.discovery.build service that you mentioned. As of now, the newer idiomatic Client Libraries are the recommended approach, and they superseded the legacy API Client Libraries that you are using, so I would strongly encourage to move in that direction. They are easier to use, more understandable, better documented and are the recommended approach to access Cloud APIs programatically.
Getting that as the starting point, I will divide this answer in two parts:
Using Client Libraries
If you decided to follow my advice and migrate to the new Client Libraries, authentication will be really easy for you, given that Client Libraries use Application Default Credentials (ADC) for authentication. ADC make use of the default service account for Compute Engine in order to provide authentication, so you should not worry about it at all, as it will work by default.
Once that part is clear, you can move on to create a sample code (such as the one available in the documentation), and as soon as you test that everything is working as expected, you can move on to the complete Vision API Client Library reference page to get the information about how the library works.
Using (legacy) API Client Libraries
If, despite my words, you want to stick to the old API Client Libraries, you might be interested in this other documentation page, where there is some complete information about Authentication using the API Client Libraries. More specifically, there is a whole chapter devoted to explaining OAuth 2.0 authentication using Service Accounts.
With a simple code like the one below, you can use the google.oauth2.service_account module in order to load the credentials from the JSON key file of your preferred SA, specify the required scopes, and use it when building the Vision client by specifying credentials=credentials:
from google.oauth2 import service_account
import googleapiclient.discovery
SCOPES = ['https://www.googleapis.com/auth/cloud-vision']
SERVICE_ACCOUNT_FILE = '/path/to/SA_key.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
vision = googleapiclient.discovery.build('vision', 'v1', credentials=credentials)
EDIT:
I forgot to add that in order for Compute Engine instances to be able to work with Google APIs, it will have to be granted with the https://www.googleapis.com/auth/cloud-platform scope (in fact, this is the same as choosing the Allow full access to all Cloud APIs). This is documented in the GCE Service Accounts best practices, but you are right that this would allow full access to all resources and services in the project.
Alternatively, if you are concerned about the implications of allowing "access-all" scopes, in this other documentation page it is explained that you can allow full access and then perform the restriction access by IAM roles.
In any case, if you want to grant only the Vision scope to the instance, you can do so by running the following gcloud command:
gcloud compute instances set-service-account INSTANCE_NAME --zone=INSTANCE_ZONE --scopes=https://www.googleapis.com/auth/cloud-vision
The Cloud Vision API scope (https://www.googleapis.com/auth/cloud-vision) can be obtained, as for any other Cloud API, from this page.
Additionally, as explained in this section about SA permissions and access scopes, SA permissions should be compliant with instance scopes; that means that most restrictive permission would apply, so you need to have that in mind too.
To set the access scopes from the python client libraries with the same effect as that radio button in the GUI:
instance_client = compute_v1.InstancesClient()
instance.service_accounts = [
compute_v1.ServiceAccount(
email="$$$$$$$-compute#developer.gserviceaccount.com",
scopes=[
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/cloud-platform",
],
)
]
With a tutorial for creating instances from python here
I'm working on a tracking proxy (for want of a better term) written in Python. It's a simple http (wsgi) application that will run on one (maybe more) server and accepts event data from a desktop client. This service would then forward the tracking data on to some actual tracking platform (DeskMetrics, MixPanel, Google Analytics) so that we don't have to deal with the slicing and dicing of data.
The reason for this implementation is that it's much easier and faster to make changes to a server process that we control rather than having to ensure every client in the wild gets updated if the tracking backend changes in some way.
I've been looking up info on the various options and I was hoping somebody here would have some good advice from their own experiences. Ideally we'd be able to use Google Analytics as it's free for any amount of usage but paid options are fine.
My only real requirement is either a good Python library or a well documented api that I can write a wrapper for (this seems somewhat lacking in GA when it comes to triggering events through any method other than their js or other provided libs).
N.B. We're not really tracking server code so something like NewRelic isn't appropriate, we're just decoupling a desktop application from the specifics of the tracking backend.
We ran into this same problem a bunch of times, we ended up building a suite of server-side analytics libraries to make this easier.
Segment.io has libraries for Python, Ruby, Java, Node, .NET and PHP that abstract the APIs for Mixpanel, KISSmetrics, Google Analytics and a bunch of other analytics services.
You could integrate the Python library once, and then send your data wherever you want. The data is proxied through Segment.io's hosted service. Hopefully this cleans up the mess of integrating a bunch of libraries, each with slightly different APIs. (The service is free for the first million events.)
Have you tried anything below?
The Google Data APIs Python Client Library has source specific to analytics
http://code.google.com/p/gdata-python-client/
http://code.google.com/p/gdata-python-client/source/browse/#hg%2Fsamples%2Fanalytics
https://developers.google.com/gdata/articles/python_client_lib
You might be able to borrow from these sources as well;
Google has something they are working on for mobile and source is available in PHP, JSP, ASP.net and Perl: https://developers.google.com/analytics/devguides/collection/other/mobileWebsites
I also came accross this in PHP http://code.google.com/p/php-ga/
As for others:
KissMetrics: http://support.kissmetrics.com/apis/python
MixPanel: https://mixpanel.com/docs/integration-libraries/python
DeskMetrics: don't seem to have python, http://docs.deskmetrics.com/index.html
Sorry I cannot provide information based off extensive experience with anything python related other then providing a few of these resources. I would be interested to see what you come up with.