Preface: I work at an agency with access to many clients google analytics accounts. I am working to set up an API to pipe all the clients data to a data warehouse.When I try to use the python API library, I created a service account that needs to be added as a user by each client. This can take a while depending on the client.
Question: why is it that if I wanted to use something like stitch data, that they can access all of my client's data without having a service account of theirs added to each individual client? How does the singer-tap used for this work to avoid this service account problem?
Related
If I set up a Google Sheets API instance and a Google Drive API instance and then connect to the Google Sheet using the credentials key from a python script (application) on my desktop. This script performs basic CRUD operations.
My question:
Is this connection secure? In other words does the data travel over the Internet plain text or encrypted?
If not secure...how can I ensure the data securely travels from python script to Google Sheets?
I have searched for API data integrity but no luck finding out if the connection to the API need TLA or SSL.
The network calls are secure since they must be using https calls and thus are not transported in plane text. Post a link to the python script if you want someone to check here.
You do have to trust google with all your data.
I am trying to access a Google Sheet stored in my Drive through the Google Sheets REST API.
This will just be a Python script without any user interaction. How can I authenticate my request using something like an access key or a service account?
I understand the concept of generating access keys or creating a service account in my Google Cloud console. But, I don't quite understand how the Sheet in my Drive can be associated with it.
I would like to know the steps I should follow in order to accomplish this. For instance, how can I send a request to this API endpoint?
GET https://sheets.googleapis.com/v4/spreadsheets/{spreadsheetId}
Note: I want to do this using the REST API. I do not want to use a Python API that has already been developed. So, I simply want to hit the above endpoint using maybe the requests package.
Google does not permit API only access to Google (Workspace?) documents.
See Authorizing Requests
API keys authenticate programs.
OAuth is used to authenticate users and Google requires that users authenticate requests when access user data stored in Workspace documents.
Domain-wide Delegation enables the use of a Service Account to operate on behalf of users in situations such as this but it is only available for (paid) Workspace accounts.
I'm unsure how to refer to the free and paid (Workspace) versions.
This is what I want to achieve:
Ask the user to authorize the collection of their data on a Google Analytics 4 property (or Universal Analytics but I would rather not)
Programmatically retrieve and store the data every n-hours
I was able to do (1) client-side by asking for authorization with google's OAUTH2 and making a call to Reporting API v4 https://developers.google.com/analytics/devguides/reporting/core/v4 using gapi on the front-end.
However, I'm not sure how to do it on a schedule without user interaction. I've searched Google's API docs and I believe there's a way to do it in python https://developers.google.com/analytics/devguides/reporting/core/v4/quickstart/service-py but I am currently limited to Node and the browser. I guess I could make a server in python that does the data fetching and connects with the Node application, but that's yet another layer of complications that I'm trying to avoid. Is there a way to do everything in Node?
GCP APIs are all documented in a way which allows everyone to generate client libraries in a variety of languages, including node.js. The documentation for the node.js client for Analytics Reporting is here.
For the question of how to schedule this on GCP, I would recommend you to use Cloud Scheduler. This will hit an endpoint running on Cloud Run, which will do the actual work. Alternatively, if you already have a service running somewhere else, you can simply add the required endpoints there and point Cloud Scheduler to it.
The overall design which I would suggest you goes something like this:
Build a site which takes the user through the OAUTH2 login process,
requesting the relevant Google Analytics Reporting API scopes
required to make the request.
Store the obtained credentials in their user database.(preferably
Firestore in Datastore mode)
Set up a Cloud Run service (or anything else), with two endpoints
Iteration endpoint: Iterate through the list of users and add tasks
to Cloud Tasks to hit the download endpoint for each one.
Download endpoint: Takes a user ID (e.g. as a query parameter) and
performs the download for this user. You will need to load the
credentials for the user from the database and use this to access the
reporting API.
Store the downloaded data in the desired location, e.g. Cloud
Storage, Firestore, Cloud SQL, etc.
Set up Cloud Scheduler to hit the iteration endpoint at the desired
frequency.
For the GCP services mentioned above, basically everything other than Analytics, you may use the "cloud" clients for node.js, which are available here
Note : The question you have asked is a very broad question and this answer is just a suggestion. You may think about other designs whichever works best for you.
I am making an application in python.
In short:
The user inputs some images for calibration, and some images that then are transformed by algorithm.
To further improve the algorithm and service, I want the users to upload calibration images to a central storage in the cloud. How would I go about this?
How do I make it secure(I.e. Not get people randomly upload terabytes of files)?
Is it possible to have a script on the server/cloud side that validates if the uploaded file should be deleted or not?
I have some experience with Azure, but open for anything..
A high level perspective:
develop a middleware to manage user authentication and proxy the upload to any cloud storage on your own. In python you may want to look for web API framework like Django / Flask to implement user authentication with a database properly. You also have to implement secure connection between the middleware and client.
A less recommended implementation is calling cloud service API directly from client, for example AWS provides a boto python client which can access to S3 API with accessKey(AK) and secretKey(SK) of a IAM user. You could prompt the user for their AK and SK for uploading file on S3. Then you are relying on the authorization of AWS. However this would expose your public cloud account to user, in security measure each user using your application would need to create a unique IAM user, setting up with minimal access policy properly. If you have lot of users, you will need to consider a user group for your application to minimize your effort on user management.
I have integrated the latest SDKs of Twitter & Facebook in my iOS project and implemented functionalities to retrieve the access tokens as well.Previously the app was using an OAuth authentication via server for both this social networks,and the server was keeping the associated access tokens for the periodic communications with Twitter & Facebook.
So is there any way the access token created in iOS can be used for our server to communicate with twitter & facebook accounts?
The server side scripts and implemented in python/Django.
Sure, why not. Just establish a webservice which receives the newly created ("short-lived") access token, exchanges them to long-lived ones and stores them in some database. After that you can use them for your server-side requests.
Have a look here:
https://developers.facebook.com/docs/facebook-login/access-tokens/#extending