I was writing a Python script to automate uploading some files to Google Drive. Since I'm still a newbie Python programmer and this is an exercise as much as anything else, I started following the Google Quickstart and decided to use their quickstart.py as a basis on which to base my own script. In the part where it talks about how to create credentials for your Python script, it refers to the "Create credentials" link, at https://developers.google.com/workspace/guides/create-credentials
I follow the link, get into one of my Google Cloud projects, and try to set up the OAuth consent screen, using an "Internal" project, as they tell you... but I can't. Google says:
“Because you’re not a Google Workspace user, you can only make your
app available to external (general audience) users. ”
So I try to create an "External" project, and then proceed to create a new client ID, using a Desktop application. Then I download the JSON credentials and put them in the same folder as my Python script, as "credentials.json". I then execute the Python script in order to authenticate it: the browser opens, I log into my Google account, give it my permissions... and then the browser hangs, because it's redirecting to a localhost URL and obviously my little Python script isn't listening in my computer at all.
I believe they must have changed this recently, because a year ago I started following the same Python tutorial and could create credentials without problems, but the Google Drive API docs haven't been updated yet. So... how do I create credentials for a Python script now?
EDIT: adding here the source code for my script. As I said, it's very similar to Google's "quickstart.py":
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.errors import HttpError
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata', 'https://www.googleapis.com/auth/drive']
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token_myappname.pickle'):
with open('token_myappname.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token_myappname.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
#print(items[0])
print('Files:')
for item in items:
#print (item)
print(u'{0} {1} {2}'.format(item['name'], item['owners'], item['parents']))
I propose you to use a service account to access to the Drive.
For that, you need to share the drive (or the folder) with the service account email. And then use this code
from googleapiclient.discovery import build
import google.auth
SCOPES = ['https://www.googleapis.com/auth/drive.metadata', 'https://www.googleapis.com/auth/drive']
def main():
credentials, project_id = google.auth.default(scopes=SCOPES)
service = build('drive', 'v3', credentials=credentials)
# Call the Drive v3 API
results = service.files().list(
q=f"'1YJ6gMgACOqVVbcgKviJKtVa5ITgsI1yP' in parents",
pageSize=10, fields="nextPageToken, files(id, name, owners, parents)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
#print(items[0])
print('Files:')
for item in items:
#print (item)
print(u'{0} {1} {2}'.format(item['name'], item['owners'], item['parents']))
If you run your code on Google Cloud, in a compute engine instance for example, you need to customize the VM with the service account that you authorized in your drive. (Don't use the compute engine default service account, else you will need extra configuration on your VM)
If you run your script outside GCP, you need to generate a service account key file and to store it on your local server. Then, create an environment variable GOOGLE_APPLICATION_CREDENTIALS that reference the full path of the stored key file.
Aside from the other solution posted by Guillaume Blaquiere, I also found another one on my own, which I wanted to post here in case it's helpful. All I had to do is to... erm, actually read the code I was copying and pasting, in particular this line:
creds = flow.run_local_server(port=0)
I checked Google's documentation outside of the Quickstart and found in the following: https://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html
It turns out, the example code was opening a local port in my computer to listen to the request, and it wasn't working probably due to the "port 0" part, or some other network problem.
So the workaround I found was to use a different auth method found in the docs:
creds = flow.run_console()
In this case, you paste manually in the command line the auth code given to you by Google. I just tried it, and have my credentials happily stored in my local pickle file.
Related
I'm hoping to use the Google Sheets API in a cloud function, which will run from my account's default service account, and I'm working in Python. However, I've only ever authenticated the Sheets library locally, using this bit of code:
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
def gen_creds(path_to_secret: str, rw_vs_ro: str):
"""
Generate the needed credentials to work with the Sheets v4 API based on your secret
json credentials file.
:param path_to_secret: The file path to your credentials json file
:param rw_vs_ro: A string, 'r_o' or 'r_w', representing whether creds should be readonly or readwrite
:return: The built service variable
"""
if rw_vs_ro == 'r_o':
scopes = ['https://www.googleapis.com/auth/spreadsheets.readonly']
creds_nm = 'readonly_token.json'
else:
scopes = ['https://www.googleapis.com/auth/spreadsheets']
creds_nm = 'readwrite_token.json'
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists(creds_nm):
creds = Credentials.from_authorized_user_file(creds_nm, scopes)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
path_to_secret, scopes)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open(creds_nm, 'w') as token:
token.write(creds.to_json())
return build('sheets', 'v4', credentials=creds)
And I'm not entirely sure how to translate this to something that a cloud function would understand, as the cloud function won't be running as me, and lacks the same type of os path that I have access to locally. Would appreciate any insight into what the translation process looks like here--I was only able to find examples in JS, which wasn't perfect for what I was going for. Then, I would love to understand how to actually implement this code in a cloud function in GCP. Thanks!
When you deploy a cloud function, your main code will have access to all the files deployed within that function. This means all you need to do is include your readwrite_token.json/readonly_token.json files when deploying the package.
Once that's done, instead of simply passing the token files as strings, since the function’s directory can be different from the current working directory, you have to properly include the files as specified in this GCP Function Filesystem documentation
Also, you can't use InstalledAppFlow in the Cloud Function environment since this flow is meant for desktop os environments so better pray for the block to never be executed or replace with a different flow.
Actually, I found a simple answer to this question in the end--it's very easy to generate these credentials in GCP for Python! The exact replacement method for gen_creds is:
import google.auth
from googleapiclient.discovery import build
def gen_creds(rw_vs_ro: str):
"""
Generate the service credentials to be used to query a google sheet
:param rw_vs_ro: A string, 'r_o' or 'r_w', representing whether creds should be readonly or readwrite
:return: The built service variable
"""
if rw_vs_ro == 'r_o':
scopes = ['https://www.googleapis.com/auth/spreadsheets.readonly']
if rw_vs_ro == 'r_w':
scopes = ['https://www.googleapis.com/auth/spreadsheets']
creds, project = google.auth.default(scopes=scopes)
service = build('sheets', 'v4', credentials=creds)
return service
Hope this is as helpful to others as it is to me!
I'm hoping to use the Google Sheets API in a cloud function, which will run from my account's default service account, and I'm working in Python. However, I've only ever authenticated the Sheets library locally, using this bit of code:
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
def gen_creds(path_to_secret: str, rw_vs_ro: str):
"""
Generate the needed credentials to work with the Sheets v4 API based on your secret
json credentials file.
:param path_to_secret: The file path to your credentials json file
:param rw_vs_ro: A string, 'r_o' or 'r_w', representing whether creds should be readonly or readwrite
:return: The built service variable
"""
if rw_vs_ro == 'r_o':
scopes = ['https://www.googleapis.com/auth/spreadsheets.readonly']
creds_nm = 'readonly_token.json'
else:
scopes = ['https://www.googleapis.com/auth/spreadsheets']
creds_nm = 'readwrite_token.json'
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists(creds_nm):
creds = Credentials.from_authorized_user_file(creds_nm, scopes)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
path_to_secret, scopes)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open(creds_nm, 'w') as token:
token.write(creds.to_json())
return build('sheets', 'v4', credentials=creds)
And I'm not entirely sure how to translate this to something that a cloud function would understand, as the cloud function won't be running as me, and lacks the same type of os path that I have access to locally. Would appreciate any insight into what the translation process looks like here--I was only able to find examples in JS, which wasn't perfect for what I was going for. Then, I would love to understand how to actually implement this code in a cloud function in GCP. Thanks!
When you deploy a cloud function, your main code will have access to all the files deployed within that function. This means all you need to do is include your readwrite_token.json/readonly_token.json files when deploying the package.
Once that's done, instead of simply passing the token files as strings, since the function’s directory can be different from the current working directory, you have to properly include the files as specified in this GCP Function Filesystem documentation
Also, you can't use InstalledAppFlow in the Cloud Function environment since this flow is meant for desktop os environments so better pray for the block to never be executed or replace with a different flow.
Actually, I found a simple answer to this question in the end--it's very easy to generate these credentials in GCP for Python! The exact replacement method for gen_creds is:
import google.auth
from googleapiclient.discovery import build
def gen_creds(rw_vs_ro: str):
"""
Generate the service credentials to be used to query a google sheet
:param rw_vs_ro: A string, 'r_o' or 'r_w', representing whether creds should be readonly or readwrite
:return: The built service variable
"""
if rw_vs_ro == 'r_o':
scopes = ['https://www.googleapis.com/auth/spreadsheets.readonly']
if rw_vs_ro == 'r_w':
scopes = ['https://www.googleapis.com/auth/spreadsheets']
creds, project = google.auth.default(scopes=scopes)
service = build('sheets', 'v4', credentials=creds)
return service
Hope this is as helpful to others as it is to me!
I read the Google API documentation pages (Drive API, pyDrive) and created a databricks notebook to connect to the Google drive. I used the sample code in the documentation page as follow:
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
CRED_PATH, SCOPES)
creds = flow.run_local_server()
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'], item['id']))
if __name__ == '__main__':
main()
The CRED_PATH includes the credential file path in /dbfs/FileStore/shared_uploads. The script prompts me the URL to authorize the application but immediately after allowing access it redirects to the page that says "This site can’t be reached: localhost refused to connect."
The localhost is listening on the default port (8080):
I checked the redirect URI of the registered app in Google API Services and it includes the localhost.
I'm not sure what should I check/set to have access the Google API in databricks. Any thought is appreciated
Although I'm not sure whether this is better workaround for your situation, in your situation, how about using the service account instead of OAuth2 you are using? By this, the access token can be retrieved without opening the URL for retrieving the authorization code, and Drive API can be used with googleapis for python you are using. From this, I thought that your issue might be able to be removed.
The method for using the service account with your script is as follows.
Usage:
1. Create service account.
About this, you can see the following official document.
Creating and managing service accounts
and/or
Create a service account
When the service account is created, the credential file of JSON data is downloaded. This file is used for the script.
2. Sample script:
The sample script for using the service account with googleapis for python is as follows.
from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.discovery import build
credentialFileOfServiceAccount = '###.json' # Please set the file path of the creadential file of service account.
creds = ServiceAccountCredentials.from_json_keyfile_name(credentialFileOfServiceAccount, ['https://www.googleapis.com/auth/drive.metadata.readonly'])
service = build('drive', 'v3', credentials=creds)
results = service.files().list(pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'], item['id']))
Note:
The Google Drive of the service account is different from your Google Drive. So in this case, when you share a folder on your Google Drive with the mail address of the service account (This email address can be seen in the credential file.). By this, you can get and put the file to the folder using the service account and you can see and edit the file in the folder on your Google Drive using the browser.
References:
Creating and managing service accounts
Create a service account
I want to implement a simple application that would allow me to access Google Drive. I am following python quickstart. I run the application in the Docker.
However when I run the script it shows me Please visit this URL to authorize this application:. If I go by the URL it asks me to Choose Account, shows warning regarding it not being a verified app (I ignore it and go my app page), asks for access to google drive and metadata (I allow it), and then it redirects me to http://localhost:46159/?state=f.. and it shows unable to connect page. Port may differ.
What is the problem? Is there a way to prevent the application running in Docker to ask for verification?
In order to avoid the "asking for verification" process you can instead use authorisation through service accounts.
In order to do so, firstly we have to create the service account:
Navigate to your GCP project.
Go to Credentials
Click on Create credentials>Service account key
Set the service account name, ID and Role (if applicable). Leave the Key type as JSON.
Click on Create. A JSON file will be downloaded containing the credentials of the newly created Service account.
Now, copy the file to the folder holding your project and use the following modified code (based on the Quickstart example you used):
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2 import service_account
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']
SERVICE_ACCOUNT_FILE = '/path/to/service.json'
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'], item['id']))
if __name__ == '__main__':
main()
Note that Service Accounts behave like normal accounts (they have their own files, permissions, etc.). If you want a service account to act like an existing user of your domain, you can manage to do so by using Domain-wide delegation.
Reference
Create a service account
Using OAuth 2.0 for Server to Server Applications
Domain-Wide Delegation of Authority
My Python (3.6.7) code uses oauth2client to access Google Photos APIs. It successfully authenticates, but when it tries to access the Google Photos albums, it seems to be using the username as the project_id.
from __future__ import print_function
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
# Setup the Photo v1 API
SCOPES = 'https://www.googleapis.com/auth/photoslibrary.readonly'
store = file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('scripts/client_id.json', SCOPES)
creds = tools.run_flow(flow, store)
service = build('photoslibrary', 'v1', http=creds.authorize(Http()))
# Call the Photo v1 API
results = service.albums().list(
pageSize=10, fields="nextPageToken,albums(id,title)").execute()
items = results.get('albums', [])
if not items:
print('No albums found.')
else:
print('Albums:')
for item in items:
print('{0} ({1})'.format(item['title'].encode('utf8'), item['id']))
When executing the above code, it prompts me the auth page. When I successfully authenticate, it shows me the following error:
HttpError 403 when requesting {URL} returned "Photos Library API has not been used in project 123456 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/photoslibrary.googleapis.com/overview?project=123456 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.">
Interestingly, the number in bold 123456 (obviously changed) is actually the first part of the client_id found in the client_id.json
But the project_id looks something like this: test1-235515
So what I got from this error is that the oauth2client client is passing the client_id instead of the project_id. So even though I have enabled the Photos API, it will never access it correctly.
Please help with this error. How can I manually change the project_id?
The project ID is different from the project number. You will be able to see both in your Google Cloud Console configuration. See this documentation for more on how to identify your projects [1].
A single Google Cloud project can have many different OAuth client IDs configured. See this documentation for information about creating OAuth client credentials [2]. You should be only have to make sure that the client you created belongs to the project for which you have enabled APIs. Going to the URL provided in the error message should take you to the right configuration page.
[1] https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects
[2] https://support.google.com/cloud/answer/6158849?hl=en