google-drive-sdk export Daily Limit unauthenticated use - python

I am trying to download/export a file according to the v3 example published by google. I am getting the "Daily Limit for Unauthenticated Use Exceeded. Continued use requires signup." error.
I have searched here and elsewhere and all of the links suggest I am missing setting up the credentials. However, I am building on top of the basic quickstart example and am able to list out the contents of my drive folder in this same application. And yes, I have changed the requested scope from drive.metadata.readonly to drive.readonly to support downloading. What am I missing?
from __future__ import print_function
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
import io
from googleapiclient.http import MediaIoBaseDownload
# Setup the Drive v3 API
SCOPES = 'https://www.googleapis.com/auth/drive.readonly'
store = file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
drive_service = build('drive', 'v3', http=creds.authorize(Http()))
# Call the Drive v3 API to list first 10 items (this works)
# example from google.
results = drive_service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print('{0} ({1})'.format(item['name'], item['id']))
# Try to download the first item (it's a google doc I can edit, this FAILS)
# code pretty much lifted from google
file_id = items[0]['id']
print (file_id)
request = drive_service.files().export_media(fileId=file_id,
mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print ( "Download %d%%." % int(status.progress() * 100) )

Found it. The example from Google was caching the credentials file (credentials.json). When I originally ran the example the scoped permissions were not for drive.readonly, but for drive.metadata.readonly. I think when I changed them the request was not longer valid.
I deleted and credentials.json and re-ran the script (and re-approved the credentials request on my browser) and it was successful. I also ended up using the following to store the data as the BytesIO wasn't actually writing to disk.
data = drive_service.files().export(fileId=file_id,
mimeType='application/pdf').execute()
f = open('MyFile.pdf','wb')
f.write(data)
f.close()

Related

Is it possible to sync or upload the file from google drive without copying the whole thing in the folder using python?

I just start learning the python scripting and I created a script using pydrive and the function is uploading all files from local folder (linux OS) to google drive but I'm planning to modify the script for my automation and add the function that can upload only the most recent file added to the local folder with no reuploading of all the files inside the folder, may I know if this is possible with python script alone?
Thank you in advance!
You dont need to use pydrive. You can use the Google api python client library directly. As far as i know pydrive does use the client library internally. There's a starter example here
Quick start python
from __future__ import print_function
import os.path
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.json', 'w') as token:
token.write(creds.to_json())
try:
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
return
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'], item['id']))
except HttpError as error:
# TODO(developer) - Handle errors from drive API.
print(f'An error occurred: {error}')
if __name__ == '__main__':
main()
Manage uploads
file_metadata = {'name': 'photo.jpg'}
media = MediaFileUpload('files/photo.jpg', mimetype='image/jpeg')
file = drive_service.files().create(body=file_metadata,
media_body=media,
fields='id').execute()
print 'File ID: %s' % file.get('id')

How to get the url of a file on Google Drive using its ID with Python

In the code below I get the fileID of a csv file on Google Drive. Now, I want to store the file content directly in a pandas frame instead of downloading the csv file and afterwards extracting the data (as shown in the code).
import io
import os.path
import pandas as pd
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']
# Login to Google Drive
def login():
creds = None
# The file token.json stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
print ("Login to your to your Google Drive account which holds/shares the file database")
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'./src/credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.json', 'w') as token:
token.write(creds.to_json())
# Return service
service = build('drive', 'v3', credentials=creds)
return service
# Download files from Google Drive
def downloadFile(file_name):
# Authenticate
service = login()
# Search file by name
response = service.files().list(q=f"name='{file_name}'", spaces='drive', fields='nextPageToken, files(id, name)').execute()
for file in response.get('files', []):
file_id = file.get('id')
# Download file file if it exists
if ("file_id" in locals()):
request = service.files().get_media(fileId=file_id)
fh = io.FileIO(f"./data/{file_name}.csv", "wb")
downloader = MediaIoBaseDownload(fh, request)
print (f"Downloading {file_name}.csv")
else:
print (f"\033[1;31m Warning: Can't download >> {file_name} << because it is missing!!!\033[0;0m")
return
downloadFile("NameOfFile")
Is there any way to achieve this?
Thanks a lot for your help
From The problem is to be able to do that I need the file's URL but I'm not able to retrieve it., I thought that your file might be Google Spreadsheet. When the file is Google Spreadsheet, webContentLink is not included in the retrieved metadata.
If my understanding of your situation is correct, how about the following modification?
Modified script:
From:
file_id = file.get('id')
# !!! Here, I would like to get the URL of the file and download it to a pandas data frame !!!
file_url = file.get("webContentLink")
To:
file_id = file.get('id')
file_url = file.get("webContentLink")
if not file_url:
request = service.files().export_media(fileId=file_id, mimeType='text/csv')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
fh.seek(0)
df = pd.read_csv(fh)
print(df)
In this modification, the Google Spreadsheet is exported as the CSV data using Drive API, and the exported data is put to the dataframe.
In this modification, please add import io and from googleapiclient.http import MediaIoBaseDownload.
Note:
In this case, the Google Spreadsheet is exported as the CSV data using Drive API. So please include the scope of https://www.googleapis.com/auth/drive.readonly or https://www.googleapis.com/auth/drive. When your scope is only https://www.googleapis.com/auth/drive.metadata.readonly, an error occurs. Please be careful this.
Reference:
Files: export
Added:
When the file is the CSV data, please modify as follows.
file_id = file.get('id')
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
fh.seek(0)
df = pd.read_csv(fh)
print(df)

Downloading Google Documents using api

this is shareable link link to file
id = 1wzCjl51u131v1KBgpbiKLJs8DPPakhXCFosfYjp7BY0
so manage downloads documentation.
file_id = '11wzCjl51u131v1KBgpbiKLJs8DPPakhXCFosfYjp7BY0'
request = drive_service.files().export_media(fileId=file_id,
mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print ("Download %d%%." % int(status.progress() * 100))
i saved the file as dd.py, run using f5 and got this error
line 2, in
request = drive_service.files().export_media(fileId=file_id,
NameError: name 'drive_service' is not defined
First off you cant be sure that the id you find in the shareable link is in fact the id of the file on google drive this is not always the case. Actually I have never known this to be the case
drive_service' is not defined
Second you need create the drive service and be authenticated before you can run that code. You should try following the python quickstart
from __future__ import print_function
from googleapiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
# If modifying these scopes, delete the file token.json.
SCOPES = 'https://www.googleapis.com/auth/drive.metadata.readonly'
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
store = file.Storage('token.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('credentials.json', SCOPES)
creds = tools.run_flow(flow, store)
service = build('drive', 'v3', http=creds.authorize(Http()))
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print('{0} ({1})'.format(item['name'], item['id']))
if __name__ == '__main__':
main()

i want to download file from google drive using Drive Api

in this code :
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
I don't know how to get file_id , I was getting file_id while uploading but now I am not able to figure out how to get file_id of the file which is present on Google Drive.
Ex. if my uploaded file has name A001002.pdf , how can i get file id for this file.
there is some reference online which i am not able to understand.
link: files.list
any help?
The file.list method contains a q paramater which is used for searching
GET https://www.googleapis.com/drive/v3/files?q=name+%3D+'hello'&key={YOUR_API_KEY}
Python Guess
"""
Shows basic usage of the Drive v3 API.
Creates a Drive v3 API service and prints the names and ids of the last 10 files
the user has access to.
"""
from __future__ import print_function
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
# Setup the Drive v3 API
SCOPES = 'https://www.googleapis.com/auth/drive.metadata.readonly'
store = file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
service = build('drive', 'v3', http=creds.authorize(Http()))
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="*").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print('{0} ({1})'.format(item['name'], item['id']))
Note this example does not show how to add the additional parameter i am still Googling that but i am not a python dev you may know more about how to do that than me.
Depending on your implementation, there is one more alternative. In case you do not need to programmatically get the fileID you can just open the file in google docs from the browser and the ID is shown in the URL.

Getting WebViewLinks with Google Drive

I've just started trying to use the Google Drive API. Using the quickstart guide I set up the authentication, I can print a list of my files and I can even make copies. All that works great, however I'm having trouble trying to access data from a file on Drive. In particular, I'm trying to get a WebViewLink, however when I call .get I receive only a small dictionary that has barely any of the file's metadata. The documentation makes it look like all the data should just be there by default but it's not appearing. I couldn't find any way to flag for requesting any additional information.
credentials = get_credentials()
http = credentials.authorize(httplib2.Http())
service = discovery.build('drive', 'v3', http=http)
results = service.files().list(fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(item['name'], item['id'])
if "Target File" in item['name']:
d = service.files().get(fileId=item['id']).execute()
print(repr(d))
This is the output of the above code: (the formatting is my doing)
{u'mimeType': u'application/vnd.google-apps.document',
u'kind': u'drive#file',
u'id': u'1VO9cC8mGM67onVYx3_2f-SYzLJPR4_LteQzILdWJgDE',
u'name': u'Fix TVP Licence Issues'}
For anyone confused about the code there is some missing that's just the basic get_credentials function from the API's quickstart page and some constants and imports. For completeness, here's all that stuff, unmodified in my code:
from __future__ import print_function
import httplib2
import os
from apiclient import discovery
import oauth2client
from oauth2client import client
from oauth2client import tools
SCOPES = 'https://www.googleapis.com/auth/drive'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Drive API Python Quickstart'
try:
import argparse
flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
flags = None
def get_credentials():
"""Gets valid user credentials from storage.
If nothing has been stored, or if the stored credentials are invalid,
the OAuth2 flow is completed to obtain the new credentials.
Returns:
Credentials, the obtained credential.
"""
home_dir = os.path.expanduser('~')
credential_dir = os.path.join(home_dir, '.credentials')
if not os.path.exists(credential_dir):
os.makedirs(credential_dir)
credential_path = os.path.join(credential_dir,
'drive-python-quickstart.json')
store = oauth2client.file.Storage(credential_path)
credentials = store.get()
if not credentials or credentials.invalid:
flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
flow.user_agent = APPLICATION_NAME
if flags:
credentials = tools.run_flow(flow, store, flags)
else: # Needed only for compatibility with Python 2.6
credentials = tools.run(flow, store)
print('Storing credentials to ' + credential_path)
return credentials
So what's missing, how can I get the API to return all that extra meta data that's just not appearing right now?
You are very close. With the newer version of the Drive API v3, to retrieve other metadata properties, you will have to add the fields parameter to specify additional properties to include in a partial response.
In your case, since you are looking to retrieve the WebViewLinkproperty your request should look something similar to this:
results = service.files().list(
pageSize=10,fields="nextPageToken, files(id, name, webViewLink)").execute()
To display your items from the response:
for item in items:
print('{0} {1} {2}'.format(item['name'], item['id'], item['webViewLink']))
I also suggest try it out with the API Explorer so you can view what additional metadata properties you would like to display on your response.
Good Luck and Hope this helps ! :)
You explicitly request only the id and name fields in your files.list call. Add webViewLink to the list to results = service.files().list(fields="nextPageToken, files(id, name, webViewLink)").execute(). To retrieval all metadata files/* should be used. For more information about this performance optimizations see Working with partial resources in the Google Drive docs.
I have written a custom function to help with getting a sharable web link given a file/folder id. More information can be gotten here
def get_webViewLink_by_id(spreadsheet_id):
sharable_link_response = drive_service.files().get( fileId=spreadsheet_id, fields='webViewLink').execute()
return(sharable_link_response['webViewLink'])
print(get_webViewLink_by_id(spreadsheet_id = '10Ik3qXK4wseva20lNGUKUTBzKoywaugi6XOmRUoP-4A'))

Categories