Does anyone know where I can find complete sample code for uploading a local file and getting contents with MediaFileUpload?
I really need to see both the HTML form used to post and the code to accept it. I'm pulling my hair out and so far only getting partial answers.
I found this question while trying to figure out where the heck "MediaFileUpload" came from in the Google API examples, and I eventually figured it out. Here is a more complete code example that I used to test things with Python 2.7.
You need a JSON credentials file for this code to work. This is the credentials file you get from your Google app / project / thing.
You also need a file to upload, I'm using "test.html" here in the example.
from oauth2client.service_account import ServiceAccountCredentials
from apiclient.discovery import build
from apiclient.http import MediaFileUpload
#Set up a credentials object I think
creds = ServiceAccountCredentials.from_json_keyfile_name('credentials_from_google_app.json', ['https://www.googleapis.com/auth/drive'])
#Now build our api object, thing
drive_api = build('drive', 'v3', credentials=creds)
file_name = "test"
print "Uploading file " + file_name + "..."
#We have to make a request hash to tell the google API what we're giving it
body = {'name': file_name, 'mimeType': 'application/vnd.google-apps.document'}
#Now create the media file upload object and tell it what file to upload,
#in this case 'test.html'
media = MediaFileUpload('test.html', mimetype = 'text/html')
#Now we're doing the actual post, creating a new file of the uploaded type
fiahl = drive_api.files().create(body=body, media_body=media).execute()
#Because verbosity is nice
print "Created file '%s' id '%s'." % (fiahl.get('name'), fiahl.get('id'))
A list of valid Mime Types to use in the "body" hash is available at
https://developers.google.com/drive/v3/web/mime-types
A list of valid mimetype strings for the MediaFileUpload (they'll attempt to convert your file to whatever you put here):
https://developers.google.com/drive/v3/web/integrate-open#open_files_using_the_open_with_contextual_menu
Python 2.7, resumable upload.
https://github.com/googleapis/google-api-python-client/blob/master/docs/media.md
from __future__ import print_function
import pickle
import os.path
from googleapiclient.http import MediaFileUpload
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive']
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('drive', 'v3', credentials=creds)
media = MediaFileUpload(
'big.jpeg',
mimetype='image/jpeg',
resumable=True
)
request = service.files().create(
media_body=media,
body={'name': 'Big', 'parents': ['<your folder Id>']}
)
response = None
while response is None:
status, response = request.next_chunk()
if status:
print("Uploaded %d%%." % int(status.progress() * 100))
print("Upload Complete!")
if __name__ == '__main__':
main()
You won't need to post JSON yourself, the client library handles that for you.
We provide full code samples already which can be found here: https://github.com/gsuitedevs/python-samples
Also you could check the file.insert reference documentation which contains a Python sample: https://developers.google.com/drive/v2/reference/files/insert
If this does not answer what you want perhaps you could explain in more details what you want to achieve and your architecture currently in place.
I want to provide additional information on uploading to a specific drive folder. I am providing the example from my AWS Lambda with Python 3.7
Note:
You need the folder ID for your desired location. You can find this by going to your drive folder and looking for the ID in the URL in the Browser.
For example in this URL, https://drive.google.com/drive/u/0/folders/1G91IKgQqI9YgNj8Odc8SIOPHrWOjdvOO, 1G91IKgQqI9YgNj8Odc8SIOPHrWOjdvOO would be your ID.
You need to provide your service account email access to the folder to be accessed. The service account email is found in the IAM section of your Google Cloud account. Add access to your folder by going to it in Drive, clicking the "i" icon on the top right, clicking details, then manage access.
You also need the JSON file associated with the service account. Find/create this in the Service Accounts section in Google Cloud IAM on the KEYS tab. The file contains the private key for your project. Store it where your code can access.
I'm not sure which dependencies you need to install but I think
pydrive installed them all for me: pip3 install pydrive
from apiclient.discovery import build
from google.oauth2 import service_account
from googleapiclient.http import MediaFileUpload
# This provides what authority the service account has as well as the location of the JSON file containing the private key.
scopes = ['https://www.googleapis.com/auth/drive']
service_account_file = 'path/to/service_account.json'
# Create the credentials object for the service account
credentials = service_account.Credentials.from_service_account_file(service_account_file, scopes=scopes)
drive = build('drive', 'v3', credentials=credentials)
# Create the metadata for the file and upload it to the drive folder. Supply the corresponding MIME type for your file. The parents parameter is very important, this is where you supply the ID you found for your drive folder.
body = {'name': 'testfile.txt', 'mimeType': 'text/plain', 'parents': ["theStringForTheDriveFolder"]}
media = MediaFileUpload('path/to/testfile.txt', mimeType='text/plain')
drive.files().create(body=body, media_body=media).execute()
Here's the documentation I followed:
How to use service accounts to call google APIs: https://developers.google.com/identity/protocols/oauth2/service-account#python
Documentation for the Drive API V3 to upload files:
https://developers.google.com/drive/api/v3/reference/files/create
Documentation for the Google-Python API client:
https://github.com/googleapis/google-api-python-client/blob/main/docs/oauth.md
How to perform various upload types with the Drive API:
https://developers.google.com/drive/api/guides/manage-uploads#simple
How to create and use Service Accounts, including generating the JSON
file/private keys:
https://developers.google.com/identity/protocols/oauth2/service-account
Related
I'm trying to build a simple python script to access gmail's API and organize certain email messages inmy inbox into a csv file.
I see in the below documentation that accessing the messages is done using the user's (mine in this case) email address.
messages.list
I'm running into difficulty accessing the API. I'm getting the below error:
"Request is missing required authentication credential. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project."
I was hoping to build a light weight script not a web application. Does anyone know if there is a way I can authenticate my own email address in a script?
PS: I suppose I could use selenium to automatically sign in but I was wondering if there was a way to do this using gmail's API.
You need to understand that the data you are trying to access is private user data. This is data owned by a user that being you, which means your application need to be "authorized" by the user "you" to access their data.
We do this with a method called Oauth2, it will allow your application to request consent for access to read the users emails in this case.
In order to use Oauth2 you must first register your applcation on Google Developer console and set up a few things, this will identify your application to Google.
All of this is explained in the Python quick-start for gmail. Once you have that working you should be able to change the code to use message.list instead of labels.list.
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
def main():
"""Shows basic usage of the Gmail API.
Lists the user's Gmail labels.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('gmail', 'v1', credentials=creds)
# Call the Gmail API
results = service.users().labels().list(userId='me').execute()
labels = results.get('labels', [])
if not labels:
print('No labels found.')
else:
print('Labels:')
for label in labels:
print(label['name'])
if __name__ == '__main__':
main()
I am quite new to working in google drive and I am well aware that i can't ask stackoverflow the complete example of the below scenario, however if you can direct me to something similar it would be really helpful. I am quite stuck and couldn't move forward.
I have uploaded the contents of 7-8 gb of pdf files which includes pdf, docx, ppt etc in google drive. My concern is to list all the files that contain the term queried by user. For instance, if i want to search 'computer vision using google drive api' then the results should contain the list of files that contain the term 'computer vision' .
The above scenario is possible when i type something in google drive search box and below is the screen shot.
When i type machine learning, i get list of files. How to retrieve the same results by programatically. I have read the documentation of google drive api and came across the syntac 'fulltext contains term' but then i don't know how to use it.
As you correctly said, an easy way to do this is to use the q parameter of the request, along with the fullText contains X operator. Below you can see an adaptation of the Python Quickstart from the reference that uses this feature:
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']
def main():
"""Shows basic usage of the Drive v3 API.
Prints the names and ids of the first 10 files the user has access to.
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
service = build('drive', 'v3', credentials=creds)
# Call the Drive v3 API
results = service.files().list(
pageSize=1000, fields="nextPageToken, files(id, name)", q="fullText contains 'computer vision'").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(u'{0} ({1})'.format(item['name'], item['id']))
if __name__ == '__main__':
main()
Notice the q parameter upon calling the service.files().list() method.
Reference
Google Drive API - Search for Files
Python Drive API v3 reference - list()
in this code :
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
I don't know how to get file_id , I was getting file_id while uploading but now I am not able to figure out how to get file_id of the file which is present on Google Drive.
Ex. if my uploaded file has name A001002.pdf , how can i get file id for this file.
there is some reference online which i am not able to understand.
link: files.list
any help?
The file.list method contains a q paramater which is used for searching
GET https://www.googleapis.com/drive/v3/files?q=name+%3D+'hello'&key={YOUR_API_KEY}
Python Guess
"""
Shows basic usage of the Drive v3 API.
Creates a Drive v3 API service and prints the names and ids of the last 10 files
the user has access to.
"""
from __future__ import print_function
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
# Setup the Drive v3 API
SCOPES = 'https://www.googleapis.com/auth/drive.metadata.readonly'
store = file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
service = build('drive', 'v3', http=creds.authorize(Http()))
# Call the Drive v3 API
results = service.files().list(
pageSize=10, fields="*").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print('{0} ({1})'.format(item['name'], item['id']))
Note this example does not show how to add the additional parameter i am still Googling that but i am not a python dev you may know more about how to do that than me.
Depending on your implementation, there is one more alternative. In case you do not need to programmatically get the fileID you can just open the file in google docs from the browser and the ID is shown in the URL.
I followed the Google Sheet Python API Quickstart guide (https://developers.google.com/sheets/api/quickstart/python) and was able to get it working using their supplied code:
def get_credentials():
"""Gets valid user credentials from storage.
If nothing has been stored, or if the stored credentials are invalid,
the OAuth2 flow is completed to obtain the new credentials.
Returns:
Credentials, the obtained credential.
"""
# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/sheets.googleapis.com-python-quickstart.json
SCOPES = 'https://www.googleapis.com/auth/spreadsheets'
CLIENT_SECRET_FILE = 'my/path/client_secret.json'
APPLICATION_NAME = 'Google Sheets API Python Quickstart'
credential_path = 'my/path/sheets.googleapis.com-python-quickstart.json'
store = Storage(credential_path)
credentials = store.get()
## !!!!! Is this needed?
if not credentials or credentials.invalid:
flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
flow.user_agent = APPLICATION_NAME
if flags:
credentials = tools.run_flow(flow, store, flags)
else: # Needed only for compatibility with Python 2.6
credentials = tools.run(flow, store)
print('Storing credentials to ' + credential_path)
return credentials
In the default setup I downloaded two JSON files:
client_secret.JSON
downloaded to project directory.
sheets.googleapis.com-python-quickstart.JSON
downloaded to ~/.credentials directory
The sheets.googleapis.com JSON file starts with:
"_module": "oauth2client.client".
Question 1: What is the purpose for each of these JSON files?
Question 2: Are both of these JSON files needed to successfully use the Google Sheets API?
I am thinking no, as I am able to get the API working without the client_secret.JSON file.
How about this answer? I think when you know the OAuth2 process for retrieving access token and refresh token, you can understand the meaning of both files. The flow for retrieving access token and refresh token using OAuth2 is as follows.
Flow :
Download client_secret.JSON from the API Console.
client_secret.JSON includes client_id, client_secret and redirect_uris.
Retrieve an authorization code using scopes and client_id from client_secret.JSON.
Retrieve access token and refresh token using the authorization code, client_id, client_secret and redirect_uris.
Retrieved access token, refresh token and other parameters are saved to the file of sheets.googleapis.com-python-quickstart.JSON.
Note :
When you run the Quickstart for the first time, the authorization process using your browser is launched. At that time, the script of Quickstart retrieves the authorization code using client_id and scopes, and then the access token and refresh token are retrieved using the authorization code, client_id, client_secret and redirect_uris.
After the first run of the Quickstart, the access token is retrieved by the refresh token from sheets.googleapis.com-python-quickstart.JSON. By this, retrieving the authorization code using browser is not required to do. So when there is sheets.googleapis.com-python-quickstart.JSON, client_secret.JSON is not required.
I think that this leads to an answer for your Question 2.
But, if you want to change scopes and/or credentials of client_secret.JSON, the authorization process using browser and retrieving the authorization code are required to do. For this, you have to remove sheets.googleapis.com-python-quickstart.JSON and authorize again. At that time, at Quickstart, client_secret.JSON is used again.
References :
Using OAuth 2.0 to Access Google APIs
Authorization for Google Services
If this is not useful for you, I'm sorry.
I've just started trying to use the Google Drive API. Using the quickstart guide I set up the authentication, I can print a list of my files and I can even make copies. All that works great, however I'm having trouble trying to access data from a file on Drive. In particular, I'm trying to get a WebViewLink, however when I call .get I receive only a small dictionary that has barely any of the file's metadata. The documentation makes it look like all the data should just be there by default but it's not appearing. I couldn't find any way to flag for requesting any additional information.
credentials = get_credentials()
http = credentials.authorize(httplib2.Http())
service = discovery.build('drive', 'v3', http=http)
results = service.files().list(fields="nextPageToken, files(id, name)").execute()
items = results.get('files', [])
if not items:
print('No files found.')
else:
print('Files:')
for item in items:
print(item['name'], item['id'])
if "Target File" in item['name']:
d = service.files().get(fileId=item['id']).execute()
print(repr(d))
This is the output of the above code: (the formatting is my doing)
{u'mimeType': u'application/vnd.google-apps.document',
u'kind': u'drive#file',
u'id': u'1VO9cC8mGM67onVYx3_2f-SYzLJPR4_LteQzILdWJgDE',
u'name': u'Fix TVP Licence Issues'}
For anyone confused about the code there is some missing that's just the basic get_credentials function from the API's quickstart page and some constants and imports. For completeness, here's all that stuff, unmodified in my code:
from __future__ import print_function
import httplib2
import os
from apiclient import discovery
import oauth2client
from oauth2client import client
from oauth2client import tools
SCOPES = 'https://www.googleapis.com/auth/drive'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Drive API Python Quickstart'
try:
import argparse
flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
flags = None
def get_credentials():
"""Gets valid user credentials from storage.
If nothing has been stored, or if the stored credentials are invalid,
the OAuth2 flow is completed to obtain the new credentials.
Returns:
Credentials, the obtained credential.
"""
home_dir = os.path.expanduser('~')
credential_dir = os.path.join(home_dir, '.credentials')
if not os.path.exists(credential_dir):
os.makedirs(credential_dir)
credential_path = os.path.join(credential_dir,
'drive-python-quickstart.json')
store = oauth2client.file.Storage(credential_path)
credentials = store.get()
if not credentials or credentials.invalid:
flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
flow.user_agent = APPLICATION_NAME
if flags:
credentials = tools.run_flow(flow, store, flags)
else: # Needed only for compatibility with Python 2.6
credentials = tools.run(flow, store)
print('Storing credentials to ' + credential_path)
return credentials
So what's missing, how can I get the API to return all that extra meta data that's just not appearing right now?
You are very close. With the newer version of the Drive API v3, to retrieve other metadata properties, you will have to add the fields parameter to specify additional properties to include in a partial response.
In your case, since you are looking to retrieve the WebViewLinkproperty your request should look something similar to this:
results = service.files().list(
pageSize=10,fields="nextPageToken, files(id, name, webViewLink)").execute()
To display your items from the response:
for item in items:
print('{0} {1} {2}'.format(item['name'], item['id'], item['webViewLink']))
I also suggest try it out with the API Explorer so you can view what additional metadata properties you would like to display on your response.
Good Luck and Hope this helps ! :)
You explicitly request only the id and name fields in your files.list call. Add webViewLink to the list to results = service.files().list(fields="nextPageToken, files(id, name, webViewLink)").execute(). To retrieval all metadata files/* should be used. For more information about this performance optimizations see Working with partial resources in the Google Drive docs.
I have written a custom function to help with getting a sharable web link given a file/folder id. More information can be gotten here
def get_webViewLink_by_id(spreadsheet_id):
sharable_link_response = drive_service.files().get( fileId=spreadsheet_id, fields='webViewLink').execute()
return(sharable_link_response['webViewLink'])
print(get_webViewLink_by_id(spreadsheet_id = '10Ik3qXK4wseva20lNGUKUTBzKoywaugi6XOmRUoP-4A'))