I am trying to download specific sheet from a spread-sheet (on Google Drive) but unable to find a method to do so. I am using Python Client API library (v3) and passing file_id and mimeType in export_media() function as shown below:
request = service.files().export_media(fileId=file_id,mimeType='text/csv')
media_request = http.MediaIoBaseDownload(local_fd, request)
This code always export the sheet which is present at first place. Can you please describe a method through which I can download specific sheet/sheets by providing gid or any other parameter.
I don't think the Drive API has a feature to specify a sheet name.
Two workarounds spring to mind...
You could use the Sheets API (https://developers.google.com/sheets/api/reference/rest/) and write your own csv formatter. It sounds more complex than it is. It's probably 10 lines of code, especially if you go for Tab Separated instead of Comma Separated.
Use the Google Spreadsheet File/Publish to the Web feature to publish a csv of any given sheet. Note that the content will be public, so anybody with the link (which is pretty obtuse) would be able to read the data.
You can use an old visualization API URL (see other answer)
f'https://docs.google.com/spreadsheets/d/{doc_id}/gviz/tq?tqx=out:csv&sheet={sheet_name}'
To make this request using the Google API Python library, you can use the credentials you already have and create an HTTP client instance yourself:
http_client = googleapiclient.discovery._auth.authorized_http(creds)
response, content = http_client.request(url)
Check response.status before you proceed.
Note that this API behaves a bit differently than your regular CSV export. Specifically there are some things I saw it does with headers - it will make them disappear if they are not set to Plain Text on a numeric column (see here), or merge multiple text rows appearing in the top of your sheet as a single header row.
Related
I want to use the imgkit.from_url function to download a specific Google sheet for use in my python code.
Problem is that, while access to the sheet via Google sheets api using oauth2 works just fine, in order for from_url to work I need to access the sheet like a human user would. That is logging in with username and password. I.e. I can use from_url just fine but instead of the actual sheet it creates a jpg of the Google login mask.
As a workaround I am able to download the page as a zip file to a temporary folder via export?format=zip URL parameter using the api, unzip it, find the correct html, use the imgkit.from_file function and convert it back to an byte object, but this is a lot more convoluted, requires handling temporary files/folders and also causes trouble with accessing the css file.
Since from_url allows me to circumvent a lot of steps and would work if I was able to access it, I'd very much prefer that method over from_file
So I'm able to use the googlesheets4 package in R to read a google sheet, but I'm unable to use pandas.read_csv(url) to read a google sheet. Using read_csv returns HTML(which seems to be because it's redirecting to an authorization page), when I set the google sheet to public, read_csv works.
This is for work, and the sheets are set to anyone in organization and view with link.
Anyone able to help?
My thought is that if a package in R can do it, that there must be a way in Python as well.
Can I read a google spreadsheet which is open to people, but doesn't have a share option? There's a discussion here, but it's I need to have an authorization to click the share option.
Even copying by URL to my own Google spreadsheet may serve the purpose.
Update:
The idea was once I create a Google API, I should be able to create a .json file with a client email. In the share option, I'm supposed to provide the client email of .json file. You may see: Accessing Google Spreadsheet Data using Python.
This is the spreadsheet page where I'm not finding any Share option: https://docs.google.com/spreadsheets/d/e/2PACX-1vSc_2y5N0I67wDU38DjDh35IZSIS30rQf7_NYZhtYYGU1jJYT6_kDx4YpF-qw0LSlGsBYP8pqM_a1Pd/pubhtml#
Issue:
Publishing the contents of a spreadsheet to the web is not the same as making a spreadsheet public.
The URL you shared refers to spreadsheet contents that were published to the web following these steps. This published website is not the same as the original file where the data comes from, and so it doesn't have most of its functionalities, like a Share button (it doesn't make sense to have a Share button anyway, since this URL is already public).
Solution:
If you want to access the spreadsheet data using a Service Account, you would have to do one of the following (better to use method 1 if you have access to the spreadsheet):
Share the spreadsheet itself (not the published contents) with the Service Account, as explained in the link you referenced.
Use your application to fetch the website contents from the provided URL.
Reference:
Make Google Docs, Sheets, Slides & Forms public
I wrote a small python program that works with data from a CSV file. I am tracking some numbers in a google sheet and I created the CSV file by downloading the google sheet. I am trying to find a way to have python read in the CSV file directly from google sheets, so that I do not have to download a new CSV when I update the spreadsheet.
I see that the requests library may be able to handle this, but I'm having a hard time figuring it out. I've chosen not to try the google APIs because this way seems simpler as long as I don't mind making the sheet public to those with the link, which is fine.
I've tried working with the requests documentation but I'm a novice programmer and I can't get it to read in as a CSV.
This is how the data is currently taken into python:
file = open('data1.csv', newline='')
reader = csv.reader(file)
I would like the file = open() to ideally be replaced by the requests library and pull directly from the spreadsheet.
You need to find the correct URL request that download the file.
Sample URL:
csv_url='https://docs.google.com/spreadsheets/d/169AMdEzYzH7NDY20RCcyf-JpxPSUaO0nC5JRUb8wwvc/export?format=csv&id=169AMdEzYzH7NDY20RCcyf-JpxPSUaO0nC5JRUb8wwvc&gid=0'
The way to doing it is by manually download your file while inspecting the requests URL at the Network tab in the Developer Tools in your browser.
Then the following is enough:
import requests as rs
csv_url=YOUR_CSV_DOWNLOAD_URL
res=rs.get(url=csv_url)
open('google.csv', 'wb').write(res.content)
It will save CSV file with the name 'google.csv' in the folder of you python script file.
import pandas as pd
import requests
YOUR_SHEET_ID=''
r = requests.get(f'https://docs.google.com/spreadsheet/ccc?key={YOUR_SHEET_ID}&output=csv')
open('dataset.csv', 'wb').write(r.content)
df = pd.read_csv('dataset.csv')
df.head()
I tried #adirmola's solution but I had to tweak it a little.
When he wrote "You need to find the correct URL request that download the file" he has a point. An easy solution is what I'm showing here. Adding "&output=csv" after your google sheet id.
Hope it helps!
I'm not exactly sure about your usage scenario, and Adirmola already provided a very exact answer to your question, but my immediate question is why you want to download the CSV in the first place.
Google Sheets has a python library so you can just get the data from the GSheet directly.
You may also be interested in this answer since you're interested in watching for changes in GSheets
I would just like to say using the Oauth keys and google python API is not always an option. I found the above to be quite useful for my current application.
I was testing with the dropbox provided API for python..my target was to read a Spreadsheet in my dropbox without downloading it to my local storage.
import dropbox
dbx = dropbox.Dropbox('my-token')
print dbx.users_get_current_account()
fl = dbx.files_get_preview('/CGPA.xlsx')[1] # returns a Response object
After the above code, calling the fl.text() method gives an HTML output which shows the preview that would be seen if opened by browser. And the data can be parsed.
My query is, if there is a built-in method of the SDK for getting any particular info from the spreadsheet, like the data of a row or a cell...preferrably in json format...I previously used butterdb for extracting data from a google drive spreadsheet...is there such functionality for dropbox?....could not understand by reading the docs: http://dropbox-sdk-python.readthedocs.io/en/master/
No, the Dropbox API doesn't offer the ability to selectively query parts of a spreadsheet file like this without downloading the whole file, but we'll consider it a feature request.