I would like to retrieve data of spreadsheet by searching for the spreadsheet by name. I wonder how it works?
result = service.spreadsheets().values().get(title=spreadsheetTitle, range=rangeName).execute()
From this example, you can open a spreadsheet by its title or url.
Example:
def open(self, title):
"""Opens a spreadsheet, returning a :class:`~gspread.Spreadsheet` instance.
:param title: A title of a spreadsheet.
If there's more than one spreadsheet with same title the first one
will be opened.
:raises gspread.SpreadsheetNotFound: if no spreadsheet with
specified `title` is found.
>>> c = gspread.Client(auth=('user#example.com', 'qwertypassword'))
>>> c.login()
>>> c.open('My fancy spreadsheet')
"""
feed = self.get_spreadsheets_feed()
for elem in feed.findall(_ns('entry')):
elem_title = elem.find(_ns('title')).text
if elem_title.strip() == title:
return Spreadsheet(self, elem)
else:
raise SpreadsheetNotFound
You can also check these links:
How do I access (read, write) to Google Sheets spreadsheets with Python?
How do I search Google Spreadsheets?
You can use pygsheets, a python library for google sheets api v4.
import pygsheets
gc = pygsheets.authorize()
# Open spreadsheet by title
sh = gc.open('my new ssheet')
Related
Given a url of googlesheets like https://docs.google.com/spreadsheets/d/1dprQgvpy-qHNU5eHDoOUf9qXi6EqwBbsYPKHB_3c/edit#gid=1139845333
How could I use gspread api to get the name of the sheet?
I mean the name may be sheet1, sheet2, etc
Thanks!
I believe your goal is as follows.
You want to retrieve the sheet names from a Google Spreadsheet from the URL of https://docs.google.com/spreadsheets/d/###/edit#gid=1139845333.
From How could I use gspread api to get the name of the sheet?, you want to achieve this using gsperad for python.
In this case, how about the following sample script?
Sample script:
client = gspread.authorize(credentials)
url = "https://docs.google.com/spreadsheets/d/1dprQgvpy-qHNU5eHDoOUf9qXi6EqwBbsYPKHB_3c/edit#gid=1139845333"
spreadsheet = client.open_by_url(url)
sheet_names = [s.title for s in spreadsheet.worksheets()]
print(sheet_names)
In this script, please use your client = gspread.authorize(credentials).
When this script is run, the sheet names are returned as a list.
References:
open_by_url(url)
worksheets()
Added:
About your following new question,
May I know what if I only want the sheet name of a particular one? Usually, for each additional sheet we create, it comes with a series of number at the end (gid=1139845333), I just want the name for that sheet instead of all.
In this case, how about the following sample script?
Sample script:
client = gspread.authorize(credentials)
url = "https://docs.google.com/spreadsheets/d/1dprQgvpy-qHNU5eHDoOUf9qXi6EqwBbsYPKHB_3c/edit#gid=1139845333"
gid = "1139845333"
sheet_name = [s.title for s in spreadsheet.worksheets() if str(s.id) == gid]
if len(sheet_name) == 1:
print(sheet_name)
else:
print("No sheet of the GID " + gid)
I created a Python function for an API call so I longer have to do that in Power BI. It creates 5 XML files that are then combined into a single CSV-file. I would like the function to run on Google Cloud (correct me if this is not a good idea).
I don't think it' s possible to create XML files in the function (maybe it's possible to write to a bucket) but ideally I would like to skip the XML file creation and just go straight to creating the CSV.
Please find the code for generating the XML files and combining into CSV below:
offices = ['NL001', 'NL002', 'NL003', 'NL004', 'NL005']
#Voor elke office inloggen, office veranderen en een aparte xml maken
for office in offices:
xmlfilename = office+'.xml'
session.service.SelectCompany(office, _soapheaders={'Header': auth_header})
proces_url = cluster + r'/webservices/processxml.asmx?wsdl'
proces = Client(proces_url)
response = proces.service.ProcessXmlString(query.XML_String, _soapheaders={'Header': auth_header})
f = open(xmlfilename, 'w')
f.write(response)
f.close()
to csv
if os.path.exists('CombinedFinance.csv'):
os.remove('CombinedFinance.csv')
else:
print("The file does not exist")
xmlfiles = ['NL001.xml','NL002.xml','NL003.xml','NL004.xml','NL005.xml']
for xmlfile in xmlfiles:
with open(xmlfile, encoding='windows-1252') as xml_toparse:
tree = ET.parse(xml_toparse)
root = tree.getroot()
columns = [element.attrib['label'] for element in root[0]]
columns.append('?')
data = [[field.text for field in row] for row in root[1::]]
df = pd.DataFrame(data, columns=columns)
df = df.drop('?', axis=1)
df.to_csv('CombinedFinance.csv', mode='a', header=not os.path.exists('CombinedFinance.csv'))
Any ideas?
n.b. If i can improve my code please let me know, I'm just learning all of this
EDIT: In response to some comments, code now looks like this. When deploying to cloud I get the following error:
ERROR: (gcloud.functions.deploy) OperationError: code=13, message=Function deployment failed due to a health check failure. This usually indicates that your code was built successfully but failed during a test execution. Examine the logs to determine the cause. Try deploying again in a few minutes if it appears to be transient.
My requirements.txt looks like this:
zeep==3.4.0
pandas
Any ideas?
import pandas as pd
import xml.etree.ElementTree as ET
from zeep import Client
import query
import authentication
import os
sessionlogin = r'https://login.twinfield.com/webservices/session.asmx?wsdl'
login = Client(sessionlogin)
auth = login.service.Logon(authentication.username, authentication.password, authentication.organisation)
auth_header = auth['header']['Header']
cluster = auth['body']['cluster']
#Use cluster to create a session:
url_session = cluster + r'/webservices/session.asmx?wsdl'
session = Client(url_session)
#Select a company for the session:
offices = ['NL001', 'NL002', 'NL003', 'NL004', 'NL005']
#Voor elke office inloggen, office veranderen en een aparte xml maken
for office in offices:
session.service.SelectCompany(office, _soapheaders={'Header': auth_header})
proces_url = cluster + r'/webservices/processxml.asmx?wsdl'
proces = Client(proces_url)
response = proces.service.ProcessXmlString(query.XML_String, _soapheaders={'Header': auth_header})
treetje = ET.ElementTree(ET.fromstring(response))
root = treetje.getroot()
columns = [element.attrib['label'] for element in root[0]]
columns.append('?')
data = [[field.text for field in row] for row in root[1::]]
df = pd.DataFrame(data, columns=columns)
df = df.drop('?', axis=1)
df.to_csv('/tmp/CombinedFinance.csv', mode='a', header=not os.path.exists('/tmp/CombinedFinance.csv'))
A few things to consider about turning a regular Python script (what you have here) into a Cloud Function:
Cloud Functions respond to events -- either an HTTP request or some other background trigger. You should think about the question "what is going to trigger my function?"
HTTP functions take in a request that corresponds to the incoming request, and must return some sort of HTTP response
The only available part of the filesystem that you can write to is /tmp. You'll have to write all files there during the execution of your function
The filesystem is ephemeral. You can't expect files to stick around between invocations. Any file you create must either be stored elsewhere (like in a GCS bucket) or returned in the HTTP response (if it's an HTTP function)
A Cloud Function has a very specific signature that you'll need to wrap your existing business logic in:
def my_http_function(request):
# business logic here
...
return "This is the response", 200
def my_background_function(event, context):
# business logic here
...
# No return necessary
I'm using the below code to update the google sheet i have with the data from a PostgreSQL table. The table refresh frequently and i need to update the Google Sheet with the latest data of the table.
I'm new to Google API and went through goggle posts and did all he steps like sharing the google sheet with the client_email, But it is not working.
There are 3 columns as shown below,
The column header are in 3rd row and i need to update the values from 4th row onwards.
Below is the current code,
import psycopg2
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pprint
#Create scope
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
cnx_psql = psycopg2.connect(host="xxx.xxx.xxx.xx", database="postgres", user="postgres",
password="**********", port="5432")
psql_cursor = cnx_psql.cursor()
meta_query = '''select * from dl.quantity;'''
psql_cursor.execute(meta_query)
results = psql_cursor.fetchall()
cell_values = (results)
creds = ServiceAccountCredentials.from_json_keyfile_name('/Users/User_123/Documents/GS/gsheet_key.json',scope)
client = gspread.authorize(creds)
sheet = client.open('https://docs.google.com/spreadsheets/d/***************').sheet1
pp = pprint.PrettyPrinter()
result = sheet.get_all_record()
for i, val in enumerate(cell_values):
cell_list[i].value = val
sheet.update_cells(cell_list)
psql_cursor.close()
cnx_psql.close()
Getting the below error,
Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gspread/client.py", line 123, in open self.list_spreadsheet_files() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gspread/utils.py", line 37, in finditem return next((item for item in seq if func(item))) StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/Users/User_123/Documents/Googlesheet_update.py", line 30, in sheet = client.open('https://docs.google.com/spreadsheets/d/********************').sheet1 File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/gspread/client.py", line 131, in open raise SpreadsheetNotFound gspread.exceptions.SpreadsheetNotFound
Your code and comments suggests that you are trying to open the spreadsheet using the full URL, but you're using the open function that only works with titles.
From the docs:
You can open a spreadsheet by its title as it appears in Google Docs:
sh = gc.open('My poor gym results')
If you want to be specific, use a key (which can be extracted from the spreadsheet’s url):
sht1 = gc.open_by_key('0BmgG6nO_6dprdS1MN3d3MkdPa142WFRrdnRRUWl1UFE')
Or, if
you feel really lazy to extract that key, paste the entire
spreadsheet’s url
sht2 = gc.open_by_url('https://docs.google.com/spreadsheet/ccc?key=0Bm...FE&hl')
In your case the last example is the way to go, so use client.open_by_url instead of client.open
This Snippet of Code will allow you to connect, from there you can look at the documentation to complete the rest of your actions!
from oauth2client.service_account import ServiceAccountCredentials
import gspread
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/drive.file', 'https://www.googleapis.com/auth/spreadsheets']
#Generate a json file by using service account auth in google developer console
'''
Link: https://console.developers.google.com/
1) Enable API Access for a Project if you haven’t done it yet.
2) Go to “APIs & Services > Credentials” and choose “Create credentials > Service account key”.
3) Fill out the form
4) Click “Create” and “Done”.
5) Press “Manage service accounts” above Service Accounts.
6) Press on ⋮ near recently created service account and select “Manage keys” and then click on “ADD KEY > Create new key”.
7) Select JSON key type and press “Create”.
8) Go to the google sheet and share the sheet with the email from service accounts.
'''
creds = ServiceAccountCredentials.from_json_keyfile_name('mod.json', scope)
client = gspread.authorize(creds)
sheet = client.open_by_url("#Paste yout google sheet url here").sheet1
data = sheet.get_all_records()
sheet.update_cell(1, 1, "You made it") #Write this message in first row and first column
print(data)
My use-case is to use a script to create/update a sheet on my google drive and have it run everyday so the data is correct.
My code properly creates the sheet, but when I run each day it creates a different sheet with the same name. I want to add a try, except to see if the sheet was previously, and if it is, just overwrite.
I've spent a couple of hours trying to find an example where someone did this. I'm looking to return the sheetid, whether it's newly created or previously created.
def create_spreadsheet(sp_name, creds):
proxy = None
#Connect to sheet API
sheets_service = build('sheets', 'v4', http=creds.authorize(httplib2.Http(proxy_info = proxy)))
#create spreadsheet with title 'sp_title'
sp_title = sp_name
spreadsheet_req_body = {
'properties': {
'title': sp_title
}
}
spreadsheet = sheets_service.spreadsheets().create(body=spreadsheet_req_body,
fields='spreadsheetId').execute()
return spreadsheet.get('spreadsheetId')
You want to check whether the file (Spreadsheet), which has the specific filename, is existing in your Google Drive.
If the file is existing, you want to return the file ID of it.
If the file is not existing, you want to return the file ID by creating new Spreadsheet.
You want to achieve above using google-api-python-client of Python.
If my understanding is correct, how about this modification? There is the method for confirming whether the file, which has the specific filename, is existing using Drive API. In this modification, the method of Files: list Drive API is used. Please think of this as just one of several answers.
Modification points:
In this modification, the method of Files: list Drive API is used. The file is checked with the search query.
In this case, the file is searched by the filename and the mimeType and out of the trash box.
When the file is existing, the file ID is return.
When the file is NOT existing, new Spreadsheet is created and return the file ID by your script.
Modified script:
Please modify your script as follows.
def create_spreadsheet(sp_name, creds):
proxy = None
sp_title = sp_name
# --- I added blow script.
drive_service = build('drive', 'v3', http=creds.authorize(httplib2.Http(proxy_info = proxy)))
q = "name='%s' and mimeType='application/vnd.google-apps.spreadsheet' and trashed=false" % sp_title
files = drive_service.files().list(q=q).execute()
f = files.get('files')
if f:
return f[0]['id']
# ---
sheets_service = build('sheets', 'v4', http=creds.authorize(httplib2.Http(proxy_info = proxy)))
sp_title = sp_name
spreadsheet_req_body = {
'properties': {
'title': sp_title
}
}
spreadsheet = sheets_service.spreadsheets().create(body=spreadsheet_req_body,
fields='spreadsheetId').execute()
return spreadsheet.get('spreadsheetId')
Note:
In this modification, I used https://www.googleapis.com/auth/drive.metadata.readonly as the scope. So please enable Drive API and add the scope and delete the file including the access token and refresh token, then please authorize the scopes by running the script, again. By this, the additional scope can be reflected to the access token. Please be careful this.
Reference:
Files: list of Drive API
If I misunderstood your question and this was not the direction you want, I apologize.
Have anyone used the function importRows() from fusion table API?
As the API reference below,
https://developers.google.com/fusiontables/docs/v1/reference/table/importRows
I have to supply CSV data in the request body.
But what should I do for the html body exactly?
My code:
http = getAuthorizedHttp()
DISCOVERYURL = 'https://www.googleapis.com/discovery/v1/apis/{api}/{apiVersion}/rest'
ftable = build('fusiontables', 'v1', discoveryServiceUrl=DISCOVERYURL, http=http)
body = create_ft(CSVFILE,"title here") # the function to load csv file and create the table with columns from csv file.
result = ftable.table().insert(body=body).execute()
print result["tableId"] # good, I have got the id for new created table
# I have no idea how to go on here..
f = ftable.table().importRows(tableId=result["tableId"])
f.body = ?????????????
f.execute()
I finally fixed my problem, my code can be found in the following link.
https://github.com/childnotfound/parser/blob/master/uploader.py
I fixed the problem like this:
media = http.MediaFileUpload('example.csv', mimetype='application/octet-stream', resumable=True)
request = service.table().importRows(media_body=media, tableId='1cowubQ0vj_H9q3owo1vLM_gMyavvbuoNmRQaYiZV').execute()