With pygsheets how to detect if a spreadsheet already exists? - python

Using pygsheets, I was looking around for a good way to open a Google sheet (by title) if it already exists, otherwise create it.
At the same time, I also wanted to make it r/w to myself and r/o to the rest of the world upon creating it.

Here's something that does just that:
import pygsheets
creds_file = "/path/to/your_creds_file.json"
gc = pygsheets.authorize(service_file=creds_file)
sheet_title = "my_google_sheet"
# Try to open the Google sheet based on its title and if it fails, create it
try:
sheet = gc.open(sheet_title)
print(f"Opened spreadsheet with id:{sheet.id} and url:{sheet.url}")
except pygsheets.SpreadsheetNotFound as error:
# Can't find it and so create it
res = gc.sheet.create(sheet_title)
sheet_id = res['spreadsheetId']
sheet = gc.open_by_key(sheet_id)
print(f"Created spreadsheet with id:{sheet.id} and url:{sheet.url}")
# Share with self to allow to write to it
sheet.share('YOUR_EMAIL#gmail.com', role='writer', type='user')
# Share to all for reading
sheet.share('', role='reader', type='anyone')
# Write something into it
wks = sheet.sheet1
wks.update_value('A1', "something")

Related

Passing a file that hasn't been created, as input -- Python

I'm attempting to create a script that pulls an html table using pandas, doing some other intermediary steps, and then transposing the data into an Excel file.
The problem is, I want to pass the website, the ship's name, and then the subsequent filename that is created into the script but it keeps erroring out stating the file doesn't exist. I know it doesn't exist because it hasn't been created by the program.
Is there a way to run through the script passing the intended filename to be created as input? Thanks!
import pandas as pd
import os
import openpyxl
#This segment of code initially grabs the webpage from the website
webpage = input("Enter the webpage here: ")
ship_name = input("Enter the name of the ship here: ")
df = pd.read_html(webpage, skiprows=[7,14,15,16], index_col=None)
df[0].to_excel(ship_name + ".xlsx")
#This code segment does an initial clean up of the data: Gets rid of copied column data that comes over due to the colspan=3 tag in the original html source code
filename = ("aase.xlsx")
wb = openpyxl.load_workbook(filename)
sheet = wb['Sheet1']
status = sheet.cell(sheet.min_row, 1).value
print(status)
sheet.delete_rows(1)
sheet.delete_cols(3,2)
wb.save(filename)
Instead of calling this:
wb = openpyxl.load_workbook(filename)
Create a blank workbook in memory:
wb = Workbook()
Then save it later.
You could also refer to the "Simple Usage" documentation for a clearer example of this.

Gspread not clearing correct tab

I have a DataFrame "budget" that im trying to upload in a heavy spreadsheet with 22 tabs and more than 1 with RawData in some form in their name: "Raw Data >>", "RawData", "RawData_TargetCompletion"
I have the following code:
class GoogleSheets():
def __init__(self):
google_service_account_path = 'some_path'
scopes = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
self.credentials = ServiceAccountCredentials.from_json_keyfile_name(google_service_account_path, scopes)
self.sheets_connection = gspread.authorize(self.credentials)
def load_spreadsheet(self, spreadsheet_key):
self.sheet = self.sheets_connection.open_by_key(spreadsheet_key)
def load_worksheet(self, worksheet_name):
self.worksheet = self.sheet.worksheet(worksheet_name)
def clear_range(self, data_range):
self.sheet.values_clear(data_range)
spreadsheet_key = "this is a spreadsheet key"
worksheet_name = "RawData"
cell_ref = 'A:AT'
google_sheets = sheets.GoogleSheets()
google_sheets.load_spreadsheet(spreadsheet_key)
google_sheets.load_worksheet(worksheet_name)
google_sheets.clear_range(cell_ref)
google_sheets.upload_dataframe(budget)
I have a problem that in that heavy spreadsheet, its clearing the first tab (not the RawData), and updating in the RawData sheet.
This exact same code, but with another spreadsheet_key works fine and clears and updates the correct RawData tab regardless of the position of that RawData tab.
But in this heavy one, RawData has to be the first tab in the document because the clear part is not mapping correctly and clears the first tab always.
Is there a problem you see in the code I'm not seeing or have you encountered the same problem when updating heavy spreadsheets?
I believe your goal as situation as follows.
You want to clear the range using gspread.
You have already been able to use Sheets API.
Modification points:
When I saw values_clear(range) in the document of gspread, it seems that it is the method of class gspread.models.Spreadsheet. Ref And, range of values_clear(range) is A1Notation.
In your script, self.sheet.values_clear('A:AT') is run. In this case, 1st tab is always used because the sheet name is not used. I thouthg that this is the reason of your issue.
In order to remove your issue, I would like to propose to use the sheet name to the A1Notation for values_clear(range).
When above points are reflected to your script, it becomes as follows.
Modified script:
From:
google_sheets.clear_range(cell_ref)
To:
google_sheets.clear_range("'{0}'!{1}".format(worksheet_name, cell_ref))
References:
values_clear(range)
A1 notation

Using gspread to extract sheet ID

Can't seem to find any answer to this, but are there any functions/methods which can get a worksheet ID?
Currently, my code looks like this:
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
....code to authorize credentials goes here....
sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet))
client.import_csv('abcdefg1234567abcdefg1234567', contents)
but I don't want to hardcode the abcdefg1234567abcdefg1234567. Is there anything I can do, like sheet.id()?
I believe your goal as follows.
In order to use import_csv, you want to retrieve the Spreadsheet ID from sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet)).
You want to achieve this using gspread with python.
In this case, you can retrieve the Spreadsheet ID from client.open(str(self.googleSheetFile)). So please modify your script as follows.
From:
sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet))
client.import_csv('abcdefg1234567abcdefg1234567', contents)
To:
spreadsheet = client.open(str(self.googleSheetFile))
sheet = spreadsheet.worksheet(str(self.worksheet))
client.import_csv(spreadsheet.id, contents)
Note:
When I saw the document of gspread, it says as follows. So please be careful this.
This method removes all other worksheets and then entirely replaces the contents of the first worksheet.
This modified script supposes that you have already been able to get and put values for Google Spreadsheet using Sheets API with gspread.
Reference:
import_csv(file_id, data)

GSpread how to duplicate sheet

After googling and searching on Stackoveflow, I think I can't find a guide on how to duplicate existing sheet(existing Template sheet) and saving it into another sheet.
as per docs, there is duplicate_sheet but I can't manage to do a working example, anyone that can guide me with this?
import gspread
from gspread.models import Cell, Spreadsheet
scope = [
"https://www.googleapis.com/auth/spreadsheets.readonly",
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive.readonly",
"https://www.googleapis.com/auth/drive.file",
"https://www.googleapis.com/auth/drive",
]
json_key_absolute_path = "key.json"
credentials = ServiceAccountCredentials.from_json_keyfile_name(json_key_absolute_path, scope)
client = gspread.authorize(credentials)
spreadsheet_client = Spreadsheet(client)
spreadsheet_client.duplicate_sheet("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", new_sheet_name="timcard2")
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
You want to copy the source Spreadsheet as new Spreadsheet.
You want to achieve this using gspread with python.
You have already been able to get and put values for Google Spreadsheet using Sheets API.
If my understanding is correct, how about this answer?
Issue and solution:
It seems that duplicate_sheet method of gspread is used for copying a sheet in the source Spreadsheet to the same source Spreadsheet. Ref In order to copy the source Spreadsheet as new Spreadsheet, pleas use the method of copy() of Class Client.
Sample script:
Please modify your script as follows.
From:
client = gspread.authorize(credentials)
spreadsheet_client = Spreadsheet(client)
spreadsheet_client.duplicate_sheet("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", new_sheet_name="timcard2")
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
To:
client = gspread.authorize(credentials)
client.copy("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", title="timcard2", copy_permissions=True)
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
When you run the script, the Spreadsheet which has the spreadsheet ID of 18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo is copied as the spreadsheet name of timcard2. And, the permission information of the source Spreadsheet is also copied.
Note:
In this case, when copy_permissions=True is used, the permission information is also copied. So although I'm not sure about your actual situation, it might not be required to use worksheet.share("my_email#google.com", perm_type='user', role='writer'). Please be careful this.
References:
duplicate_sheet
copy(file_id, title=None, copy_permissions=False)
Added:
You want to copy one of sheets in Google Spreadsheet.
I could understand like above. For this, the sample script is as follows.
Sample script:
client = gspread.authorize(credentials)
client.copy("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", title="timcard2", copy_permissions=True)
ss = client.open("timcard2")
ss.share("my_email#google.com", perm_type='user', role='writer')
delete_sheets = ["Sheet2", "Sheet3", "Sheet4"] # Please set the sheet names you want to delete.
for s in delete_sheets:
ss.del_worksheet(ss.worksheet(s))
In this sample, the sheets of "Sheet2", "Sheet3", "Sheet4" are deleted from the copied Spreadsheet.
Reference:
del_worksheet(worksheet)

Python code to refresh the connection in individual excel sheet

I am a beginner in python. I have written few DBQ statements in excel to fetch
result in excel which should be refreshed whenever the excel is opened. Have given the correct setting in connection properties.
Below is my python code for refreshall:-
import win32com.client
import time
xl = win32com.client.DispatchEx("Excel.Application")
wb = xl.workbooks.open("D:\\Excel sheets\\Test_consolidation.xlsx")
xl.Visible = True
time.sleep(10)
wb.Refreshall()
I have 3 sheets in the excel file, which has 3 different connections. I want to refresh one after the other.
Can someone help me with the python code to refresh the connections individually ? I would be really grateful for your help.
So if you want to refresh all of them but one after the other, instead of wb.Refreshall(), the command would be:
for conn in wb.connections:
conn.Refresh()
If you want to link (in a dictionary for example) a connection to a sheet:
dict_conn_sheet = {} # create a new dict
for conn in wb.connections: # iterate over each connection in your excel file
name_conn = conn.Name # get the name of the connection
sheet_conn = conn.Ranges(1).Parent.Name # get the name of the sheet linked to this connection
# add a key (the name of the sheet) and the value (the name of the connection) into the dictionary
dict_conn_sheet[sheet_conn] = name_conn
Note: if one sheet has more than one connection, this is not a good way.
Then, if you want to update only one connection on a specific sheet (in my example it is called Sheet1):
sheet_name = 'Sheet1'
# refresh the connection linked to the sheet_name
# if existing in the dictionnary dict_conn_sheet
wb.connections(dict_conn_sheet[sheet_name]).Refresh()
Finally, if you know directly the name of the connection you want to update (let's say connection_Raj), just enter:
name_conn = 'connection_Raj'
wb.connections(name_conn).Refresh()
I hope it's clear even if it does not answer exactly to your question as I'm not sure I understood what you want to do.

Categories