Copy paste Google sheet without formulae and just data - python

I want to copy a sheet from a Google sheet to another Google sheet where I want to keep the data and formatting intact but not the formula.
Just want to copy the cell values to another sheet i.e. raw data.
I am using the google sheets api -
spreadsheets().sheets().copyTo(spreadsheetId=spreadsheet_id, sheetId=sheet_id, body={'destination_spreadsheet_id': target_spreadsheet})
but this is copying the formula and throws error

I believe your goal is as follows.
You want to copy a sheet in a source Spreadsheet to a destination Spreadsheet.
You want to remove the formulas while the cell formats and the cell values are kept.
You want to achieve this using googleapis for python.
In this case, how about the following patterns?
Pattern 1:
In this pattern, your showing script is modified.
service = build("sheets", "v4", credentials=creds) # Please use your client.
srcSpreadsheetId = "###" # Please set source Spreadsheet ID.
srcSheetId = "###" # Please set source sheet ID.
dstSpreadsheetId = "###" # Please set destination Spreadsheet ID.
res = service.spreadsheets().sheets().copyTospreadsheetId=srcSpreadsheetId, sheetId=srcSheetId, body={"destination_spreadsheet_id": dstSpreadsheetId}).execute()
)
service.spreadsheets().batchUpdate(spreadsheetId=dstSpreadsheetId, body={"requests": [{"copyPaste": {"source": {"sheetId": res["sheetId"]},"destination": {"sheetId": res["sheetId"]},"pasteType": "PASTE_VALUES"}}]}).execute()
When this script is run, a sheet in a source Spreadsheet is copied to a destination Spreadsheet. In this case, the formulas are also copied. And, only the values are copied using batchUpdate method. By this, the cell formats and the values are copied without the formulas.
Pattern 2:
In this pattern, the copy process is changed from the above pattern. Because, when the above script is used, if the formulas using the other sheets and Spreadsheet are included in the source sheet, the copied sheet has no values. Unfortunately, from your question, I couldn't confirm this. So, I would like to propose this pattern 2.
service = build("sheets", "v4", credentials=creds) # Please use your client.
srcSpreadsheetId = "###" # Please set source Spreadsheet ID.
srcSheetId = "###" # Please set source sheet ID.
dstSpreadsheetId = "###" # Please set destination Spreadsheet ID.
# 1. Duplicate the source sheet in the source Spreadsheet as a temporal sheet.
newSheetId = "123456789"
service.spreadsheets().batchUpdate(spreadsheetId=srcSpreadsheetId,body={"requests": [{"duplicateSheet": {"sourceSheetId": srcSheetId,"newSheetId": newSheetId}}]}).execute()
time.sleep(3)
# 2. Remove formulas.
service.spreadsheets().batchUpdate(spreadsheetId=srcSpreadsheetId,body={"requests": [{"copyPaste": {"source": {"sheetId": newSheetId},"destination": {"sheetId": newSheetId},"pasteType": "PASTE_VALUES"}}]}).execute()
# 3. Copy the source sheet from the source Spreadsheet to the destination Spreadsheet.
service.spreadsheets().sheets().copyTo(spreadsheetId=srcSpreadsheetId,sheetId=newSheetId,body={"destination_spreadsheet_id": dstSpreadsheetId}).execute()
# 4. Delete temporal sheet from source Spreadsheet.
service.spreadsheets().batchUpdate(spreadsheetId=srcSpreadsheetId,body={"requests": [{"deleteSheet": {"sheetId": newSheetId}}]},
).execute()
When this script is run, the following flow is run.
Duplicate the source sheet in the source Spreadsheet as a temporal sheet.
Remove formulas.
Copy the source sheet from the source Spreadsheet to the destination Spreadsheet.
Delete temporal sheet from source Spreadsheet.
If the copied sheet has no value from the formulas, please increase 3 of time.sleep(3).
References:
Method: spreadsheets.sheets.copyTo
Method: spreadsheets.batchUpdate

You could use SpreadsheetApp:
const destination = SpreadsheetApp.openById("id1")
const source = SpreadsheetApp.openById("id2")
source.getSheetByName("sheet-name").copyTo(destination)

Related

Reading GoogleSheet with pandas dataframe doing search on it

Do I need read_excel GoogleSheet for doing further search action on its columns in Python?
I must gather data from the entire Google Sheet file. I need search by sheetname firstly, then gather information by looking up the values in columns.
I started by looking up the two popular solutions on the internet;
First one is, with the gspread package : as it relies on service_account.json info I will not use it.
Second one is, appropriate for me. But it shows how to export as csv file. I need to take data as xlsx file.
code is below;
import pandas as pd
sheet_id=" url "
sheet_name="sample_1"
url=f"https://docs.google...d/{sheet_id}/gviz/tq?tqx=out:csv&sheet={sheet_name}"
I have both info sheet_id and sheet_name but need to export as xlsx file.
Here I see an example how to read an excel file. Is tehre a way to read as excel file but google spreadsheet
Using Pandas to pd.read_excel() for multiple worksheets of the same workbook
xls = pd.ExcelFile('excel_file_path.xls')
# Now you can list all sheets in the file
xls.sheet_names
# ['house', 'house_extra', ...]
# to read just one sheet to dataframe:
df = pd.read_excel(file_name, sheet_name="house")
I have no problem reading a google sheet using the method I found here:
Python Read In Google Spreadsheet Using Pandas
spreadsheet_id = "<INSERT YOUR GOOGLE SHEET ID HERE>"
url = f"https://docs.google.com/spreadsheets/d/{spreadsheet_id}/export?format=csv"
df = pd.read_csv(url)
df.to_excel("my_sheet.xlsx")
You need to set the permissions of your sheet though. I found that setting it to "anyone with a link" worked.
UPDATE - based on comments below
If your spreadsheet has multiple tabs and you want to read anything other than the first sheet, you need to specify a sheetID as described here
spreadsheet_id = "<INSERT YOUR GOOGLE spreadsheetId HERE>"
sheet_id = "<INSERT YOUR GOOGLE sheetId HERE>"
url = f"https://docs.google.com/spreadsheets/d/{spreadsheet_id}/export?gid={sheet_id}&format=csv"
df = pd.read_csv(url)
df.to_excel("my_sheet.xlsx")

Read GoogleSheet with multiple sheets into pandas

I want to read google sheet with multiple sheets into a (or several) pandas dataframe.
I don't know the sheet names, or the number of sheets in advance.
The trivial attempt fails:
def main():
path = r"https://docs.google.com/spreadsheets/d/1-MlSisrAxhOyKhrz6S08PG68j667Ym7jGExOyytpCSM/edit?usp=sharing"
pd.read_excel(path)
fails with
ValueError: Excel file format cannot be determined, you must specify an engine manually.
Trying any format doesn't work.
All answers to this question refer to .csv, meaning a single sheet, or knowing the sheet name in advance.
Same goes for the 1st Google hit for "read google sheet python pandas".
Is there a standard way of doing this?
When your Spreadsheet is publicly shared, in your situation, how about the following sample script?
Sample script:
import openpyxl
import pandas as pd
import requests
from io import BytesIO
spreadsheetId = "###" # Please set your Spreadsheet ID.
url = "https://docs.google.com/spreadsheets/export?exportFormat=xlsx&id=" + spreadsheetId
res = requests.get(url)
data = BytesIO(res.content)
xlsx = openpyxl.load_workbook(filename=data)
for name in xlsx.sheetnames:
values = pd.read_excel(data, sheet_name=name)
# do something
In this sample script, the publicly shared Spreadsheet is exported as a XLSX data. And, the exported XLSX data is opened, the sheet names are retrieved. And then, each sheet is put into the dataframe.
If you want to retrieve the specific sheets, please filter the sheet names from xlsx.sheetnames.
Note:
If your Spreadsheet is not publicly shared, this thread might be useful. Ref

Using gspread to extract sheet ID

Can't seem to find any answer to this, but are there any functions/methods which can get a worksheet ID?
Currently, my code looks like this:
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
....code to authorize credentials goes here....
sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet))
client.import_csv('abcdefg1234567abcdefg1234567', contents)
but I don't want to hardcode the abcdefg1234567abcdefg1234567. Is there anything I can do, like sheet.id()?
I believe your goal as follows.
In order to use import_csv, you want to retrieve the Spreadsheet ID from sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet)).
You want to achieve this using gspread with python.
In this case, you can retrieve the Spreadsheet ID from client.open(str(self.googleSheetFile)). So please modify your script as follows.
From:
sheet = client.open(str(self.googleSheetFile)).worksheet(str(self.worksheet))
client.import_csv('abcdefg1234567abcdefg1234567', contents)
To:
spreadsheet = client.open(str(self.googleSheetFile))
sheet = spreadsheet.worksheet(str(self.worksheet))
client.import_csv(spreadsheet.id, contents)
Note:
When I saw the document of gspread, it says as follows. So please be careful this.
This method removes all other worksheets and then entirely replaces the contents of the first worksheet.
This modified script supposes that you have already been able to get and put values for Google Spreadsheet using Sheets API with gspread.
Reference:
import_csv(file_id, data)

Writing a Json file, cell by cell into a google spreadsheet

I am in the process of automating a process, in which I need to upload some data to a Google spreadsheet.
The data is originally located in a pandas dataframe, which is converted to a JSON file for upload.
I am getting to the upload, but i get all the data into each cell, so that cell A1 contains all data from the entire Pandas dataframe, in fact each cell in the spreadsheet contains all the data :/
Of course, what I want to have happen is to place what is cell A1 in the dataframe, as A1 in the Google spreadsheet and so forth down to cell J173.
I am thinking I need to put in some sort of loop to make this happen, but I am not sure how JSON files work, so I am not succeeding in creating this loop.
I hope one of you can help
Below is the code
#Converting data to a json file for upload
csv_data = csv_data.to_json()
#Updating data
cell_list = sheet.range('A1:J173')
for cell in cell_list:
cell.value = csv_data
sheet.update_cells(cell_list)
Windows 10
Python 3.8
You want to put the data of dataframe to Google Spreadsheet.
In your script, csv_data of csv_data.to_json() is the dataframe.
You want to achieve this using gspread with python.
From your script, I understood like this.
You have already been able to get and put values for Google Spreadsheet using Sheets API.
Pattern 1:
In this pattern, the method of values_update of gspread is used.
Sample script:
spreadsheetId = "###" # Please set the Spreadsheet ID.
sheetName = "Sheet1" # Please set the sheet name.
csv_data = # <--- please set the dataframe.
client = gspread.authorize(credentials)
values = [csv_data.columns.values.tolist()]
values.extend(csv_data.values.tolist())
spreadsheet.values_update(sheetName, params={'valueInputOption': 'USER_ENTERED'}, body={'values': values})
Pattern 2:
In this pattern, the library of gspread-dataframe is used.
Sample script:
from gspread_dataframe import set_with_dataframe # Please add this.
spreadsheetId = "###" # Please set the Spreadsheet ID.
sheetName = "Sheet1" # Please set the sheet name.
csv_data = # <--- please set the dataframe.
client = gspread.authorize(credentials)
spreadsheet = client.open_by_key(spreadsheetId)
worksheet = spreadsheet.worksheet(sheetName)
set_with_dataframe(worksheet, csv_data)
References:
values_update
gspread-dataframe

GSpread how to duplicate sheet

After googling and searching on Stackoveflow, I think I can't find a guide on how to duplicate existing sheet(existing Template sheet) and saving it into another sheet.
as per docs, there is duplicate_sheet but I can't manage to do a working example, anyone that can guide me with this?
import gspread
from gspread.models import Cell, Spreadsheet
scope = [
"https://www.googleapis.com/auth/spreadsheets.readonly",
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive.readonly",
"https://www.googleapis.com/auth/drive.file",
"https://www.googleapis.com/auth/drive",
]
json_key_absolute_path = "key.json"
credentials = ServiceAccountCredentials.from_json_keyfile_name(json_key_absolute_path, scope)
client = gspread.authorize(credentials)
spreadsheet_client = Spreadsheet(client)
spreadsheet_client.duplicate_sheet("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", new_sheet_name="timcard2")
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
You want to copy the source Spreadsheet as new Spreadsheet.
You want to achieve this using gspread with python.
You have already been able to get and put values for Google Spreadsheet using Sheets API.
If my understanding is correct, how about this answer?
Issue and solution:
It seems that duplicate_sheet method of gspread is used for copying a sheet in the source Spreadsheet to the same source Spreadsheet. Ref In order to copy the source Spreadsheet as new Spreadsheet, pleas use the method of copy() of Class Client.
Sample script:
Please modify your script as follows.
From:
client = gspread.authorize(credentials)
spreadsheet_client = Spreadsheet(client)
spreadsheet_client.duplicate_sheet("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", new_sheet_name="timcard2")
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
To:
client = gspread.authorize(credentials)
client.copy("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", title="timcard2", copy_permissions=True)
worksheet = client.open("timcard2")
worksheet.share("my_email#google.com", perm_type='user', role='writer')
When you run the script, the Spreadsheet which has the spreadsheet ID of 18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo is copied as the spreadsheet name of timcard2. And, the permission information of the source Spreadsheet is also copied.
Note:
In this case, when copy_permissions=True is used, the permission information is also copied. So although I'm not sure about your actual situation, it might not be required to use worksheet.share("my_email#google.com", perm_type='user', role='writer'). Please be careful this.
References:
duplicate_sheet
copy(file_id, title=None, copy_permissions=False)
Added:
You want to copy one of sheets in Google Spreadsheet.
I could understand like above. For this, the sample script is as follows.
Sample script:
client = gspread.authorize(credentials)
client.copy("18Qk5bzuA7JOBD8CTgwvKYRiMl_35it5AwcFG2Bi5npo", title="timcard2", copy_permissions=True)
ss = client.open("timcard2")
ss.share("my_email#google.com", perm_type='user', role='writer')
delete_sheets = ["Sheet2", "Sheet3", "Sheet4"] # Please set the sheet names you want to delete.
for s in delete_sheets:
ss.del_worksheet(ss.worksheet(s))
In this sample, the sheets of "Sheet2", "Sheet3", "Sheet4" are deleted from the copied Spreadsheet.
Reference:
del_worksheet(worksheet)

Categories