Append CSV file to Google Sheet - python

I am trying to append this CSV file after the last row with data in this Google Sheet. But I can only overwrite the existing data.
import gspread
from oauth2client.service_account import ServiceAccountCredentials
scope = ["https://spreadsheets.google.com/feeds", 'https://www.googleapis.com/auth/spreadsheets',
"https://www.googleapis.com/auth/drive.file", "https://www.googleapis.com/auth/drive"]
credentials = ServiceAccountCredentials.from_json_keyfile_name(
'key.json', scope)
client = gspread.authorize(credentials)
spreadsheet = client.open('upload_data')
with open('gmt2.csv', 'r') as file_obj:
content = file_obj.read()
client.import_csv(spreadsheet.id, data=content)

When import_csv is used, it seems that the Spreadsheet is overwritten by the CSV data. I thought that this might be the reason of your issue. In your situation, how about using append_rows method? When your script is modified, it becomes as follows.
Modified script:
client = gspread.authorize(credentials)
# I modified below script.
sheetName = "Sheet1" # Please set the sheet name you want to append the CSV data.
spreadsheet = client.open('upload_data')
worksheet = spreadsheet.worksheet(sheetName)
content = list(csv.reader(open('gmt2.csv')))
worksheet.append_rows(content, value_input_option="USER_ENTERED")
In this case, import csv is used.
References:
append_rows
Method: spreadsheets.values.append

Related

CSV to Google Sheets python

I have a CSV that I want to put into a google sheet into sheet3 of many. I was hoping someone can help me complete this code. I am using Google API. So far I have gotten the csv to upload to the google drive. Now I would like to change the code to update a specific google sheet in sheet3 instead of creating a new sheet. Bellow you will find the code that I am using to create a new sheet with the CSV data.
# Import Csv to Google Drive
import os
import glob
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
drive = GoogleDrive(gauth)
# line used to change the directory
os.chdir(r'DIRECTORY OF CSV')
list_of_files = glob.glob('DIRECTORY OF CSV\*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print(latest_file)
upload_file_list = [latest_file]
for upload_file in upload_file_list:
gfile = drive.CreateFile({'parents': [{'id': 'THE GOOGLE ID'}]})
# Read file and set it as the content of this instance.
gfile.SetContentFile(upload_file)
gfile.Upload() # Upload the file.
I believe your goal is as follows.
You want to put CSV data to the specific sheet of a Google Spreadsheet.
You want to achieve this using python.
You have already been able to get and put values to the Spreadsheet using Sheets API.
In this case, how about the following sample script?
Sample script 1:
When googleapis for python is used, how about the following sample script?
service = build("sheets", "v4", credentials=creds) # Please use your script for authorization.
spreadsheet_id = "###" # Please put your Spreadsheet ID.
sheet_name = "Sheet3" # Please put the sheet ID of the sheet you want to use.
csv_file = "###" # Please put the file path of the CSV file you want to use.
f = open(csv_file, "r")
values = [r for r in csv.reader(f)]
request = service.spreadsheets().values().update(spreadsheetId=spreadsheet_id, range=sheet_name, valueInputOption="USER_ENTERED", body={"values": values}).execute()
Sample script 2:
When gspread for python is used, how about the following sample script?
import gspread
import csv
client = gspread.oauth(###) # Please use your script for authorization.
spreadsheet_id = "###" # Please put your Spreadsheet ID.
sheet_name = "Sheet3" # Please put the sheet ID of the sheet you want to use.
csv_file = "###" # Please put the file path of the CSV file you want to use.
spreadsheet = client.open_by_key(spreadsheet_id)
worksheet = spreadsheet.worksheet(sheet_name)
f = open(csv_file, "r")
values = [r for r in csv.reader(f)]
worksheet.update(values)
Note:
About both sample scripts, the CSV data is retrieved from a CSV file on your local PC, and the CSV data is converted to a 2-dimensional array and put the array to "Sheet3" of Google Spreadsheet using Sheets API. In this sample script, Drive API is not used.
Reference:
Method: spreadsheets.values.update

Python - download google sheet - to csv file

I'm looking for a way to save a google sheet as csv on my computer..
I tried this:
import gspread
gc = gspread.service_account(filename='client_secret.json')
sh = gc.open("sheets").worksheet("sheet1")
sh.to_csv("exported_file.csv")
How to make it work?
I would recommend using gsheets and oauth2client module for this.
from oauth2client.service_account import ServiceAccountCredentials
import gsheets
my_json = "client_secret.json"
my_sheet_url = f"https://docs.google.com/spreadsheets/d/{insert_id}"
SCOPE = ["https://spreadsheets.google.com/feeds", 'https://www.googleapis.com/auth/spreadsheets',
"https://www.googleapis.com/auth/drive.file", "https://www.googleapis.com/auth/drive"]
CREDS = ServiceAccountCredentials.from_json_keyfile_name(my_json, SCOPE)
sheets = gsheets.Sheets(CREDS)
sheet = sheets.get(my_sheet_url)
sheet[0].to_csv("export.csv")
This saves the first worksheet as a csv next to your .py file

Read excel file from google drive without downloading file

I wants to read excel sheets from excel file on google drive without downloading on local machine! i searched for google drive api but couldn't find solution i tried following code please need suggestion:
'''
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
file_id = 'abc'
file_name = 'abc.xlsx'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile(file_name)
class TestCase:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)
class TestCaseSteps:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)
testcases = TestCase(file_name, 'A')
steps = TestCaseSteps(file_name, 'B')
'''
I believe your goal and situation as follows.
You want to read the XLSX downloaded from Google Drive using pd.read_excel.
You want to achieve this without saving the downloaded XLSX data as a file.
Your gauth = GoogleAuth() can be used for downloading the Google Spreadsheet as the XLSX format.
In this case, I would like to propose the following flow.
Download the Google Spreadsheet as XLSX format.
In this case, it directly requests to the endpoint for exporting Spreadsheet as XLSX format using requests library.
The access token is retrieved from gauth = GoogleAuth().
The downloaded XLSX data is read with pd.read_excel.
In this case, BytesIO is used for reading the data.
By this flow, when the Spreadsheet is downloaded as the XLSX data, the XLSX data can be read without saving it as a file. When above flow is reflected to the script, it becomes as follows.
Sample script:
Before you run the script, please set the Spreadsheet ID.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
spreadsheetId = "###" # <--- Please set the Spreadsheet ID.
# 1. Download the Google Spreadsheet as XLSX format.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + spreadsheetId + "/export?mimeType=application%2Fvnd.openxmlformats-officedocument.spreadsheetml.sheet"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)
References:
Download a Google Workspace Document
pandas.read_excel
Added:
At the following sample script, it supposes that the XLSX file is put to the Google Drive, and the XLSX file is downloaded.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
file_id = "###" # <--- Please set the file ID of XLSX file.
# 1. Download the XLSX data.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)

How do I get all records from all sheets (tabs) in a google spreadsheet using one API call?

I wanted to retrieve all the records from a google spreadsheet in one api call (instead of looping through all the sheets and retrieving them one-by-one). Currently I'm doing this
creds = ServiceAccountCredentials.from_json_keyfile_name('path-to-key', 'key.json'), self.scope)
client = gspread.authorize(creds)
spread = client.open("My SpreadSheet")
data_for_sheet_0 = spread.get_worksheet(0).get_all_records()
data_for_sheet_1 = spread.get_worksheet(1).get_all_records()
.
.
.
As you can see, this is not efficient. Is there any way to get all of the sheet data (or the entire spreadsheet as an Iterable of Iterables)? Thanks.
Try this. By using spreadsheet.worksheets()
Checkout this gspread documentation: https://gspread.readthedocs.io/en/latest/user-guide.html#selecting-a-worksheet
creds = ServiceAccountCredentials.from_json_keyfile_name('path-to-key', 'key.json'), self.scope)
client = gspread.authorize(creds)
spread = client.open("My SpreadSheet")
#Getting a list of worksheets inside a spreadsheet.
sheets = spreadsheet.worksheets()
for sheet in sheets:
record = sheet.get_all_records()
# Do whatever you want
print(record)

Any way to send an xlsxwriter generated file to azure data lake without writing to local disk?

For purposes of security, I have a need to move a file to Azure Datalake storage without writing the file locally. This is an excel workbook that is being created with the xlsxwriter package. Here is what I have tried which returns a ValueError: Seek only available in read mode
import pandas as pd
from azure.datalake.store import core, lib, multithread
import xlsxwriter as xl
# Dataframes have undergone manipulation not listed in this code and come from a DB connection
matrix = pd.DataFrame(Database_Query1)
raw = pd.DataFrame(Database_Query2)
# Name datalake path for workbook
dlpath = '/datalake/file/path/file_name.xlsx'
# List store name
store_name = 'store_name_here'
# Create auth token
token = lib.auth(tenant_id= 'tenant_id_here',
client_id= 'client_id_here',
client_secret= 'client_secret_here')
# Create management file system client object
adl = core.AzureDLFileSystem(token, store_name= store_name)
# Create workbook structure
writer = pd.ExcelWriter(adl.open(dlpath, 'wb'), engine= 'xlsxwriter')
matrix.to_excel(writer, sheet_name= 'Compliance')
raw.to_excel(writer, sheet_name= 'Raw Data')
writer.save()
Any ideas? Thanks in advance.
If the data is not monstrously huge, you might consider keeping the bytes in memory and dump the stream back to your adl:
from io import BytesIO
xlb = BytesIO()
# ... do what you need to do ... #
writer = pd.ExcelWriter(xlb, engine= 'xlsxwriter')
matrix.to_excel(writer, sheet_name= 'Compliance')
raw.to_excel(writer, sheet_name= 'Raw Data')
writer.save()
# Set the cursor of the stream back to the beginning
xlb.seek(0)
with adl.open(dlpath, 'wb') as fl:
# This part I'm not entirely sure - consult what your adl write methods are
fl.write(xlb.read())

Categories