Google Spreadsheet to CSV in Google Drive - python

While uploading CSV file to Google drive, it automatically converting to Google Sheets. How to save it as CSV file in drive? or can I read google sheet through pandas data frame ?
Develop environment: Google Colab
Code Snippet:
Input
data = pd.read_csv("ner_dataset.desktop (3dec943a)",
encoding="latin1").fillna(method="ffill")
data.tail(10)
Output
[Desktop Entry]
0 Type=Link
1 Name=ner_dataset
2 URL=https://docs.google.com/spreadsheets/d/1w0...

WORKING CODE
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
worksheet = gc.open('Your spreadsheet name').sheet1
# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)
# Convert to a DataFrame and render.
import pandas as pd
pd.DataFrame.from_records(rows)

#Mount the Drive
from google.colab import drive
drive.mount('drive')
#Authenticate you need to do with your credentials, fill yourself
gauth = GoogleAuth()
#Create CSV and Copy
df.to_csv('data.csv')
!cp data.csv drive/'your drive'

Related

CSV to Google Sheets python

I have a CSV that I want to put into a google sheet into sheet3 of many. I was hoping someone can help me complete this code. I am using Google API. So far I have gotten the csv to upload to the google drive. Now I would like to change the code to update a specific google sheet in sheet3 instead of creating a new sheet. Bellow you will find the code that I am using to create a new sheet with the CSV data.
# Import Csv to Google Drive
import os
import glob
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
drive = GoogleDrive(gauth)
# line used to change the directory
os.chdir(r'DIRECTORY OF CSV')
list_of_files = glob.glob('DIRECTORY OF CSV\*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print(latest_file)
upload_file_list = [latest_file]
for upload_file in upload_file_list:
gfile = drive.CreateFile({'parents': [{'id': 'THE GOOGLE ID'}]})
# Read file and set it as the content of this instance.
gfile.SetContentFile(upload_file)
gfile.Upload() # Upload the file.
I believe your goal is as follows.
You want to put CSV data to the specific sheet of a Google Spreadsheet.
You want to achieve this using python.
You have already been able to get and put values to the Spreadsheet using Sheets API.
In this case, how about the following sample script?
Sample script 1:
When googleapis for python is used, how about the following sample script?
service = build("sheets", "v4", credentials=creds) # Please use your script for authorization.
spreadsheet_id = "###" # Please put your Spreadsheet ID.
sheet_name = "Sheet3" # Please put the sheet ID of the sheet you want to use.
csv_file = "###" # Please put the file path of the CSV file you want to use.
f = open(csv_file, "r")
values = [r for r in csv.reader(f)]
request = service.spreadsheets().values().update(spreadsheetId=spreadsheet_id, range=sheet_name, valueInputOption="USER_ENTERED", body={"values": values}).execute()
Sample script 2:
When gspread for python is used, how about the following sample script?
import gspread
import csv
client = gspread.oauth(###) # Please use your script for authorization.
spreadsheet_id = "###" # Please put your Spreadsheet ID.
sheet_name = "Sheet3" # Please put the sheet ID of the sheet you want to use.
csv_file = "###" # Please put the file path of the CSV file you want to use.
spreadsheet = client.open_by_key(spreadsheet_id)
worksheet = spreadsheet.worksheet(sheet_name)
f = open(csv_file, "r")
values = [r for r in csv.reader(f)]
worksheet.update(values)
Note:
About both sample scripts, the CSV data is retrieved from a CSV file on your local PC, and the CSV data is converted to a 2-dimensional array and put the array to "Sheet3" of Google Spreadsheet using Sheets API. In this sample script, Drive API is not used.
Reference:
Method: spreadsheets.values.update

Read excel file from google drive without downloading file

I wants to read excel sheets from excel file on google drive without downloading on local machine! i searched for google drive api but couldn't find solution i tried following code please need suggestion:
'''
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)
file_id = 'abc'
file_name = 'abc.xlsx'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile(file_name)
class TestCase:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)
class TestCaseSteps:
def __init__(self, file_name, sheet):
self.file_name = file_name
self.sheet = sheet
testcase = pd.read_excel(file_name, usecols=None, sheet_name=sheet)
print(testcase)
testcases = TestCase(file_name, 'A')
steps = TestCaseSteps(file_name, 'B')
'''
I believe your goal and situation as follows.
You want to read the XLSX downloaded from Google Drive using pd.read_excel.
You want to achieve this without saving the downloaded XLSX data as a file.
Your gauth = GoogleAuth() can be used for downloading the Google Spreadsheet as the XLSX format.
In this case, I would like to propose the following flow.
Download the Google Spreadsheet as XLSX format.
In this case, it directly requests to the endpoint for exporting Spreadsheet as XLSX format using requests library.
The access token is retrieved from gauth = GoogleAuth().
The downloaded XLSX data is read with pd.read_excel.
In this case, BytesIO is used for reading the data.
By this flow, when the Spreadsheet is downloaded as the XLSX data, the XLSX data can be read without saving it as a file. When above flow is reflected to the script, it becomes as follows.
Sample script:
Before you run the script, please set the Spreadsheet ID.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
spreadsheetId = "###" # <--- Please set the Spreadsheet ID.
# 1. Download the Google Spreadsheet as XLSX format.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + spreadsheetId + "/export?mimeType=application%2Fvnd.openxmlformats-officedocument.spreadsheetml.sheet"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)
References:
Download a Google Workspace Document
pandas.read_excel
Added:
At the following sample script, it supposes that the XLSX file is put to the Google Drive, and the XLSX file is downloaded.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import pandas as pd
import requests
from io import BytesIO
file_id = "###" # <--- Please set the file ID of XLSX file.
# 1. Download the XLSX data.
gauth = GoogleAuth()
gauth.LocalWebserverAuth()
url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
res = requests.get(url, headers={"Authorization": "Bearer " + gauth.attr['credentials'].access_token})
# 2. The downloaded XLSX data is read with `pd.read_excel`.
sheet = "Sheet1"
values = pd.read_excel(BytesIO(res.content), usecols=None, sheet_name=sheet)
print(values)

How to implement this Jupyter notebook on Google Colab?

I just started using Google Colab a few horus ago and I'm trying to figure our how to read,write and save stuff etc.
I have this code on Jupyter notebook,and I'm having trouble at the last part where I save the file, I want to save it either on my local computer or Google Drive?
import pandas as pd
pd.set_option('display.max_columns', 999)
#load data
df = pd.read_csv('D:\\Project\\database\\Isolation Forest\\IF 15 PERCENT.csv')
df.shape
#data info
info = df.info()
print(info)
#data description
describe = df.describe() #print(describe)
f = open('D:\\Project\\database\\Isolation Forest\\Final Description IF TEST11.txt', "w+")
print(describe, file=f)
f.close()
and
Google Colab Code:
import pandas as pd
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
link = '......'
fluff, id = link.split('=')
print (id) # Verify that you have everything after '='
downloaded = drive.CreateFile({'id':id})
downloaded.GetContentFile('IF 15 PERCENT.csv')
df = pd.read_csv('IF 15 PERCENT.csv',index_col=None)
info = df.info()
print(info)
describe = df.describe()
I don't really know how to save it now as txt file and w+
Thank you.
This will save your dataframe in text format:
tfile = open('test.txt', 'w+')
tfile.write(describe.to_string())
tfile.close()

How to upload csv file into google drive and read it from same into python

I have a google drive which I have my csv file uploaded in already, the link to share that file is given as:
https://drive.google.com/open?id=1P_UYUsgvGXUhPCKQiZWlEAynKoeldWEi
I also know my the directory to the drive as:
C:/Users/.../Google Drive/
Please give me a step-by-step guide to achieving how to read this particular csv file directly from google drive and not by downloading it to my PC first before reading it to python.
I have searched this forum and tried some given solutions such as:
How to upload csv file (and use it) from google drive into google colaboratory
It did not work for me, it resulted to the below error:
3 from pydrive.auth import GoogleAuth
4 from pydrive.drive import GoogleDrive
----> 5 from google.colab import auth
6 from oauth2client.client import GoogleCredentials
7
ModuleNotFoundError: No module named 'google.colab'
You don't need that much out of that example to upload a file to google drive:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
# access the drive
gauth = GoogleAuth()
drive = GoogleDrive(gauth)
# the file you want to upload, here simple example
f = drive.CreateFile()
f.SetContentFile('document.txt')
# upload the file
f.Upload()
print('title: %s, mimeType: %s' % (f['title'], f['mimeType']))
# read all files, the newly uploaded file will be there
file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file1 in file_list:
print('title: %s, id: %s' % (file1['title'], file1['id']))
Note: I created an empty file in this example instead of an existing one, you just have to change it to load up the csv file from your local pc where the python file is running on instead.
Kind regards
Here is a simple approach I use for all my csv files stored in Google Drive.
First import the necessary libraries that will facilitate your connection.
!pip install -U -q PyDrive
from google.colab import auth
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from oauth2client.client import GoogleCredentials
Next step is authentication and creating the PyDrive client in order to connect to your Drive.
This should give you a link to connect to Google Cloud SDK.
Select the Google Drive account you want to access. Copy the link and paste it onto the text field prompt on your Colab Notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
To get the file, you will need the id of the file in Google Drive.
downloaded = drive.CreateFile({'id':'1P_UYUsgvGXUhPCKQiZWlEAynKoeldWEi'}) # replace the id with id of the file you want to access
downloaded.GetContentFile('file.csv')
Finally, you can read the file as pandas dataframe.
import pandas as pd
df= pd.read_csv('fle.csv')

How to upload csv file (and use it) from google drive into google colaboratory

Wanted to try out python, and google colaboratory seemed the easiest option.I have some files in my google drive, and wanted to upload them into google colaboratory.
so here is the code that i am using:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# 2. Create & upload a file text file.
uploaded = drive.CreateFile({'xyz.csv': 'C:/Users/abc/Google Drive/def/xyz.csv'})
uploaded.Upload()
print('Uploaded file with title {}'.format(uploaded.get('title')))
import pandas as pd
xyz = pd.read_csv('Untitled.csv')
Basically, for user "abc", i wanted to upload the file xyz.csv from the folder "def".
I can upload the file, but when i ask for the title it says the title is "Untitled".
when i ask for the Id of the file that was uploaded, it changes everytime, so i can not use the Id.
How do i read the file??? and set a proper file name???
xyz = pd.read_csv('Untitled.csv') doesnt work
xyz = pd.read_csv('Untitled') doesnt work
xyz = pd.read_csv('xyz.csv') doesnt work
Here are some other links that i found..
How to import and read a shelve or Numpy file in Google Colaboratory?
Load local data files to Colaboratory
To read a csv file from my google drive into colaboratory, I needed to do the following steps:
1) I first needed to authorize colaboratory to access my google drive with PyDrive. I used their code example for that. (pasted below)
2) I also needed to log into my drive.google.com to find the target id of the file i wanted to download. I found this by right clicking on the file and copying the shared link for the ID. The id looks something like this: '1BH-rffqv_1auzO7tdubfaOwXzf278vJK'
3) Then I ran downloaded.GetContentFile('myName.csv') - putting in the name i wanted (in your case it is xyz.csv)
This seems to work for me!
I used the code they provided in their example:
# Code to read csv file into colaboratory:
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
#2. Get the file
downloaded = drive.CreateFile({'id':'1BH-rffqv_1auzO7tdubfaOwXzf278vJK'}) # replace the id with id of file you want to access
downloaded.GetContentFile('xyz.csv')
#3. Read file as panda dataframe
import pandas as pd
xyz = pd.read_csv('xyz.csv')
Okay I'm pretty sure I'm quite late, but I'd like to put this out there, just in case.
I think the easiest way you could do this is by
from google.colab import drive
drive.mount("/content/drive")
This will generate a link, click on it and sign in using Google OAuth, paste the key in the colab cell and you're connected!
check out the list of available files in the side bar on the left side and copy the path of the file you want to access. Read it as you would, with any other file.
File create takes a file body i its first parameter. If you check the documentation for file create there are a number of fields you can fill out. In the example below you would add them to file_metadata comma separated.
file_metadata = {'name': 'photo.jpg'}
media = MediaFileUpload('files/photo.jpg',
mimetype='image/jpeg')
file = drive_service.files().create(body=file_metadata,
media_body=media,
fields='id').execute()
I suggest you read the file upload section of the documentation to get a better idea how upload works and which files can actually be read from within google drive. I am not sure that this is going to give you access to Google colaborate
Possible fix for your code.
I am not a python dev but my guess would be you can set your title by doing this.
uploaded = drive.CreateFile({'xyz.csv': 'C:/Users/abc/Google Drive/def/xyz.csv',
'name': 'xyz.csv'})
I think it's that simple with this command
# Mount Google Drive
import os
from google.colab import drive
drive.mount('/content/drive')
!pwd
!ls
import pandas as pd
df = pd.read_csv('Untitled.csv')
It will require authorization with your Google OAuth, and create authorization key. put the key into the colab cell.
Please aware !, sometimes the file within google colab directory are not update or similar with google drive if you delete or add files in your Google Drive.

Categories