Create Function to write dataframe to google sheets with df.to_numpy() - python

I have a piece of python code in which I create multiple data frames, which I want to write to my google drive.
I have defined the following functions
def write_to_google(FileName,dFrame):
worksheet = gc.open(FileName).sheet1
wb = gc.open(FileName)
wsResults = wb.worksheet('Sheet1')
print("untill here the function seems to operate")
data = dFrame.to_numpy().tolist() #here I get an error
headers = dFrame.columns.tolist()
Data2Write = [headers] + data
wsResults.update("A1",Data2Write)
dFrame is the name of the dataframe I want to write to the FileName (which is an existing file)
I get the following error on the `data = dFrame.to_numpy().tolist()
AttributeError: 'str' object has no attribute 'to_numpy'
How do I get the .to_numpy part to work with the dFrame argument?`
tried google but can find anything on the .to_numpy in the context of an argument

Related

Return the results of a function (input from tkinter) as new variabels for use going forward

I am trying to create a function in tkinter that takes in some file path of a csv, converts it to json and that json file is then useable as a pandas dataframe (assigned to a variable) moving forward in the program.
def upload_initial(): # command for uploading 3 csv files, running them and creating json files to work with.
try:
df1 = upload_entry.get()
df2 = upload_entry2.get()
df3 = upload_entry3.get()
airports_df = pd.read_csv(df1)
freq_df = pd.read_csv(df2)
runways_df = pd.read_csv(df3)
freq_df.to_json("freq.json")
airports_df.to_json("airports.json")
runways_df.to_json("runways.json")
freq_json = pd.read_json("freq.json")
runways_json = pd.read_json("runways.json")
airports_json = pd.read_json("airports.json")
success_label = tk.Label(text="You have successfully uploaded your files! They have been converted to JSON", foreground='green')
success_label.pack()
openMenuNew()
except FileNotFoundError:
fileNotFound = tk.Label(text="File does not exist, please enter the full file path", foreground='red')
fileNotFound.pack()
return freq_json, runways_json, airports_json
freq_json, runways_json, airports_json = upload_initial()
The code above works for:
taking the data set in from the user input
converting it to json and saving it locally
printing the success message
handling the file error
I want to then be able to use the json files (now pandas dataframes after conversion in the function) as a variable moving forward but cant seem to save the variables freq_json, airports_json, runways_json globally so I can then use them in other funtions, access the df etc. How do I save that variable from user input for this purpose?
Essentially, can someone explain how to get it so I could them call airports_json.head() in another cell and return that head?

AttributeError: 'workbook' object has no attribute 'max_row'

I already referred this post here but it has no response.
Am using a public github package here to copy all the formatting from one excel file to another excel file.
By formatting, I mean the color, font, freeze panes etc.
style_sheet = load_workbook(filename = 'sty1.xlsx')
data_sheet = load_workbook(filename = 'dat1.xlsx')
copy_styles(style_sheet, data_sheet) # the error happens in this line
The error producing line within copy_styles function is given below
def copy_styles(style_sheet, data_sheet):
max_matched_row = min(style_sheet.max_row, data_sheet.max_row)
max_matched_col = min(style_sheet.max_column, data_sheet.max_column)
The full copy_styles function can be found in this github link here
The error that I encounter is given below
AttributeError: 'workbook' object has no attribute 'max_row'
If you want a sample file to test, it can be found in this github issue here
Assuming you want to copy the style of the first sheet in the workbook, you should do:
copy_styles(style_sheet.sheet_by_index(0), data_sheet.sheet_by_index(0))
If you want to copy the style of all worksheets (assuming that they match), just loop over them:
style_wb = open_workbook(filename = 'sty1.xlsx')
data_wb = open_workbook(filename = 'dat1.xlsx')
for sheet_from, sheet_to in zip(style_wb.sheets(), data_wb.sheets()):
copy_styles(sheet_from, sheet_to)
I changed the variable names on the second example to make it clear that they are workbooks, not sheets.

Uploading pandas dataframe to google spreadsheet

I followed the steps here and here but couldn't upload a pandas dataframe to google sheets.
First I tried the following code:
import gspread
from google.oauth2.service_account import Credentials
scope = ['https://spreadsheets.google.com/feeds',
'https://www.googleapis.com/auth/drive']
credentials = Credentials.from_service_account_file('my_json_file_name.json', scopes=scope)
gc = gspread.authorize(credentials)
spreadsheet_key = '1FNMkcPz3aLCaWIbrC51lgJyuDFhe2KEixTX1lsdUjOY'
wks_name = 'Sheet1'
d2g.upload(df_qrt, spreadsheet_key, wks_name, credentials=credentials, row_names=True)
The above code returns an error message like this: AttributeError: module 'df2gspread' has no attribute 'upload' which doesn't make sense since df2spread indeed has a function called upload.
Second, I tried to append my data to a dataframe that I artificially created on the google sheet by just entering the column names. This also didn't work and didn't provide any results.
import gspread_dataframe as gd
ws = gc.open("name_of_file").worksheet("Sheet1")
existing = gd.get_as_dataframe(ws)
updated = existing.append(df_qrt)
gd.set_with_dataframe(ws, updated)
Any help will be appreciated, thanks!
You are not importing the package properly.
Just do this
from df2gspread import df2gspread as d2g
When you convert a worksheet to Dataframe using
existing = gd.get_as_dataframe(ws)
All the blank columns and rows in the sheet are now part of the dataframe with values as NaN, so when you try to append it with another dataframe it won't be appended because columns are mismatched.
Instead try this to covert worksheet to dataframe
existing = pd.DataFrame(ws.get_all_records())
When you export a dataframe in Google Sheets the index of the dataframe is stored in the first column(It happened in my case, can't be sure).
If the first column is index then you can remove the column using
existing.drop([''],axis=1,inplace=True)
Then this will work properly.
updated = existing.append(df_qrt)
gd.set_with_dataframe(ws, updated)

How to edit Excel (xlsx and xlsm) in python

I am very new to Python and this is my first project in python.
What I am doing is...
1. Retrieved the data from Sql server
2. Put the data in predefined excel template (specific worksheet).
3. If is there any data in this sheet then it should be replaced and only column name should remain in the sheet.
3. Another sheet in excel template contains a Pivot representation of data from step 2.
4. I need to refresh this pivot with new data from sheet1.
5. no of row in sheet1 can be changed depends on data from database.
I am fine with Step1 but unable oto perform excel operations.
I tried openpyxl but not able to much understand of it.
https://openpyxl.readthedocs.io/en/stable/
code:
from openpyxl import load_workbook
wb2 = load_workbook('CnA_Rec.xlsx')
print (wb2.sheetnames)
rawsheet = wb2.get_sheet_by_name('RawData')
print (rawsheet.cell_range)
Error with above code:
AttributeError: 'Worksheet' object has no attribute 'cell_range'
I can access individual cell but not range.
I need to select current range and replace it will new data.
ref link: https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.cell_range.html
Can any one point me to some online example for the same or any sample code for this.
So, then let go for it with openpyxl. Where is your problem? This is a very basic start. We can change this script during the process.
import openpyxl
wb = openpyxl.load_workbook('hello_world.xlsx')
# do magic with openpyxl here and save
ws = wb.worksheets[0]
ws.cell(row=1, column=3).value = 'Hello' # example
ws.cell(row=2, column=3).value = 'World' # example
for i in range(2,20):
ws.cell(row=i,column=1).value = 'Row:' + str(i)
data = [ws.cell(row=i,column=1).value for i in range(1,11)]
print(data)
wb.save('hello_world.xlsx')

Reading xls file with Python

import xlrd
cord = xlrd.open_workbook('MT_coordenadas_todas.xls')
id = cord.sheet_by_index(0)
print id
When I run my code in terminal,I got
<xlrd.sheet.Sheet object at 0x7f897e3ecf90>
I wanted to take the first column,so what should I change in my code?
id is a reference to the sheet object. You need to use values = id.col_values(0) to read the values from the first column of that sheet.

Categories