Adding rows to existing Excel file - python

I am trying to add a DataFrame to rows below existing cells on an existing .xlsx file with this code:
book = load_workbook(r"C:\path\file_name.xlsx")
writer = pd.ExcelWriter(r"C:\path\file_name.xlsx", engine='openpyxl')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}
contract_df.to_excel(writer, startrow = 10, header = False,
sheet_name='UsrLeaseContract')
writer.save()
I manage to add the data, but I am getting the following error when re-opening the file:
Removed Part: /xl/styles.xml part with XML error. (Styles) HRESULT
0x8000ffff Line 1, column 0. Repaired Records: Cell information from
/xl/worksheets/sheet1.xml part
and the detailed XML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<logFileName>error344800_01.xml </logFileName>
<summary>
Errors were detected in file 'C:path\file_name.xlsx'
</summary>
<removedParts><removedPart>Removed Part: /xl/styles.xml part with XML error. (Styles) HRESULT 0x8000ffff Line 1, column 0.
</removedPart>
</removedParts><repairedRecords><repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1.xml part</repairedRecord>
</repairedRecords></recoveryLog>

Have you tried using openpyxl directly to append the data?
Ran this code and it ran without problems. Also, did not get any warnings when opening the Exel file.
from openpyxl import Workbook, load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
import pandas as pd
# list of strings
vegetables = ['potatoes', 'carrots', 'cabbage']
# Calling DataFrame constructor on list
df = pd.DataFrame(vegetables)
wb = load_workbook('file_name.xlsx')
ws = wb['UsrLeaseContract']
for r in dataframe_to_rows(df, index=True, header=True):
ws.append(r)
wb.save('file_name.xlsx')

Related

ValueError: Sheet 'Sheet1' already exists and if_sheet_exists is set to 'error'

I am trying to create an excel file of 3 columns: System Date, Time, Value on a webpage at that time.
Intention is to create a dataframe of the 3 values, every time the code runs, and append the dataframe to existing excel workbook (with one existing sheet).
I am able to create dataframe every time code runs, but when I try to append it to an excel file, it throws error:
ValueError: Sheet 'Sheet1' already exists and if_sheet_exists is set to 'error'
Can you please suggest, where am I going wrong.
# Importing Libraries
from datetime import datetime
import pandas as pd
import requests
from bs4 import BeautifulSoup
import openpyxl
#getting today's date amd formatting it
now = datetime.now()
Date = now.strftime ("%d/%m/%Y")
Time = now.strftime ("%H:%M")
# GET request to scrape. 'Page' variable to assign contents
page = requests.get("https://www.traderscockpit.com/?pageView=live-nse-advance-decline-ratio-chart")
# Create BeautifulSoup object to parse content
soup = BeautifulSoup(page.content, 'html.parser')
adv = soup.select_one('a:-soup-contains("Advanced:")').next_sibling.strip()
dec = soup.select_one('a:-soup-contains("Declined:")').next_sibling.strip()
ADratio = round(int(adv)/int(dec), 2)
df = pd.DataFrame({tuple([Date, Time, ADratio])})
#Load workbook and read last used row
path = r'C:\Users\kashk\OneDrive\Documents\ADratios.xlsx'
writer = pd.ExcelWriter (path, engine='openpyxl', mode = 'a')
wb = openpyxl.load_workbook(path)
startrow = writer.sheets['Sheet1'].max_row
#Append data frame to existing table in existing sheet
df.to_excel (writer, sheet_name = 'Sheet1', index = False, header = False, startrow = startrow)
writer.save()
writer.close()
A fast and easy solution would be upgrading your pandas > 1.4.0 since it provides a if_sheet_exists = 'overlay' Source
pd.ExcelWriter(path, engine='openpyxl', mode='a', if_sheet_exists='overlay')
If you don't want to upgrade your pandas, there is a way to work around by removing and re-write the sheet into the excel file. (Not recommended if you have a lot of records since it will be slow).
path, sheet_name = 'ADratios.xlsx' , 'Sheet 1'
df.columns = ['Date','Time','ADratio']
with pd.ExcelWriter(path, engine='openpyxl', mode='a', if_sheet_exists='replace') as writer:
book = openpyxl.load_workbook(path, 'r')
df_bak = pd.read_excel(path)
writer.book = openpyxl.load_workbook(path)
writer.book.remove(writer.book.worksheets[writer.book.sheetnames.index(sheet_name)])
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
pd.concat([df_bak, df], axis=0).to_excel(writer, sheet_name=sheet_name, index = False)

Create Excel Tables from Dictionary of Dataframes

I have dictionary of dataframes.
dd = {
'table': pd.DataFrame({'Name':['Banana'], 'color':['Yellow'], 'type':'Fruit'}),
'another_table':pd.DataFrame({'city':['Atlanta'],'state':['Georgia'], 'Country':['United States']}),
'and_another_table':pd.DataFrame({'firstname':['John'], 'middlename':['Patrick'], 'lastnme':['Snow']}),
}
I would like to create an Excel file which contains Excel Table objects created from these dataframes. Each Table needs to be on a separate Tab/Sheet and Table names should match dataframe names.
Is this possible to do with Python?
So far I was only able to export data to Excel normally without converting to tables using xlsxwriter
writer = pd.ExcelWriter('Results.xlsx', engine='xlsxwriter')
for sheet, frame in dd.items():
frame.to_excel(writer, sheet_name = sheet)
writer.save()
For writing multiple sheets from Pandas, use the openpyxl library. In addition, to prevent overwriting, set the workbook sheets before each update.
Try this code:
import pandas as pd
import openpyxl
dd = {
'table': pd.DataFrame({'Name':['Banana'], 'color':['Yellow'], 'type':'Fruit'}),
'another_table':pd.DataFrame({'city':['Atlanta'],'state':['Georgia'], 'Country':['United States']}),
'and_another_table':pd.DataFrame({'firstname':['John'], 'middlename':['Patrick'], 'lastnme':['Snow']}),
}
filename = 'Results.xlsx' # must exist
wb = openpyxl.load_workbook(filename)
writer = pd.ExcelWriter(filename, engine='openpyxl')
for sheet, frame in dd.items():
writer.sheets = dict((ws.title, ws) for ws in wb.worksheets) # need this to prevent overwrite
frame.to_excel(writer, index=False, sheet_name = sheet)
writer.save()
# convert data to tables
wb = openpyxl.load_workbook(filename)
for ws in wb.worksheets:
mxrow = ws.max_row
mxcol = ws.max_column
tab = openpyxl.worksheet.table.Table(displayName=ws.title, ref="A1:" + ws.cell(mxrow,mxcol).coordinate)
ws.add_table(tab)
wb.save(filename)
Output

Copy excel sheet from one worksheet to another in Python

All I want to do is copy a worksheet from an excel workbook to another excel workbook in Python.
I want to maintain all formatting (coloured cells, tables etc.)
I have a number of excel files and I want to copy the first sheet from all of them into one workbook. I also want to be able to update the main workbook if changes are made to any of the individual workbooks.
It's a code block that will run every few hours and update the master spreadsheet.
I've tried pandas, but it doesn't maintain formatting and tables.
I've tried openpyxl to no avail
I thought xlwings code below would work:
import xlwings as xw
wb = xw.Book('individual_files\\file1.xlsx')
sht = wb.sheets[0]
new_wb = xw.Book('Master Spreadsheet.xlsx')
new_wb.sheets["Sheet1"] = sht
But I just get the error:
----> 4 new_wb.sheets["Sheet1"] = sht
AttributeError: __setitem__
"file1.xlsx" above is an example first excel file.
"Master Spreadsheet.xlsx" is my master spreadsheet with all individual files.
In the end I did this:
def copyExcelSheet(sheetName):
read_from = load_workbook(item)
#open(destination, 'wb').write(open(source, 'rb').read())
read_sheet = read_from.active
write_to = load_workbook("Master file.xlsx")
write_sheet = write_to[sheetName]
for row in read_sheet.rows:
for cell in row:
new_cell = write_sheet.cell(row=cell.row, column=cell.column,
value= cell.value)
write_sheet.column_dimensions[get_column_letter(cell.column)].width = read_sheet.column_dimensions[get_column_letter(cell.column)].width
if cell.has_style:
new_cell.font = copy(cell.font)
new_cell.border = copy(cell.border)
new_cell.fill = copy(cell.fill)
new_cell.number_format = copy(cell.number_format)
new_cell.protection = copy(cell.protection)
new_cell.alignment = copy(cell.alignment)
write_sheet.merge_cells('C8:G8')
write_sheet.merge_cells('K8:P8')
write_sheet.merge_cells('R8:S8')
write_sheet.add_table(newTable("table1","C10:G76","TableStyleLight8"))
write_sheet.add_table(newTable("table2","K10:P59","TableStyleLight9"))
write_to.save('Master file.xlsx')
read_from.close
With this to check if the sheet already exists:
#checks if sheet already exists and updates sheet if it does.
def checkExists(sheetName):
book = load_workbook("Master file.xlsx") # open an Excel file and return a workbook
if sheetName in book.sheetnames:
print ("Removing sheet",sheetName)
del book[sheetName]
else:
print ("No sheet ",sheetName," found, will create sheet")
book.create_sheet(sheetName)
book.save('Master file.xlsx')
with this to create new tables:
def newTable(tableName,ref,styleName):
tableName = tableName + ''.join(random.choices(string.ascii_uppercase + string.digits + string.ascii_lowercase, k=15))
tab = Table(displayName=tableName, ref=ref)
# Add a default style with striped rows and banded columns
tab.tableStyleInfo = TableStyleInfo(name=styleName, showFirstColumn=False,showLastColumn=False, showRowStripes=True, showColumnStripes=True)
return tab
Adapted from this solution, but note that in my (limited) testing (and as observed in the other Q&A), this does not support the After parameter of the Copy method, only Before. If you try to use After, it creates a new workbook instead.
import xlwings as xw
wb = xw.Book('individual_files\\file1.xlsx')
sht = wb.sheets[0]
new_wb = xw.Book('Master Spreadsheet.xlsx')
# copy this sheet into the new_wb *before* Sheet1:
sht.api.Copy(Before=new_wb.sheets['Sheet1'].api)
# now, remove Sheet1 from new_wb
new_wb.sheets['Sheet1'].delete()
This can be done using pywin32 directly. The Before or After parameter needs to be provided (see the api docs), and the parameter needs to be a worksheet <object>, not simply a worksheet Name or index value. So, for example, to add it to the end of an existing workbook:
def copy_sheet_within_excel_file(excel_filename, sheet_name_or_number_to_copy):
excel_app = win32com_client.gencache.EnsureDispatch('Excel.Application')
wb = excel_app.Workbooks.Open(excel_filename)
wb.Worksheets[sheet_name_or_number_to_copy].Copy(After=wb.Worksheets[wb.Worksheets.Count])
new_ws = wb.ActiveSheet
return new_ws
As most of my code runs on end-user machines, I don't like to make assumptions whether Excel is open or not so my code determines if Excel is already open (see GetActiveObject), as in:
try:
excel_app = win32com_client.GetActiveObject('Excel.Application')
except com_error:
excel_app = win32com_client.gencache.EnsureDispatch('Excel.Application')
And then I also check to see if the workbook is already loaded (see Workbook.FullName). Iterate through the Application.Workbooks testing the FullName to see if the file is already open. If so, grab that wb as your wb handle.
You might find this helpful for digging around the available Excel APIs directly from pywin32:
def show_python_interface_modules():
os.startfile(os.path.dirname(win32com_client.gencache.GetModuleForProgID('Excel.Application').__file__))

openpyxl 'Worksheet' object has no attribute 'write'(python)

Sorry for my English. I need to open a xlsx document and write in the last position new values. But I don't understand how to do it. My algorithm works like this:
Open xlsx l_workbook = load_workbook(old_log_tmp_path)
Get all value from there
Code:
def iter_rows(ws):
for row in ws.iter_rows():
yield [cell for cell in row]
Create new xlsm file
Code:
workbook = xlsxwriter.Workbook(tf.name)
worksheet = workbook.add_worksheet()
Copy all values from l_workbook to workbook -> worksheet
But I think it is not right, I think they exist in a simple way. Like this:
l_workbook = load_workbook('EX2cRqM7xi1D.xlsx')
sheet = l_workbook.get_sheet_names()[0]
worksheet = l_workbook.get_sheet_by_name(sheet)
worksheet.write(1, 1, "TEST")
Running that script gave me the error below:
AttributeError: 'Worksheet' object has no attribute 'write'
My question is: How can I open xlsm file and add new values to it (using openpyxl)?
UPD:
i try this code, but not work
import openpyxl
workbook = openpyxl.load_workbook('tmp3by148hj.xlsx')
ws = workbook.worksheets[0]
ws.cell(row=1, column=1).value = 'TEST'
You need to write into a cell:
worksheet.cell(row=1, column=1).value = 'TEST'
and finally save your changes:
workbook.save('tmp3by148hj.xlsx')

Activate second worksheet with openpyxl

I am trying to activate multiple excel worksheets and write to both multiple sheets within both workbook(s) using python and openpyxl. I am able to load the second workbook f but I am unable to append cell G2 of my second workbook with the string Recon
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb = Workbook()
ws = wb.active
ws['G1'] = 'Path'
ws.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
"""
wb22 = Workbook(wb2)
ws = wb22.active
ws['G1'] = "Recon2"
ws.title = 'Main2'
"""
print wb2.get_sheet_names()
wb.save(filename)
I commented out the code which is broken
Update
I adjusted my code with the below answer. The value in cell H1 is written onto wb2 in column H, but for some reason the column is hidden. I have adjusted the column to other columns but still I have seen the code hide multiple columns. There are also occurences when the code executes and titles ws2 as Main21 but the encoded value is Main2
from openpyxl import Workbook, load_workbook
filename = 'sda_2015.xlsx'
wb1 = Workbook()
ws1 = wb1.active
ws1['G1'] = 'Path'
ws1.title = 'Main'
adf = "Dirty Securities 04222015.xlsx"
f = "F:\\ana\\xlmacro\\" + adf
wb2 = load_workbook(f)
ws2 = wb2.active
ws2['H1'] = 'Recon2'
ws2.title = 'Main2'
print wb2.get_sheet_names()
wb1.save(filename)
wb2.save(f)
If you have two workbooks open, wb1 and wb2, you'll also need different names for the various worksheets: ws1 = wb1.active and ws2 = wb2.active.
If you're working with a file with macros, you'll need to set the keep_vba flag to True when opening it in order to preserve the macros.
I had experienced the same thing with hidden cells. Eventually, I unpacked the Excel file and looked at the raw XML to find out that not all of the columns had a dimension for width. Those without a width were being by Excel.
A quick fix is to do something like this...
for col in 'ABCDEFG':
if not worksheet.column_dimensions[col].width:
worksheet.column_dimensions[col].width = 10

Categories