Activate hyphenation formatting while generating Excel document with Python - python

I'm using Python to generate an excel document from a Pandas DataFrame.
I can set column width and text wrap with workbook.add_format({"text_wrap": True}) and worksheet.set_column(f"{cols[idx]}:{cols[idx]}", 30, format), but I don't know how to activate "Hyphenation active". I fail to find it in the doc : https://xlsxwriter.readthedocs.io/format.html
Here is a sample of my code :
df = get_pandas_dataframe()
writer = pd.ExcelWriter(path, engine="xlsxwriter")
sheet_name = "abc"
df.to_excel(writer, index=False, sheet_name=sheet_name)
workbook = writer.book
worksheet = writer.sheets[sheet_name]
max_row, max_col = pdf.shape
format = workbook.add_format({"text_wrap": True})
cols = dict(zip(range(26), list(string.ascii_uppercase)))
for idx, col in enumerate(df):
worksheet.set_column(f"{cols[idx]}:{cols[idx]}", 30, format)
writer.save()
Any idea?

"Hyphenation Active" isn't an Excel option and hence it isn't supported by XlsxWriter.
You can verify that yourself in Libreoffice by saving an xlsx file with that option in a cell, closing the file, and then re-opening it. The option will no longer be there.

Related

Openpyxl column number formatting

Im using openpyxl to append formated dataframe rows to existing excel file/creating new with following code:
if os.path.isfile(transformed_file): #if file exists, load and append
workbook = openpyxl.load_workbook(transformed_file)
sheet = workbook['Sheet1']
for row in dataframe_to_rows(df, header=False, index=False):
sheet.append(row)
workbook.save(transformed_file)
workbook.close()
else: # create the excel file if doesn't already exist
with pd.ExcelWriter(path = transformed_file, engine = 'openpyxl') as writer:
df.to_excel(writer, index=False, sheet_name = 'Sheet1')
I need to format column 'G' as a plain number '0', at the moment when opening excel file the format is '1.23E+10'.
How could this be achieved for the sample above? Thank you!
Hello try the following code see if it works for you:
wb = Workbook()
ws = wb.active
ws['A1'] = 123455656565464563302589013
ws['B1'] = 123455656565464563302589013
ws['A1'].number_format = '0' # Number formatting
ws['B1'].number_format = '0.00E+00' # Scientific formatting
wb.save("formating_test.xlsx")
Found the solution which worked for me. Realized from documentation that one has to iterate through each cell.
for cell in sheet[('D')]:
cell.number_format ='0'

Convert DataFrame to excel by preserving existing sheets and increase the column size of excel

I am trying to convert the DataFrame to excel without overwriting the existing sheet.
The solution is using pd.ExcelWriter with openpyxl engine which supports append mode.
Now, I have to increase the column size of the excel, I use pd.ExcelWriter with XlsxWriter engine but it overwrites the remaining sheets.
Openpyxl as an engine:
with pd.ExcelWriter("test.xlsx", engine="openpyxl", mode="a") as writer:
df.to_excel(writer, sheet_name="name", startrow=num, startcol=num)
XlsxWriter as an engine:
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
worksheet.set_column(0, 0, 20)
Can someone please suggest to me a solution where I can do both things:
Keep the existing sheets
Increase the column width
you can use your ExcelWriter to adjust the column width. Example below. Note that you can only add a new tab and the data with this, not update text within an existing tab. But, it will NOT delete any contents, like in case of xlsxwriter.
from openpyxl.utils.cell import get_column_letter
startRow = 12 #Change as per your req
startCol = 3 #Change as per your req
with pd.ExcelWriter("test.xlsx", engine="openpyxl", mode="a") as writer: #Your code
df.to_excel(writer, sheet_name="name", startrow=startRow, startcol=startCol) #Your code... mostly
worksheet = writer.sheets['name'] #Get worksheet name
for i, col in enumerate(df.columns): #For each column in df, set width to 60
worksheet.column_dimensions[get_column_letter(startCol+i+1)].width = 60

Use data from a different sheet for table using xlsxwriter

I am using xlsxwriter to generate a file with quite a few formulas. From there, I want to create a table on another sheet. Everything is pretty straightforward until I want to use data from a different sheet for the table.
The documentation only shows examples of already having the data you need, and then passing that to the .add_table as the 'data' parameter.
What I am trying to do is this: (Which is structured how the rest of xlsxwriter's formulas are.)
df = pd.DataFrame(stuff)
writer = pd.ExcelWriter('File.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']
worksheet2 = workbook.add_worksheet('Summary Page')
data = f"'Sheet1'!$A$1:$D${len(df)}"
worksheet2.add_table(f'A1:D{len(df)}', {'data':data})
workbook.close()
This approach adds the new sheet, and creates a table the correct size. But then fills in the "data" with 'data' as a string down the first column with one character in each cell.
Is there a way to create a table referencing data from another sheet using xlsxwriter?
ExcelWriter is (obviously) for writing Excelfiles.
If you want to read data from Excel after writing and saving it (did I get you right?!) use
ExcelFile.parse or read_excel to convert data to dataframe and write it again to Excel by ExcelWriter. Unfortunately xlsxwriter does not support appending, so you have to load and write all sheets again. Or just use the default openpyxl as engine. Could be omitted (as said: default) but to point out it is given in minimal working example:
import pandas as pd
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
data = pd.read_excel('test.xlsx', usecols='A:B', sheet_name='Sheet1', index_col=0)
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a')
# shape our data here
data.to_excel(writer, sheet_name='Sheet2')
writer.save()

Python: Writing Images and dataframes to the same excel file

I'm creating an excel dashboard and I want to generate an excel workbook that has some dataframes on half of the sheets, and .png files for the other half. I'm having difficulty writing them to the same file in one go. Here's what I currently have. It seems that when I run my for loop, it won't let me add additional worksheets. Any advice on how I might get my image files added to this workbook? I can't find anything about why I can't add any more worksheets Thanks!
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FCountsData'] = f2fcounts
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
df.to_excel(writer, sheet_name=name, index = False)
Then I want to add a couple sheets with some images to the same excel workbook. Something like this, but where I'm not creating a whole new workbook.
workbook = xlsxwriter.Workbook('MyData.xlsx')
worksheet = workbook.add_worksheet('image1')
worksheet.insert_image('A1', 'MemberCollateral.png')
Anyone have any tips to work around this?
Here is an example of how to get a handle to the underlying XlsxWriter workbook and worksheet objects and insert an image:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_image.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Insert an image.
worksheet.insert_image('D3', 'logo.png')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See also Working with Python Pandas and XlsxWriter in the XlsxWriter docs for more examples
Here's the solution I came up with. I still cound't find a way to do this without re-importing the workbook with load_workbook but this got the job done.
# assign dataframes to dictionary and export them to excel
avgvisits = pd.DataFrame(pd.read_sql(avgvisits(), cnxn))
f2fcounts = pd.DataFrame(pd.read_sql(f2fcounts(), cnxn))
activityencounters = pd.DataFrame(pd.read_sql(ActivityEncounters(), cnxn))
activityencountersp = activityencounters.pivot_table(values='ActivityCount', index = ['Activity'], columns= ['QuarterYear'], aggfunc=np.max)
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FIndirect'] = f2fcounts
dfs['ActivityEncounters'] = activityencountersp
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
if name != 'ActivityEncounters':
df.to_excel(writer, sheet_name=name, index=False)
else:
df.to_excel(writer, sheet_name=name, index=True)
writer.save()
writer.close()
# re-import the excel book and add the graph image files
wb = load_workbook('MyData.xlsx')
png_loc = 'MemberCollateral.png'
wb.create_sheet('MemberCollateralGraph')
ws = wb['MemberCollateralGraph']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A1')
png_loc = 'DirectIndirect.png'
ws = wb['F2FIndirect']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
png_loc = 'QuarterlyActivitySummary.png'
ws = wb['ActivityEncounters']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
wb.save('MyData.xlsx')

Open existing workbook with ExcelWriter [duplicate]

I use pandas to write to excel file in the following fashion:
import pandas
writer = pandas.ExcelWriter('Masterfile.xlsx')
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
Masterfile.xlsx already consists of number of different tabs. However, it does not yet contain "Main".
Pandas correctly writes to "Main" sheet, unfortunately it also deletes all other tabs.
Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWriter gives a clue that something like this might work out:
import pandas
from openpyxl import load_workbook
book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl')
writer.book = book
## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
UPDATE: Starting from Pandas 1.3.0 the following function will not work properly, because functions DataFrame.to_excel() and pd.ExcelWriter() have been changed - a new if_sheet_exists parameter has been introduced, which has invalidated the function below.
Here you can find an updated version of the append_df_to_excel(), which is working for Pandas 1.3.0+.
Here is a helper function:
import os
from openpyxl import load_workbook
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
truncate_sheet=False,
**to_excel_kwargs):
"""
Append a DataFrame [df] to existing Excel file [filename]
into [sheet_name] Sheet.
If [filename] doesn't exist, then this function will create it.
#param filename: File path or existing ExcelWriter
(Example: '/path/to/file.xlsx')
#param df: DataFrame to save to workbook
#param sheet_name: Name of sheet which will contain DataFrame.
(default: 'Sheet1')
#param startrow: upper left cell row to dump data frame.
Per default (startrow=None) calculate the last row
in the existing DF and write to the next row...
#param truncate_sheet: truncate (remove and recreate) [sheet_name]
before writing DataFrame to Excel file
#param to_excel_kwargs: arguments which will be passed to `DataFrame.to_excel()`
[can be a dictionary]
#return: None
Usage examples:
>>> append_df_to_excel('d:/temp/test.xlsx', df)
>>> append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False, startrow=25)
(c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)
"""
# Excel file doesn't exist - saving and exiting
if not os.path.isfile(filename):
df.to_excel(
filename,
sheet_name=sheet_name,
startrow=startrow if startrow is not None else 0,
**to_excel_kwargs)
return
# ignore [engine] parameter if it was passed
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')
# try to open an existing workbook
writer.book = load_workbook(filename)
# get the last row in the existing Excel sheet
# if it was not specified explicitly
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
# truncate sheet
if truncate_sheet and sheet_name in writer.book.sheetnames:
# index of [sheet_name] sheet
idx = writer.book.sheetnames.index(sheet_name)
# remove [sheet_name]
writer.book.remove(writer.book.worksheets[idx])
# create an empty sheet [sheet_name] using old index
writer.book.create_sheet(sheet_name, idx)
# copy existing sheets
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
if startrow is None:
startrow = 0
# write out the new sheet
df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)
# save the workbook
writer.save()
Tested with the following versions:
Pandas 1.2.3
Openpyxl 3.0.5
With openpyxlversion 2.4.0 and pandasversion 0.19.2, the process #ski came up with gets a bit simpler:
import pandas
from openpyxl import load_workbook
with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
writer.book = load_workbook('Masterfile.xlsx')
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!
Starting in pandas 0.24 you can simplify this with the mode keyword argument of ExcelWriter:
import pandas as pd
with pd.ExcelWriter('the_file.xlsx', engine='openpyxl', mode='a') as writer:
data_filtered.to_excel(writer)
I know this is an older thread, but this is the first item you find when searching, and the above solutions don't work if you need to retain charts in a workbook that you already have created. In that case, xlwings is a better option - it allows you to write to the excel book and keeps the charts/chart data.
simple example:
import xlwings as xw
import pandas as pd
#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5
#load workbook that has a chart in it
wb = xw.Book('C:\\data\\bookwithChart.xlsx')
ws = wb.sheets['chartData']
ws.range('A1').options(index=False).value = df
wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')
xw.apps[0].quit()
Old question, but I am guessing some people still search for this - so...
I find this method nice because all worksheets are loaded into a dictionary of sheet name and dataframe pairs, created by pandas with the sheetname=None option. It is simple to add, delete or modify worksheets between reading the spreadsheet into the dict format and writing it back from the dict. For me the xlsxwriter works better than openpyxl for this particular task in terms of speed and format.
Note: future versions of pandas (0.21.0+) will change the "sheetname" parameter to "sheet_name".
# read a single or multi-sheet excel file
# (returns dict of sheetname(s), dataframe(s))
ws_dict = pd.read_excel(excel_file_path,
sheetname=None)
# all worksheets are accessible as dataframes.
# easy to change a worksheet as a dataframe:
mod_df = ws_dict['existing_worksheet']
# do work on mod_df...then reassign
ws_dict['existing_worksheet'] = mod_df
# add a dataframe to the workbook as a new worksheet with
# ws name, df as dict key, value:
ws_dict['new_worksheet'] = some_other_dataframe
# when done, write dictionary back to excel...
# xlsxwriter honors datetime and date formats
# (only included as example)...
with pd.ExcelWriter(excel_file_path,
engine='xlsxwriter',
datetime_format='yyyy-mm-dd',
date_format='yyyy-mm-dd') as writer:
for ws_name, df_sheet in ws_dict.items():
df_sheet.to_excel(writer, sheet_name=ws_name)
For the example in the 2013 question:
ws_dict = pd.read_excel('Masterfile.xlsx',
sheetname=None)
ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]
with pd.ExcelWriter('Masterfile.xlsx',
engine='xlsxwriter') as writer:
for ws_name, df_sheet in ws_dict.items():
df_sheet.to_excel(writer, sheet_name=ws_name)
There is a better solution in pandas 0.24:
with pd.ExcelWriter(path, mode='a') as writer:
s.to_excel(writer, sheet_name='another sheet', index=False)
before:
after:
so upgrade your pandas now:
pip install --upgrade pandas
The solution of #MaxU is not working for the updated version of python and related packages. It raises the error:
"zipfile.BadZipFile: File is not a zip file"
I generated a new version of the function that works fine with the updated version of python and related packages and tested with python: 3.9 | openpyxl: 3.0.6 | pandas: 1.2.3
In addition I added more features to the helper function:
Now It resize all columns based on cell content width AND all variables will be visible (SEE "resizeColumns")
You can handle NaN, if you want that NaN are displayed as NaN or as empty cells (SEE "na_rep")
Added "startcol", you can decide to start to write from specific column, oterwise will start from col = 0
Here the function:
import pandas as pd
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None, startcol=None,
truncate_sheet=False, resizeColumns=True, na_rep = 'NA', **to_excel_kwargs):
"""
Append a DataFrame [df] to existing Excel file [filename]
into [sheet_name] Sheet.
If [filename] doesn't exist, then this function will create it.
Parameters:
filename : File path or existing ExcelWriter
(Example: '/path/to/file.xlsx')
df : dataframe to save to workbook
sheet_name : Name of sheet which will contain DataFrame.
(default: 'Sheet1')
startrow : upper left cell row to dump data frame.
Per default (startrow=None) calculate the last row
in the existing DF and write to the next row...
truncate_sheet : truncate (remove and recreate) [sheet_name]
before writing DataFrame to Excel file
resizeColumns: default = True . It resize all columns based on cell content width
to_excel_kwargs : arguments which will be passed to `DataFrame.to_excel()`
[can be dictionary]
na_rep: default = 'NA'. If, instead of NaN, you want blank cells, just edit as follows: na_rep=''
Returns: None
*******************
CONTRIBUTION:
Current helper function generated by [Baggio]: https://stackoverflow.com/users/14302009/baggio?tab=profile
Contributions to the current helper function: https://stackoverflow.com/users/4046632/buran?tab=profile
Original helper function: (c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)
Features of the new helper function:
1) Now it works with python 3.9 and latest versions of pandas and openpxl
---> Fixed the error: "zipfile.BadZipFile: File is not a zip file".
2) Now It resize all columns based on cell content width AND all variables will be visible (SEE "resizeColumns")
3) You can handle NaN, if you want that NaN are displayed as NaN or as empty cells (SEE "na_rep")
4) Added "startcol", you can decide to start to write from specific column, oterwise will start from col = 0
*******************
"""
from openpyxl import load_workbook
from string import ascii_uppercase
from openpyxl.utils import get_column_letter
from openpyxl import Workbook
# ignore [engine] parameter if it was passed
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
try:
f = open(filename)
# Do something with the file
except IOError:
# print("File not accessible")
wb = Workbook()
ws = wb.active
ws.title = sheet_name
wb.save(filename)
writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')
# Python 2.x: define [FileNotFoundError] exception if it doesn't exist
try:
FileNotFoundError
except NameError:
FileNotFoundError = IOError
try:
# try to open an existing workbook
writer.book = load_workbook(filename)
# get the last row in the existing Excel sheet
# if it was not specified explicitly
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
# truncate sheet
if truncate_sheet and sheet_name in writer.book.sheetnames:
# index of [sheet_name] sheet
idx = writer.book.sheetnames.index(sheet_name)
# remove [sheet_name]
writer.book.remove(writer.book.worksheets[idx])
# create an empty sheet [sheet_name] using old index
writer.book.create_sheet(sheet_name, idx)
# copy existing sheets
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
except FileNotFoundError:
# file does not exist yet, we will create it
pass
if startrow is None:
# startrow = -1
startrow = 0
if startcol is None:
startcol = 0
# write out the new sheet
df.to_excel(writer, sheet_name, startrow=startrow, startcol=startcol, na_rep=na_rep, **to_excel_kwargs)
if resizeColumns:
ws = writer.book[sheet_name]
def auto_format_cell_width(ws):
for letter in range(1,ws.max_column):
maximum_value = 0
for cell in ws[get_column_letter(letter)]:
val_to_check = len(str(cell.value))
if val_to_check > maximum_value:
maximum_value = val_to_check
ws.column_dimensions[get_column_letter(letter)].width = maximum_value + 2
auto_format_cell_width(ws)
# save the workbook
writer.save()
Example Usage:
# Create a sample dataframe
df = pd.DataFrame({'numbers': [1, 2, 3],
'colors': ['red', 'white', 'blue'],
'colorsTwo': ['yellow', 'white', 'blue'],
'NaNcheck': [float('NaN'), 1, float('NaN')],
})
# EDIT YOUR PATH FOR THE EXPORT
filename = r"C:\DataScience\df.xlsx"
# RUN ONE BY ONE IN ROW THE FOLLOWING LINES, TO SEE THE DIFFERENT UPDATES TO THE EXCELFILE
append_df_to_excel(filename, df, index=False, startrow=0) # Basic Export of df in default sheet (Sheet1)
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0) # Append the sheet "Cool" where "df" is written
append_df_to_excel(filename, df, sheet_name="Cool", index=False) # Append another "df" to the sheet "Cool", just below the other "df" instance
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0, startcol=5) # Append another "df" to the sheet "Cool" starting from col 5
append_df_to_excel(filename, df, index=False, truncate_sheet=True, startrow=10, na_rep = '') # Override (truncate) the "Sheet1", writing the df from row 10, and showing blank cells instead of NaN
def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name):
try:
master_book = load_workbook(master_file_path)
master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl')
master_writer.book = master_book
master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets)
current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0],
header=None,
index_col=None)
current_frames.to_excel(master_writer, sheet_name, index=None, header=False)
master_writer.save()
except Exception as e:
raise e
This works perfectly fine only thing is that formatting of the master file(file to which we add new sheet) is lost.
writer = pd.ExcelWriter('prueba1.xlsx'engine='openpyxl',keep_date_col=True)
The "keep_date_col" hope help you
I used the answer described here
from openpyxl import load_workbook
writer = pd.ExcelWriter(p_file_name, engine='openpyxl', mode='a')
writer.book = load_workbook(p_file_name)
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
df.to_excel(writer, 'Data', startrow=10, startcol=20)
writer.save()
book = load_workbook(xlsFilename)
writer = pd.ExcelWriter(self.xlsFilename)
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name=sheetName, index=False)
writer.save()
Solution by #MaxU worked very well. I have just one suggestion:
If truncate_sheet=True is specified than "startrow" should NOT be retained from existing sheet. I suggest:
if startrow is None and sheet_name in writer.book.sheetnames:
if not truncate_sheet: # truncate_sheet would use startrow if provided (or zero below)
startrow = writer.book[sheet_name].max_row
I'd reccommend using xlwings (https://docs.xlwings.org/en/stable/api.html), it is really powerful for this application... This is how I use it:
import xlwings as xw
import pandas as pd
import xlsxwriter
# function to get the active workbook
def getActiveWorkbook():
try:
# logic from xlwings to grab the current excel file
activeWb = xw.books.active
except:
# print error message if unable to get the current workbook
print('Unable to grab the current Workbook')
pause()
exitProgram()
else:
return activeWb
# function that returns the last row number and last cell of a sheet
def getLastRow(myBook, sheetName):
lastRow = myBook.sheets[sheetName].range("A1").current_region.last_cell.row
lastCol = str(xlsxwriter.utility.xl_col_to_name(myBook.sheets[sheetName].range("A1").current_region.last_cell.column))
return str(lastRow), lastCol + str(lastRow)
activeWb = getActiveWorkbook()
df = pd.DataFrame(data=[1,2,3])
# look at worksheet = Part Number Status
sheetName = "Sheet1"
ws = activeWb.sheets[sheetName]
lastRow, lastCell = getLastRow(activeWb, sheetName)
if int(lastRow) > 1:
ws.range("A1:" + lastCell).clear()
ws.range("A1").options(index=False, header=False).value = df.fillna('')
This seems to work very well for my applications because .xlsm workbooks can be very tricky. You can execute this as a python script or turn it into and executable with pyinstaller and then run the .exe through an excel macro. You can also call VBA macros from Python using xlwings which is very useful.
You can write to an existing Excel file without overwriting data using pandas by using the pandas.DataFrame.to_excel() method and specifying the mode parameter as 'a' (append mode).
Here's an example:
import pandas as pd
# Create a sample DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
# Write the DataFrame to an existing Excel file in append mode
df.to_excel('existing_file.xlsx', engine='openpyxl', mode='a', index=False, sheet_name='Sheet1')
Method:
Can create file if not present
Append to existing excel as per sheet name
import pandas as pd
from openpyxl import load_workbook
def write_to_excel(df, file):
try:
book = load_workbook(file)
writer = pd.ExcelWriter(file, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, **kwds)
writer.save()
except FileNotFoundError as e:
df.to_excel(file, **kwds)
Usage:
df_a = pd.DataFrame(range(10), columns=["a"])
df_b = pd.DataFrame(range(10, 20), columns=["b"])
write_to_excel(df_a, "test.xlsx", sheet_name="Sheet a", columns=['a'], index=False)
write_to_excel(df_b, "test.xlsx", sheet_name="Sheet b", columns=['b'])

Categories