Creating a table in an excel sheet using Pandas and XLSXWriter - python

So I am using Pandas and Python to transfer all the data from one excel sheet to a completely new sheet, everyday the data gets updated and hence the new Excel sheet being used gets overridden with updated data. I've been trying in the new sheet to have all the data that is overridden put into a table format within the new Excel.
When I run the code I have made for the table it overrides the data but doesn't put it into a table format?
The following code is what I have so far for the table formatting:
import xlsxwriter
excel_file_1 = 'Incident Report.xls'
df_first_shift = pd.read_excel(r'C:\Users\M\Downloads\Incident Report.xls')
print(df_first_shift)
df_all = pd.concat([df_first_shift])
writer = pd.ExcelWriter('Incident_Report.xlsx', engine = 'xlsxwriter')
dataframe = pd.DataFrame({'Data': [df_all.to_excel(r'C:\Users\M\OneDrive\Test\Incident_Report.xlsx', engine = 'xlsxwriter')]})
dataframe.to_excel(writer, sheet_name='Sheet1', startrow=1, header=False, index=False)
workbook = writer.book
worksheet = writer.sheets['Sheet1']
(max_row, max_col) = dataframe.shape
column_settings = [{'header':column} for column in dataframe.columns]
worksheet.add_table(0,0, max_row, max_col - 1, {'columns' : column_settings})
worksheet.set_column(0, max_col - 1, 12)
writer.save()```

Related

Convert DataFrame to excel by preserving existing sheets and increase the column size of excel

I am trying to convert the DataFrame to excel without overwriting the existing sheet.
The solution is using pd.ExcelWriter with openpyxl engine which supports append mode.
Now, I have to increase the column size of the excel, I use pd.ExcelWriter with XlsxWriter engine but it overwrites the remaining sheets.
Openpyxl as an engine:
with pd.ExcelWriter("test.xlsx", engine="openpyxl", mode="a") as writer:
df.to_excel(writer, sheet_name="name", startrow=num, startcol=num)
XlsxWriter as an engine:
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet()
worksheet.set_column(0, 0, 20)
Can someone please suggest to me a solution where I can do both things:
Keep the existing sheets
Increase the column width
you can use your ExcelWriter to adjust the column width. Example below. Note that you can only add a new tab and the data with this, not update text within an existing tab. But, it will NOT delete any contents, like in case of xlsxwriter.
from openpyxl.utils.cell import get_column_letter
startRow = 12 #Change as per your req
startCol = 3 #Change as per your req
with pd.ExcelWriter("test.xlsx", engine="openpyxl", mode="a") as writer: #Your code
df.to_excel(writer, sheet_name="name", startrow=startRow, startcol=startCol) #Your code... mostly
worksheet = writer.sheets['name'] #Get worksheet name
for i, col in enumerate(df.columns): #For each column in df, set width to 60
worksheet.column_dimensions[get_column_letter(startCol+i+1)].width = 60

Use data from a different sheet for table using xlsxwriter

I am using xlsxwriter to generate a file with quite a few formulas. From there, I want to create a table on another sheet. Everything is pretty straightforward until I want to use data from a different sheet for the table.
The documentation only shows examples of already having the data you need, and then passing that to the .add_table as the 'data' parameter.
What I am trying to do is this: (Which is structured how the rest of xlsxwriter's formulas are.)
df = pd.DataFrame(stuff)
writer = pd.ExcelWriter('File.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']
worksheet2 = workbook.add_worksheet('Summary Page')
data = f"'Sheet1'!$A$1:$D${len(df)}"
worksheet2.add_table(f'A1:D{len(df)}', {'data':data})
workbook.close()
This approach adds the new sheet, and creates a table the correct size. But then fills in the "data" with 'data' as a string down the first column with one character in each cell.
Is there a way to create a table referencing data from another sheet using xlsxwriter?
ExcelWriter is (obviously) for writing Excelfiles.
If you want to read data from Excel after writing and saving it (did I get you right?!) use
ExcelFile.parse or read_excel to convert data to dataframe and write it again to Excel by ExcelWriter. Unfortunately xlsxwriter does not support appending, so you have to load and write all sheets again. Or just use the default openpyxl as engine. Could be omitted (as said: default) but to point out it is given in minimal working example:
import pandas as pd
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
data = pd.read_excel('test.xlsx', usecols='A:B', sheet_name='Sheet1', index_col=0)
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a')
# shape our data here
data.to_excel(writer, sheet_name='Sheet2')
writer.save()

Overwriting one sheet in an existing excel file with python

I have an existing excel file called test.xlsx, which has a table in sheet one tables, which points to the data in the second sheet data.
I want to keep the tables sheet and its table while overwriting the entirety of the data sheet with the contents of the new_df dataframe.
My program:
import pandas as pd
import openpyxl
from openpyxl import load_workbook
file_to_update = r'C:\folder\test.xlsx'
book = load_workbook(file_to_update)
writer = pd.ExcelWriter(file_to_update, engine = 'openpyxl')
writer.book = book
new_df = pd.read_csv(r'C:\folder\other_csv.csv', sep = '|')
new_df.to_excel(writer, index = False)
writer.save()
writer.close()
The previously existing data is still there, but the new dataframe new_df data is in a new sheet, Sheet1.
The objective is to keep the existing tables sheet, while overwriting the data sheet with the data from the new dataframe new_df.

Insert worksheet at specified index in existing Excel file using Pandas

Is there a way to insert a worksheet at a specified index using Pandas? With the code below, when adding a dataframe as a new worksheet, it gets added after the last sheet in the existing Excel file. What if I want to insert it at say index 1?
import pandas as pd
from openpyxl import load_workbook
f = 'existing_file.xlsx'
df = pd.DataFrame({'cat':['A','B'], 'word': ['C','D']})
book = load_workbook(f)
writer = pd.ExcelWriter(f, engine = 'openpyxl')
writer.book = book
df.to_excel(writer, sheet_name = 'sheet')
writer.save()
writer.close()
Thank you.

Python: Writing Images and dataframes to the same excel file

I'm creating an excel dashboard and I want to generate an excel workbook that has some dataframes on half of the sheets, and .png files for the other half. I'm having difficulty writing them to the same file in one go. Here's what I currently have. It seems that when I run my for loop, it won't let me add additional worksheets. Any advice on how I might get my image files added to this workbook? I can't find anything about why I can't add any more worksheets Thanks!
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FCountsData'] = f2fcounts
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
df.to_excel(writer, sheet_name=name, index = False)
Then I want to add a couple sheets with some images to the same excel workbook. Something like this, but where I'm not creating a whole new workbook.
workbook = xlsxwriter.Workbook('MyData.xlsx')
worksheet = workbook.add_worksheet('image1')
worksheet.insert_image('A1', 'MemberCollateral.png')
Anyone have any tips to work around this?
Here is an example of how to get a handle to the underlying XlsxWriter workbook and worksheet objects and insert an image:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_image.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Insert an image.
worksheet.insert_image('D3', 'logo.png')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See also Working with Python Pandas and XlsxWriter in the XlsxWriter docs for more examples
Here's the solution I came up with. I still cound't find a way to do this without re-importing the workbook with load_workbook but this got the job done.
# assign dataframes to dictionary and export them to excel
avgvisits = pd.DataFrame(pd.read_sql(avgvisits(), cnxn))
f2fcounts = pd.DataFrame(pd.read_sql(f2fcounts(), cnxn))
activityencounters = pd.DataFrame(pd.read_sql(ActivityEncounters(), cnxn))
activityencountersp = activityencounters.pivot_table(values='ActivityCount', index = ['Activity'], columns= ['QuarterYear'], aggfunc=np.max)
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FIndirect'] = f2fcounts
dfs['ActivityEncounters'] = activityencountersp
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
if name != 'ActivityEncounters':
df.to_excel(writer, sheet_name=name, index=False)
else:
df.to_excel(writer, sheet_name=name, index=True)
writer.save()
writer.close()
# re-import the excel book and add the graph image files
wb = load_workbook('MyData.xlsx')
png_loc = 'MemberCollateral.png'
wb.create_sheet('MemberCollateralGraph')
ws = wb['MemberCollateralGraph']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A1')
png_loc = 'DirectIndirect.png'
ws = wb['F2FIndirect']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
png_loc = 'QuarterlyActivitySummary.png'
ws = wb['ActivityEncounters']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
wb.save('MyData.xlsx')

Categories