I am trying to build a simple python script that reads data from a .csv file, formats the data to an easy to read layout, then either writes it to a new xlsx file or appends to an existing xlsx file, depending on user input. That all works well and I write to the new file using;
with pd.ExcelWriter(file_path) as writer:
df.to_excel(writer, sheet_name='Master')
Now I want to add a second sheet that contains excel charts from the data and have extended the above code to;
with pd.ExcelWriter(file_path) as writer:
df.to_excel(writer, sheet_name='Master')
book = writer.book
sheet = writer.sheets['Master']
chart_a = book.add_chart({'type': 'line'})
chart_a.add_series({
'categories': ['Master', 1, 0, trend_data_row, 0],
'values': ['Master', 1, 1, trend_data_row, 1],
})
chart_a.set_x_axis({'name': 'time', 'position_axis': 'on_tick'})
chart_a.set_y_axis({'name': 'value'})
chart_a.set_legend({'position': 'Bottom'})
sheet.insert_chart('A11', chart_a)
writer.save()
This adds the chart to the 'Master' sheet as expected, I don't understand how to create the second sheet and insert the chart there instead. I have tried changing sheet = writer.sheets to a new name [Graphs] but I guess its looking for an existing sheet with that name rather than creating one. Any help is really appreciatted.
I don't understand how to create the second sheet and insert the chart there instead.
You can do it like this:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_chart.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Master')
# Get the xlsxwriter workbook object.
workbook = writer.book
# Add a new worksheet.
worksheet = workbook.add_worksheet('Graphs')
# Create a chart object.
chart = workbook.add_chart({'type': 'column'})
# Configure the series of the chart from the dataframe data.
chart.add_series({'values': ['Master', 1, 1, 7, 1]})
# Insert the chart into the worksheet.
worksheet.insert_chart('D2', chart)
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
then either writes it to a new xlsx file or appends to an existing xlsx file
XlsxWriter cannot write to an existing file.
Related
I am using xlsxwriter to generate a file with quite a few formulas. From there, I want to create a table on another sheet. Everything is pretty straightforward until I want to use data from a different sheet for the table.
The documentation only shows examples of already having the data you need, and then passing that to the .add_table as the 'data' parameter.
What I am trying to do is this: (Which is structured how the rest of xlsxwriter's formulas are.)
df = pd.DataFrame(stuff)
writer = pd.ExcelWriter('File.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']
worksheet2 = workbook.add_worksheet('Summary Page')
data = f"'Sheet1'!$A$1:$D${len(df)}"
worksheet2.add_table(f'A1:D{len(df)}', {'data':data})
workbook.close()
This approach adds the new sheet, and creates a table the correct size. But then fills in the "data" with 'data' as a string down the first column with one character in each cell.
Is there a way to create a table referencing data from another sheet using xlsxwriter?
ExcelWriter is (obviously) for writing Excelfiles.
If you want to read data from Excel after writing and saving it (did I get you right?!) use
ExcelFile.parse or read_excel to convert data to dataframe and write it again to Excel by ExcelWriter. Unfortunately xlsxwriter does not support appending, so you have to load and write all sheets again. Or just use the default openpyxl as engine. Could be omitted (as said: default) but to point out it is given in minimal working example:
import pandas as pd
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
data = pd.read_excel('test.xlsx', usecols='A:B', sheet_name='Sheet1', index_col=0)
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl', mode='a')
# shape our data here
data.to_excel(writer, sheet_name='Sheet2')
writer.save()
I create a excel file from a dataframe:
#writer = pd.ExcelWriter('ΜΟΝΑΔΙΚΕΣ_ΠΡΟΣΛΗΨΕΙΣ.xlsx', engine='xlsxwriter')
#uniq_pros.to_excel(writer, sheet_name='Sheet1')
#writer.save()
how can add a watermark in every page of excel file?
or
a header with logo text and image in every first row of pages (or 25lines?)
with python
The usual way to add a watermark in Excel (as suggested by Microsoft) is to add an image to the header. Here is one way to do it via Pandas and XlsxWriter:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Set a worksheet header with image.
worksheet.set_header('&C&[Picture]',
{'image_center': 'watermark.png'})
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See also Example: Adding Headers and Footers to Worksheets in the XlsxWriter docs.
So I am using Pandas and Python to transfer all the data from one excel sheet to a completely new sheet, everyday the data gets updated and hence the new Excel sheet being used gets overridden with updated data. I've been trying in the new sheet to have all the data that is overridden put into a table format within the new Excel.
When I run the code I have made for the table it overrides the data but doesn't put it into a table format?
The following code is what I have so far for the table formatting:
import xlsxwriter
excel_file_1 = 'Incident Report.xls'
df_first_shift = pd.read_excel(r'C:\Users\M\Downloads\Incident Report.xls')
print(df_first_shift)
df_all = pd.concat([df_first_shift])
writer = pd.ExcelWriter('Incident_Report.xlsx', engine = 'xlsxwriter')
dataframe = pd.DataFrame({'Data': [df_all.to_excel(r'C:\Users\M\OneDrive\Test\Incident_Report.xlsx', engine = 'xlsxwriter')]})
dataframe.to_excel(writer, sheet_name='Sheet1', startrow=1, header=False, index=False)
workbook = writer.book
worksheet = writer.sheets['Sheet1']
(max_row, max_col) = dataframe.shape
column_settings = [{'header':column} for column in dataframe.columns]
worksheet.add_table(0,0, max_row, max_col - 1, {'columns' : column_settings})
worksheet.set_column(0, max_col - 1, 12)
writer.save()```
I am storing a pandas DataFrame in an Excel sheet. When I re-run my code, I want the sheet completely overwritten. This is important because my code writes to the same file a few different times, i.e., loading and saving certain sheets at different moments, not wanting to disturb the sheets not currently being changed. Because of this, if a new iteration of the code produces fewer rows or columns, the old data will still be there. For example, if iteration #1 produces 500 rows but iteration #2 only produces 499, that 500th row will still show up in my Excel file.
I'm aware I could loop through all the cells and set their values to None, but I thought it would be more efficient to remove a given sheet completely, create_sheet with the same sheet name, and then save my DataFrame to the new sheet. The code below is a MRE of what I'm trying to do. It successfully removes the sheet, creates a new one, and saves the file, but the to_excel doesn't seem to be executing. The resulting Excel file has the 'test' sheet, but it is blank.
import pandas as pd
import numpy as np
import openpyxl
from openpyxl import load_workbook
from openpyxl import Workbook
df_data = {'A': np.random.randint(1, 50, 20),
'B': np.random.randint(1, 50, 20),
'C': np.random.randint(1, 50, 20),
'D': np.random.randint(1, 50, 20)}
df = pd.DataFrame(data=df_data)
fn = 'test.xlsx'
sheet = 'test'
df.to_excel(fn, sheet_name=sheet)
df2 = pd.read_excel(fn, sheet_name=sheet, index_col=0)
df2.drop(columns=['A'], inplace=True)
book = load_workbook(fn)
writer = pd.ExcelWriter(fn, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
s = book[sheet]
book.remove(s)
book.create_sheet(sheet, 0)
#THIS CODE WILL ACTUALLY WRITE TO THE SHEET, BUT df2 WILL NOT
#s2 = book[sheet]
#s2['A1'] = 'This will write to the sheet'
df2.to_excel(writer, sheet_name=sheet)
writer.save()
Note that my commented code will write to the proper sheet if uncommented. It seems to just be the to_excel line that doesn't work.
You could do this by using a function:
import pandas as pd
def write2excel(filename,sheetname,dataframe):
with pd.ExcelWriter(filename, engine='openpyxl', mode='a') as writer:
workBook = writer.book
try:
workBook.remove(workBook[sheetname])
except:
print("There is no such sheet in this file")
finally:
dataframe.to_excel(writer, sheet_name=sheetname,index=False)
writer.save()
After this, assuming you have a datafram df, A workbook Myfile.xlsx and the sheet you want to overwrite THE_sheet do
write2excel('Myfile.xlsx','THE_sheet',df)
I'm creating an excel dashboard and I want to generate an excel workbook that has some dataframes on half of the sheets, and .png files for the other half. I'm having difficulty writing them to the same file in one go. Here's what I currently have. It seems that when I run my for loop, it won't let me add additional worksheets. Any advice on how I might get my image files added to this workbook? I can't find anything about why I can't add any more worksheets Thanks!
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FCountsData'] = f2fcounts
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
df.to_excel(writer, sheet_name=name, index = False)
Then I want to add a couple sheets with some images to the same excel workbook. Something like this, but where I'm not creating a whole new workbook.
workbook = xlsxwriter.Workbook('MyData.xlsx')
worksheet = workbook.add_worksheet('image1')
worksheet.insert_image('A1', 'MemberCollateral.png')
Anyone have any tips to work around this?
Here is an example of how to get a handle to the underlying XlsxWriter workbook and worksheet objects and insert an image:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_image.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Insert an image.
worksheet.insert_image('D3', 'logo.png')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See also Working with Python Pandas and XlsxWriter in the XlsxWriter docs for more examples
Here's the solution I came up with. I still cound't find a way to do this without re-importing the workbook with load_workbook but this got the job done.
# assign dataframes to dictionary and export them to excel
avgvisits = pd.DataFrame(pd.read_sql(avgvisits(), cnxn))
f2fcounts = pd.DataFrame(pd.read_sql(f2fcounts(), cnxn))
activityencounters = pd.DataFrame(pd.read_sql(ActivityEncounters(), cnxn))
activityencountersp = activityencounters.pivot_table(values='ActivityCount', index = ['Activity'], columns= ['QuarterYear'], aggfunc=np.max)
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FIndirect'] = f2fcounts
dfs['ActivityEncounters'] = activityencountersp
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
if name != 'ActivityEncounters':
df.to_excel(writer, sheet_name=name, index=False)
else:
df.to_excel(writer, sheet_name=name, index=True)
writer.save()
writer.close()
# re-import the excel book and add the graph image files
wb = load_workbook('MyData.xlsx')
png_loc = 'MemberCollateral.png'
wb.create_sheet('MemberCollateralGraph')
ws = wb['MemberCollateralGraph']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A1')
png_loc = 'DirectIndirect.png'
ws = wb['F2FIndirect']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
png_loc = 'QuarterlyActivitySummary.png'
ws = wb['ActivityEncounters']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
wb.save('MyData.xlsx')