Is there a way to insert a worksheet at a specified index using Pandas? With the code below, when adding a dataframe as a new worksheet, it gets added after the last sheet in the existing Excel file. What if I want to insert it at say index 1?
import pandas as pd
from openpyxl import load_workbook
f = 'existing_file.xlsx'
df = pd.DataFrame({'cat':['A','B'], 'word': ['C','D']})
book = load_workbook(f)
writer = pd.ExcelWriter(f, engine = 'openpyxl')
writer.book = book
df.to_excel(writer, sheet_name = 'sheet')
writer.save()
writer.close()
Thank you.
Related
I have the code where I want to read data from the current sheet, store it in df_old, append the current data to it using df = df_old.append(df) and then replace the data in the sheet with this new dataframe. However, what it does instead is create a new sheet with the exact same name where it publishes this new dataframe. I tried adding if_sheet_exists="replace" as an argument to ExcelWriter but this did not change anything. How can I force it to overwrite the data in the sheet with the current name?
df_old = pd.read_excel(r'C:\Users\XXX\Downloads\Digitalisation\mat_flow\reblend_v2.xlsx',sheet_name = ft_tags_final[i][j])
df = df_old.append(df)
with pd.ExcelWriter(r'C:\Users\XXX\Downloads\Digitalisation\mat_flow\reblend_v2.xlsx', engine="openpyxl", mode="a", if_sheet_exists="replace") as writer:
df.to_excel(writer, index=False, sheet_name = ft_tags_final[i][j])
I had the same issue and i solved it with using write instead of append. Also i used openpyxl instead of xlsxwriter
from pandas import ExcelWriter
from pandas import ExcelFile
from openpyxl import load_workbook
book = load_workbook('Wallet.xlsx')
writer = pd.ExcelWriter('Wallet.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
#^THIS IS THE MOST IMPORTANT LINES BECAUSE IT GIVES PANDAS THE SHEET
Data.to_excel(writer, sheet_name='Main', header=None, index=False, startcol=number,startrow=counter)
Hi I'm trying to loop through a list to create multiple excel files (not sheets). I was able to create separate dataframe in below codes, but I was wondering if anyone has example for creating multiple excel files by looping through a list and create multiple excel files with each company name.
from datetime import date, time
import xlsxwriter
import pandas as pd
from openpyxl import load_workbook
import openpyxl
#create a list of company - this list contains company names
dfList = list(set(df['Company']))
#create a separate DF by each company
for i, row in enumerate(dfList):
dfName = dfNames[i]
dfNew = df1[df1['Company'] == row]
globals()[dfName] = dfNew
#create excel files, format cells, and export dataframe in each excel files
for i, row in enumerate(dfList):
workbook = xlsxwriter.Workbook('C:/Users/user/row.xlsx')
worksheet = workbook.add_worksheet()
worksheet.write('C1', 'Work Location')
worksheet.data_validation('C2:C100', {'validate': 'list','source':['WFH', 'Office']})
workbook.close()
#Update excel file created above with dataframe data
book = load_workbook('C:/Users/user/row.xlsx')
writer = pandas.ExcelWriter('C:/Users/user/row.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
row.to_excel(writer, "Sheet1", columns=['Company','Sales'])
writer.save()
I found an answer and this worked for me
from datetime import date, time
import xlsxwriter
import pandas as pd
import os
from openpyxl import load_workbook
import openpyxl
#create a list of company - this list contains company names
dfList = list(set(df['Company']))
#create a separate DF by each company
for i, row in enumerate(dfList):
dfName = dfNames[i]
dfNew = df1[df1['Company'] == row]
globals()[dfName] = dfNew
#create excel files, format cells, and export dataframe in each excel files
directory = r'C:/Users/user'
for i, row in enumerate(dfList):
workbook = xlsxwriter.Workbook(os.path.join(directory, row))
worksheet = workbook.add_worksheet()
worksheet.write('C1', 'Work Location')
worksheet.data_validation('C2:C100', {'validate': 'list','source':['WFH', 'Office']})
workbook.close()
#Update excel file created above with dataframe data
book = load_workbook(os.path.join(directory, row))
writer = pandas.ExcelWriter(os.path.join(directory, row), engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
row.to_excel(writer, "Sheet1", columns=['Company','Sales'])
writer.save()
I'm creating an excel dashboard and I want to generate an excel workbook that has some dataframes on half of the sheets, and .png files for the other half. I'm having difficulty writing them to the same file in one go. Here's what I currently have. It seems that when I run my for loop, it won't let me add additional worksheets. Any advice on how I might get my image files added to this workbook? I can't find anything about why I can't add any more worksheets Thanks!
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FCountsData'] = f2fcounts
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
df.to_excel(writer, sheet_name=name, index = False)
Then I want to add a couple sheets with some images to the same excel workbook. Something like this, but where I'm not creating a whole new workbook.
workbook = xlsxwriter.Workbook('MyData.xlsx')
worksheet = workbook.add_worksheet('image1')
worksheet.insert_image('A1', 'MemberCollateral.png')
Anyone have any tips to work around this?
Here is an example of how to get a handle to the underlying XlsxWriter workbook and worksheet objects and insert an image:
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_image.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Insert an image.
worksheet.insert_image('D3', 'logo.png')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Output:
See also Working with Python Pandas and XlsxWriter in the XlsxWriter docs for more examples
Here's the solution I came up with. I still cound't find a way to do this without re-importing the workbook with load_workbook but this got the job done.
# assign dataframes to dictionary and export them to excel
avgvisits = pd.DataFrame(pd.read_sql(avgvisits(), cnxn))
f2fcounts = pd.DataFrame(pd.read_sql(f2fcounts(), cnxn))
activityencounters = pd.DataFrame(pd.read_sql(ActivityEncounters(), cnxn))
activityencountersp = activityencounters.pivot_table(values='ActivityCount', index = ['Activity'], columns= ['QuarterYear'], aggfunc=np.max)
dfs = dict()
dfs['AvgVisitsData'] = avgvisits
dfs['F2FIndirect'] = f2fcounts
dfs['ActivityEncounters'] = activityencountersp
writer = pd.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
for name, df in dfs.items():
if name != 'ActivityEncounters':
df.to_excel(writer, sheet_name=name, index=False)
else:
df.to_excel(writer, sheet_name=name, index=True)
writer.save()
writer.close()
# re-import the excel book and add the graph image files
wb = load_workbook('MyData.xlsx')
png_loc = 'MemberCollateral.png'
wb.create_sheet('MemberCollateralGraph')
ws = wb['MemberCollateralGraph']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A1')
png_loc = 'DirectIndirect.png'
ws = wb['F2FIndirect']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
png_loc = 'QuarterlyActivitySummary.png'
ws = wb['ActivityEncounters']
my_png = openpyxl.drawing.image.Image(png_loc)
ws.add_image(my_png, 'A10')
wb.save('MyData.xlsx')
I tried to edit an existing excel file. My file is test.xlsx, with two sheets are All and Summary. Following step:
import pandas as pd
df = pd.read_csv('abc.csv')
number_rows = len(df.index)
writer = pd.ExcelWriter('test.xlsx')
df.to_excel(writer, sheet_name = 'All',startrow = number_rows)
writer.save()
I want to edit(append data to sheet name All) but when run this code, it seem to be the sheet name Summary and All deleted and it create a new sheet name All and write my new data to it. So, how to append data to excel sheet without delete existing data? Thank you.
You can use openpyxl engine along-with startrow parameter.
You also need to ;
read csv to df first
open xlsx as workbook using openpyxl
create writer object using openpyxl as engine
Add sheets to writer object
Add df to writer object
Your Code (modified):
import pandas as pd
from openpyxl import load_workbook
df = pd.read_csv('abc.csv')
number_rows = len(df.index)
book = load_workbook('test.xlsx')
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name = 'All',startrow = number_rows)
writer.save()
I have a pandas dataframe and I want to open an existing excel workbook containing formulas, copying the dataframe in a specific set of columns (lets say from column A to column H) and save it as a new file with a different name.
The idea is to update an existing template, populate it with the dataframe in a specified set of column and then save a copy of the Excel file with a different name.
Any idea?
What I have is:
import pandas
from openpyxl import load_workbook
book = load_workbook('Template.xlsx')
writer = pandas.ExcelWriter('Template.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer)
writer.save()
The below should work, assuming that you are happy to copy into column A. I don't see a way to write into the sheet starting in a different column (without overwriting anything).
The below incorporates #MaxU's suggestion of copying the template sheet before writing to it (having just lost a few hours' work on my own template workbook to pd.to_excel)
import pandas as pd
from openpyxl.utils.dataframe import dataframe_to_rows
from shutil import copyfile
template_file = 'Template.xlsx' # Has a header in row 1 already
output_file = 'Result.xlsx' # What we are saving the template as
# Copy Template.xlsx as Result.xlsx
copyfile(template_file, output_file)
# Read in the data to be pasted into the termplate
df = pd.read_csv('my_data.csv')
# Load the workbook and access the sheet we'll paste into
wb = load_workbook(output_file)
ws = wb.get_sheet_by_name('Existing Result Sheet')
# Selecting a cell in the header row before writing makes append()
# start writing to the following line i.e. row 2
ws['A1']
# Write each row of the DataFrame
# In this case, I don't want to write the index (useless) or the header (already in the template)
for r in dataframe_to_rows(df, index=False, header=False):
ws.append(r)
wb.save(output_file)
try this:
df.to_excel(writer, startrow=10, startcol=1, index=False, engine='openpyxl')
Pay attention at startrow and startcol parameters