How can I export a list of DataFrames into one Excel spreadsheet?
The docs for to_excel state:
Notes
If passing an existing ExcelWriter object, then the sheet will be added
to the existing workbook. This can be used to save different
DataFrames to one workbook
writer = ExcelWriter('output.xlsx')
df1.to_excel(writer, 'sheet1')
df2.to_excel(writer, 'sheet2')
writer.save()
Following this, I thought I could write a function which saves a list of DataFrames to one spreadsheet as follows:
from openpyxl.writer.excel import ExcelWriter
def save_xls(list_dfs, xls_path):
writer = ExcelWriter(xls_path)
for n, df in enumerate(list_dfs):
df.to_excel(writer,'sheet%s' % n)
writer.save()
However (with a list of two small DataFrames, each of which can save to_excel individually), an exception is raised (Edit: traceback removed):
AttributeError: 'str' object has no attribute 'worksheets'
Presumably I am not calling ExcelWriter correctly, how should I be in order to do this?
You should be using pandas own ExcelWriter class:
from pandas import ExcelWriter
# from pandas.io.parsers import ExcelWriter
Then the save_xls function works as expected:
def save_xls(list_dfs, xls_path):
with ExcelWriter(xls_path) as writer:
for n, df in enumerate(list_dfs):
df.to_excel(writer,'sheet%s' % n)
In case anyone needs an example using a dictionary of dataframes:
from pandas import ExcelWriter
def save_xls(dict_df, path):
"""
Save a dictionary of dataframes to an excel file,
with each dataframe as a separate page
"""
writer = ExcelWriter(path)
for key in dict_df.keys():
dict_df[key].to_excel(writer, sheet_name=key)
writer.save()
example:
save_xls(dict_df = my_dict, path = '~/my_path.xls')
Sometimes there can be issues(Writing an excel file containing unicode), if there are some non supporting character type in the data frame. To overcome it we can use 'xlsxwriter' package as in below case:
for below code:
from pandas import ExcelWriter
import xlsxwriter
writer = ExcelWriter('notes.xlsx')
for key in dict_df:
data[key].to_excel(writer, key,index=False)
writer.save()
I got the error as "IllegalCharacterError"
The code that worked:
%pip install xlsxwriter
from pandas import ExcelWriter
import xlsxwriter
writer = ExcelWriter('notes.xlsx')
for key in dict_df:
data[key].to_excel(writer, key,index=False,engine='xlsxwriter')
writer.save()
Related
I have come across a lot of answers and just wanted to check if this is the best answer
Write pandas dataframe values to excel to specific cell in a specific sheet.
The question is - assuming I have a dataframe "df".
I want to write to an existing excel file called "Name1.xlsx", in
worksheet called "exampleNames", and starting at cell d25.
What's the easiest/ most efficient way to do that.
###############Updated!#############
I tried this
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
path = "C:\\Users\\ABC\\PycharmProjects\\ABC\\Name1.xlsx"
df = pd.DataFrame(np.random.randint(1,10,(3,2)),columns=['a','b'])
df.to_excel(path,sheet_name="exampleNames",startcol=5,startrow=5,header=None,index=False)
df.to_excel(path,sheet_name="NN",startcol=5,startrow=25,header=None,index=False)
gave me error
ModuleNotFoundError: No module named 'openpyxl'
This is the approach suggested in the pandas docs
df.to_excel(writer, sheet_name='Sheet1', startcol=col,startrow=row, header=None)
where writer could be path-like, file-like, or ExcelWriter object
eg
df.to_excel('sample.xlsx',sheet_name="exampleNames",startcol=5,startrow=5,header=None)
To save multiple dataframes in excel, you will have to use the writer object
with pd.ExcelWriter('output.xlsx', engine="openpyxl", mode='a', if_sheet_exists='overlay') as writer:
df1.to_excel(writer, sheet_name='exampleNames' ,startcol=5,startrow=5,header=None,index=False)
df2.to_excel(writer, sheet_name='NN', startcol=5,startrow=25,header=None,index=False)
I have refined an existing xlsx file and want to create three new files based on the content. Successful in getting three new outputs, but not able to write it to new xlsx files.
I tried installing excelwriter but that didn't fixed my problem.
import pandas as pd
import xlsxwriter
xl_file = pd.ExcelFile('C:\\Users\\python_codes\\myfile.xlsx')
dfs = pd.read_excel('myfile.xlsx', sheetname="Sheet1")
test = dfs.drop_duplicates(subset='DetectionId', keep='first', inplace=False)
dfs2 = test[test['list_set_id'] == 1]
print(dfs2)
writer = dfs2.ExcelWriter('newfile.xlxs', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
I want to write new xlsx file with the filtered content from the existing file.
ExcelWriter belongs to the pandas module, not to a DataFrame instance.
writer = dfs2.ExcelWriter should be writer = pd.ExcelWriter
Im trying to create an excel file with pandas for a database I have generated.
I have tried both:
import pandas as pd
# write database to excel
df = pd.DataFrame(database)
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('fifa19.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
as well as:
import pandas as pd
df = pd.DataFrame(database).T
df.to_excel('database.xls')
However, none of the options generate an excel file. Database is a dictionary.
From the pandas document Notes itself:
If passing an existing ExcelWriter object, then the sheet will be added to the existing workbook. This can be used to save different DataFrames to one workbook:
>>> writer = pd.ExcelWriter('output.xlsx')
# writer = pd.ExcelWriter('/path_to_save/output.xlsx')
>>> df1.to_excel(writer,'Sheet1')
>>> df2.to_excel(writer,'Sheet2')
>>> writer.save()
For compatibility with to_csv, to_excel serializes lists and dicts to
strings before writing.
I am trying to write some text to a specific sheet in an Excel file. I export a number of pandas dataframes to the other tabs, but in this one I need only some text - basically some comments explaining how the other tabs were calculated.
I have tried this but it doesn't work:
import pandas as pd
writer=pd.ExcelWriter('myfile.xlsx')
writer.sheets['mytab'].write(1,1,'This is a test')
writer.close()
I have tried adding writer.book.add_worksheet('mytab') and
ws=writer.sheets['mytab']
ws.write(1,1,'This is a test')
but in all cases I am getting: keyerror:'mytab'.
The only solution I have found is to write an empty dataframe to the tab before writing my text to the same tab:
emptydf=pd.DataFrame()
emptydf['x']=[None]
emptydf.to_excel(writer,'mytab',header=False, index=False)
I could of course create a workbook instance, as in the example on the documentation of xlsxwriter: http://xlsxwriter.readthedocs.io/worksheet.html
However, my problem is that I already have a pd.ExcelWriter instance, which is used in the rest of my code to create the other excel sheets.
I even tried passing a workbook instance to to_excel(), but it doesn't work:
workbook = xlsxwriter.Workbook('filename.xlsx')
emptydf.to_excel(workbook,'mytab',header=False, index=False)
Is there any alternative to my solution of exporting an empty dataframe - which seems as unpythonic as it can get?
You mentioned that you used add_worksheet() method from the writer.book object, but it seems to work and do what you wanted it to do. Below I've put in a reproducible example that worked successfully.
import pandas as pd
print(pd.__version__)
writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')
workbook = writer.book
ws = workbook.add_worksheet('mytab')
ws.write(1,1,'This is a test')
writer.close()
Thought I'd also mention that I'm using pandas 0.18.1.
I am trying to use ExcelWriter to write/add some information into a workbook that contains multiple sheets.
First time when I use the function, I am creating the workbook with some data. In the second call, I would like to add some information into the workbook in different locations into all sheets.
def Out_Excel(file_name,C,col):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
for tab in tabs: # tabs here is provided from a different function that I did not write here to keep it simple and clean
df = DataFrame(C) # the data is different for different sheets but I keep it simple in this case
df.to_excel(writer,sheet_name = tab, startcol = 0 + col, startrow = 0)
writer.save()
In the main code I call this function twice with different col to print out my data in different locations.
Out_Excel('test.xlsx',C,0)
Out_Excel('test.xlsx',D,10)
But the problem is that doing so the output is just the second call of the function as if the function overwrites the entire workbook. I guess I need to load the workbook that already exists in this case?
Any help?
Use load_book from openpyxl - see xlsxwriter and openpyxl docs:
import pandas as pd
from openpyxl import load_workbook
book = load_workbook('test.xlsx')
writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name='tab_name', other_params)
writer.save()
Pandas version 0.24.0 added the mode keyword, which allows you to append to excel workbooks without jumping through the hoops that we used to have to do. Just use mode='a' to append sheets to an existing workbook.
From the documentation:
with ExcelWriter('path_to_file.xlsx', mode='a') as writer:
df.to_excel(writer, sheet_name='Sheet3')
You could also try using the following method to create your Excel spreadsheet:
import pandas as pd
def generate_excel(csv_file, excel_loc, sheet_):
writer = pd.ExcelWriter(excel_loc)
data = pd.read_csv(csv_file, header=0, index_col=False)
data.to_excel(writer, sheet_name=sheet_, index=False)
writer.save()
return(writer.close())
Give this a try and let me know what you think.