I am trying to write a pandas data frame to an existing excel sheet on a new tab, but it gives me the following error:
AttributeError: 'NoneType' object has no attribute 'read'.
I've determined this is because pandas to_excel returns a NoneType object, which isn't allowing me to save the file with writer.save(). Does anyone know a workaround for this?
path = 'summary.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name="results")
writer.save()
I had exactly the same issue.
I managed to work around it by removing the value in legacy_drawing from each sheet in the workbook.
path = 'summary.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
for s in list(writer.sheets.keys()):
writer.sheets[s].legacy_drawing = None
df.to_excel(writer, sheet_name="results")
writer.save()
Related
This code used to get a xlsx file and write over it, but after updating from pandas 1.1.5 to 1.5.1 I got zipfile.badzipfile file is not a zip file
Then I read here that after pandas 1.2.0 the pd.ExcelWriter(report_path, engine='openpyxl') creates a new file but as this is a completely empty file, openpyxl cannot load it.
Knowing that, I changed the code to this one, but now I'm getting AttributeError: property 'sheets' of 'OpenpyxlWriter' object has no setter. How should I handle this?
book = load_workbook('Resultados.xlsx')
writer = pd.ExcelWriter('Resultados.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
reader = pd.read_excel(r'Resultados.xlsx')
df = pd.DataFrame.from_dict(dict_)
df.to_excel(writer, index=False, header=False, startrow=len(reader) + 1)
writer.close()
TLDR
Use .update to modify writer.sheets
Rearrange the order of your script to get it working
# run before initializing the ExcelWriter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
book = load_workbook("Resultados.xlsx")
# use `with` to avoid other exceptions
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
writer.book = book
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
Details
Recreating your problem with some fake data
import numpy as np
from openpyxl import load_workbook
import pandas as pd
if __name__ == "__main__":
# make some random data
np.random.seed(0)
df = pd.DataFrame(np.random.random(size=(5, 5)))
# this makes an existing file
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
df.to_excel(excel_writer=writer)
# make new random data
np.random.seed(1)
df = pd.DataFrame(np.random.random(size=(5, 5)))
# what you tried...
book = load_workbook("Resultados.xlsx")
writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
reader = pd.read_excel("Resultados.xlsx")
# skipping this step as we defined `df` differently
# df = pd.DataFrame.from_dict(dict_)
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
writer.close()
We get the same error plus a FutureWarning
...\StackOverflow\answer.py:23: FutureWarning: Setting the `book` attribute is not part of the public API, usage can give unexpected or corrupted results and will be removed in a future version
writer.book = book
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 24, in <module>
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
AttributeError: can't set attribute 'sheets'
The AttributeError is because sheets is a property of the writer instance. If you're unfamiliar with it, here is a resource.
In shorter terms, the exception is raised because sheets cannot be modified in the way you're trying. However, you can do this:
# use the `.update` method
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
That will move us past the the AttributeError, but we'll hit a ValueError a couple lines down:
reader = pd.read_excel("Resultados.xlsx")
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 26, in <module>
reader = pd.read_excel("Resultados.xlsx")
...
File "...\lib\site-packages\pandas\io\excel\_base.py", line 1656, in __init__
raise ValueError(
ValueError: Excel file format cannot be determined, you must specify an engine manually.
Do what the error message says and supply an argument to the engine parameter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
And now we're back to your original zipfile.BadZipFile exception
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 26, in <module>
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
...
File "...\Local\Programs\Python\Python310\lib\zipfile.py", line 1334, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
After a bit of toying, I noticed that the Resultados.xlsx file could not be opened manually after running this line:
writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
So I reordered some of the steps in your code:
# run before initializing the ExcelWriter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
book = load_workbook("Resultados.xlsx")
# the old way
# writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
writer.book = book
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
try this:
filepath = r'Resultados.xlsx'
with pd.ExcelWriter(
filepath,
engine='openpyxl',
mode='a',
if_sheet_exists='overlay') as writer:
reader = pd.read_excel(filepath)
df.to_excel(
writer,
startrow=reader.shape[0] + 1,
index=False,
header=False)
I'm doing some simple conditional formatting using xlsxwriter but I am getting this error when I run the code below.
AttributeError: 'Workbook' object has no attribute 'add_format'
I have updated xlsxwriter and looked at a lot of questions on SO and documentation but nothing has worked yet.
This is my code:
workbook = load_workbook(input_excel_filename)
writer = pd.ExcelWriter(input_excel_filename, engine="xlsxwriter")
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
trends_sheet = writer.sheets["Trends"]
slight_increase = writer.book.add_format({"bg_color":"#d3e6d5"})
trends_sheet.conditional_format("E:E", {"type":"cell", "criteria":"==", "value":"Slight Increase", "format":slight_increase})
Check if xlsxwriter package is installed or not....even I faced the same issue..resolved it after installing the package...same answer goes for any attribute error issue related to workbook/writer if your code is correct
Cause and solution
Makesure variable is usable
such as mine
first: workbook = writer.book
then: header_format = workbook.add_format(
Makesure already set pandas's engine (here using xlsxwriter)
when init ExcelWriter, set your engine
writer = pd.ExcelWriter(outputFile, engine='xlsxwriter’, options={'strings_to_urls': False} )
Makesure already installed related lib (xlsxwriter)
pip install xlsxwriter
or mine: pipenv install xlsxwriter
Full code for refer
import pandas as pd
writer = pd.ExcelWriter(
output_final_total_file,
engine='xlsxwriter',
options={'strings_to_urls': False}
)
...
df = pd.read_csv(outputExcelFile, sep=pandas_sep)
...
df.to_excel(outputExcelFile.replace('.csv', '.xlsx'), index=False)
...
df.to_excel(writer, sheet_name=SheetNamePay, startrow=1, header=False, index=False)
...
workbook = writer.book
header_format = workbook.add_format( # !!! here workable, no error
{
'bold': True,
'text_wrap': True,
# 'valign': 'top',
'valign': 'center',
# 'fg_color': '#D7E4BC',
'bg_color': '#edbd93',
'border': 1
}
)
Part of the problem was I needed to set writer.book explicitly. So add the line writer.book = workbook after defining writer. Also adding engine="openpyxl" to the ExcelWriter got rid of a subsequent error. Altogether this seems to work:
workbook = load_workbook(input_excel_filename)
writer = pd.ExcelWriter(input_excel_filename, engine="openpyxl")
writer.book = workbook
writer.sheets = dict((ws.title, ws) for ws in wb.worksheets)
data.to_excel(writer, sheet_name="Data", index=False)
writer.save()
writer.close()
I couldn't get it to work with conditional formatting but setting formatting in the Excel spreadsheet directly actually seems to work, because even if the data is rewritten the formatting stays intact.
I have dataframe (df) that is added to an existing excel file as a new tab ('Print'). I am having difficulties adjusting the column width any ideas?
Code
book = load_workbook('file.xlsx')
writer = pd.ExcelWriter('file.xlsx', engine = 'openpyxl')
writer.book = book
df.to_excel(writer,sheet_name = 'Print')
worksheet = writer.sheets['Print']
worksheet.set_column('B:B', 40) #This does not work
writer = pd.ExcelWriter('file.xlsx', engine='openpyxl')
writer.book = book
df.to_excel(writer, sheet_name='Print')
sheet = book.get_sheet_by_name('Print')
sheet.column_dimensions['B'].width = 40
writer.save()
I've never used nor done this, but I just googled for "worksheet.set_column" and found this:
https://xlsxwriter.readthedocs.io/worksheet.html
The syntax for the function is
set_column(first_col, last_col, width, cell_format, options)
So I'd say the answer is:
worksheet.set_column(2, 2, 40)
import pandas as pd
from openpyxl import load_workbook
book = load_workbook('test.xlsx')
writer = pd.ExcelWriter('test.xlsx')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df = pd.DataFrame({'a':[1,3,5,7,4,5,6,4,7,8,9],
'b':[3,5,6,2,4,6,7,8,7,8,9]})
df.to_excel(writer, sheet_name='tab_name', index = False)
writer.save()
I get an error
AttributeError: 'Workbook' object has no attribute 'add_worksheet'
import pandas as pd
from openpyxl import load_workbook
fn = 'test.xlsx'
df = pd.read_excel(fn, header=None)
df2 = pd.DataFrame({'a':[1,3,5,7,4,5,6,4,7,8,9],
'b':[3,5,6,2,4,6,7,8,7,8,9]})
writer = pd.ExcelWriter(fn, engine='openpyxl')
book = load_workbook(fn)
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name='tab_name', header=None, index=False)
df2.to_excel(writer, sheet_name='tab_name', header=None, index=False,
startcol=7,startrow=6)
writer.save()
I try to write to all files, that I have at the same time.
I have some files
izzymonroe#mail.ru.xlsx,
lucky-frog#mail.ru.xlsx,
lucky-frog#mail.ru.xlsx,
izzymonroe#mail.ru.xlsx,
Yubodrova#ya.ru.xlsx,
lucky-frog#mail.ru.xlsx,
Ant.karpoff2011#yandex.ru.xlsx
9rooney9#list.ru.xlsx
and I want to write data to this. But how can I send it to function(and I need to write to file value with groupby)
df = pd.read_excel('group.xlsx')
def add_xlsx_sheet(df, sheet_name=u'Смартфоны полно', index=True, digits=1, path='9rooney9#list.ru.xlsx'):
book = load_workbook(path)
writer = ExcelWriter('9rooney9#list.ru.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
if sheet_name in list(writer.sheets.keys()):
sh = book.get_sheet_by_name(sheet_name)
book.remove_sheet(sh)
df.to_excel(writer, sheet_name=u'Смартфоны полно', startrow=0, startcol=0,
float_format='%.{}f'.format(digits), index=index)
writer.save()
It works to one file, but it write all data to this file. But I need to write group, where id in mail complies the name of file
How can I specify all file in function and next
df.groupby('member_id').apply(lambda g: g.to_excel(str(g.name) + '.xlsx', 'sheet2'))
The problem was solved with df.groupby('col_name').apply(lambda x: add_xlsx_sheet(x, x.name, path='{}.xlsx'.format(x.name)))