This code used to get a xlsx file and write over it, but after updating from pandas 1.1.5 to 1.5.1 I got zipfile.badzipfile file is not a zip file
Then I read here that after pandas 1.2.0 the pd.ExcelWriter(report_path, engine='openpyxl') creates a new file but as this is a completely empty file, openpyxl cannot load it.
Knowing that, I changed the code to this one, but now I'm getting AttributeError: property 'sheets' of 'OpenpyxlWriter' object has no setter. How should I handle this?
book = load_workbook('Resultados.xlsx')
writer = pd.ExcelWriter('Resultados.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
reader = pd.read_excel(r'Resultados.xlsx')
df = pd.DataFrame.from_dict(dict_)
df.to_excel(writer, index=False, header=False, startrow=len(reader) + 1)
writer.close()
TLDR
Use .update to modify writer.sheets
Rearrange the order of your script to get it working
# run before initializing the ExcelWriter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
book = load_workbook("Resultados.xlsx")
# use `with` to avoid other exceptions
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
writer.book = book
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
Details
Recreating your problem with some fake data
import numpy as np
from openpyxl import load_workbook
import pandas as pd
if __name__ == "__main__":
# make some random data
np.random.seed(0)
df = pd.DataFrame(np.random.random(size=(5, 5)))
# this makes an existing file
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
df.to_excel(excel_writer=writer)
# make new random data
np.random.seed(1)
df = pd.DataFrame(np.random.random(size=(5, 5)))
# what you tried...
book = load_workbook("Resultados.xlsx")
writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
reader = pd.read_excel("Resultados.xlsx")
# skipping this step as we defined `df` differently
# df = pd.DataFrame.from_dict(dict_)
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
writer.close()
We get the same error plus a FutureWarning
...\StackOverflow\answer.py:23: FutureWarning: Setting the `book` attribute is not part of the public API, usage can give unexpected or corrupted results and will be removed in a future version
writer.book = book
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 24, in <module>
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
AttributeError: can't set attribute 'sheets'
The AttributeError is because sheets is a property of the writer instance. If you're unfamiliar with it, here is a resource.
In shorter terms, the exception is raised because sheets cannot be modified in the way you're trying. However, you can do this:
# use the `.update` method
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
That will move us past the the AttributeError, but we'll hit a ValueError a couple lines down:
reader = pd.read_excel("Resultados.xlsx")
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 26, in <module>
reader = pd.read_excel("Resultados.xlsx")
...
File "...\lib\site-packages\pandas\io\excel\_base.py", line 1656, in __init__
raise ValueError(
ValueError: Excel file format cannot be determined, you must specify an engine manually.
Do what the error message says and supply an argument to the engine parameter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
And now we're back to your original zipfile.BadZipFile exception
Traceback (most recent call last):
File "...\StackOverflow\answer.py", line 26, in <module>
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
...
File "...\Local\Programs\Python\Python310\lib\zipfile.py", line 1334, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
After a bit of toying, I noticed that the Resultados.xlsx file could not be opened manually after running this line:
writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
So I reordered some of the steps in your code:
# run before initializing the ExcelWriter
reader = pd.read_excel("Resultados.xlsx", engine="openpyxl")
book = load_workbook("Resultados.xlsx")
# the old way
# writer = pd.ExcelWriter("Resultados.xlsx", engine="openpyxl")
with pd.ExcelWriter("Resultados.xlsx", engine="openpyxl") as writer:
writer.book = book
writer.sheets.update(dict((ws.title, ws) for ws in book.worksheets))
df.to_excel(writer, index=False, header=False, startrow=len(reader)+1)
try this:
filepath = r'Resultados.xlsx'
with pd.ExcelWriter(
filepath,
engine='openpyxl',
mode='a',
if_sheet_exists='overlay') as writer:
reader = pd.read_excel(filepath)
df.to_excel(
writer,
startrow=reader.shape[0] + 1,
index=False,
header=False)
Related
I want to create a python script for cpu% to run every 5 seconds and output into excel file. I have managed to run the script once and its output in excel is below. How do i repeat it every 5 seconds and insert into excel just the value not the header-name. Please help i just started learning python.
output-
enter image description here
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
import numpy as np
import psutil
CPU = psutil.cpu_percent(interval=1)
df = pd.DataFrame({'CPU': [CPU]})
writer = ExcelWriter(r'C:\Users\kumardha\Desktop\DK_TEST\Pandas3.xlsx')
df.to_excel(writer,'Sheet1',index=False)
writer.save()
I m assuming this is what you expected ..
import pandas as pd
import numpy as np
import psutil
import time
from openpyxl import load_workbook
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
truncate_sheet=False,
**to_excel_kwargs):
# ignore [engine] parameter if it was passed
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
writer = pd.ExcelWriter(filename, engine='openpyxl')
# Python 2.x: define [FileNotFoundError] exception if it doesn't exist
try:
FileNotFoundError
except NameError:
FileNotFoundError = IOError
try:
# try to open an existing workbook
writer.book = load_workbook(filename)
# get the last row in the existing Excel sheet
# if it was not specified explicitly
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
# truncate sheet
if truncate_sheet and sheet_name in writer.book.sheetnames:
# index of [sheet_name] sheet
idx = writer.book.sheetnames.index(sheet_name)
# remove [sheet_name]
writer.book.remove(writer.book.worksheets[idx])
# create an empty sheet [sheet_name] using old index
writer.book.create_sheet(sheet_name, idx)
# copy existing sheets
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
except FileNotFoundError:
pass
if startrow is None:
startrow = 0
df.to_excel(writer, sheet_name, startrow=startrow,**to_excel_kwargs)
# save the workbook
writer.save()
def repeat(seconds,filename):
first_time=True
while True:
CPU = psutil.cpu_percent(interval=1)
df = pd.DataFrame({'CPU': [CPU]})
s = str(CPU)
b = print(s +' is current cpu at time '+time.ctime())
if first_time:
append_df_to_excel(filename,df,sheet_name='Sheet1',index=False)
first_time=False
else:
append_df_to_excel(filename,df,sheet_name='Sheet1',header=False,index=False)
time.sleep(seconds)
filename='path to filename'
repeat('delay you want in seconds',filename)
You can use subprocess.check to see what your output from running your script would be. Ive used this before with discord bots. I recommend you read this post: Running shell command and capturing the output
subprocess.check_output()
Good Luck
I'm doing some simple conditional formatting using xlsxwriter but I am getting this error when I run the code below.
AttributeError: 'Workbook' object has no attribute 'add_format'
I have updated xlsxwriter and looked at a lot of questions on SO and documentation but nothing has worked yet.
This is my code:
workbook = load_workbook(input_excel_filename)
writer = pd.ExcelWriter(input_excel_filename, engine="xlsxwriter")
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
trends_sheet = writer.sheets["Trends"]
slight_increase = writer.book.add_format({"bg_color":"#d3e6d5"})
trends_sheet.conditional_format("E:E", {"type":"cell", "criteria":"==", "value":"Slight Increase", "format":slight_increase})
Check if xlsxwriter package is installed or not....even I faced the same issue..resolved it after installing the package...same answer goes for any attribute error issue related to workbook/writer if your code is correct
Cause and solution
Makesure variable is usable
such as mine
first: workbook = writer.book
then: header_format = workbook.add_format(
Makesure already set pandas's engine (here using xlsxwriter)
when init ExcelWriter, set your engine
writer = pd.ExcelWriter(outputFile, engine='xlsxwriter’, options={'strings_to_urls': False} )
Makesure already installed related lib (xlsxwriter)
pip install xlsxwriter
or mine: pipenv install xlsxwriter
Full code for refer
import pandas as pd
writer = pd.ExcelWriter(
output_final_total_file,
engine='xlsxwriter',
options={'strings_to_urls': False}
)
...
df = pd.read_csv(outputExcelFile, sep=pandas_sep)
...
df.to_excel(outputExcelFile.replace('.csv', '.xlsx'), index=False)
...
df.to_excel(writer, sheet_name=SheetNamePay, startrow=1, header=False, index=False)
...
workbook = writer.book
header_format = workbook.add_format( # !!! here workable, no error
{
'bold': True,
'text_wrap': True,
# 'valign': 'top',
'valign': 'center',
# 'fg_color': '#D7E4BC',
'bg_color': '#edbd93',
'border': 1
}
)
Part of the problem was I needed to set writer.book explicitly. So add the line writer.book = workbook after defining writer. Also adding engine="openpyxl" to the ExcelWriter got rid of a subsequent error. Altogether this seems to work:
workbook = load_workbook(input_excel_filename)
writer = pd.ExcelWriter(input_excel_filename, engine="openpyxl")
writer.book = workbook
writer.sheets = dict((ws.title, ws) for ws in wb.worksheets)
data.to_excel(writer, sheet_name="Data", index=False)
writer.save()
writer.close()
I couldn't get it to work with conditional formatting but setting formatting in the Excel spreadsheet directly actually seems to work, because even if the data is rewritten the formatting stays intact.
I am trying to write a pandas data frame to an existing excel sheet on a new tab, but it gives me the following error:
AttributeError: 'NoneType' object has no attribute 'read'.
I've determined this is because pandas to_excel returns a NoneType object, which isn't allowing me to save the file with writer.save(). Does anyone know a workaround for this?
path = 'summary.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(writer, sheet_name="results")
writer.save()
I had exactly the same issue.
I managed to work around it by removing the value in legacy_drawing from each sheet in the workbook.
path = 'summary.xlsx'
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
for s in list(writer.sheets.keys()):
writer.sheets[s].legacy_drawing = None
df.to_excel(writer, sheet_name="results")
writer.save()
I have an error when I'm trying to save an excel file with another name:
This is part of my code:
precios_read = pd.read_excel('Precios_{}.xls'.format(auth2), sheet_name='Precios')
precios_read = precios_read.sort_values(by=['Espacio'], ascending=True)
book = load_workbook('Template_sugerencia.xlsx')
writer = pd.ExcelWriter('Template_sugerencia.xlsx', engine='openpyxl')
writer.book = book
precios_read.to_excel(writer, sheet_name='template', startcol=12, startrow=5, index=False, merge_cells = True)
Recom.to_excel(writer, sheet_name='template', startcol=0, startrow=5, index=False, merge_cells = True)
cliente = auth + '_' + ids
writer.save('{}.xls'.format(cliente))
The problem is in the last line : writer.save('{}.xls'.format(cliente)). If I do writer.save() only all is okey and the file was saved but if I add the name of the file I want I can't do it
TypeError: save() takes exactly 1 argument (2 given)
ExcelWriter only takes in the filename on create, e.g.:
writer = pd.ExcelWriter('Template_sugerencia.xslx', engine='openpyxl')
writer.save has no arguments (the 1 argument is self). Calling it will save to the earlier specified filename.
I try to write to all files, that I have at the same time.
I have some files
izzymonroe#mail.ru.xlsx,
lucky-frog#mail.ru.xlsx,
lucky-frog#mail.ru.xlsx,
izzymonroe#mail.ru.xlsx,
Yubodrova#ya.ru.xlsx,
lucky-frog#mail.ru.xlsx,
Ant.karpoff2011#yandex.ru.xlsx
9rooney9#list.ru.xlsx
and I want to write data to this. But how can I send it to function(and I need to write to file value with groupby)
df = pd.read_excel('group.xlsx')
def add_xlsx_sheet(df, sheet_name=u'Смартфоны полно', index=True, digits=1, path='9rooney9#list.ru.xlsx'):
book = load_workbook(path)
writer = ExcelWriter('9rooney9#list.ru.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
if sheet_name in list(writer.sheets.keys()):
sh = book.get_sheet_by_name(sheet_name)
book.remove_sheet(sh)
df.to_excel(writer, sheet_name=u'Смартфоны полно', startrow=0, startcol=0,
float_format='%.{}f'.format(digits), index=index)
writer.save()
It works to one file, but it write all data to this file. But I need to write group, where id in mail complies the name of file
How can I specify all file in function and next
df.groupby('member_id').apply(lambda g: g.to_excel(str(g.name) + '.xlsx', 'sheet2'))
The problem was solved with df.groupby('col_name').apply(lambda x: add_xlsx_sheet(x, x.name, path='{}.xlsx'.format(x.name)))