I am building automation for Excel a multi-tabbed excel document. When I try to close the document I get the error below (full traceback, minus the personal details at the top), which then is corrupted and I cannot open the xlsx document. Unfortunately I haven't found any clues to go off of. I am using xlsxwriter functions to set row and column formatting, from what I've found this could be an issue but I haven't been able to track it down. Any thoughts on possible solutions?
writer.close()
File "/opt/homebrew/lib/python3.10/site-packages/pandas/io/excel/_base.py", line 1480, in close
self._save()
File "/opt/homebrew/lib/python3.10/site-packages/pandas/io/excel/_xlsxwriter.py", line 244, in _save
self.book.close()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/workbook.py", line 324, in close
self._store_workbook()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/workbook.py", line 709, in _store_workbook
xml_files = packager._create_package()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/packager.py", line 137, in _create_package
self._write_worksheet_files()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/packager.py", line 193, in _write_worksheet_files
worksheet._assemble_xml_file()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/worksheet.py", line 4221, in _assemble_xml_file
self._write_cols()
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/worksheet.py", line 5807, in _write_cols
self._write_col_info(self.colinfo[col])
File "/opt/homebrew/lib/python3.10/site-packages/xlsxwriter/worksheet.py", line 5836, in _write_col_info
if width > 0:
TypeError: '>' not supported between instances of 'Format' and 'int'
Related
I have a folder with many xlsx files that I'd like to convert to csv files.
During my research, if found several threads about this topic, such as this or that one. Based on this, I formulated the following code using glob and pandas:
import glob
import pandas as pd
path = r'/Users/.../xlsx files'
excel_files = glob.glob(path + '/*.xlsx')
for excel in excel_files:
out = excel.split('.')[0]+'.csv'
df = pd.read_excel(excel) # error occurs here
df.to_csv(out)
But unfortunately, I got the following error message that I could not interpret in this context and I could not figure out how to solve this problem:
Traceback (most recent call last):
File "<input>", line 11, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/util/_decorators.py", line 299, in wrapper
return func(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 336, in read_excel
io = ExcelFile(io, storage_options=storage_options, engine=engine)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1131, in __init__
self._reader = self._engines[engine](self._io, storage_options=storage_options)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/excel/_openpyxl.py", line 475, in __init__
super().__init__(filepath_or_buffer, storage_options=storage_options)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 391, in __init__
self.book = self.load_workbook(self.handles.handle)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/io/excel/_openpyxl.py", line 486, in load_workbook
return load_workbook(
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 317, in load_workbook
reader.read()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 281, in read
apply_stylesheet(self.archive, self.wb)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/styles/stylesheet.py", line 198, in apply_stylesheet
stylesheet = Stylesheet.from_tree(node)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/styles/stylesheet.py", line 103, in from_tree
return super(Stylesheet, cls).from_tree(node)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/descriptors/serialisable.py", line 87, in from_tree
obj = desc.expected_type.from_tree(el)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/descriptors/serialisable.py", line 87, in from_tree
obj = desc.expected_type.from_tree(el)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/openpyxl/descriptors/serialisable.py", line 103, in from_tree
return cls(**attrib)
TypeError: __init__() got an unexpected keyword argument 'xfid'
Does anyone know how to fix this? Thanks a lot for your help!
I had the same problem here. After some hours thinking and searching I realized the problem is, actually, the file. I opened it using MS Excel, and save. Alakazan, problem solved.
The file was downloaded, so i think it's a "security" error or just an error from how the file was created. xD
EDIT:
It's not a security problem, but actually an error from the generation of file. The correct has the double of kb the wrong file.
An solution is: if using xlrd==1.2.0 the file can be opened, you can, after doing this, call read_excel to the Book(file opened by xlrd).
import xlrd
# df = pd.read_excel('TabelaPrecos.xlsx')
# The line above is the same result
a = xlrd.open_workbook('TabelaPrecos.xlsx')
b = pd.read_excel(a)
There is an xlsm file which I need to open and edit and extract data as per the macros enabled in the sheet.
But I'm not able to open the file itself.
I tried:
wb = openpyxl.load_workbook("workbook.xlsm",read_only=False,keep_vba=True)
Error Occuerred:
Traceback (most recent call last):
File "C:/Users/Downloads/Projects/Project1/Trial4.py", line 7, in <module>
wb = openpyxl.load_workbook("workbook.xlsm",read_only=False,keep_vba=True)
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\reader\excel.py", line 317, in load_workbook
reader.read()
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\reader\excel.py", line 282, in read
self.read_worksheets()
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\reader\excel.py", line 268, in read_worksheets
pivot = TableDefinition.from_tree(tree)
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\descriptors\serialisable.py", line 83, in from_tree
obj = desc.from_tree(el)
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\descriptors\sequence.py", line 85, in from_tree
return [self.expected_type.from_tree(el) for el in node]
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\descriptors\sequence.py", line 85, in <listcomp>
return [self.expected_type.from_tree(el) for el in node]
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\descriptors\serialisable.py", line 103, in from_tree
return cls(**attrib)
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\pivot\table.py", line 481, in __init__
self.scope = scope
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\openpyxl\descriptors\base.py", line 128, in __set__
raise ValueError(self.__doc__)
ValueError: Value must be one of {'selection', 'data', 'field'}
Can somebody provide a solution to this problem. TIA
I had the same problem not finding a solution on internet, so I deconstructed openpyxl and I finally found.
it would appear that a piece of code is missing to manage the "ConditionalFormat", so I removed it all from the source excel.
and that seems to have solved the problem.
In addition to Romain's answer, this issue seems to come from when conditional formatting is applied to specific cells that are part of a pivot table.
Ex. Have a pivot table with values from $A$2:$B$5. Setting a rule specifically on $A$2:$B$5 will cause an error, but if you set your conditional formatting rule to be applied to the more general 'All cells showing [values], for [Row Labels] and [Column Labels]' it won't throw an error.
I try to read a xlsx into a data frame:
itut_ir = pd.read_excel('C:\\Users\\Administrator\\Downloads\\reportdata.xlsx')
print(itut_ir.to_string())
I receive this:
Traceback (most recent call last):
File
"C:\Users\Administrator\eclipse-workspace\Reports\GOW\Report.py",
line 44, in
df = pd.read_excel('C:\Users\Administrator\Downloads\reportdata.xlsx')
File
"C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py",
line 304, in read_excel
io = ExcelFile(io, engine=engine) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py",
line 824, in init
self._reader = self.enginesengine File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_xlrd.py",
line 21, in init
super().init(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py",
line 353, in init
self.book = self.load_workbook(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_xlrd.py",
line 36, in load_workbook
return open_workbook(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\xlrd_init.py",
line 117, in open_workbook
zf = zipfile.ZipFile(filename) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\zipfile.py",
line 1222, in init
self._RealGetContents() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\zipfile.py",
line 1289, in _RealGetContents
raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file
does anybody have an idea? the file does not seem to be broken, I can open it with Excel.
thanks!
*** UPDATE ***
the file producing the error is being downloaded from FTP. opening the original file works ... if that gives you a hint :) thanks
I had the same issue just a little bit ago with an XLSX that I created in LibreOffice.
The solution was to check the XLSX to make sure it wasn't corrupted. In my case, loading a previous version of the XLSX file corrected the problem.
I'm trying to merge multiple .xls files into a single workbook, where each file is inserted into a sheet, named with the .xls filename.
While surfing on web, I've seen the documentation of Pyexcel and a specific module which, as written here, could do the job easly.
Here's the code.
from pyexcel.cookbook import merge_all_to_a_book
import glob
merge_all_to_a_book(glob.glob("Dir\*.xls"),"output.xls")
As expected, it doesn't work. Here's the console output.
File "..\Desktop\scripts\provaimport.py", line 48, in <module>
merge_all_to_a_book(glob.glob("C:\Users\Tesisti\Desktop\forpythonscript\*.xls"),"output.xls")
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\cookbook.py", line 148, in merge_all_to_a_book
merged.save_as(outfilename)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\internal\meta.py", line 339, in save_as
return save_book(self, file_name=filename, **keywords)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\internal\core.py", line 51, in save_book
return _save_any(a_source, book)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\internal\core.py", line 55, in _save_any
a_source.write_data(instance)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\plugins\sources\file_output.py", line 38, in write_data
**self._keywords)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel\plugins\renderers\excel.py", line 30, in render_book_to_file
save_data(file_name, book.to_dict(), **keywords)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel_io\io.py", line 119, in save_data
**keywords)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel_io\io.py", line 141, in store_data
writer.write(data)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel_io\book.py", line 58, in __exit__
self.close()
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\pyexcel_xls\xlsw.py", line 86, in close
self.work_book.save(self._file_alike_object)
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\xlwt\Workbook.py", line 710, in save
doc.save(filename_or_stream, self.get_biff_data())
File "C:\Users\Tesisti\Anaconda2\lib\site-packages\xlwt\Workbook.py", line 680, in get_biff_data
self.__worksheets[self.__active_sheet].selected = True
Any idea on how to fix?
It seems to me that glob.glob("Dir*.xls") returned an empty list of files. Hence pyexcel's plugin pyexcel-xls fails to create an empty file.
The current solution, I would recommend is to take the latest pyexcel-xls and use try-except statement around merge_all_to_a_book, catching empty file case.
I've been using pandas for a while and I think it is a great tool. I made a program to generate some excel files from some data collected by the user. The final user have been testing and using it for 6 months; it never failled till yesterday, when it generated a dagamaged excel file. When I opened it with a text editor, it was totally blank. The code to generate this file is this:
escritor = pandas.ExcelWriter(direccion, engine='xlsxwriter')
listaTotal.to_excel(escritor, index = False)
escritor.save()
and:
escritor = pandas.ExcelWriter(direccion + '.xlsx', engine='xlsxwriter')
self.listaFact.to_excel(escritor, index = False, startrow = 1, startcol = 0, sheet_name = 'Hoja1')
escritor.save()
The second code fragment also uses some format options for the 'xlsxwriter', an example here:
format = workbook.add_format()
format.set_font_size(9)
format.set_font_name('Sans Serif 12cpi')
format.set_border()
format.set_text_wrap()
This error happened twice; about 1 month ago and yesterday. I can't duplicate the error, I don't know what happened. And also the traceback is here, it shows the problem when the program reads the file, but this file was generated by the code posted before:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python27\lib\lib-tk\Tkinter.py", line 1532, in __call__
return self.func(*args)
File "C:\Users\WINNER\Documents\Visual Studio 2013\Projects\PythonApplication4\PythonApplication4\PythonApplication4.py", line 792, in botonGenerarPedido
self.generarPedido()
File "C:\Users\WINNER\Documents\Visual Studio 2013\Projects\PythonApplication4\PythonApplication4\PythonApplication4.py", line 904, in generarPedido
self.generarVentasDia()
File "C:\Users\WINNER\Documents\Visual Studio 2013\Projects\PythonApplication4\PythonApplication4\PythonApplication4.py", line 927, in generarVentasDia
listaTotal = pandas.io.excel.read_excel(direccion)
File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 151, in read_excel
return ExcelFile(io, engine=engine).parse(sheetname=sheetname, **kwds)
File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 188, in __init__
self.book = xlrd.open_workbook(io)
File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
ragged_rows=ragged_rows,
File "C:\Python27\lib\site-packages\xlrd\book.py", line 91, in open_workbook_xls
biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "C:\Python27\lib\site-packages\xlrd\book.py", line 1230, in getbof
bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
File "C:\Python27\lib\site-packages\xlrd\book.py", line 1224, in bof_error
raise XLRDError('Unsupported format, or corrupt file: ' + msg)
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\x00\x00\x00\x00\x00\x00\x00\x00'