Pandas / xlsxwriter writer.close() does not completely close the excel file - python

I'm trying to modify manually an excel file after creating it with a python script. Unfortunately, if the script is still running, a sharing file violation error message appears when trying to save it with the same name.
Everything runs smoothly in the code. The file is created, filled and saved. I can open it and work on it but can't overwrite it under the same name if the script is still running.
outpath = filedialog.asksaveasfile(
mode="wb",
filetype=[("Excel", ("*.xls", "*.xlsx"))],
defaultextension=".xlsx",
)
writer = pd.ExcelWriter(outpath, engine="xlsxwriter")
df1.to_excel(writer, sheet_name="Results")
writer.save()
writer.close()
I expect python to fully close the excel file and let me overwrite on it while the script is still running

I also had this issue.
When trying to save changes in Excel I got "Sharing violation".
Solved it adding writer.handles = None after writer.close().
writer = pd.ExcelWriter(workPath+'file.xlsx', engine='xlsxwriter')
# Add all your sheets and formatting here
...
# Save and release handle
writer.close()
writer.handles = None

I also ran into this. I couldn't save the file in Excel because of a "Sharing violation" because python.exe still had a handle on the file.
The accepted answer, to just use df.to_excel() is correct if all you want to do is save the excel file. But if you want to do more things, such as adding formatting to the excel file first, you will have to use pd.ExcelWriter().
The key is though, as Exho commented, that you use the form:
with pd.ExcelWriter(outpath, engine="xlsxwriter") as writer:
# do stuff here
You don't use writer.save() or writer.close(), which are synonyms for the same call anyway. Instead the file is saved and closed and handles are released as soon as you leave the with scope.

Your code looks too complicated, you don't need to deal with the writer yourself df.to_excel() can do it for you.
Just use the simpler code:df1.to_excel(outpath, sheet_name="Results", engine='xlsxwriter') as suggested in the docs.

I was facing a similar situation. The suggestion given by alec_djinn didn't work for multiple sheets, as I was working. So I just ignored .close() method and it worked just fine.

Related

Pandas library writing corrupted xlsx files when using an ExcelWriter

I'm trying to write some data to an excel spreadsheet. Whenever I've tried to use the DataFrame.to_excel method with a file path instead of an ExcelWriter object as the first argument
(e.g. pd.DataFrame([1, 2, 3]).to_excel("test.xlsx") and that works fine except that it rewrites the whole file every time. I want to append data and I don't see an option in the documentation that lets you set it to something like append mode. So, I'm using an ExcelWriter object because that seems to have an append mode if you initialise is as follows (documentation):
writer = ExcelWriter("test.xlsx", mode='a', if_sheet_exists="overlay").
Then, if I understand correctly, you should be able to pass that object into the to_excel function like this:
pd.DataFrame([1, 2, 3]).to_excel(writer) and it shouldn't rewrite the whole file.
But, when I use an ExcelWriter to create or modify the file, excel gives me the error:
"Excel cannot open the file 'test.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file"
I have tried initialising the ExcelWriter with only the first argument, writer = ExcelWriter("test.xlsx"), and that produces the same error when opening the file.
I think the writer is writing corrupted excel files, anyone know a fix?
Fixed it, I wasn't closing the XlsxWriter with
writer.close()

Question about error when saving xlsm file using xlwings

I want to open an xlsm file via xlwings and then edit it and save it. However, some problems arose.
If I run the code with no excel file working, or just open another excel file and do not edit the excel file, it works fine. However, if I open an Excel file and do some work, for example open a blank Excel file and enter 'test' in cell A1, and run the code, sometimes it works, but sometimes it becomes unresponsive in the third line.(wb_xl = xw.Book(copy)) In this case, the code does not jump from the third line in an unresponsive state. What makes more sense is that the code works fine in some cases.
I want to know when the code works fine in all cases.
And there is one more problem.
If this code is executed while working with another Excel, only wb_xl should be terminated. I don't want another Excel to be closed. I want to exit only wb_xl. However, when the app.quit() code is executed, all open Excels are closed. In this case, how can I close only the Excel(wb_xl) opened through the code without closing the working Excel?
import xlwings as xw
copy = 'C:/Users/ijung/Desktop/210919_Mk_Lot_test/210922_101test.xlsm'
wb_xl = xw.Book(copy) #sometimes no response in this line
ws_xl = wb_xl.sheets['Main']
app = xw.apps.active
ws_xl.range('A1').value = 'test'
wb_xl.save()
app.quit()
#wb_xl.app.kill()
#wb_xl.close()
I also used openpyxl. However, in this part of wb_open.save(copy), an error such as xml.etree.ElementTree.ParseError: mismatched tag: line 20, column 8 occurred. When I use xlsx, the save works fine, but when I use xlsm, an error occurs.
import openpyxl
wb_open = openpyxl.load_workbook(copy, read_only = False, keep_vba = True)
ws_open = wb_open.active
ws_open.cell(1,1).value = 'test'
wb_open.save(copy) #error
wb_open.close()
As a result, the purpose of this code is to open the xlsm file by executing this code even when working with another Excel, edit and save, and close only this xlsm file.However, using multiple packages and searching multiple sites could not solve the problem.I'm under a lot of stress with this issue. Any help would be greatly appreciated. Please help me.
Thanks in advance.
openpyxl does not works with xlsm files that contains form objects
I think the problem is in app.quit() you are closing the excel instance, just use wb_xl.close()
import xlwings as xw
copy = 'C:/Users/ijung/Desktop/210919_Mk_Lot_test/210922_101test.xlsm'
wb_xl = xw.Book(copy) #sometimes no response in this line
ws_xl = wb_xl.sheets['Main']
#app = xw.apps.active # don't needed
ws_xl.range('A1').value = 'test'
wb_xl.save()
wb_xl.close()
This should only close the book, take a look this post has insteresting answers

Save file Excel using Openpyx without loosing data

Using this command, unfortunately it always creates that file for me, losing the previous data:
Account.save("Ex.xlsx")
The command: SaveCopyAs not work with a workbook
I would simply like to replicate the SaveCopyAs command on python to save my excel file after writing and updating it. Unfortunately with the save command, I delete all the previous content
When you execute Example=Workbook(), you are making a new file. That means when you execute Example.save("Jungle.xlsx"), you are overwriting the original file. Instead, you should use Example = load_workbook('Jungle.xlsx') to read the contents of the original so that Example.save("Jungle.xlsx") can act like an update.
See https://openpyxl.readthedocs.io/en/stable/tutorial.html#loading-from-a-file for more details.

openpyxl error raise ValueError('Min value is {0}'.format(self.min)) in opening heavy file with formatting

I'm trying to use openpyxl for the first time on a very heavy file, that happens to be over 20 500 Ko, has a lot of formatting and a VBA macro.
My code keeps returning the following error:
File " \Anaconda3\lib\site-packages\openpyxl\styles\alignment.py", line 52, in __init__
self.relativeIndent = relativeIndent
File " \Anaconda3\lib\site-packages\openpyxl\descriptors\base.py", line 107, in __set__
raise ValueError('Min value is {0}'.format(self.min))
ValueError: Min value is 0
Would anyone know what the problem is / how to access the file despite it? I'm trying to post data into an existent Excel file to simplify processes and replace a heavy VBA code. So I can't just post it into a different xlsx file and call it using a VBA code (that would defeat the purpose).
Thanks a lot!
Here is my code :
wb = load_workbook(filename='C:/dev/CodeRep/ProjectName/MainFile 2021_01.xlsm', read_only = False, keep_vba = True)
The traceback says that there is a problem with the Alignment definition in the workbook's stylesheet. openpyxl follows the OOXML specification very closely to minimise unpleasant surprises later, this is why it tends to raise exceptions or give warnings rather than let things pass.
For more details we'll need to see the XML source for the stylesheet, or the Alignments part at least. You can find this by unzipping the XLSM file and looking for the styles.xml file. That will give you more information and also allow you to submit a bug report to openpyxl.
Preprocess the file
I solved this issue by preprocessing the excel file.
Found that mi problem was at "*/myfile.xlsx/xl/styles.xml" where several xf tags had an attribute indent="-1", and openpyxl only supports non-negative values, raising that exception when a negative value is found.
After some time spent trying to override entire openpyxl hierarchy in order to catch the exception, I decided to process the XLSX.
Here is my code:
def fix_xlsx(file_name):
with zipfile.ZipFile(file_name) as input_file, zipfile.ZipFile(file_name + ".out", "w") as output_file:
# Iterate over files
for inzipinfo in input_file.infolist():
with input_file.open(inzipinfo) as infile:
if "xl/styles.xml" in inzipinfo.filename:
# Read, Process & Write
lines = infile.readlines()
new_lines = b"\n".join([line.replace(b'indent="-1"', b'indent="0"') for line in lines])
output_file.writestr(inzipinfo.filename, new_lines)
else:
# Read & Write
output_file.writestr(inzipinfo.filename, b"\n".join([line for line in infile.readlines()]))
# Replace file
os.replace(file_name + ".out", file_name)
Disclaimer:
I must say this is not a very elegant solution as the entire file is processed, and an auxiliary file is used.
Also I am not so expert at excel to tell wheter changing that indent="-1" to indent="0" for those tags might cause format problems in the file. This is my working solution and can't really tell the effect of those tags.
I had the same issue — the file wasn't accepted by Openpyxl.
I just opened the file in MS Excel and saved it to a new file. And it worked after that.
I got the same error and wasn't able to figure out the exact cause, but noticed when I ran my python script in a different environment it worked without issue.
I realized it may have had something to do with the versions of the openpyxl and xlrd packages I was using so I downgraded them to openpyxl==3.0.4 and xlrd==1.2.0 (previously using openpyxl==3.0.7 and xlrd==2.0.1) and that solved my issue.
I ran into this issue, my solution was to pinpoint what was causing the error in the spreadsheet (had something to do with a table that was recently modified) and reconstruct that table in the worksheet. much easier for me than debugging openpyxl or xml.

Openpyxl NotImplementedError Only When Loading Workbook

I have been working on a program to input some data into an excel file using Openpyxl with options of either loading an existing file or creating a new file. While creating a new file allows me to write the data to the excel file without any problems but loading from an existing file and trying to write new data to new rows raises a NotImplementedError with the line:
ws['A' + str(row)] = gene]
even though it was the same for writing to a new file.
Any help would be greatly appreciated!
Update: Thanks Charlie, after removing use_iterators from:
wb = load_workbook(filename=file_name+'.xlsx', use_iterators=True), the code let me write to the file.
If you open a file in read-only mode, why do you expect to be able to edit it? The exception is raised for exactly this reason.
Remove use_iterators when opening the file to avoid this.

Categories