Can't use xlwt to save excel sheet in python - python

So, I am facing a weird issue, In my code I have the follow snippet:
workbook = xlwt.Workbook()
sheet = workbook.add_sheet("Sheet 1")
for x in range(len(table)):
for y in range(len(table[x])):
sheet.write(x, y, table[x][y])
workbook.save("output.xls")
which is from this question's answer: How do I export a two dimensional list in Python to excel?
where table is a multidimentional array (4x12 in my test code) that I want to convert to excel. I know that the code reaches the save and attempts to do, because if I put a directory it cannot save to due to insufficient admin-rights it will give me an error telling me about it, But if I run the code with a normal directory... nothing happens.
I am not sure where this is going wrong

Related

Opening an Excel File in Python Disables Dynamic Arrays

I have an excel workbook that uses functions like OFFSET, UNIQUE, and FILTER which spill into other cells. I'm using python to analyze and write some data to the workbook, but after doing so these formulas revert into normal arrays. This means they now take up a fixed number of cells (however many they took up before opening the file in python) instead of adjusting to fit all of the data. I can revert the change by selecting the formula and hitting enter, but there are many of these formulas it's more work to fix them than to just print the data to a text file and paste it into excel manually. Is there any way to prevent this behavior?
I've been using openpyxl to open and save the workbook, but after encountering this issue also tried xlsxwriter and the dataframe to excel function from pandas. Both of them had the same issue as openpyxl. For context I am on python 3.11 and using the most recent version of these modules. I believe this issue is on the Python side and not the Excel side, so I don't think changing Excel settings will help, but maybe there is something there I missed.
Example:
I've created an empty workbook with two sheets, one called 'main' and one called 'input'. The 'main' sheet will analyze data from the 'input' sheet which will be entered with openpyxl. The data will just be values in the first column.
In cell A1 of the 'main' sheet, enter =OFFSET(input!A1,0,0,COUNTA(input!A:A),1).
This formula will just show a copy of the data. Since there currently isn't any data it gives a #REF! error, so it only takes up one cell.
Now I'll run the following python code to add the numbers 0-9 into the first column of the input sheet:
from openpyxl import load_workbook
wb = load_workbook('workbook.xlsx')
ws = wb['input']
for i in range(10):
ws.append([i])
wb.save('workbook_2.xlsx')
When opening the new file, cell A1 on the 'main' sheet only has the first value, 0, instead of the range 0--9. When selecting the cell, you can see the formula is now {=OFFSET(input!A1,0,0,COUNTA(input!A:A),1)}. The curly brackets make it an array, so it wont spill. By hitting enter in the formula the array is removed and the sheet properly becomes the full range.
If I can get this simple example to work, then expanding it to the data I'm using shouldn't be a problem.

Appending Excel cell values using pandas

Edit: I found out a solution to my question. More or less look at the user manual for openPyxl instead of online tutorials, the tutorials ran errors when I tried them (I tried more than one) and their thought process was significantly different from the thought process in the user manual. And also I ended up not using pandas as much as I thought I would need to.
I am trying to append certain values in an Excel file with multiple sheets based on user inputs and then rewrite it to the Excel file (without deleting the rest of the sheets). So far I have tried this which seems to combine the data but I didn't quite see how it applied to what I am doing since I want to append a part of a sheet instead of rewrite the whole excel file. I have also tried a few other things with ExcelWriter but I don't quite understand it since it usually wipes all the data in the file (I may be using it wrong).
episode_dataframe = pd.read_excel (r'All_excerpts (Siena Copy)_test.xlsx', sheet_name=episode)
#episode is a specified string inputted by user, this line makes a data frame for the specified sheet
episode_dataframe.loc[(int(pass_num) - 1), 'Resources'] = resources
#resources is also a user inputted string, it's what I am trying to append the spreadsheet cell value to, this appends to corresponding data frame
path_R = open("All_excerpts (Siena Copy)_test.xlsx", "rb")
with pd.ExcelWriter(path_R) as writer:
writer.book = openpyxl.load_workbook(path_R)
#I copied this from [here][3], i think it should make the writer for the to_excel? I don't fully know
episode_dataframe.to_excel(writer, sheet_name=episode, engine=openpyxl, if_sheet_exsits ='replace')
#this should write the sheet data frame onto the file, but I don't want it to delete the other sheets
Additionally, I have been running into a bunch of other smaller errors, a big one was Workbook' object has no attribute 'add worksheet' even though I'm not trying to add a worksheet, also I could not get their solution to work.
I am a bit of a novice at python, so my code might be a bit of a mess.

Excel removes a formula set by Pandas, but setting the formula manually makes the formula work

After check this post and see that there is no response I have opened this one.
I am trying to set a formula in an Excel cell through Pandas in Python. So far it worked by specifying the formula as text but with a new formula I am having problems:
=FILTER(SHEET1!A2:I456,(IF(SHEET2!D9=0,SHEET1!D2:D456>SHEET2!D9,SHEET1!D2:D456>=SHEET2!D9)),"No data")
(In the python code, the " are specified as \" for the empty branch)
If I open the Excel file after the code execution, Excel complains that there is a problem and I have to do accept a "recover", showing that the formula has been removed and the cell displays a 0.
After that, If I put the same formula (with " instead of \") manually in the same cell it works and the information is displayed.
I have tried to specify the cells with $ ($A$2) without success... I also have checked in the Excel options and the formulas are set to evaluate in "Automatic".
What is the problem?
Regards.
After some more research I have found the problem. I'm using OFFICE 365, in case it might affect this answer.
What was driving me crazy was that the handwritten formula in Excel was working. I had a workaround that consisted of putting the contents of the formula as text without the = sign so that Excel would not interpret it as a formula. Open Excel, go to that cell, enter the = by hand and when I pressed enter, the data was displayed.
As I use EXCEL in Spanish, but with Pandas you have to write everything in English notation, I thought I would see what Excel did internally when I put the = by hand and the formula worked. What I did was:
Change the file extension from .xlsx to .zip.
Open the zip and go to the path: xl/worksheets/sheet[number].xml.
Find the formula field, looking for <f> or </f>.
At that point I noticed that the content, instead of starting with:
FILTER(....)
I found:
_xlfn._xlws.FILTER(....)
So in the PANDAS code I changed:
cell_formula = f"=FILTER(...)"
by:
cell_formula = f"=_xlfn._xlws.FILTER(...)"
And then:
workbook = pandas_writer.book
worksheet = workbook.sheetnames[sheet_name]
worksheet.write_array_formula("A2:Y109", "{" + cell_formula + "}")
workbook.close()
And now when I open Excel I don't get the error and the formula shows the result. Then, looking in this section of the XlsxWriter documentation and in the Microsoft documentation this function does not appear.
So if this happens to you, fix the function by hand, save the changes and inspect the internal XML that is generated by EXCEL.

Broken Excel output: Openpyxl formula settings?

I am creating some Excel spreadsheets from pandas DataFrames using the pandas.ExcelWriter().
Issue:
For some string input, this creates broken .xlsx files that need to be repaired. (problem with some content --- removed formula, cf error msg below)
I assume this happens because Excel interprets the cell content not as a string, but a formula which it cannot parse, e.g. when a string value starts with "="
Question:
When using xlsxwriter as engine, I can solve this issue by setting the argument options = {"strings_to_formulas" : False }
Is there a similar argument for openpyxl?
Troubleshooting:
I found the data_only argument to Workbook, but it only seems to apply to reading files / I cannot get it to work with ExcelWriter().
Not all output values are strings / I'd like to avoid converting all output to str
Could not find an applicable question on here
Any hints are much appreciated, thanks!
Error messages:
We found a problem with some content in 'file.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of this workbook, click Yes
The log after opening says:
[...] summary="Following is a list of removed records:">Removed Records: Formula from /xl/worksheets/sheet1.xml part [...]
Code
import pandas
excelout = pandas.ExcelWriter(output_file, engine = "openpyxl")
df.to_excel(excelout)
excelout.save()
Versions:
pandas #0.24.2
openpyxl #2.5.6
Excel 2016 for Mac (but replicates on Win)
I've struggled of this issue too.
I have found a strange solution for formulas.
I had to replace all ; (semicolon) signs with , (comma) in the formulas.
When I opened the result xlsx file with Excel, this error didn't rise and the formula in Excel had usual ;.
I spent FAR too long trying to figure out this error.
Turned out I had an extra bracket, so the formula wasn't valid.
I know 99% of people will read this and say "thats not the issue" and move on, but take your formula and paste it into excel if you can (replacing dynamic values as best you can) and see if excel accepts it.
If it accepts it fine, move on and find whatever the other cause it, but if you find it doesn't like the formula, maybe I just saved you a couple of hours....
My command: f'''=IF(ISBLANK(E{row}),FALSE," "))'''
Tiny command, could not understand what was wrong with it. :facepalm:

pd.read_excel does recognize the file but does not actually read it

I've been busy working on some code and one part of it is importing an excel file. I've been using the code below. Now, on one pc it works but on another it does not (I did change the paths though). Python does recognize the excel file and does not give an error when loading, but when I print the table it says:
Empty DataFrame
Columns: []
Index: []
Just to be sure, I checked the filepath which seems to be correct. I also checked the sheetname but that is all good too.
df = pd.read_excel(book_filepath, sheet_name='Potentie_alles')
description = df["#"].map(str)
The key error '#' (# is the header of the first column of the sheet).
Does anyone know how to fix this?
Kind regards,
iCookieMonster

Categories