File not writing unless I set a trace and wait - python

I'm having a bizarre issue trying to write an xlsx file in Python.
I'm using Python 2.7.x and xlsxwriter to write xlsx files.
Here's a code snippet for context:
workbook = xlsxwriter.Workbook('filename.xlsx')
worksheet = workbook.add_worksheet('worksheet_name')
worksheet.write_row('A1', make_header_row) // <---- ROW 1
... // initialize "fields" array
worksheet.write_row('A2', fields) // <---- ROW 2
So here's the problem: Row 1 gets written, no problem. Row 2 never gets written... unless I stick a import pdb; pdb.set_trace() right above the line where I write Row 2. Waiting ~5 seconds in the pdb and then hitting continue will result in a successfully written second row.
I've tried flushing the workbook right after write_row, making sure the file is closed... nothing works.
Thanks for any help you can provide!

Related

Opening an Excel File in Python Disables Dynamic Arrays

I have an excel workbook that uses functions like OFFSET, UNIQUE, and FILTER which spill into other cells. I'm using python to analyze and write some data to the workbook, but after doing so these formulas revert into normal arrays. This means they now take up a fixed number of cells (however many they took up before opening the file in python) instead of adjusting to fit all of the data. I can revert the change by selecting the formula and hitting enter, but there are many of these formulas it's more work to fix them than to just print the data to a text file and paste it into excel manually. Is there any way to prevent this behavior?
I've been using openpyxl to open and save the workbook, but after encountering this issue also tried xlsxwriter and the dataframe to excel function from pandas. Both of them had the same issue as openpyxl. For context I am on python 3.11 and using the most recent version of these modules. I believe this issue is on the Python side and not the Excel side, so I don't think changing Excel settings will help, but maybe there is something there I missed.
Example:
I've created an empty workbook with two sheets, one called 'main' and one called 'input'. The 'main' sheet will analyze data from the 'input' sheet which will be entered with openpyxl. The data will just be values in the first column.
In cell A1 of the 'main' sheet, enter =OFFSET(input!A1,0,0,COUNTA(input!A:A),1).
This formula will just show a copy of the data. Since there currently isn't any data it gives a #REF! error, so it only takes up one cell.
Now I'll run the following python code to add the numbers 0-9 into the first column of the input sheet:
from openpyxl import load_workbook
wb = load_workbook('workbook.xlsx')
ws = wb['input']
for i in range(10):
ws.append([i])
wb.save('workbook_2.xlsx')
When opening the new file, cell A1 on the 'main' sheet only has the first value, 0, instead of the range 0--9. When selecting the cell, you can see the formula is now {=OFFSET(input!A1,0,0,COUNTA(input!A:A),1)}. The curly brackets make it an array, so it wont spill. By hitting enter in the formula the array is removed and the sheet properly becomes the full range.
If I can get this simple example to work, then expanding it to the data I'm using shouldn't be a problem.

Why does this Python command line make my Excel freeze and stop responding?

I am trying to make a Python script that refreshes a specific file. I have been able to do it with a few sheets but a handful of my sheets it seems to break my Excel. Here is my code:
import win32com.client
import time
# Start an instance of Excel
x1 = win32com.client.DispatchEx("Excel.Application")
# Open the workbook in said instance of Excel
wb = x1.workbooks.open(r"file path")
x1.Visible = True
# Refresh all data connections.
wb.RefreshAll()
x1.CalculateUntilAsyncQueriesDone()
wb.Save()
wb.Close(True)
time.sleep(5)
x1.Quit()
print("Excel Quit")
What happens is right when I get to x1.CalculateUntilAsyncQueriesDone() Excel just spins and whites out and says "Not responding." I've let it run for 15 minutes and nothing. Usually this query takes about 1 minute if I just simply open the spreadsheet and hit refresh all. Also, if I replace x1.CalculateUntilAsyncQueriesDone() with time.sleep(120) the code works perfectly. For some reason that line is breaking the entire process. I don't want to simply use time.sleep though, because sometimes the refresh will take longer or shorter.
Any help anyone can give would be greatly appreciated.

Using a function from another file to write to excel with xlsx

I have two python files:
1- writerCode.py:
import xlsxwriter
workbook = xlsxwriter.Workbook('demo.xlsx')
ws = workbook.add_worksheet()
def writeTo(x,y,array):
j = 0
while j < (len(array)):
ws.write(x,y,array[j])
j +=1
x +=1
return;
workbook.close()
2- testingCode.py:
from writerCode import *
an = ['123','234','123','432','123']
writeTo(0,0,an)
I want to import an[] items to excel.
When I run testingCode.py it creates 'demo.xlsx' with NOTHING in it. The excel file is empty meaning that it does not import an[] to the excel file as intended.
I was wondering if anybody knows what the problem is??
The problem is not "Using a function from another file"
The excel file is empty meaning that it does not import an[] to the excel file as intended.
It could also mean that the part of your code controlling Excel does not work.
test2.py:
def mydef(d):
print(d)
test.py:
from test2 import *
data = "hello"
mydef(data)
results in
hello
So it's not that.
It's probably that you are closing your workbook right after you open it.
All the code is run on import, except for the function.
Ugly quick fix (restructure your code):
workbook.close()
after calling writeTo().
xlsxwriter should be raising an Exception when a closed Workbook/Worksheet is being written to, but it doesn't:
import xlsxwriter
workbook = xlsxwriter.Workbook('demo.xlsx')
ws = workbook.add_worksheet()
workbook.close()
ws.write(0,0,"hello")
indeed results in a valid XSLX file with no data in the cell.
You are trying to import closed Workbook in testingCode.py, you can't write to a closed workbook.
So, replace workbook.close() to testingCode.py. That way you are closing workbook after calling the function writeTo(x, y, array).
And try to use for loop, because while loop works but makes your code looks bit complex to read. And make your code looks neat by adding spaces after ,. These spaces makes your code more readable.
def writeTo(x, y, array):
for item in (array):
ws.write(x, y, item)
x += 1

How can I make this python(using openpyxl) program run faster?

Here is my code:
import openpyxl
import os
os.chdir('c:\\users\\Desktop')
wb= openpyxl.load_workbook(filename= 'excel.xlsx',data_only = True)
wb.create_sheet(index=0,title='Summary')
sumsheet= wb.get_sheet_by_name('Summary')
print('Creating Summary Sheet')
#loop through worksheets
print('Looping Worksheets')
for sheet in wb.worksheets:
for row in sheet.iter_rows():
for cell in row:
#find headers of columns needed
if cell.value=='LowLimit':
lowCol=cell.column
if cell.value=='HighLimit':
highCol=cell.column
if cell.value=='MeasValue':
measCol=cell.column
#name new columns
sheet['O1']='meas-low'
sheet['P1']='high-meas'
sheet['Q1']='Minimum'
sheet['R1']='Margin'
#find how many rows of each sheet
maxrow=sheet.max_row
i=0
#subtraction using max row
for i in range(2,maxrow+1):
if sheet[str(highCol)+str(i)].value=='---':
sheet['O'+str(i)]='='+str(measCol)+str(i)+'-'+str(lowCol)+str(i)
sheet['P'+str(i)]='=9999'
sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'
elif sheet[str(lowCol)+str(i)].value=='---':
sheet['O'+str(i)]='=9999'
sheet['P'+str(i)]='='+str(highCol)+str(i)+'-'+str(measCol)+str(i)
sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'
else:
sheet['O'+str(i)]='='+str(measCol)+str(i)+'-'+str(lowCol)+str(i)
sheet['P'+str(i)]='='+str(highCol)+str(i)+'-'+str(measCol)+str(i)
sheet['Q'+str(i)]='=MIN(O'+str(i)+':P'+str(i)+')'
sheet['R'+str(i)]='=IF(AND(Q'+str(i)+'<3,Q'+str(i)+'>-3),"Marginal","")'
++i
print('Saving new wb')
import os
os.chdir('C:\\Users\\hpj683\\Desktop')
wb.save('example.xlsx')
This runs perfectly fine except that it takes 4 minutes to complete one excel workbook. Is there any way I can optimize my code to make this run faster? My research online suggested to change to read_only or write_only to make it run faster however my code requires reading and writing to an excel workbook, so neither of those worked.
The code could benefit from being broken down into separate functions. This will help you identify the slow bits and replace them bit by bit.
The following bits should not be in the loop for every row:
finding the headers
calling ws.max_row this is very expensive
ws["C" + str(i)]. Use ws.cell(row=i, column=3)
And if the nested loop is not a formatting error then why is it nested?
Also you should look at the profile module to find out what is slow. You might want to watch my talk on profiling openpyxl from last year's PyCon UK.
Good luck!

win32com Excel data input error

I'm exporting results of my script into Excel spreadsheet. Everything works fine, I put big sets of data into SpreadSheet, but sometimes an error occurs:
File "C:\Python26\lib\site-packages\win32com\client\dynamic.py", line 550, in __setattr__
self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value)
pywintypes.com_error: (-2147352567, 'Exception.', (0, None, None, None, 0, -2146777998), None)***
I suppose It's not a problem of input data format. I put several different types of data strings, ints, floats, lists and it works fine. When I run the sript for the second time it works fine - no error. What's going on?
PS. This is code that generates error, what's strange is that the error doesn't occur always. Say 30% of runs results in an error. :
import win32com.client
def Generate_Excel_Report():
Excel=win32com.client.Dispatch("Excel.Application")
Excel.Workbooks.Add(1)
Cells=Excel.ActiveWorkBook.ActiveSheet.Cells
for i in range(100):
Row=int(35+i)
for j in range(10):
Cells(int(Row),int(5+j)).Value="string"
for i in range(100):
Row=int(135+i)
for j in range(10):
Cells(int(Row),int(5+j)).Value=32.32 #float
Generate_Excel_Report()
The strangest for me is that when I run the script with the same code, the same input many times, then sometimes an error occurs, sometimes not.
This is most likely a synchronous COM access error. See my answer to Error while working with excel using python for details about why and a workaround.
I can't see why the file format/extension would make a difference. You'd be calling the same COM object either way. My experience with this error is that it's more or less random, but you can increase the chances of it happening by interacting with Excel while your script is running.
edit: It doesn't change a thing. Error occurs, but leff often. Once in 10 simulations while with .xlsx file once in 3 simulations. Please help
The problem was with the file I was opening. It was .xlsx , while I've saved it as .xls the problem disappeared. So beware, do not ever use COM interface with .xlsx or You'll get in trouble !
You should diseable excel interactivity while doing this.
import win32com.client
def Generate_Excel_Report():
Excel=win32com.client.Dispatch("Excel.Application")
#you won't see what happens (faster)
Excel.ScreenUpdating = False
#clics on the Excel window have no effect
#(set back to True before closing Excel)
Excel.Interactive = False
Excel.Workbooks.Add(1)
Cells=Excel.ActiveWorkBook.ActiveSheet.Cells
for i in range(100):
Row=int(35+i)
for j in range(10):
Cells(int(Row),int(5+j)).Value="string"
for i in range(100):
Row=int(135+i)
for j in range(10):
Cells(int(Row),int(5+j)).Value=32.32 #float
Excel.ScreenUpdating = True
Excel.Interactive = True
Generate_Excel_Report()
Also you could do that to increase your code performance :
#Construct data block
string_line = []
for i in range(10)
string_line.append("string")
string_block = []
for i in range(100)
string_block.append(string_line)
#Write data block in one call
ws = Excel.Workbooks.Sheets(1)
ws.Range(
ws.Cells(35, 5)
ws.Cells(135,15)
).Values = string block
I had the same error while using xlwings for interacting with Excel. xlwings also use win32com clients in the backend.
After some debugging, I realized that this error pops up whenever the code is executed and the excel file (containing data) is not in focus. In order to resolve the issue, I simply select the file which is being processed and run the code and it always works for me.

Categories