Freeze Panes first two rows and column with openpyxl

Freeze Panes first two rows and column with openpyxl - python

Trying to freeze the first two rows and first column with openpyxl, however, whenever doing such Excel says that the file is corrupted and there is no freeze.
Current code:
workbook = openpyxl.load_workbook(path)
worksheet = workbook[first_sheet]
freeze_panes = Pane(xSplit=2000, ySplit=3000, state="frozen", activePane="bottomRight")
worksheet.sheet_view.pane = freeze_panes
Took a look at the documentation, however, there is little explanation on parametere setting.
Desired output:
Came across this answer, however, it fits a specific use case, hence, wanted to make a general question for future reference:
How to split Excel screen with Openpyxl?

To freeze the first two rows and first column, use the sample code below... ws.freeze_panes works. Note that, like you would do in excel, select the cell above and left of which you want to freeze. So, in your case, the cell should be B3. Hope this is what you are looking for.
import openpyxl
wb=openpyxl.load_workbook('Sample.xlsx')
ws=wb['Sheet1']
mycell = ws['B3']
ws.freeze_panes = mycell
wb.save('Sample.xlsx')

Related

Opening an Excel File in Python Disables Dynamic Arrays

I have an excel workbook that uses functions like OFFSET, UNIQUE, and FILTER which spill into other cells. I'm using python to analyze and write some data to the workbook, but after doing so these formulas revert into normal arrays. This means they now take up a fixed number of cells (however many they took up before opening the file in python) instead of adjusting to fit all of the data. I can revert the change by selecting the formula and hitting enter, but there are many of these formulas it's more work to fix them than to just print the data to a text file and paste it into excel manually. Is there any way to prevent this behavior?
I've been using openpyxl to open and save the workbook, but after encountering this issue also tried xlsxwriter and the dataframe to excel function from pandas. Both of them had the same issue as openpyxl. For context I am on python 3.11 and using the most recent version of these modules. I believe this issue is on the Python side and not the Excel side, so I don't think changing Excel settings will help, but maybe there is something there I missed.
Example:
I've created an empty workbook with two sheets, one called 'main' and one called 'input'. The 'main' sheet will analyze data from the 'input' sheet which will be entered with openpyxl. The data will just be values in the first column.
In cell A1 of the 'main' sheet, enter =OFFSET(input!A1,0,0,COUNTA(input!A:A),1).
This formula will just show a copy of the data. Since there currently isn't any data it gives a #REF! error, so it only takes up one cell.
Now I'll run the following python code to add the numbers 0-9 into the first column of the input sheet:
from openpyxl import load_workbook
wb = load_workbook('workbook.xlsx')
ws = wb['input']
for i in range(10):
ws.append([i])
wb.save('workbook_2.xlsx')
When opening the new file, cell A1 on the 'main' sheet only has the first value, 0, instead of the range 0--9. When selecting the cell, you can see the formula is now {=OFFSET(input!A1,0,0,COUNTA(input!A:A),1)}. The curly brackets make it an array, so it wont spill. By hitting enter in the formula the array is removed and the sheet properly becomes the full range.
If I can get this simple example to work, then expanding it to the data I'm using shouldn't be a problem.

How to read outline levels using Python `openpyxl`?

My organization has a clean export for bills of materials (BOM). I would like to automatically parse the excel file to check the BOM for certain attributes.
At the moment, I'm using Python with openpyxl.
I can read the excel workbook and worksheet just fine, but I cannot seem to find the attribute that contains the "outline level" of each row (I fully concede that I may be using the wrong terminology... another term candidate might be "group").
When I look at my excel file using excel, I see this at the left of the screen:
I would like to extract the 1 2 3 4 5 from each of the rows and to tell what grouping they were in.
My initial code is:
from pathlib import Path
import openpyxl as xl
path = Path('<path-to-my-file>.xlsx')
wb = xl.load_workbook(filename=path)
sh = wb.worksheets[0]
# ... would like to put outline level reading code here
From reading other questions, I suspect that I need to look at the row_dimension.group method of the worksheet, but I can't seem to get a handle on the syntax or the exact attribute that I'm looking for.

Thanks for the post. I was struggling with the same problem and seing your post gave me an idea!
I overcome it with the following code:
from pathlib import Path
import openpyxl as xl
path = Path('<path-to-my-file>.xlsx')
wb = xl.load_workbook(filename=path)
sh = wb.worksheets[0]
for row in sorted(sheet.row_dimensions):
outline1=sheet.dimensions[row].outlineLevel
outline2=sheet.dimensions[row].outline_level
print(row,sheet.dimensions[row], outline1, outline2 )

Maybe you can use the following code to gather individual row outline levels as an integer. I use a similar code to find maximum outline level in a sheet with some more lines.
for index in range(ws.min_row, ws.max_row):
row_level = ws.row_dimensions[index].outline_level + 1
In here row level variable is the outline level, you may use as required. But please double check +1, if I remember correctly, to get true level, you need to increase variable by one.

gspread - get_all_values() returns an empty list

If I call a sheet by name, get_all_values function will always give me an empty list for a sheet that is definitely not empty.
import gspread
sheet = workbook.worksheet(sheet_name)
all_rows_list = sheet.get_all_values()
The only time get_all_values seems to return like it should is if I do the following:
all_rows_list = workbook.sheet1.get_all_values()
But the above works just for the first sheet and for no other, which is kind of useless for a workbook with more sheets.
What always works is reading row by row like
one_row_list = sheet.row_values(1) # first row
But the problem is that I'm trying to read a relatively big workbook with lots of sheets to figure out where I'm supposed to start writing, and it looks like reading row by row triggers "RESOURCES EXHAUSTED" error very fast.
So, am I doing something wrong or is get_all_values broken in gspread?
EDIT:
Added a screenshot.

gspread doesn't work well with sheets with names that could be confused as a cell reference in the A1 notation (like X101 and AT8 in your case).
https://github.com/burnash/gspread/issues/554 is an older issue that describes the underlying problem (the symptoms in that issue are different, but I'm pretty sure the root problem is the same).
I'll copy the workaround with providing a range, that you've discovered yourself:
ws.range("A1:C"+str(end_row)) That end_row is usually row_count of the sheet.

How to pull last cell in column using openpyxl in python

I created a small program that writes to an excel file. I have another program that needs to read the last entry (in column A) every day. Since there is a new data imported into the excel file every day, the cell that I need to capture is different.
I'm looking to see if there is a way for me to grab the last cell in Column A using openpyxl in python?
I don't have much experience with this, so I wasn't sure where to start.
import openpyxl
wb = openpyxl.load_workbook('text.xlsx')
sheet = wb.get_sheet_by_name('Sheet1')

from https://openpyxl.readthedocs.io/en/stable/tutorial.html
try this, it should get the entire A column and take the last entry:
sheet['A'][-1]

How to apply conditional formatting in openpyxl?

I am using openpyxl to manipulate a Microsoft Excel Worksheet.
What I want to do is to add a Conditional Formatting Rule that fills the rows with a given colour if the row number is even, leaves the row blank if not.
In Excel this can be done by selecting all the worksheet, creating a new formatting rule with the text =MOD(ROW();2)=0 or =EVEN(ROW()) = ROW().
I tried to implement this behaviour with the following lines of code (considering for example the first 10 rows):
redFill = PatternFill(start_color='EE1111', end_color='EE1111', fill_type='solid')
ws2.conditional_formatting.add('A1:A10', FormulaRule(formula=['MOD(ROW();2) = 0'], stopIfTrue=False, fill=redFill))
My program runs correctly but when I try to open the output Excel file, it tells me that the file contains unreadable content and it asks me if I want to recover the worksheet content. By clicking yes, the worksheet is what I expect but there is no formatting.
What is the correct way to apply such a formatting in openpyxl (possibly to the entire worksheet)?

Unfortunately, the way formulae are handled in conditional formatting is particularly opaque. The best thing to do is to create a file with the relevant conditional format and inspect the relevant file by unzipping it. The rules are stored in the relevant worksheet files and the formats in the styles file.
However, I suspect that the problem may simply because you are using ";" to separate parameters in the function: you must always use commas for this.
A sample formula from one of my projects:
green_text = Font(color="006100")
green_fill = PatternFill(bgColor="C6EFCE")
dxf2 = DifferentialStyle(font=green_text, fill=green_fill)
r3 = Rule(type="expression", dxf=dxf2)
r3.formula = ["AND(ISNUMBER(C2), C2>=400)"]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.