Read excel autofilter with python

Read excel autofilter with python - python

I'd like to read the autofilter rules from an excel sheet in python.
Suppose this kind of input:
original input
then I filter with excel autofilter one column, for example:
filtered input
Is there a way to retrieve the applied autofilter rule in python?
Currently the only option I know, it is to set the autofilter via xlwings:
import xlwings as xw
# Open the workbook
workbook = xw.Book(r"C:\Users\Desktop\Example.xlsx")
# Set Autofilter
workbook.sheets[0].api.Range("A1:D4").AutoFilter(4,"Yes")
but does it exist the "inverse" function?
It could be fine also with other way like pandas, openpyxl, xlsxwriter and so on.

With Xlwings you should be able to duplicate what VBA can do so it's usually the better for this type of query.
You should be able to show the Filter set from Criteria1 as shown below;
import xlwings as xw
# Open the workbook
workbook = xw.Book(r"C:\Users\Desktop\Example.xlsx")
# Set Autofilter
workbook.sheets[0].api.Range("A1:D4").AutoFilter(4,"Yes")
for count, item in enumerate(workbook.sheets[0].api.AutoFilter.Filters,1):
if item.On:
print(f'{count}, {item.Criteria1}')
Output would be
4, =Yes

Related

How to select non-adjacent ranges or cells of excel using xlwings in Python?

I wonder if there's a way to select non-adjacent ranges or cells simultaneously of excel using xlwings in Python, cause i dont want to use loop for that.
xlwings.Range(xlwings.Range('a1:b4'), xlwings.Range('b8:d10')).color=(255,0,0)
I want to color Range('a1:b4') and Range('b8:d10') so i used the above code but it colored the Range('a1:d10'). How can I fix it?

Just the following will do what you want.
import xlwings as xw
path = 'foo.xlsx'
with xw.App() as app:
wb = xw.Book(path)
ws = xw.sheets.active
# ws.Range(xlwings.Range('a1:b4'), xlwings.Range('b8:d10')).color=(255,0,0)
ws.range("A1:B4,B8:D10").color=(255,0,0)
wb.save(path)
wb.close()

Python: How to save excel workbook without ruining dynamic spill/array formulas

Short description of the problem:
I am currently accessing an Excel workbook from Python with openpyxl.
I have some dynamic spill formulas in sheet1, like filter(), byrow() and unique().
With the python script, I am doing some operations in sheet2, but I am not touching sheet1 (where the dynamic spill formulas are located).
When using workbook.save() method in Python, I experience that the dynamic formulas in sheet1 are ruined and static, not having the dynamic functionality they had before interacting with python.
What can I do? Use a parameter in .save()? Use another method?
Detailed description of problem (with pictures):
I have a workbook called Original, with the following three sheets:
nums
dynamic
dump
In "nums" I have a cell for ID (AA), and a column with some numerical values (picture1).
In "dynamic" I have some dynamic formulas like byrow() and filter() that updates automatically with the values in ID and Values-column of "nums" (picture2).
The sheet "dump" is for now empty.
I have a second workbook called Some_data, which have one sheet with a 3-column dataframe (picture3).
I am dumping the 3-column dataframe of Some_data into the empty "dump"-sheet of Original with a Python script, and then using the workbook.save() method to save the new workbook.
The code is here:
import pandas as pd
from openpyxl import load_workbook
Some_data = filepath of the workbook
Original = filepath of the workbook
df = pd.read_excel(Some_data, engine = "openpyxl")
wb = load_workbook(filename = Original)
ws = wb["dump"]
rownr = 2
for index, row in df.iterrows():
ws["B"+str(rownr)] = row["col1"]
ws["C"+str(rownr)] = row["col2"]
ws["D"+str(rownr)] = row["col3"]
rownr+=1
wb.save(filepath of new workbook)
Now, the newly saved workbook's sheet "dump" has now been populated.
The problem is that the dynamic formulas in the sheet "dynamic" has been ruined, although the python script does not interact with any of the sheets "nums" or "dynamic".
First of all - the dynamic array formulas (like filter) now have brackets around them (picture4), and the dynamic array formulas are not dynamic anymore (there are no blue line around the array when selected, and they do not update automatically; picture5).
I need help with what to do. I want to save the excel-file, but with the dynamic array formulas not being ruined.
Thank you for your help, in advance.
Frode

How to save excel file with openpyxl and preserve pivot table as is?

I have an excel file - one sheet is used for writing data with python, other sheet contains pivot table. I want to keep pivot table exactly the same as source file.
The problem is that after saving new workbook with openpyxl I open excel file and refresh pivot table, it loses 'Field settings..' -> 'Repeat items label' checkbox and I need to manually turn it on each time. That is not very efficient, I would rather solve this with python.
Sample file has it checked, but checkbox seems to disappear after saving new file with openpyxl.
from openpyxl import load_workbook
from pathlib import Path
from datetime import date
import os
sample_file_path = Path('sample_excel.xlsx') # source excel
result_folder_path = Path('results')
wb = load_workbook(sample_file_path)
ws = wb["t_mm"] # worksheet with pivot table I want to preserve as is
# some manipulations to other worksheet
xlsx_filename = "test_my_file_%s.xlsx" % date.today().strftime('%d%m%Y')
completename = os.path.join(result_folder_path, xlsx_filename)
wb.save(completename)
I read the documentation https://openpyxl.readthedocs.io/en/stable/api/openpyxl.pivot.table.html, but couldn't figure out how to keep that checkbox. I am not excel or pivot table expert. I think this is the parameter I need "showMultipleLabel=True", but from docs I understand that it's "True" by default, so my chekbox should remain intact. Maybe other parameter?

Update a single cell in an Excel spreadsheet using Pandas

I'm just wondering how to update a single cell in an excel spreadsheet with Pandas in a python script. I don't want any of the other cells in the file to be overwritten, just the one cell I'm trying to update. I tried using .at[], .iat[], and .loc() but my excel spreadsheet does not update. None of the other deprecated methods like .set_value() work either. What am I doing wrong?
import pandas as pd
tp = pd.read_excel("testbook.xlsx", sheet_name = "Sheet1")
tp.at[1, 'A'] = 10

I might suggest using xlwings for this operation, as it might be easier than reading and writing a sheet in pandas dataframes. The example below changes the value of "A1".
import xlwings as xw
sheet = xw.Book("testbook.xlsx").sheets("Sheet1")
sheet.range("A1").value = "hello world"
Also note xlwings is included with all Anaconda packages if you're using that: https://docs.xlwings.org/en/stable/api.html

Using Python to load template excel file, insert a DataFrame to specific lines and save as a new file

I'm having troubles writing something that I believe should be relatively easy.
I have a template excel file, that has some visualizations on it with a few spreadsheets. I want to write a scripts that loads the template, inserts an existing dataframe rows to specific cells on each sheet, and saves the new excel file as a new file.
The template already have all the cells designed and the visualization, so i will want to insert this data only without changing the design.
I tried several packages and none of them seemed to work for me.
Thanks for your help! :-)

I have written a package for inserting Pandas DataFrames to Excel sheets (specific rows/cells/columns), it's called pyxcelframe:
https://pypi.org/project/pyxcelframe/
It has very simple and short documentation, and the method you need is insert_frame
So, let's say we have a Pandas DataFrame called df which we have to insert in the Excel file ("MyWorkbook") sheet named "MySheet" from the cell B5, we can just use insert_frame function as follows:
from pyxcelframe import insert_frame
from openpyxl import load_workbook
workbook = load_workbook("MyWorkbook.xlsx")
worksheet = workbook["MySheet"]
insert_frame(worksheet=worksheet,
dataframe=df,
row_range=(5, 0),
col_range=(2, 0))
0 as the value of the second element of row_range or col_range means that there is no ending row or column specified, if you need specific ending row/column you can replace 0 with it.

Sounds like a job for xlwings. You didn't post any test data, but modyfing below to suit your needs should be quite straight-forward.
import xlwings as xw
wb = xw.Book('your_excel_template.xlsx')
wb.sheets['Sheet1'].range('A1').value = df[your_selected_rows]
wb.save('new_file.xlsx')
wb.close()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Read excel autofilter with python - python

Related

How to select non-adjacent ranges or cells of excel using xlwings in Python?

Python: How to save excel workbook without ruining dynamic spill/array formulas

How to save excel file with openpyxl and preserve pivot table as is?

Update a single cell in an Excel spreadsheet using Pandas

Using Python to load template excel file, insert a DataFrame to specific lines and save as a new file

Categories

Resources