Refresh single worksheet (excel) in workbook using xlwings - python

I looked around a bit but could not find an answer. I found RefreshAll() which is not what I want to do.
Say I have a workbook named "DATA" with following sheets, "Forecast Temps", "Actual Temps", "Table", "Summary".
Now imagine that Forecast Temps sheet has a time series function that grabs data from NWS. This worksheet needs to be refreshed and then temps added into specified column in worksheet "Table". After this, Sheet Summary can be refreshed to determine new high and lows for that day.
Yes - I could run RefreshAll() at each step, but this seems redundant and would take the script longer to run. I was wondering if there was a way to refresh a single sheet w/ xlwings.
I also know you can do it in VBA, but my plan is to write a python script and then create a Sub where I call RunPython ("ScriptName").
Would I be able to do something like:
import xlwings as xw
wb = xw.Book("path")
forecast_temps = wb.sheet[0]
summary = wb.sheet[1]
forecast_temps.refresh() #do not know the correct func here (if there is one)?

I think RefreshAll() is not part of xlwings API. You may call the Excel API like this: wb.api.RefreshAll().
If you know how to do it in VBA, the same will probably work in xlwings using .api. I think you will have some kind of workbook connection. wb.api.Connections should return a list of all workbook connections. From there you can go on with the WorkbookConnection-Objects, which have a Refresh() method.

Related

gspread - get_all_values() returns an empty list

If I call a sheet by name, get_all_values function will always give me an empty list for a sheet that is definitely not empty.
import gspread
sheet = workbook.worksheet(sheet_name)
all_rows_list = sheet.get_all_values()
The only time get_all_values seems to return like it should is if I do the following:
all_rows_list = workbook.sheet1.get_all_values()
But the above works just for the first sheet and for no other, which is kind of useless for a workbook with more sheets.
What always works is reading row by row like
one_row_list = sheet.row_values(1) # first row
But the problem is that I'm trying to read a relatively big workbook with lots of sheets to figure out where I'm supposed to start writing, and it looks like reading row by row triggers "RESOURCES EXHAUSTED" error very fast.
So, am I doing something wrong or is get_all_values broken in gspread?
EDIT:
Added a screenshot.
gspread doesn't work well with sheets with names that could be confused as a cell reference in the A1 notation (like X101 and AT8 in your case).
https://github.com/burnash/gspread/issues/554 is an older issue that describes the underlying problem (the symptoms in that issue are different, but I'm pretty sure the root problem is the same).
I'll copy the workaround with providing a range, that you've discovered yourself:
ws.range("A1:C"+str(end_row)) That end_row is usually row_count of the sheet.

How to split an Excel workbook by worksheet while preserving grouping

I am doing some excel reports for work and am given a book exported from SSRS daily. The book is nicely set up, with groupings applied to every sheet for an effect similar to pivot tables.
However the book comes with 32 sheets, and I eventually need to send out each sheet individually as a distinct report. Right now I am splitting them up manually, but I am wondering if there is a way to automate this while preserving the grouping.
I previously tried something like:
import xlrd
import pandas as pd
targetWorkbook = xlrd.open_workbook(r'report.xlsx', on_demand=True)
xlsxDoc = pd.ExcelFile('report.xlsx')
for sheet in targetWorkbook.sheet_names():
reportDF = pd.read_excel(xlsxDoc, sheet)
reportDF.to_excel("report - {}.xlsx".format(sheet))
However since I'm converting each sheet to a pandas datagrams, the grouping is lost.
There are multiple ways to read/interact with excel docs in python, but I can't find a clear way to pick out a sheet and save it as its own document without losing the grouping.
This is my full answer. I have used the Worksheets().Move() method. The main idea is to use win32com.client library.
This was tested and works on my Windows 10 system with Excel 2013 installed, and Python 3.7. The grouping format was moved intact with the worksheets. I am still working on getting the looping to work. I will revise my answer again when I get the looping to work.
My example has 3 worksheets, each with different grouping (subtotal) formats.
#
# Refined .Move() method, save new file using Active Worksheet property.
#
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb0 = excel.Workbooks.Open(r'C:\python\so\original.xlsx')
excel.Visible = True
# Move sheet1.
wb0.Worksheets(1).Move()
excel.Application.ActiveWorkbook.SaveAs(r'C:\python\so\sheet1.xlsx')
# Move sheet2, which is now the front sheet.
wb0.Worksheets(1).Move()
excel.Application.ActiveWorkbook.SaveAs(r'C:\python\so\sheet2.xlsx')
# Save single remaining sheet as sheet3.
wb0.SaveAs(r'C:\python\so\sheet3.xlsx')
wb0.Close()
excel.Application.Quit()
You would also need to install pywin32, which is not a standard library item.
https://github.com/mhammond/pywin32
pip install pywin32

How to put a value to an excel cell?

You'll probably laugh at me, but I am sitting on this for two weeks. I'm using python with pandas.
All I want to do, is to put a calculated value in a pre-existing excel file to a specific cell without changing the rest of the file. That's it.
Openpyxl makes my file unusable (means, I can not open because it's "corrupted" or something) or it plainly delets the whole content of the file. Xlsxwriter cannot read or modify pre-existing files. So it has to be pandas.
And for some reason I can't use worksheet = writer.sheets['Sheet1'], because that leads to an "unhandled exception".
Guys. Help.
I tried a bunch of packages but (for a lot of reasons) I ended up using xlwings. You can do pretty much anything with it in python that you can do in Excel.
Documentation link
So with xlwings you'd have:
import xlwings as xw
# open app_excel
app_excel = xw.App(visible = False)
# open excel template
wbk = xw.Book( r'stuff.xlsx' )
# write to a cell
wbk.sheets['Sheet1'].range('B5').value = 15
# save in the same place with the same name or not
wbk.save()
wbk.save( r'things.xlsx' )
# kill the app_excel
app_excel.kill()
del app_excel
Let me know how it goes.

Reading cell value without redefining it with Openpyxl

I need to read this .xlsm database and some of the cells values I need are derived from Excel functions. To accomplish this I used:
from openpyxl import load_workbook
wb = load_workbook('file.xlsm', data_only=True, keep_vba=True)
ws = wb['Plan1']
And then, for every cell I wanted to read:
ws.cell(row=row, column=column).value
This works fine for getting the data out. But the problem comes with saving. When I do:
wb.save('file.xlsm')
It saves the file, but all the formulas inside the sheets are lost
My dilemma is reading the cell's displayed values on one of the database's sheet without modifying them, writing the code's output in a new sheet and saving it.
Read the file once in read-only and data-only mode to look at the values and another time keeping the VBA around. And save under a different name.

Using Python to load template excel file, insert a DataFrame to specific lines and save as a new file

I'm having troubles writing something that I believe should be relatively easy.
I have a template excel file, that has some visualizations on it with a few spreadsheets. I want to write a scripts that loads the template, inserts an existing dataframe rows to specific cells on each sheet, and saves the new excel file as a new file.
The template already have all the cells designed and the visualization, so i will want to insert this data only without changing the design.
I tried several packages and none of them seemed to work for me.
Thanks for your help! :-)
I have written a package for inserting Pandas DataFrames to Excel sheets (specific rows/cells/columns), it's called pyxcelframe:
https://pypi.org/project/pyxcelframe/
It has very simple and short documentation, and the method you need is insert_frame
So, let's say we have a Pandas DataFrame called df which we have to insert in the Excel file ("MyWorkbook") sheet named "MySheet" from the cell B5, we can just use insert_frame function as follows:
from pyxcelframe import insert_frame
from openpyxl import load_workbook
workbook = load_workbook("MyWorkbook.xlsx")
worksheet = workbook["MySheet"]
insert_frame(worksheet=worksheet,
dataframe=df,
row_range=(5, 0),
col_range=(2, 0))
0 as the value of the second element of row_range or col_range means that there is no ending row or column specified, if you need specific ending row/column you can replace 0 with it.
Sounds like a job for xlwings. You didn't post any test data, but modyfing below to suit your needs should be quite straight-forward.
import xlwings as xw
wb = xw.Book('your_excel_template.xlsx')
wb.sheets['Sheet1'].range('A1').value = df[your_selected_rows]
wb.save('new_file.xlsx')
wb.close()

Categories