Create internal links within excelsheet with openpyxl - python

I created a excelfile with around 50 worksheets. All information is in the summary in the first worksheet, but for detailed information people can check the source in the worksheet.
I thought it would be nice to have an internal link to the worksheet (people want to know why the sales were down in July 2016 worksheet etc).
But while I seem to be able to create hyperlinks to websites, I just want to make it work in this excel file.
Is this possible at all?

This question is more about Excel than Python or programming, but you have to use #, for example:
ws = wb['Sheet1']
cell = ws.cell('A1')
cell.value = '=HYPERLINK("#Sheet2!A2")'
You can also give the cell a human-friendly display text:
cell.value = '=HYPERLINK("#Sheet2!A2", "click here")'
Will create a link in cell A1 in Sheet1 to cell A2 in Sheet2.
The 2 cells may or may not be on the same sheet.
The # tells Excel that this is an hyperlink to a local location, much like # is used as an anchor in HTML.

Here is a function which can be used directly to create hyperlink to a sheet in same excel:
def create_hyperlink(ws, at_cell, sheet_name, cell_ref='A1', display_name=None):
if display_name is None:
display_name = sheet_name
to_location = "'{0}'!{1}".format(sheet_name, cell_ref)
ws[at_cell].hyperlink = Hyperlink(display=display_name, ref=at_cell, location=to_location)
ws[at_cell].value = display_name
ws[at_cell].font = Font(u='single', color=colors.BLUE)
ws: worksheet where the links will be created
at_cell: cell location where links will be created. e.g. 'A1'
sheet_name: sheet_name of which the link will be created. e.g. 'Sheet1'

Actually, you can add local hyperlinks but have to control the location. The specification says this of the location attribute:
Location within target. If target is a workbook (or this workbook)
this shall refer to a sheet and cell or a defined name. Can also be an
HTML anchor if target is HTML file.
I think this works by setting target to None and ref to the cell reference.

Related

Python: How to save excel workbook without ruining dynamic spill/array formulas

Short description of the problem:
I am currently accessing an Excel workbook from Python with openpyxl.
I have some dynamic spill formulas in sheet1, like filter(), byrow() and unique().
With the python script, I am doing some operations in sheet2, but I am not touching sheet1 (where the dynamic spill formulas are located).
When using workbook.save() method in Python, I experience that the dynamic formulas in sheet1 are ruined and static, not having the dynamic functionality they had before interacting with python.
What can I do? Use a parameter in .save()? Use another method?
Detailed description of problem (with pictures):
I have a workbook called Original, with the following three sheets:
nums
dynamic
dump
In "nums" I have a cell for ID (AA), and a column with some numerical values (picture1).
In "dynamic" I have some dynamic formulas like byrow() and filter() that updates automatically with the values in ID and Values-column of "nums" (picture2).
The sheet "dump" is for now empty.
I have a second workbook called Some_data, which have one sheet with a 3-column dataframe (picture3).
I am dumping the 3-column dataframe of Some_data into the empty "dump"-sheet of Original with a Python script, and then using the workbook.save() method to save the new workbook.
The code is here:
import pandas as pd
from openpyxl import load_workbook
Some_data = filepath of the workbook
Original = filepath of the workbook
df = pd.read_excel(Some_data, engine = "openpyxl")
wb = load_workbook(filename = Original)
ws = wb["dump"]
rownr = 2
for index, row in df.iterrows():
ws["B"+str(rownr)] = row["col1"]
ws["C"+str(rownr)] = row["col2"]
ws["D"+str(rownr)] = row["col3"]
rownr+=1
wb.save(filepath of new workbook)
Now, the newly saved workbook's sheet "dump" has now been populated.
The problem is that the dynamic formulas in the sheet "dynamic" has been ruined, although the python script does not interact with any of the sheets "nums" or "dynamic".
First of all - the dynamic array formulas (like filter) now have brackets around them (picture4), and the dynamic array formulas are not dynamic anymore (there are no blue line around the array when selected, and they do not update automatically; picture5).
I need help with what to do. I want to save the excel-file, but with the dynamic array formulas not being ruined.
Thank you for your help, in advance.
Frode

How to get the CodeName of a sheet

I'm trying to use xlwings to deal with Excel files similarlly to what I used to do via VBA.
As I've learned so far, I can access a spreadsheet using name or index. but both of which can be modified. Is there a way to access a sheet using the codename?
Here is an example:
I have a workbook with 3 sheets inside. one of it is a special sheet that I've modified its CodeName in VBA editor to shReport. So no matter who uses this file and rename the sheet to "Report" or "NiceReport", in VBA I can always use shReport.cells(1,1) to get what I need.
But in xlwings, I can only (seems to be) use sht = wb.Sheets['Report'] or sht = wb.Sheets[0] to get the sheet as object. this will fail if user rename the sheet or inseart or delete sheets which will change the index.
So I wonder if it's possible to use the CodeName to refer to the sheet. I've tried in api and don't get any return of CodeName. the code below will return nothing
for sht in wk.Sheets:
print(sht.api.CodeName)
Not sure why your code does not work, however this complete code example works
import xlwings as xw
wb = xw.Book('Book1.xlsx')
for sheet in wb.sheets:
print("Sheet Name: " + str(sheet))
print("Code Name: " + str(wb.sheets(sheet).api.CodeName))
The last line can be split into
sht = wb.sheets(sheet)
print("Code Name: " + str(sht.api.CodeName))
so its the same as your line of code.
From what I can see you would only get the Code Name from a selected sheet so have to check each sheet for the one that matches as you did.

Update specific tab on google sheet - Python

I have the following code to append a dataframe in to a google sheet that runs everyday.
I had to create 03 more tabs in to this sheet and now, every time I upload the dataframe it goes to another tab and not the one that I need.
I`m using the following code to update the gsheet:
gc = gspread.authorize(credentials)
sh = gc.open_by_key("1O1NKT4LRf7F17kRjupUD7peonCwT04BG-l7pbo5-BLU").sheet1
values = df.values.tolist()
sh.append_rows(values)
I tried a few things such as
sh = gc.open_by_key("1O1NKT4LRf7F17kRjupUD7peonCwT04BG-l7pbo5-BLU").tabname
But it didnt work. Is there a way to do that?
thank you
Using sheet1 will give you the first worksheet in your spreadsheet, if your target sheet is not the first worksheet then you might need to use other methods to access that particular worksheet.
Best option is to get the worksheet by title (if you select worksheet using indexes, you need to update your code if ever you re-arranged your tabs. Hence the best option is to select worksheet by its title)
Here are all the options that you can use to select a worksheet using gspread:
Select worksheet by index. Worksheet indexes start from zero:
sh = gc.open_by_key("1O1NKT4LRf7F17kRjupUD7peonCwT04BG-xxxxxx")
worksheet = sh.get_worksheet(0)
Or by title:
worksheet = sh.worksheet("January")
Or the most common case: Sheet1:
worksheet = sh.sheet1
To get a list of all worksheets: (check each worksheet in the list based on their title)
worksheet_list = sh.worksheets()

does this library assume the Google Spreadsheet will have one sheet only?

I am trying to use this library to pull data from a Googlespreadsheet with two sheets in it, I can get data only from the first sheet but not the second sheet. sheet = client.open("sheetname").sheet1, if I change sheet1 to sheet2 I get the following error sheet = client.open("filename").sheet2 AttributeError: 'Spreadsheet' object has no attribute 'sheet2' how do I fix this? any help is appreciated!
.sheet1 is used as a shortcut.
In order to get the second sheet try that:
sheet = client.open("filename").get_worksheet(1)
1 means second sheet (starting from 0).
References:
Official documentation
In this case, you can use get_worksheet, worksheet and worksheets.
Sample script:
sh = client.open("###Spreadsheet name###") # or client.open_by_key(spreadsheetId)
worksheet = sh.get_worksheet(1) # Use the index of the sheet. 0 is the 1st sheet.
worksheet = sh.worksheet('Sheet2') # Use the sheet name of the sheet.
worksheet = sh.worksheets()[1] # In this case, all sheets are included in the array.
Note:
In the current stage, it seems that sh.sheet1 is only the 1st sheet.
Reference:
Selecting a Worksheet

Obtain name of worksheet using openpyxl

I am using openpyxl to access all of the tabs in a spreadsheet using the following:
rawReturnwb = openpyxl.load_workbook(ValidationsDir)
for sheet in rawReturnwb.worksheets:
do something...
This works fine. I then would like to access the worksheet name to use else where in my code. However when I try access the worksheet name (printing sheet to the console) I get:
<Worksheet "SheetName">
the type of sheet is
<class 'openpyxl.worksheet.worksheet.Worksheet'>
Is there a way that I can just get the worksheet name returned (so my output would be "SheetName" only. Or would I have to convert to string and strip the parts of the string I don't need?
As suggested in comment of the question, sheet.title is working.
For example, this is some code to get the Worksheet name from a given cell:
from openpyxl.cell import Cell
def get_cell_name(cell:Cell) -> str:
"""Get the name of the Worksheet of a given cell"""
return cell.parent.title
And in the case of the OP, the code could be something like:
rawReturnwb = openpyxl.load_workbook(ValidationsDir)
for sheet in rawReturnwb.worksheets:
# ...
if sheet.title == "Sheet1":
continue
# ...
FYI, the names of the variables are weird:
They should be lowercase in Python (according to PEP-8)
ValdiationsDir is weird, for the path of an Excel File
rawReturnwb could be renamed to book for example

Categories