how to retrieve hyperlinks in spreadsheet cells using gspread? - python

I am not able to retrieve hyperlinks in google spreadsheet cells using gspread. I am always returned the text of the cell and not the hyperlink itself.
I have attempted
worksheet.cell(i, j, value_render_option="FORMULA")
with all the three possible options for value_render_option and none of them works.
I have seen some old answers here about using input_value, that unfortunately is not supported anymore

If your cell content is something like
=HYPERLINK("http://www.wikipedia.de","wikipedia")
try
cell = worksheet.cell(i, j, value_render_option='FORMULA').value

Related

Find value of cell next to searched cell?

I'm trying to find the value for the cell NEXT to the cell I've found the value in. I have a column for tags and for links. If the link has a certain tag, the links value should be displayed. How can I do this in python with google-spreadsheet API?
I couldn't find a solution in the documentations.
ll = sheet.findall(topic)
for i in ll:
print(i.value)
How can I change it so it outputs the value of the column next to the cell?

Separate text from URL in Excel column

What I have is a column in Excel with a list of URLs like this:
first link
second link
...
What I would like is to separate the "text" from the "URL" like this:
first link | http://www.example.com/1
second link | http://www.example.com/2
...
I'm using LibreOffice, but I'd accept and answer also for Google Spreadsheet or even a python script.
Three steps simple solution (for Google Sheets):
just copy and 'paste values only' to get the text (or use concatenate-with-nothing)
check this answer for a custom function to get the url
concatenate text and url.

Python docx paragraph in textbox

Is there any way to access and manipulate text in an existing docx document in a textbox with python-docx?
I tried to find a keyword in all paragraphs in a document by iteration:
doc = Document('test.docx')
for paragraph in doc.paragraphs:
if '<DATE>' in paragraph.text:
print('found date: ', paragraph.text)
It is found if placed in normal text, but not inside a textbox.
A workaround for textboxes that contain only formatted text is to use a floating, formatted table. It can be styled almost like a textbox (frames, colours, etc.) and is easily accessible by the docx API.
doc = Document('test.docx')
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
if '<DATE>' in paragraph.text:
print('found date: ', paragraph.text)
Not via the API, not yet at least. You'd have to uncover the XML structure it lives in and go down to the lxml level and perhaps XPath to find it. Something like this might be a start:
body = doc._body
# assuming differentiating container element is w:textBox
text_box_p_elements = body.xpath('.//w:textBox//w:p')
I have no idea whether textBox is the actual element name here, you'd have to sort that out with the rest of the XPath path details, but this approach will likely work. I use similar approaches frequently to work around features that aren't built into the API yet.
opc-diag is a useful tool for inspecting the XML. The basic approach is to create a minimally small .docx file containing the type of thing you're trying to locate. Then use opc-diag to inspect the XML Word generates when you save the file:
$ opc browse test.docx document.xml
http://opc-diag.readthedocs.org/en/latest/index.html

Add link to text within cell, not entire cell

Using Openpyxl, is there a way to create a link within a cell?
I tried:
worksheet['A1'].hyperlink = 'http://mypage.com'
However, this sets the entire cell of 'A1' to be a link. I would like it to set the text within the cell to a link so that it looks like: My page in cell A1.
You can try something like this:
wb = load_workbook("my_book.xlsx")
worksheet1 = wb.active()
cell_value = '=HYPERLINK("http://mypage.com", "My Page")'
worksheet1.cell(row=1, column=1, value=cell_value)
The important part of my example is that you can just set the value of the cell to excel's hyperlink function as a string. The first parameter is the link and the second parameter is the text to display in the cell.

Reading text values in a PowerPoint table using pptx?

Using the pptx module,
tbl.cell(3,3).text
is a writable, but not a readable attribute. Is there a way to just read the text in a PowerPoint table? I'm trying to avoid COM and pptx is a great module, but lacks this particular feature.
Thanks!
At present, you'll need to go a level deeper to get text out of a cell using python-pptx. The cell.text 'getter' is on the roadmap.
Something like this should get it done:
cell = tbl.cell(3, 3)
paragraphs = cell.textframe.paragraphs
for paragraph in paragraphs:
for run in paragraph.runs:
print run.text
Let me know if you need more to go on.

Categories