Python Selenium, select the next element in html DOM

Python Selenium, select the next element in html DOM - python

i have to make some automation on a page.
The page consists of table where inside each td element i have 2 a tags, the first one with a class, the second one has no class or id.
i can easily select the one with the class, but how to get the other one? is there a way to select the element next to another one like in css?
this is a draft of the structure of the page
<table>
<tr>
<td>
<a class="mylink"> element 1 </a>
<a>
<img src="">
</a>
</td>
</tr>
<tr>
<td>
<a class="mylink"> element 2 </a>
<a>
<img src="">
</a>
</td>
</tr>
</table>
I can select the first one with
fileLinkClass = "mylink"
driver.find_element(by=By.CLASS_NAME, value=fileLinkClass)
but i need to select and click the a link without the class. How can i accomplish this?
Thank you so much

You can use xpath selector
'//td/a[2]'
to find all second 'a's under a 'td'

Try using css selector
For single element selection
driver.find_element(By.CSS_SELECTOR,'.mylink + a')
For multiple elements selection
driver.find_elements(By.CSS_SELECTOR,'.mylink + a')
Make a list slicing then click. For example:
element = driver.find_elements(By.CSS_SELECTOR,'.mylink + a')
element = element[0].clik()
element = element[1].clik()

Related

Select the entire text from the following node with child nodes using xpath query in python

I want to extract the content of the following node of an a tag with XPath in python. so far I manage to extract the content with no inside tag in it. the problem is, my method is not working if the following node has a child node in it. I'm using lxml package and here is my code:
from lxml.html import etree, fromstring
reference_titles = root.xpath("//table[#id='vulnrefstable']/tr/td")
for tree in reference_titles:
a_tag = tree.xpath('a/#href')[0]
title = tree.xpath('a/following-sibling::text()')
this is working for this html:
<tr>
<td class="r_average">
<a href="http://somelink.com" target="_blank" title="External url">
http://somelink.com
</a>
<br/> SECUNIA 27633
</td>
</tr>
Here the title is correctly "SECUNIA 27633" but in this html:
<tr>
<td class="r_average">
<a href="http://somelink.com" target="_blank" title="External url">
http://somelink.com
</a>
<br/> SECUNIA 27633 <i>Release Date:</i> tomorrow
</td>
</tr>
The result is "SECUNIA 27633 tomorrow"
How can I extract "SECUNIA 27633 Release Date: tomorrow"?
EDIT: using node() instead of text() in XPath returns all the nodes in it. so I use this and create the final string with a nested for statement
title = tree.xpath('a/following-sibling::node()')
but I want to know is there a better way to simply extract the text content regardless of child nodes with XPath query

Try this one:
for tree in reference_titles:
a_tag = tree.xpath('a/#href')[0]
title = " ".join([node.strip() for node in tree.xpath('.//text()[not(parent::a)]') if node.strip()])

Get td class text with selenium

So I want to take the text of td class.
The html page
<table class="table table-striped">
<tbody>
<tr>
<td class="text-center">
<img .....>
</td>
<td>text</td>
<td>text</td>
<td class="text-center">
<a ....></a>
</td>
<td class="text-center">
TEXT I WANT TO TAKE HERE
</td>
<td class="text-center">
<a ....><i class="fa fa-times"></i></a>
</td>
</tr>
</tbody>
</table>
The text I want to take is "TEXT I WANT TO TAKE HERE".
I tried using the xpath like below but it didnt work
table = browser.find_element_by_xpath(("//div[#class='table table-striped']/tbody/tr/td[5]"));
I got an error saying:
no such element: Unable to locate element: {"method":"xpath","selector":"//div[#class='table table-striped']/tbody/tr/td[5]"}
Is it because I have multiple classes in the selector and I have to use dot?
(I tried: 'table.table-striped' but it still didnt work)

Your xpath is incorrect. You have a table tag but, you are looking for a div tag. So, you just need to replace div with table.
table = browser.find_element_by_xpath(("//table[#class='table table-striped']/tbody/tr/td[5]"));

Use below xpath to get the text
browser.find_element_by_xpath("//td[#class='text-center']").text
And use the index as well to better find your row e.g.
browser.find_element_by_xpath("//td[#class='text-center'][3]").text

Use Below xpath to get the text TEXT I WANT TO TAKE HERE
//table//tr/td[contains(text(), 'TEXT I WANT TO TAKE HERE')]
Updated Answer: You can refer any of these below mentioned xpath to get your webelement.
//td[5]
OR
//table[#class='table table-striped']//td[5]
OR
//table[#class='table table-striped']/..//following-sibling::td[5]
OR
//td[#class='text-center'][3]

In your XPath expression you are looking for a div tag, but your HTML does not have that. Perhaps you are looking to the table tag:
table = browser.find_element_by_xpath(("//table[#class='table table-striped']/tbody/tr/td[5]"));

Iterate through table row images and click on hyperlink

I have a table with multiple rows ( tr) that contains multiple cells ( td )
one of the cells contains an image hyperlink
I would link to iterate through the table rows and click on every cell containing the image of each row, using selenium.
for example, this is one of my tables:
<table class="thetable" cellspacing="1" >
<thead></thead>
<tbody>
<tr class="visibleRow">
<td class="Item"></td>
<td class="modified" style="color: gray;"></td>
<td class="imageHyperlink">
<a href="#" role="button" title="Edit the item">
<img src="web/service/editRow.gif" />
</a>
</td>
</tr>
<tr class="visibleRow"></tr>
<tr class="anotherow" style="display: none;"></tr>
<tr class="visibleRow"></tr>
<tr class="editorRow" style="display: none;"></tr>
</tbody>
</table>
the only rows that I want to iterate through are the ones containing the class name visibleRow, and the only cells that need to be clicked on are the cells containing the class name imageHyperlink
I implemented a for loop that iterate through the rows with class visibleRow, store the cell class name inside cell variable. and click on the cell:
for row in driver.find_elements_by_css_selector("tr.visibleRow"):
cell = row.find_elements_by_class_name("imageHyperlink")
cell.click()
However I am getting this error as it seems the cell is not the clickable item:
AttributeError: 'list' object has no attribute 'click'
How can I fix this ?

The call row.find_elements_by_class_name("imageHyperlink") (note the plural name elements) returns a list, which in your case will have zero or one element. Adding a second level of iteration should fix the problem:
for row in driver.find_elements_by_css_selector("tr.visibleRow"):
for cell in row.find_elements_by_class_name("imageHyperlink"):
cell.find_element_by_tag_name("a").click()
The inner loop iterates over the children of the that have the class imageHyperLink; in your example, there will be either one of these (for the first visible row) or none (for the others). It then finds the first <a> child element and clicks on it.

python BeautifulSoup4 break for loop when tag found

I have a problem breaking a for loop when going trough a html with bs4.
I want to save a list separated with headings.
The HTML code can look something like below, however it contains more information between the desired tags:
<h2>List One</h2>
<td class="title">
<a title="Title One">This is Title One</a>
</td>
<td class="title">
<a title="Title Two">This is Title Two</a>
</td>
<h2>List Two</h2>
<td class="title">
<a title="Title Three">This is Title Three</a>
</td>
<td class="title">
<a title="Title Four">This is Title Four</a>
</td>
I would like to have the results printed like this:
List One
This is Title One
This is Title Two
List Two
This is Title Three
This is Title Four
I have come this far with my script:
import urllib2
from bs4 import BeautifulSoup
html = urllib2.urlopen('some webiste')
soup = BeautifulSoup(html, "lxml")
quote1 = soup.h2
print quote1.text
quote2 = quote1.find_next_sibling('h2')
print quote2.text
for quotes in soup.findAll('h2'):
if quotes.find(text=True) == quote2.text:
break
if quotes.find(text=True) == quote1.text:
for anchor in soup.findAll('td', {'class':'title'}):
print anchor.text
print quotes.text
I have tried to break the loop when "quote2" (List Two) is found. But the script gets all the td-content and ignoring the next h2-tags.
So how do I break the for loop with next h2-tag?

In my opinion the problem lies in your HTML syntax. According to https://validator.w3.org it's not legal to mix "td" and "h3" (or generally any header tag). Also, implementing list with tables is most likely not a good practice.
If you can manipulate your input files, the list you seem to need could be implemented with "ul" and "li" tags (first 'li' in 'ul' containing the header) or, if you need to use tables, just put your header inside of "td" tag, or even more cleanly with "th"s:
<table>
<tr>
<th>Your title</th>
</tr>
<tr>
<td>Your data</td>
</tr>
</table>
If the input is not under your control, your script could perform search and replace on the input text anyway, putting the headers into table cells or list items.

How do I click on a dynamically loaded link?

I am currently working on web automation via Selenium.
I have a html file where the relevant part is this:
<table>
<tbody>
<tr>
<td class="tabon" nowrap="">
<div class="tabon">
<a id="tab" href="(long dynamically generated string)">
<b>Main Page</b>
</a>
</div>
</td>
<td class="taboff" nowrap="">
<div class="taboff">
<a id="tab" href="(another long string)">Info</a>
</div>
</td>
</tr>
</tbody>
</table>
I want to be able to access the second tab. Using Selenium I can't actually "click" on the div tag.
try:
browser.find_element_by_xpath(
'//table/tbody/tr/td[2]/div/a').click()
except NoSuchElementException:
print ('error')
This always results in an error. It has something to do with the fact that when the div tag is interacted with, it clicks on the URL anchor which changes the div such that the clicked on tag has a "tabon" property. How can Selenium mimic this?
EDIT: I neglected to note that the class with "tabon" has the title of the page in a separate bold tag.

Try this code, in case the tab "My Info" is visible on the webpage:
browser.find_element_by_xpath("//a[.='My Info']").click()
This will click on the element with tag 'a' and having exact innerHTML/text as My Info.

You need to be passing click on a tag not on div and in addition to the solution Subh provided you can use .taboff a as cssselector. This selector walks you down to a tag from second td of pasted html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Selenium, select the next element in html DOM - python

You can use xpath selector '//td/a[2]' to find all second 'a's under a 'td'

Related

Select the entire text from the following node with child nodes using xpath query in python

Get td class text with selenium

Iterate through table row images and click on hyperlink

python BeautifulSoup4 break for loop when tag found

How do I click on a dynamically loaded link?

Categories

Resources