I have a html code that has two links but both the links have the same href value, but the onclick and the text are different.
I wasn't sure as to how to access the second link.
I tried using driver.find_element_by_link_text('text'), but I get a no such element found error.
<div id="member">
<"a href="#" onclick="add_member("abc"); return false;">run abc<"/a>
<br>
<"a href="#" onclick="add_member("def"); return false;">run def<"/a>
</div>
There are multiple options to get the desired link.
One option would be to get use find_element_by_xpath() and check onclick attribute value:
link = driver.find_element_by_xpath('//div[#id="member"]/a[contains(#onclick, "add_member(\"def\")")]')
link.click()
Another one would be to simply find both links and get the desired one by index:
div = driver.find_element_by_id('member')
links = div.find_elements_by_tag_name('a')
links[1].click()
Which option to choose depends on the whole HTML content. Hope at least one of two suggested solutions solves the issue.
Related
I am trying to extract data from multiple pages of search results where the HTML in question looks like so:
<ul>
<li class="Card___StyledLi4-ulg8ho-7 jmevwM">...</li>
<li class="Card___StyledLi4-ulg8ho-7 jmevwM">...</li>
<li class="Card___StyledLi4-ulg8ho-7 jmevwM">...</li>
</ul>
I want to extract the text from the "li" tags, so I have:
text_data = WebDriverWait(driver,10).until(EC.visibility_of_all_element_located((By.XPATH,'Card___StyledLi4-ulg8ho-7.jmevwM')
print(text_data.text)
to wait and target "li" item. However, I get a "TimeoutException" error.
However, if I try to locate a single "li" item using the XPATH under the same conditions, the data is returned which leads me to question if I am inputting the class correctly?
Can anyone tell me what I'm doing wrong? Please let me know if there is any further information, you'd like me to provide.
I believe the XPath for these list items would be //li[#class="Card___StyledLi4-ulg8ho-7 jmevwM"] (or //*[#class="Card___StyledLi4-ulg8ho-7 jmevwM"] if you want all elements with that class rather than just li tags). You can take a look at this cheatsheet and this tutorial for further rules and examples of XPath.
You can also just use CSS Selectors like (By.CSS_SELECTOR, '.Card___StyledLi4-ulg8ho-7.jmevwM') in this case.
You have mentioned the wrong locator type, it should be CSS_SELECTOR and also put a dot '.' in front of element's property, because it is a 'class':
text_data = WebDriverWait(driver,10).until(EC.visibility_of_all_element_located((By.CSS_SELECTOR,'.Card___StyledLi4-ulg8ho-7.jmevwM')
I'm trying to get the URL link from an element with get_attribute('href') mode. However it returns null, like it didn't have href.
webdriver.find_element_by_xpath('//*[#id="__next"]/main/div[3]/div/section[1]/div/a[1]').get_attribute('href');
If I click manually, or use click() function, it will get me to the URL of the button, so there is a hyperlink associated to that button.
The html code of the element is that below:
<a data-testid="subcategory-content-with-no-link" class="styles__baseSubCategory-sc-rqlxha-0 styles__SubCategory-sc-rqlxha-3 kLvfti hFicjq">Mastite</a>
How can I get the URL/hyperlink of that button using Selenium?
Update
There's no href in any element inside <div> class as can be seen on this printscreen of html
.
Though on clicking the desired element you will be redirect to the desired URL but the href attribute is absent can be easily acknowledged from the value of the data-testid set as subcategory-content-with-no-link.
However, it needs to be emphasized that the href attribute is obfuscated within the class attribute value:
class="styles__baseSubCategory-sc-rqlxha-0 styles__SubCategory-sc-rqlxha-3 kLvfti hFicjq"
The html code that you posted shows that there is not href atteibute associated with that element. See if for example a parent element (possibly div) contains the href attribute instead.
Edit
As a workaround you can click the element to go to the url and get the url with driver.current_url
I'm trying to get the specific link from this
<a href="/doi/10.1021/ed500712k" title="Next" class="header_contnav-next">
<i class="icon-angle-right"></i>
</a>
I'm only able to find all the links within the page and it would be helpful to extract this specific one.
Thank you !
I presume you're using .find_all() function to get links, to find a specific item, you should use .find() function instead of that. If you're sure that only this one has the "class" variable set to "header_contnav-next", then all you need to do is to specify it in dictionary format:
soup.find("a", {"class": "header_contnav-next"})['href']
From what I can see, you just need to find the <a> element and, save it to some_variable and then use some_variable.href.
Can't give you more information from what you've provided.
I successfully get href link from http://quotes.toscrape.com/ example by implementing:
response.css('div.quote > span > a::attr(href)').extract()
and it gives all partial link inside href of each a tag:
['/author/Albert-Einstein', '/author/J-K-Rowling', '/author/Albert-Einstein', '/author/Jane-Austen', '/author/Marilyn-Monroe', '/author/Albert-Einstein', '/author/Andre-Gide', '/author/Thomas-A-Edison', '/author/Eleanor-Roosevelt', '/author/Steve-Martin']
by the way in above example each a tag has this format:
(about)
So I tried to make the same for this site: http://www.thegoodscentscompany.com/allproc-1.html
The problem here is that the style of a tag is a bit different as such:
formaldehyde
As you see I can't get link from href by using similar method above. I want to get link (http://www.thegoodscentscompany.com/data/rw1247381.html) from this a tag, but i could not make it. How can i get this link?
Try this response.css('a::attr(onclick)').re(r"Window\('(.*?)'\)")
I am using Selenium PhantomJS to perform headless dynamic scraping. I was able to extract all information except popups triggered by an ng-click, such as:
<button href="#" ng-click="navigation.login({edu:false})">login</button>
<a class="btn btn-primary" ng-click="login()">signup</a>
I want to get the tag that contains ng-click label, so that I can perform onclick activity and extract information from it.
The ng-click value and tag can be anything, I just want to search whether a tag contains ng-click or not, and if it is then return that tag.
I don't want to use regex or something like that.
The most simple solution is using XPath to check length of value of ng-click.
elements = driver.find_elements_by_xpath("//*[string-length(#ng-click) > 1]")
for element in elements:
element.click()
It works.
elements = driver.find_elements_by_xpath("//*[(#ng-click)]")