I am using Selenium for Python 2.7.10.
With XPath, I would like to locate the link in a href, following the sibling to minimal-list__title (i.e. I'm looking for the child beneath minimal-list__value). Which XPath should I use?
<span class="minimal-list__title">ETF Home Page:</span>
<span class="minimal-list__value">
ROBO
This is the current attempt:
from selenium import webdriver as driver
from selenium.common.exceptions import NoSuchElementException
def get_link(driver, key):
key = key + ":"
try:
find_value = driver.find_element_by_xpath("//span[#class='minimal-list__title' and . = '%s']/following-sibling::span/*[1]::a" % key).text
except NoSuchElementException:
return None
else:
value = re.search(r"(.+)", find_value).group().encode("utf-8")
return value
website = get_link(driver, "ETF Home Page")
print "Website: %s" % website
Note that I am specifically interested in a XPath that gets the link from the child of the following sibling. This is because the function above uses "ETF Home Page:" in the web code as an identifier for what to search for.
You're almost correct:
//span[#class = "minimal-list__title" and . = "ETF Home Page:"]/following-sibling::span/a
Note that you don't need to worry about multiple elements matching the locator since you are using find_element_by_xpath() and it would give you the first matching element.
Though, if it would makes sense in your case and you know the "ROBO" label beforehand:
driver.find_element_by_link_text("ROBO")
To get an attribute value, use get_attribute():
find_value = driver.find_element_by_xpath('//span[#class = "minimal-list__title" and . = "ETF Home Page:"]/following-sibling::span/a').get_attribute("href")
String e = driver.findElement(By.xpath("//*[contains(#class,"minimal-list__value")]/a)).getAttribute("href");
//*[contains(#class,"minimal-list__value")]/a is the xpath, the getAttribute will give you the desired result.
Based on the text ETF Home Page: to extract the link http://www.robostoxetfs.com/ from the child node of the following sibling you can use either of the following xpath based Locator Strategies:
Using xpath and following-sibling:
print(driver.find_element_by_xpath("//span[text()='ETF Home Page:']//following-sibling::span/a").get_attribute("href"))
Using xpath and following:
print(driver.find_element_by_xpath("//span[text()='ETF Home Page:']//following::span/a").get_attribute("href"))
Related
I want to fetch text from a div, but there are allot of duplicated classes. The only way to filter my search is by checking for a specific text within a sibling. Right now this is what I got:
accountmanager = ()
def send_keys_in_loop_dropaccountmanager(locator):
for i in range(5):
try:
global accountmanager
test = wait.until(EC.element_to_be_clickable(locator)).text
print(test)
accountmanager = test
break
except:
pass
send_keys_in_loop_dropaccountmanager((By.XPATH, "//div[contains(#class,'ahoy-value')] and following-sibling::div[contains(text(),'Accountmanager')]"))
print("accountmanager:", accountmanager)
I get no response at all.
Google inspector code(text that I want selected in blue):```
You can locate the parent element with class ahoy-label-value-pair based on known child element text content and then find the another child of that parent, as following:
"//div[#class='ahoy-label-value-pair'][contains(.,'Accountmanager')]//div[#class='ahoy-value']"
The selenium code for this will look as following:
accountmanager = wait.until(EC.visibility_of_element_located((By.XPATH, "//div[#class='ahoy-label-value-pair'][contains(.,'Accountmanager')]//div[#class='ahoy-value']"))).text
print("accountmanager name: ", accountmanager)
I figured it out, by removing the loop, checking other posts, and dubble checking, my xpath, I came up with the following:
element = wait.until(EC.visibility_of_element_located((By.XPATH,"//div[contains(text(), 'Accountmanager')]/following-sibling::div")))
accountmanager = element.text
print("accountmanager:", accountmanager)
Try the following xpath -
//div[text()='Peter Hendrik'][#class='ahoy-value']
Edit: If you want to go through the Accountmanager text, you can use the following xpath -
//div[text()='Accountmanager'][#class='ahoy-label']/following-sibling::div[#class='ahoy-value']
I am trying to find the following element in selenium
It is a user name input field and I use: loginLink = driver.find_element(By.name, "loginEmail" but keep getting "no such element" message.
//input[#ng-reflect-name='loginEmail']
Use xpath or CSS , you can find by name only if the attribute key is 'name'
Eg 'name=loginEmail'
driver.find_element_by_xpath("//input[#ng-reflect-name='loginEmail']")
driver.find_element_by_css_selector("input[ng-reflect-name='loginEmail']")
you can use xpath and css for any attribute as
xpath: //tagname[#attriubute='value']
css: tagname[attriubute='value']
Using the xpath given by PDHide the code you need to use is
loginLink = driver.find_element_by_xpath("//input[#ng-reflect-name='loginEmail']")
I'm trying to print a title from a link, but it doesn't return any values. Can anyone see where I've gone wrong?
Link to HTML for the link I'm trying to get the title from - http://imgur.com/a/niTAs
driver.get("http://www.theflightdeal.com/category/flight-deals/boston-flight-deals/")
results = driver.find_elements_by_xpath('//div[#class="post-entry half_post half_post_odd"]')
for result in results:
main = result.find_element_by_xpath('//div[#class="entry-content"]')
title1 = main.find_element_by_xpath('//h1/a')
title = title1.get_attribute('title')
print(title)
You need to prepend a . to your xpaths.
An xpath starting with / will search in the root of the current document, instead of within the current element. See function docs.
This will select the first link under this element.
::
myelement.find_elements_by_xpath(".//a")
However, this will select the first link on the page.
::
myelement.find_elements_by_xpath("//a")
I am trying to scrape the list of followings for a given instagram user. This requires using Selenium to navigate to the user's Instagram page and then clicking "following". However, I cannot seem to click the "following" button with Selenium.
driver = webdriver.Chrome()
url = 'https://www.instagram.com/beforeeesunrise/'
driver.get(url)
driver.find_element_by_xpath('//*[#id="react-root"]/section/main/article/header/div[2]/ul/li[3]/a').click()
However, this results in a NoSuchElementException. I copied the xpath from the html, tried using the class name, partial link and full link and cannot seem to get this to work! I've also made sure that the above xpath include the element with a "click" event listener.
UPDATE: By logging in I was able to get the above information. However (!), now I cannot get the resulting list of "followings". When I click on the button with the driver, the html does not include the information in the pop up dialog that you see on Instagram. My goal is to get all of the users that the given username is following.
Make sure you are using the correct X Path.
Use the following link to get perfect X Paths to access web elements and then try.
Selenium Command
Hope this helps to solve the problem!
Try a different XPath. I've verified this is unique on the page.
driver.find_element_by_xpath("//a[contains(.,'following')]")
It's not the main goal of selenium to provide rich functionalities, from a web-scraping perspective, to find elements on the page, so the better option is to delegate this task to a specific tool, like BeautifulSoup. After we find what we're looking for, then, we can ask for selenium to interact with the element.
The bridge between selenium and BeautifulSoup will be this amazing function below that I found here. The function gets a single BeautifulSoup element and generates a unique XPATH that we can use on selenium.
import os
import re
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import itertools
def xpath_soup(element):
"""
Generate xpath of soup element
:param element: bs4 text or node
:return: xpath as string
"""
components = []
child = element if element.name else element.parent
for parent in child.parents:
"""
#type parent: bs4.element.Tag
"""
previous = itertools.islice(parent.children, 0, parent.contents.index(child))
xpath_tag = child.name
xpath_index = sum(1 for i in previous if i.name == xpath_tag) + 1
components.append(xpath_tag if xpath_index == 1 else '%s[%d]' % (xpath_tag, xpath_index))
child = parent
components.reverse()
return '/%s' % '/'.join(components)
driver = webdriver.Chrome(executable_path=YOUR_CHROMEDRIVER_PATH)
driver.get(url = 'https://www.instagram.com/beforeeesunrise/')
source = driver.page_source
soup = bs(source, 'html.parser')
button = soup.find('button', text=re.compile(r'Follow'))
xpath_for_the_button = xpath_soup(button)
elm = driver.find_element_by_xpath(xpath_for_the_button)
elm.click()
...and works!
( but you need writing some code to log in with an account)
I am trying to get src(URL) link of main image from xkcd.com website. I am using the following code but it returns something like session="2f69dd2e-b377-4d1f-9779-16dad1965b81", element="{ca4e825a-88d4-48d3-a564-783f9f976c6b}"
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get('http://xkcd.com')
assert 'xkcd' in browser.title
idlink= browser.find_element_by_id("comic")
#link = idlink.get_attribute("src") ## print link prints null
print idlink
using xpath method also returns same as above.
browser.find_element_by_id returns web element, and that is what you print.
In addition, the text you want is in child element of idlink. Try
idlink = browser.find_element_by_css_selector("#comic > img")
print idlink.get_attribute("src")
idlink is now web element with img tag who has parent with comic ID.
The URL is in src so we want that attribute.
Building off the answer here
You need to:
Select the img tag (you're currently selecting the div)
Get the contents of the source attribute of the img tag
img_tag = browser.find_element_by_xpath("//div[#id='comic']/img")
print img_tag.get_attribute("src")
The above should print the URL of the image
More techniques for locating elements using selenium's python bindings are available here
For more on using XPath with Selenium, see this tutorial