How to find what selenium see in a dom where it misses an image I see on screen?
Context: I have a Selenium python test
browser.wait_to_find_visible_element(By.ID, 'image')
that sometimes can't find an image that I see on the browser selenium launched for the test:
<div id="container">
<img id='image' src=''/>
</div>
To find out what selenium see instead, I get the enclosing div:
element = browser.find_displayed_elements(By.CSS_SELECTOR, '#container')
print element
which prints:
selenium.webdriver.remote.webelement.WebElement object at 0x9b3876c
and try to get the dom:
dom = browser.driver.execute_script('return arguments[0].parentNode', element)
print dom
which prints
None
What I'm missing?
Have you tried this
element = browser.find_displayed_elements(By.CSS_SELECTOR, '#container')
source_code = element.get_attribute("innerHTML")
# or
source_code = element.get_attribute("outerHTML")
Related
Situation
I'm using Selenium and Python to extract info from a page
Here is the div I want to extract from:
I want to extract the "Registre-se" and the "Login" text.
My code
from selenium import webdriver
url = 'https://www.bet365.com/#/AVR/B146/R^1'
driver = webdriver.Chrome()
driver.get(url.format(q=''))
elements = driver.find_elements_by_class_name('hm-MainHeaderRHSLoggedOutNarrow_Join ')
for e in elements:
print(e.text)
elements = driver.find_elements_by_class_name('hm-MainHeaderRHSLoggedOutNarrow_Login ')
for e in elements:
print(e.text)
Problem
My code don't send any output.
HTML
<div class="hm-MainHeaderRHSLoggedOutNarrow_Join ">Registre-se</div>
<div class="hm-MainHeaderRHSLoggedOutNarrow_Login " style="">Login</div>
By looking this HTML
<div class="hm-MainHeaderRHSLoggedOutNarrow_Join ">Registre-se</div>
<div class="hm-MainHeaderRHSLoggedOutNarrow_Login " style="">Login</div>
and your code, which looks okay to me, except that part you are using find_elements for a single web element.
and by reading this comment
The class name "hm-MainHeaderRHSLoggedOutMed_Login " only appear in
the inspect of the website, but not in the page source. What it's
supposed to do now?
It is clear that the element is in either iframe or shadow root.
Cause page_source does not look for iframe.
Please check if it is in iframe, then you'd have to switch to iframe first and then you can use the code that you have.
switch it like this :
driver.switch_to.frame(driver.find_element_by_xpath('xpath here'))
My code navigates to a website, and in the website there is an article which contains its own link/url/href.
I want to print this field.
My current code highlights the container that it is in, and then I try to do a for loop to get the href.
from selenium import webdriver
driver = webdriver.Chrome()
import time
url = 'https://library.ehaweb.org/eha/#!*menu=6*browseby=8*sortby=2*media=3*ce_id=2035*label=21986*ot_id=25553*marker=1283*featured=17286'
driver.get(url)
time.sleep(3)
page_source = driver.page_source
container=driver.find_element_by_xpath("//div[#class='list-box col-md-6 col-lg-6 col-xl-4 test']")
for j in container:
link= j.find_element_by_css_selector('a').get_attribute('href')
print(link)
If I correctly understand what you want, you just need to print element's child (a) attribute:
link = driver.find_element_by_xpath("//div[#class='list-box col-md-6 col-lg-6 col-xl-4 test']/a").get_attribute("href")
print(link)
This prints:
https://library.ehaweb.org/eha/2021/eha2021-virtual-congress/324511/hanny.al-samkari.pazopanib.for.severe.bleeding.and.transfusion-dependent.html?f=menu%3D6%2Abrowseby%3D8%2Asortby%3D2%2Amedia%3D3%2Ace_id%3D2035%2Alabel%3D21986%2Aot_id%3D25553%2Amarker%3D1283%2Afeatured%3D17286
If you want to use loop, then change container=driver.find_element_by_xpath("//div[#class='list-box col-md-6 col-lg-6 col-xl-4 test']") to
container=driver.find_elements_by_xpath("//div[#class='list-box col-md-6 col-lg-6 col-xl-4 test']")
For exactly this element the following locator would be enough:
//div[contains(#class, 'test')]/a
With the following code:
driver = webdriver.Chrome(executable_path='/snap/bin/chromium.chromedriver')
url = 'https://library.ehaweb.org/eha/#!*menu=6*browseby=8*sortby=2*media=3*ce_id=2035*label=21986*ot_id=25553*marker=1283*featured=17286'
driver.get(url)
driver.implicitly_wait(10)
container = driver.find_elements_by_xpath("//div[contains(#class, 'test')]")
for j in container:
link = j.find_element_by_css_selector('a').get_attribute('href')
print(link)
driver.close()
That page contain lots of inner URL. To click on EHA 2021 virtual container, you can use the below code.
eha_2021 = driver.find_element_by_css_selector('div#listing-main a')
eha_2021_link = eha_2021_link.get_attribute('href')
print(eha_2021_link)
Just in case if you want to click on COVID-19 Outbreak you may try the below code.
Code :
covid_19_element = driver.find_element(By.ID, 'menu-8')
covid_19_url = covid_19_element.get_attribute('href')
print(covid_19_url)
Suggestion :
Try to avoid this kind of xpath //div[#class='list-box col-md-6 col-lg-6 col-xl-4 test'] this looks bit dynamic and may change region wise. Always use locater in below order :
ID
Name
TagName
Class Name
Link Text
Partial Link Text
CSS selector
XPATH
Learn about them here
This works for me in getting the href
elems = driver.find_elements_by_xpath("//a[#href]")
for elem in elems:
print(elem.get_attribute("href"))
Loop through the list, take each element and fetch the required attribute value you want from it (in this case href).
I would like to click an image link and I need to be able to find it by its src, however it's still not working for some reason. Is this even possible? This is what I'm trying:
#Find item
item = WebDriverWait(driver, 100000).until(EC.presence_of_element_located((By.XPATH, "//img[#src=link]")))
#item = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//img[#alt='Bzs36xl9 xa']")))
item.click()
link = //assets.supremenewyork.com/170065/vi/BZS36xl9-xA.jpg in the above code. This matches the HTML from below.
The second locator works (finding image using alt), but I will only have the image source when the program actually runs.
HTML for the webpage:
<article>
<div class="inner-article">
<a style="height:81px;" href="/shop/accessories/h68lyxo2h/llhxzvydj">
<img width="81" height="81" src="//assets.supremenewyork.com/170065/vi/BZS36xl9-xA.jpg" alt="Bzs36xl9 xa">
</a>
</div>
</article>
I don't see why finding by alt would work and not src, is this possible? I saw another similar question which is where I got my solution but it didn't work for me. Thanks in advance.
EDIT
To find the link I have to parse through a website in JSON format, here's the code:
#Loads Supreme JSON website into an object
url = urllib2.urlopen('https://www.supremenewyork.com/mobile_stock.json')
obj = json.load(url)
items = obj["products_and_categories"]["Accessories"]
itm_name = "Sock"
index = 0;
for i in items:
if(itm_name in items[index]["name"]):
found_url = i["image_url"]
break
index += 1
str_link = str(found_url)
link = str_link.replace("ca","vi")
Use WebDriverWait and element_to_be_clickable.Try the following xpath.Hope this will work.
link ='//assets.supremenewyork.com/170065/vi/BZS36xl9-xA.jpg'
item = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='inner-article']/a/img[#src='{}']".format(link))))
print(item.get_attribute('src'))
item.click()
item = WebDriverWait(driver, 100000).until(EC.presence_of_element_located((By.XPATH, "//img[#src=link]")))
Heres your problem, I can't believe it didn't jump out at me. You're asking the driver to find an element with a src of "link" NOT the variable link that you've defined earlier. Idk how to pass in variables into xpaths but i do know that you can use stringFormat to create the correct xpath string just before calling it.
i also dont speak python, so here's some pseudo java/c# to help you get the picture
String xPathString = String.Format("//img[#src='{0}']", link);
item = WebDriverWait(driver, 100000).until(EC.presence_of_element_located((By.XPATH, xPathString)))
I am trying to get src(URL) link of main image from xkcd.com website. I am using the following code but it returns something like session="2f69dd2e-b377-4d1f-9779-16dad1965b81", element="{ca4e825a-88d4-48d3-a564-783f9f976c6b}"
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
browser.get('http://xkcd.com')
assert 'xkcd' in browser.title
idlink= browser.find_element_by_id("comic")
#link = idlink.get_attribute("src") ## print link prints null
print idlink
using xpath method also returns same as above.
browser.find_element_by_id returns web element, and that is what you print.
In addition, the text you want is in child element of idlink. Try
idlink = browser.find_element_by_css_selector("#comic > img")
print idlink.get_attribute("src")
idlink is now web element with img tag who has parent with comic ID.
The URL is in src so we want that attribute.
Building off the answer here
You need to:
Select the img tag (you're currently selecting the div)
Get the contents of the source attribute of the img tag
img_tag = browser.find_element_by_xpath("//div[#id='comic']/img")
print img_tag.get_attribute("src")
The above should print the URL of the image
More techniques for locating elements using selenium's python bindings are available here
For more on using XPath with Selenium, see this tutorial
I get "element is no longer attached to the dom" exception even though the element is there and is clickable, I am trying to click the "next" arrow on ryanair website, the html for the next button is:
<li class="newer" ng-class="{'loadingsmall':loading}">
<a ng-disabled="loading" ng-click="loadMore(0, 'SelectInput$LinkButtonNext1', 1)"
title="Next Week" href="">
</a>
</li>
I located and clicked it in several methods:
elem = WebDriverWait(browser, 15).until(EC.element_to_be_clickable((By.XPATH,
"//a[#title='Next Week']")))
elem = browser.find_element_by_xpath("//a[#title='Next Week']")
elem.click()
and:
area = browser.find_element_by_xpath("//a[#title='Next Week']")
action = webdriver.ActionChains(browser)
action.move_to_element(area) action.click(area) action.perform()
and:
elem = browser.find_element_by_link_text('>')
elem.click()
all work fine if I have no action in between, but once I tell selenium to click on other elements on the page (I do not move to other pages, I stay on the same page and show some dynamic content) the "next" link only works the first time around, and then gives me the exceptions, help would be so greatly appreciated! :)
You're still on the same page but the DOM's elements have changed due to some AJAX actions & calls, so you're forced to re-detect your object using:
elem = browser.find_element_by_link_text('>')
If you don't do that, it is more than likely that you'll get a
StaleReferenceException
For more info