I am currently working on a demo Selenium project with Python. I have been able to navigate to a page but when trying to collect text within a "div class" selenium fails to find the HTML :
Code to be collected
I have made use of the wait functionality but the code still does not find the Html element.
Any suggestions on how to resolve this issue would be appreciated, please see my code below :
Image of my selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
import json
# establish Json dict
global data
data = {}
global date
date = '''&checkin=2021-02-22&checkout=2021-02-28&adults=1&source'''
def find_info(place):
data[place] = []
driver = webdriver.Chrome('chromedriver.exe')
driver.get("https://www.airbnb.co.uk/")
time.sleep(2)
#first_page_search_bar
search_bar = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "_1xq16jy")))
time.sleep(2)
search_bar.clear()
time.sleep(2)
search_bar.send_keys(f"{place}")
time.sleep(2)
enter_button = driver.find_element_by_class_name("_1mzhry13")
enter_button.click()
#load page
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "_ljad0a")))
page = driver.current_url
new_url = page.replace("&source", date)
# driver = webdriver.Chrome('chromedriver.exe')
driver.get(new_url)
time.sleep(3)
click_button = driver.find_element_by_xpath('//*[#id="menuItemButton-price_range"]/button')
click_button.click()
time.sleep(5)
price = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '/html/body/div[16]/section/div/div/div[2]/div/section/div[2]/div/div/div[1]')))
print(price)
find_info("London, United Kingdom")
I've fixed the xpath at the end of your script:
price = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '(//div[#role="menu"]//div[#dir="ltr"])[1]/preceding-sibling::div')))
print(price.text)
Explanation: Under the <div role="menu" ... there are 3 <div dir="ltr">elements and the first one happens to be just after the div you are looking for. So we find that one and select the preceding sibling.
Another recommendation: if you replace EC.presence_of_element_located to EC.element_to_be_clickable when you are looking for the input fields at the start you can get rid of a few time.sleep statements.
Related
A similar question has been asked many times, and I have gone over many of them, such as Debugging "Element is not clickable at point" error
, Selenium Webdriver - element not clickable error in firefox, ElementClickInterceptedException: Message: element click intercepted:
but haven't been able to solve my problem.
I want to select a subset of car brands from the websites search dropdown menu. Usually I would do it via Selenium's Select, but that doesn't do the trick here.
Here's my code.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.proxy import Proxy, ProxyType
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
ser = Service(executable_path= r'D:\chromedriver.exe')
#Note I have omitted the options that I use (proxy and header).
driver = webdriver.Chrome(service = ser)
driver.get("https://www.autotalli.com/")
time.sleep(5)
# Accepting cookies
driver.find_element(by = By.XPATH, value = "//button[contains(text(),'Asetuks')]").click()
time.sleep(5)
driver.find_element(by = By.XPATH, value = "//button[contains(text(),'Tallenna')]").click()
driver.maximize_window()
time.sleep(5)
#selecting parameters from the dropdown menu
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[#class = 'mbsc-input-wrap']")))
element.click()
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[#data-val = '66-duplicated']")))
element.click()
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[#class = 'mbsc-input-wrap']")))
element.click()
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[#data-val = '10-duplicated']")))
element.click()
What throws me off is that the code works for the 66-duplicated element but not for the 10-duplicated element, and the two are identical in every way. The error I get is
Exception has occurred: ElementClickInterceptedException
Message: element click intercepted: Element <div role="option" tabindex="-1" aria-selected="false" class="mbsc-sc-itm mbsc-sel-gr-itm mbsc-btn-e" data-index="2" data-val="10-duplicated" style="height:40px;line-height:40px;">...</div> is not clickable at point (268, 217). Other element would receive the click: <input tabindex="0" type="text" class="mbsc-sel-filter-input mbsc-control" placeholder="Hae">
To to solve this, I have tried to use javascript, move to the element and then click and maximize the window - None of which worked.
#Attempt 1:js:
driver.execute_script("arguments[0].click()", element)
#Attempt 2: moveToElement:
element = driver.find_element(by = By.XPATH, value = "//*[#data-val = '10-duplicated']")
actions = ActionChains(driver)
actions.move_to_element(element).perform()
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//*[#data-val = '10-duplicated']")))
element.click()
I have also tried a combination of these but to no avail.
However,when I put a break point right before the click of element "10-duplicated" and manually scroll and move the mouse to the element, and run the remaining code, it works.
I am quite puzzled here. What's going on and how can this problem be solved?
There are 17 matches for //*[#class = 'mbsc-input-wrap'] locator on that page, but you are opening the same, first match 2 times. That is Merkit droplist.
Now, when selecting //*[#data-val = '66-duplicated'] (Nissan) from the opened droplist this will work since that option is within the visible options but when Nissan is currently selected //*[#data-val = '10-duplicated'] (BMW) option in not visible, you can not click it directly.
In order to select it now you will have to
Cancel the previous selection of Nissan so that BMW will become initially visible by opening the droplist.
Scroll the droplist
Click the //*[#data-val = '10-duplicated'] with JavaScript - not recommended since this is not what human user can do via GUI.
I will give you a code to make the first approach - cancelling the previous Nissan selection.
I have also made some improvements there.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.proxy import Proxy, ProxyType
from selenium.webdriver.common.by import By
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
ser = Service(executable_path= r'D:\chromedriver.exe')
#Note I have omitted the options that I use (proxy and header).
driver = webdriver.Chrome(service = ser)
driver.get("https://www.autotalli.com/")
wait = WebDriverWait(driver, 20)
time.sleep(5)
# Accepting cookies
driver.find_element(by = By.XPATH, value = "//button[contains(text(),'Asetuks')]").click()
time.sleep(5)
driver.find_element(by = By.XPATH, value = "//button[contains(text(),'Tallenna')]").click()
driver.maximize_window()
time.sleep(5)
#selecting parameters from the dropdown menu
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#class = 'mbsc-input-wrap']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#data-val = '66-duplicated']"))).click()
#clear the previously selected NIssan option
wait.until(EC.element_to_be_clickable((By.XPATH, "//span[contains(#class,'usedCarsMakeClear clearOption')]"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#class = 'mbsc-input-wrap']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH, "//*[#data-val = '10-duplicated']"))).click()
I'm practicing trying to scrape my university's course catalog. I have a few lines in Python that open the url in Chrome and clicks the search button to bring up the course catalog. When I go to extract the texting using find_elements_by_xpath(), it returns blank. When I use the dev tools on Chrome, there definitely is text there.
from selenium import webdriver
import time
driver = webdriver.Chrome()
url = 'https://courses.osu.edu/psp/csosuct/EMPLOYEE/PUB/c/COMMUNITY_ACCESS.OSR_CAT_SRCH.GBL?'
driver.get(url)
time.sleep(3)
iframe = driver.find_element_by_id('ptifrmtgtframe')
driver.switch_to.frame(iframe)
element = driver.find_element_by_xpath('//*[#id="OSR_CAT_SRCH_WK_BUTTON1"]')
element.click()
course = driver.find_elements_by_xpath('//*[#id="OSR_CAT_SRCH_OSR_CRSE_HEADER$0"]')
print(course)
I'm trying to extract the text from the element 'OSU_CAT_SRCH_OSR_CRSE_HEADER'. I don't understand why it's not returning the text values especially when I can see that it contains text with dev tools.
You are not using text that is the reason you are not getting the text.
course = driver.find_elements_by_xpath('//*[#id="OSR_CAT_SRCH_OSR_CRSE_HEADER$0"]').text
Try above changes in last second line
Below is the full code after the changes
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Chrome()
url = 'https://courses.osu.edu/psp/csosuct/EMPLOYEE/PUB/c/COMMUNITY_ACCESS.OSR_CAT_SRCH.GBL?'
driver.get(url)
time.sleep(3)
iframe = driver.find_element_by_id('ptifrmtgtframe')
driver.switch_to.frame(iframe)
element = driver.find_element_by_xpath('//*[#id="OSR_CAT_SRCH_WK_BUTTON1"]')
element.click()
# wait 10 seconds
course = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, '//*[#id="OSR_CAT_SRCH_OSR_CRSE_HEADER$0"]'))
).text
print(course)
I can't get the text from the element. I think it is a dynamically added text (from Angular) to the element and therefore not loaded directly in the element. The text inside the element is in the format of e.g. "3" with citation marks around ut.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
import xlsxwriter
import re
pattern = r"[\"\d{1, 2}\"]"
PATH = "C:\Program Files (x86)\chromedriver.exe"
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome(PATH, chrome_options=chrome_options)
driver.get("some-url")
xpathPain = "/html/body/div[2]/div/div/div[1]/div/div/div[1]/div[3]/div/div/div[1]/div[3]/development-numbers/status-numbers/div/div[2]/div/h4"
try:
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.XPATH, xpathPain)))
elementPain = driver.find_element_by_xpath(xpathPain)
print(elementPain.text)
except TimeoutException:
print("Failed to load elementPain")
I get the output: (blank , like an empty string)
. I have tried to wait til the text is loaded with the EC text_to_be_present_in_element(locator, text_) and tried to use a regular expression for the text part.
The page source for the element is:
<h4 class="status-numbers__number">
"6"
<!---->
</h4>
So how do I get the number 6 from this element?
I have tried print(elementPain.get_attribute("innerHTML")) and that gets the "<!---->" part of the text but not the '"6"' part. I have also tried .getAttribute("innerText"), .getAttribute("textContent").
I have tried using the firefox geckodriver instead as well. No result.
I have managed to solve the issue using Firefox and this code:
try:
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, xpathPain)))
elementPain = driver.find_element_by_xpath(xpathPain)
print(elementPain.get_attribute("innerHTML"))
Don't know it it had to do with the element out of viewport.
I have managed to solve the issue using Firefox and this code:
try:
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, xpathPain)))
elementPain = driver.find_element_by_xpath(xpathPain)
print(elementPain.get_attribute("innerHTML"))
Don't know it it had to do with the element out of viewport.
Use the following XPath to identify the element.
You can use element.text or element.get_attribute("textContent") to get the text.
try:
WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//h4[#class='status-numbers__number']")))
elementPain = driver.find_element_by_xpath("//h4[#class='status-numbers__number']")
print(elementPain.text) #To get the text using text
print(elementPain.get_attribute("textContent")) #To get the text using get_attribute()
except TimeoutException:
print("Failed to load elementPain")
Pic of inspect element
from selenium.webdriver.common.keys import Keys
import time
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get('https://website.com/')
driver.maximize_window()
search = driver.find_element_by_id('UserName')
search.send_keys('UserName')
search = driver.find_element_by_id('Password')
search.send_keys('Password')
search.send_keys(Keys.RETURN)
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.LINK_TEXT, "Admin"))
)
element.click
link = driver.find_element_by_link_text('Admin')
link.click()
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.LINK_TEXT, "Reports"))
)
element.click
link = driver.find_element_by_link_text('Reports')
link.click()
except:
driver.quit()
driver.implicitly_wait(5)
sales_link = driver.find_element_by_link_text('Sales').click()
Below is the info from the website, I want to click on Sales but can't seem to do so any help would be appreciated
a _ngcontent-hyf-c12="" routerlink="./SalesReport" routerlinkactive="active" href="/Reports/SalesReport"Sales /a
Pic of error
This what appears if I try to click on it with XPATH
Error Pic
In your html pic, I see a whitespace after Sales.
Look carefully: href="/Reports/SalesReport">Sales </a.
So find_element_by_link_text('Sales') will not work.
You can change it to find_element_by_link_text('Sales ').
However, this will be better:
driver.find_elements_by_xpath("//a[contains(text(), 'Sales')]")
I want to make a parser for scraping price, however I can't find the working method of parsing innerHTML
I don't know why, but selenium (getAttribute(innerHTML)), phantomjs (page.evaluation function(){return document.ElementToParse.innerHTML}) and scrapy-splash (loaded a webpage using WebPageEngine and parse html) don't work. All the time, result is empty "[]", null or webelement
I test my code on banggood's products and also on landing page but result is always the same.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A-Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN") #random url
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
)
finally:
driver.quit()
print(element)
and output:
<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="b0593791-138b-4177-a8f3-e7983143824a", element="d08f4717-d3f1-4594-8f2b-1bf943deb9f9")>
when need something like:
6.59(or US$6.59)
i also tried
price = driver.find_element_by_class_name('item_now_price').getAttribute("innerHTML")
and
var page = require('webpage').create();
page.open('https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A- Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN', function(status) {
var price = page.evaluate(function() {
return document.getElementByClassName('item_now_price').innerHTML;
});
console.log('price is ' + price);
phantom.exit();
});
but result is null and when i add
page.includeJs(/url/to/js)
terminal stops working
s
Once you get the element in selenium, you can get the text of that element with .text
See the slight adjustment to your first example below:
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
)
print(element.text)
finally:
See if that gets the results you're looking for.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get("https://www.banggood.com/BlitzWolf-Ampcore-Turbo-TC10-3A-Durable-USB-Type-C-Charging-Data-Cable-p-1188424.html?rmmds=category&cur_warehouse=CN") #random url
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, "item_now_price"))
).text
finally:
driver.quit()
print(element)