I'm trying to scrape this website
Best Western Mornington Hotel
for the name of hotel rooms and the price of said room. I'm using Selenium to try and scrape this data but I keep on getting no return after what I assume is me using the wrong selectors/XPATH. Is there any method of identifying the correct XPATH/div class/selector? I feel like I have selected the correct ones but there is no output.
from re import sub
from decimal import Decimal
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
seleniumurl = 'https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1'
driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')
driver.get(seleniumurl)
time.sleep(5)
working = driver.find_elements_by_class_name('room-type-block')
for work in working:
name = work.find_elements_by_xpath('.//div/h4').string
price = work.find_elements_by_xpath('.//div[2]/div[2]/div/div[1]/div/div[3]/div/div[1]/div/div[2]/div[1]/div[2]/div[1]/div[1]/span[2]').string
print(name,price)
I only work with Selenium in Java, but from I can see you're trying to get collection of WebElements and invoke toString() on them...
should be that find_element_by_xpath to get just one WebElement and then call .text instead of .string?
Marek is right use .text instead of .string. Or use .get_attribute("innerHTML"). I also think your xpath may be wrong unless I'm looking at the wrong page. Here are some xpaths from the page you linked.
#This will get all the room type sections.
roomTypes = driver.find_elements_by_xpath("//div[contains(#class,'room-type-box__content')]")
#This will get the room type titles
roomTypes.find_elements_by_xpath("//div[contains(#class,'room-type-title')]/h3")
#Print out room type titles
for r in roomTypes:
print(r.text)
Please use this selector div#rr_wrp div.room-type-block and .visibility_of_all_elements_located method for get category div list.
With the above selector, you can search title by this xpath: .//h2[#class="room-type--title"], sub category by .//strong[#class="trimmedTitle rt-item--title"] and price .//div[#class="rt-rate-right--row group"]//span[#data-bind="text: priceText"].
And please try the following code with zip loop to extract parallel list:
driver = webdriver.Chrome(executable_path='C:\\Users\\Conor\\Desktop\\diss\\chromedriver.exe')
driver.get('https://www.bestwestern.co.uk/hotels/best-western-mornington-hotel-london-hyde-park-83187/in-2021-06-03/out-2021-06-05/adults-1/children-0/rooms-1')
wait = WebDriverWait(driver, 20)
elements = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div#rr_wrp div.room-type-block')))
for element in elements:
for room_title in element.find_elements_by_xpath('.//h2[#class="room-type--title"]'):
print("Main Title ==>> " +room_title.text)
for room_type, room_price in zip(element.find_elements_by_xpath('.//strong[#class="trimmedTitle rt-item--title"]'), element.find_elements_by_xpath('.//div[#class="rt-rate-right--row group"]//span[#data-bind="text: priceText"]')) :
print(room_type.text +" " +room_price.text)
driver.quit()
Related
I'm trying to code a python script using selenium to automatically pick something randomly for dinner by using the inputted location. However, I've been getting these error messages: selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[#id="tsuid11"]/div[2]/div/a/div/div[3]/div"}
I don't really understand why this error is happening. I've even watched the entire process load multiple times and am also certain the XPath value is correct.
This is my code:
import requests
import random
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
location = input("Please enter your postal code: ")
driver = webdriver.Chrome("<path to chromedriver.exe>")
query = "food near " + location
print("Please give us a moment...")
driver.get("https://www.google.com/search?q=" + query)
time.sleep(3)
#this helps to click the "View All" option to see the entire list of restaurants nearby
view_all = driver.find_element_by_xpath('//*[#id="rso"]/div[1]/div/div/div/div/div[5]/div/g-more-link/a/div/span[1]')
view_all.click()
time.sleep(10)
#This is where I can't seem to find the element by its XPath
name = driver.find_element_by_xpath('//*[#id="tsuid11"]/div[2]/div/a/div/div[3]/div')
print(name)
driver.close()
I have also already searched this error up and got this: here
Based on the answer, the person mentioned "Could be a race condition where the find element is executing before it is present on the page". However, I've already added the time.sleep() function to mitigate that.
Any help would be appreciated :)
Update: I got it to work by replacing find by XPath to find by CSS Selector. However, this is just a work around, I'm still going to try to figure this one out. Thanks for all the solutions, but unfortunately none of them worked for me.
my View All button works perfectly fine on my end
the webdriver wait function also didn't allow me to find the element (I think time.sleep does the exact same thing)
and the element is not inside an iframe
After some further probing, I've tested the full XPath: /html/body/div[6]/div/div[7]/div[1]/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[1]/div[4]/div[3]/div[2]/div/a/div/div[3]/div on the browser console itself, and did not find anything. By digging through the layers, I've noticed that it stops at /html/body/div[6]/div/div[7]/div[1]/div/div/div[2]/div[2]/div/div/div/div/div/div/div/div[1]/div[4]/div[3]/div[2]/div, right before the a tag. I'm not sure why as of now, will update again if I find anything.
Update 2:
I used class name instead of XPath which is much more consistent for my output. That fixed everything for me :) Hope this helps.
After clicking on view All, you have not mentioned what exactly you wanna do, if I assume that you wanna fetch restaurant name, you could do that by the below method :
you can write below code after these two lines from your code : -
view_all.click()
time.sleep(10)
Sample code :
all_names = driver.find_elements(By.CSS_SELECTOR, "div[role='heading'] div")
print(all_names[0].text)
or in case you would like to fetch all the names :-
names = []
for name in driver.find_elements(By.CSS_SELECTOR, "div[role='heading'] div"):
names.append(name.text)
print(names)
Updated 1 :
driver = webdriver.Chrome("<path to chromedriver.exe>")
driver.maximize_window()
driver.implicitly_wait(30)
location = "545084"
query = "food near " + location
driver.get("https://www.google.com/search?q=" + query)
wait = WebDriverWait(driver, 10)
ActionChains(driver).move_to_element(wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#href, '/search?tbs')]")))).perform()
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(#href, '/search?tbs')]"))).click()
names = []
for name in driver.find_elements(By.XPATH, "//div[#aria-level='3']"):
names.append(name.text)
print(names)
Output :
["Mum's Kitchen\nCatering Pte Ltd", 'Kitch', "Zoey's Diner", "McDonald's", 'ThaiExpress', 'Dian Xiao Er', "Swensen's", 'Pho Street Compass One', 'Singapore WaterDrop Tea House', 'LeNu at Compass One', "McDonald's Compass One", "McDonald's", 'Miami Bistro', 'Monster Curry', 'Texas Chicken (Hougang Capeview)', 'Paradise Hotpot at Compass One', "Long John Silver's Rivervale Mall", 'Boon Tong Kee # Compass One', 'Din Tai Fung', 'Fish & Co.', 'PUTIEN', 'Soup Restaurant 三盅两件 - Compass One']
Please check this element is inside the frame, when we use the debugger, our element would be identified but the code will not know. we need to drill down from root frame -> corresponding frame -> then the element would be intractable.
Can you try the below lines of codes, I hope it will help
location = input("Please enter your postal code: ")
driver = webdriver.Chrome("<path to chromedriver.exe>")
#I've try to open the browser in the maximize mode because when we try to click on the `veiw All` button sometime it through the Exception
driver.maximize_window()
query = "food near " + location
print("Please give us a moment...")
driver.get("https://www.google.com/search?q=" + query)
time.sleep(3)
#This XPath also gives me an error sometime, So I've updated it to click on restaurants nearby
view_all = driver.find_element_by_xpath("//*[text()='View all']")
view_all.click()
time.sleep(10)
#the provided xpath was not perfect I was not able to search it on the page, Also updated the print() with `.text` to get the text
name = driver.find_element_by_xpath("//div[#id='tsuid9']//a")
print(name.text)
driver.close()
You can try to, click on the element with explicitWait
viewAll = WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//*[text()='Viewall']")))
viewAll.click()
import
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
O/P
#Change your xpath to click on view all
view_all = driver.find_element_by_xpath('//*[#id="rso"]//span[normalize-space(text())="View all"]')
#This will print all restaurants name
numberof_restaurants = driver.find_elements_by_xpath('//*[contains(#id,"tsuid")]//div[#role="heading"]')
for restaurant in numberof_restaurants:
name = restaurant.text
print(name)
UPDATE:
So thanks to the voted answer it displayed some information not the right information, it shows 0kb out of 100 and when in the inspect element console if doing console.log($0) then the item would be displayed in console how do I fetch this
I want to create a python 3.x programme that gets my stats off of netlify and easybase using selenium. The issue I have come across already is that the element does not have a specific class name and the text widget isn't just a tag nor a tag. Here is a screenshot of the html of netlify the screenshot, and this is the code that I used
element = driver.find_element_by_name("github")
element.click()
login = driver.find_element_by_name("login")
login.send_keys(email)
password = driver.find_element_by_name("password")
password.send_keys(passwordstr)
loginbtn = driver.find_element_by_name("commit")
loginbtn.click()
getbandwidth = driver.find_element_by_xpath('//*[#id="main"]/div/div[1]/div/section/div/div/div/dl/div/dd')
print(getbandwidth.text)
getbandwidth = driver.find_element_by_xpath("//dd[#class='tw-text-xl tw-mt-4px tw-leading-none']")
You can use this to grab the first xpath with that class. Below does the same but if you want to index other elements with similar classes.
(//dd[#class='tw-text-xl tw-mt-4px tw-leading-none'])[1]
Normally we use webdriver waits to allow for the element to become visible.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
getbandwidth = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH,"//dd[#class='tw-text-xl tw-mt-4px tw-leading-none']")))
The company has a list of 100+ sites that I am trying to use Selenium webdriver to automatically take a user into that site. I am fairly new to programming so please forgive me if my question is worded poorly.. But, I am trying to take the name of a site such as "Alpharetta - Cemex" in the example below from the user and find it in this long list and then select that link. Through testing I am pretty sure the element I need to click is the h3 class that also holds the name of the site under the data-hmi-name
Website Code Example:
I have tried to use the below and it never seems to work..
driver.find_element_by_css_selector("h3.tru-card-head-text uk-text-center[data-hmi-name='Alpharetta - Cemex']").click()
#For this one I tried to select the h3 class by searching for all those elements that has the name Alpharetta - Cemex
or
**theCards = main.find_elements_by_tag_name("h3")** #I tried both of these declarations for theCards
**#theCards = main.find_elements_by_class_name("tru-card-wrapper")**
#then used the loop below. This obviously didn't work and it just returns an error that card.text doesn't actually exist
for card in theCards:
#title = card.find_elements_by_tag_name("h3")
print(card.text)
if(card.text == theSite):
card.click()
Any help or guidance would be so appreciated! I am new to programming in Python and if you can explain what I am doing wrong I'd be forever thankful!
If you want to click a single link (e.g. Alpharetta - Cemex) , you can try like below:
theSite = "Alpharetta - Cemex" #You can store user inputted site Name here
linkXpath = "//a[h3[contains(text(),'"+theSite +"']]"
WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.XPATH, linkXpath))).click() #This will wait for element to be clickable before it clicks
In case above is not working. If your Link is not in screen / not visible. You can use java script to first scroll to element and the click like below:
ele = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.XPATH, linkXpath)))
driver.execute_script("arguments[0].scrollIntoView();", ele )
driver.execute_script("arguments[0].click();", ele )
You need to Import:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
The input field is being recognized by Selenium (done using xpath), but I am not able to send any keys to it. There is no error, but the text I want written into the keys is not being shown in the search box. The input field is inside several divs and is in an iframe, but I'm pretty sure I took care of that. What can I do to fix this issue?
driver.implicitly_wait(20)
search = driver.find_element_by_xpath("/html/body/main/div/div/div/div[1]/div/div/div/div/div/div[2]/div/input")
search.send_keys("25%)")
Picture of HTML
The input field doesn't have an ID and it doesn't have a type either, so I'm not really sure how else to find it besides xpath.
I'm not sure about the difference between Java and Python but my suggestion is that you try using JavaScriptExecutor to enter the Keys:
Your Python code that gets the element:
search = driver.find_element_by_xpath("/html/body/main/div/div/div/div[1]/div/div/div/div/div/div[2]/div/input")
You basically pass the element this way in Java:
JavascriptExecutor js = (JavascriptExecutor)driver;
js.executeScript("arguments[0].value='<<Add your input String here>>';", search);
Many times, it works when SendKeys doesn't
Use class to locate the div element:
search_div = driver.find_element_by_class_name("search-input")
search = search_div.find_element_by_class_name("text")
Btw:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
klass = "search-input"
WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CLASS_NAME, klass)))
search_div = driver.find_element_by_class_name(klass)
search = search_div.find_element_by_class_name("text")
I just want to write a simple log in code for one website. However, I think the log in page was written in JS. It's really hard to locate the elements with selenium.
The web page I am going to play with is:
"https://www.nike.com/snkrs/login?returnUrl=%2F"
This is how the page looks like and how the inspect element page look like:
I was trying to locate the element by following code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Firefox()
driver.get("https://www.nike.com/snkrs/login?returnUrl=%2Fthread%2Fe9680e08e7e3cd76b8832684037a58a369cad5ed")
time.sleep(5)
driver.switch_to.frame(driver.find_element_by_tag_name("iframe"))
elem =driver.find_element_by_xpath("//*[#id='ce3feab5-6156-441a-970e-23544473a623']")
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
driver.close()
This code return the error that the element could not find by [#id='ce3feab5-6156-441a-970e-23544473a623'.
I tried playing with frames, it seems does not work. If I went to "view web source" page, it is full of JS code.
Is there a good way to play with such a web page with selenium?
Try changing the code :
elem =driver.find_element_by_xpath("//*[#id='ce3feab5-6156-441a-970e-23544473a623']")
to
elem =driver.find_element_by_xpath("//*[#type='email']")
My guess (and observation) is that the id changes each time you visit the page. The id looks auto-generated, and when I go to the page multiple times, the id is different each time.
You'll need to search for something that doesn't change. For example, you can search for the name attribute, which has the seemingly static value "emailAddress"
element = driver.find_element_by_name("emailAddress")
You could also use an xpath expression to search for other attributes, such as data-componentname:
element = driver.find_element_by_xpath("//input[#data-componentname='emailAddress']")
Also, instead of a hard-coded sleep, you can simply wait for the element to be visible:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.nike.com/snkrs/login")
element = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.NAME, "emailAddress"))
)