Trying to get the search bar ID from this website: http://www.pexels.com
browser = webdriver.Chrome(executable_path="C:\\Users\\James\\Documents\\PythonScripts\\chromedriver.exe")
url = "https://www.pexels.com"
browser.get(url)
browser.maximize_window()
search_bar = browser.find_element_by_id("//input[#id='search__input']")
search_bar.send_keys("sky")
search_button.click()
However this isn't correct and I'm not sure how to get the search to work. First time using selenium so all help is appreciated!
There's no id attribute in the tag you are searching for. You may use css selectors instead. Here's a sample snippet:
search_bar = driver.find_element_by_css_selector("input[placeholder='Search for free photos…']");
search_bar.send_keys("sky")
search_bar.send_keys(Keys.RETURN)
Above snippet will insert 'sky' in the search bar and hit enter button.
Locating Elements gives an explanation on how to locate elements with Selenium.
The element you want to select, looks like this in the DOM:
<input required="required" autofocus="" class="search__input" type="search" placeholder="Search for free photos…" name="s">
Since there is no id specified you can't use find_element_by_id.
For me this worked:
search_bar = driver.find_element_by_xpath('/html/body/header[2]/div/section/form/input')
search_bar.send_keys("sky")
search_bar.send_keys(Keys.RETURN)
for the selection you can also use the class name:
search_bar = driver.find_element_by_class_name("search__input")
or the name tag:
search_bar = driver.find_element_by_name('s')
However, locating elements by names is probably not a good idea if there are more elements with the same name (link)
BTW if you are unsure about the xpath, the google Chrome inspection tool lets you copy the xpath from the document:
Related
Issue: I cannot get a clickable variable that points the chosen anime title. The title is an tag that has a tag that contains the anime name.
What I want to do is:
1)Get all anime that appear from the website
2)Select the anime that has the same name as the input variable "b"
3)Get the chosen anime title clickable to redirect to its page and datascrape it.
What is causing me a lot of issues is the selection of the right anime, because all anime titles only share the same class name and the "presence of the strong tag" and that doesn't seem enough to get the title clickable
Website I use selenium on:
This is the full program code for the moment:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
a = input("Inserisci l'anime che cerchi: ")
b = str(a)
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://myanimelist.net/anime.php")
print(driver.title)
'''CHIUDIAMO LA FINESTRA DEI COOKIE, CHE NON MI PERMETTE DI PROSEGUIRE COL
PROGRAMMA'''
puls_cookie = driver.find_element(By.CLASS_NAME, "css-47sehv")
puls_cookie.click()
search = driver.find_element(By.XPATH,
"/html/body/div[2]/div[2]/div[3]/div[2]/div[3]/form/div[1]/div[1]/input")
time.sleep(2)
search.click()
search.send_keys(b)
search.send_keys(Keys.RETURN)
search2 = driver.find_elements(By.TAG_NAME, "strong")
i = 0
link = driver.find_element(By.XPATH, f"// a[contains(text(),\{b})]")
# the a represents the <a> tag and the be represents your input text
link.click()
time.sleep(10)
driver.quit()
I wanted to open the page by clicking the blue name of one of the anime that came up as result of the previus input on the searchbar
I AM VERY SORRY IF THIS EDIT DOESN'T STILL MAKE THE ISSUE CLEAR, english is not my native lenguage and I'm pretty new to programming too so its very difficult for me.
I thank everyone that spent and (I wish) will spend time trying to help me; God bless you all
You don't need to specify index when you use for in.
#wrong usage
for element in search2:
anime = search2[i].text
#proper usage
f or element in search2:
anime = element.text
if you want to go to the page of the first anime that comes up as a result of the search
you can use code like this
anime = driver.find_element(By.XPATH, "//div[#class='js-scrollfix-bottom-rel']/article/div/div[2]/div[1]/a[1][text()='Date A Live']")
driver.get(anime.get_attribute("href"))
Since we will no longer be looking for an element instead of a list, we can search it directly in driver.get without assigning it to a variable.
driver.get(driver.find_element(By.XPATH, "//div[#class='js-scrollfix-bottom-rel']/article/div/div[2]/div[1]/a[1][text()='Date A Live']").get_attribute("href"))
You can call the link in the 'href' element found in the anime result with the get_attribute method
When the element is searched before it is loaded, an error is received because the element is not found. For this, WebDriverWait is used, which is waiting for the element to be loaded.
anime = "Date A Live"
driver.get(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, f"//div[#class='js-scrollfix-bottom-rel']/article/div/div[2]/div[1]/a[1][text()='{anime}']"))).get_attribute("href"))
driver.get("https://myanimelist.net/anime.php")
searchBar = driver.find_element(By.XPATH, "//input[#id='q']")
anime = "Date A Live"
searchBar.send_keys(anime)
searchBar.send_keys(Keys.ENTER)
driver.get(WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, f"//table/tbody/tr/td[2]/div/a[strong='{anime}']"))).get_attribute("href"))
Your code is not working because driver.find_elements() returns a list, even if it only finds one element. You are probably getting an error like: 'list' object has no attribute 'Click'
I think an easier way would be to find the element whose text matches your input string. The driver.find_element_by_xpath() method can do this. The If your input string is stored in the variable b you could do this with something like:
link = driver.find_element_by_xpath(f"// a[contains(text(),\{b})]") # the a represents the <a> tag and the be represents your input text
link.Click()
Trying to scrape a website, I created a loop and was able to locate all the elements. My problem is, that the next button id changes on every page. So I can not use the id as a locator.
This is the next button on page 1:
<a rel="nofollow" id="f_c7" href="#" class="nextLink jasty-link"></a>
And this is the next button on page 2:
<a rel="nofollow" id="f_c9" href="#" class="nextLink jasty-link"></a>
Idea:
next_button = browser.find_elements_by_class_name("nextLink jasty-link")
next_button.click
I get this error message:
Message: no such element: Unable to locate element
The problem here might be that there are two next buttons on the page.
So I tried to create a list but the list is empty.
next_buttons = browser.find_elements_by_class_name("nextLink jasty-link")
print(next_buttons)
Any idea on how to solve my problem? Would really appreciate it.
This is the website:
https://fazarchiv.faz.net/faz-portal/faz-archiv?q=Kryptow%C3%A4hrungen&source=&max=10&sort=&offset=0&_ts=1657629187558#hitlist
There are two issues in my opinion:
Depending from where you try to access the site there is a cookie banner that will get the click, so you may have to accept it first:
browser.find_element_by_class_name('cb-enable').click()
To locate a single element, one of the both next buttons, it doeas not matter, use browser.find_element() instead of browser.find_elements().
Selecting your element by multiple class names use xpath:
next_button = browser.find_element(By.XPATH, '//a[contains(#class, "nextLink jasty-link")]')
or css selectors:
next_button = browser.find_element(By.CSS_SELECTOR, '.nextLink.jasty-link')
Note: To avoid DeprecationWarning: find_element_by_* commands are deprecated. Please use find_element() import in addition from selenium.webdriver.common.by import By
You can't get elements by multiple class names. So, you can use find_elements_by_css_selector instead.
next_buttons = browser.find_elements_by_css_selector(".nextLink.jasty-link")
print(next_buttons)
You can then loop through the list and click the buttons:
next_buttons = browser.find_elements_by_css_selector(".nextLink.jasty-link")
for button in next_buttons:
button.click()
Try below xPath
//a[contains(#class, 'step jasty-link')]/following-sibling::a
Situation
I'm using Selenium and Python to extract info from a page
Here is the div I want to extract from:
I want to extract the "Registre-se" and the "Login" text.
My code
from selenium import webdriver
url = 'https://www.bet365.com/#/AVR/B146/R^1'
driver = webdriver.Chrome()
driver.get(url.format(q=''))
elements = driver.find_elements_by_class_name('hm-MainHeaderRHSLoggedOutNarrow_Join ')
for e in elements:
print(e.text)
elements = driver.find_elements_by_class_name('hm-MainHeaderRHSLoggedOutNarrow_Login ')
for e in elements:
print(e.text)
Problem
My code don't send any output.
HTML
<div class="hm-MainHeaderRHSLoggedOutNarrow_Join ">Registre-se</div>
<div class="hm-MainHeaderRHSLoggedOutNarrow_Login " style="">Login</div>
By looking this HTML
<div class="hm-MainHeaderRHSLoggedOutNarrow_Join ">Registre-se</div>
<div class="hm-MainHeaderRHSLoggedOutNarrow_Login " style="">Login</div>
and your code, which looks okay to me, except that part you are using find_elements for a single web element.
and by reading this comment
The class name "hm-MainHeaderRHSLoggedOutMed_Login " only appear in
the inspect of the website, but not in the page source. What it's
supposed to do now?
It is clear that the element is in either iframe or shadow root.
Cause page_source does not look for iframe.
Please check if it is in iframe, then you'd have to switch to iframe first and then you can use the code that you have.
switch it like this :
driver.switch_to.frame(driver.find_element_by_xpath('xpath here'))
On this website, I'm trying to find an element based on its XPATH, but the XPATH keeps changing. What's the next best alternative?
Snippet from website
<button class="KnkXXg vHcWfw T1alpA kiPMng AvEAGQ vM2UTA DM1_6g _-kwXsw Mqe1NA SDIrVw edrpZg" type="button" aria-expanded="true"><span class="nW7nAQ"><div class="VpIG5Q"></div></span></button>
XPATH:
//*[#id="__id15"]/div/div/div[1]/div[2]/div
#Sometimes id is a number between 15-18
//*[#id="__id23"]/div/div/div[1]/div[2]/div
#Sometimes id is a number between 13-23
Here's how I use the code:
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, """//*[#id="__id3"]/div/div/div[1]/div[2]/div/div/div/button"""))).click()
I've tried clicking the element by finding the button class, but for whatever reason it won't do anything.
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "KnkXXg vHcWfw T1alpA kiPMng AvEAGQ vM2UTA DM1_6g _-kwXsw Mqe1NA SDIrVw edrpZg"))).click()
If Part of the text is keep changing you can use contains in the xpath.
//*[contains(#id,"__id"]/div/div/div[1]/div[2]/div
I am trying to print off some housing prices and am having trouble using Xpath. Here's my code:
from selenium import webdriver
driver = webdriver.Chrome("my/path/here")
driver.get("https://www.realtor.com/realestateandhomes-search/?pgsz=10")
for house_number in range(1,11):
try:
price = driver.find_element_by_xpath("""//*[#id="
{}"]/div[2]/div[1]""".format(house_number))
print(price.text)
except:
print('couldnt find')
I am on this website, trying to print off the housing prices of the first ten houses.
My output is that for all the houses that say "NEW", that gets taken as the price instead of the actual price. But for the bottom two, which don't have that NEW sticker, the actual price is recorded.
How do I make my Xpath selector so it selects the numbers and not NEW?
You can write it like this without loading the image, which can increase your fetching speed
from selenium import webdriver
# Unloaded image
chrome_opt = webdriver.ChromeOptions()
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_opt.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_options=chrome_opt,executable_path="my/path/here")
driver.get("https://www.realtor.com/realestateandhomes-search/Bladen-County_NC/sby-6/pg-1?pgsz=10")
for house_number in range(1,11):
try:
price = driver.find_element_by_xpath('//*[#id="{}"]/div[2]/div[#class="srp-item-price"]'.format(house_number))
print(price.text)
except:
print('couldnt find')
You're on the right track, you've just made an XPath that is too brittle. I would try making it a little more verbose, without relying on indices and wildcards.
Here's your XPath (I used id="1" for example purposes):
//*[#id="1"]/div[2]/div[1]
And here's the HTML (some attributes/elements removed for brevity):
<li id="1">
<div></div>
<div class="srp-item-body">
<div>New</div><!-- this is optional! -->
<div class="srp-item-price">$100,000</div>
</div>
</li>
First, replace the * wildcard with the element that you are expecting to contain the id="1". This simply serves as a way to help "self-document" the XPath a little bit better:
//li[#id="1"]/div[2]/div[1]
Next, you want to target the second <div>, but instead of searching by index, try to use the element's attributes if applicable, such as class:
//li[#id="1"]/div[#class="srp-item-body"]/div[1]
Lastly, you want to target the <div> with the price. Since the "New" text was in it's own <div>, your XPath was targeting the first <div> ("New"), not the <div> with the price. Your XPath did however work, if the "New" text <div> did not exist.
We can use a similar method as the previous step, targeting by attribute. This forces the XPath to always target the <div> with the price:
//li[#id="1"]/div[#class="srp-item-body"]/div[#class="srp-item-price"]
Hope this helps!
And so... having said all of that, if you are just interested in the prices and nothing else, this would probably also work :)
for price in driver.find_elements_by_class_name('srp-item-price'):
print(price.text)
Can you try this code:
from selenium import webdriver
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://www.realtor.com/realestateandhomes-search/Bladen-County_NC/sby-6/pg-1?pgsz=10")
prices=driver.find_elements_by_xpath('//*[#class="data-price-display"]')
for price in prices:
print(price.text)
It will print
$39,900
$86,500
$39,500
$40,000
$179,000
$31,000
$104,900
$94,900
$54,900
$19,900
Do let me know if any other details are also required