Webpage View
Webpage HTML Snippet
Problem
I'm trying to use Python and Selenium to scrape this web site where the submenu shows after the mouse goes over the "Dossiê" menu.
Attempt
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
browser = webdriver.Chrome()
action = ActionChains(browser)
(...)
action.move_to_element(browser.find_element_by_id('menuItem1')).perform()
When this code is run, a NoSuchElementException is raised. The element is found, but the move_to_element call doesn't show the submenu.
Related
I'm trying to scrap the list of services we have for us from this site but not able to click to the next page.
This is what I've tried so far using selenium & bs4,
#attempt1
next_pg_btn = browser.find_elements(By.CLASS_NAME, 'ui-lib-pagination_item_nav')
next_pg_btn.click() # nothing happens
#attemp2
browser.find_element(By.XPATH, "//div[#role = 'button']").click() # nothing happens
#attempt3 - saw in some stackoverflow post that sometimes we need to scroll to the
#bottom of page to have the button clickable, so tried that
browser.execute_script("window.scrollTo(0,2500)")
browser.find_element(By.XPATH, "//div[#role = 'button']").click() # nothing happens
I'm not so experienced with scrapping, pls advice how to handle this and where I'm going wrong.
Thanks
Several issues with your code:
You tried wrong locators.
You probably need to wait for the element to be loaded before clicking it. But if before clicking the pagination you performing some actions on the page this is not needed since during you scraping the page content web elements are already got loaded.
Pagination button is on the buttom of the page, so you need to scroll the page to bring the pagination button into the visible screen.
After scrolling some delay should be added, as you can see in the code below.
Now pagination element can be clicked.
The following code works
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 10)
url = "https://www.tamm.abudhabi/en/life-events/individual/HousingProperties"
driver.get(url)
pagination = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "ui-lib-pagination__item_nav")))
pagination.location_once_scrolled_into_view
time.sleep(0.5)
pagination.click()
I want to scrape links to news articles using scrapy + selenium. The website I am using uses a 'Load more' button, so I obviously want selenium to click on this button to load all articles.
I have looked for similar questions and tried various options already such as
element = driver.find_element(By.XPATH, value='//*[#id="fusion-app"]/main/div/div/div/div/div[4]/div/div/button')
driver.execute_script("arguments[0].click();", element)
and
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".ais-InfiniteHits-loadMore")))
ActionChains(driver).move_to_element(element).click().perform()
All to no result. I've also inserted some print statements in between to check whether it does run the code, and that seems to work fine; I think it's just a matter of the button not being located/clicked on.
This is the html of the button btw:
<button class="ais-InfiniteHits-loadMore">Load more </button>
And when I print element, this is what I get: <selenium.webdriver.remote.webelement.WebElement (session="545716eef622a12bdbeddef99e02bdef", element="551741ec-4616-4bd4-b8fd-57c2f4bffb00")>
Is someone able to help me out? Thank you in advance.
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get('https://www.businessoffashion.com/search/?q=Louis+Vuitton&f=Articles%2CFashion+Shows%2CNews')
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,'button.ab-close-button'))).click()
elem=wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".ais-InfiniteHits-loadMore")))
driver.execute_script("arguments[0].click()", elem)
You hit two different errors with a pop up and an element click intereception when you can just use javascript to click that element.
Import:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
I'm new in Webscraping, I want to click this button "Afficher plus" which has the HTML code below.
<button class="more-results js-more-results" id="search-more-results">Afficher plus de résultats</button>
I tried the code below in selenium but it doesn't click the button.
driver = webdriver.Safari()
driver.get(carte)
bt = driver.find_element_by_id("search-more-results")
bt.click()
"carte" is the link the of web page I want to scrape.
1 Check if this button is not inside an iframe.
2 If not, try waiting until this button is clickable
3 '"carte" is the link the of web page I wanna to scrape.' You are using get() function incorrectly. It is used to open a page, not to get a link from a page.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Safari()
driver.get("YOUR WEB PAGE")
wait = WebDriverWait(driver, 30)
wait.until(EC.element_to_be_clickable((By.ID, "search-more-results")))
bt = driver.find_element_by_id("search-more-results")
bt.click()
If this won't work, try CSS selector or XPath. Example for CSS:
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#search-more-results")))
email = driver.find_element_by_css_selector('#search-more-results').click()
I'm trying to use ActionChains to click a button with
python but it just refuses to work no matter what I do.
The issue is that whenever the website opens, it opens with an overlay.
I want my program to click the 'OK' button on the overlay. Whatever code I write just ends up clicking the overlay itself.
Here is my code:
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Chrome()
URL = 'https://sam.gov/search/?index=opp&page=1&sort=-relevance&sfm%5Bstatus%5D%5Bis_active%5D=true&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueSelect%5D=customDate&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueFrom%5D=05%2F29%2F2021&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueTo%5D=05%2F29%2F2022&sfm%5Bkeywords%5D%5B0%5D%5Bkey%5D=541511&sfm%5Bkeywords%5D%5B0%5D%5Bvalue%5D=541511'
driver.get(URL)
overlay = driver.find_element(By.ID, "cdk-overlay-0")
button = driver.find_element(By.CSS_SELECTOR, "button.usa-button")
ActionChains(driver).move_to_element(overlay).click(button).perform()
And the relevant HTML from the webpage I'm looking at is:
<div id="cdk-overlay-0" class="cdk-overlay-pane" ...>
for the overlay, and
<button class="usa-button">OK</button>
for the button itself.
My code always ends up clicking just on the overlay and not the button. It ends up looking like this (the overlay gets a blue outline when clicked):
Yo, try this
from selenium import webdriver
URL = 'https://sam.gov/search/?index=opp&page=1&sort=-relevance&sfm%5Bstatus%5D%5Bis_active%5D=true&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueSelect%5D=customDate&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueFrom%5D=05%2F29%2F2021&sfm%5Bdates%5D%5BresponseDue%5D%5BresponseDueTo%5D=05%2F29%2F2022&sfm%5Bkeywords%5D%5B0%5D%5Bkey%5D=541511&sfm%5Bkeywords%5D%5B0%5D%5Bvalue%5D=541511'
driver = webdriver.Chrome()
driver.get(URL)
driver.find_element_by_xpath(r'/html/body/div/div[2]/div/sds-dialog-container/layout-splash-modal/div[4]/div[2]/div/button').click()
I just used the xpath and clicked it without actionchains
I'm trying to crawl the website "http://everydayhealth.com". However, I found that the page will dynamically rendered. So, when I click the button "More", some new news will be shown. However, using splinter to click the button doesn't let "browser.html" automatically changes to the current html content. Is there a way to let it get newest html source, using either splinter or selenium? My code in splinter is as follows:
import requests
from bs4 import BeautifulSoup
from splinter import Browser
browser = Browser()
browser.visit('http://everydayhealth.com')
browser.click_link_by_text("More")
print(browser.html)
Based on #Louis's answer, I rewrote the program as follows:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Firefox()
driver.get("http://www.everydayhealth.com")
more_xpath = '//a[#class="btn-more"]'
more_btn = WebDriverWait(driver, 10).until(lambda driver: driver.find_element_by_xpath(more_xpath))
more_btn.click()
more_news_xpath = '(//a[#href="http://www.everydayhealth.com/recipe-rehab/5-herbs-and-spices-to-intensify-flavor.aspx"])[2]'
WebDriverWait(driver, 5).until(lambda driver: driver.find_element_by_xpath(more_news_xpath))
print(driver.execute_script("return document.documentElement.outerHTML;"))
driver.quit()
However, in the output text, I still couldn't find the text in the updated page. For example, when I search "Is Milk Your Friend or Foe?", it still returns nothing. What's the problem?
With Selenium, assuming that driver is your initialized WebDriver object, this will give you the HTML that corresponds to the state of the DOM at the time you make the call:
driver.execute_script("return document.documentElement.outerHTML;")
The return value is a string so you could do:
print(driver.execute_script("return document.documentElement.outerHTML;"))
When I use Selenium for tasks like this, I know browser.page_source does get updated.