I want to scrape links to news articles using scrapy + selenium. The website I am using uses a 'Load more' button, so I obviously want selenium to click on this button to load all articles.
I have looked for similar questions and tried various options already such as
element = driver.find_element(By.XPATH, value='//*[#id="fusion-app"]/main/div/div/div/div/div[4]/div/div/button')
driver.execute_script("arguments[0].click();", element)
and
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".ais-InfiniteHits-loadMore")))
ActionChains(driver).move_to_element(element).click().perform()
All to no result. I've also inserted some print statements in between to check whether it does run the code, and that seems to work fine; I think it's just a matter of the button not being located/clicked on.
This is the html of the button btw:
<button class="ais-InfiniteHits-loadMore">Load more </button>
And when I print element, this is what I get: <selenium.webdriver.remote.webelement.WebElement (session="545716eef622a12bdbeddef99e02bdef", element="551741ec-4616-4bd4-b8fd-57c2f4bffb00")>
Is someone able to help me out? Thank you in advance.
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
driver.maximize_window()
wait = WebDriverWait(driver, 30)
driver.get('https://www.businessoffashion.com/search/?q=Louis+Vuitton&f=Articles%2CFashion+Shows%2CNews')
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,'button.ab-close-button'))).click()
elem=wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, ".ais-InfiniteHits-loadMore")))
driver.execute_script("arguments[0].click()", elem)
You hit two different errors with a pop up and an element click intereception when you can just use javascript to click that element.
Import:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Related
I am making a screen scraper using the Selenium Python library and I have already made some code so that I can log in. For some reason I am now stuck on the main menu and can't select any of the options. I have tried using CSS Selector, Class Name, and XPATH and none have been able to select any of the possible choices. No matter what happens, I always get a TimeoutException even with a long delay.
The portion of the page I am trying to scrape from is here.
The relevant code is as follows:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
# Open browser and go to the Webex login page
driver = webdriver.Chrome()
driver.get('https://admin.webex.com')
delay = 10 # seconds
long_delay = 20
# Login portion removed
# Menu selection goes here.
# I have tried the following with no luck
# The following lines produce a TimeoutException error
# Selecting menu item
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.XPATH, "span[#class='left-nav-item__link']"))).click()
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.XPATH, '//mch-left-nav-item-group[4]/ul/mch-left-nav-item[3]/li/span'))).click()
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.XPATH, "//webex-root/webex-main[#class='control-hub-container']//webex-sidebar/mch-left-nav/nav/mch-left-nav-item-group[4]/ul/mch-left-nav-item[3]//span[#class='left-nav-item__link']"))).click()
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.CSS_SELECTOR, "span[class='left-nav-item__link']"))).click()
WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.CSS_SELECTOR, "[aria-label] mch-left-nav-item-group:nth-of-type(4) mch-left-nav-item:nth-of-type(3) .left-nav-item__link"))).click()
# Selecting group of items
WebDriverWait(driver, delay).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'left-nav-item')))
WebDriverWait(driver, delay).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'li.left-nav-item')))
# Selecting parent
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.CSS_SELECTOR, "li[data-test-name='calling']")))
Does anyone have an idea why I'm not able to select any element?
Try xpath for the third one:
WebDriverWait(driver, long_delay).until(EC.presence_of_element_located((By.XPATH, "span[#class='left-nav-item__link']"))).click()
To anyone having this issue.
When a new tab is opened while you are using Selenium, even though you are viewing the new page, Selenium is not. You have to switch to the correct window. Something such as the following can help you with that.
driver.switch_to.window(driver.window_handles[NUMBER])
I want to scrape some data from "https://www.techadvisor.co.uk/review/wearable-tech/". I figured out that looping through the pages with Beautifulsoup does not work. This is the reason why I tried to open it with selenium. The "Accept All" Button to overcome the GDPR blocker cannot be located.
I tried:
browser = webdriver.Chrome()
browser.get("https://www.techadvisor.co.uk/review/wearable-tech/")
# button = browser.find_element_by_xpath('/html/body/div/div[3]/div[5]/button[2]')
# WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "html/body/div/div[3]/div[5]/button[2]"))).click()
I always receive NoSuchElementException
To be honest, I found the Xpath really weird, but I got this from the Google Chrome inspect.
Every solution proposal or tip is appreciated :)
To click on Accept All button which is inside an iframe.You need to switch to iframe first in order to click the button.
Induce WebDriverWait() and wait for frame_to_be_available_and_switch_to_it() and use the following css selector.
Induce WebDriverWait() and wait for element_to_be_clickable() and use the following xpath selector.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Chrome()
browser.get("https://www.techadvisor.co.uk/review/wearable-tech/")
WebDriverWait(browser,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[id^='sp_message_iframe']")))
WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All']"))).click()
I know the question is old,
but i would like provide my own solution!
First step is to recognize the "id" of the form that you are actually view, and then you need to move the focus on it!
driver.switch_to_frame(driver.find_element_by_xpath('//*[#id="gdpr-consent-notice"]'))
cookies = driver.find_element_by_xpath('/html/body/app-root/app-theme/div/div/app-notice/app-theme/div/div/app-home/div/div[3]/div[2]/a[3]/span')
cookies.click()
When I run the code, the website loads up fine but then it won't click on the button- an error appears saying the element is not interacterble. What do I need to do to click the button? I am relatively new to this and would be grateful for any help.
I have already tried finding it by id and tag.
page = driver.get("https://kenpreston.co.uk/author/")
element = driver.find_element_by_id('mk-button-31')
element.click()
SOLVED:
I used driver.find_element_by_link_text and this worked fine.
I have checked the website and noticed that mk-button-31 is an id for a div tag and inside it there is an a tag. Try getting the url from the a tag and do another driver.get instead of clicking on it.
Also the whole div tag is not clickable so that is why you are getting this error.
Use sleep from time library to be sure page fully loaded
from time import sleep
page = driver.get("https://kenpreston.co.uk/author/")
sleep(2)
element = driver.find_element_by_id('mk-button-31')
element.click()
Looks like your element is not clickable you need to replace this id selector with the css and need to wait for the element before click on it.
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
page = driver.get("https://kenpreston.co.uk/author/")
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#mk-button-31 span"))
element.click();
Consider adding Explicit Wait to your script as it might be the case the DOM had finished loading and the button you're looking for is still not there.
The classes you're looking for are:
WebDriverWait
expected_conditions
Suggested code change:
#your other imports here
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
#your other code here
page = driver.get("https://kenpreston.co.uk/author/")
element = WebDriverWait(driver, 10).until(expected_conditions.element_to_be_clickable((By.ID, "mk-button-31")))
element.click()
More information: How to use Selenium to test web applications using AJAX technology
So I'm scraping using selenium and I want to click 'next' button in 'Defensive' section but the code I wrote clicks 'next' on 'Summary'.
Here's the url for you to try :
https://www.whoscored.com/Regions/252/Tournaments/2/Seasons/7361/Stages/16368/PlayerStatistics/England-Premier-League-2018-2019
So it's selecting 'Defensive' and I can see it selected in the window but the next page doesnt appear. On clicking 'Summary' I found out next function is actually happening there.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser= webdriver.Chrome(executable_path ="C:\Program Files (x86)\Google\Chrome\chromedriver.exe")
browser.get('https://www.whoscored.com/Regions/252/Tournaments/2/Seasons/7361/Stages/16368/PlayerStatistics/England-Premier-League-2018-2019')
browser.find_element_by_xpath("""//*[#id="stage-top-player-stats-options"]/li[2]/a""").click()
element = WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.XPATH, """//*[#id="next"]""")))
browser.execute_script("arguments[0].click();", element)
The xpath for next button is not unique for this page. try this,
element = WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.XPATH, "//*[#id='stage-top-player-stats-defensive']//a[#id='next']")))
browser.execute_script("arguments[0].click();", element)
or
element = WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.XPATH, "//*[#id='stage-top-player-stats-defensive']//a[#id='next']")))
element.click()
For each tab (Summary, Defensive, ..) new next button with same id=next added to the DOM.
Select Defensive and you will see there will be two next buttons with same id=next, select Offensive and there will be three next buttons.
With basic id=next selector you always click to the first next button from Summary tab. Because you're using JavaScript and nothing happen, try to click with Selenium click method and you will get an error.
To solve the problem adjust your selector to be more specific to the dom - #statistics-paging-defensive #next.
Also when you first time open the page there's cookies acceptance screen appears and block the page, you can use method like below to skip it.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import selenium.common.exceptions as EX
def accept_cookies():
try:
WebDriverWait(browser, 20)\
.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.qc-cmp-button")))\
.click()
except EX.NoSuchElementException or EX.TimeoutException:
pass
#...
browser = webdriver.Chrome(executable_path ="C:\Program Files (x86)\Google\Chrome\chromedriver.exe")
browser.get('https://www.whoscored.com/Regions/252/Tournaments/2/Seasons/7361/Stages/16368/PlayerStatistics/England-Premier-League-2018-2019')
wait = WebDriverWait(browser, 20)
browser.get(baseUrl)
accept_cookies()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "[href='#stage-top-player-stats-defensive']"))).click()
next_button = wait.until(
EC.element_to_be_clickable((By.CSS_SELECTOR, "#statistics-paging-defensive #next")))
next_button.click()
Your elements locators must be unique
Avoid using XPath wildcards - * as it will cause performance degradation and prolonged elements lookup timings
Avoid using JavaScriptExecutor for clicking, well-behaved Selenium test must do what real user does and I doubt that real user will be opening browser console and typing something like document.getElementById('next').click(), he will use the mouse
Assuming all above you should come up with a selector which uniquely identifies next button on Defensive tab which would be something like:
//div[#id='statistics-paging-defensive']/descendant::a[#id='next']
References:
XPath Tutorial
XPath Axes
XPath Operators & Functions
I am trying to write a program which will interact with a web application.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
browser = webdriver.Firefox()
browser.get('https://www.easports.com/fifa/ultimate-team/web-app/')
element = WebDriverWait(browser, 20).until(EC.presence_of_element_located(\
(By.ID, 'Login')))
element.click()
print("Found %s" % element)
print("locatione %s" % element.location)
By design it should've open the browser, wait for page to load and then find and click "Login" button. "Login" button is found, but it doesn't click on it. Location is obviously printing wrong coordinates too. What are possible reasons and solutions for this problem?
Try locating the button with another logic. For example, via class:
element = browser.find_element_by_class_name('call-to-action')
element.click()
To find and click the Login button you can use the following line of code :
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[#class='standard call-to-action']"))).click()