how to accept a forced cookie with selenium webbrowser - python

I'm trying to get to accept a cookie and looked at a similar question.
This is how the popup looks: https://i.stack.imgur.com/FxChW.png
I have tried different things. For instance I tried following the others question solution like so:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Firefox()
browser.get('https://www.volkskrant.nl/best-gelezen?utm_source=pocket_mylist')
wait = WebDriverWait(browser, 4)
element = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'button.message-component:nth-child(1)')))
This however gives the this error.
I tried a bunch of different things but I can't seem to select anything on the page (at least nothing from the popup).
I know this question has already been asked a couple of times but I did not find a solution.
Is there anyone else who encountered this problem and knows how to just accept the cookies as to go to the regular site?
thanks in advance!

There's an iframe :
iframe[title='Iframe title']
you need to switch first in Selenium.
wait = WebDriverWait(driver, 10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[title='Iframe title']")))
after this you can click on accept cookies button.
full code :
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.implicitly_wait(30)
driver.get("https://www.volkskrant.nl/best-gelezen?utm_source=pocket_mylist")
wait = WebDriverWait(driver, 10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[title='Iframe title']")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[title='Akkoord']"))).click()

This element is inside an iframe.
So you have first switch to that iframe and only after that you will be able to accept the cookies.
browser = webdriver.Firefox()
browser.get('https://www.volkskrant.nl/best-gelezen?utm_source=pocket_mylist')
wait = WebDriverWait(browser, 20)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, "//iframe[contains(#src,'preload')]")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'button[title="Akkoord"]'))).click()

Related

Selenium scrolling randomly

So I am trying to scrape data from a table from several hundred pages on a website. Here is part of what I have so far:
driver.get("link")
driver.maximize_window()
window_before = driver.window_handles[0]
driver.switch_to.window(window_before)
wait = WebDriverWait(driver, 10)
driver.execute_script("window.scrollTo(0, 350)")
games = driver.find_elements(By.XPATH, '//*[#id="schedule"]/tbody/tr')
This code only works sometimes. If I run this chunk 10 times, only 5 times will the website actually scroll down. I tried using this:
for i in range(0, 2): driver.find_element(By.XPATH, '//*[#id="meta"]/div[1]/p[1]/a').send_keys(Keys.DOWN)
but the same issue arises. Sometimes that scrolls down the amount I need, other times it does nothing, and other times it scrolls the entire page.
This part of my code navigates to the first link I need to click and on the next page I need to scroll another page, where the same issue is present. This is all part of a loop that goes through several hundred pages to read html tables, so even if it works the first 50 times, I won't get all the data I need.
Edit: Directly after the above snippet I have this:
for idx, game in enumerate(games):
driver.find_element(By.XPATH, '/html/body/div[2]/div[6]/div[3]/div[2]/table/tbody/tr['+str(idx+1)+']/td[6]/a').click()
Which is where I get the "element is not clickable at point (X, Y)" error.
Am I doing something wrong here, or is there a work around to accomplish my goal?
Here is one way to access href attribute for every 'Box Score' link from that page (according to OP's clarification in comments):
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(browser, 20)
actions = ActionChains(browser)
url = 'https://www.basketball-reference.com/leagues/NBA_2014_games-october.html'
browser.get(url)
# print(browser.page_source)
# browser.maximize_window()
try:
wait.until(EC.element_to_be_clickable((By.XPATH, '//div[#class="qc-cmp2-summary-section"]'))).click()
print('clicked cookie parent')
wait.until(EC.element_to_be_clickable((By.XPATH, '//button[#mode="primary"]'))).click()
print('accepted cookies')
except Exception as e:
print('no cookies')
wait.until(EC.element_to_be_clickable((By.XPATH, '//div[#id="all_schedule"]'))).location_once_scrolled_into_view
table_with_score_links = wait.until(EC.presence_of_element_located((By.XPATH, '//table[#id="schedule"]')))
# print(table_with_score_links.get_attribute('outerHTML'))
links_from_table = [x.get_attribute('href') for x in table_with_score_links.find_elements(By.TAG_NAME, 'a') if x.text == 'Box Score']
print(links_from_table)
Result printed in terminal:
clicked cookie parent
accepted cookies
['https://www.basketball-reference.com/boxscores/201310290IND.html', 'https://www.basketball-reference.com/boxscores/201310290MIA.html', 'https://www.basketball-reference.com/boxscores/201310290LAL.html', 'https://www.basketball-reference.com/boxscores/201310300CLE.html', 'https://www.basketball-reference.com/boxscores/201310300TOR.html', 'https://www.basketball-reference.com/boxscores/201310300PHI.html', 'https://www.basketball-reference.com/boxscores/201310300DET.html', 'https://www.basketball-reference.com/boxscores/201310300NYK.html', 'https://www.basketball-reference.com/boxscores/201310300NOP.html', 'https://www.basketball-reference.com/boxscores/201310300MIN.html', 'https://www.basketball-reference.com/boxscores/201310300HOU.html', 'https://www.basketball-reference.com/boxscores/201310300SAS.html', 'https://www.basketball-reference.com/boxscores/201310300DAL.html', 'https://www.basketball-reference.com/boxscores/201310300UTA.html', 'https://www.basketball-reference.com/boxscores/201310300PHO.html', 'https://www.basketball-reference.com/boxscores/201310300SAC.html', 'https://www.basketball-reference.com/boxscores/201310300GSW.html', 'https://www.basketball-reference.com/boxscores/201310310CHI.html', 'https://www.basketball-reference.com/boxscores/201310310LAC.html']
I tried to make variable names as descriptive as possible, and also left some commented out lines of code, to help with the thought process - build up to reach the end goal.
You can now go through those links one by one, etc.
Selenium documentation can be found here: https://www.selenium.dev/documentation/

Selenium does NOT do anything after getting rid of GDPR consent

It's my first time using Selenium and webscraping. I have been stuck with the annoying GDPR iframe. I am simply trying to go to a website, search something in the search bar and then click in one of the results. But it does not seem to do anything after I get rid of the GDPR consent.
Important, it does not give any errors.
This is my very simple code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
#Web driver
driver = webdriver.Chrome(executable_path="C:\Program Files (x86)\chromedriver.exe")
driver.get("https://transfermarkt.co.uk/")
search = driver.find_element_by_name("query")
search.send_keys("Sevilla FC")
search.send_keys(Keys.RETURN)
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "sp_message_iframe_382445")))
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='ACCEPT ALL']"))).click()
try:
sevfclink = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "368")))
sevfclink.click()
except:
driver.quit()
time.sleep(5)
driver.quit()
Not sure where you get the iframe from but the id might be dynamic so try this.
driver.get("https://transfermarkt.co.uk/")
wait = WebDriverWait(driver,10)
search = wait.until(EC.element_to_be_clickable((By.NAME, "query")))
search.send_keys("Sevilla FC", Keys.RETURN)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[id^='sp_message_iframe']")))
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='ACCEPT ALL']"))).click()
driver.switch_to.default_content()
try:
sevfclink = wait.until(EC.element_to_be_clickable((By.ID, "368")))
sevfclink.click()
except:
pass
It looks like the two lines starting WebDriverWait throw an error. If I skip those to the try statement you get the results of the search. A page that gives an overview of Sevilla FC shows up. I presume the WebDriverWait lines are there to make sure you wait for something, but from what I can tell they are unnecessary.

Selenium GDPR NoSuchElementException

I want to scrape some data from "https://www.techadvisor.co.uk/review/wearable-tech/". I figured out that looping through the pages with Beautifulsoup does not work. This is the reason why I tried to open it with selenium. The "Accept All" Button to overcome the GDPR blocker cannot be located.
I tried:
browser = webdriver.Chrome()
browser.get("https://www.techadvisor.co.uk/review/wearable-tech/")
# button = browser.find_element_by_xpath('/html/body/div/div[3]/div[5]/button[2]')
# WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "html/body/div/div[3]/div[5]/button[2]"))).click()
I always receive NoSuchElementException
To be honest, I found the Xpath really weird, but I got this from the Google Chrome inspect.
Every solution proposal or tip is appreciated :)
To click on Accept All button which is inside an iframe.You need to switch to iframe first in order to click the button.
Induce WebDriverWait() and wait for frame_to_be_available_and_switch_to_it() and use the following css selector.
Induce WebDriverWait() and wait for element_to_be_clickable() and use the following xpath selector.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
browser = webdriver.Chrome()
browser.get("https://www.techadvisor.co.uk/review/wearable-tech/")
WebDriverWait(browser,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[id^='sp_message_iframe']")))
WebDriverWait(browser, 10).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All']"))).click()
I know the question is old,
but i would like provide my own solution!
First step is to recognize the "id" of the form that you are actually view, and then you need to move the focus on it!
driver.switch_to_frame(driver.find_element_by_xpath('//*[#id="gdpr-consent-notice"]'))
cookies = driver.find_element_by_xpath('/html/body/app-root/app-theme/div/div/app-notice/app-theme/div/div/app-home/div/div[3]/div[2]/a[3]/span')
cookies.click()

Selenium Webdriver won't find element in xblock

I don't know if the question even makes sense - I'm very new to Python and Selenium and coding in general.
The story is I'm trying to automate the process of saving edX course webpages as HTML. I'm using the latest iPython and Webdriver. This is what I've done so far:
from selenium import webdriver
driver = webdriver.Chrome(executable_path=r'/Users/Khoa_Ngo\bin\chromedriver\chromedriver.exe')
driver.get('https://courses.edx.org/login')
#logging in
driver.find_element_by_id('login-email').send_keys('EMAIL')
driver.find_element_by_id('login-password').send_keys('PASSWORD')
driver.find_element_by_xpath('//*[#type="submit"]').click()
#choosing course
driver.find_element_by_xpath('//*[#href="/courses/course-v1:Microsoft+DEV262x+1T2020a/course/"]').click()
What I want to do next is to save the webpage as HTML, store it somewhere, and then click "Next" to proceed to the next course module and repeat. But I can't seem to locate the button. Here is what I've tried:
driver.find_element_by_xpath('/html/body/div[3]/div[2]/div[2]/div[1]/section[1]/main/div/div/div[1]/button[2]').click()
driver.find_element_by_css_selector('#sequence_adf942ea-fcee-289c-a1f8-3c557ee5fb15 > div.sequence-nav > button.sequence-nav-button.button-next')
I don't think this element is in an iframe. However it's in some kind of "xblock". I'm not sure how that will affect the selection.
This is the webpage I saved: https://drive.google.com/drive/folders/1Zr6sGO0j-H-Tze_lBgkLnQXuQA0pxsWr?usp=sharing
Are these information enough to answer my question? Thank you for your help!
Try below xpath :
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#class='sequence-nav-button button-next'][contains(.,'Next')]"))).click()
or
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#class='sequence-nav-button-label'][contains(.,'Next')]")))
You can also try javascript click::
wait = WebDriverWait(driver, 10)
nextbutton= wait.until(EC.element_to_be_clickable((By.XPATH, "//button[#class='sequence-nav-button-label'][contains(.,'Next')]")))
driver.execute_script("arguments[0].click();", nextbutton)
Note : please add below imports to your solution
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

selenium chromedriver different values of xpath between terminal and actual driver

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url = 'https://www.msha.gov/mine-data-retrieval-system'
driver = webdriver.Chrome(executable_path='chromedriver')
driver.get(url)
#driver.find_element_by_xpath('//*[#id="mstr90"]/div[1]/div/div') error
#driver.find_elements_by_xpath('//input') gives 3 while in driver gives 10
I am unable to find element where the input "Search by Mine ID by typing here.." is, the document is fully loaded but it can't locate it. What I want to do is simply pass in an input "0100003" then submit
Iframe is present on your page. Before you interact with inputbox you need to switch on to iframee. Refer below code to resolve your issue.
wait = WebDriverWait(driver, 10)
driver.get("https://www.msha.gov/mine-data-retrieval-system")
driver.switch_to.frame("iframe1")
wait = WebDriverWait(driver, 10)
inputBox = wait.until(EC.element_to_be_clickable((By.XPATH, "//div[#class='mstrmojo-SimpleObjectInputBox-empty']"))).click()
inputBox1 = wait.until(EC.element_to_be_clickable((By.XPATH, "//div[#class='mstrmojo-SimpleObjectInputBox-container mstrmojo-scrollNode']//input")))
inputBox1.send_keys("0100003")
Updated Code to handle dropdown
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div#mstr100,mstrmojo-Popup.mstrmojo.SearchBoxSelector-suggest"))).click()
Note: please add below imports to your solution
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
Output:
The element you are trying to find is inside an iframe, so you will need to switch to that iframe first and then do your find element. Also, it's a best practice to use waits to give pages/elements time to load before a find element timeouts and throws an error.
iframe = WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#iframe1')))
driver.switch_to.frame(iframe)
mine_id = WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.XPATH, '//*[#id="mstr90"]/div[1]/div/div')))
Then you need to click this element to make it interactable.
mine_id.click()
Once you click then you need to re-find the input box before sending keys.
mine_id_input = WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#mstr90 input')))
mine_id_input.send_keys('0100003')
To select the suggestion displayed:
suggestion = WebDriverWait(driver, 15).until(EC.presence_of_element_located((By.CSS_SELECTOR, '#mstr100')))
suggestion.click()
if you wanted to continue on interacting outside the iframe after this is done, you will want to switch back out of the iframe like this:
driver.switch_to.default_content()

Categories