Python Selenium Webdriver - Try except loop - python

I'm trying to automate processes on a webpage that loads frame by frame. I'm trying to set up a try-except loop which executes only after an element is confirmed present. This is the code I've set up:
from selenium.common.exceptions import NoSuchElementException
while True:
try:
link = driver.find_element_by_xpath(linkAddress)
except NoSuchElementException:
time.sleep(2)
The above code does not work, while the following naive approach does:
time.sleep(2)
link = driver.find_element_by_xpath(linkAddress)
Is there anything missing in the above try-except loop? I've tried various combinations, including using time.sleep() before try rather than after except.
Thanks

The answer on your specific question is:
from selenium.common.exceptions import NoSuchElementException
link = None
while not link:
try:
link = driver.find_element_by_xpath(linkAddress)
except NoSuchElementException:
time.sleep(2)
However, there is a better way to wait until element appears on a page: waits

Another way could be.
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
try:
element = WebDriverWait(driver, 2).until(
EC.presence_of_element_located((By.XPATH, linkAddress))
)
except TimeoutException as ex:
print ex.message
Inside the WebDriverWait call, put the driver variable and seconds to wait.

Related

Python Selenium - Using Try and except in a loop for automated reading on a website

I'm trying to create an automatic soundcloun player. I started learning selenium on python a few days ago, I'm still a beginner.
So I would like to create a loop that works when he enters the page he clicks on "I accept" (For cookies) then presses the space button and refreshes after 60s
Here is the code:
from selenium import webdriver
from time import sleep
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from typing import KeysView
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome(executable_path="chromedriver.exe")
driver.implicitly_wait(17)
driver.get("https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjkk_fNxM_6AhXrgc4BHWcSCjoQFnoECAYQAQ&url=https%3A%2F%2Fsoundcloud.com%2Fuser-997915825%2Fpmh-ah-aah&usg=AOvVaw2vFzZHA4d8DH1r5GjZ6e4y")
try:
element = driver.find_element(By.ID, 'onetrust-accept-btn-handler')
action = ActionChains(driver)
action.click(on_element = element)
action.perform()
driver.implicitly_wait(33)
except NoSuchElementException:
ActionChains(driver).key_down(Keys.SPACE).key_up(Keys.SPACE).perform()
while True:
sleep(60)
driver.refresh()
But when I put try: then Except: NoSuchElementException I don't even have time to decompile that I get an error in the lines like :
Try statement must have at least one except or finally clause. Pylance [Ln 17]
Expected expression. Pylance [Ln 28]
I used the right libraries I think:
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
And finally when I try to run it I have this nice error : SyntaxError: expected 'except' or 'finally' block.
When I remove try and except as follows:
from selenium import webdriver
from time import sleep
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from typing import KeysView
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome(executable_path="chromedriver.exe")
driver.implicitly_wait(17)
driver.get("https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjkk_fNxM_6AhXrgc4BHWcSCjoQFnoECAYQAQ&url=https%3A%2F%2Fsoundcloud.com%2Fuser-997915825%2Fpmh-ah-aah&usg=AOvVaw2vFzZHA4d8DH1r5GjZ6e4y")
element = driver.find_element(By.ID, 'onetrust-accept-btn-handler')
action = ActionChains(driver)
action.click(on_element = element)
action.perform()
driver.implicitly_wait(33)
ActionChains(driver).key_down(Keys.SPACE).key_up(Keys.SPACE).perform()
while True:
sleep(60)
driver.refresh()
He enters the page, he clicks on "I accept" and then he presses the SPACE button to start reading but when he refreshes, since he doesn't have a "cookie acceptance" window anymore, he doesn't even start reading.
that's why I'm looking to use conditions. I've searched several threads about this, I tried to understand their code before copying it but so far all the methods I've tried have failed and it's already been 2 days constantly that it's blocking me, my eyes are starting to hurt.
If anyone can find a solution to my problem, I would be delighted I'll be glad and can find the problem or I'm the problem.
Thanks for your correction, Prophet, here is the new code :
from selenium import webdriver
from time import sleep
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from typing import KeysView
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome(executable_path="chromedriver.exe")
driver.get("https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjkk_fNxM_6AhXrgc4BHWcSCjoQFnoECAYQAQ&url=https%3A%2F%2Fsoundcloud.com%2Fuser-997915825%2Fpmh-ah-aah&usg=AOvVaw2vFzZHA4d8DH1r5GjZ6e4y")
try:
element = driver.find_element(By.ID, 'onetrust-accept-btn-handler')
action = ActionChains(driver)
action.click(on_element = element)
action.perform()
except NoSuchElementException:
ActionChains(driver).key_down(Keys.SPACE).key_up(Keys.SPACE).perform()
while True:
sleep(7)
driver.refresh()
You are missing indentation.
This is a very basic python syntax rule.
The code inside try or except or any other block should have indentation relatively to previous code.
With the indentation your code could look like this:
driver = webdriver.Chrome(executable_path="chromedriver.exe")
driver.implicitly_wait(17)
driver.get("https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjkk_fNxM_6AhXrgc4BHWcSCjoQFnoECAYQAQ&url=https%3A%2F%2Fsoundcloud.com%2Fuser-997915825%2Fpmh-ah-aah&usg=AOvVaw2vFzZHA4d8DH1r5GjZ6e4y")
try:
element = driver.find_element(By.ID, 'onetrust-accept-btn-handler')
action = ActionChains(driver)
action.click(on_element = element)
action.perform()
driver.implicitly_wait(33)
except NoSuchElementException:
ActionChains(driver).key_down(Keys.SPACE).key_up(Keys.SPACE).perform()
Also, driver.implicitly_wait(33) is not a pause command.
driver.implicitly_wait is set once per the session. Is sets the timeout, how long time to try finding (pooling) an element on the page.
If you want to make a pause time.sleep() should be used, but not recommended since we don't generally recommend to use hardcoded sleeps in Selenium code.

Python Selenium define custom TimeoutException

I have a pytest script that has multiple classes which each have a set of test. Currently, each test within each class has the same TimeoutException defined. For example,
class Test1:
def test_1:
try:
"do something"
except TimeoutException:
"handle exception"
def test_2:
try:
"do something"
except TimeoutException:
"handle exception"
class Test2:
def test_3:
try:
"do something"
except TimeoutException:
"handle exception"
The "handle exception" part is where I have the same code for each module. I was wondering if there was a more pythonic way to do this. It seems sloppy to have the same lines pasted within each module for my TimeoutException handler.
Any help is appreciated and if more information is desired please let me know.
Thanks in advance!
Maybe try to add some explicit waits for functions you want to handle. Explicit waits are dedicated especially for test cases, when you have individual logic for every test. I suppose your issue concerns selenium, as I see you tagged your question. You can define it in every test, for every element you want to wait for.
To implement such explicit wait, you can use the most popular tool designed for WebDrivers: WebdriverWait. You can set the Timeout duration and the conditions defined when the TimeoutException needs to be raised.
Here is and example from official Selenium site:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
finally:
driver.quit()
I modified this case a little bit according to your code and here is an example of customizing TimeoutException (in the simplest way ignoring Page Object Model):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
class Test1:
def test_1:
try:
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement")))
except TimeoutException:
"handle exception"
Another wait that is really awesome is the implicit wait, that sets overall timeout for all actions done by Selenium like this:
driver.implicitly_wait(10)
That means, the webdriver will wait ten seconds for an Webelement, and after that TimeoutException will be raised. However, this kind of wait should be treated as a global, case-independent wait.
I hope this answer helps.
Here you have a documentation how to do this:
https://selenium-python.readthedocs.io/waits.html

unable to locate elements in chrome page using selenium python

this is the HTML code:
this is my code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
timeout = 20
driver = webdriver.Chrome()
driver.get('http://polarionprod1.delphiauto.net/polarion/#/project/10032024_MY21_FORD_PODS_SDPS_P702/wiki/10_Testing/SysTs_ATR')
wait = WebDriverWait(driver, 10)
men_menu=0
while(men_menu==0):
try:
men_menu=WebDriverWait(driver,
timeout).until(EC.element_to_be_clickable((By.CSS_SELECTOR,'#polarion_type_icon')))
print(men_menu)
except TimeoutException:
if(men_menu==0):
print(men_menu)
continue
else:
men_menu.click()
break
i have used try and except in loop since the web page I am dealing with takes a lot of time to load.When i run the code the code seems to be always in try block where I am printing the value men_menu to see if the element is located. But it always prints 0. So that's how i confirmed the element is not getting located

How to make Selenium only click a button and nothing else? Inconsistent clicking

My goal: to scrape the amount of projects done by a user on khan academy.
To do so I need to parse the profile user page. But I need to click on show more to see all the project a user had done and then scrape them.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException,StaleElementReferenceException
from bs4 import BeautifulSoup
# here is one example of a user
driver = webdriver.Chrome()
driver.get('https://www.khanacademy.org/profile/trekcelt/projects')
# to infinite click on show more button until there is none
while True:
try:
showmore_project=WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME,'showMore_17tx5ln')))
showmore_project.click()
except TimeoutException:
break
except StaleElementReferenceException:
break
# parsing the profile
soup=BeautifulSoup(driver.page_source,'html.parser')
# get a list of all the projects
project=soup.find_all(class_='title_1usue9n')
# get the number of projects
print(len(project))
This code return 0 for print(len(project)). And that's not normal because when you manually check https://www.khanacademy.org/profile/trekcelt/projects you can see there that the amount of projects is definetly not 0.
The weird thing: at first, you can see (with the webdriver) that this code is working fine and then selenium clicks on something else than the show more button, it click on one of the project's link for example and thus change the page and that's why we get 0.
I don't understand how to correct my code so selenium is only clicking on the right button and nothing else.
Check out the following implementation to get the desired behavior. When the script is running, take a closer look at the scroll bar to see the progress.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
with webdriver.Chrome() as driver:
wait = WebDriverWait(driver,10)
driver.get('https://www.khanacademy.org/profile/trekcelt/projects')
while True:
try:
showmore = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,'[class^="showMore"] > a')))
driver.execute_script("arguments[0].click();",showmore)
except Exception:
break
soup = BeautifulSoup(driver.page_source,'html.parser')
project = soup.find_all(class_='title_1usue9n')
print(len(project))
Another way would be:
while True:
try:
showmore = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,'[class^="showMore"] > a')))
showmore.location_once_scrolled_into_view
showmore.click()
wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR,'[class^="spinnerContainer"] > img[class^="loadingSpinner"]')))
except Exception:
break
Output at this moment:
381
I have modified the accepted answer to improve the performance of your script. Comment on how you can achieve it is in the code
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException, StaleElementReferenceException
from bs4 import BeautifulSoup
import time
start_time = time.time()
# here is one example of a user
with webdriver.Chrome() as driver:
driver.get('https://www.khanacademy.org/profile/trekcelt/projects')
# This code will wait until the first Show More is displayed (After page loaded)
showmore_project = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME,
'showMore_17tx5ln')))
showmore_project.click()
# to infinite click on show more button until there is none
while True:
try:
# We will retrieve and click until we do not find the element
# NoSuchElementException will be raised when we reach the button. This will save the wait time of 10 sec
showmore_project= driver.find_element_by_css_selector('.showMore_17tx5ln [role="button"]')
# Using a JS to send the click will avoid Selenium to through an exception where the click would not be
# performed on the right element.
driver.execute_script("arguments[0].click();", showmore_project)
except StaleElementReferenceException:
continue
except NoSuchElementException:
break
# parsing the profile
soup=BeautifulSoup(driver.page_source,'html.parser')
# get a list of all the projects
project=soup.find_all(class_='title_1usue9n')
# get the number of projects
print(len(project))
print(time.time() - start_time)
Execution Time1: 14.343502759933472
Execution Time2: 13.955228090286255
Hope this help you!

Issues when scraping a web page using selenium in python

I have been given a model to run a successful web scraper on a selected website, however, when i alter this to collect data from a second website, it keeps returning as an error. I'm not sure if it is an error in the code or the website is refusing my requests. Could you please look through this and see where my issue lies. Any help hugely appreciated!
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
try:
driver.get("http://www.caiso.com/TodaysOutlook/Pages/supply.aspx") # load the page
WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, '.highcharts-legend-item highcharts-pie-series highcharts-color-0'))) # wait till relevant elements are on the page
except:
driver.quit() # quit if there was an error getting the page or we've waited 15 seconds and the stats haven't appeared.
stat_elements = driver.find_elements_by_css_selector('.highcharts-legend-item highcharts-pie-series highcharts-color-0')
for el in stat_elements:
print(el.find_element_by_css_selector('b').text)
print(el.find_element_by_css_selector('br').text)
driver.quit()
First of all you are passing wrong CSS as it should be like this
.highcharts-legend-item.highcharts-pie-series.highcharts-color-0
not as you have mentioned.
Then you are closing the browser and then trying to again close it getting the error
try:
driver.get("http://www.caiso.com/TodaysOutlook/Pages/supply.aspx") # load the page
WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, '.highcharts-legend-item.highcharts-pie-series.highcharts-color-0'))) # wait till relevant elements are on the page
except:
driver.quit()
Next on list item you are fetching text
print(el.find_element_by_css_selector('b').text)
Debugged Code here:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException
driver = webdriver.Chrome()
try:
driver.get("http://www.caiso.com/TodaysOutlook/Pages/supply.aspx") # load the page
WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, '.highcharts-legend-item.highcharts-pie-series.highcharts-color-0'))) # wait till relevant elements are on the page
#driver.quit() # quit if there was an error getting the page or we've waited 15 seconds and the stats haven't appeared.
except TimeoutException:
pass
finally:
try:
stat_elements = driver.find_elements_by_css_selector('.highcharts-legend-item.highcharts-pie-series.highcharts-color-0')
for el in stat_elements:
for i in el.find_elements_by_tag_name('b'):
print(i.text)
for i in el.find_elements_by_tag_name('br'):
print(i.text)
except NoSuchElementException:
print("No Such Element Found")
driver.quit()
I hope this has solved your problem if not then let me know.

Categories