I am using selenium + python, been using implicit waits and try/except code on python to catch errors. However I have been noticing that if the browser crashes (let's say the user closes the browser during the program's executing), my python program will hang, and the timeouts of implicit wait seems to not work when this happens. The below process will just stay there forever.
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium import webdriver
import datetime
import time
import sys
import os
def open_browser():
print "Opening web page..."
driver = webdriver.Chrome()
driver.implicitly_wait(1)
#driver.set_page_load_timeout(30)
return driver
driver = open_browser() # Opens web browser
# LET'S SAY I CLOSE THE BROWSER RIGHT HERE!
# IF I CLOSE THE PROCESS HERE, THE PROGRAM WILL HANG FOREVER
time.sleep(5)
while True:
try:
driver.get('http://www.google.com')
break
except:
driver.quit()
driver = open_browser()
The code you have provided will always hang in the event that there is an exception getting the google home page.
What is probably happening is that attempting to get the google home page is resulting in an exception which would normally halt the program, but you are masking that out with the except clause.
Attempt with the following amendment to your loop.
max_attemtps = 10
attempts = 0
while attempts <= max_attempts:
try:
print "Retrieving google"
driver.get('http://www.google.com')
break
except:
print "Retrieving google failed"
attempts += 1
Related
I use Selenium Chrome to extract information from online sources. Baically, I loop over a list of URLs (stored in mylinks) and load the webpages in the browser as follows:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("window-size=1200,800")
browser = webdriver.Chrome(chrome_options=options)
browser.implicitly_wait(30)
for x in mylinks:
try:
browser.get(x)
soup = BeautifulSoup(browser.page_source, "html.parser")
city = soup.find("div", {"class": "city"}).text
except:
continue
My problem is, that the browser "freezes" at some point. I know that this problem is caused by the webpage. As a consequence, my routine stops since the browser does not work any more. Also browser.implicitly_wait(30) does not help here. Neither explicit or implicit wait solves the problem.
I want to "timeout" the problem, meaning that I want to quit() the browser after x seconds (in case the browser freezes) and restart it.
I know that I could use a subprocess with timeout like:
def startprocess(filepath, waitingtime):
p = subprocess.Popen("C://mypath//" + filepath)
try:
p.wait(waitingtime)
except subprocess.TimeoutExpired:
p.kill()
However, for my task this solution would be second-best.
Question: is there an alternative way to timeout the browser.get(x) step in the loop above (in case the browser freezes) and to continue to the next step?
I am automating a boring data entry task, so I created a program that basically clicks and types for me using selenium. It runs great! except for when it reaches this specific "Edit Details..." element that I need clicked, however, selenium is unable to locate the element regardless of whatever method I try.
I've tried using a CSS selector that tried to access the ID to no avail. I also tried using XPATH, as well as trying to be more specific by giving it a 'contains' statement with the button text. As last resort, I used the selenium IDE to see what locator it registers when I physically click the button and it used the exact same ID that I state in my code. I am completely lost on how to go about fixing this.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import *
import pyautogui as py
import time, sys, os, traceback
#Launching Browser
browser = webdriver.Ie()
wait = WebDriverWait(browser, 15) #Waiting
#Laziness Functions
def clickCheck(Method, Locator, elemName):
wait.until(EC.element_to_be_clickable((Method, Locator)))
print(elemName + ' Clickable')
#Commence main function
try:
#Do alot of Clicks and Stuff until it reaches "Edit Details..." element
"""THIS IS WHERE THE PROBLEM LIES"""
time.sleep(3)
clickCheck(By.CSS_SELECTOR, 'td[id="DSCEditObjectSummary"]', "Edit Details")
elemEdit = browser.find_element_by_css_selector('td[id="DSCEditObjectSummary"]')
elemEdit.click()
#FAILSAFES
except:
print('Unknown error has Occured')
exc_info = sys.exc_info()
traceback.print_exception(*exc_info)
del exc_info
finally: #Executes at the end and closes all processes
print('Ending Program')
browser.quit()
os.system("taskkill /f /im IEDriverServer.exe")
sys.exit()
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to find element with css selector == [id="DSCEditObjectSummary"]
This is what I get as an error, all I want is for the element to be clicked just like all other elements are being located by CSS_Selectors. The image below indicates in blue the exact line for the "Edit Details..." button.
Edit Details Button
It looks like the issue may be with the page loading slowly, or as another commenter mentioned it's in an iFrame, etc. I typically try clicking by using X/Y coordinates with a macro tool like AppRobotic if you're running this on Windows. If it's an issue with the page loading slowly, I usually try stopping the page load, and interacting with the page a bit, something like this should work for you:
import win32com.client
from win32com.client import Dispatch
x = win32com.client.Dispatch("AppRobotic.API")
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import *
import pyautogui as py
import time, sys, os, traceback
#Launching Browser
browser = webdriver.Ie()
wait = WebDriverWait(browser, 15) #Waiting
#Laziness Functions
def clickCheck(Method, Locator, elemName):
wait.until(EC.element_to_be_clickable((Method, Locator)))
print(elemName + ' Clickable')
driver.get('https://www.google.com')
# wait 20 seconds
x.Wait(20000)
# scroll down a couple of times in case page javascript is waiting for user interaction
x.Type("{DOWN}")
x.Wait(2000)
x.Type("{DOWN}")
x.Wait(2000)
# forcefully stop pageload at this point
driver.execute_script("window.stop();")
# if clicking with Selenium still does not work here, use screen coordinates
x.MoveCursor(xCoord, yCoord)
x.MouseLeftClick
x.Wait(2000)
I am posting this answer more so for other folks who might be running into the same problem and stumble upon this post. As #PeterBejan mentioned, the element I was trying to click was nested in an iframe. I tried accessing the iframe except I was thrown a NoSuchFrameException. Further digging revealed that this frame was nested inside 3 other frames and I had to switch to each frame from top level down, to access the element. This was the code I used
wait.until(EC.frame_to_be_available_and_switch_to_it("TopLevelFrameName"))
wait.until(EC.frame_to_be_available_and_switch_to_it("SecondaryFrameName"))
wait.until(EC.frame_to_be_available_and_switch_to_it("TertiaryFrameName"))
wait.until(EC.frame_to_be_available_and_switch_to_it("FinalFrameName"))
clickCheck(By.ID, 'ElementID', "Edit Details")
elemEdit = browser.find_element_by_id("ElementID")
elemEdit.click()
I am writing a script that will check if the proxy is working. The program should:
1. Load the proxy from the list (txt).
2. Go to any page (for example wikipedia)
3. If the page has loaded (even not completely) it saves the proxy data to another txt file.
It must all be in the loop. It must also check whether the browser has displayed an error. I have a problem with always turning off the previous browser every time, after several loops several browsers are already open.
Ps. I replaced the iteration with a random number
from selenium import webdriver
import random
from configparser import ConfigParser
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import traceback
while 1:
ini = ConfigParser()
ini.read('liczba_proxy.ini')
random.random()
liczba_losowa = random.randint(1, 999)
f = open('user-agents.txt')
lines = f.readlines()
user_agent = lines[liczba_losowa]
user_agent = str(user_agent)
s = open('proxy_list.txt')
proxy = s.readlines()
i = ini.getint('liczba', 'liczba')
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % proxy[liczba_losowa])
chrome_options.add_argument(f'user-agent={user_agent}')
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path='C:\Python\Driver\chromedriver.exe')
driver.get('https://en.wikipedia.org/wiki/Replication_error_phenotype')
def error_catching():
print("error")
driver.stop_client()
traceback.print_stack()
traceback.print_exc()
return False
def presence_of_element(driver, timeout=5):
try:
w = WebDriverWait(driver, timeout)
w.until(EC.presence_of_element_located((By.ID, 'siteNotice')))
print('work')
driver.stop_client()
return True
except:
print('not working')
driver.stop_client()
error_catching()
Without commenting on your code design:
In order to close a driver instance, use driver.close() or driver.quit() instead of your driver.stop_client().
The first one closes the the browser window on which the focus is set.
The second one basically closes all the browser windows and ends the WebDriver session gracefully.
Use
chrome_options.quit()
Obs.: Im pretty sure you should not use testcases like that... "while 1"? so you test will never end?
I guess you should setup your testes in TestCases and call the TheSuite to teste all your testcases and give you one feedback about whant pass or not, and maybe setup one cronjob to keep calling it by time to time.
Here one simple example mine using test cases with django and splinter (splinter is build on top of selenium)
https://github.com/Diegow3b/python-django-basictestcase/blob/master/myApp/tests/test_views.py
I would like to refresh a page if the loading time exceeds my expectation. So I plan to use existing function set_page_load_timeout(time_to_wait), but it turns out that call driver.get() seems not to work anymore.
I've written a simple program below and hit the problem.
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
import time
driver = webdriver.Chrome()
time.sleep(5)
driver.set_page_load_timeout(2)
try:
driver.get("https://aws.amazon.com/")
except TimeoutException as e:
print str(e)
driver.set_page_load_timeout(86400)
time.sleep(5)
print "open page"
driver.get("https://aws.amazon.com/")
print "page loaded"
The environment info:
chrome=67.0.3396.99
chromedriver=2.40.565386 (45a059dc425e08165f9a10324bd1380cc13ca363),platform=Mac OS X 10.13.4 x86_64
Selenium Version: 3.12.0
or see:
environment
What you are seeing is an unfortunate situation when get/navigation timeouts the connection is not stable, so you can't operate well on the browser again.
The only workaround that exists as of now is to disable the pageLoadStrategy, but then you loose lot of good perks which automatically wait for pageLoad on get and click operations
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
import time
from selenium.webdriver import DesiredCapabilities
cap = DesiredCapabilities.CHROME
cap["pageLoadStrategy"] = "none"
driver = webdriver.Chrome(desired_capabilities=cap)
I'm trying to run a script in Selenium/Python that requires logins at different points before the rest of the script can run. Is there any way for me to tell the script to pause and wait at the login screen for the user to manually enter a username and password (maybe something that waits for the page title to change before continuing the script).
This is my code so far:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import unittest, time, re, getpass
driver = webdriver.Firefox()
driver.get("https://www.facebook.com/")
someVariable = getpass.getpass("Press Enter after You are done logging in")
driver.find_element_by_xpath('//*[#id="profile_pic_welcome_688052538"]').click()
Use WebDriverWait. For example, this performs a google search and then waits for a certain element to be present before printing the result:
import contextlib
import selenium.webdriver as webdriver
import selenium.webdriver.support.ui as ui
with contextlib.closing(webdriver.Firefox()) as driver:
driver.get('http://www.google.com')
wait = ui.WebDriverWait(driver, 10) # timeout after 10 seconds
inputElement = driver.find_element_by_name('q')
inputElement.send_keys('python')
inputElement.submit()
results = wait.until(lambda driver: driver.find_elements_by_class_name('g'))
for result in results:
print(result.text)
print('-'*80)
wait.until will either return the result of the lambda function, or a selenium.common.exceptions.TimeoutException if the lambda function continues to return a Falsey value after 10 seconds.
You can find a little more information on WebDriverWait in the Selenium book.
from selenium import webdriver
import getpass # < -- IMPORT THIS
def loginUser():
# Open your browser, and point it to the login page
someVariable = getpass.getpass("Press Enter after You are done logging in") #< THIS IS THE SECOND PART
#Here is where you put the rest of the code you want to execute
THEN whenever you want to run the script, you type loginUser() and it does its thing
this works because getpass.getpass() works exactly like input(), except it doesnt show any characthers ( its for accepting passwords and notshowing it to everyone looking at the screen)
So what happens is your page loads up. then everything stops, Your user manually logs in, and then goes back to the python CLI and hits enter.