Why does this Selenium test fail UNLESS I insert a pdb breakpoint? - python

I have a Selenium LiveServerTestCase (django project) that tests an AJAXy page. After the main page loads, there's a delay before another item loads, and I need to test for that second item. I'm trying to include a smart way to wait without doing time.sleep(too_long), but the test always fails unless I insert a pdb breakpoint.
def some_test_thing(self):
#loads a page with some ajaxy stuff, so there's a
#delay that needs to be accounted for
url = "something..."
self.browser.get(url)
#import ipdb;ipdb.set_trace() #this test only passes with this statement, wtf^^?
self.assertWithWait(self.assert_, args=(some_args,))
and I use the convenience function assertWithWait, which tries the assertion in a loop, catching the AssertionError if the test fails, until a timeout.
def assertWithWait(self, assertion, timeout=10, args=(), kwargs={}):
total_time = 0
while total_time < timeout:
try:
assertion(*args, **kwargs)
except AssertionError:
pass
else: # assertion passed
return
time.sleep(0.2)
total_time += 0.2
assertion(*args, **kwargs) # final try, will cause test failure if unsuccessful
The problem is that the test always fails if I don't have that pdb breakpoint - the loop runs until the timeout at 10 seconds. However, if I include that pdb breakpoint, and even if I enter "c" and continue immediately (like within 1 second), the test will pass. So clearly it's not a timing issue, because the test completes successfully well within the 10 second timeout in the breakpoint case, but fails after waiting 10 seconds in the no-breakpoint case. So it seems like something related to multiprocessing or multithreading that Selenium is doing, that maybe gets released by the pdb breakpoint? I'm grabbing at straws a bit here.
Help!
EDIT
I'm using this assertWithWait not just to wait for things to load in the page, but to wait for the results of some AJAXy server calls to check that things changed in the database too. So I can't just use Selenium's built-in WebDriverWait functions. Sorry, should have been more clear about that.

You are basically reinventing what an Explicit Wait was made for, use it and handle TimeoutException:
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
def some_test_thing(self):
try:
element = WebDriverWait(self.browser, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
except TimeoutException:
self.fail("Element not found")
This waits up to 10 seconds before throwing a TimeoutException or if
it finds the element will return it in 0 - 10 seconds. WebDriverWait
by default calls the ExpectedCondition every 500 milliseconds until it
returns successfully. A successful return is for ExpectedCondition
type is Boolean return true or not null return value for all other
ExpectedCondition types.

DUH, my bad. Objects! Let's walk through the calling signature of assertWithWait and the way that assertion later gets executed, and the problem will become clear. So here's the relevant stuff:
def assertWithWait(self, assertion, timeout=10, args=(), kwargs={}):
...
assertion(*args, **kwargs)
...
Let's suppose the assertion is
self.assertEqual(SomeModel.objects.get(pk=1).some_field,
"some value")
So we call assertWithWait like this:
self.assertWithWait(self.assertEqual,
args=(SomeModel.objects.get(pk=1).some_field,
"some value"))
Can you spot my dumb mistake yet??
The args are being evaluated at the time that assertWithWait is called, and so at each iteration of the loop inside assertWithWait the assertEqual is being called with exactly the same arguments (objects!).
The fix is to change the calling signature of assertWithWait to use lambda:
def assertWithWait(self, assertion_func, timeout=6):
"""
Tries the assertion in a loop until a timeout.
Usage:
self.assertWithWait(lambda: assertion_statement)
e.g.
self.assertWithWait(lambda: self.assertEqual(something, something_else))
"""
...
try:
assertion_func()
except AssertionError:
pass
...
Now we have an all-purpose assertWithWait that can check browser elements OR database values (or whatever else you like). And as a bonus, the calling syntax got easier to read - python ftw!

Related

Unable to select reCaptchaV2 checkbox in SeleniumBase

I have a SeleniumBase code like this
from seleniumbase import SB
from func_timeout import func_set_timeout, FunctionTimedOut
def checkbox():
print('- click checkbox')
checkbox = 'span#recaptcha-anchor'
try:
sb.wait_for_element(checkbox)
sb.click(checkbox)
sb.sleep(4)
except FunctionTimedOut as e:
print('- 👀 checkbox:', e)
When I call checkbox() it gives error and the browser crashes quickly without clicking the checkbox
I tried replacing
checkbox = 'id#recaptcha-anchor-label'
checkbox = 'id#rc-anchor-center-item'
but it didn't work
You might want to try avoiding the captcha altogether by running SeleniumBase in undetected-chromedriver mode (set uc=True, or run with --uc):
from seleniumbase import SB
with SB(uc=True) as sb:
sb.open("https://nowsecure.nl/#relax")
try:
sb.assert_text("OH YEAH, you passed!", "h1", timeout=7)
sb.post_message("Selenium wasn't detected!", duration=2)
print("\n Success! Website did not detect Selenium! ")
except Exception:
print("\n Sorry! Selenium was detected!")
If you still need to go through the captcha, then make sure that if you need to click something in an iframe, that you switch to the iframe first, such as with sb.switch_to_frame("iframe").
Also, the sb.click(selector) method will automatically wait for the element, which means that calling sb.wait_for_element(selector) before that is unnecessary.

Restart python program whenever it encounters an exception

So I've got a selenium python script that occasionally runs into the following error at differing sections in my code:
Exception has occurred: WebDriverException
Message: unknown error: cannot determine loading status
from target frame detached
But when I encounter this error, if I re-run the program without changing anything, everything works again (so I know that it's either a problem with the website or the webdriver). In the meantime, I was wondering if there was a way to tell python to restart the program if a WebDriverException is encountered. Any help would be greatly appreciated, thank you.
You could try os.execv(), according to here, it enables a python script to be restarted, but you need to clean the buffers etc using this C-like function sys.stdout.flush()
try:
<your block of code>
except WebDriverException:
print(<Your Error>)
import os
import sys
sys.stdout.flush()
os.execv(sys.argv[0], sys.argv)
You could simply use the os module to do this: os.execv(sys.argv[0], sys.argv)
Use a main() function as a starting point for your program, and trigger that function again whenever you need to restart. Something like
def foo1():
return
def foo2():
try:
...
except:
main()
def main():
foo1()
foo2()
if __name__ == "main":
main()
If the error occurs because it does not find a certain tag, you could put a wait
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
or worst case
browser.implicitly_wait(5)
or
time.sleep(5)
which waits 5 seconds for the element to appear

Retry for Selenium Exception in Subproces for Python

I have a python with selenium script that should be called with an argument that different for each person (I called it Tester.py), to make it easier I put it in a new python file and run it with a #subprocess like this :
# Explanation about retry is next
#retry(retry_count=5, delay=5)
def job():
try:
subprocess.Popen("python Tester.py -user Margareth", shell=True)
return True
except subprocess.CalledProcessError:
return False
print("System Retrying")
pass
my retry wrapper are look like this (but It's not working) :
def retry(retry_count=5, delay=5, allowed_exceptions=()):
def decorator(f):
#functools.wraps(f)
def wrapper(*args, **kwargs):
for _ in range(retry_count):
try:
result = f(*args, **kwargs)
if result:
pass
except allowed_exceptions as e:
pass
time.sleep(delay)
return wrapper
return decorator
My only problem with Tester.py is TimeoutException from selenium, and I want to retry it if it failed for x times and x seconds delay, but somehow #try-catch not working and my retry always result in random second retry and it really messed, Any clue?
There is no need for a function decorator to achieve retry behavior. It can be done by a "simple" for loop in the code:
import subprocess
import time
retry_count = 5
delay = 5
success = False
for _ in range(retry_count):
try:
subprocess.Popen("python Tester.py -user Margareth", shell=True)
success = True
break
except subprocess.CalledProcessError:
print("System Retrying")
time.sleep(delay)
Here, if the subprocess.Popen call return without raising an error, we break out of the loop, otherwise we retry up to retry_count times.
Also, as a side note, consider try to not use shell=True when doing subproces.Popen calls, as that can potentially open up an attack vector for people wanting to hack your system. In the current setup, there is no immediate risk, as there are no user input to or in the call, but it is still generally best practice to try to avoid it.

How to set sigalarm to repeat over and over again in linux?

I'm trying to make alarm work over and over again.
My handler is
def handler_SIGALRM(signum, frame):
print "waiting"
signal.alarm(2)
The alarm work only once, even though every time I set it again.
In my code I also use sigchld and sys.exit after the child working.
I'm running it with cygwin.
EDIT:
I need to write a code that will print "waiting" every second, with sigalarm and not loops
I'm an idiot, I edited the wrong code
You put your signal.alarm(2) in a wrong place. See my example below.
import time
import signal
class TimeoutException (Exception):
pass
def signalHandler (signum, frame):
raise TimeoutException ()
timeout_duration = 5
signal.signal (signal.SIGALRM, signalHandler)
for i in range (5):
signal.alarm (timeout_duration)
try:
"""Do something that has a possibility of taking a lot of time
and exceed the timeout_duration"""
time.sleep (20)
except TimeoutException as exc:
print "Notify your program that the timeout_duration has passed"
finally:
#Clean out the alarm
signal.alarm (0)

Restart a process if running longer than x amount of minutes

I have a program that creates a multiprocessing pool to handle a webextraction job. Essentially, a list of product ID's is fed into a pool of 10 processes that handle the queue. The code is pretty simple:
import multiprocessing
num_procs = 10
products = ['92765937', '20284759', '92302047', '20385473', ...etc]
def worker():
for workeritem in iter(q.get, None):
time.sleep(10)
get_product_data(workeritem)
q.task_done()
q.task_done()
q = multiprocessing.JoinableQueue()
procs = []
for i in range(num_procs):
procs.append(multiprocessing.Process(target=worker))
procs[-1].daemon = True
procs[-1].start()
for product in products:
time.sleep(10)
q.put(product)
q.join()
for p in procs:
q.put(None)
q.join()
for p in procs:
p.join()
The get_product_data() function takes the product, opens an instance of Selenium, and navigates to a site, logs in, and collects the details of the product and outputs to a csv file. The problem is, randomly (literally... it happens at different points of the website's navigation or extraction process) Selenium will stop doing whatever it's doing and just sit there and stop doing it's job. No exceptions are thrown or anything. I've done everything I can in the get_product_data() function to get this to not happen, but it seems to just be a problem with Selenium (i've tried using Firefox, PhantomJS, and Chrome as it's driver, and still run into the same problem no matter what).
Essentially, the process should never run for longer than, say, 10 minutes. Is there any way to kill a process and restart it with the same product id if it has been running for longer than the specified time?
This is all running on a Debian Wheezy box with Python 2.7.
You could write your code using multiprocessing.Pool and the timeout() function suggested by #VooDooNOFX. Not tested, consider it an executable pseudo-code:
#!/usr/bin/env python
import signal
from contextlib import closing
from multiprocessing import Pool
class Alarm(Exception):
pass
def alarm_handler(*args):
raise Alarm("timeout")
def mp_get_product_data(id, timeout=10, nretries=3):
signal.signal(signal.SIGALRM, alarm_handler) #XXX could move it to initializer
for i in range(nretries):
signal.alarm(timeout)
try:
return id, get_product_data(id), None
except Alarm as e:
timeout *= 2 # retry with increased timeout
except Exception as e:
break
finally:
signal.alarm(0) # disable alarm, no need to restore handler
return id, None, str(e)
if __name__=="__main__":
with closing(Pool(num_procs)) as pool:
for id, result, error in pool.imap_unordered(mp_get_product_data, products):
if error is not None: # report and/or reschedule
print("error: {} for {}".format(error, id))
pool.join()
You need to ask Selenium to wait an explicit amount of time, or wait for some implicit DOM object to be available. Take a quick look at the selenium docs about that.
From the link, here's a process that waits 10 seconds for the DOM element myDynamicElement to appear.
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait # available since 2.4.0
from selenium.webdriver.support import expected_conditions as EC # available since 2.26.0
ff = webdriver.Firefox()
ff.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement")))
except TimeoutException as why:
# Do something to reject this item, possibly by re-adding it to the worker queue.
finally:
ff.quit()
If nothing is available in the given time period, a selenium.common.exceptions.TimeoutException is raised, which you can catch in a try/except loop like above.
EDIT
Another option is to ask multiprocessing to timeout the process after some amount of time. This is done using the built-in library signal. Here's an excellent example of doing this, however it's still up to you to add that item back into the work queue when you detect a process has been killed. You can do this in the def handler section of the code.

Categories