Explicit Wait not timing out consistently in Selenium Python - python

I have the following code using Selenium in Python 3:
profile = webdriver.FirefoxProfile()
profile.set_preference('webdriver.load.strategy', 'unstable')
browser = webdriver.Firefox(profile)
browser.set_page_load_timeout(10)
url = 'my_url'
while True:
try:
st = time.time()
browser.get(url)
print('Finished get!')
time.sleep(2)
wait = WebDriverWait(browser, 10)
element = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div[my_attr="my_attr"]')))
print('Success after {} seconds.'.format(round(time.time()-st)))
break
except:
print('Timed out after {} seconds.'.format(round(time.time()-st)))
print('Reloading')
continue
From my understanding, using the explicit wait here (even with the unstable load strategy and page load timeout), what should happen is that the page should load, it should look for the element specified, and if either the page doesn't load within 10 seconds or the element is not found within 10ish seconds, it should time out and reload again (because of the try/except clause with the while loop).
However, what I'm finding is that it's not timing out consistently. For example, I've had instances where the loading times out after 10ish seconds the first time around, but once it reloads, it doesn't time out and instead "succeeds" after like 140 seconds. Or sometimes it doesn't time out at all and just keeps running until it succeeds. Because of the unstable load strategy, I don't think the page load itself is ever timing out (more specifically, the 'Finished get!' message always prints). But the explicit wait here that I specified also does not seem to be consistent. Is there something in my code that is overriding the timeouts? I want the timeouts to be consistent such that if either the page doesn't load or the element isn't located within 10ish seconds, I want it to timeout and reload. I don't ever want it to go on for 100+ seconds, even if it succeeds.
Note that I'm using the unstable webdriver load strategy here because the page I'm going to takes forever to completely load so I want to go straight through the code once the elements I need are found without needing the entire page to finish loading.

After some more testing, I have located the source of the problem. It's not that the waits are not working. The problem is that all the time is being taken up by the locator. I discovered this by essentially writing my own wait function and using the .find_element_by_css_selector() method, which is where all the runtime is occurring when it takes 100+ seconds. Because of the nature of my locator and the complexity of the page source, it's taking 100+ seconds sometimes for the locator to find the element when the page is nearly fully loaded. The locator time is not factored into the wait time. I presume that the only "solution" to this is to write a more efficient locator.

Related

Equivalent of time.sleep() in Selenium

I realize this is a relatively simple question but I haven't found the answer yet.
I'm using driver.get() in a for loop that iterates through some urls. To help avoid my IP address getting blocked, I've implemented time.sleep(5) before the driver.get statement in the for loop.
Basically, I just want a wait period to make my scraping seem more natural.
I think time.sleep may be causing page crashes. What is the equivalent of time.sleep in selenium? From what I understand, implicitly_wait just sets the amount of time before throwing an exception, but I'm not sure that's what I want here? I want a specific amount of time for the driver to wait.
time.sleep()
The sleep() function is from the time module which suspends execution of the current thread for a given number of seconds.
Now, WebDriver being a out-of-process library which instructs the browser what to perform and at the same time the web browser being asynchronous in nature, WebDriver can't track the active, real-time state of the HTML DOM. This gives rise to some intermittent issues that arise from usage of Selenium and WebDriver those are subjected to race conditions that occur between the browser and the user’s instructions.
As of now Selenium doesn't have any identical method to time.sleep(), however there are two equavalent methods at your disposal and can be used as per the prevailing condition of your automated tests.
Implicit wait: In this case, WebDriver polls the DOM for a certain duration when trying to find any element. This can be useful when certain elements on the webpage are not available immediately and need some time to load.
def implicitly_wait(self, time_to_wait) -> None:
"""
Sets a sticky timeout to implicitly wait for an element to be found,
or a command to complete. This method only needs to be called one
time per session. To set the timeout for calls to
execute_async_script, see set_script_timeout.
:Args:
- time_to_wait: Amount of time to wait (in seconds)
:Usage:
::
driver.implicitly_wait(30)
"""
self.execute(Command.SET_TIMEOUTS, {
'implicit': int(float(time_to_wait) * 1000)})
Explicit wait: This type of wait allows your code to halt program execution, or freeze the thread, until the condition you pass it resolves. As an example:
presence_of_element_located()
visibility_of_element_located()
element_to_be_clickable()
There is no specific method in Selenium for hardcoded pauses like a time.sleep() general Python method.
As you mentioned there is an implicitly_wait and Expected Conditions explicit WebDriverWait waits but both these are NOT a hardcoded pauses.
Both the implicitly_wait WebDriverWait are used for setting the timeout - how long time to poll for some element presence or condition, so if that condition is fulfilled or the element is presented the program flow will immediately continue to the next code line.
So, if you want to put a pause you have to use some general Python method that will suspend the program / thread run like the time.sleep().

How to scrape every specific amount of time with Python Selenium Chrome Driver?

Situation: There is a website which requires me to scrape information from it every x seconds. The site in question has information which requires my input, thus I decided to go with Selenium. The action flow looks like that: User can click in the browser section or interact with the website and the Selenium browser will scrape a specific piece of information every x seconds.
What have I tried?:
driver.wait (for any kind of element or a specific time); this, unfortunately, doesn't work as I don't have a specific element the browser shall wait for.
time.sleep(0.5) in a while True loop; this didn't work as the scraping and processing part (which may run simultaneously) took time as well, this time.sleep(0.5) may be off by a few seconds.
I looked into creating a Google Chrome Plugin which may do actions and send that information to the Python script in charge, though this surpassed the efforts it should, hence I decided against it.
To sum up, how can I scrape information from a Selenium Chrome Driver session every fixed amount of time?
You can simply wait for the difference between when you start and end. You also need to make sure you have a time that is greater than the time it takes your program. Used 5 here so if your program takes 1 second to run than it would wait for 5-1=4 seconds. This does get the difference in floating values so you can switch to int and do some checks for 0-1 second.
import time
while True:
now = time.time()
time.sleep(1)
later = time.time()
difference = (later - now)
print(difference)
driver.implicitly_wait(5-difference)

Python & Selenium: Difference between driver.implicitly_wait() and time.sleep()

Yes, I know both are used to wait for some specified time.
Selenium:
driver.implicitly_wait(10)
Python:
import time
time.sleep(10)
Is there any difference between these two?
time.sleep(secs)
time.sleep(secs) suspends the execution of the current thread for the given number of seconds. The argument may be a floating point number to indicate a more precise sleep time. The actual suspension time may be less than that requested because any caught signal will terminate the sleep() following execution of that signal’s catching routine. Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
You can find a detailed discussion in How to sleep webdriver in python for milliseconds
implicitly_wait(time_to_wait)
implicitly_wait(time_to_wait) is to specify the amount of time the WebDriver instance i.e. the driver should wait when searching for an element if it is not immediately present in the HTML DOM in-terms of SECONDS when trying to find an element or elements if they are not immediately available. The default setting is 0 which means the driver when finds an instruction to find an element or elements, the search starts and results are available on immediate basis.
In this case, after a fresh loading of a webpage an element or elements may be / may not be found on an immediate search. So your Automation Script may be facing any of these exceptions:
NoSuchElementException
TimeoutException
ElementNotVisibleException
ElementNotSelectableException
ElementClickInterceptedException
ElementNotInteractableException
Hence we introduce ImplicitWait. By introducing ImplicitWait the driver will poll the DOM Tree until the element has been found for the configured amount of time looking out for the element or elements before throwing a NoSuchElementException. By that time the element or elements for which you had been looking for may be available in the HTML DOM. As in your code you have already set ImplicitWait to a value of 10 seconds, the driver will poll the HTML DOM for 10 seconds.
You can find a detailed discussion in Using implicit wait in selenium
time.sleep(10) pauses code execution exactly 10 seconds.
driver.implicitly_wait(10) waits maximum 10 seconds for element's presence. If it is found after 2 seconds then code execution will be continued without wait for more 8 seconds.
When we use implicit wait in test script it is declared globally and it will automatically get applied to all the elements on that script and for example in java if you use implicit wait. --> driver. manage().timeouts().implictwait(10,timeunit.seconds);. this code will wait for the element to be present in DOM until then it will wait once element gets visible execution will get continue. during the time of hold script execution is stopped.
In thread.sleep(1000) in this case script will get hold for 1000ms no matter if element gets visible on dom at 500ms it will stay at this point till 1000 ms.
Thread.sleep() is a static wait which holds script duration for fixed number of time. Where as implicit wait will hld the script execution until element gets visible in DOM.
Hope this helps!

Make Selenium wait 10 seconds

Yes I know the question has been asked quite often but I still don't get it. I want to make Selenium wait, no matter what. I tried these methods
driver.set_page_load_timeout(30)
driver.implicitly_wait(90)
WebDriverWait(driver, 10)
driver.set_script_timeout(30)
and other things but it does not work. I need selenium to wait 10 seconds. NO not until some element is loaded or whatever, just wait 10 seconds. I know there is this
try:
element_present = EC.presence_of_element_located((By.ID, 'whatever'))
WebDriverWait(driver, timeout).until(element_present)
except TimeoutException:
print "Timed out waiting for page to load"
I do not want that.
If waiting for some seconds is to much (not achievable) for selenium, what other (python) library's/programs would be capable to achieve this task? With Javas Selenium it does not seem to be a problem...
All the APIs you have mentioned is basically a timeout, so it's gonna wait until either some event happens or maximum time reached.
set_page_load_timeout - Sets the amount of time to wait for a page load to complete before throwing an error. If the timeout is negative, page loads can be indefinite.
implicitly_wait - Specifies the amount of time the driver should wait when searching for an element if it is not immediately present.
set_script_timeout - Sets the amount of time to wait for an asynchronous script to finish execution before throwing an error. If the timeout is negative, then the script will be allowed to run indefinitely.
For more information please visit this page. (documention is for JAVA binding, but functionality should be same for all the bindings)
So, if you want to wait selenium (or any script) 10 seconds, or whatever time. Then the best thing is to put that thread to sleep.
In python it would be
import time
time.sleep(10)
The simplest way to do this in Java is using
try {
Thread.sleep(10*1000);
} catch (InterruptedException e) {
e.printStackTrace();
}

Reload time & retries in selenium for a url

I am working on selenium with python for downloading file from a url.
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', '/tmp')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/csv')
browser = webdriver.Firefox(profile)
try:
browser.get("http://www.drugcite.com/?q=ACTIMMUNE")
browser.find_element
browser.find_element_by_id('exportpt').click()
browser.find_element_by_id('exporthlgt').click()
except:
pass
I want to set timeout for this program. Means, If within 60 seconds if this url is not loaded due to net issue, it should retry after each 60 seconds and after 3 tries, it should go ahead.
How can I achieve such in this code?
Thanks
You could use browser.implicitly_wait(60)
WebDriver.implicitly_wait
There is nothing built in to do this. However, I wouldn't have said it would be too hard.
Just use an explicit wait to find a particular element that should be there when the page loads. Set the timeout to be 60 seconds on this explicit wait.
Wrap this in a loop that executes up to three times. To avoid it running three times unnecessarily, put in a break statement when the explicit wait actually runs without any issue.
That means it'll run up to three times, waiting 60 seconds a time, and once it's successful it'll exit the loop. If it isn't successful after all of that, then it'll crash.
Note: I've not actually tried this but it's just a logical solution!

Categories