Python Selenium send request and avoid "Waiting for (website) ...." - python

I am launching several requests on different tabs. While one tab loads I will iteratively go to other tabs and see whether they have loaded correctly. The mechanism works great except for one thing: a lot of time is wasted "Waiting for (website)..."
The way in which I go from one tab to the other is launching an exception whenever a key element that I have to find is missing. But, in order to check for this exception (and therefore to proceed on other tabs, as it should do) what happens is that I have to wait for the request to end (so for the message "Waiting for..." to disappear).
Would it be possible not to wait? That is, would it be possible to launch the request via browser.get(..) and then immediately change tab?

Yes you can do that. You need to change the pageLoadStrategy of the driver. Below is an example of firefox
import time
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium import webdriver
cap = DesiredCapabilities.FIREFOX
cap["pageLoadStrategy"] = "none"
print(DesiredCapabilities.FIREFOX)
driver = webdriver.Firefox(capabilities=cap)
driver.get("http://tarunlalwani.com")
#execute code for tab 2
#execute code for tab 3
Nothing will wait now and it is up to you to do all the waiting. You can also use eager instead of none

Related

Python 3 Selenium WebDriverWait causes script to hang/freeze forever

I have a script that uses a selenium webdriver (geckodriver) and loads various webpages.
The scripts works fine at the beginning, but than at a random point it stops working without raising any error (the program sorts of hangs without really doing anything).
I added some logging statement to check when it hangs, and this is caused by the WebDriverWait statement (see below).
The last thing that is printed in the log is "get_records - Loaded".
The expected behavior to me would be to either print "get_records - Acquired pager", or to raise a TimeoutException after 10 seconds.
[...]
logging.info("get_records - Getting url: {}".format(url))
driver.get(url)
logging.info("get_records - Loaded")
# Get records number and result pages
elem = WebDriverWait(driver, 10).until(ec.element_to_be_clickable(
(By.XPATH, "//td[#align='right']/span[#class='pager']"))
)
logging.info("get_records - Acquired pager")
[...]
Python version: 3.7.3
Selenium version: 3.141.0
Firefox version: 70.0.1
It seems like a similar bug happened with previous version (Selenium WebDriver (2.25) Timeout Not Working), but that bug was closed.
Is anyone having the same issue?
Update:
It seems like adding time.sleep(0.5) before elem prevents the script from freezing (either "get_records - Acquired pager" is printed, or the timeoutException is raised).
Even though this is a turnaround for the issue, I would rather not put any forced wait.
I actually have the exactly same experience when the script works fine at first but hangs forever after some time. The '10 seconds' timeout is that webdriver/browser tries to open a page in 10 seconds. But the timeout that python script sends request to webdriver/browser is not defined. And it's none by default meaning request will wait infinitely.
Short answer:
driver.command_executor.set_timeout(10)
driver.get(url)
Explain:
Chromedriver as example. Whenever you run a selenium script. A process named 'chromedriver' starts as well. Let's call it 'control process'. It opens the browser and controls it. And it also acts as a http server which you can get the address and port by driver.command_executor._url. It receives http request, processes it, tells the browser to do something(maybe open a url) and returns. Details here.
When you call
elem = WebDriverWait(driver, 10).until(ec.element_to_be_clickable(
(By.XPATH, "//td[#align='right']/span[#class='pager']"))
)
you are actually sending a request to the 'control process' which is a http server and tell it to do something(find some elements in current page). The timeout '10' means that 'control process' tells browser to open a page in 10 seconds before it cancels and returns timeout status to the python script.
But what really happens here is the 'control process' is receiving request but not responding. I don't really know what's happening in the 'control process'.
Python selenium package is using urllib3.request to send request and socket._GLOBAL_DEFAULT_TIMEOUT as timeout. It is none by default that makes a request wait infinitely. So you can set it by using driver.command_executor.set_timeout(10). Now if 'control process' brokes you will get a timeout exception and maybe recreate webdriver.

Webbrowser opens two windows instead of two tabs

I'm trying to open two websites in two tabs in my web browser. What actually happens is that two separate web browser windows are opened.
import webbrowser
webbrowser.open_new('https://www.msn.com')
webbrowser.open_new_tab('https://www.aol.com/')
The issue is likely that browser hasn't finished opening by the time you ask for a new tab. The docs do state that if no browser is open open_new_tab() acts as open_new(), which is why you are seeing two browsers.
I suggest putting a small delay between the calls:
import webbrowser
import time
webbrowser.open_new(url1)
time.sleep(1)
webbrowser.open_new_tab(url2)
Your other option is to poll the running processes and wait until the first instance of the browser appears before asking for a new tab.

Python3, Selenium, Chromedriver console window

I've a made a selenium test using python3 and selenium library.
I've also used Tkinter to make a GUI to put some input on (account, password..).
I've managed to hide the console window for python by saving to the .pyw extension; and when I make an executable with my code, the console doesn't show up even if it's saved with .py extension.
However, everytime the chromedriver starts, it also starts a console window, and when the driver exists, this window does not.
so in a loop, i'm left with many webdriver consoles.
Is there a work around this to prevent the driver from launching a console everytime it runs ?
I hated dealing with this in selenium until I remembered that this was an obvious use case for context managers just like the usage of open.
I did find out that selenium is about to add this officially to their package in this pull request
Until this is officially added, this snippet should give you the functionality you need to get things going :)
import contextlib
#contextlib.contextmanager
def Chrome(*args, **kwargs):
webdriver = webdriver.Chrome(*args, **kwargs)
try:
yield webdriver
finally:
webdriver.quit()
with Chrome() as driver:
# whatever you're planning on doing goes here
driver.close() and driver.quit() are two different methods for closing the browser session in Selenium WebDriver.
driver.close() - It closes the the browser window on which the focus is set.
driver.quit() – It basically calls driver.dispose method which in turn closes all the browser windows and ends the WebDriver session gracefully.
You should use driver.quit whenever you want to end the program. It will close all opened browser window and terminates the WebDriver session. If you do not use driver.quit at the end of program, WebDriver session will not close properly and files would not be cleared off memory. This may result in memory leak errors.

Python Selenium: How to let human interaction in a automated script?

I am working on a script that shows CAPTCHA and a few other stuff in a pop window. I am writing script for FireFox. Is it possible that I feed the values and on hitting Submit button script could resume the operations? I guess, some kind of infinite loop?
You could wait for the submit button to be clicked by the user:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# load the page
driver.get("https://www.google.com/recaptcha/api2/demo")
# get the submit button
bt_submit = driver.find_element_by_css_selector("[type=submit]")
# wait for the user to click the submit button (check every 1s with a 1000s timeout)
WebDriverWait(driver, timeout=1000, poll_frequency=1) \
.until(EC.staleness_of(bt_submit))
print "submitted"
I believe what you're asking is if it's possible to have your selenium script pause for human interaction that can't be fully automated.
There are several ways to do this:
Easy but hacky feeling:
In your python script, put
import pdb
pdb.set_trace()
At the point you want to pause for the human. This will cause the python app to drop to a debugger. When the human has done their thing, they can type c and hit enter to continue running the selenium script.
Slightly less hacky feeling (and easier for the human).
At the point where you want to put the pause, assuming the user submits something, you can do something like (with an import time at the top):
for _ in xrange(100): # or loop forever, but this will allow it to timeout if the user falls asleep or whatever
if driver.get_current_url.find("captcha") == -1:
break
time.sleep(6) # wait 6 seconds which means the user has 10 minutes before timeout occurs
More elegant approaches are left as an exercise for the reader (I know there's a way you should be able to not have to busy-wait, but I haven't used it in too long)
One way is to check for some content of the next page that loads after entering captcha and wait till they're found.
Other way is to check for current URL until it changes (usually URL changes after entering CAPTCHA) or wait for the next URL with -
while driver.current_url == your_current_url:
wait

Reload time & retries in selenium for a url

I am working on selenium with python for downloading file from a url.
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', '/tmp')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/csv')
browser = webdriver.Firefox(profile)
try:
browser.get("http://www.drugcite.com/?q=ACTIMMUNE")
browser.find_element
browser.find_element_by_id('exportpt').click()
browser.find_element_by_id('exporthlgt').click()
except:
pass
I want to set timeout for this program. Means, If within 60 seconds if this url is not loaded due to net issue, it should retry after each 60 seconds and after 3 tries, it should go ahead.
How can I achieve such in this code?
Thanks
You could use browser.implicitly_wait(60)
WebDriver.implicitly_wait
There is nothing built in to do this. However, I wouldn't have said it would be too hard.
Just use an explicit wait to find a particular element that should be there when the page loads. Set the timeout to be 60 seconds on this explicit wait.
Wrap this in a loop that executes up to three times. To avoid it running three times unnecessarily, put in a break statement when the explicit wait actually runs without any issue.
That means it'll run up to three times, waiting 60 seconds a time, and once it's successful it'll exit the loop. If it isn't successful after all of that, then it'll crash.
Note: I've not actually tried this but it's just a logical solution!

Categories