I was trying to scrape a website with Selenium (as it is a website with a dynamically loaded content).
However, to wait for such a dynamic content to be loaded, while I usually use time.sleep(), I have just known about (and tried) driver.implicitly_wait() but it seems I do not get the expected content.
Here is the code:
from selenium import webdriver
import os
import time
os.environ['MOZ_HEADLESS'] = '1'
baseSite = 'https://bair.berkeley.edu/students.html'
driver = webdriver.Firefox()
#driver.implicitly_wait(5) --> full content is not retrieved
driver.get(baseSite)
time.sleep(5) # full content is retrieved
source = driver.page_source
print(source)
The setting of an implicit wait timeout (i.e., using implicitly_wait() does not affect how the browser loads a page. What that method does is poll the DOM for the desired element when using find_element or find_elements. In the code you posted, setting the implicit wait timeout has no effect because you are not attempting to find any elements on the page.
If you were to provide more details about what you’re expecting (aside from saying, “I want the page to be ‘fully loaded’”, because that phrase is so vague as to be meaningless), it might be easier to provide more guidance.
You could use an explicit wait, which would wait until a certain condition is met. For example,
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "myDynamicElement")))
would wait until the WebDriver located that element (until 10 seconds have passed, at which point, it would time out). This link is helpful for learning more about different types of waits in Selenium. https://selenium-python.readthedocs.io/waits.html
Related
For some reason, while I am trying to wait for an element by any criteria (visibility, presence etc.) It waits until the whole page is loaded and not only the specific element.
The page uses GTM if it is related, and that the relevant part in the code :
ui.WebDriverWait(driver, 40).until(EC.element_to_be_clickable((By.ID, 'menu-item-15608')))
startLoadReg = time.time()
driver.find_element_by_id("menu-item-15608").click()
driver.switch_to.window(driver.window_handles[-1])
RegTabLoaded = ui.WebDriverWait(driver, 45).until(EC.presence_of_element_located((By.ID, "page-title")))
endLoadReg = time.time()
Does any one know what might cause it?
Thanks!
When you get the page in Selenium it first waits untill the Document readiness state becomes complete. See https://www.w3.org/TR/webdriver/#dfn-waiting-for-the-navigation-to-complete
So even if you have some "waiters" for specific conditions on specific elements, Selenium will wait while the initial DOM will be built.
I need to constantly reload a web page as quickly as possible and check, if something has changed. But when I tried the following, it didn't work.
I used something like this:
while True:
driver.get(driver.getCurrentUrl())
source = driver.page_source
-- checking for change --
while using the Edge webdriver (it was the fastest one), but when the change already occured, the webdriver was still getting the old version of the page.
I don't know if the driver can save the page in cache or something, but I need to make sure, that I will always get the current version. How can I achieve that?
While you invoke get() method again on the getCurrentUrl() before you pull out the page_source to check for the change, it is worth to mention that though the WebClient may have achieved 'document.readyState' equal to "complete" at a certain stage and Selenium gets back the control that doesn't guarantees that all the associated Javascript and Ajax Calls on the new page have completed. Until and unless the Javascript and Ajax Calls associated with the DOM Tree gets completed the page is not completely rendered you may not be able to track the intended changes.
An ideal way to check for changes would be to induce WebDriverWait in-conjunction with expected_conditions clause set as title_contains as follows :
while True:
driver.get(driver.getCurrentUrl())
WebDriverWait(browser, 10).until(EC.title_contains(("full_or_partial_text_of_the_page_title")))
source = driver.page_source
-- check for change --
Note : While Page Title resides within the <head> tag of the HTML DOM a better solution would be to induce WebDriverWait for the visibility of an element which will be present in all situations within the <body> tag of the DOM Tree as follows :
while True:
driver.get(driver.getCurrentUrl())
WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.ID, "id_of_element_present_in_all_situation")))
source = driver.page_source
-- check for change --
driver = webdriver.Chrome()
#driver.set_page_load_timeout(10)
driver.get("sitename.com")
driver.find_element_by_id("usernameId").send_keys("myusername")
Setting a page load time proved counterproductive as the page load was killed even before the elements were actually loaded!
Currently, when I try to login on a site, the find_element_by_id() waits for the complete page to load, then gets me the element. I've read about implicit/explicit waits used along with ExpectedConditions, but as far as I understand they are used for waiting for an element to appear(dynamically) after the complete page has loaded.
Is there a way I can find an element as soon(polling is good enough) as it is visible(without waiting for the complete page to load)? It would be great to do so, some pages take quite a while to load(heavy traffic/low availability/poor internet connection could be reasons though).
I am using Selenium with Python, and a Chrome Driver. Thanks.
Take a look at selenium python documentation.
It has visibility_of_element_located.
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
element = wait.until(EC.visibility_of_element_located((By.ID,'someid')))
It is a best practice to wait for entire page to load before you take any further action. However, if you want to stop the page load in between(or load the page only for a specified time and carry on), you can change this in the browser's profile setting.
In case of Firefox :
profile = webdriver.FirefoxProfile()
profile.set_preference("http.response.timeout", 10)
profile.set_preference("dom.max_script_run_time", 10)
driver = webdriver.Firefox(firefox_profile=profile)
Hope it helps, cheers.
I am writing a bot using Python with Selenium module.When I open a webpage with my bot, since the webpage contains too many external sources than dom, it takes a lot to get all of the page loaded. I used the explicit and implicit waits to eliminate this problem since I just wanted a specific element to be loaded and not all of the webpage, it didn't work. The problem is If i run the following statement:
driver = webdriver.Firefox()
driver.get('somewebpage')
elm = WebDriverWait(driver, 5).until(ExpectedConditions.presence_of_element_located((By.ID, 'someelementID'))
elm.click()
It doesn't work since the Selenium waits for the driver.get() to fully retrieve the webpage and then, it proceeds further. Now I want to write a code that sets a timeout for driver.get(), Like:
driver.get('somewebpage').timeout(5)
Where the driver.get() stops loading the page after 5 secs and the program flow proceeds, whether the driver.get() fully loaded webpage or not.
By the way, I have searched about the feature that I said above, and came across there:
Selenium WebDriver go to page without waiting for page load
But the problem is that the answer in the above link does not say anything about the Python equivalent code.
How do I accomplish the future that I am searching for?
python equivalent code for the question mentioned in the current question (Selenium WebDriver go to page without waiting for page load):
from selenium import webdriver
profile = webdriver.FirefoxProfile()
profile.set_preference('webdriver.load.strategy', 'unstable')
driver = webdriver.Firefox(profile)
and:
driver.set_page_load_timeout(5)
There is a ton of questions on this, here is an example. Here is an example that waits until all jquery ajax calls have completed or a 5 second timeout.
from selenium.webdriver.support.ui import WebDriverWait
WebDriverWait(driver, 5).until(lambda s: s.execute_script("return jQuery.active == 0"))
It was a really tedious issue to solve. I just did the following and the problem got resolved:
driver= webdriver.Firefox()
driver.set_page_load_timeout(5)
driver.get('somewebpage')
It worked for me using Firefox driver (and Chrome driver as well).
I am using the python unit testing library (unittest) with selenium webdriver. I am trying to find an element by it's name. About half of the time, the tests throw a NoSuchElementException and the other time it does not throw the exception.
I was wondering if it had to do with the selenium webdriver not waiting long enough for the page to load.
driver = webdriver.WhatEverBrowser()
driver.implicitly_wait(60) # This line will cause it to search for 60 seconds
it only needs to be inserted in your code once ( i usually do it right after creating webdriver object)
for example if your page for some reason takes 30 seconds to load ( buy a new server), and the element is one of the last things to show up on the page, it pretty much just keeps checking over and over and over again if the element is there for 60 seconds, THEN if it doesnt find it, it throws the exception.
also make sure your scope is correct, ie: if you are focused on a frame, and the element you are looking for is NOT in that frame, it will NOT find it.
I see that too. What I do is just wait it out...
you could try:
while True:
try:
x = driver.find_by_name('some_name')
break
except NoSuchElementException:
time.sleep(1)
# possibly use driver.get() again if needed
Also, try updating your selenium to the newest version with pip install --update selenium
I put my money on frame as I had similar issue before :)
Check your html again and check if you are dealing with frame. If it is then switching to correct frame will be able to locate the element.
Python
driver.switch_to_frame("frameName")
Then search for element.
If not, try put wait time as others suggested.
One way to handle waiting for an element to appear is like this:
import selenium.webdriver.support.ui as ui
wait = ui.WebDriverWait(driver,10)
wait.until(lambda driver: driver.find_by_name('some_name') )
elem = driver.find_by_name('some_name')
You are correct that the webdriver is not waiting for the page to load, there is no built-in default wait for driver.get().
To resolve this query you have to define explicit wait. so that till the time when page is loading it will not search any WebElement.
below url help on this.
http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp
You need to have a waitUntil (your element loads). If you are sure that your element will eventually appear on the page, this will ensure that what ever validations will only occur after your expected element is loaded.
I feel it might be synchronisation issue (i.e webdriver speed and application speed is mismatch )
Use Implicit wait:
driver.manage.timeouts.implicitlyWait(9000 TIMEUNITS.miliseconds)
Reference