extracting links with a specific class with Selenium in Python - python

I am trying to extract links from a infinite scroll website
It's my code for scrolling down the page
driver = webdriver.Chrome('C:\\Program Files (x86)\\Google\\Chrome\\chromedriver.exe')
driver.get('http://seekingalpha.com/market-news/top-news')
for i in range(0,2):
driver.implicitly_wait(15)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(20)
I aim at extracting specific links from this page. With class = "market_current_title" and HTML like the following :
<a class="market_current_title" href="/news/3223955-dow-wraps-best-week-since-2011-s-and-p-strongest-week-since-2014" sasource="titles_mc_top_news" target="_self">Dow wraps up best week since 2011; S&P in strongest week since 2014</a>
When I used
URL = driver.find_elements_by_class_name('market_current_title')
I ended up with the error that says "stale element reference: element is not attached to the page document". Then I tried
URL = driver.find_elements_by_xpath("//div[#id='a']//a[#class='market_current_title']")
but it says that there is no such a link !!!
Do you have any idea about solving this problem?

You're probably trying to interact with an element that is already changed (probably elements above your scrolling and off screen). Try this answer for some good options on how to overcome this.
Here's a snippet:
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
import selenium.webdriver.support.expected_conditions as EC
import selenium.webdriver.support.ui as ui
# return True if element is visible within 2 seconds, otherwise False
def is_visible(self, locator, timeout=2):
try:
ui.WebDriverWait(driver, timeout).until(EC.visibility_of_element_located((By.CSS_SELECTOR, locator)))
return True
except TimeoutException:
return False

Related

How to fix "stale element reference: element is not attached to the page document" while _scraping_ with selenium

Good evening modern-day heroes, hope everyone's safe and sound !
What I'm hoping to achieve with this selenium script is to load up the page, click on BTC, ETH, XRP icons to filter results, then keep clicking the "show more" button until the max number of elements have been loaded --> 1138, then to obtain all the hrefs of those 1138 companies, click on each and visit their respective pages, then scrape further data points located on each internal page visited
With that said, I've tried lots of different approaches including just to print the link of each company which it worked, however, it fails to properly go/visit the extracted hrefs and says ("stale element reference: element is not attached to the page document").
Heard that explicit/implicit waits could help to fix this, but I can't seem to wrap my head around how to use it with the variable links particularly which is where the code stops to give me the error aforementioned
Have a feeling that the issue is with the while loop and how it processes the fact that I'm looping through a list of links that will be visited next. Can't emphasize how grateful I'll be if someone can guide me in the right direction !!
from selenium.webdriver import Chrome
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import time
from selenium.common.exceptions import NoSuchElementException, ElementNotVisibleException
webdriver = '/Users/karimnabil/projects/selenium_js/chromedriver-1'
driver = Chrome(webdriver)
url = 'https://acceptedhere.io/catalog/company/'
driver.get(url)
btc = driver.find_element_by_xpath("//ul[#role='currency-list']/li[1]/a")
btc.click()
eth = driver.find_element_by_xpath("//ul[#role='currency-list']/li[2]/a")
eth.click()
xrp = driver.find_element_by_xpath("//ul[#role='currency-list']/li[5]/a")
xrp.click()
all_categories = driver.find_element_by_xpath("//div[#class='dropdownMenu']/ul/li[1]")
all_categories.click()
time.sleep(1)
maximun_number = 1138
while True:
show_more = driver.find_element_by_xpath("//div[#class='row search-result']/div[3]/button")
elements = driver.find_elements_by_xpath("//div[#class='row desktop-results mobile-hide']/div")
if len(elements) > maximun_number:
break
show_more.click()
time.sleep(1)
for element in elements:
links = element.find_elements_by_xpath(".//div/div/div[2]/div/div/div[1]/a")
links = [url.get_attribute('href') for url in links]
time.sleep(0.5)
for link in links:
driver.get(link)
company_title = driver.find_element_by_xpath("//h3").text
print(company_title)
When you navigate through pages the elements you put in you variables (e.g. show_more ) becomes stale or stateless since you are on a different page. It may seem you need to wait for an element to load or to be clickable. Here are some examples:
https://seleniumbyexamples.github.io/waitclickable
https://seleniumbyexamples.github.io/waitvisibility

Circumventing Stale Element Exceptions in Selenium

I have read several articles on this site regarding around the StaleElementReferenceException and am aware that this error is caused by the element no longer being in the site's DOM. What I am trying to do is click the bottom links on this webpage in order to go on and see the next page's listings. I have tried a few ways around this exception being given to me, and haven't found any to work. Here is an example of the code I have tried, and what I thought it might accomplish.
driver = webdriver.Chrome(r'C:\Users\Hank\Desktop\chromedriver_win32\chromedriver.exe')
driver.get('https://steamcommunity.com/market/listings/440/Unusual%20Old%20Guadalajara')
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import StaleElementReferenceException
action = ActionChains(driver)
page_links = wait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '[class^=market_paging_pagelink]')))
try:
action.move_to_element(page_links[1]).click().perform()
except StaleElementReferenceException as Exception:
print("Exception received, trying again")
time.sleep(5)
page_links = wait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, '[class^=market_paging_pagelink]')))
action.move_to_element(page_links[1]).click().perform()
I was hoping that this code segment would attempt to move to the element at the bottom, click it, or return the error message, and try again, succeeding the second time. Instead, the code simply throws the error again. If my question has already been answered, please direct me to the relevant link.
Thank you!
The approach I normally go for is to click Next page until the button gets disabled/invisible.
Here's a working example based on your page. You should obviously do whatever relevant in the while loop; I chose to capture prices for the sake of example.
url="https://steamcommunity.com/market/listings/440/Unusual%20Old%20Guadalajara"
driver.get(url)
next_button=wait(driver, 10).until(EC.presence_of_element_located((By.ID,'searchResults_btn_next')))
# capture the start value from "Showing x-xx of 22 results"
#need this to check against later
ref_val=wait(driver, 10).until(EC.presence_of_element_located((By.ID,'searchResults_start'))).text
while next_button.get_attribute('class') == 'pagebtn':
next_button.click()
#wait until ref_val has changed
wait(driver, 10).until(lambda driver: wait(driver, 10).until(EC.presence_of_element_located((By.ID,'searchResults_start'))).text != ref_val)
# ====== Do whatever relevant here =============================
page_num=wait(driver, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR,'.market_paging_pagelink.active'))).text
print(f"Prices from page {page_num}")
prices = wait(driver, 10).until(EC.presence_of_all_elements_located(
(By.XPATH, ".//span[#class='market_listing_price market_listing_price_with_fee']")))
for price in prices:
print(price.text)
#================================================================
#get the new reference value
ref_val = wait(driver, 10).until(EC.presence_of_element_located((By.ID, 'searchResults_start'))).text

StaleElementReferenceException in Python

I am trying to scrape data from the Sunshine List website (http://www.sunshinelist.ca/) using the Selenium package but I get the following error mentioned below. From several other related posts I understand that I need to use the WebDriverWait to explicitly ask the driver to wait/refresh but I am unable to identify where and how I should call the function.
Screenshot of Error
StaleElementReferenceException: Message: The element reference
of (tr class="even") stale: either the element is no longer attached to the DOM or the
page has been refreshed
import numpy as np
import pandas as pd
import requests
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import StaleElementReferenceException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
ffx_bin = FirefoxBinary(r'C:\Users\BhagatM\AppData\Local\Mozilla Firefox\firefox.exe')
ffx_caps = DesiredCapabilities.FIREFOX
ffx_caps['marionette'] = True
driver = webdriver.Firefox(capabilities=ffx_caps,firefox_binary=ffx_bin)
driver.get("http://www.sunshinelist.ca/")
driver.maximize_window()
tablewotags1=[]
while True:
divs = driver.find_element_by_id('datatable-disclosures')
divs1=divs.find_elements_by_tag_name('tbody')
for d1 in divs1:
div2=d1.find_elements_by_tag_name('tr')
for d2 in div2:
tablewotags1.append(d2.text)
try:
driver.find_element_by_link_text('Next →').click()
except NoSuchElementException:
break
year1=tablewotags1[0::10]
name1=tablewotags1[3::10]
position1=tablewotags1[4::10]
employer1=tablewotags1[1::10]
df1=pd.DataFrame({'Year':year1,'Name':name1,'Position':position1,'Employer':employer1})
df1.to_csv('Sunshine List-1.csv', index=False)
If your problem is to click the "Next" button, you can do that with the xpath:
driver = webdriver.Firefox(executable_path=r'/pathTo/geckodriver')
driver.get("http://www.sunshinelist.ca/")
wait = WebDriverWait(driver, 20)
el=wait.until(EC.presence_of_element_located((By.XPATH,"//ul[#class='pagination']/li[#class='next']/a[#href='#' and text()='Next → ']")))
el.click()
For each click on the "Next" button -- you should find that button and click on it.
Or do something like this:
max_attemps = 10
while True:
next = self.driver.find_element_by_css_selector(".next>a")
if next is not None:
break
else:
time.sleep(0.5)
max_attemps -= 1
if max_attemps == 0:
self.fail("Cannot find element.")
And after this code does click action.
PS: Also try to add just time.sleep(x) after fiding element and then do click action.
Try this code below.
When the element is no longer attached to the DOM and the StaleElementReferenceException is invoked, search for the element again to reference the element.
Please do note I checked with Chrome:
try:
driver.find_element_by_css_selector('div[id="datatable-disclosures_wrapper"] li[class="next"]>a').click()
except StaleElementReferenceException:
driver.find_element_by_css_selector('div[id="datatable-disclosures_wrapper"] li[class="next"]>a').click()
except NoSuchElementException:
break
>>>Stale Exceptions can be handled using **StaleElementReferenceException** to continue to execute the for loop. When you try to get the element by any find_element method in a for loop.
from selenium.common import exceptions
and customize your code of for loop as:
for loop starts:
try:
driver.find_elements_by_id("data") //method to find element
//your code
except exceptions.StaleElementReferenceException:
pass
When you raise the StaleElementException that means that somthing changed in the site, but not in the list you have. So the trick is to refresh that list every time, inside the loop like this:
while True:
driver.implicitly_wait(4)
for d1 in driver.find_element_by_id('datatable-disclosures').find_element_by_tag_name('tbody').find_elements_by_tag_name('tr'):
tablewotags1.append(d1.text)
try:
driver.switch_to.default_content()
driver.find_element_by_xpath('//*[#id="datatable-disclosures_wrapper"]/div[2]/div[2]/div/ul/li[7]/a').click()
except NoSuchElementException:
print('Don\'t be so cryptic about error messages, they are good\n
...Script broke clicking next') #jk aside put some info there
break
Hope this help you, cheers.
Edit:
So I went to the said website, the layout is pretty straight forward, but the structure repeats itself like four times. So when you go about crawling the site like that something is bound to change.
So I’ve edited the code to only scrap one tbody tree. This tree comes from the first datatable-disclousure. And added some waits.

cannot locate element within a web page using selenium python

I just want to write a simple log in code for one website. However, I think the log in page was written in JS. It's really hard to locate the elements with selenium.
The web page I am going to play with is:
"https://www.nike.com/snkrs/login?returnUrl=%2F"
This is how the page looks like and how the inspect element page look like:
I was trying to locate the element by following code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Firefox()
driver.get("https://www.nike.com/snkrs/login?returnUrl=%2Fthread%2Fe9680e08e7e3cd76b8832684037a58a369cad5ed")
time.sleep(5)
driver.switch_to.frame(driver.find_element_by_tag_name("iframe"))
elem =driver.find_element_by_xpath("//*[#id='ce3feab5-6156-441a-970e-23544473a623']")
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
driver.close()
This code return the error that the element could not find by [#id='ce3feab5-6156-441a-970e-23544473a623'.
I tried playing with frames, it seems does not work. If I went to "view web source" page, it is full of JS code.
Is there a good way to play with such a web page with selenium?
Try changing the code :
elem =driver.find_element_by_xpath("//*[#id='ce3feab5-6156-441a-970e-23544473a623']")
to
elem =driver.find_element_by_xpath("//*[#type='email']")
My guess (and observation) is that the id changes each time you visit the page. The id looks auto-generated, and when I go to the page multiple times, the id is different each time.
You'll need to search for something that doesn't change. For example, you can search for the name attribute, which has the seemingly static value "emailAddress"
element = driver.find_element_by_name("emailAddress")
You could also use an xpath expression to search for other attributes, such as data-componentname:
element = driver.find_element_by_xpath("//input[#data-componentname='emailAddress']")
Also, instead of a hard-coded sleep, you can simply wait for the element to be visible:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.nike.com/snkrs/login")
element = WebDriverWait(driver, 10).until(
EC.visibility_of_element_located((By.NAME, "emailAddress"))
)

Wait until page is loaded with Selenium WebDriver for Python

I want to scrape all the data of a page implemented by a infinite scroll. The following python code works.
for i in range(100):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
This means every time I scroll down to the bottom, I need to wait 5 seconds, which is generally enough for the page to finish loading the newly generated contents. But, this may not be time efficient. The page may finish loading the new contents within 5 seconds. How can I detect whether the page finished loading the new contents every time I scroll down? If I can detect this, I can scroll down again to see more contents once I know the page finished loading. This is more time efficient.
The webdriver will wait for a page to load by default via .get() method.
As you may be looking for some specific element as #user227215 said, you should use WebDriverWait to wait for an element located in your page:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
browser = webdriver.Firefox()
browser.get("url")
delay = 3 # seconds
try:
myElem = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'IdOfMyElement')))
print "Page is ready!"
except TimeoutException:
print "Loading took too much time!"
I have used it for checking alerts. You can use any other type methods to find the locator.
EDIT 1:
I should mention that the webdriver will wait for a page to load by default. It does not wait for loading inside frames or for ajax requests. It means when you use .get('url'), your browser will wait until the page is completely loaded and then go to the next command in the code. But when you are posting an ajax request, webdriver does not wait and it's your responsibility to wait an appropriate amount of time for the page or a part of page to load; so there is a module named expected_conditions.
Trying to pass find_element_by_id to the constructor for presence_of_element_located (as shown in the accepted answer) caused NoSuchElementException to be raised. I had to use the syntax in fragles' comment:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get('url')
timeout = 5
try:
element_present = EC.presence_of_element_located((By.ID, 'element_id'))
WebDriverWait(driver, timeout).until(element_present)
except TimeoutException:
print "Timed out waiting for page to load"
This matches the example in the documentation. Here is a link to the documentation for By.
Find below 3 methods:
readyState
Checking page readyState (not reliable):
def page_has_loaded(self):
self.log.info("Checking if {} page is loaded.".format(self.driver.current_url))
page_state = self.driver.execute_script('return document.readyState;')
return page_state == 'complete'
The wait_for helper function is good, but unfortunately click_through_to_new_page is open to the race condition where we manage to execute the script in the old page, before the browser has started processing the click, and page_has_loaded just returns true straight away.
id
Comparing new page ids with the old one:
def page_has_loaded_id(self):
self.log.info("Checking if {} page is loaded.".format(self.driver.current_url))
try:
new_page = browser.find_element_by_tag_name('html')
return new_page.id != old_page.id
except NoSuchElementException:
return False
It's possible that comparing ids is not as effective as waiting for stale reference exceptions.
staleness_of
Using staleness_of method:
#contextlib.contextmanager
def wait_for_page_load(self, timeout=10):
self.log.debug("Waiting for page to load at {}.".format(self.driver.current_url))
old_page = self.find_element_by_tag_name('html')
yield
WebDriverWait(self, timeout).until(staleness_of(old_page))
For more details, check Harry's blog.
As mentioned in the answer from David Cullen, I've always seen recommendations to use a line like the following one:
element_present = EC.presence_of_element_located((By.ID, 'element_id'))
WebDriverWait(driver, timeout).until(element_present)
It was difficult for me to find somewhere all the possible locators that can be used with the By, so I thought it would be useful to provide the list here.
According to Web Scraping with Python by Ryan Mitchell:
ID
Used in the example; finds elements by their HTML id attribute
CLASS_NAME
Used to find elements by their HTML class attribute. Why is this
function CLASS_NAME not simply CLASS? Using the form object.CLASS
would create problems for Selenium's Java library, where .class is a
reserved method. In order to keep the Selenium syntax consistent
between different languages, CLASS_NAME was used instead.
CSS_SELECTOR
Finds elements by their class, id, or tag name, using the #idName,
.className, tagName convention.
LINK_TEXT
Finds HTML tags by the text they contain. For example, a link that
says "Next" can be selected using (By.LINK_TEXT, "Next").
PARTIAL_LINK_TEXT
Similar to LINK_TEXT, but matches on a partial string.
NAME
Finds HTML tags by their name attribute. This is handy for HTML forms.
TAG_NAME
Finds HTML tags by their tag name.
XPATH
Uses an XPath expression ... to select matching elements.
From selenium/webdriver/support/wait.py
driver = ...
from selenium.webdriver.support.wait import WebDriverWait
element = WebDriverWait(driver, 10).until(
lambda x: x.find_element_by_id("someId"))
On a side note, instead of scrolling down 100 times, you can check if there are no more modifications to the DOM (we are in the case of the bottom of the page being AJAX lazy-loaded)
def scrollDown(driver, value):
driver.execute_script("window.scrollBy(0,"+str(value)+")")
# Scroll down the page
def scrollDownAllTheWay(driver):
old_page = driver.page_source
while True:
logging.debug("Scrolling loop")
for i in range(2):
scrollDown(driver, 500)
time.sleep(2)
new_page = driver.page_source
if new_page != old_page:
old_page = new_page
else:
break
return True
Have you tried driver.implicitly_wait. It is like a setting for the driver, so you only call it once in the session and it basically tells the driver to wait the given amount of time until each command can be executed.
driver = webdriver.Chrome()
driver.implicitly_wait(10)
So if you set a wait time of 10 seconds it will execute the command as soon as possible, waiting 10 seconds before it gives up. I've used this in similar scroll-down scenarios so I don't see why it wouldn't work in your case. Hope this is helpful.
To be able to fix this answer, I have to add new text. Be sure to use a lower case 'w' in implicitly_wait.
Here I did it using a rather simple form:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("url")
searchTxt=''
while not searchTxt:
try:
searchTxt=browser.find_element_by_name('NAME OF ELEMENT')
searchTxt.send_keys("USERNAME")
except:continue
Solution for ajax pages that continuously load data. The previews methods stated do not work. What we can do instead is grab the page dom and hash it and compare old and new hash values together over a delta time.
import time
from selenium import webdriver
def page_has_loaded(driver, sleep_time = 2):
'''
Waits for page to completely load by comparing current page hash values.
'''
def get_page_hash(driver):
'''
Returns html dom hash
'''
# can find element by either 'html' tag or by the html 'root' id
dom = driver.find_element_by_tag_name('html').get_attribute('innerHTML')
# dom = driver.find_element_by_id('root').get_attribute('innerHTML')
dom_hash = hash(dom.encode('utf-8'))
return dom_hash
page_hash = 'empty'
page_hash_new = ''
# comparing old and new page DOM hash together to verify the page is fully loaded
while page_hash != page_hash_new:
page_hash = get_page_hash(driver)
time.sleep(sleep_time)
page_hash_new = get_page_hash(driver)
print('<page_has_loaded> - page not loaded')
print('<page_has_loaded> - page loaded: {}'.format(driver.current_url))
How about putting WebDriverWait in While loop and catching the exceptions.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
browser = webdriver.Firefox()
browser.get("url")
delay = 3 # seconds
while True:
try:
WebDriverWait(browser, delay).until(EC.presence_of_element_located(browser.find_element_by_id('IdOfMyElement')))
print "Page is ready!"
break # it will break from the loop once the specific element will be present.
except TimeoutException:
print "Loading took too much time!-Try again"
You can do that very simple by this function:
def page_is_loading(driver):
while True:
x = driver.execute_script("return document.readyState")
if x == "complete":
return True
else:
yield False
and when you want do something after page loading complete,you can use:
Driver = webdriver.Firefox(options=Options, executable_path='geckodriver.exe')
Driver.get("https://www.google.com/")
while not page_is_loading(Driver):
continue
Driver.execute_script("alert('page is loaded')")
use this in code :
from selenium import webdriver
driver = webdriver.Firefox() # or Chrome()
driver.implicitly_wait(10) # seconds
driver.get("http://www.......")
or you can use this code if you are looking for a specific tag :
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox() #or Chrome()
driver.get("http://www.......")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "tag_id"))
)
finally:
driver.quit()
Very good answers here. Quick example of wait for XPATH.
# wait for sizes to load - 2s timeout
try:
WebDriverWait(driver, 2).until(expected_conditions.presence_of_element_located(
(By.XPATH, "//div[#id='stockSizes']//a")))
except TimeoutException:
pass
I struggled a bit to get this working as that didn't worked for me as expected. anyone who is still struggling to get this working, may check this.
I want to wait for an element to be present on the webpage before proceeding with my manipulations.
we can use WebDriverWait(driver, 10, 1).until(), but the catch is until() expects a function which it can execute for a period of timeout provided(in our case its 10) for every 1 sec. so keeping it like below worked for me.
element_found = wait_for_element.until(lambda x: x.find_element_by_class_name("MY_ELEMENT_CLASS_NAME").is_displayed())
here is what until() do behind the scene
def until(self, method, message=''):
"""Calls the method provided with the driver as an argument until the \
return value is not False."""
screen = None
stacktrace = None
end_time = time.time() + self._timeout
while True:
try:
value = method(self._driver)
if value:
return value
except self._ignored_exceptions as exc:
screen = getattr(exc, 'screen', None)
stacktrace = getattr(exc, 'stacktrace', None)
time.sleep(self._poll)
if time.time() > end_time:
break
raise TimeoutException(message, screen, stacktrace)
If you are trying to scroll and find all items on a page. You can consider using the following. This is a combination of a few methods mentioned by others here. And it did the job for me:
while True:
try:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
driver.implicitly_wait(30)
time.sleep(4)
elem1 = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "element-name")))
len_elem_1 = len(elem1)
print(f"A list Length {len_elem_1}")
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
driver.implicitly_wait(30)
time.sleep(4)
elem2 = WebDriverWait(driver, 30).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "element-name")))
len_elem_2 = len(elem2)
print(f"B list Length {len_elem_2}")
if len_elem_1 == len_elem_2:
print(f"final length = {len_elem_1}")
break
except TimeoutException:
print("Loading took too much time!")
selenium can't detect when the page is fully loaded or not, but javascript can. I suggest you try this.
from selenium.webdriver.support.ui import WebDriverWait
WebDriverWait(driver, 100).until(lambda driver: driver.execute_script('return document.readyState') == 'complete')
this will execute javascript code instead of using python, because javascript can detect when page is fully loaded, it will show 'complete'. This code means in 100 seconds, keep tryingn document.readyState until complete shows.
nono = driver.current_url
driver.find_element(By.XPATH,"//button[#value='Send']").click()
while driver.current_url == nono:
pass
print("page loaded.")

Categories