selenium webdriverwait does not reset timer in for loop | python - python

I am using python3.8.5 & selenium3.141.0 to automate login processes. To am locating login buttons with css selectors, but in my testing, the selectors can change (i'm a little fuzzy on this - something dynamic to do with how the page loads??).
My current solution for this is iterate over a list of css selectors that I have observed (they do repeat):
driver = webdriver.Chrome(options=options)
success = True
errorMSG = ""
for loginClickID in paper.loginClickID:
wait = WebDriverWait(driver, 12)
try:
wait.until(
EC.presence_of_element_located(By.CSS_SELECTOR, loginClickID)
)
except Exception as e:
log("\nFailed to see click-to-login element: " + loginClickID + "\nHTML output shown below:\n\n" + driver.page_source)
success = False
errorMSG = e
if not success:
response = driver.page_source
driver.quit()
return f"{paper.brand} login_fail\nLoginClickID: {loginClickID}\nNot found in:\n{response}\n{errorMSG}\n"
I am creating a new WebDriverWait() object for each iteration of the loop. However, when I debug the code and step over it manually, the second time I enter the loop the wait.until() method exits immediately, without even throwing an exception (which is very strange, right?) and exiting the loop completely (the css selector list has 2 elements)
My thought is that somehow the wait.until() timer is not reseting?
I've tried reloading the page using driver.refresh(), and sleeping the python code using time.sleep(1) in the except: section in that hope that might help rest things - it has not...
I've included all my ChromeDriver options for context:
options = Options()
# options.add_argument("--headless")
options.add_argument("window-size=1400,1500")
options.add_argument("--disable-gpu")
options.add_argument("--no-sandbox")
options.add_argument("enable-automation")
options.add_argument("--disable-infobars")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("start-maximized")
options.add_argument("--disable-browser-side-navigation")
I am using:
Google Chrome 80.0.3987.106
&
ChromeDriver 80.0.3987.106
Any Suggestions?

Why use
css selectors
Which are obvious dynamic, as you say - they change..
Why don't you use Xpath?
xpath_email = "//input[#type='email']"
xpath_password = "//input[#type='password']"
driver.find_element_by_xpath(xpath_email)
time.sleep(1)
driver.find_element_by_xpath(xapth_password)
These are obviously, generic xpaths, but you can find them on whatever login page you are, and change accordingly.
This way, no matter whatever the test case, your logic will work.

Related

driver.find_element(By.XPATH, "xpath") not working

I am trying to learn Selenium in Python and I have faced a problem which stopped me from processing.
As you might know, previous versions of Selenium have different syntax comparing the latest one, and I have tried everything to fill the form with my code but nothing happens. I am trying to find XPATH element from [https://demo.seleniumeasy.com/basic-first-form-demo.html] but whatever I do, I cannot type my message into the message field.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service as ChromeService
import time
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
service = ChromeService(executable_path="C:/Users/SnappFood/.cache/selenium/chromedriver/win32/110.0.5481.77/chromedriver.exe")
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
time.sleep(1000)
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
By this code, I expect to fill the message field in the form and click the "SHOW MESSAGE" button to print my typed text, but what happens is that my code only opens a new Chrome webpage with empty field.
I have to mention that I don't get any errors by PyCharm and the code runs with no errors.
I would really appreciate if you help me through this to understand what I am doing wrong.
You need to wait for the page loads correctly.
For this, the best approach is using WebDriverWait:
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
# Wait for the page to load
wait = WebDriverWait(driver, 10)
wait.until(lambda driver: driver.find_element(By.XPATH, '//*[#id="user-message"]'))
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
Tested, it works fine.
Also you can use time.sleep() to wait the load if you know the loading times:
import time
driver = webdriver.Chrome()
driver.maximize_window()
driver.get("https://demo.seleniumeasy.com/basic-first-form-demo.html")
time.sleep(5)
message_field = driver.find_element(By.XPATH,'//*[#id="user-message"]')
message_field.send_keys("Hello World")
time.sleep(5)
show_message_button = driver.find_element(By.XPATH,'//*[#id="get-input"]/button')
show_message_button.click()
time.sleep(5)
with this xpath you have two elements //*[#id="user-message"] . you need to make it unique. Try below xpath this will work as expected.
message_field = driver.find_element(By.XPATH,'//input[#id="user-message"]')
browser snapshot:

Scraping only the portion that loads - Without Scrolling

I have written a simple web scraping code using Selenium but I want to scrape only the portion that is present 'before scroll'
Say, if it is this page I want to scrape - https://en.wikipedia.org/wiki/Pandas_(software) - Selenium reads information till the absolute last element/text which for me is the 'Powered by Media Wiki' button on the far bottom-right of the page.
What I want Selenium to do is stop after DataFrames (see screenshot) and not scroll down to the bottom.
And I also want to know where on the page it stops. I have checked multiple sources and most of them ask for infinite scroll websites. No one asks for just the 'visible' half of a page.
This is my code now:
from selenium import webdriver
EXECUTABLE = r"chromedriver.exe"
# get the URL
url = "https://en.wikipedia.org/wiki/Pandas_(software)"
# open the chromedriver
driver = webdriver.Chrome(executable_path = EXECUTABLE)
# google window is maximized so that all webpages are rendered in the same size
driver.maximize_window()
# make the driver wait for 30 seconds before throwing a time-out exception
driver.implicitly_wait(30)
# get URL
driver.get(url)
for element in driver.find_elements_by_xpath("//*"):
try:
#stuff
except:
continue
driver.close()
Absolutely any direction is appreciated. I have tried to be as clear as possible here but let me know if any more details are required.
I don't think that is possible. Observe the DOM, all the informational elements are under one section I mean one tag div[#id='content'], which is already visible to Selenium. Even if you try with //*, div[#id='content'] is visible.
And trying to check whether the element is visible though not scrolled, will also return True. (If someone knows to do what you are asking for, even I would like to know.)
from selenium import webdriver
from selenium.webdriver.support.expected_conditions import _element_if_visible
driver = webdriver.Chrome(executable_path = 'path to chromedriver.exe')
driver.maximize_window()
driver.implicitly_wait(30)
driver.get("https://en.wikipedia.org/wiki/Pandas_(software)")
elements = driver.find_elements_by_xpath("//div[#id='content']//*")
for element in elements:
try:
if _element_if_visible(element):
print(element.get_attribute("innerText"))
except:
break
driver.quit()

Selenium. Unable to locate element from the html website

Here's the link of the website I'm trying to scrape (I'm training for the moment, nothing fancy):
link
Here's my script, he's quite long but nothing too complicated :
from selenium import webdriver
if __name__ == "__main__":
print("Web Scraping application started")
PATH = "driver\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1200,900")
options.add_argument('enable-logging')
driver = webdriver.Chrome(options=options, executable_path=PATH)
driver.get('https://fr.hotels.com/')
driver.maximize_window()
destination_location_element = driver.find_element_by_id("qf-0q-destination")
check_in_date_element = driver.find_element_by_id("qf-0q-localised-check-in")
check_out_date_element = driver.find_element_by_id("qf-0q-localised-check-out")
search_button_element = driver.find_element_by_xpath('//*[#id="hds-marquee"]/div[2]/div[1]/div/form/div[4]/button')
print('Printing type of search_button_element')
print(type(search_button_element))
destination_location_element.send_keys('Paris')
check_in_date_element.clear()
check_in_date_element.send_keys("29/05/2021")
check_out_date_element.clear()
check_out_date_element.send_keys("30/05/2021")
close_date_window = driver.find_element_by_xpath('/html/body/div[7]/div[4]/button')
print('Printing type of close_date_window')
print(type(close_date_window))
close_date_window[0].click()
search_button_element.click()
time.sleep(10)
hotels = driver.find_element_by_class_name('hotel-wrap')
print("\n")
i = 1
for hotel in hotels:
try:
print(hotel.find_element_by_xpath('//*[#id="listings"]/ol/li['+str(i)+']/article/section/div/h3/a').text)
print(hotel.find_element_by_xpath('//*[#id="listings"]/ol/li[' + str(i) + ']/article/section/div/address/span').text)
except Exception as ex:
print(ex)
print('Failed to extract data from element.')
i = i +1
print('\n')
driver.close()
print('Web Scraping application completed')
And here's the error I get :
File "hotelscom.py", line 21, in <module>
destination_location_element = driver.find_element_by_id("qf-0q-destination")
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="qf-0q-destination"]"}
(Session info: chrome=90.0.4430.85)
Any idea how to fix that ? I don't understand why it get me this error because in the html code, there is this syntax. But i guess I'm wrong.
You have multiple problems with your code and the site.
SITE PROBLEMS
1 The site is located on multiple servers and different servers have different html code. I do not know if it depends on location or not.
2 The version I have solution for has few serious bugs (or maybe those are features). Among them:
When you press Enter it starts hotels search when a date field is opened and and you just want to close this date field. So, it is a problem to close input fields in a traditional way.
clear() of Selenium does not work as it is supposed to work.
BUGS IN YOUR CODE
1 You are defining window size in options and you are maximizing the window immediately after site is opened. Use only one option
2 You are entering dates like "29/05/2021", but sites recognises formats only like: "05/30/2021". It is a big difference
3 You are not using any waits and they are extremely important.
4 Your locators are wrong and unstable. Even locators with id did not always work for me because if you will make a search, there are two elements for some of them. So I replaces them with css selectors.
Please note that my solution works only for an old version of site. If you want to a specific version to be opened you will need either:
Get the site by a direct ip address, like driver.get('site ip address')
Implement a strategy in your framework which recognises which site version is opened and applies inputs depending on it.
SOLUTION
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
if __name__ == "__main__":
print("Web Scraping application started")
options = webdriver.ChromeOptions()
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1200,900")
options.add_argument('enable-logging')
driver = webdriver.Chrome(options=options, executable_path='/snap/bin/chromium.chromedriver')
driver.get('https://fr.hotels.com/')
wait = WebDriverWait(driver, 15)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#qf-0q-destination")))
destination_location_element = driver.find_element_by_css_selector("#qf-0q-destination")
destination_location_element.send_keys('Paris, France')
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".widget-autosuggest.widget-autosuggest-visible table tr")))
destination_location_element.send_keys(Keys.TAB) # workaround to close destination field
driver.find_element_by_css_selector(".widget-query-sub-title").click()
wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, ".widget-query-group.widget-query-destination [aria-expanded=true]")))
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#qf-0q-localised-check-in")))
check_in_date_element = driver.find_element_by_css_selector("#qf-0q-localised-check-in")
check_in_date_element.send_keys(Keys.CONTROL, 'a') # workaround to replace clear() method
check_in_date_element.send_keys(Keys.DELETE) # workaround to replace clear() method
# check_in_date_element.click()
check_in_date_element.send_keys("05/30/2021")
# wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#qf-0q-localised-check-out")))
check_out_date_element = driver.find_element_by_id("qf-0q-localised-check-out")
check_out_date_element.click()
check_out_date_element.send_keys(Keys.CONTROL, 'a')
check_out_date_element.send_keys(Keys.DELETE)
check_out_date_element.send_keys("05/31/2021")
driver.find_element_by_css_selector(".widget-query-sub-title").click() # workaround to close end date
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#hds-marquee button"))).click()
Spent on this few hours, the task seemed just interesting for me.
It works for this UI:
The code can still be optimised. It's up to you.
UPDATE:
I found out that the site has at least three home pages with three different Destination and other fields locators.
The easiest workaround that came into my mind is something like this:
try:
element = driver.find_element_by_css_selector("#qf-0q-destination")
if element.is_displayed():
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#qf-0q-destination")))
destination_location_element = driver.find_element_by_css_selector("#qf-0q-destination")
print("making input to Destination field of site 1")
destination_location_element.send_keys('Paris, France')
# input following data
except:
print("Page 1 not found")
try:
element = driver.find_element_by_css_selector("input[name=q-destination-srs7]")
if element.is_displayed():
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name=q-destination-srs7]")))
destination_location_element = driver.find_element_by_css_selector("input[name=q-destination-srs7]")
print("making input to Destination field of site 2")
destination_location_element.send_keys('Paris, France')
# input following data
except:
print("Page 2 is not found")
try:
element = driver.find_element_by_css_selector("form[method=GET]>div>._1yFrqc")
if element.is_displayed():
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "form[method=GET]>div>._1yFrqc")))
destination_location_element = driver.find_element_by_css_selector("form[method=GET]>div>._1yFrqc")
print("making input to Destination field of site 3")
destination_location_element.send_keys('Paris, France')
# input following data
except:
print("Page 3 is not found")
But the best solution would be to have a direct access to a specific server that has only one version available.
Please also note that if you access the site by a direct link for France: https://fr.hotels.com/?pos=HCOM_FR&locale=fr_FR your input dates will be as you initially specified, for example 30/05/2021.
Try this
driver.find_element_by_xpath(".//div[contains(#class,'destination')]/input[#name='q-destination']")
Also please add wait after you maximize the window
You are missing a wait / sleep before finding the element.
So, just add this:
element = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.ID, "qf-0q-destination")))
element.click()
to use this you will have to use the following imports:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as E

Selenium Element.get_attribute occasionally failing

Given this code:
options = webdriver.ChromeOptions()
options.add_argument("headless")
driver = webdriver.Chrome(options=options)
driver.get('https://covid19.apple.com/mobility')
elements = driver.find_elements_by_css_selector("div.download-button-container a")
csvLink = [el.get_attribute("href") for el in elements]
driver.quit()
At the end, csvLink sometimes has the link and most times not. If I stop at the last line in the debugger, it often fails to have anything in csvlink, but if I manually execute (in the debugger) elements[0].get_attribute('href') the correct link is returned. Every time.
if I replace
csvLink = [el.get_attribute("href") for el in elements]
with a direct call -
csvLink = elements[0].get_attribute("href")
it also fails. But, again, if I'm stopped at the driver.quit() line, and manually execute, the correct link is returned.
is there a time or path dependency I'm unaware of in using Selenium?
I'm guessing it has to do with how and when the javascript loads the link. Selenium grabs it without waiting before the javascript is able to load the elements href attribute value. Try explicitly waiting for the selector, something like:
(
WebDriverWait(browser, 20)
.until(EC.presence_of_element_located(
(By.CSS_SELECTOR, "div.download-button-container a[href]")))
.click()
)
Reference:
Selenium - wait until element is present, visible and interactable
How do I target elements with an attribute that has any value in CSS?
Also, if you curl https://covid19.apple.com/mobility my suspicion would be that the element exists ( maybe ), but the href is blank?

Selenium - Python - same code works well with firefox but 10min slower with chrome

I want to click a button after an image is visible/ loaded,
the test takes 10 min to run with Chrome Vs 0:00:00.029671 with firefox.
It's too slow , i'd rather be running test manually.
How can i have the same time execution ?
i'm desperate , taking me days ...with multiple code solution from internet
I upgrade google Chrome 75.0.3770.90 and ChromeDriver
I added some options to run chrome :(not very helpful in this case)
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument("--disable-extensions")
connectionStatus = True
while connectionStatus == True:
try:
WebDriverWait(conn.driver, 10).until(ec.visibility_of_element_located(
(By.CSS_SELECTOR, "img[src='../public/images//disconnect.png']")))
element = conn.driver.find_element(By.CSS_SELECTOR, 'img[src="../public/images//disconnect.png"]')
element.is_displayed
print("disconnect")
connectionStatus = False
except NoSuchElementException as e:
print("Waiting for M to disconnect from VW")
time.sleep(10)
except TimeoutException:
print("TIMEOUT - Element not found: ")
conn.driver.find_element(By.CSS_SELECTOR, "#btnSendUpd").click()
Execution:
Start: 2019-06-18 16:13:06.710734
TIMEOUT - Element not found:
Diff = 0:05:00.004450
disconnect
Diff = 0:05:00.046355
NB: the code html contains only css , not ID to use findElementById
Windows 10 - 64bits(I use chromedriver 32bits-they say that is working on 64bits)
Selenium 3.141.0
I was told that the website i'm testing works with Hidden Iframe (Comet Programming with Javascript), A basic technique for dynamic web application is to use a hidden iframe HTML element (an inline frame, which allows a website to embed one HTML document inside another). This invisible iframe is sent as a chunked block, which implicitly declares it as infinitely long (sometimes called "forever frame")
I checked "development tool"=> Network :
it's like the script never stop F12-Network-Chrome, and i think that Chrome is waiting for it to finish, that's why he is too long (Firefox doesn't)
As a workaround i added this line to force chrome to not wait page loading too long :
driver.set_page_load_timeout(7)
Execution now takes seconds:
Start: 2019-06-20 13:23:24.746351
TIMEOUT - Element not found
Diff = 0:00:07.004282
disconnect
Diff = 0:00:07.036196

Categories